Therefore, the authors conducted a survey and research on online shopping behavior and the relationship between demographic factors and online shopping frequency of Gen Z subjects in Han
Banking Academy of Vietnam International school of Business Factors affecting Gen-Z's online shopping behavior in Hanoi in 2022 Lecturer: PhD Dinh Thi Thanh Binh Class: CityU9A Group Members of team: Phan Anh Dũng CA9-016 (25%) Nguyễn Kiều Linh CA9-040 (25%) Dương Quỳnh Phương CA9-059 (25%) Vũ Hoàng Long CA9-045 (25%) Hà Nội, tháng 12 năm 2022 Contents Introduction Chapter 1: Theoretical and practical basis of research subjects and a number of main factors 1.1 Definition 1.2 Literature review on the factors effecting the Online shopping frequency of GenZ in Hanoi area Chapter 2: Research methodology and econometric model 2.1 Method Research 2.1.1 Model building method 2.1.2 Methods of data collection and processing 2.2 Building econometric models 2.2.1 A random sample regression model 2.3 Description of the data 2.3.1 Data source 2.3.2 Describe the statistics Chapter 3: Quantitative Analysis 3.1 Regression model 3.2 Analyze the results after run regression mode 3.3 Meaning of Partial Regression Coefficients 3.4 Check the suitability of the model 3.5 Check the model's defects 10 3.5.1: Multicollinearity 10 3.5.2 Variable error variance 12 Chapter Hypothesis test 13 4.1 Wage test 13 4.2 Onl test 13 4.4 Gen test 14 4.5 Lvp test 14 4.6 Std test 15 Conclusion 16 References 17 Introduction In recent years, online shopping is becoming more and more popular with many generations of consumers in the digital transformation era, especially during the Covid19 era, creating new shopping habits and cultures Regarding the change in consumer behavior, there is a report that 58% of Vietnamese consumers think that they will continue to shop for groceries on e-commerce platforms because of convenience Generation Z are the ones who create new trends, make an impact in the consumer segment, are the object of fast access to online shopping because of frequent use of online tools and easy to update online shopping trends Besides, they also have the ability to influence family decisions in general shopping activities (Nielsen, 2021) Therefore, Gen Z is a very important customer that needs to be studied Hanoi, the Capital of Vietnam, contributes the second largest share (11%) to total retail sales in Vietnam and online shopping has become extremely popular among the residents of this city Therefore, the authors conducted a survey and research on online shopping behavior and the relationship between demographic factors and online shopping frequency of Gen Z subjects in Hanoi in the context of context of the Covid19 pandemic, through which to propose some solutions to help businesses and retailers that provide online sales services make more appropriate adjustments to meet the expectations of gen Z customers after making an online purchase This essay use table data analysis methods to study and analyze some determinants of degree of influence of the six main influencing factors, which are: wage, online time, age, gender, living place, level of study By applying the knowledge from econometrics with socio-economic knowledge to analyze and find relationships between variables, the essay of the research team will answer the questions: How the main factors affecting frequency of online purchases of Gen Z? What is the level of impact? What should sellers to attract more of these customers? During the research process, usage data was collected from google forms and used econometric analysis tools, STATA software to analyze and research based on data The essay of our research team contains chapters: Chapter 1: Rationale and research hypothesis, and econometric model Chapter 2: Research methodology and econometric model Chapter 3: Estimation, model testing, and statistical inference Chapter 4: Solutions and recommendations Due to many limitations of expertise and circumstances, the essay cannot avoid errors and omissions The research team is looking forward to receiving the comments of the subject lecturers to be able to complete the essay better Our team sincerely thank you! Chapter 1: Theoretical and practical basis of research subjects and a number of main factors 1.1 Definition Online shopping frequency is Purchase Frequency is the number of times that a customer makes a purchase in a given period of time In this case, the frequency of a consumer's online shopping is measured by the number of times they made an online purchase in the last six months Purchase frequency represents how often GenZ performs online shopping We use levels from to to represent this variable, the higher the level, the more frequent purchases and vice versa In recent years, online shopping is becoming more and more popular with many generations of consumers in the digital transformation era, especially during the Covid19 era, creating new shopping habits and cultures Regarding the change in consumer behavior, there is a report that 58% of Vietnamese consumers think that they will continue to shop for groceries on e-commerce platforms because of convenience Generation Z are the ones who create new trends, make an impact in the consumer segment, are the object of fast access to online shopping because of frequent use of online tools and easy to update online shopping trends Besides, they also have the ability to influence family decisions in general shopping activities (Nielsen, 2021) Therefore, Gen Z is a very important customer that needs to be studied Hanoi, the Capital of Vietnam, contributes the second largest proportion (11%) to total retail sales in Vietnam and online shopping has become extremely popular among the residents of this city Therefore, the authors conducted a survey and research on online shopping behavior and the relationship between demographic factors and online shopping frequency of Gen Z subjects in Hanoi to propose proposed a number of solutions to help businesses and retailers that provide online selling services make more appropriate adjustments to meet the expectations of Gen Z after buying online 1.2 Literature review on the factors effecting the Online shopping frequency of GenZ in Hanoi area In general, both domestically and internationally there have been research papers on the impact of Covid-19 on the shopping behavior of gen Z and the influence of some demographic factors on people's online shopping behavior Specifically, can include: Consumers of all generations during the Covid-19 crisis tend to buy goods and services online (Jílková, P.; Králová, P., 2021) Sethuraman (2020) points out that the Covid-19 epidemic has increased the demand for food, grocery and healthcare delivery at home Gen Z in Croatia who have been self-isolating at home because of the Covid-19 pandemic say they s Pap, 2021) Regarding the impact of some demographic factors, according to Gurmu & Etana (2014), age is an important demographic aspect affecting online purchasing behavior, as purchasing decisions will change by age Young people tend to spend more on lifestyle, entertainment, fashion, while older people spend most on health-related expenses Average income also plays an important role influencing online purchasing behavior (Ryscavage, 2015) Low-income people approach online shopping with caution due to a lower tolerance for financial loss than high-income people (Gunes, 2018) Besides, in the country, there is also a research paper by the authors Nguyen Minh Hieu, Jimmy Armoogum and Nguyen Thi Binh (2021) indicating that women tend to shop online more often This may be because women in Vietnam are often responsible for household chores and taking care of other family members So, when they have difficulty with going to shop in person, they will be the pioneers to implement online shopping as an alternative However, in reality in Vietnam today, there are still very few studies that go into analysis of the relationship between demographic factors and the online shopping frequency of gen Z Therefore, the purpose of this study is online shopping behavior and the relationship between demographic factors and online shopping frequency of Gen Z subjects in Hanoi in the context of the Covid-19 pandemic, in order to help businesses can have more ways to attract these customers on online shopping channels Based on a study we just learned, we decided to choose variables for the model, including: Dependent variable: Frq Online purchase frequency (Unit: Level(s)) Independent variables: - Wage WGL (Unit: Levels) - Online times ONL (Unit: Levels) - Ages AGL (Unit: Levels) - Gender GEN (Unit: 1) - Living place LVP (Unit: 1) - Study levels STD (Unit: Levels) Chapter 2: Research methodology and econometric model 2.1 Method Research 2.1.1 Model building method Regression analysis method: Find the dependencies of a variable, called the dependent variable on one or more other variables, called independent variables for the purpose of estimating or predicting the expected value of the foreseeable values of the independent variable, specifically in this study, analyzing the relationship between the independent variable (Wage, Online times, Ages, Gender, Living place, Study levels) and dependent variable (Online purchase frequency) 2.1.2 Methods of data collection and processing - Methods of data collection For secondary data: Our team collects data from other scientific research papers, articles in newspapers or scientific journals that specialize in the subject For primary data: To reduce survey costs and time, our study uses a convenient sampling method Targeted are people of Gen Z (aged 10-25), who are living in Hanoi and have been shopping online The survey process was conducted from 01/12/2022 to 05/12/2022 After screening and removing invalid 26 answers, it is finally possible that the number of survey votes taken into the analysis will be 114 The data is distributed fairly evenly over gender, as well as two geographical regions in the peak and non-peak districts of Hanoi The survey respondents were Z genes, so the number of responses that studying in college/university degree was quite large (78.3%), 12.2% had a bachelor's degree and only 9.6% were in secondary school Respondents had less than million VND per month in personal income, accounting for 54.8%, but more than 19% earned over 10 million VND per month Regarding internet usage, the majority of people who spend less than hours a day on the internet, 40% spend two to five hours a day, and the number of people who spend less than hours a day on the internet is only 18.3% - Data processing method By estimating the coefficients of the normal minimum average model OLS, the data is selected and checked the statistical significance of the regression coefficients and the suitability of the model based on observations, also as compared to previous and similar studies to find the best results to use for analysis During the homework, the group used the knowledge of econometrics and macroeconomics, quantitative methods with the main support of STATA, Microsoft Excel, and Microsoft Word software to synthesize and complete this essay Additionally, for variables measured using quantitative intervals such as Frq, Wage, Online time, Ages and Std, our team converts the data into Levels to and by the command: encode (var),gen(new var) The higher the level, the higher the level of performance, value, volume, frequency and vice versa Especially in the Lvp (place of residence), my team classified the districts in Hanoi from the original data into regions: key areas (Hoan Kiem District, Ba Dinh District, Dong Da District, Hai Ba Trung District) and non-key areas (remaining counties) Document continues below Discover more from:tế lượng Kinh Học viện Ngân hàng 116 documents Go to course 156 slide Nguyên lý thống kê kinh tế Họ… Kinh tế lượng 100% (5) CÁCH TÍNH TỐC ĐỘ 14 TĂNG TRƯỞNG CỔ… Kinh tế lượng 100% (3) Hướng dẫn số chủ đề nghiên cứu… Kinh tế lượng 100% (3) KINH TẾ LƯỢNG27 BÀI GIẢNG Kinh tế lượng 100% (2) De so - Kinh tế lượng Kinh tế lượng 100% (1) De so - Kinh tế lượng Kinh tế 2.2 Building econometric models 100% (1) After studying and referencing studies that have been done before, our team decided lượng to use multiple regression analysis to find out the dependence of FRQ and dependent variable for independent Wage, Online times, Ages, Gender, Living place, Study levels for the Year 2022 The model consists of variables: Dependent variable: FRQ Online purchase frequency (Unit: Level(s)) Independent variables: - Wage WGL (Unit: Levels) - Online times ONL (Unit: Levels) - Ages AGL (Unit: Levels) - Gender GEN (Unit: 1) - Living place LVP (Unit: 1) - Study levels STD (Unit: Levels) 2.2.1 A random sample regression model I = 1+ 2WGL + ONL+ 4AGL + 5GEN+ 1: 2: 6LVP + 7STD + ui estimate of intercept the estimated slope of the variable WGL 3: the estimated slope of the variable ONL 4: the estimated slope of the variable AGL 5: the estimated slope of the variable GEN 6: the estimated slope of the variable LVP 7: the estimated slope of the variable STD ui: remainder, estimate of random error 2.3 Description of the data 2.3.1 Data source - Scientific research papers, articles in newspapers or scientific journals that specialize in the subject - Data found from the Google Form - Sample space: The survey was conducted in Hanoi area, with focus areas and nonkey areas Therefore, it can be said that this sample space is large enough, objective and reliable enough to build up a regression model 2.3.2 Describe the statistics In order to help the reader have the most overview as well as give some initial assessment, the group will describe the data before proceeding to analyze the data Through this description, the team is able to predict some possible errors when running the model due to lack of data The figures include: Wage (WGL), Online times (ONL), Ages (AGL), Gender (GEN), Living place (LVP), Study levels (STD) for the Year 2022 Table 2.1 Statistical description of the variables With the purchase frequency variable, up to 12 people with a high frequency of online purchases clicked more than 10 times a month (level 4) compared to the lowest level of to times a month (level 1) of 35 other people On average, these young people shop online at 2.5 times, that is, between level (3-5 times/month) and level (6-10 times/month) Next, 22 people have the highest monthly income of over 10 million VND a month (level 3), while the majority have income from level (under million VND), about 63 people The average income of those who fill out the form is around (5-10 million VND/month) Considering the maximum online time, level (over hours/day), up to 41.7% of applicants belong to this region, almost equal to level (40%), the rest level (under hours) /day) only accounts for 18.3% On average, most GenZ spend time online at level (2-5 hours/day) When it comes to age level, with the largest age group (over 25 years old) there are only people, while the lowest level (level 1) is only 19 people, so the rest (level 19 to 25 years old) ) is the most numerous The average age of applicants is close to level Regarding gender with Mean 0.42 < 0.5, it can be said that the number of respondents is female (0) with 67 votes more than male (1) with 48 votes The variable lvp is divided into areas, the central area of Hanoi (1) has only 22 people living while the remaining areas are up to 92 votes The mean of the variable 0.26 also showed that difference Finally, the STD variable, generally with an average level of about (currently studying at University / College), it can be said that the majority of survey participants are GenZ subjects that the group is targeting, besides The highest level is level (graduated) only 12.2% and the lowest level is level (middle school/high school) accounting for only 9.6% Chapter 3: Quantitative Analysis 3.1 Regression model Table 3.1 Regression Model 3.2 Analyze the results after run regression mode From the table above we have the sample regression equation SRF: FRQ = 1.891962 – 0.2105482*wgl + 0.2585366*onl + 0.052408*agl – 0.4680162*gen + 0.1972811*lvp – 0.206545*std + u^ The model shows that: Wage, Online times, Ages, Gender, Living place, Study levels have effects on Online shopping frequency It means that the independent variable (Wage, Online times, Ages, Gender, Living place, Study levels) in the model can explain 12.38% of the variation of Online shopping frequency So, 87.62% of the variation of Frq is explained by other variation that is excluded from the model By theory, they are included in 3.3 Meaning of Partial Regression Coefficients - For 1: When the variables Wage, Online times, Ages, Gender, Living place, Study levels has value equal to 0, the average FRQ levels is 1.891962, it is the average effect of other factors that are not in the model to FRQ For 2: When the factors remain constant and if the wage increases (decreases) by unit, the FRQ decreases (increases) 0.2105482 units For 3: When the factors remain constant and if the online time level increases (decreases) by unit, the FRQ decreases (increases) 0.2585366 units For 4: When the factors remain constant and if the age level increases (decreases) by unit, the FRQ decreases (increases) 0.052408 units For 5: When the factors remain constant and if the gender increases (decreases) by unit, the FRQ decreases (increases) 0.4680162 units For 6: When the factors remain constant and if the living place increases (decreases) by unit, the FRQ decreases (increases) 0.1972811 units For 7: When the factors remain constant and if the study level increases (decreases) by unit, the FRQ decreases (increases) 0.206545 units 3.4 Check the suitability of the model This test is to consider whether the parameters of an independent variable at the same time equal to can occur or not H : R2 = H : R2 Method: Testing overall significance of a regression H : R2 = H : R2 0.1238 / R2 / k F = - = = 2.59 ( -R2) / ( n-k-1) ( - 0.1238) / ( 117-6-1) F , n-k-1 = F 0.05, 110 = 2.31 We have F = 2.59 > 2.31 => Reject H0, the model is significant 3.5 Check the model's defects 3.5.1: Multicollinearity a) Nature: A good model must include the BLUE (linear, unbiased, and most efficient) qualities However, due to incorrect model building or the nature of the data, the model does not achieve all of the desired properties Multicollinearity is one of the issues that affects the model that we call assumption violation Multicollinearity is a regression analysis model flaw that occurs when the independent variables Xi appear linearly connected with each other b) The causes: There are causes of multicollinearity problem misplaced In practice, perfect multicollinearity rarely occurs -economic phenomenon where the independent variables already have a collinear relationship with each other multicollinearity occurs because the survey data is not large enough, or the survey data is not randomly c) How to detect multi-collinearity Method 1: Use the corr command to check multicollinearity If the independent variables are strongly correlated with each other (r>=0.8), the multicollinearity phenomenon can occur Using the corr command, the following output is obtained 10 All |r(Xj,Xk)| < 0.8 => No multicollinearity Method 2: Use the variance inflation factor (VIF) If VIF> 10, the phenomenon of multicollinearity occurs Using the vif command in stata software, we obtained the following result; We see that all VIF values are