Research ModelBased on the theoretical basis as well as previous studies, the group has built afunction to study the relationship and influence of different factors on the amount ofcusto
Trang 1FOREIGN TRADE UNIVERSITY SCHOOL OF ECONOMICS AND INTERNATIONAL BUSINESS
Instructors: PhD Nguyen Thi Hong Vinh
PhD Pham Thi Cam Anh
Trang 2List of team members
Trang 3TABLE OF CONTENTS
INTRODUCTION
I THEORETICAL FRAMEWORK AND LITERATURE REVIEW
1.1 Income and Expenditure
1.2 Income effect theory
1.3 Consumer theory
1.4 Literature review
1.4.1 The impact of Family size on the amount of consumption
1.4.2 The impact of Income on people’s expenditure
II RESEARCH MODEL AND RESEARCH HYPOTHESIS
2.1 Research Model
2.2 Research Hypothesis
III DATA PROCESSING
3.1 Handling Values
3.2 Clustering
3.2.1 Overview
3.2.2 Details
4.1 Result
4.2 Diagnostics test
4.2.1 Testing statistical significance of an individual regression coefficient 𝛽i
4.2.2 Testing the significance of the model
4.2.3 Multicollinearity testing
4.2.4 Heteroscedasticity test
4.2.5 Omitted value test
5.1 Recommendations towards Cluster 1
5.2 Recommendations towards Cluster 0
REFERENCE
Trang 4In today's highly competitive business landscape, understanding the factors thatinfluence consumer spending behavior is paramount for companies aiming tomaximize their profits and achieve success in their marketing campaigns Thisbusiness analysis report delves into the examination of the key factors that affectconsumers' spending patterns, with the primary objective of providing insights thatwill guide the development of a highly effective marketing campaign for the company
To conduct the analysis, a regression model will be utilized to test thehypothesis and examine the relationship between consumer spending, income,marketing campaigns, frequency and family size By applying this model to theavailable dataset provided, we can extract meaningful insights and draw conclusionsthat will inform the company's marketing strategy
This is the first time our team has worked on a business analysis report, so theremay be some mistakes during the process We would like to express our gratitude to the teacher for guiding us and being open to feedback
Trang 5I THEORETICAL FRAMEWORK AND LITERATURE REVIEW
1.1 Income and Expenditure
Income refers to the money that a person or entity receives in exchange for theirlabor or products
Household income generally refers to the combined gross income (the totalamount of a person’s or organization’s income in a particular period before tax is paid)
of all members of a household above a specified age Household income includesevery member of a family who lives under the same roof, including spouses and theirdependents
Expenditure is the total amount of money that a government, organization, orperson spends during a particular period of time In this report, expenditures arecounted by Spent (The money people spent for goods and services)
1.2 Income effect theory
The income effect is the change or shift in the level of consumption of goods
and services when the purchasing power of consumers changes This can be due to thefluctuations in the consumer’s income, which changes their consumption patternswhich in turn changes the prices of goods If a consumer’s income rises, they are morelikely to buy more goods and services as long as other factors remain constant.The demand for normal goods rises when the consumer’s income increases Thedemand for inferior goods decreases as the income of the consumer increases When considering the given database, for general items, there is nodifferentiation in terms of categories We examine all variables in the article as Normalgoods to determine whether income truly affects the consumption/expenditure ofconsumers on that item This means that the higher the income, the more spending onthese items, or income is the determining factor for consumer spending
1.3 Consumer theory
Consumer theory is the study of how people decide to spend their money based
on their individual preferences and budget constraints Consumer theory seeks topredict their purchasing patterns by making the following three basic assumptionsabout human behavior:
- Utility maximization—The combination of goods or services that maximize
utility is determined by comparing the marginal utility of two choices andfinding the alternative with the highest total utility within the budget limit Thedecision is influenced by the option that produces a higher level of satisfaction
Trang 6- Non-satiation—People are seldom satisfied with one trip to the shops and
always want to consume more
- Decreasing marginal utility—Consumers lose satisfaction with a product the
more they consume it
The utility maximization is directly related to the issue we are researching, withvariables such as Marital Status and Family size, consumers will have to considercarefully when making decisions that depend on others and aim to maximize benefitsfor their families The consumer may consider purchasing more of one item and less ofanother Through maximizing utility, the consumer will buy an item that produces the
greatest marginal utility with the least amount of spending
1.4 Literature review
1.4.1 The impact of Family size on the amount of consumption
Most of the existing studies are of the view that family size affects both thesavings and consumption expenses of the individual, but in opposing directions(Rehman et al., 2010) Consumption expenditure is regarded as a positive function ofhousehold size as proposed by a number of consumption theories Every addition tothe family size results in incremental burden on the current income levels of thehousehold which leads to the diversion of income towards consumption (Dornbusch etal., 2004) and the gratification of day to day consumption needs of the additionalfamily member results in increased consumption income ratios of the individual Theeffect of family size on consumption expenditure in real terms is assessed throughexamining the pattern of proportion of income spent on consumption (consumptionincome ratios) in response to increase in the number of members in a family Anumber of studies unanimously agree that existence of additional family members in ahousehold result in increased propensity to consume, thereby implying thatconsumption expenses are positively impacted by the family size (Kelley, 1988) 1.4.2 The impact of Income on people’s expenditure
Theoretical aspects of household expenditure and private consumptionfunction have been an object of research in lots of studies and research Manytheoretical studies in econometrics have been directly or indirectly devoted to theseissues or cover these economic processes (as Bardsen et al 2005; Intriligator et al.1996; Mills 2003; Klein et al 2005) The researches solely devoted to con-sumptionbehavior and estimation of income changes are relatively in small numbers (forin-stance, Garratt et al 2009; Lo et al 2007; Mar-quez 2006)
According to Astra, Remigijs, 2010 research results and main-streameconomic theory, one percent income increase or decrease has a different impact
on expenditures by purpose – thus elastic and inelastic expenditure purposes can
be determined Values of elasticities vary in large amplitude if statistics of average
Trang 8household budget is replaced with data of households by income quintile and a moresophis-ticated study has been performed Income changes have significant impact onoverall consumption and saving process and, in re-sult, on structure of consumptionexpenditure.
II RESEARCH MODEL AND RESEARCH HYPOTHESIS
2.1 Research Model
Based on the theoretical basis as well as previous studies, the group has built afunction to study the relationship and influence of different factors on the amount ofcustomer’s household spending on purchases:
Spent = f(Income, Cmp5, Cmp4, Cmp1, Frequency, Family_Size)
In which:
● Spent: Amount of customer’s household spending on purchases (USD)
● Income: Customer’s yearly household income (USD)
● Cmp5: Result of the 5th campaign
● Cmp4: Result of the 4th campaign
● Cmp1: Result of the 1st campaign
● Frequency: Total number of purchases made (purchasing unit)
● Family_Size: Total number of people in customer’s household (person)
With the data provided, the dependent variable y is related to more than one independent variables Therefore, we use Multiple Linear Regression
According to Natural Resources Biometrics (Kiernan, 2023), Multiple Linear Regression is basically the extension of Simple Linear Regression For that reason, the
Population Regression Model is constructed as:
y = β + β 0 1 x 1 + β x 2 2 + + β k x k + u i with the mean value of y given as: µy = β + β 0 1 x 1 + β 2 x 2 + + β k x k
D = b + b 0 1 x 1 + b 2 x 2 + b 3 x 3 + + b k x k
Whereas:
● k = the number of independent variables (also called predictor variables)
● y = the random response variable/ dependent variable
Introduction
to Business… None
De 212 - Bài tập vật lý
Introduction
to Business… None
4
Trang 9● W = the estimated mean value of the dependent variable y given values for x , x ,1 2
…, x (computed by using the multiple regression equation)k
● x1, x , …, x = the independent variables2 k
● β0 is the y-intercept (the value of y when all the predictor variables equal 0)
● b0 is the estimate of β based on that sample data0
● β1, β , β , …, β are the coefficients of the independent variables x , x , …, x2 3 k 1 2 k
● b1, b , b , …, b are the sample estimates of the coefficients β , β , β , …, β2 3 k 1 2 3 k
● ui is the random error, which allows each response to deviate from the averagevalue of y The errors are assumed to be independent, have a mean of zero and acommon variance (σ2), and are normally distributed
Using the data provided, the Population Regression Model and Sample
Regression Model for this work would be:
Population Regression Model:
+ β Family_Size + u 6 i
In which:
● Spent: dependent variable
● Income, Cmp5, Cmp4, Cmp1, Frequency, Family_Size: independent
variables
● β0: the intercept term of the model
● β1 , β 2 , β 3 , β 4 , β 5, β 6: the regression coefficient of each independent variables it
follows
● ui: the disturbance term of the model
Sample Regression Model:
Trang 10● Income, Cmp5, Cmp4, Cmp1, Frequency, Family_Size: independent
variables
● b 0 , b , b , b , b , b , b 1 2 3 4 5 6 : the estimator of β1 , β , β , β , β 2 3 4 5, β 6
2.2 Research Hypothesis
H0: Regression coefficients of independent variables are equal to 0
H1: Regression coefficients of independent variables are different from 0Research Hypothesis:
Regression coefficients of independent variables are different from zero, whichmeans there is a positive or negative correlation between each independent variableand the dependent variable
In other words, the customer’s yearly household income; the number ofcustomers accepting the offer in the 5th campaign, 4th campaign, 1st campaign; thetotal number of purchases made and the total number of people in the customer’shousehold do have remarkable impacts on the amount of customer’s householdspending on purchases, whether in a positive or negative way
To be more specific, if a customer’s yearly household income changes, theamount of customer’s household spending on purchases would alter in the same oropposite direction Similarly, if the number of customers accepting the offer in the 5thcampaign, 4th campaign, 1st campaign; the total number of purchases made or thetotal number of people in the customer’s household varies, the amount of customer’shousehold spending on purchases would also differ remarkably
Trang 11III DATA PROCESSING
3.1 Handling Values
Handling Missing Value
Choose the entire dataset containing data, select “Go to” in “Find & Select”.Click on “ Special” and tick on “Blanks” then select “ Delete Sheet Rows” to delete allthe rows that have missing values
"Married" and "Together" with "Partner", and values like "Absurd", "Widow",
"YOLO", "Divorced", and "Single" with "Alone" Use the formula:
“=IF(OR(D2="Married"; D2="Together"); "Partner"; IF(OR(D2="Absurd";D2="Widow"; D2="YOLO"; D2="Divorced"; D2="Single"); "Alone"; ""))”Drag the formula down the column to apply it to all the rows Next, use
“Replace” in “Find & Object” to label encode all the data in “Living_with” column as:
Calculate the total number of children by summing the "Kidhome" and
"Teenhome" columns Create a new column labeled "Children" and use the formula
"Children = Kidhome + Teenhome"
Trang 12“Family_size” column
Create a new column named "Family_size" to determine the size of thecustomer's family Calculate the data by summing the values from the "Living_with"and "Children" columns created above:
“Family_size = Children + Living with”
Trang 13To determine the number of customers of each cluster group, we use the code below:
The total number of customers in cluster 0 is 638, while figures for cluster 1 and 2 are
516 and 1062 respectively
Cluster distribution:
From the distribution chart of all clusters, there are 3 observations having the age above 100 and 1 observation having income above 60000 These observations are considered to be the noises of the model and we need to remove it
Trang 14After removing the noises, we have the cluster distribution with 3 clear clusters.
Figure 1 Overview of Clustering
An overview of the value of Spent, Frequency, Success rate, Income, Family_Size, we can obtain that:
● The average amount spent by Cluster 1 is dominant in each year compared tothe other two Clusters with above 1305 USD Meanwhile, the Cluster 2 onlyspent very little in the same period with under 130 USD one tenth compared toCluster 1
Trang 15● Cluster 1 also has the highest average income with about 76000 USD andCluster 2 has the lowest income with only 35000 USD.
● There is a huge difference in the Family Size of all the clusters While Cluster 0and Cluster 2 are almost married and have children, which can be shown intheir average family size with 2.87 and 2.88, almost all customers in Cluster 1are still single
● Cluster 1 and Cluster 0 are the most frequent buyers in all the channels of ourfirm in the period 2012-2014 which is about three-time higher than Cluster 2with about 4.58 to 6.45 times
● The success rate of marketing campaigns focusing on Cluster 1 outperformothers In other words, Cluster 1 seems to be the most potential target customersfor the firm
In conclusion, a glance at different values by Clusters, we can conclude that allthe statistics show the dominance of Cluster 1, which helps firms make betterdecisions in choosing target customers
3.2.2 Details
a Purchase by channel
Figure 2 Purchase by channel
In general, all the Clusters choose the Store as the main channel for purchasingwith the largest proportion with 36.99%, 40.48% and 40.39% respectively While theWeb is the second-best choice of Cluster 0 and Cluster 2, Cluster 1 frequently buyfrom the Catalog On the other hand, the proportion of Web chosen by Cluster 1 is stillhigher compared to the Deals with 24.45%
b Purchase by Product