1. Trang chủ
  2. » Luận Văn - Báo Cáo

Optimize the new campaign for bambo fashion store

56 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Optimize The New Campaign For Bambo Fashion Store
Tác giả Nguyen Xuan Hieu, Le Van Dong, Nguyen Thi Minh Tuyet, Tran Minh Thuc, To Duc Tin
Người hướng dẫn Tran Thanh Cong
Trường học Ho Chi Minh City University of Economics and Finance
Chuyên ngành Business Intelligence
Thể loại final report
Năm xuất bản 2024
Thành phố Ho Chi Minh City
Định dạng
Số trang 56
Dung lượng 5,58 MB

Cấu trúc

  • 1.1 OVERVIEW (10)
  • 1.2 PROBLEM STATEMENT (11)
  • 1.3 OBJECTIVES AND STUDY METHODOLOGY (11)
  • 1.4 REPORT STRUCTURE (13)
  • Chapter 2 DATA (14)
    • 2.1 DATASET DESCRIPTION (14)
      • 2.1.1 Customer Shopping Trends Dataset (14)
      • 2.1.2 Data shape (16)
      • 2.1.3 Data columns (16)
      • 2.1.4 Data head (18)
      • 2.1.5 Data tail (18)
      • 2.1.6 Data information (18)
    • 2.2 DATA STATISTICS DESCRIPTION (19)
    • 2.3 DATA CLEANING (22)
      • 2.3.1 Check for duplication (22)
      • 2.3.2 Missing values calculation (23)
      • 2.3.3 Data reduction (25)
    • 2.4 FEATURE ENGINEERING (26)
    • 2.5 SUMMARY (27)
  • Chapter 3 EXPLORATORY DATA ANALYSIS (29)
    • 3.1 EDA UNIVARIATE ANALYSIS (29)
      • 3.1.1 Age segment (29)
      • 3.1.2 Gender (30)
      • 3.1.3 Item Purchased (31)
      • 3.1.4 Category (33)
      • 3.1.5 Purchase Amount (USD) (34)
      • 3.1.6 Location (35)
      • 3.1.7 Size (37)
      • 3.1.8 Color (39)
      • 3.1.9 Season (40)
      • 3.1.10 Review rating (41)
      • 3.1.11 Payment method (42)
      • 3.1.12 Shipping Type (44)
      • 3.1.13 Previous Purchases (45)
      • 3.1.14 Frequency of Purchases (46)
    • 3.2 EDA BIVARIATE ANALYSIS (47)
      • 3.2.1 Analyze Average purchase (47)
      • 3.2.2 Analyze average amount spent (48)
      • 3.2.3 Analyze average purchase by age segment and season (50)
    • 3.3 Summary (51)
  • Chapter 4 SOLUTION PROPOSAL (52)
    • 4.1 OPTIMIZE THE NEW CAMPAIGN (52)
      • 4.1.1 Product (52)
      • 4.1.2 Price (52)
      • 4.1.3 Place (53)
      • 4.1.4 Promotion (53)
    • 4.2 Summary (54)

Nội dung

Group 6In the chapter, we will explore different aspects of data description, includingdata set shape, column names, data headers and tails, data information, andstatistical descriptions

OVERVIEW

The diverse shopping preferences of customers pose challenges for business owners aiming to maximize profits The Covid-19 pandemic has accelerated the growth of digital industries like E-Commerce and Digital Marketing, making social media marketing crucial Given the high costs of advertising on platforms such as Facebook, YouTube, Instagram, and TikTok, it is essential to identify distinct customer segments and concentrate on selling products that ensure stable sales and maximum profitability for businesses.

Customer segmentation is a crucial aspect of marketing that involves categorizing a company's customers into distinct groups based on shared characteristics, such as needs, behaviors, and preferences This strategic approach enables businesses to tailor their marketing efforts effectively to each segment, ultimately increasing conversion rates By accurately identifying and targeting the right customer groups, companies can optimize their advertising budgets and reduce sales costs, leading to more efficient marketing outcomes.

To analyze the customer data of BAMBO fashion store, to come up with solutions to improve business operations We will perform the following steps:

 Data collection and preprocessing: They will collect customer data from different sources of BAMBO fashion store Then use Python to clean, normalize, and synthesize the data into a homogeneous dataset.

In this study, I conducted a descriptive data analysis utilizing statistics and visual representations, including histograms and bar graphs, to examine essential features of customer data This analysis focused on key aspects such as age distribution, gender demographics, shopping preferences, and spending behaviors.

Customer segmentation involves categorizing customer groups based on shared shopping behaviors and preferences By utilizing Python command lines, we can effectively analyze and divide these groups, presenting the results through various graphical representations to illustrate the distinct customer segments.

To enhance sales and profitability for BAMBO fashion store, we will provide tailored marketing and business strategy recommendations for each customer segment, grounded in our analytical findings.

The use of Python and the data analysis techniques in this section will helpBAMBO fashion store better understand customer structure, effectively orient business strategy, and optimize marketing resources.

PROBLEM STATEMENT

BAMBO fashion store is currently struggling with low revenue despite significant investment in various marketing campaigns, including discounts, membership cards, and referral codes These efforts have not yielded the expected results, leading to financial pressure and inadequate sales to cover incurred costs If this trend continues, the store risks facing losses and potential closure without timely intervention The key challenge lies in identifying the root causes of the ineffective campaigns and developing effective strategies to optimize future efforts, increase revenue, and enhance the store's financial health.

OBJECTIVES AND STUDY METHODOLOGY

 Know better the shopping trends of customer groups: Analyze customer data to identify different customer groups with distinct needs and behaviors.

 Optimizing the new marketing campaigns for shopping trends of customer groups: Develop the new marketing campaigns tailored to each customer segment.

 Improve marketing efficiency based on customer shopping trends:

Increase conversion rates, revenue, and profits by targeting marketing campaigns more effectively.

This study employs Python charts and data analysis tools to analyze customer behavior and preferences, enabling the development of more effective business strategies The analysis utilizes various data analysis methods and tools to achieve these insights.

A histogram chart effectively illustrates the frequency distribution of continuous variables, including age, spending habits, and purchase frequency By utilizing this visual tool, businesses can gain valuable insights into the distinct characteristics of various customer segments.

A bar chart is an effective tool for comparing discrete variable values across various groups, including gender, marital status, and shopping preferences By utilizing this chart, we can uncover key characteristics essential for customer segmentation.

 Box Plot: Used to show the distribution of the continuous variable by different groups, allowing us to detect anomalous data points and compare differences between groups.

In addition, we will use the 4Ps strategy in Marketing (Product, Price, Place, Promotion) to build and implement appropriate business strategies for each customer segment:

 Product: Based on an analysis of customer preferences and needs, we will determine suitable products to develop or adjust the current product portfolio.

 Price: Through the analysis of our customers' spending and affordability, we will determine the appropriate pricing strategy to ensure competition and profitability.

 Place: Based on an analysis of the customer's geographic location and preferred distribution channel, we will determine suitable distribution channels and sales locations.

 Promotion: Through analysis of our clients' preferred communication methods and channels, we will build effective advertising and promotion campaigns.

Integrating Python data analysis with the 4Ps marketing strategy enables us to gain deeper insights into customer needs and behaviors, allowing us to provide tailored business solutions, optimize resources, and enhance profitability.

REPORT STRUCTURE

This report has 3 main chapters:

DATA

DATASET DESCRIPTION

The Customer Shopping Preferences Dataset provides essential insights into consumer behavior and purchasing habits, enabling businesses to customize their products, marketing strategies, and overall customer experience By capturing diverse customer attributes such as age, gender, purchase history, preferred payment methods, and purchase frequency, this dataset allows for informed decision-making and optimization of product offerings As a valuable resource for businesses looking to align their strategies with customer needs, it is important to recognize that this dataset is a synthetic creation designed for beginners to enhance their understanding of data analysis and machine learning.

This dataset provides valuable insights into customer shopping preferences, essential for businesses aiming to deepen their understanding of their clientele Key features include customer demographics such as age and gender, along with purchase amounts, preferred payment methods, and purchase frequency It also captures feedback ratings, types of items bought, shopping seasons, and responses to promotional offers.

Group 6 included With a collection of 3900 records, this dataset serves as a foundation for businesses looking to apply data-driven insights for better decision-making and customer-centric strategies.

 Customer ID - Unique identifier for each customer

 Age - Age of the customer

 Gender - Gender of the customer (Male/Female)

 Item Purchased - The item purchased by the customer

 Category - Category of the item purchased

 Purchase Amount (USD) - The amount of the purchase in USD

 Location - Location where the purchase was made

 Size - Size of the purchased item

 Color - Color of the purchased item

 Season - Season during which the purchase was made

 Review Rating - Rating given by the customer for the purchased item

 Shipping Type - Type of shipping chosen by the customer

 Previous Purchases - The total count of transactions concluded by the customer at the store, excluding the ongoing transaction

 Payment Method - Customer's most preferred payment method

 Frequency of Purchases - Frequency at which the customer makes purchases (e.g., Weekly, Fortnightly, Monthly)

Figure 2.1: Structure of the Dataset

 These lines of code are used to print out the number of columns and rows in the dataset stored in the variable df.

The code snippet `print("The number of the columns is : ", df.shape[1])` outputs the total number of columns in the dataset by retrieving the second element of the tuple provided by `df.shape`, which specifically indicates the column count.

The line of code `print("The number of the rows is : ",df.shape[0])` displays the total number of rows in the dataset by accessing the first element of the tuple returned by `df.shape`, which represents the row count.

 "The number of the columns is : 15" indicates that there are 15 columns in the dataset.

 "The number of the rows is : 3900" indicates that there are 3900 rows in the dataset.

The code `df.columns` is used to retrieve and display the names of the columns in the dataset assigned to the variable `df` It returns an Index object that contains all the column names, with each name represented as a string within the Index.

 In the output, the column names are displayed as follows: 'Customer ID','Age', 'Gender', 'Item Purchased', 'Category', 'Purchase Amount

Group 6 (USD)', 'Location', 'Size', 'Color', 'Season', 'Review Rating', 'Payment Method', 'Shipping Type', 'Previous Purchases', 'Frequency of Purchases'.

=> This line of code is useful for quickly inspecting and identifying the columns present in the dataset.

 Customer ID: Unique identifier for each customer.

 Age: Age of the customer.

 Gender: Gender of the customer (Male/Female).

 Item Purchased: The item purchased by the customer.

 Category: Category of the item purchased.

 Purchase Amount (USD): The amount of the purchase in USD.

 Location: Location where the purchase was made.

 Size: Size of the purchased item.

 Color: Color of the purchased item.

 Season: Season during which the purchase was made.

 Review Rating: Rating given by the customer for the purchased item.

 Shipping Type: Type of shipping chosen by the customer.

 Previous Purchases: Number of previous purchases made by the customer.

 Payment Method: Customer's most preferred payment method.

 Frequency of Purchases: Frequency at which the customer makes purchases (e.g., Weekly, Fortnightly, Monthly).

2.1.4 Data head head() will display the top 5 observations of the dataset.

2.1.5 Data tail tail() will display the last 5 observations of the dataset.

 RangeIndex: 3900 entries, 0 to 3899: This line provides information about the index of the DataFrame, stating that it's a RangeIndex with 3900 entries ranging from 0 to 3899.

 Data columns: This line informs that there are a total of 15 columns in the DataFrame.

 Column Information: The subsequent lines list each column along with additional information:

 Column: Displays the column name.

 Non-Null Count: Indicates the number of non-null (non-missing) values in the column.

 Dtype: Represents the data type of the column.

 Memory usage: 457.2+ KB: Shows the approximate memory usage of the DataFrame.

This output is essential for analyzing different facets of a DataFrame, including its structure, data types, and memory consumption It provides a quick overview of missing values through the Non-Null Count, identifies data types with Dtype, and assesses memory usage, making it a valuable tool for data inspection.

DATA STATISTICS DESCRIPTION

- describe() method generates descriptive statistics that summarize the central tendency, dispersion, and shape of a dataset's distribution, excluding NaN values.

- A breakdown of what each column in the output represents:

 count: Indicates the number of non-null values in each column.

 unique: Represents the number of unique values in each column.

 top: Displays the most frequently occurring value in each column.

 freq: Shows the frequency of the most common value in each column.

 mean: Represents the mean (average) value of each column.

 std: Indicates the standard deviation of each column.

 min: Displays the minimum value in each column.

 25%: Represents the 25th percentile (first quartile) of each column.

 50%: Represents the median (50th percentile or second quartile) of each column.

 75%: Represents the 75th percentile (third quartile) of each column.

 max: Displays the maximum value in each column.

The describe() function delivers a detailed summary of a dataset's numerical columns, offering essential statistical measures like mean, median, minimum, maximum, and quartiles Additionally, it presents insights into categorical columns by highlighting the most frequent values and their respective frequencies This information is vital for analyzing the distribution and attributes of the dataset.

The output of the describe(exclude=np.number) function provides a statistical summary of categorical variables in the dataset In this summary, each row corresponds to a distinct categorical variable, and each column represents a specific category within that variable.

 Gender to Age_Segment: These are the names of the categorical variables being summarized.

 count: Indicates the total number of non-null entries for each categorical variable.

 unique: Represents the number of unique categories within each categorical variable.

 top: Displays the most frequently occurring category within each variable.

 freq: Shows the frequency of the most common category within each variable.

It helps us understand the distribution of different categories within each categorical variable in the dataset

 You can see in the Gender variable, there are 2 unique categories: 'Male' and likely 'Female' (though it's not explicitly mentioned) The most common gender is 'Male', occurring 2652 times.

 In the Item Purchased variable, there are 25 unique categories, with 'Blouse' being the most common (occurring 171 times).

The summary offers valuable insights into the distribution of various categories across multiple variables, including Category, Location, Size, Color, Season, Payment Method, Shipping Type, Frequency of Purchases, and Age Segment This information is essential for understanding the frequency and distribution of different categories within each variable, aiding in informed analysis and decision-making.

DATA CLEANING

 This line of code iterates through each column of the DataFrame ‘df’ and counts the number of unique values in each column.

 Output: Each index represents the name of a column in the DataFrame df, and each value represents the count of unique values in the corresponding column.

Figure 2.9: Check for duplication The result is a series with the column names as indices and the counts of unique values as their corresponding values.

Here's the summary of unique value counts for each column:

 Customer ID: 3900 unique values (indicating that there are 3900 unique customers in the dataset)

 Gender: 2 unique values ( male and female)

 Item Purchased: 25 unique values (indicating that 25 different items have been purchased)

 Category: 4 unique values (indicating that there are 4 different item categories)

 Purchase Amount (USD): 81 unique values (indicating that there are 81 different purchase amounts)

 Size: 4 unique values (indicating that there are 4 different sizes)

 Location: 50 unique values (indicating 50 different purchase locations)

 Color: 25 unique values (indicating 25 different colors)

 Season: 4 unique values (indicating 4 different seasons for purchases)

 Review Rating: 26 unique values (indicating 26 different review ratings given by customers)

 Payment Method: 6 unique values (indicating 6 different payment methods used by customers)

 Shipping Type: 6 unique values (indicating 6 different types of shipping services used)

 Previous Purchases: 50 unique values (indicating 50 different counts of previous purchases made by customers)

 Preferred Payment Method: 6 unique values (indicating 6 different preferred payment methods selected by customers)

 Frequency of Purchases: 7 unique values (indicating 7 different frequencies of purchases made by customers)

Figure 2.10: Missing values This method counts the number of missing values (null or NaN) in each column of the DataFrame The DataFrame has various columns including Customer

ID, Age, Gender, Item Purchased, Category, Purchase Amount (USD), Size, Color, Season, Review Rating, Payment Method, Shipping Type, Previous Purchases, and Frequency of Purchases.

The result indicates that there are no missing values in any of the columns, with all counts showing as 0.

 Used to calculate the percentage of missing (null) values in a data set.

 data.isnull() creates a table the same size as the data, where each cell has the value True if it is null and False otherwise.

The method data.isnull().sum() calculates the total number of null values for each column in the dataset, returning a Series where the column names are indexed by numbers and the corresponding values represent the count of null entries in those columns.

 len(data) returns the number of rows (lines) in data.

To calculate the percentage of null values in a dataset, use the formula (data.isnull().sum()/(len(data)))*100 This divides the total number of null values by the total number of rows, then multiplies the result by 100 The outcome is a Series that lists each column name alongside the corresponding percentage of null values it contains.

 `df.drop(['Customer ID'], axis=1)`: This is the `drop()` method of the

To remove the 'Customer ID' column from the DataFrame `df`, we use the parameter `axis=1`, which specifies that we are targeting a column for deletion As a result, the 'Customer ID' column will no longer be present in the DataFrame.

The `data.info()` method provides a comprehensive overview of a DataFrame, detailing the number of rows and columns, the count of non-null values in each column, the data types associated with each column, and the memory usage statistics.

 The first line removes the column named 'Customer ID' from the DataFrame `df`.

 The second line displays information about the DataFrame `data` after the 'Customer ID' column has been removed.

Removing the Customer ID column enhances calculation speed and simplifies data analysis, as Customer IDs do not contribute valuable insights for general trend analysis or predictive modeling that does not necessitate personal information.

FEATURE ENGINEERING

Figure 2.13: Creating features The provided code snippet creates a new categorical feature named

"Age_Segment" based on the "Age" column in the DataFrame `df` This feature segments the age into three categories: "Young", "Adult", and "Senior".

 `df["Age_Segment"] = pd.cut(df["Age"], bins=[0, 25, 45, 70], labels=["Young", "Adult", "Senior"])`: This line creates the new column "Age_Segment" in the DataFrame `df` using the `cut()`

Group 6 function from pandas The `cut()` function segments the values in the

"Age" column into the specified bins In this case, ages are segmented into three categories: "Young" (0-25), "Adult" (26-45), and "Senior" (46-70) The `labels` parameter specifies the labels for each category.

 `df.head()`: This line displays the first few rows of the DataFrame `df`, including the newly added "Age_Segment" column.

 The DataFrame `df` will have a new column named "Age_Segment", which categorizes each entry based on the age ranges defined.

 The `head()` function is used to display the first few rows of the DataFrame `df`, showing the newly added column along with other existing columns.

Incorporating the "Age_Segment" column enables the categorization of customer ages into meaningful groups, facilitating a more in-depth analysis of shopping trends and age-related behaviors This targeted marketing approach allows for the design of tailored campaigns aimed at specific age groups, enhancing engagement and effectiveness Additionally, leveraging age data helps optimize sales strategies by identifying suitable products or services for each segment, ensuring offerings are aligned with customer preferences Ultimately, segmenting age information systematically organizes and analyzes data, leading to more accurate and valuable strategic decisions.

SUMMARY

In Chapter 2, we explored various aspects of data description, including dataset shape, column names, data head and tail, data information, and statistics description.

In our exploration of data cleaning techniques, we focused on identifying and eliminating duplicate entries, calculating missing values, and implementing data reduction methods Additionally, we discussed feature engineering strategies, which involve generating new features derived from existing data to enhance analysis and improve model performance.

Overall, this chapter provide a comprehensive overview of data understanding and preparation techniques, laying the foundation for further analysis and modeling.

EXPLORATORY DATA ANALYSIS

EDA UNIVARIATE ANALYSIS

Univariate analysis focuses on examining the distribution of values for a single variable Commonly utilized chart types for this analysis include histograms, bar charts, and boxplots, which effectively illustrate the data's characteristics.

Below we will analyze each chart in detail.

Figure 3.14: Code of Age segment

The bar chart illustrates the distribution of buyers at the store across three age segments: Young, Adult, and Senior Notably, Seniors represent the largest group, comprising 47.64% of the customer base, followed by Adults at 37.72%, and Young individuals at 14.64% This data highlights that Seniors and Adults are the predominant age segments among the store's clientele.

Figure 3.17:Gender bar chart The bar chart shows the number of male and female customers The y-axis labeled "Count" represents the number of customers, and the x-axis labeled

"Gender" classifies customers by gender There are two data points plotted in the chart.

Through the chart, we see that the number of male customers is more than double that of female customers (2652 compared to 1248).

Analyzing the gender distribution of customers purchasing various products allows us to identify trends in buying behavior, enabling us to customize our products and marketing strategies effectively.

The bar graph titled "Item Purchased" illustrates the quantity of various clothing types bought over the past year The y-axis, marked "Count," indicates the total number of items purchased, while the x-axis, labeled "Item Purchased," categorizes the different types of clothing.

 Socks are the most popular clothing item purchased at 160 units.

 Shoes and T-shirts follow closely behind at 140 and 120 units respectively.

 Belts, hats and scarves are the least popular clothing items listed on the chart, each purchased at 20 units.

The chart illustrates the variety of clothing purchases made over the past year, enabling us to compare the popularity of different clothing types through the height of the bars This visual representation highlights both favored and less frequently bought items, providing valuable insights into consumer preferences and guiding the development of effective business strategies.

Figure 3.21: Category chart The chart title is "Amount Purchased by Category" and it depicts a bar graph showing the amount spent on four clothing categories: clothing, accessories, footwear, and outerwear.

The y-axis labeled "Count" represents the total amount spent on each category. The x-axis labeled "Category" lists the four clothing categories.

Here are some observations from the chart:

 Clothing is the most popular category, with a total spending of 1750.

 Footwear is the least popular category, with a total spending of 599.

 Accessories and outerwear follow in between at 1240 and 1237,respectively.

Overall, the chart indicates that people spent the most money on clothing and the least on footwear in this dataset

Figure 3.22: Code of Purchased Amount (USD)

Figure 3.23: Purchase Amount (USD) box plot

The chart is a box plot of purchase amount in dollars.

The center line in the box shows the median purchase amount, which is $50. The box contains the middle 50% of the data In this case, the values between

$30 and $70 represent the middle 50% of the purchase amounts.

The whiskers extend to the most extreme values within 1.5 times the interquartile range (IQR) from the top and bottom of the box The IQR is the

Group 6 difference between the 75th percentile and the 25th percentile In this chart, the whiskers extend to $20 and $80.

Any data points beyond the whiskers are considered outliers and are plotted as individual circles There are two outliers in this chart, one at $10 and another at $90.

Overall, the box plot shows that most purchases fall between $30 and $70 with a median of $50 There are a few outliers at $10 and $90.

The "Locations TreeMap" illustrates the distribution of trees throughout the United States using a hierarchical area chart format In this TreeMap, the overall rectangle symbolizes the total tree count in the country, while individual states are depicted as smaller rectangles within it The size of each state's rectangle is directly proportional to its number of trees, effectively showcasing the varying tree populations across different states.

Here are some observations about the distribution of trees in the United States based on this chart:

California appears to have the most trees, followed by Alabama, Oregon, and Florida.

States in the Northeast and Midwest appear to have fewer trees compared to the Southern and Western states.

It is important to note that this chart does not show the exact number of trees in each state It only shows the relative proportion of trees between states.

The chart titled "Distribution of Sizes" illustrates the varying counts of coin sizes, categorized as M, L, S, and XL The x-axis represents these distinct sizes, while the y-axis indicates the corresponding count for each size This visual representation effectively conveys the distribution of coin sizes.

 Most of the coins are medium (M) sized, making up 45% of the total with 1755 people.

 Large (L) sized coins come in second at 27% with 1053 people

 Small (S) sized coins make up 17% of the total with 663 people.

 Extra Large (XL) sized coins are the least common, at 11% of the total.

Figure 3.28: Code of Item Color

The chart titled "Distribution of Item Colors" illustrates the sales figures for various item colors in the United States The x-axis identifies the different colors available, while the y-axis indicates the total count of items sold for each color.

 The most popular color is green, with 177 items sold.

 Black and olive are tied for the second most popular color, each selling

 Teal is the least popular color, selling only 142 items.

Overall, the chart indicates that green, black, and olive are the most popular colors for items sold in this dataset.

The chart titled "Average Sunshine Hours by Season" presents a bar graph that illustrates the average number of sunshine hours across different seasons: Spring, Summer, Fall, and Winter The x-axis categorizes these seasons, while the y-axis displays the corresponding average sunshine hours for each season.

 Spring appears to have the most sunshine hours, at an average of 975 hours.

 Summer follows closely behind at 955 hours.

 Fall has the second least sunshine hours, at 900 hours.

 Winter has the least sunshine hours, at an average of 600 hours.

Overall, the chart indicates that spring and summer receive the most sunshine hours, while winter receives the least.

Figure 3.32: Code of Review rating

Figure 3.33: Review Rating Histogram The chart is titled "Histogram of Review Ratings" and it depicts a histogram that shows the distribution of ratings left for a product.

The x-axis labeled "Review Rating" shows the rating on a scale from 2.5 to 5.0. The y-axis labeled "Frequency" represents the number of reviews that gave a particular rating.

The majority of reviews average a 4.5-star rating, indicating a high level of customer satisfaction This central tendency at 4.5 suggests an equal number of reviews above and below this rating, reflecting a balanced range of customer experiences.

The data tapers off towards the ends of the graph, indicating that there are fewer reviews with ratings that are very high or very low.

Figure 3.34: Code of Payment method

Figure 3.35: Payment Method bar chart

The chart titled "Number of Users by Payment Method" is a bar graph that shows the number of users who prefer different payment methods.

The x-axis labeled "Payment Method" categorizes the different payment methods users can choose from These methods include credit card, debit card, cash, Venmo, bank transfer, and Paypal.

The y-axis labeled "Count" represents the number of users who prefer each payment method.

 Credit card is the most popular payment method, with 700 users.

 Debit card is the second most popular payment method, with 696 users.

Cash is currently the least favored payment method, utilized by only 200 users Analyzing popular payment methods through bar charts provides a clear visual comparison of user preferences By examining the varying heights of the bars, businesses can easily identify the most and least popular payment options This insight enables a better understanding of customer preferences during checkout, allowing for tailored checkout processes and the provision of suitable payment options.

Figure 3.36: Code of Shipping Type

The line graph titled "Cost of Shipping by Shipping Type" illustrates various shipping options along the x-axis, which includes Free Shipping, Standard Shipping, Store Pickup, Next Day Air, Express 2-Day Shipping, and Not Purchased The y-axis represents the corresponding costs associated with each shipping type, providing a clear comparison of expenses for consumers considering different delivery methods.

"Count" represents the cost of shipping for each option.

Free Shipping has a cost of $0, as indicated by the line touching the x-axis at that point.

Standard Shipping has the next lowest cost, at around $5.

Store Pickup, Next Day Air, Express, 2-Day Shipping is the most expensive option, costing around $20.

Not Purchased likely refers to items that were not purchased, and therefore have no shipping cost associated with them.

Figure 3.38: Code of Previous Purchases

Figure 3.39: Previous Purchases box plot The chart titled "Box Plot of Previous Purchases" shows the distribution of the number of previous purchases made by customers.

A box plot effectively divides data into quartiles, with the central box representing the middle 50% of the dataset The median, indicated by a line within the box, marks the 50th percentile, revealing that the median number of previous purchases is 2.

In box plots, the whiskers reach out to the maximum and minimum values within 1.5 times the interquartile range (IQR), which is calculated as the difference between the 75th and 25th percentiles For this specific chart, the whiskers extend to the values of 0 and 5.

 Any data points beyond the whiskers are considered outliers and are plotted as individual circles There are two outliers in this chart, one at 8 and another at 12.

Overall, the box plot shows that most customers have made between 0 and 5 previous purchases There are a few outliers who have made 8 or 12 previous purchases.

Figure 3.40: Code of Frequency of Purchases

Figure 3.41: Frequency of Purchases chart

The chart illustrates the purchasing frequency of a product over a specified time period, with the x-axis representing the frequency of purchases and the y-axis indicating the total count Additionally, the chart provides options to view the x-axis in various time units, including quarterly, annually, bi-weekly, fortnightly, and weekly.

In general, almost all columns in this chart are equal, the difference between the lowest and highest columns is just over 40 people.

EDA BIVARIATE ANALYSIS

Figure 3.42: Code of Analyze Average purchase

Figure 3.43 : Analyze Average purchase The X-axis represents your store's product categories, the Y-axis represents the average number of product purchases, and the columns represent age segments.

 Overall, all 3 segments had a clear difference in the number of buyers, the most was Senior, followed by Adult, and finally Young.

The senior segment leads in purchases, particularly favoring items like shoes, jewelry, and handbags In contrast, the adult segment prioritizes essential items such as backpacks, shirts, and pants Meanwhile, the young segment shows the lowest purchasing activity, with a variety of products nearly equal in sales; however, sweaters remain the most popular, followed closely by coats and dresses.

Based on the previous chart, we analyze the chart below to understand the prices for consumers' favorite products that they are willing to spend to buy products.

Figure 3.44: Code of average amount

The bar chart illustrates the average spending on merchandise across different age segments The X-axis categorizes the various product types available in the store, while the Y-axis indicates the average monetary expenditure by consumers Each column in the chart represents distinct age groups, highlighting their purchasing behavior.

Across all three age groups, there is minimal variation in the amount consumers are willing to spend on products, with prices generally falling between $55 and $70 per item.

In the Senior segment, the most purchased products include shoes, jewelry, and handbags, with average spending per item being $60, $55, and $59, respectively.

In the adult category, the top three most purchased products are backpacks, shirts, and pants, priced at $59, $60, and $60 each For the young segment, the leading products are sweaters, coats, and dresses, with prices set at $58, $63, and $66 per item, respectively.

3.2.3 Analyze average purchase by age segment and season

Figure 3.46: Code of average purchase

 Chart shown average purchase by age segment and season

The chart illustrates the average purchase count across different age segments—Young, Adult, and Senior—during each season: Winter, Fall, Spring, and Summer It highlights how purchasing behavior varies by age group throughout the year.

 In general, for customer segments, the difference in buyers between Spring, Summer, Fall, and Winter seasons is insignificant, the largest difference is 50 people.

The purchasing patterns of different age segments reveal distinct seasonal preferences: the Young demographic tends to buy more during spring and winter, while Adults prioritize their shopping in winter In contrast, Seniors show a preference for making purchases in the spring.

Summary

In Chapter 3, we segmented customer groups to address key questions about customer identity, thoughts, interests, and needs This analysis enables our team to effectively develop and optimize a new campaign for the fashion company, ensuring it aligns with customer expectations and purchasing behavior.

SOLUTION PROPOSAL

OPTIMIZE THE NEW CAMPAIGN

To optimize the new campaign for the BAMBO fashion store, our team use the 4P Marketing tool.

Our age segment choice those are Senior and Adult because these two age segments account for most of the store's customers.

For Product, we will select the product based on the following characteristics: Product, Size, and Color.

 Product: based on the figure 3.30, we choose 3 products that that age segment loves the most.

 Size: Based on the figure 3.14, we will choose all sizes, but take double the amount for size L compared to other sizes.

 Color: Because there is not much difference in product color selection across age segments, we will choose them all.

- For Seniors: o Product: Shoes, Jewelry, and Handbags. o Size: S, M, L, XL Double addition of size L products. o Color: All of color.

- For Adults: o Product: Backpack, Shirt, and Pant. o Sizes S, M, L, XL Double addition of size L products. o Color: All of color.

Based on the analysis of the price that customers are willing to spend to buy the product in figure 3.32, we have given the price of the products as follows:

- For Seniors: o Shoes: 60 USD. o Jewelry: 55 USD. o Handbag: 59 USD.

- For Adults: o Backpack: 59 USD. o Shirt: 60 USD. o Pant: 60 USD.

 For Place, we will choose to analyze Location and Season to build strategy.

 Location: Based on the figure 3.12, there is not much difference in the customer's purchasing locations, so we will choose all locations of all our stores.

 Season: Based on the figure 3.34, we will select the seasons with the highest number of customers buying during the year to create a campaign.

- For Seniors: o Location: All store locations. o Season: Spring.

- For Adults: o Location: All store locations. o Season: Winter, Fall.

For Promotion, because we are all traditional retail stores, we will use campaigns at the store in the following ways:

Summary

In summary, our campaign strategy targets our main customer segments, Seniors and Adults, by providing a thoughtfully selected range of products, competitive pricing, easy access across all store locations, and attractive promotions This strategy is designed to boost sales and improve customer satisfaction, ensuring our ongoing success in the marketplace.

In this report, we utilized Python along with libraries such as Pandas, NumPy, and Matplotlib to analyze customer data and create visual graphs By employing various chart types, including histograms, bar charts, and box plots, we gained valuable insights into our customers' characteristics, behaviors, and preferences, enhancing our understanding of the customer demographics in our store.

We have categorized customers into distinct groups based on shared shopping behaviors and preferences, allowing us to gain insights into their specific needs and expectations This customer segmentation informs our tailored business and marketing strategies Utilizing the 4P model—Product, Price, Place, Promotion—we have developed targeted marketing strategies for each segment, which involve optimizing product offerings, pricing approaches, distribution methods, and promotional campaigns.

By leveraging data analytics alongside effective marketing strategies, BAMBO fashion store aims to improve customer experience, boost sales and profits, and navigate challenging periods of cost-revenue imbalance Our proposed solutions focus on resource optimization, targeting the right customers, and enhancing the alignment with customer needs.

Ngày đăng: 05/02/2025, 16:21