1. Trang chủ
  2. » Luận Văn - Báo Cáo

Btec Level 5 Hnd Diploma In Business Unit 42 - Statistics For Management.pdf

48 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Unit 42 - Statistics For Management
Tác giả Pham Thi Huong Lan
Người hướng dẫn Nguyen Thi Thu Ha
Trường học btec
Chuyên ngành business
Thể loại assignment
Năm xuất bản 2023
Định dạng
Số trang 48
Dung lượng 3,09 MB

Cấu trúc

  • I. Introduction (7)
  • II. Apply Statistical Methods in Business Planning (7)
    • 1. Measuring the Variability in Business Processes or Quality Management (7)
    • 2. Probability distributions and application to business operations and processes (9)
      • 2.1. Poisson Distribution (9)
      • 2.2. Normal distribution (13)
    • 3. Inferential statistics (17)
      • 3.1. One Sample T-test (17)
      • 3.3. Regression (24)
  • III. Communicate findings using appropriate charts/tables (29)
    • 1. Different types of visual representations for variables in the dataset (29)
      • 1.1. Frequency tables (29)
      • 1.2. Simple tables (30)
      • 1.3. Pie charts (33)
      • 1.4. Histogram (36)
    • 2. Application for assigned dataset (37)
      • 2.1. Frequency tables (37)
      • 2.2. Simple tables (38)
      • 2.3. Pie charts (39)
      • 2.4. Histogram (42)
  • IV. Conclusion (45)
  • V. Reference lists (47)

Nội dung

Introduction

The demand of analyzing and evaluation the business data is increasing day by day in recovering after covid-19 era As the role of a Research Analyst of SSI Securities Corporation, which is one of the top financial investments in Viet Nam in recent 20 years, by applying statistical methods to make business reports of some Vietnamese companies’ business planning and operations management with aim to provide the necessary data and point out the risk, the trends or opportunities, and so on for meet the demand of recovery business planning after covid-19 as well as demonstrate effective of applying statistical techniques in analyzing data business Specially, this report will focus on sample data business including Garments (a4a), Whosales (a4aQ) and Retail (a4a R) with basing on combining inferential statistics, regression technique and descriptive statistics method to access and export the exact result for stakeholders In addition, this report will divide in two parts, with firstly that apply statistics methods in business planning and the rest of this report will present relation to universal information and the application of different variables and different charts and tables.

Apply Statistical Methods in Business Planning

Measuring the Variability in Business Processes or Quality Management

Figure 1: The chart of measuring the variability of hours operating in a week

The graph illustrates the distribution of weekly working hours among companies operating in theGarments, Whosales, and Retail sectors in all regions covered by the VES dataset In general, it is rare to find companies operating fewer than 40 hours per week, with the focus being mainly on the range of 40 to almost 60 hours per week, especially between 40 and 60 hours Additionally, there are some companies that work over 100 hours per week in these industries.

Probability distributions and application to business operations and processes

The Poisson distribution, a discrete probability distribution, models the likelihood of discrete events occurring within a fixed interval It predicts the probability of observing a specific number of events, denoted as k, occurring within that interval (Turney, 2022).

The Poisson distribution serves as a statistical tool utilized to predict or interpret the occurrence of events within a specified temporal or spatial interval These events can manifest in various forms, including disease outbreaks, customer purchases, and even celestial events such as meteor strikes The interval can be defined as a given amount of time or space, with examples including a 10-day time frame or a 5-square-inch area (Turney, 2022)

A Poisson distribution can be shown as a probability mass function graph A function that characterizes a discrete probability distribution is known as a probability mass function The peak of the distribution—the mode—represents the most likely number of events.

 When is not an integer, the mode is the nearest integer less than.

 There are two possibilities when is an integer: and 1.

 When is small, the distribution is significantly longer on the right side of its peak than on the left (i.e., it is highly skewed to the right).

The Poisson distribution, a probability distribution model, is widely used in various fields of business to describe the random occurrence of events (Hayes, 2023) It is commonly applied in scenarios where the number of events happening in a given time interval is of interest, such as accident rates, phone calls, or website traffic In finance, the Poisson distribution can be utilized to analyze the number of losses a business might experience, helping to estimate the probability of default or loan risk In addition, it can also be used to model patterns in customer arrival rates, which assists in optimizing staffing and reducing customer wait times Overall, the Poisson distribution is a valuable tool for businesses to quantify the probability of events occurring, leading to more informed and accurate decision-making processes (Hayes, 2023)

Normal distribution, also called Gaussian distribution, is a fundamental concept in statistics (Britannica, 2023) It is a distribution in which the data is symmetrically distributed around the mean, and the data points are evenly distributed on both sides of the mean value The normal distribution is often used as a model for many naturally occurring phenomena, such as biological variables or socio-economic phenomena It has a bell-shaped curve with the highest frequency at the mean value and gradually decreasing towards the tails The standard deviation of the data in a normal distribution helps to measure the spread of the data points from the mean value (Britannica, 2023).

The mean is the scale parameter, and the standard deviation is the location parameter The mean decides where the curve's apex is located Increasing the mean shifts the curve to the right, while reducing it shifts the curve to the left The standard deviation either stretches or compresses the curve A narrow curve is produced by a small standard deviation, whereas a wide curve is produced by a big standard deviation.

Table1:Probability normaldistributionchart ofdays ofinventoryin Garments,Non-metallic mineralproductsandRetailindustry

The result of Probability normal distribution ratio of the number of days of inventory (d16) was0,011425026, in which, it was calculated as the mean, and the Standard Deviation of the population in the normal area was equal to the sample mean and the SD in the data set Therefore,the probability for businesses to have less than 6 days of inventory is very low In addition, the number of companies with inventory is relatively small, estimated to be around 28 out of all the businesses in Garments, Non-metallic mineral products and Retail industry examined.

Inferential statistics

The one-sample t-test is a statistical hypothesis test used to determine whether an unknown population mean is different from a specific value (SAS, 2016).

Figure4:One-sampleT-test example (Gerald,2018)

To compare a sample mean to a specific value, use the one-sample t-test The sample mean and the presumed population mean can be compared using a one-sample t-test to see whether there is a significant difference between the two One-sample t-tests are employed, for instance, to compare the sample mean and sample midpoint of the test variable or to ascertain if a sample of observations might have been produced by a process with a certain mean (Gerald, 2018)

Step1: Write null and alternative hypothesis

Let μ the average sales operating in a week of firms in the garment, whosale and retail and wholesale industries in VietNam in 2019 It will be tested on the hypothesis different from 54 sales with the significance level of 5%.

Hα: μ ≠ 54 (altenative hypothesis - need to test)

Assume null hypothesis is true ( μ= 54), the report calculates the probabilities from the numbers specified from the survey of enterprises.

Table2:The tableofOne-sampleT-test

From the calculated date from the above table, it can be seen that P-value=0.0000000036477< 0,05. This demonstrates that Ha accepted 95% confidence that the average hours operating in a week of firms (Medium size) is not different from 54 hours The Ha is rejected.

The two-sample t-test (also known as the independent samples t-test) is a method used to test whether the unknown population means of two groups are equal or not (JMP, 2023).

Figure5:TwoSampleT-test example(Street,2023)

Explain: This table show the result of Two-sample T-test with independent sample From the output, T = 1.4779 with 13.9939 degrees of freedom p-value = Sig.(2-tailed) = 0.1616 Since p- value = 0.1616 > 0.05 = α , we fail to reject the null hypothesis Therefore, there is not enough evidence to conclude that the mean pollution indexes are the same for the two areas.

Step1: Write null and alternative hypothesis

Let �1,à2 be the average hours operating in a week of millions micro and medium firms in the

Garments, Non-metallic mineral products and Retail industry in Viet Nam in 2015 It will be tested on the hypothesis: different in two types of company (Micro company and Medium firms) with the significance level of 5%

Step2: Find P-value by Jamovi

Table 3:Thetableof firmdatasetin termTwo-sampleT-test

From the calculation of from above table, it is clearly seen that P-value = 0,867 > 0,05 Therefore, The Hypothesis Hα is rejected and is not sufficient evidence 95% confident that the average weekly hours of operation in the Garments, Non-metallic mineral products and Retail industry are different for micro and medium enterprises.

Regression is a statistical method used in finance, investing, and other disciplines that attempts to determine the strength and character of the relationship between one dependent variable (usually denoted by Y) and a series of other variables (known as independent variables) (Beers, 2023).

The height coefficient in the regression equation is 106.5 This coefficient represents the mean increase of weight in kilograms for every additional one meter in height If your height increases by 1 meter, the average weight increases by 106.5 kilograms.

The regression line visually represents the relationship between height and weight A one-meter increase in height corresponds to a 106.5-kilogram increase in weight, as indicated by the line's slope However, it's important to note that these results are only valid within the range of observed data, which in this case is limited to middle-school girls between 1.3 m and 1.7 m in height.

Consequently, we can’t shift along the line by a full meter for these data.

Step1: Write null and alternative hypothesis

This analysis will examinate the relationship between sales and hours operating in 2015 by ES (with >95% confidence level)

Ho: There is no significant relation between sales and hours of operation

Hα: There is a positive relation between sales and hours of operation

Step 2: Assume Ho is true, find P-value:

Table4:The tableoffirmdatasetin termofRegression technique

Basing on the information of the regression table, it can be seen that P-value = 0,014 < 0,05 This demonstrate that the Hα is enough 95% confident that there is a relationship between operating hour and sales This hypothesis Hα is accepted.

Communicate findings using appropriate charts/tables

Different types of visual representations for variables in the dataset

A frequency table is a statistical tool used to organize and summarize data by categorizing it into distinct groups or intervals and displaying the frequency or count of occurrences in each category.

It provides a clear and concise summary of the data distribution, making it easier to identify patterns, trends, and outliers (Splashlearn, 2022).

The advantages of frequency tables include their simplicity and ease of interpretation They offer a structured format that presents information in an organized manner, facilitating quick comprehension and analysis Frequency tables allow for easy comparison between categories, making it straightforward to identify the most common or rare occurrences They are particularly useful for handling large datasets by condensing the information into manageable categories, which simplifies the presentation and analysis process (Splashlearn, 2022).

However, frequency tables also have some limitations They can oversimplify complex datasets, potentially losing granularity by aggregating data into categories This aggregation can lead to information loss, as individual data points are not explicitly represented Additionally, frequency tables may not provide detailed insights into the relationships or distributions within each category.

As a result, they might not be suitable for more sophisticated analyses that require a deeper understanding of individual data points or complex data structures (Splashlearn, 2022).

Simple tables present data in a structured format, organizing information into rows and columns for easy readability Their simplicity and ease of understanding make them suitable for displaying data for clear comparisons, pattern recognition, and quick information retrieval Tables provide flexibility in customization to meet specific requirements, facilitating efficient data analysis However, their limitations arise when handling large or highly detailed datasets, requiring cumbersome navigation and potential challenges with updates Additionally, simple tables may struggle to represent complex data relationships or hierarchies, reducing their effectiveness in certain analytical or presentation contexts.

Table6:Table ofQuantitative datafortheA4A (Regions) The table provides quantitative information for the sample's l1 (number of employees) variable. First, according to Talor (2023), the mean represents average and that it is the sum of a collection of data divided by the sum of all the data The mean for this data set shows a corporation having an average of 214,136 employees Secondly, the median is the midpoint of the data set when it is sorted ascendingly, and the median employee count is 24 Taylor (2023) defines mode as the value that occurs most often in the set data, so 5 represents the most number of employees appearing out of 324 companies According to Bhandari (2022), the range is the difference between the highest and lowest value for a specific variable in a sample It is calculated by subtracting the minimum from the largest value The range represents the variance of the data, for this data set, the number of employees disparate across companies is 8996 Finally, Bhandari (2022) defines the standard deviation as the average amount of variation in your data set It demonstrates how different each score's average is from the mean With increasing standard deviation, the data collection becomes more diverse This sample's standard deviation is 845,7793 Overall, these descriptive statistics give a thorough account of the workforce, reflecting both the dataset's central tendency and its variability.

A pie chart is a circular graphical representation that displays data as slices, with each slice representing a different category or proportion of the whole It is commonly used to showcase the distribution or composition of a dataset (Byjus, 2019) The advantages of pie charts include their ability to provide a visual representation of proportions, making it easy to compare and understand relative sizes They are particularly useful for highlighting the dominant or significant categories within a dataset However, pie charts can be limited in displaying precise numerical values and are less effective for comparing multiple datasets or categories (Byjus, 2019) They may also become cluttered and difficult to interpret when dealing with too many slices or small proportions, leading to potential misinterpretation or confusion.

The descriptive analysis of the frequency of industries depicted in the pie chart provides valuable insights into the distribution of qualitative data The pie chart shows that among the industries represented, garments have the highest frequency with 46% This suggests that a significant proportion of the dataset belongs to the garment industry Retail has account for 35% of the frequency, indicating a substantial presence in the dataset as well Wholesale represents 28% of the frequency, suggesting a comparatively lower occurrence within the dataset This descriptive analysis offers a visual representation of the distribution of industries, allowing for a quick and comprehensive understanding of the relative frequencies in the dataset.

A histogram is a graphical representation of data that uses bars to display the frequency distribution of a continuous variable It consists of a series of adjacent rectangles, where the width of each rectangle represents a specific interval or range of values, and the height represents the frequency or count of occurrences within that interval (Javatpoint, 2023) Histograms are particularly useful for visualizing the shape, central tendency, and spread of data They provide a clear and concise overview of the data distribution, making it easy to identify patterns, outliers,and gaps Histograms also allow for easy comparison between different categories or groups.

Moreover, they can handle large datasets effectively, as the data can be grouped into intervals.However, histograms may oversimplify the data by aggregating it into intervals, potentially losing some granularity They are sensitive to the choice of interval width, which can impact the appearance and interpretation of the distribution Additionally, histograms do not provide detailed information about individual data points and may not be suitable for displaying categorical or qualitative data (Javatpoint, 2023).

Application for assigned dataset

The frequency table displays information about the regions The Southeast has the highest frequency with 139 times, while the Mekong River has the lowest frequency with only 81 times.Due to the advantages of labor and raw materials, it can be said that textiles, non-metallic minerals and wholesale are having the strongest growth in the Southeast region.

The table presents various statistical measures for four business categories: food, retail, garment, and wholesale The data includes mean (2,317901), median (2), mode (2), standard deviation (1,047267), and range (3) These metrics provide insights into the central tendency and spread of the data within each category.

The pie chart depicts the percentage of firms in the four regions include the Mekong River Delta, North Central Area & Central Coastal Area, South East and Red River Delta The pie chart shows that the South East region has the largest percentage (33%), followed by the Red River area (30%). The remaining two regions account for a comparatively minor share, with the North Central Area

& Central Coastal Area region accounting for (24%) and the Mekong River accounting for (17%).

As a result, the Southeast area and the Red River have reaffirmed their appeal for textiles, retails,and wholesale enterprises.

The graph illustrates the number of businesses in the three sectors of Apparel, Retail and Wholesale in four regions namely Mekong River Delta, North Central Area & Central Coastal Area, South East and Red River Delta In general, the number of enterprises tends to increase gradually from Mekong River Delta to South East South East has 107 businesses operating with the highest position Next is the Red River Delta with 96 operating businesses North Central Area

The Central Coastal Area and Mekong River Delta have lower business activity than the South East and Red River Delta regions, with 75 and 46 businesses respectively This disparity suggests that the South East and Red River Delta have more robust economic development compared to the other two regions.

2.1 Use two different visual representations to describe this variable

The Histogram and the pie chart above show information about the number of businesses in 3 industries of Garment, Retail and Wholesale in 4 regions: Mekong Delta, North Central & CentralCoast, Southeast Ministry and the Red River Delta The Southeast has 107 enterprises in operation with the highest position equivalent to 33% of the rate Next is the Red River Delta, accounting for30% of enterprises (96 enterprises) in operation North Central & Central Coast and Mekong River

Delta are 23% and 14% - 75 and 46 enterprises respectively From the number of enterprises operating in each region, it can be seen that the economic situation of the Southeast region and the Red River Delta is more vibrant and developed than the two regions mentioned above remaining.

2.2 Evaluate the appropriateness of above visual representations

The pie chart and histogram are both suitable for portraying the number of businesses in different regions The pie chart effectively displays proportions, allowing for comparisons of relative sizes It excels at emphasizing prominent categories within a dataset However, the histogram surpasses the pie chart by providing an additional visual representation of data distribution It reveals the shape of the distribution, highlights peaks and gaps, and demonstrates the data's dispersion This becomes essential for large datasets where a frequency table alone may lack context Consequently, the histogram enables the identification of patterns, trends, and outliers, making it a more comprehensive choice for understanding data distribution.

In conclusion, while both the histogram and the pie chart can be used to show the distribution of staff numbers in a business, the histogram is a more powerful visual tool for displaying and analyzing the statistics A histogram is preferable to a frequency table because it provides a more clear and intuitive visual depiction of the data distribution.

Conclusion

In conclusion, the application of statistical techniques and visual representations is an essential to apply to analyze in business dataset, aiming to gain insights into various variables, and their relationships within the business context as well as identify patterns, and quantify the impact of certain factors on business outcomes In addition, each method is suitable and effective for each different variables and situation, so need to evaluate the appropriateness of above visual representations for justify best choice Lastly, the findings and conclusions drawn from this analysis in this report can contribute to evidence-based decision-making, enabling businesses to optimize their operations, identify growth opportunities, mitigate risks, and enhance overall performance Additionally, Statistics techniques enabled a comprehensive exploration of variables in business dataset, uncovering insights that can be leveraged for strategic planning, resource allocation, and overall business success.

Ngày đăng: 08/05/2024, 19:40