The utmost advantage of probability is that it helps to model our world, enabling us to obtain estimates of the probability that a certain event may occur, or estimate the variability of
Trang 1MINISTRY OF EDUCATION AND TRAINING NATIONAL ECONOMICS UNIVERSITY
GROUP MID-TERM EXAM
Course: Business Statistics TOPIC: PROBABILITY
Lecturer: Assoc Prof Tran Thi Bich
Members: Đỗ Phương Anh : 11210339
Nguyễn Mai Thảo Anh : 11215493 Nguyễn Thái Châu Anh : 11211030
HA NOI - 2023
Trang 2TI Additional references .ố 6
1 Application ofprobabilifty and statistics methods in arangement ofrallway 6
2 The application of probability theory in speculative game of stock market 7 Expectation Ả ÝÃ 7 PART B: DATA ANALYSIS TO SPECIFIC ORGANIZATIONAL PROBLEMS 9
ri J00)/3000)1<i83) 14 8n 9
1 Organization overview and the purpose of the SUTV€V óc 2n s22 21111 rrrre 9
pin - 11
E2 on ố ẽ 19 i0 Hi) 8x: 0i 8áiii 0x ï) 0n 21
1 Step 1: Select and measure expected return of each asset cào 21
2 Step 2: Calculate standard deviafion - c cv 2 2 111111112111 111011111111181 111 E11 xe 21
3 Step 3: Calculate covariance and correÌafIØN - ¿c2 2113211113 111111 15111215111 xe2 22
4 Step 4: Add more classes of assets to diversify the portÍoÌ1o - + sec 22
5 Step 5: Build the most optimal portfolio combine HPG, MSN, VHM and SBT 23
voi 2o 8n ố ẽă 24 Iiii®.vJ0 0 .aăaă 26 4585.4500001 1
Trang 3election results, lottery, medical diagnosis, sports outcome, stock markets, The utmost
advantage of probability is that it helps to model our world, enabling us to obtain estimates of the probability that a certain event may occur, or estimate the variability of occurrence
In this report, our team will illustrate the way of using probability techniques and how
to apply those techniques in real life business cases In the first part, we will analyze and summarize the main content of some articles that present the core definition of probability
The second part will be our data analysis based on the theory and techniques we have
learned
Trang 4PART A: SUMMARIZING THE ARTICLE
I Article summary
1 The article
Article name: Applications of Probability Theory in Criminalistics
Author(s): Charles R Kingston Source: Journal of the American Statistical Association, Vol
60, No 309 (Mar., 1965), pp 70-80
Published by: Taylor & Francis, Ltd on behalf of the American Statistical Association
(Kingston, 2018)
2 The issue of interest
The research paper considers some problems in the probabilistic analysis of physical evidence in criminal investigations The primary purpose of the article is to suggest a model that is applicable to an important type of physical evidence and to show that an analysis of the model can be made by an objective probabilistic approach Moreover, the paper creates a better understanding of the nature of physical evidence with respect to the probabilistic aspects of its evaluation
In terms of methodology, initially, two basic assumptions are made: the number of persons or objects possessing a particular set of properties can be considered as a random variable, and it is possible to estimate the probability function of this random variable Relating the assumptions above, two models (one with and one without the assumption that the suspect is a random selection from the set of possible suspects) are applied to the evaluation of partial transfer evidence (PTE) - an important category of physical evidence that is found in most criminal investigations Some models and an example are presented for the case when the estimated probability distribution is binomial with an expected value less than 1 As the expected value becomes smaller, the assumption of randomness in the selection of the suspect becomes immaterial to the evaluation of the evidential significance
The application of probability is to find some basis for deciding which reconstruction
we are willing to accept as being closest to the true situation, and some basis for deciding whether we are willing to act as though the reconstruction reflected the true situation As a result, the criminalist assists the statistician toward enough understanding of the problems so that effective teamwork can be generated
3 The technique of probability
be considered to be the PTE available for examination at the scene, is examined and the color
of the ball is noted to be black Random selections of balls from the box are now made until
Trang 5the first black ball is found The question to be considered is: What is the probability that the black ball just found is B?
Suppose that the n balls in the box have been randomly selected from some inexhaustible population of balls, and that the probability of randomly drawing a black ball from this population is x Thus the box of n balls with which we are concerned is only one of many possible collections The number of black balls that could be in any one collection of balls is the random variable X , which has the binomial distribution b(x;n,4) The value p'
is an estimate of X, and p(x) is the binomial distribution function
n ( )œa Ý- x
From the available evidence, we know that at least one black ball exists, so we have the
condition that x>1 The probability that the correct origin, B, has been found unconditional
Trang 6TABLE 1 SOME VALUES OF 1-E
* Estimated from a table of the exponential function for values of z up to 9
The value of 1—E in this case is about 1/70 to 1/1400 with respect to the world population It may be of interest to note the published demonstration of two core areas in fingerprints taken from two different persons that have a rough similarity in six characteristic points of comparison Under adverse conditions, an inexperienced technician might consider the two areas a match if only the central portions were available
3.2 Probability model 2:
The assumption that the selection of the origin (or the suspected origin) 1s made randomly from the members of the ID-set can be criticized as not being a realistic assumption If the search for the origin is directed preferentially toward a particular segment
of the population, or if the origin is actively trying to avoid being selected, the "drawings" are not likely to be truly random
Consider the probability that a second member of the ID-set could be found, or equivalently, that an error in the identification of the origin on the basis of the PTE is possible Represent this probability by P(ErV X=x), which is dependent on the value of X
It is clear that:
P(Er| X
PŒr | Z =z) = 1 for z > 1
The expected value of P{ErV X=x) is just the probability of error unconditional 1 on
x, and is calculated under the condition that x>1 as before If n is large, and A=np'<1, A is approximately Poisson distributed with parameter X , and we have:
Trang 7TABLE 2, SOME VALUES OF P(Er) The
* Estimated from a table of the exponential function for values of z up to 9,
approximately double those in the first model for 1—E The importance of this difference between P( Er) and 1—E in evaluating the evidential significance becomes less as A becomes smaller, and Model I is, in this respect, robust against violations of the assumption of randomness in the selection from the ID-set It is readily shown that
lim f(A) = lim P(z)/(1 — E) = 2,
and that,
1<ƒ(X)<2when0<X<1 Therefore the above statement is valid for all sufficiently small values of X
-z(ì_ PŒr)
(Note: f(4)=5—>p))
4, Viewpoint
Noted that if n is large, and A=np'<1 , A is approximately Poisson distributed with
parameter X , and we have:
DX (2) P(Er) = —~—— = 1-
Trang 8In this case, the value of P(Er) is not the same as previously proven in table 2
Therefore, the statement might be invalid for all sufficiently small values of A
II Additional references
1 Application of probability and statistics methods in arrangement of railway
Article name: Application of probability and statistics methods in arrangement of railway Author(s): Grigory G, Irkutsk State University of Railway Engineering, 634074 Irkutsk, Russia
Published by: MATEC Web of Conferences 216, 02004 (2018)
(Grigory G, 2018)
1.1 Issue
Today, the capacity of the rail network in many cities is not upgraded at the pace necessary to keep up with the increase in traffic demand The sensitivity of the railway system rises as the capacity utilization increases This may lead to longer travel times and increased sensitivity to delays
1.2, Purpose
The article uses application of probabilistic and statistical methods to problems in design of railway transportation, specifically, the fluctuations in loading of railway stations 1.3 Techniques
- If, during time t , each of the available ~ trains reliably arrived at the station, the
probability of the event that exactly k trains ( 0 < k <n ) will arrive at the station within time ¢ 1s determined by the Bernoulli formula:
k
n(Ê\ ; p.(tÌ=Gi-) é (1)
1.3.2 Poisson’s Distribution
n
- Az r represents the average intensity of onset of events (the average number of trains arriving per unit time), and value At is the average number of events occurring within the time interval with the duration of ¢ The Poisson’s distribution has the form of:
k=1,2,
ÀAtte ”
P(X=k) =pilt “—,
Application
Trains arrive at the station in accordance with the Poisson’s flow of events:
+ Onan average, within T hours, 7 trains arrive (i.e., the average arrival intensity is A =
n
T trains per hour)
Trang 9+ The probabilities of arrival of & trains within | hour are distributed in accordance with Poisson's law:
Atte * P(X=k) =p,(t) = a k=1,2, (2)
It is required to make a decision on the need to build additional arrival and departure ways in accordance with the following criterion: The probability of arrival of 6 or more
trains at the station within 1 hour should not exceed a certain critical value of D,,
If fort = 16 hours, 7 = 25 trains arrive (1.¢., the average arrival intensity will comprise
A= T (trains per hour) then the probability of arrival of 0 to 5 trains within | hour in accordance with formula (2) is equal to: Po(1) = 0.210; p,(1) = 0.328; p21) = 0.256; p3(1) = 0.133; pa(1) = 0.052; ps(1) = 0.016 and the probability of arrival of 6 and more trains at the station within | hour is equal
to P(k>6)}=1—0.995=0.005
If this value exceeds the critical probability value P„„, then a decIsion must be made on
the necessity of constructing additional arrival and departure tracks
2 The application of probability theory in speculative game of stock market
Article name: The application of probability theory in speculative game of stock market Author(s): Ya Guo Liu Heyuan vocational and technical college, Heyuan, Guangdong, China Published by: E3S Web of Conferences 233, 01172 (2021)
(Heyuan, 2021)
2.1 Issue
In China's A-share market, it is difficult for retail investors to make a living in this
market because of the small amount of funds, too little capital in the market, especially in bear and volatile markets, there will be less capital entering the stock market
2.2 Purpose
This paper tries to use the method of probability theory to get nd of the fog of the market, help the majority of retail investors wake up in cognition, realize that the strong are always strong, and form their own investment style, so as to realize the stable profit from the market
7
Trang 10trading limit board), the probability of failure is 0.3, the average profit in the case of success
is A%, and the average loss in the case of failure is B%
Then the mathematical expectation of return x is
E(X)=A %*+0.7+(— B%)x0.3
If A=3,B=5, then E( X)=3%*0.7+(—5 %)*0.3=0.6%
If A=3, B=3, then E( X)=3%*0.7+(—3%)*0.3=1.2%
Judging from this, there is still potential to pursue the limit up At least from the perspective of probability, as long as the number of operations is enough, the profit will gradually accumulate In fact, four years of ten times, 100 times hot money is the practice of this model
Trang 11PART B: DATA ANALYSIS TO SPECIFIC ORGANIZATIONAL PROBLEMS
I/ Starbuck Customers Survey
1 Organization overview and the purpose of the survey
1.1 Organization Overview:
Starbucks is a global coffeehouse chain that started in Seattle in 1971 and now operates over 31,000 stores in 82 countries Starbucks maintains a strong relationship with its customers through various channels The company's loyalty program, Starbucks Rewards, offers personalized perks and convenience to members Starbucks' exceptional customer service fosters a positive experience and cultivates loyalty among customers The company also values customer feedback and incorporates it into product development and improvement Starbucks’ social responsibility initiatives resonate with customers and reinforce brand loyalty The company's commitment to sustainability and community engagement has also earned recognition and admiration from customers (HAMZAH, 2020)
1.2 The survey overview
- Purpose: The survey conducted in Malaysia analyzes customer behavior at Starbucks including demographic information, current buying behavior, and preferences of Starbucks customers Insights from the survey can help identify areas for improvement and prioritize investments in facilities and features that are important to
2 Your age Optional
Employed Self-employed Housewife Student
3 Your current job
Less than RM25,000 From RM25,000 to RM50,000
4, What is your annual income? (The currency | 1
5 The amount you spend at Starbucks per visit | 1 Zero
(The currency used is Ringgit Malaysia) 2 Less than RM20
Trang 12RM20 - RM40 More than RM40
Coffee Cold drinks Pastries Sandwiches Juices
6 What do you most frequently purchase at
8 How would you rate the quality of Starbucks | Evaluate on the scale from | to 5
compared to other brands (Coffee Bean, Old | 1 - Poor
3 - Good 4- Very good
5 - Excellent
10 How would you rate the service at | Evaluate on the scale from 1 to 5
Starbucks? (Promptness, friendliness, .) 1 - Poor
2 - Fair
3 - Good 4- Very good
5 - Excellent
11 Will you continue buying at Starbucks? 1 Yes
2 No
1.3 The purpose of probability application & techniques applied
- The purpose of probability application
The purpose of applying probability techniques in this context is to analyze and understand various aspects of Starbucks customer behavior in Malaysia
- Techniques applied
Probability tree is used to identify the likelihood of different buying behaviors based on demographic information Bivariate distribution helps to analyze the relationship between different variables and provide insights into customer behavior Cross tabulation and conditional probability evaluate customer ratings and identify customer segments and preferences Binomial probability distribution is used to calculate the probability of customers returning to Starbucks and the probability of a certain number of customers
10
Trang 13coming back to the store Overall, these techniques assist in making data-driven decisions to improve customer retention and satisfaction
2 Data analysis
2.1 Customer analysis
To analyze the characteristics of customers, three factors used include gender (female
or male), current job (student, employed, housewife or self-employed), age, annual income range and average amount of money they spend at Starbucks per visit
Table 1 Frequency table based on Gender Customers’ gender Probability of customers’ gender coming to Starbucks
the customers are female, 46.7% are male It can be seen that there is not much difference
between the proportion of 2 genders coming to Starbucks
Table 2 Frequency table based on customers’ current job
Customers’ Probability of customers’ current job coming to current job Starbucks
Trang 14Based on the visiting frequency recorded by gender group and customers’ current job, the frequency distribution can help to estimate the probability that which gender or a kind of customer’s job would visit the store more frequently Therefore, managers can propose suitable marketing projects to approach target customers
Histogram 257] Mean = 27.34 Std Dev = 5.921
Age
Figure 1 Histogram of Customers’ age
In terms of customers’ age, the respondents’ answer shows that Starbucks customers belong to various age groups, with the average age (mean) is about 27.34 and standard deviation is approximately 5.921 It can be seen from the histogram above, because the variable is nearly normal, standard normal random variable (denoted by Z) can be used to analyze the characteristic of customer age group
- Probability that the customers are below 20:
12