assignment report international business administration intake 62b report title hypothesis testing and applications

● Null Hypothesis H : "Taking the new medication will not lower systolic blood 0pressure by at least 10 mmHg compared to not taking the new medication." ● Alternative Hypothesis H : "Tak

Trang 1

HANOI MINISTRY OF EDUCATION AND TRAINING

National Economics University

School of Advanced Educational Programs

11207155

Lê Ngọc Trâm

11205721

Đặng Bảo Linh

Course: BUSINESS STATISTICS

Date of completion: November 13th, 2022

Trang 2

I INTRODUCTION 3

1 Definition of Hypothesis testing 3

2 The necessity 3

3 Application 4

II ARTICLES AND ADDITIONAL SOURCE ANALYSIS 6

1 Article (Hypothesis testing and earthquake prediction - DAVID D JACKSON - Southern California Earthquake Center, University of California, Los Angeles, CA 90095-1567) 6

a Article summary 6

b Issue 7

c Technique 7

d Why do you care about technique as the organization manager? 10

e Application in this article 11

2 Article 2 (STATISTICAL ANALYSIS OF THE RELATION BETWEEN THE DEVELOPMENTAL STATUS OF NATIONS AND THEIR BANKS’ NPA USING HYPOTHESIS TESTING) 11

a Summary 11

b Issue 11

c Techniques 12

e Application 12

3 Article 3 (CO2 Emission - Truong Quynh Tam) 12

a Summary 12

b Issue 13

c Technique 13

e Application in the source 15

III DATA ANALYSIS 16

1 Database and analysis 1 16

2 Database and analysis 2 18

IV CONCLUSION 20

V REFERENCES 21

Trang 3

I INTRODUCTION

1 Definition of Hypothesis testing

Hypothesis testing, which is a type of statistical inference, uses data from a sample to make inferences about a population parameter or population probability distribution

A tentative assumption is first made on the parameter or distribution This assumption is called the null hypothesis and is denoted by H0 An alternative hypothesis (denoted Ha), which is the opposite of what

is stated in the null hypothesis, is then defined Using sample data, the hypothesis-testing technique determines whether or not H0 may be rejected The statistical conclusion is that the alternative hypothesis Ha is true if H0 is rejected

There are two possible errors When we reject a valid null hypothesis, we make a Type I errors A Type

II is defined as not rejecting a false null hypothesis The significance level, also known as anpha, is used

to indicate the likelihood of a Type I mistake The probability of a Type II error is denoted by beta As

a result, any attempt to decrease one will result in a rise in the other error probability, ana and beta

● Steps for hypothesis test

- State null and alternative hypothesis

- Calculate test statistic

- Formulate a decision rule using either the Rejection Region, or p-value - found from appropriate distribution or confidence interval approach

- Reach a conclusion regarding whether to accept the null or alternative hypothesis

2 The necessity

Hypothesis testing helps researchers to decide whether data from sample is statistically significant To

be precise, the result of calculation based on a sample or a population might be different and the picture

of a sample is a lot more blurred than the whole population This could lead to misunderstanding when making decisions So by repeating the experiment, we might get our sample result closer to the population result The hypothesis testing helps the likelihood of this possibility

Therefore, hypothesis testing is of paramount importance when measuring the validity and reliability of outcomes

Hypothesis testing helps evaluating the strength of your claim or assumption before applying it in your data set Hypothesis testing is also the only method to prove that something “is or is not” Hypothesis testing works best in extrapolating data from sample to population, which provides a reliable framework for researchers to make decisions on population scale

Trang 4

3 Application

a) In healthcare

Many pharmacists and doctors use hypothesis testing for clinical trials The impact of the new clinical methods, medicines or procedures on the condition of the patients is analyzed through hypothesis testing We will take a specific example for this application A researcher at Fontys hospital hypothesizes that the new medication lowers systolic blood pressure of patients by at least 10 mmHg compared to not taking the new medication We have the level of uncertainty the researcher is willing to accept (α = 0.05) or the significant level is 5% Now the researcher can perform their research by giving 50 patients the new medication and 50 other patients no medication The following procedure of the hypothesis testing is then followed

● Null Hypothesis (H ): "Taking the new medication will not lower systolic blood 0pressure by at least 10 mmHg compared to not taking the new medication."

● Alternative Hypothesis (H ): "Taking the new medication will lower systolic blood Apressure by at least 10 mmHg compared to not taking the medication."

In our example, the researcher arrives at a p-value of 0.02, which is less than the significance value of 0.05 Therefore, the researcher rejects the null hypothesis in favor of the alternative In other words, it can be concluded that the new medication is responsible for the decline of at least 10 mmHg in the systolic blood pressure of the patients at the 5% significance level b) In business

The real value of hypothesis testing in business is that it allows professionals to test their theories and assumptions before putting them into action In addition, hypothesis also contributes to helping the firms look for customer insight This essentially allows an organization to prepare for implementing a broader strategy We will take a specific example for this application JIS company wants to make a prediction of the mean value a customer would pay for their firm’s product, which is organic apple juice boxes The company’s financial analyst then formulated a hypothesis: “The average value that customers will pay for the product is larger than $5.” We have a sample of 100 customers and pick the significance level

Trang 5

c) In manufacturing

Hypothesis testing finds its application in the manufacturing processes such as in determining whether the implication of the new technique or process in the manufacturing plant caused the anomalies in the quality of the product or not We will take a specific example Manufacturing plant Splash decides to verify that the particular method results in an increase in the defective products per quarter, say this number to be 150 The significance level is set at 5% Now, to verify this the researcher needs to calculate the mean of the number of defective products produced before the start and the end of the quarter

● Null Hypothesis (H ): The average of the defective products produced is the same 0before and after the implementation of the new manufacturing method

● Alternative Hypothesis (H ): The average number of defective products produced are Adifferent before and after the implementation of the new manufacturing method After testing, the manufacturing plant has p-value of 0.04, which is smaller than the significance value of 0.05 Therefore, the null hypothesis is rejected in favor of the alternative In other words, it can be concluded that the changes in the method of production lead to the rise in the number of defective products produced per quarter

Trang 6

II ARTICLES AND ADDITIONAL SOURCE ANALYSIS

1 Article (Hypothesis testing and earthquake prediction - DAVID D

JACKSON - Southern California Earthquake Center, University of California, Los Angeles, CA 90095-1567)

a Article summary

Leon Knopoff, Keiiti Aki, Clarence R Allen, James R Rice, and Lynn R Sykes’s (1995) study explored the statistics method - hypothesis testing used in 4 earthquake prediction methods: single prediction, multiple prediction, probabilistic prediction, and prediction based on Rate-Density Maps The authors believed that it was worthwhile to formulate testable hypotheses carefully in advance, as it may take decades to validate prediction methods Therefore, fully specified hypotheses were chosen to use, with no parameter adjustments or arbitrary decisions allowed during the test period

In the single prediction method, the authors issued difficulties in concluding a qualified future earthquake prediction as prerequisite factors such as magnitude scales, listings of hypocenters and time zones were not specified in advance The authors used Parkfield, California Prediction Case as an failed approach example of this method Parkfield Case used the data set of previous earthquakes as inputs to calculate the probability that one qualifying earthquake would occur at random, at a rate determined by past behavior and used that probability as a null hypothesis for long-term predictions

In the multiple predictions method, on the contrary, the authors used various predictions with either for separate regions, separate times, separate magnitude ranges, or some combination thereof According to this scheme only the first qualifying earthquake counts in each region, so that implicitly the prediction for each region is terminated as soon as it succeeds Constructing

a reasonable null hypothesis is difficult because it requires the background or unconditional rate

of large earthquakes within the TIP regions However, this rate is low, and the catalog available for determining the rate is short This led to questions: “is it a region with a higher rate than other regions, or higher than for other times within that region? How much higher than normal

is the rate supposed to be?” The authors came to the third approach as a solution: “Probabilistic Prediction”

In the probabilistic prediction method, the authors found that an earthquake prediction hypothesis was much more useful, and much more testable, if probabilities were attached to each region The authors introduced 3 hypothesis testing schemes: the "N test" based on the total number of successes; the "L test" based on the likelihood values according to each

Trang 7

that of the null hypothesis They reached the conclusion that the success of a prediction scheme

is measured by comparing it with a hypothetical scheme which occupies the same proportion

of space-time

The final method - Predictions Based on Rate-Density Maps focused on improving s drawbacks

of the previous prediction method, which was the lack of account for uncertainty in measurement and for imperfect specificity in stating hypotheses To reinforce, the authors formulated hypotheses in terms of smaller regions, allowing for some smoothing of probability from one region to another

In conclusion, a fair test of an earthquake prediction hypothesis must involve an adequate null hypothesis that incorporates well-known features of earthquake occurrence The authors came

to the conclusion that predictions specified in this form were no more difficult to test than other predictions, and gave suggestions that earthquake predictions should be expressed as conditional rate densities whenever possible

b Issue

Predicting future earthquakes uses hypothesis testing Because it may take decades to validate prediction methods, it is worthwhile to formulate testable hypotheses carefully in advance Earthquake prediction generally implies that the probability will be temporarily higher than normal Such a statement requires knowledge of "normal behavior"- that is, it requires a null hypothesis Predictions made without a statement of probability are very difficult to test, and any test must be based on the ratio of earthquakes in and out of the forecast regions Hypothesis testing is an essential part of the scientific method, and it is especially important in earthquake prediction because public safety, public funds, and public trust are involved However, earthquakes occur apparently at random, and the larger, more interesting earthquakes are infrequent enough that a long time may be required to test a hypothesis For this reason, it

is important to formulate hypotheses carefully so that they may be reasonably evaluated at some future time A fair test of an earthquake prediction hypothesis must involve an adequate null hypothesis that incorporates well-known features of earthquake occurrence

c Technique

In this article, Hypothesis Testing is used in 4 methods: Single predictions, Multiple predictions, Probabilistic predictions, and Predictions Based on Rate-Density Maps Hypotheses were suggested to be tested in three ways: in three ways: (i) by comparing the number of actual

Trang 8

earthquakes to the predicted distribution, and (iii) by comparing the likelihood ratio to that of a null hypothesis

- Single predictions:

In this method, a data set of previous earthquakes was used as inputs for calculating the probability that one qualifying earthquake would occur at random and used as a null hypothesis

p 𝒐 = 1 - exp(-r*t)

For r = 1.5/yr and t = 1 yr, p = 0.78 𝒐

where,

r is the rate that earthquake occurs

t is period that earthquake performed

p𝑜 is the probability that at least one qualifying earthquake would occur at random

- Multiple predictions:

To simplify, the authors referred to the magnitude-space-time interval for each prediction as a "region." Let p , for (i = 1, ., P), be the random probabilities of 𝑜𝑗satisfying the predictions in each region, according to the null hypothesis, and let c for 𝑖(i = 1, , P), be 1 for each region that was "filled" by a qualifying earthquake, and 0 for those not filled Thus, c𝑖 was 1 for each successful prediction and zero for each failure According to this scheme only the first qualifying earthquake counted in each region,

so that implicitly the prediction for each region was terminated as soon as it succeeded

A reasonable measure of the success of the predictions was the total number of regions filled by qualifying earthquakes The probability of having as many or more successes

at random was approximately that given by the Poisson distribution:

where, λ is the expected number of successes according to the null hypothesis

- Probabilistic predictions:

In this method, the probabilities for each region labeled p , for j = 1 through P For 𝑗simplicity, a prediction for any region was terminated if it succeeded, so that the only

Trang 9

more qualifying earthquakes)

● The “N Test”, based on the total number of successes This approach used the Rejection method for making decisions This number was compared with the distributions predicted by the null hypothesis (just as described above) and by the experimental hypothesis to be tested Two critical values of N could be established

○ Let N1 be the smallest value of N, such that the probability of N or more successes according to the null hypothesis is less than 0.05 Then the null hypothesis could be rejected if the number of successes in the experiment exceeds N1

○ Let N2 be the largest value of N, such that the probability of N or fewer successes according to the test hypothesis is less than 0.05 Then the test hypothesis could be rejected if N was less than or equal to N2 If N1 was less than N2, then there was a range of possible success counts for which neither hypothesis can be rejected

● The "L test", based on the likelihood values according to each distribution The authors supposed as above that they had P regions of time-space-magnitude,

a test hypothesis with probabilities p , and a null hypothesis with probabilities 𝑗p𝑜𝑗

Here there are only two possibilities for each region, and the outcome probability

is p for a success and 1 - p for failure The log-likelihood function can then be 𝑗 𝑗represented by:

● The "R test," based on the ratio of the likelihood value of the test hypothesis to that of the null hypothesis The log of this ratio is

where,

Trang 10

𝑅 is the difference between two log-likelihood functions, with the test hypothesis having the positive sign and the null hypothesis the negative sign

- Predictions Based on Rate-Density Maps:

The authors considered a refinement of the "region" concept to include infinitesimal regions of time-space, in which p 𝑗 = Λ (x𝑗, y , t , m ) dx dy dt dm Here dx, dy, dt, and 𝑗 𝑗 𝑗

dm were small increments of longitude, latitude, time, and magnitude, respectively The function Λ was often referred to in the statistical literature as the "conditional intensity," but the notation "conditional rate density" would be used to avoid confusion with the seismic intensity

If dx dy dt dm is made arbitrarily small, then 1 - p approaches 1, and log(1 -𝑗 p𝑗) approaches - p After some algebra, we then obtain: 𝑗

where,

𝑁 is the number of earthquakes that actually occurred

𝑁′ is the total number of earthquakes predicted by the test hypothesis

𝑁′0 is the total number of earthquakes predicted by the null hypothesis

The sum is over those earthquakes that actually occurred

d Why do you care about technique as the organization manager?

From 4 approaches of hypothesis testing for earthquake predictions, to avoid misleading conclusions when calculating probability, managers should use multi-dimensional rational approaches for problems with many external influences; a probability can only be legit when all the data used for the calculations are consolidated Earthquake predictions give warning of potentially damaging earthquakes early enough to allow appropriate response to the disaster, enabling people to minimize loss of life and property As for nations in general, as the population increases, expanding urban development and construction works encroach upon areas susceptible to earthquakes With a greater understanding of the causes and effects of earthquakes, we may be able to reduce damage and loss of life from this destructive phenomenon For businesses and especially agribusinesses, earthquake predictions hold a vital position in helping companies reduce costs and protect resources from not being negatively affected by sudden earthquakes

Tiêu đề	Hypothesis Testing and Applications
Tác giả	Nguyễn Hà My, Nguyễn Hà Phương, Trần Lê Khanh, Trần Nguyễn Minh Hạnh, Lê Ngọc Trâm, Đặng Bảo Linh
Trường học	National Economics University
Chuyên ngành	BUSINESS STATISTICS
Thể loại	Assignment Report
Năm xuất bản	2022
Thành phố	Hanoi

Định dạng
Số trang	21
Dung lượng	1,69 MB