The Uniform Distribution

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	22
Dung lượng	903,28 KB

Nội dung

The Uniform Distribution tài liệu, giáo án, bài giảng , luận văn, luận án, đồ án, bài tập lớn về tất cả các lĩnh vực kin...

Journal of Water and Environment Technology, Vol.4, No.1, 2006 - 61 - Effect of urban emissions on the horizontal distribution of metal concentration in sediments in the vicinity of Asian large cities T. Urase 1* , K. Nadaoka 2 , H. Yagi 2 , T. Iwasa 1 , Y. Suzuki 1 F. Siringan 3 , T. P. Garcia 4 , T. T. Thao 5 1: Dept. of Civil Engineering, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8552 Japan. *: Corresponding author. turase@fluid.cv.titech.ac.jp , +81-3-5734-3548 2: Dept. of Mechanical and Environmental Informatics, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8552 Japan. 3: National Institute of Geological Sciences, University of the Philippines, 1101 Diliman, Quezon City, Philippines 4: Dept. Civil Engineering, College of Engineering, Technological University of the Philippines, Manila, Philippines 5: Department of Analytical Chemistry, Hanoi University of Science, 19- LeThanh Tong street, Hanoi, Vietnam Abstract: Metal contents of sediments in Manila Bay – Laguna Lake watershed in the Philippines were measured and detailed horizontal distribution was obtained. The distribution of zinc and lead concentration in Manila Bay clearly shows the effect of anthropogenic contamination and it was explained by the diffusion of lead and zinc rich anthropogenic particles discharged from Pasig River. The sediments in Laguna Lake were mostly natural particulate matters from surrounding mountains and they contained 20 mgPb/kg and 100 mgZn/kg, while the sediment taken at the heavily polluted branches of the Pasig River contained as high as 88 mgPb/kg and 310 mgZn/kg. The lead and zinc concentrations in the sediments of Manila Bay – Laguna Lake watershed were compared with those in the mouth of the Tama River, Tokyo, where the faster deposition of coarser natural origin particles and slower deposition of lead and zinc rich anthropogenic particles determined the sediment concentration. The comparison was also made with Hanoi City, Vietnam. In spite of the difference in time when leaded gasoline was prohibited, the difference in the lead concentrations of roadside deposits and sediments was not obvious in the vicinity of these three target cities. This is probably due to dilution by a large amount of suspended solids conveyed by the Pasig River in the case of the Philippines. Storm water runoff containing roadside deposits and discharge of untreated wastewater were identified as factors increasing zinc and lead concentrations of sediments in receiving waters based on the measurements on roadside deposits and the estimation of the contribution of untreated wastewater. Keywords: Laguna Lake; lead; Manila Bay; sediment; wastewater; zinc. Introduction Asian cities generally have large populations. Human activities and their impacts on natural environments are concentrated in the vicinity of urban regions. High precipitation in Asian regions results in erosion of land and induces urban runoff during wet weather days. A large amount of particulate matters having natural and anthropogenic sources flows into receiving watersheds. Incomplete sewer problems such as low coverage and The Uniform Distribution The Uniform Distribution By: OpenStaxCollege The uniform distribution is a continuous probability distribution and is concerned with events that are equally likely to occur When working out problems that have a uniform distribution, be careful to note if the data is inclusive or exclusive The data in [link] are 55 smiling times, in seconds, of an eight-week-old baby 10.4 19.6 18.8 13.9 17.8 16.8 21.6 17.9 12.5 11.1 4.9 12.8 14.8 22.8 20.0 15.9 16.3 13.4 17.1 14.5 19.0 22.8 1.3 0.7 8.9 11.9 10.9 7.3 5.9 3.7 17.9 19.2 9.8 5.8 6.9 2.6 5.8 21.7 11.8 3.4 2.1 4.5 8.9 9.4 9.4 7.6 10.0 3.3 7.8 11.6 13.8 18.6 6.7 6.3 10.7 The sample mean = 11.49 and the sample standard deviation = 6.23 We will assume that the smiling times, in seconds, follow a uniform distribution between zero and 23 seconds, inclusive This means that any smiling time from zero to and including 23 seconds is equally likely The histogram that could be constructed from the sample is an empirical distribution that closely matches the theoretical uniform distribution Let X = length, in seconds, of an eight-week-old baby's smile The notation for the uniform distribution is X ~ U(a, b) where a = the lowest value of x and b = the highest value of x The probability density function is f(x) = b−a For this example, X ~ U(0, 23) and f(x) = 23 − for a ≤ x ≤ b for ≤ X ≤ 23 Formulas for the theoretical mean and standard deviation are 1/22 The Uniform Distribution μ= a+b and σ = √ (b − a)2 12 For this problem, the theoretical mean and standard deviation are μ= + 23 = 11.50 seconds and σ = √ (23 − 0)2 12 = 6.64 seconds Notice that the theoretical mean and standard deviation are close to the sample mean and standard deviation in this example Try It The data that follow are the number of passengers on 35 different charter fishing boats The sample mean = 7.9 and the sample standard deviation = 4.33 The data follow a uniform distribution where all values between and including zero and 14 are equally likely State the values of a and b Write the distribution in proper notation, and calculate the theoretical mean and standard deviation 12 10 14 11 11 13 10 12 10 13 10 14 12 11 10 11 11 13 a is zero; b is 14; X ~ U (0, 14); μ = passengers; σ = 4.04 passengers a Refer to [link] What is the probability that a randomly chosen eight-week-old baby smiles between two and 18 seconds? a Find P(2 < x < 18) P(2 < x < 18) = (base)(height) = (18 – 2) ( 231 ) = ( 1623 ) 2/22 The Uniform Distribution b Find the 90th percentile for an eight-week-old baby's smiling time b Ninety percent of the smiling times fall below the 90th percentile, k, so P(x < k) = 0.90 P(x < k) = 0.90 (base)(height) = 0.90 (k − 0) ( ) = 0.90 23 k = (23)(0.90) = 20.7 c Find the probability that a random eight-week-old baby smiles more than 12 seconds KNOWING that the baby smiles MORE THAN EIGHT SECONDS c This probability question is a conditional You are asked to find the probability that an eight-week-old baby smiles more than 12 seconds when you already know the baby has smiled for more than eight seconds Find P(x > 12|x > 8) There are two ways to the problem For the first way, use the fact that this is a conditional and changes the sample space The graph illustrates the new sample space You already know the baby smiled more than eight seconds 3/22 The Uniform Distribution Write a new f(x): f(x) = 23 − = 15 for < x < 23 P(x > 12|x > 8) = (23 − 12) ( 151 ) = ( 1115 ) For the second way, use the conditional formula from Probability Topics with the original distribution X ~ U (0, 23): P(A|B) = P(A AND B) P(B) For this problem, A is (x > 12) and B is (x > 8) So, P(x > 12|x > 8) = (x > 12 AND x > 8) P(x > 8) = P(x > 12) P(x > 8) = 11 23 15 23 = 11 15 Try It A distribution is given as X ~ U (0, 20) What is P(2 < x < 18)? Find the 90th percentile P(2 < x < 18) = 0.8; 90th percentile = 18 4/22 The Uniform Distribution The amount of time, in minutes, that a person must wait for a bus is uniformly distributed between zero and 15 minutes, inclusive a What is the probability that a person waits fewer than 12.5 minutes? a Let X = the number of minutes a person must wait for a bus a = and b = 15 X ~ 1 U(0, 15) Write the probability density function f (x) = 15 − = 15 for ≤ x ≤ 15 Find P (x < 12.5) Draw a graph P(x < k) = (base)(height) = (12.5 − 0) ( 151 ) = 0.8333 The probability a person waits less than 12.5 minutes is 0.8333 b On the average, how long must a person wait? Find the mean, μ, and the standard deviation, σ b μ = σ= √ a + b ( b − a) 12 = = 15 + √ = 7.5 On the average, a person must wait 7.5 minutes (15 − 0)2 12 = 4.3 The Standard deviation is 4.3 minutes c Ninety percent of the time, the time a person must wait falls below what value? NoteThis asks for the 90th percentile c Find the 90th percentile Draw a graph Let k = the 90th percentile P(x < k) = (base)(height) = (k − 0)( 15 ) 0.90 = (k) ( 151 ) 5/22 The Uniform Distribution k = (0.90)(15) = 13.5 k is sometimes called a critical value ... 7 CHAPTER The Binomial Distribution Introduction Many probability problems involve assigning probabilities to the outcomes of a probability experiment. These probabilities and the corresponding outcomes make up a probability distribution. There are many different probability distributions. One special probability distribution is called the binomial distribution. The binomial distribution has many uses such as in gambling, in inspecting parts, and in other areas. 114 Copyright © 2005 by The McGraw-Hill Companies, Inc. Click here for terms of use. Discrete Probability Distributions In mathematics, a variable can assume different values. For example, if one records the temperature outside every hour for a 24-hour period, temperature is considered a variable since it assumes different values. Variables whose values are due to chance are called random variables. When a die is rolled, the value of the spots on the face up occurs by chance; hence, the number of spots on the face up on the die is considered to be a random variable. The outcomes of a die are 1, 2, 3, 4, 5, and 6, and the probability of each outcome occurring is 1 6 . The outcomes and their corresponding probabilities can be written in a table, as shown, and make up what is called a probability distribution. Value, x 123456 Probability, P(x) 1 6 1 6 1 6 1 6 1 6 1 6 A probability distribution consists of the values of a random variable and their corresponding probabilities. There are two kinds of probability distributions. They are discrete and continuous.Adiscrete variable has a countable number of values (countable means values of zero, one, two, three, etc.). For example, when four coins are tossed, the outcomes for the number of heads obtained are zero, one, two, three, and four. When a single die is rolled, the outcomes are one, two, three, four, five, and six. These are examples of discrete variables. A continuous variable has an infinite number of values between any two values. Continuous variables are measured. For example, temperature is a continuous variable since the variable can assume any value between 108 and 208 or any other two temperatures or values for that matter. Height and weight are continuous variables. Of course, we are limited by our measuring devices and values of continuous variables are usually ‘‘rounded off.’’ EXAMPLE: Construct a discrete probability distribution for the number of heads when three coins are tossed. SOLUTION: Recall that the sample space for tossing three coins is TTT, TTH, THT, HTT, HHT, HTH, THH, and HHH. CHAPTER 7 The Binomial Distribution 115 The outcomes can be arranged according to the number of heads, as shown. 0 heads TTT 1 head TTH, THT, HTT 2 heads THH, HTH, HHT 3 heads HHH Finally, the outcomes and corresponding probabilities can be written in a table, as shown. Outcome, x 0123 Probability, P(x) 1 8 3 8 3 8 1 8 The sum of the probabilities of a probability distribution must be 1. A discrete probability distribution can also be shown graphically by labeling the x axis with the values of the outcomes and letting the values on the y axis represent the probabilities for the outcomes. The graph for the discrete probability distribution of the number of heads occurring when three coins are tossed is shown in Figure 7-1. There are many kinds of discrete probability distributions; however, the distribution of the number of heads when three coins are tossed is a special kind of CHAPTER 9 The Normal Distribution Introduction A branch of mathematics that uses probability is called statistics. Statistics is the branch of mathematics that uses observations and measurements called data to analyze, summarize, make inferences, and draw conclusions based on the data gathered. This chapter will explain some basic concepts of statistics such as measures of average and measures of variation. Finally, the relationship between probability and normal distribution will be explained in the last two sections. 147 Copyright © 2005 by The McGraw-Hill Companies, Inc. Click here for terms of use. Measures of Average There are three statistical measures that are commonly used for average. They are the mean, median, and mode. The mean is found by adding the data values and dividing by the number of values. EXAMPLE: Find the mean of 18, 24, 16, 15, and 12. SOLUTION: Add the values: 18 þ 24 þ 16 þ 15 þ 12 ¼ 85 Divide by the number of values, 5: 85 Ä 5 ¼ 17 Hence the mean is 17. EXAMPLE: The ages of 6 executives are 48, 56, 42, 52, 53 and 52. Find the mean. SOLUTION: Add: 48 þ 56 þ 42 þ 52 þ 53 þ 52 ¼ 303 Divide by 6: 303 Ä 6 ¼ 50.5 Hence the mean age is 50.5. The median is the middle data value if there is an odd number of data values or the number halfway between the two data values at the center, if there is an even number of data values, when the data values are arranged in order. EXAMPLE: Find the median of 18, 24, 16, 15, and 12. SOLUTION: Arrange the data in order: 12, 15, 16, 18, 24 Find the middle value: 12, 15, 16, 18, 24 The median is 16. EXAMPLE: Find the median of the number of minutes 10 people had to wait in a checkout line at a local supermarket: 3, 0, 8, 2, 5, 6, 1, 4, 1, and 0. SOLUTION: Arrange the data in order: 0, 0, 1, 1, 2, 3, 4, 5, 6, 8 The middle falls between 2 and 3; hence, the median is (2 þ 3) Ä 2 ¼ 2.5. CHAPTER 9 The Normal Distribution 148 The third measure of average is called the mode. The mode is the data value that occurs most frequently. EXAMPLE: Find the mode for 22, 27, 30, 42, 16, 30, and 18. SOLUTION: Since 30 occurs twice and more frequently than any other value, the mode is 30. EXAMPLE: Find the mode for 2, 3, 3, 3, 4, 4, 6, 6, 6, 8, 9, and 10. SOLUTION: In this example, 3 and 6 occur most often; hence, 3 and 6 are used as the mode. In this case, we say that the distribution is bimodal. EXAMPLE: Find the mode for 18, 24, 16, 15, and 12. SOLUTION: Since no value occurs more than any other value, there is no mode. A distribution can have one mode, more than one mode, or no mode. Also, the mean, median, and mode for a set of values most often differ somewhat. PRACTICE 1. Find the mean, median, and mode for the number of sick days nine employees used last year. The data are 3, 6, 8, 2, 0, 5, 7, 8, and 5. 2. Find the mean, median, and mode for the number of rooms seven hotels in a large city have. The data are 332, 256, 300, 275, 216, 314, and 192. 3. Find the mean, median, and mode for the number of tornadoes that occurred in a specific state over the last 5 years. The data are 18, 6, 3, 9, and 10. 4. Find the mean, median, and mode for the number of items 9 people purchased at the express checkout register. The data are 12, 8, 6, 1, 5, 4, 6, 2, and 6. 5. Find the mean, median, and mode for the ages of 10 children who participated in a field trip to the zoo. The ages are 7, 12, 11, 11, 5, 8, 11, 7, 8, and 6. CHAPTER 9 The Normal Distribution 149 ANSWERS 1. Mean ¼ 3 þ 6 þ 8 þ 2 þ 0 þ 5 þ OptiX OSN 3500 Installation Manual Contents Contents 6 Installing the Cable Distribution Plate 1 6.1 Installation Position 1 6.2 Installing the Cable Distribution Plate in the Cabinet 2 Issue 05 (2006-11-20) Huawei Technologies Proprietary i Figures Installing the cable distribution plate 3 T2-0416xx- 20050330-C- 1.20 Huawei Technologies Proprietary iii OptiX OSN 3500 IM 6 Installing the Cable Distribution Plate About This Chapter This chapter guides you to install the cable distribution plate. One cable distribution plate is delivered with one subrack. The cable distribution plate is installed over the subrack. The following table lists the contents of this chapter. Section Description 6.1Installation Position Describes the position of the cable distribution plate in the cabinet. 6.2Installing the Cable Distribution Plate in the Cabinet Describes the steps to install the cable distribution plate 6.1 Installation Position Figure 1.1 shows the position of the cable distribution plate in the 2200mm-high 1 6Installing the Cable Distribution Plate OptiX OSN 3500 Installation Manual cabinet. Figure 1.1 Position for mounting ears Cable distribution plate Installation position Cable distribution plate for the lower subrack 37, 38 Cable distribution plate for the upper subrack 62, 63 Note: The holes are numbered from bottom to top. The hole at the bottom is numbered 1. 6.2 Installing the Cable Distribution Plate in the Cabinet Purpose This procedure guides you to install the cable distribution plate in the cabinet. Tools /Materials Cross screwdriver Cable distribution plate Prerequisites The cabinet and the subrack have been installed. Required/As needed Required Step 1 Secure the cable distribution plate into the cabinet over the subrack. See Figure 1.1. 2 Huawei Technologies Proprietary Issue 05 (2006-11-20) OptiX OSN 3500 Installation Manual 6Installing the Cable Distribution Plate Figure 1.1 Installing the cable distribution plate End Issue 05 (2006-11-20) Huawei Technologies Proprietary 3 Genome Biology 2007, 8:R69 comment reviews reports deposited research refereed research interactions information Open Access 2007Fodoret al.Volume 8, Issue 5, Article R69 Method Towards the uniform distribution of null P values on Affymetrix microarrays Anthony A Fodor * , Timothy L Tickle * and Christine Richardson *† Addresses: * Bioinformatics Resource Center, The University of North Carolina at Charlotte, University City Boulevard, Charlotte, North Carolina 28223, USA. † Department of Biology, The University of North Carolina at Charlotte, University City Boulevard, Charlotte, North Carolina 28223, USA. Correspondence: Anthony A Fodor. Email: anthony.fodor@gmail.com © 2007 Fodor et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Uniform distribution of microarray P values<p>Estimating the <it>P </it>value from the overall distribution of scores on the microarray can produce <it>P </it>values that are much closer to a uniform distribution.</p> Abstract Methods to control false-positive rates require that P values of genes that are not differentially expressed follow a uniform distribution. Commonly used microarray statistics can generate P values that do not meet this assumption. We show that poorly characterized variance, imperfect normalization, and cross-hybridization are among the many causes of this non-uniform distribution. We demonstrate a simple technique that produces P values that are close to uniform for nondifferentially expressed genes in control datasets. Background Microarray data typically involve tens of thousands of genes but only a handful of replicates. It is therefore difficult to establish appropriate P value thresholds for significance. For example, consider the response of 40,000 genes to two different experimental conditions, say diseased and healthy tissue. If a significance level of P < 0.05 is chosen, then one would expect an unacceptable number (2,000 [40,000 × 0.05]) of false positives. A conceptually simple procedure, the Bonfer- roni correction, would set a threshold of P = 1.25 × 10 -6 (0.05/ 40,000). Using this P value as the threshold for significance, there is only a 0.05 chance of any false positives across all of the 40,000 comparisons between the two conditions. Such metrics are said to control the 'family-wise error rate'. Family- wise error rate is often assumed to be too conservative for microarray experiments, because there are often no results with P values below the threshold for the modest number of samples that make up most microarray experiments. Recently, 'false discovery rate' (FDR) was proposed as an alternative, more permissive approach to estimating significance of microarray experiments [1-4]. This metric acknowl- edges that biologists are often able to tolerate some error in gene lists. For example, a FDR could be set at 10%, in which case a list of 100 genes would be expected to have as many as 10 false positives. No matter what threshold is used to control significance in microarray experiments, there is an inherent assumption that the P values of genes that are not differentially expressed follow a uniform distribution. For example, genes that are not differentially expressed should have a P value of 0.01 or smaller only 1% of the time. The uniform distribution of null P values seems like a safe assumption that is guaranteed by the laws of statistics. However, if for some reason this assumption is not met, then attempts to determine a threshold of significance may yield meaningless results [2,5]. In this report we show that commonly used statistics can in fact generate distributions of P values for non-differentially expressed genes that ... it represents the highest value of x What is the probability density function? What is the theoretical mean? six What is the theoretical standard deviation? Draw the graph of the distribution. .. Find the 40th percentile 4.8 14/22 The Uniform Distribution Use the following information to answer the next eleven exercises The age of cars in the staff parking lot of a suburban college is uniformly... minutes The waiting times for the train are known to follow a uniform distribution What is the average waiting time (in minutes)? zero two three four d 19/22 The Uniform Distribution Find the 30th

Ngày đăng: 31/10/2017, 16:47

Xem thêm