Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e 5TH EDITION Basic Business Statistics Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e This page is intentionally left blank 5TH EDITION Basic Business Statistics Concepts and applications Berenson Levine Szabat O’Brien Jayne Watson Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019 Pearson Australia 707 Collins Street Melbourne VIC 3008 www.pearson.com.au Authorised adaptation from the United States edition entitled Basic Business Statistics, 13th edition, ISBN 0321870026 by Berenson, Mark L., Levine, David M., Szabat, Kathryn A., published by Pearson Education, Inc., Copyright © 2015 Fifth adaptation edition published by Pearson Australia Group Pty Ltd, Copyright © 2019 The Copyright Act 1968 of Australia allows a maximum of one chapter or 10% of this book, whichever is the greater, to be copied by any educational institution for its educational purposes provided that that educational institution (or the body that administers it) has given a remuneration notice to Copyright Agency Limited (CAL) under the Act For details of the CAL licence for educational institutions contact: Copyright Agency Limited, telephone: (02) 9394 7600, email: info@copyright.com.au All rights reserved Except under the conditions described in the Copyright Act 1968 of Australia and subsequent amendments, no part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner Portfolio Manager: Rebecca Pedley Development Editor: Anna Carter Project Managers: Anubhuti Harsh and Keely Smith Production Manager: Julie Ganner Product Manager: Sachin Dua Content Developer: Victoria Kerr Rights and Permissions Team Leader: Lisa Woodland Lead Editor/Copy Editor: Julie Ganner Proofreader: Katy McDevitt Indexer: Garry Cousins Cover and internal design by Natalie Bowra Cover photograph â kireewong foto/Shutterstock Typeset by iEnergizer Aptarađ, Ltd Printed in Malaysia ISBN 9781488617249 23 22 21 20 19 Pearson Australia Group Pty Ltd ABN 40 004 245 943 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e brief contents Preface x Acknowledgements xi How to use this book xii About the authors PART PRESENTING AND DESCRIBING INFORMATION Defining and collecting data Organising and visualising data Numerical descriptive measures PART Basic probability Some important discrete probability distributions The normal distribution and other continuous distributions Sampling distributions 147 180 212 248 DRAWING CONCLUSIONS ABOUT POPULATIONS BASED ONLY ON SAMPLE INFORMATION Confidence interval estimation Fundamentals of hypothesis testing: One-sample tests 10 Hypothesis testing: Two-sample tests 11 Analysis of variance PART 4 37 91 MEASURING UNCERTAINTY PART xvii 279 315 358 401 DETERMINING CAUSE AND MAKING RELIABLE FORECASTS 12 Simple linear regression 13 Introduction to multiple regression 14 Time-series forecasting and index numbers 15 Chi-square tests 455 504 544 607 ONLINE CHAPTERS PART FURTHER TOPICS IN STATS 16 Multiple regression model building 17 Decision making 18 Statistical applications in quality management 19 Further non-parametric tests 20 Business analytics 21 Data analysis: The big picture 650 680 704 740 770 794 Appendices A to F A-1 Glossary G-1 Index I-1 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e vi detailed contents Preface Acknowledgements How to use this book About the authors x xi xii xvii 3.3 3.4 Calculating numerical descriptive measures from a frequency distribution 118 Five-number summary and box-and-whisker plots 120 3.5 Covariance and the coefficient of correlation 123 PRESENTING AND DESCRIBING INFORMATION 3.6 Pitfalls in numerical descriptive measures and ethical issues Defining and collecting data Summary 130 Key formulas 130 Key terms 132 Chapter review problems 132 Continuing cases 134 Chapter 3 Excel Guide 135 PART 1.1 Basic concepts of data and statistics 1.2 Types of variables 1.3 Collecting data 13 1.4 Types of survey sampling methods 17 1.5 Evaluating survey worthiness 22 1.6 The growth of statistics and information technology 26 Summary 27 Key terms 27 References 27 Chapter review problems 28 Continuing cases 29 Chapter 1 Excel Guide 29 Organising and visualising data 37 2.1 Organising and visualising categorical data 38 2.2 Organising numerical data 43 2.3 Summarising and visualising numerical data 46 2.4 Organising and visualising two categorical variables 55 2.5 Visualising two numerical variables 59 2.6 Business analytics applications – descriptive analytics 62 Misusing graphs and ethical issues 69 2.7 Summary 73 Key terms 73 References 73 Chapter review problems 74 Continuing cases 76 Chapter 2 Excel Guide 77 Numerical descriptive measures 3.1 3.2 Measures of central tendency, variation and shape Numerical descriptive measures for a population 91 92 113 End of Part problems 129 139 PART MEASURING UNCERTAINTY Basic probability 147 4.1 Basic probability concepts 148 4.2 Conditional probability 156 4.3 Bayes’ theorem 163 4.4 Counting rules 168 4.5 Ethical issues and probability 172 Summary 173 Key formulas 173 Key terms 173 Chapter review problems 174 Continuing cases 177 Chapter 4 Excel Guide 178 Some important discrete probability distributions 180 Probability distribution for a discrete random variable 181 5.2 Covariance and its application in finance 185 5.3 Binomial distribution 189 5.4 Poisson distribution 196 5.5 Hypergeometric distribution 200 5.1 Summary 204 Key formulas 204 Key terms 205 Chapter review problems 205 Chapter 5 Excel Guide 208 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e DETAILED CONTENTS The normal distribution and other continuous distributions 212 6.1 Continuous probability distributions 213 6.2 The normal distribution 214 6.3 Evaluating normality 229 6.4 The uniform distribution 233 6.5 The exponential distribution 235 6.6 The normal approximation to the binomial distribution 238 Summary 242 Key formulas 242 Key terms 242 Chapter review problems 243 Continuing cases 244 Chapter 6 Excel Guide 246 Sampling distributions 248 7.1 Sampling distributions 249 7.2 Sampling distribution of the mean 249 7.3 Sampling distribution of the proportion 259 Summary 262 Key formulas 263 Key terms 263 References 263 Chapter review problems 263 Continuing cases 265 Chapter 7 Excel Guide 265 End of Part problems 267 PART DRAWING CONCLUSIONS ABOUT POPULATIONS BASED ONLY ON SAMPLE INFORMATION Confidence interval estimation 279 Confidence interval estimation for the mean (σ known) 280 Confidence interval estimation for the mean (σ unknown) 285 Confidence interval estimation for the proportion 291 8.4 Determining sample size 294 8.5 Applications of confidence interval estimation in auditing 300 More on confidence interval estimation and ethical issues 307 8.1 8.2 8.3 8.6 Summary 308 Key formulas 308 Key terms 308 References 309 Chapter review problems 309 Continuing cases 313 Chapter 8 Excel Guide 313 Fundamentals of hypothesis testing: One-sample tests 315 9.1 Hypothesis-testing methodology 316 9.2 Z test of hypothesis for the mean (σ known) 322 9.3 One-tail tests 9.4 t test of hypothesis for the mean (σ unknown) 334 9.5 Z test of hypothesis for the proportion 340 9.6 The power of a test 344 9.7 Potential hypothesis-testing pitfalls and ethical issues 349 329 Summary 352 Key formulas 353 Key terms 353 References 353 Chapter review problems 354 Continuing cases 356 Chapter 9 Excel Guide 356 10 Hypothesis testing: Two-sample tests 358 10.1 Comparing the means of two independent populations 359 10.2 Comparing the means of two related populations 371 10.3 10.4 F test for the difference between two variances 378 Comparing two population proportions 384 Summary 389 Key formulas 391 Key terms 392 References 392 Chapter review problems 392 Continuing cases 395 Chapter 10 Excel Guide 396 11 Analysis of variance 401 The completely randomised design: One-way analysis of variance 402 11.2 The randomised block design 415 11.3 The factorial design: Two-way analysis of variance 425 11.1 Summary 438 Key formulas 439 Key terms 440 References 440 Chapter review problems 441 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e vii viii DETAILED CONTENTS Continuing cases Chapter 11 Excel Guide 443 444 End of Part problems 448 PART DETERMINING CAUSE AND MAKING RELIABLE FORECASTS 12 Simple linear regression 455 14 Time-series forecasting and index numbers 544 14.1 The importance of business forecasting 545 14.2 Component factors of the classical multiplicative time-series model 546 14.3 Smoothing the annual time series 547 14.4 Least-squares trend fitting and forecasting 555 14.5 The Holt–Winters method for trend fitting and forecasting 567 Autoregressive modelling for trend fitting and forecasting 570 12.1 Types of regression models 12.2 Determining the simple linear regression equation 458 14.6 12.3 Measures of variation 467 14.7 Choosing an appropriate forecasting model 579 12.4 Assumptions 473 14.8 Time-series forecasting of seasonal data 584 12.5 Residual analysis 14.9 Index numbers 591 12.6 Measuring autocorrelation: The Durbin–Watson statistic 14.10 Pitfalls in time-series forecasting 599 477 Inferences about the slope and correlation coefficient 482 12.7 456 473 12.8 Estimation of mean values and prediction of individual values 489 12.9 Pitfalls in regression and ethical issues 493 Summary 496 Key formulas 497 Key terms 498 References 498 Chapter review problems 498 Continuing cases 501 Chapter 12 Excel Guide 502 13 Introduction to multiple regression Chi-square test for differences between more than two proportions 615 15.3 Chi-square test of independence 622 504 15.4 Chi-square goodness-of-fit tests 627 15.5 Chi-square test for a variance or standard deviation 632 505 13.2 R 2, adjusted R and the overall F test 511 Residual analysis for the multiple regression model 514 Inferences concerning the population regression coefficients 516 Testing portions of the multiple regression model 520 Using dummy variables and interaction terms in regression models 525 13.5 13.6 13.7 607 608 Developing the multiple regression model 13.4 15 Chi-square tests Chi-square test for the difference between two proportions (independent samples) 13.1 13.3 Summary 600 Key formulas 600 Key terms 601 References 602 Chapter review problems 602 Chapter 14 Excel Guide 604 Collinearity 535 Summary 536 Key formulas 537 Key terms 537 References 537 Chapter review problems 538 Continuing cases 541 Chapter 13 Excel Guide 541 15.1 15.2 Summary 635 Key formulas 635 Key terms 636 References 636 Chapter review problems 636 Continuing cases 640 Chapter 15 Excel Guide 641 End of Part problems 642 PART (ONLINE) FURTHER TOPICS IN STATS 16 Multiple regression model building 650 16.1 Quadratic regression model 651 16.2 Using transformations in regression models 657 16.3 Influence analysis 660 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e DETAILED CONTENTS 16.4 Model building 663 16.5 Pitfalls in multiple regression and ethical issues 673 Summary 674 Key formulas 674 Key terms 674 References 676 Chapter review problems 676 Continuing cases 677 Chapter 16 Excel Guide 677 17 Decision making Payoff tables and decision trees 681 17.2 Criteria for decision making 685 17.3 Decision making with sample information 694 17.4 Utility 699 Summary 700 Key formulas 701 Key terms 701 References 701 Chapter review problems 701 Chapter 17 Excel Guide 703 704 18.1 Total quality management 705 18.2 Six Sigma management 707 18.3 The theory of control charts 708 18.4 Control chart for the proportion – The p chart 710 The red bead experiment – Understanding process variability 716 18.5 19.1 19.2 740 McNemar test for the difference between two proportions (related samples) 741 Wilcoxon rank sum test – Non-parametric analysis for two independent populations 744 19.3 Wilcoxon signed ranks test – Nonparametric analysis for two related populations 750 19.4 Kruskal–Wallis rank test – Non-parametric analysis for the one-way anova 755 Friedman rank test – Non-parametric analysis for the randomised block design 758 680 17.1 18 Statistical applications in quality management 19 Further non-parametric tests 19.5 Summary 762 Key formulas 762 Key terms 762 Chapter review problems 763 Continuing cases 765 Chapter 19 Excel Guide 766 20 Business analytics 770 20.1 Predictive analytics 771 20.2 Classification and regression trees 772 20.3 Neural networks 777 20.4 Cluster analysis 781 20.5 Multidimensional scaling 783 Key formulas 786 Key terms 787 References 787 Chapter review problems 787 Chapter 20 Software Guide 788 21 Data analysis: The big picture 794 21.1 Analysing numerical variables 798 Control chart for an area of opportunity – The c chart 718 21.2 Analysing categorical variables 800 18.7 Control charts for the range and the mean 721 21.3 Predictive analytics 801 18.8 Process capability 727 18.6 Summary 733 Key formulas 733 Key terms 734 References 734 Chapter review problems 734 Chapter 18 Excel Guide 736 Chapter review problems 802 End of Part problems 804 Appendices A to F A-1 Glossary G-1 Index I-1 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e ix www.freebookslides.com G-6 GLOSSARY Paasche price index Uses consumption quantities in the final year to weight price changes measured as an index number paired Observations that are analysed together on the basis of a common characteristic paired t test for the mean difference in related populations A test for the difference between the means of two populations that have a common characteristic parameter A numerical measure of some population characteristics parsimony The process of choosing the simplest model in terms of independent variables that still adequately explains the variation in the dependent variable partial correlation The correlation between two variables after removing the effects of other variables; used to identify spurious correlation and hidden correlation (a correlation masked by the effect of other variables) partial F test Tests for a significant contribution of an individual independent X variable in multiple regression after all other independent X variables have been included in the regression model, using the F probability distribution payoff table A table that shows the values associated with every possible event that can occur for each course of action payoffs Values associated with the outcome of events p chart A control chart for the proportion of nonconforming items Pearson correlation The correlation coefficient, also called the linear or product-moment correlation; determines the extent to which values of two variables are ‘proportional’ to each other percentage distribution A summary table for numerical data; it gives the percentage of data values in each class percentage polygon A graphical representation of a percentage distribution permutation An ordered selection of items pie chart A graphical representation of a summary table for categorical data; each category is represented by a slice of a circle of which the area represents the proportion or percentage share of the category point estimate A single value, calculated from a sample, that is used to estimate an unknown population parameter Poisson distribution Discrete probability distribution, where the random variable is the number of events in a given interval pooled-variance t test A test for the difference between two population means which assumes that the unknown population variances are equal population A collection of all members of a group being investigated population mean A mean calculated from population data population standard deviation A standard deviation calculated from population data population variance A variance calculated from population data portfolio A combined investment in two or more assets portfolio expected return A measure of central tendency; a mean return on investment portfolio risk A measure of the variation of investment returns post-hoc A comparison where hypotheses are formulated after the data have been inspected power curve A graph showing the power of the test for various actual values of the population parameter power of a statistical test The probability that you reject the null hypothesis when it is false and should be rejected prediction interval for an individual response Y The interval for the prediction of a specific value of Y in regression, given a value of X prediction line The straight line derived by a regression equation using the method of least squares predictive analytics A form of business analytics that identifies what is likely to occur in the (near) future and finds relationships in data that may not be readily apparent using descriptive analytics prescriptive analytics A form of business analytics that investigates what should occur and prescribes the best course of action for the future price index A measure of the average price of a group of goods relative to a base year primary source Provides information collected by the data analyser principle of parsimony The principle that the simplest of two competing statistical processes is to be preferred probability The likelihood of an event occurring probability distribution for a discrete random variable The values of a discrete random variable with the corresponding probability of occurrence probability sample A sample where selection is based on known probabilities process The value-added transformation of inputs to outputs process capability The ability of a process to consistently meet specified customer expectations processing elements The hidden layer in multilayer perceptrons (MLPs) pth-order autocorrelation The correlation between values in a time series that are p periods apart pth-order autoregressive model A regression model to measure autocorrelation p order apart in a time series p-value The probability of getting a test statistic more extreme than the sample result if the null hypothesis is true quadratic regression model A multiple regression model with two independent variables, where the second independent variable is the square of the first independent variable quadratic trend model A non-linear forecast model where the second independent variable is the square of the first independent time-series variable qualitative forecasting methods Methods that are primarily based on the subjective opinion of the forecaster rather than the analysis of numerical data quantile–quantile plot A normal probability plot quantitative forecasting methods Methods that use time-series data in a mathematical process to forecast future values of the series quartiles Measures of relative standing that partition a data set into quarters R chart A control chart for the range random error An error that results from unpredictable variations random experiment A precisely described scenario that leads to an outcome that cannot be predicted with certainty randomisation A process used in an experiment to ensure selection bias is avoided randomised block design An experimental technique where data in groups are divided into fairly homogeneous subgroups called blocks to remove variability from random error Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com GLOSSARY G-7 randomness and independence Assumptions necessary in ANOVA to avoid bias range A distance measure of variation; the difference between maximum and minimum data values ratio scale A ranking where the differences between measurements involve a true zero point recoded variable A variable that has been assigned new values that replace the original ones rectangular distribution A continuous probability distribution where the values of the random variable have the same probability; also called the ‘uniform distribution’ region of non-rejection The range of values of the test statistic where the null hypothesis cannot be rejected region of rejection The range of values of the test statistic where the null hypothesis is rejected; it is also called the ‘critical region’ regression analysis A method for predicting the values of a numerical variable based upon the values of one or more other variables regression coefficients The calculated parameters in regression that specify the interval and slope of the linear line defining the relationship between the independent and dependent variables regression sum of squares (SSR) The degree of variation between X and Y variables that is explained by the defined regression relationship between the two variables Specifically, the degree of variation in the Y variable that is accounted for by variation in the X variable(s) relative frequency distribution A summary table for numerical data which gives the proportion of data values in each class relevant range The range of values of the explanatory variable, which are themselves the only values relevant to predicting any value in regression repeated measurements Data collected from the same set of persons or items at different times replicates Sample sizes for particular combinations of two factors in two-way ANOVA residual Difference between the observed values and the corresponding values that are predicted by the regression model; they represent the variance that is not explained by the model residual analysis A graphical evaluation of the residuals from regression to test for violations of the assumptions of regression resistant measures Summary measures not influenced by extreme values response variable A dependent variable return-to-risk ratio (RTRR) The expected monetary value of an action divided by its standard deviation risk-averter’s curve A utility curve that increases rapidly then levels off as dollar amounts increase risk of Type II error (b) The chance that the null hypothesis will not be rejected when it is incorrect risk-neutral curve A utility curve where each additional dollar of profit has the same value risk-seeker’s curve A utility curve that increases more rapidly as dollar amounts increase robust A test or procedure that is not seriously affected by the breakdown of assumptions sample The portion of the population selected for analysis sample coefficient of correlation A coefficient of correlation calculated from sample data sample covariance A covariance calculated from sample data sample mean A mean calculated from sample data sample proportion The number of items that have some characteristic of interest divided by the size of the sample sample space A collection of all simple events of a random experiment sample standard deviation A standard deviation calculated from sample data sample variance A variance calculated from sample data sampling distribution The probability distribution of a given sample statistic with repeated sampling of the population sampling distribution of the mean The distribution of all possible sample means from a given population sampling distribution of the proportion The distribution of all possible sample proportions from samples of a certain size sampling error The difference in results for different samples of the same size sampling with replacement An item in the frame can be selected more than once sampling without replacement Each item in the frame can be selected only once scatter diagram A graphical representation of the relationship between two numerical variables; plotted points represent the given values of the independent variable and corresponding dependent variable seasonal component A factor that measures the regular seasonal change in a time series second-order autocorrelation Indicates there is a correlation between values two periods apart in a time series second-order autoregressive model A regression model to measure second-order autocorrelation in a time series second quartile Usually called the median; the middle value in an array that 50% of data values are smaller than, or equal to secondary source Provides data collected by another person or organisation separate-variance t test A test for the difference between two population means, used when the unknown population variances cannot be assumed to be equal shape The pattern of the distribution of data values Shewhart–Deming cycle An improvement process used by TQM: ‘plan, do, study, act’ side-by-side bar chart A graphical representation of a crossclassification table simple event A single outcome of a random experiment simple linear regression A regression method using a single independent variable to predict values of the numerical dependent variable simple price index A percentage measure of the change in the price of a single item between two time periods simple random sample A sample where each item in the frame has an equal chance of being selected single linkage A measure of distance that bases the distance between clusters on the minimum distance between objects in one cluster and another cluster Six Sigma management An approach to process improvement with an emphasis on accountability and bottom-line results Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com G-8 GLOSSARY skewed Non-symmetrical distribution; where the distribution of data values above and below the mean differ sparklines A descriptive analytics method that summarises timeseries data as small, compact graphs designed to appear as part of a table special (or assignable) causes of variation Large fluctuations or patterns in data that are not inherent to a process; these fluctuations reflect changes in the process specification limits Technical requirements based on customers’ needs and expectations spread (dispersion) The amount of scattering of data values square-root transformation Uses the square-root of the sample data to overcome breaches of the homoscedasticity or linearity assumptions in regression standard deviation A measure of variation based on squared deviations from the mean; closely related to the variance standard deviation of a discrete random variable A measure of variation, based on squared deviations from the mean; closely related to the variance standard deviation of the sum of two random variables A measure of variation; closely related to the variance standard error The square root of the expected squared difference between the random variable and its expected value standard error of the estimate The standard deviation of the Y predicted values in a regression around the line of best fit standard error of the mean Reflects how much the sample mean varies from its average value in repeated experiments standard error of the proportion The standard deviation of the sample proportion for repeated samples standardised normal random variable A normal random variable with a mean of and a standard deviation of state of statistical control A process that is in control statistic A numerical measure that describes a characteristic of a sample statistical independence The occurrence of an event does not affect the occurrence of a second event statistical packages Computer programs designed to perform statistical analysis statistics A branch of mathematics concerned with the collection and analysis of data stem-and-leaf display A graphical representation of numerical data that partitions each data value into a stem portion and a leaf portion stepwise regression A model-building regression technique to find subsets of independent variables that most adequately predict a dependent variable given the specified criteria for adequacy of model fit strata Subpopulations composed of items with similar characteristics in a stratified sampling design stratified sample Items randomly selected from each of several populations or strata stress statistic A goodness-of-fit statistic used in multidimensional scaling structured data Data that follow an organised pattern Studentised deleted residual A statistical method of residual analysis using the t probability distribution that identifies individual cases in the sample data of a multiple regression that have high individual influence on the regression equation Studentised range distribution A probability distribution used for testing all differences between pairs of means Student’s t distribution A continuous probability distribution whose shape depends on the number of degrees of freedom subgroup A sample used in a control chart subjective probability The probability that reflects an individual’s belief that an event occurs sum of squares (SS) The sum of the squared deviations sum of squares between blocks (SSBL) That part of the withingroup variation that is due to differences between the blocks sum of squares between groups (SSB) That part of total variation that is due to differences between groups sum of squares due to factor A (SSA) Variation due to factor A in two-way ANOVA sum of squares due to factor B (SSBB) Variation due to factor B in two-way ANOVA sum of squares due to interaction (SSAB) The interacting effect of specific combinations of factor A and factor B sum of squares error (SSE) (or error sum of squares) The sum of squared differences between the values in each cell and the corresponding mean of that cell sum of squares total (SST) (or total sum of squares) The total variation; the sum of squared differences between each value and the grand mean sum of squares within groups (SSW) The sum of squared differences between each value and the mean of its own group summary table A table that summarises categorical or numerical data; it gives the frequency, proportion or percentage of data values in each category or class symmetrical Where the distribution of data values above and below the mean are identical systematic sample A method that involves selecting the first element randomly then choosing every kth element thereafter table of random numbers A table showing a list of numbers generated in a random sequence tampering Over-adjustment that increases variation in a process test of independence Tests for independence between the rows and columns of a contingency table test statistic A value derived from sample data that is used to determine whether the null hypothesis should be rejected or not third (upper) quartile The value that 75% of data values are smaller than or equal to time series A sequence of measurements taken at successive points in time time-series forecasting methods Statistical methods for forecasting future values of a variable based entirely on the past values of that variable time-series plot A graphical representation of the value of a numerical variable over time total amount The sum of values total quality management (TQM) An approach to quality improvement that emphasises continuous improvement and the total system total sum of squares (SST) (or sum of squares total) The total variation total variation The sum of the squared differences between each individual value and the grand mean Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com GLOSSARY G-9 training data A set of data used by neural networks to uncover a model that by some criterion best describes the patterns and relationships in the data transformation formula A Z-score formula used to convert any normal random variable to the standardised normal random variable treatment effect A variation due to group membership treemaps A descriptive analytics method that helps visualise two variables, one of which must be categorical trend component An overall long-term upward or downward movement in the values of a time series t test for the correlation coefficient A hypothesis test for the statistical significance of the correlation coefficient in regression using the t probability distribution t test for the slope A hypothesis test for the statistical significance of the regression slope b using the t probability distribution t test of hypothesis for the mean A test about the population mean that uses a t distribution Tukey–Kramer multiple comparison procedure A method of determining which of the group means are significantly different Tukey procedure A method of making pairwise comparisons between means two-factor factorial design Analysis of variance where two factors are simultaneously evaluated two-tail test A hypothesis test where the rejection region is divided into the two tails of the probability distribution two-way ANOVA An analysis of variance where two factors are simultaneously evaluated Type I error The rejection of a null hypothesis that is true and should not be rejected Type II error The non-rejection of a null hypothesis that is false and should be rejected unbiased If the average of all possible sample means equals the population mean then the sample mean is unbiased unexplained variation The error sum of squares uniform distribution A continuous probability distribution in which the values of the random variable have the same probability; also called the ‘rectangular distribution’ unstructured data Data that have no repeated pattern unweighted aggregate price index A price index for a group of items where each item has an equal weight upper control limit (UCL) The upper limit for a control chart, typically three standard deviations above the process mean upper specification limit (USL) The largest value a CTQ can have to meet customer expectations utility A measure of the desirability of different outcomes for an individual decision maker variables Characteristics or attributes that can be expected to differ from one individual to another variables control charts Control charts for numerical variables variance A measure of variation based on squared deviations from the mean; closely related to the standard deviation variance inflationary factor (VIF) A factor that measures the impact of collinearity among the Xs in a regression model by stating the degree to which collinearity among the predictors reduces the precision of an estimate variance of a discrete random variable A measure of variation, based on squared deviations from the mean; closely related to the standard deviation variance of the sum of two random variables A measure of variation; closely related to the standard deviation variation The spread, scattering or dispersion of data values Venn diagram The graphical representation of a sample space; joint events shown as ‘unions’ and ‘intersections’ of circles representing simple events Ward’s minimum variance method A measure of distance that bases the distance between clusters on the sum of squares over all variables between objects in one cluster and another cluster weighted aggregate price index A price index for a group of items where each item has a different weight based on volume of consumption Wilcoxon rank sum test A non-parametric test for testing the difference between two medians from independent samples Wilcoxon signed ranks test A non-parametric test for testing the mean difference for paired samples within-group variation That part of total variation due to differences within individual groups – X chart A control chart for the process mean Y intercept Represents the mean value of Y when X equals zero in regression Z scores Measures of relative standing; number of standard deviations given data values are from the mean Z test for the difference between two means A test statistic used in hypothesis tests about the difference between means of two populations Z test for the difference between two proportions A test statistic used in hypothesis tests about the difference between the proportions of two populations Z test for the proportion A test statistic used for a test of the population proportion Z test of hypothesis for the mean A test about the population mean which uses the standard normal distribution Z test statistic A test statistic calculated by converting a sample statistic to a standard normal score Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com This page is intentionally left blank www.freebookslides.com index Page numbers in bold indicate definitions of key terms Page numbers in italics indicate figures tables 608–9, 611 3 tables 616 3 tables 625 a priori classical probability 148, 148–9 ABS (Australian Bureau of Statistics) 6, 8, 14, 24 accounting 8 adjusted R2 511, 537, 542–3 aggregate price indices 591, 593–4 All Australia Index 596 All Industries Index 597 alternative hypotheses (H1) 316–17, 317, 323–4, 332 analysis of variance (ANOVA), defined 402 see also one-way analysis of variance; randomised block design; two-way analysis of variance annual time-series data 545 ANOVA summary tables 405, 406–8, 408, 419, 430 Anscombe’s quartet 494 arithmetic mean 92, 92–3 artificial data 494 assumptions of regression (LINE) 473 ASX 200 Index 596–7 auditing 300, 300–7 Australian Bureau of Statistics (ABS) 6, 8, 14, 24 Australian Securities Exchange (ASX) 597 autocorrelation defined 477 first-order 570 measuring with Durbin–Watson statistic 477–80, 480, 503 pth-order 570 regression analysis 475, 477–80, 478–80 second-order 570 autoregressive modelling 570, 570–8, 573, 575–8, 600–1, 605–6 averages see moving averages bar charts 39, 39–40, 39–40, 79 base period 591 Bayes’ theorem 163, 163–6, 168–9, 173, 178 bell-shaped distribution 115, 122, 122, 213, 214 between-block variation 416, 416, 417, 417–18 between-group variation 402, 403, 416, 416, 417–18, 439 bias 25 big data 14, 14–15 bin ranges 83 binomial distribution binomial probabilities 191–5, 193–4 defined 189 example 190 formula 191, 204 mean 194, 205 normal approximation to 238–41, 240, 242 properties 189 standard deviation 194, 205 using statistical software 193, 193–4 binomial probabilities 191–5, 193–4 block effects 416, 416–22, 421 blocks 416 see also randomised block design box-and-whisker plots confidence interval estimation 289 defined 121 examples 121–2, 121–2 one-sample tests 338, 350 two-sample tests 364, 375 using Excel 137–8, 138 boxplots 121 brainstorming 14 bullet graphs 64, 64–5, 65, 66, 89–90 business analytics 62–7, 63 business forecasting 545 CAI (Computer Assisted Interview) 24 call monitoring 368 car exports 505 categorical data, tables and charts for 38–42, 73, 77–9 categorical variables 9–11, 10, 10, 55–7, 86–7, 527–8 causal forecasting methods 545 cell means plots 432–4, 432–4, 447 censuses 8, 26 Central Limit Theorem 214, 256, 256–8, 257, 266, 280, 322, 334, 337 central tendency 92, 92–9, 135–6 certain events 148 chartjunk 65 charts bar charts 39, 39–40, 39–40, 79 bullet graphs 64, 64–5, 65 for categorical data 38–42, 39–41 choosing an appropriate chart 73 misusing graphs 69–71, 69–71 for numerical data 46–53 pie charts 40, 40–2, 41, 79 side-by-side bar charts 56, 56–7, 56–7 using statistical software 79 see also scatter diagrams; time-series plots Chebyshev rule 116, 116–17, 137 chi-square analysis chi-square (x2) distribution 610 x2 test statistic 609–13, 612, 615–18, 622–33, 641 goodness-of-fit tests 627, 627–31, 635 key formulas 635 Marascuilo procedure 619, 619–20, 620, 635 test for differences between more than two proportions 615–20, 616, 618, 620, 635, 641 test for differences between two proportions 608–13, 610, 612–13, 635, 641 test for standard deviation 632–3, 632–3, 635 test for the variance 632–3, 632–3, 635 test of independence 622, 622, 622–6, 625, 641 using statistical software 612–13, 618, 620, 625, 633, 641 chi-square (x2) distribution 610 chunk samples 17 class boundaries 47 class mid-point 47, 83 class width 46, 46–7 classical multiplicative time-series model 546, 546–7, 585, 600 classical parametric procedures 337 classical statistical inference 214 cluster samples 17, 21 clusters 21 coefficient of correlation 125, 125–8, 138, 486, 497, 503 coefficient of determination 469, 469–71, 470, 497 coefficient of multiple determination 511, 511–12, 537, 542–3 coefficient of partial determination 523, 523–4, 537 coefficient of variation 105, 105–6, 131, 136–7 collectively exhaustive events 16, 47, 153, 215 collinearity 535, 535–6 combinations 170, 170–1, 173, 179, 190–1, 204 complement 150 completely randomised designs 402, 403, 416, 444–5 Computer Assisted Interview (CAI) techniques 24 computer technology 26 conditional probability 156, 156–61, 164, 173 confidence coefficient (1 a) 319 confidence interval estimate, defined 280, 485 confidence interval estimation applications in auditing 300–7 compared with prediction interval 492, 492 difference between the means of two independent populations 364–5, 391 difference between the means of two related populations 377, 391, 397 difference between two proportions 388, 391, 400 ethical issues 307 and hypothesis testing 327–8 for the mean (s known) 280–4, 281, 283, 308, 313 for the mean (s unknown) 285–6, 285–90, 288–9, 308, 313 for the mean difference 377 for the mean of Y 489–91, 497 of the mean response 508, 508 one-sided 304, 304–5 for the population total 300–2, 314 for the proportion 291–3, 292, 305, 308, 314 of the slope 485, 497, 537 for total difference 314 using statistical software 288, 289 value of a population slope 518–19 see also difference estimation; sample size determination Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com I-2 INDEX confidence interval statements 287–8, 288 confidence level 319 consent, in research 350 Consumer Price Index (CPI) 595, 596 contingency tables tables 608–9, 611 3 tables 616 3 tables 625 chi-square analysis 608–9, 615 defined 55, 608 examples of 55–7 probability 150 using statistical software 86–7, 625 continuity corrections 238–9 continuous numerical variables 181 continuous probability density function 213, 213–14 continuous probability distributions 146, 213–14, 405 continuous random variables 213–14, 238–9 continuous variables 10 convenience sampling 17, 17 correlation coefficient 125, 125–8, 138, 486, 497, 503 counting rules 168–71, 173, 178–9 covariance 123, 123–5, 138, 185, 185–8, 204, 209 coverage errors 23, 25 CPI (Consumer Price Index) 595, 596 critical range factor A 436, 440 factor B 436, 440 for Marascuilo procedure 619 randomised block design 422–3, 439 for Tukey–Kramer procedure 409–10, 439 for Tukey procedure 438 critical region 318 critical value approach to hypothesis testing 322–5, 323 to one-tail tests 329–30, 330 t test for the mean (s unknown) 334–6, 335 Z test for the proportion 341–2, 341–2 critical values 283, 286, 318, 319, 335, 335, 380–2, 381–2, 391 CRM (customer-relationship management system) 26 cross-classification (contingency) tables 55, 150, 608, 608–9 cross-product term 528 cumulative percentage distributions 49, 49–50, 82, 85–6 cumulative percentage polygons (ogives) 52, 52–3, 53 cumulative standardised normal distribution 217, 217–18, 219, 223 curvilinear relationship 457, 457–8 customer-relationship management system (CRM) 26 cyclical component 546, 546–7 dashboards 63–6, 64, 64 data analysing 129, 350–1 big data 14, 14–15 categorical 9–11, 38–42 cleaning 15–16, 350–1 collecting 13–16, 350 defined discarding 350–1 formatting 15 independence of 420 interpreting 129 measuring 10–11 numerical 9–11, 43–53 randomness 350 sources of 13–14 stacked and unstacked data 79 structured 15 unstructured 15 Data Analysis Toolpak see Microsoft Excel data discovery 66 data mining 26 data point data snooping 350 Dax 30 Index 597 decision making, in hypothesis testing 320 decision trees 158, 158–9, 158–9, 165–6 deductive reasoning 281 degrees of freedom 285, 285–7, 286, 366, 391, 404 Delphi technique 14 dependent variables 456, 507–8, 508 descriptive analytics 62–7, 63, 88–90 descriptive statistics 8, 55, 109 see also numerical descriptive measures difference estimation 302, 302–4 diffusion indices 545 directional tests 330 discrete random variables 181–4, 185–7, 208 discrete variables 10 dispersion (spread) 99 distribution bell-shaped 115, 122, 122, 214 exponential 213, 214, 235, 235–7, 236, 242, 247, 247, 257, 257–8 hypergeometric distribution 200, 200–2 probability distribution for a discrete random variable 181, 181–4, 182, 208 shape 107, 107–9 skewed distribution 107, 107, 107–8, 109, 110, 120, 137 symmetrical 107, 107, 107–8, 109, 120 uniform 213, 214, 233, 233–4, 234, 256, 257, 258 see also binomial distribution; chi-square distribution; normal distribution; Poisson distribution; sampling distributions double exponential smoothing 555 Dow Jones Industrial Average 596 drill-down 66 dummy variables 525–31, 526, 527, 529–30, 532, 587 Durbin–Watson statistic 475, 479, 479–80, 480, 497, 503 econometric modelling 545 electronic formats 15 empirical classical probability 149 empirical rule 115, 115–16, 137 encoding 15 equal variance 473, 476, 476 error sum of squares (SSE) 467, 468, 468–9, 497, 581 see also sum of squares error (SSE) errors independence of 473 random error 402 residual 579–80 Type I errors 319, 319–21, 344 Type II errors 319, 319–21, 344, 345–6, 345–6 estimated relative efficiency (RE) 422, 439 ethical issues calculating probabilities 172 confidence interval estimation 307 hypothesis testing 349–51 misuse of graphs 70–1, 70–1 selective use of statistics 129–30 survey errors 25 using numerical descriptive measures 129–30 using regression analysis 493–4, 495, 496 events certain events 148 collectively exhaustive events 16, 47, 153, 215 complement of 150 defined 149 impossible events 148 independent events 159–61 joint events 150 mutually exclusive events 16, 47, 153, 215 rare events 60–1 sample spaces and 150 simple events 149 Excel see Microsoft Excel exchange rate 505 executive information systems 63 expected frequency 609, 611–13, 612–13, 615–19, 623–8, 630, 635 expected returns 187–8 expected value of a discrete random variable 182, 182–3, 204 of the sum of two random variables 186, 186–7, 204 explained variation 467 explanatory variables 457, 525 exponential distribution 213, 214, 235, 235–7, 236, 242, 247, 247, 257, 257–8 exponential growth, forecasting equations 586, 601 exponential smoothing 551, 551–3, 552, 600, 604 exponential trend model 558, 558–60, 560–1, 581, 582–3, 585–8, 588, 600–1, 605 exponentially weighted moving averages 551 extrapolation (regression analysis) 463, 493 extreme values (outliers) 106 F distribution 378, 380–1, 381, 391, 405 F test for block effects 419–20, 421 for the difference between two variances 378–83, 379, 381–2, 391, 399–400, 399–400 for differences between more than two means 402–8, 403, 405–6, 408 for factor A effect 428, 440 for factor B effect 428, 428–9, 440 for interaction effect 429, 440 one-way ANOVA F test 402–8, 403, 404, 405–6, 408, 439 overall F test 511–12, 512, 513, 537, 542–3 partial F test 520, 520–3, 537 for the slope 484, 484–5, 484–5, 497 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com INDEX F test statistic 378, 378–80, 391, 419, 439, 512, 522–3 factor A 426–8, 436, 439, 440 factor B 426–8, 436, 439, 440 factorials 169, 173, 179 factors (variables) 402 finance, use of statistics in 8, 187–8, 209 financial indices 596–7 finite population correction factor 201 first differences 561–3, 605 first (lower) quartile 97, 100, 130 first-order autocorrelation 570 first-order autoregressive model 570–1, 571, 575, 577–8, 577–8, 581, 582–3, 600 fitted pth-order autoregressive equation 573, 601 five-number summary 120, 120–1, 137 focus groups 14 forecasting see time-series forecasting frame 17 frequencies see expected frequency; observed frequency frequency distributions approximating standard deviation from 131 approximating the mean from 131 constructing 46–8 defined 46 finding numerical descriptive measures from 118–19 relative frequency distributions 48–9 using statistical software 80–4, 81 frequency polygons 83 Friedman rank test 420 gauges 64, 64–6, 65, 88–9 Gaussian distribution 214 general addition rule 154, 154–5, 173 general multiplication rule 160, 160–1, 173 geometric mean 98, 98–9, 130 geometric mean rate of return 98–9, 131 goodness-of-fit tests 627, 627–31, 635 Gosset, William S. 285 grand mean 403 graphs bullet graphs 64, 64–5, 65 misuse of 69–71, 69–71 groups 402 halo effect 25 highest-order autoregressive model 571–3, 575, 601 histograms 50, 50–1, 82–5 Holt–Winters method 567, 567–8, 569, 581, 582–3, 600 homogeneity of variance 411, 412, 445 homoscedasticity 473 households 108, 108–9 Human Development Index 456–64, 460–1, 468, 469, 471, 475–6, 483–6, 484, 489–92 hypergeometric distribution 200, 200–2, 205, 211 hypothesis testing comparing the means of two independent populations 359–68, 361–4, 367, 391, 396–8, 396–8 comparing the means of two related populations 371–7, 374–6, 391, 398–9, 398–9 comparing two population proportions 384–8, 385–6, 391, 400, 400 and confidence interval estimation 327–8 confidence level 319 confirmatory approach 359 critical value approach 322–5, 323, 329–30, 330, 334–6, 335, 341–2, 341–2 decision-making risks 319–21 defined 316 ethical issues 349–51 F test for difference between two variances 378–83, 379, 381–2, 399–400, 399–400 hypothesis-testing methodology 316–21 in multiple regression models 516–19, 517 null and alternative hypotheses 316–17, 323–5, 327, 332, 350 one-sample tests 316–57 p-value approach 325–7, 326, 331–2, 342–3 potential pitfalls 349–51 power of a statistical test 320, 320–1, 344–7, 344–8 regions of rejection and non-rejection 318, 318–19, 323, 324 selecting a test 352, 353 six-step method 323–4 t test of hypothesis for the mean 334, 334–7, 335–8 test statistic 318, 323–4 two-sample tests 359–400 using statistical software 326, 326, 331, 336–7, 336–8, 342, 356–7, 362–3, 367, 374, 376, 381, 386, 396–400, 397–400 Z test of hypothesis for the mean 322, 322–8, 323, 326 Z test of hypothesis for the proportion 340, 340–3, 341 see also chi-square analysis; F test; one-tail tests; t test; two-tail tests; Z test impossible events 148 independence in ANOVA 410, 420 chi-square test of 622, 622, 622–6, 625, 641 of data 420 of errors 473 in regression analysis 475 statistical 159, 159–60 independent events 159–61 independent populations 359–68, 361–4, 367, 378, 391, 396–8, 396–8 independent variables 456, 506–7, 520, 523–4, 528–9, 535–6, 537 index numbers 591, 591–7 inductive reasoning 281 inferential statistics 8, 26, 280–1 information technology 26 informed consent 350 insurance premiums 613 interaction effects 420, 422, 426, 426, 426–35, 430–5, 447 interaction terms 528 interactions (variables) 528, 528–31, 529–30, 532 interpolation (regression analysis) 463 interquartile range 100, 100–1, 105, 131, 136 interval estimates see confidence interval estimation interval scales 11, 11 I-3 intervals 280 investment services 14 irregular (random) component 546, 546–7 joint events 150 joint probability 152, 152–3 judgment samples 17, 17 kurtosis 109, 137 labour force participation rates, for females 546, 546 lagged predictor variables 545, 605 Laspeyres price index 594, 594–5, 596, 601 leading indicator analysis 545 leading questions 24 least-squares method 459–61, 460, 460–1, 555–63, 557, 581, 606 least-squares trend-fitting and forecasting 555–63, 557, 585–9, 588, 605 level of confidence 282, 282–4 level of significance (a) 319, 323–4, 350, 431 levels (factors) 402 Levene test 411, 445 LINE 473, 474 linear relationship 457, 457–8 linear trend models 555, 555–6, 557, 561–2, 581, 582–3, 600, 605 linearity 473, 474, 475 lotteries 172 lower quartile 97, 100, 130 lower-tail critical values 380–2, 381–2, 391 MAD (mean absolute deviation) 580, 601, 606 main effects 431, 432 management 8 Marascuilo procedure 619, 619–20, 620, 635 margin of error 295 marginal probability 151, 151–2, 173 market research 14 marketing 8 matched observations 371, 373 mathematical models 189 mean arithmetic mean 92, 92–3 of binomial distribution 194, 205 calculation of 92–3, 135 comparing the means of two independent populations 359–68, 361–4, 367, 391, 396–8, 396–8 comparing the means of two related populations 371–7, 374–6, 391, 398–9, 398–9 confidence interval estimation 280–90, 281, 283, 285–6, 288–9, 308, 313, 364–5, 377, 391 defined 92 exponential distribution 236, 242 F test for differences between more than two means 402–8, 403, 405–6, 408 from a frequency distribution 118–19, 131 geometric mean 98, 98–9, 130 grand mean 403 of hypergeometric distribution 201, 205 versus median 110 normal distribution 216, 216, 219 one-tail test of hypothesis for 330, 330, 331–2 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com I-4 INDEX mean (continued) related populations 371–7, 374–6 sample mean 93, 93–4 sample size determination 294–6, 296, 308, 314 sampling distribution of 249–58, 253, 256, 263 in shape of distribution 107–8 standard error of 251, 251–2, 252, 307 t test of hypothesis for 334, 334–7, 335–8, 353, 356 unbiased property of 249–51 uniform distribution 233, 234, 242 Z test of hypothesis for 322, 322–8, 323, 326, 353, 356 see also population mean mean absolute deviation (MAD) 580, 601, 606 mean difference 302 mean square between (MSB) 404 mean square between A (MSBA) 428 mean square between AB (MSBAB) 428 mean square between B (MSBB) 428 mean square between blocks (MSBL) 418, 418–19 mean square error (MSE) 418, 418–19 mean square total (MST) 404 mean square within (MSW) 404 mean squares 418–19, 439 mean values, estimating 489–92, 503 measurement error 24, 24–5 measurement, scales of 11–12, 11–12 see also numerical descriptive measures median 94, 94–6, 110, 130, 135 Microsoft Excel analysis of variance 408, 412, 430, 433–5, 444–7, 444–7 autoregressive modelling 575–7 bar charts 39–40, 79 basic probabilities 178 basics 30–1 Bayes’ theorem 178 binomial probabilities 193, 193–4, 209–10 box-and-whisker plots 137–8, 138, 289, 338, 375 bullet graphs 89–90 calculating coefficient of correlation 138 calculating covariance 138 calculating mean, median and mode 135, 135 calculating quartiles 136 calculating variation 136–7 Central Limit Theorem 266 chi-square analysis 612–13, 618, 620, 625, 633, 641 coefficient of variation 136–7 collecting data 35 conditional probability 178 confidence interval estimates 288, 289, 292, 313–14, 492 contingency tables 86–7 counting rules 178–9 covariance of a probability distribution 209 creating charts 33 Data Analysis Toolpak 109, 109 defining classes, bins and mid-points 83–4 defining data 35 descriptive measures for a population 137 descriptive statistics 230 determining sample size 296 entering data 31–3, 32 evaluating normality 246, 246–7 exponential distribution 236, 236, 247, 247 F test for difference between two variances 381 frequency distributions 80–2, 81 gauges 88–9 histograms 50, 83–5 hypergeometric distribution 202, 202, 210, 211 hypothesis testing 326, 326, 331, 336–7, 336–8, 342, 356–7, 362–3, 367, 374, 376, 381, 386, 396–400, 396–400 Marascuilo procedure 620 multiple regression 507–8, 514–15, 515–16, 527, 527, 529–30, 532, 541–3, 542 normal distribution 246 normal probabilities 226, 227, 230, 246, 246–7 normal probability plots 232, 338 numerical descriptive measures for a population 137 ogives 85 one-sample t test 336, 336 opening and saving workbooks 31 ordered arrays 80 paired t tests 376 percentage and cumulative percentage polygons 85–6 pie charts 41, 79 Poisson distribution 198, 210 polygons 85 pooled t tests 362–3 printing workbooks 33–5, 34 probability distribution for a discrete variable 208 probability plots 289, 338 problems with early versions of 26 randomised block design 421, 446, 446 relative frequency 82 residual analysis 478–9, 514–15, 515–16 sample size determination 314 sampling distributions 82, 265, 265 scatter diagrams 59, 87, 87–8, 408, 459, 461 separate-variance t test 367 side-by-side bar charts 56, 87 side-by-side charts 87 simple linear regression 460–1, 461–2, 469, 472, 475, 483, 492, 492, 502, 502–3, 521 sparklines 88 stacked and unstacked data 79 standard deviation 136–7 stem-and-leaf displays 80 summary tables 77–8, 77–8 tables and charts 45 time-series forecasting 550, 552, 557–8, 569, 575–7, 582–3, 585, 588, 604, 604–6, 606 time-series plots 60, 88 Tukey–Kramer procedure 410 two-sample tests 338, 396–400, 396–400 types of sampling methods 36 using Excel on a Mac 35 using formulas in worksheets 33 variance 136–7 Z scores 136–7 Z test for difference between two proportions 386 Minitab (software) 26 missing values 16 mode 98, 135 Morningstar 14 moving averages 548, 548–51, 550, 604 MSB (mean square between) 404 MSBA (mean square between A) 428 MSBAB (mean square between AB) 428 MSBB (mean square between B) 428 MSBL (mean square between blocks) 418, 418–19 MSE (mean square error) 418, 418–19 MST (mean square total) 404 MSW (mean square within) 404 multiple comparisons 408, 408–10, 410, 422–3, 435–6, 445 multiple determination, coefficients of 511, 511–12, 537, 542–3 multiple regression models 505–36 coefficients of multiple determination 511, 511–12, 537, 542–3 coefficients of partial determination 523, 523–4, 537 collinearity 535, 535–6 confidence interval estimate for the slope 518–19, 537 defined 505 dummy variables 525–31, 526, 527, 529–30, 532 interactions 528, 528–31, 529–30, 532 interpreting the regression coefficient 505–7 with k independent variables 506, 524, 537 key formulas 537 overall F test 511–12, 512, 513, 537, 542–3 partial F test 520, 520–3 population regression coefficients 516–19, 517, 543 predicting the dependent variable Y 507–8, 508, 542 residual analysis for 514–15, 515–16, 543 testing for significance of overall model 512, 513 testing for the slope 516–18, 517, 537 testing portions of 520–4, 521–2, 543 with two independent variables 506–7, 507, 537 using statistical software 507–8, 527, 527, 529–30, 532, 541–3, 542 multiplication rule for independent events 161, 173 multiplication rules 160–1 multiplicative model 546, 546–7, 585, 600 mutually exclusive events 16, 47, 153, 215 NASDAQ Index 596 National Health Survey (2010–11) 24 negatively skewed distribution 107, 107–8, 120 net regression coefficients 506 Newspoll 12 Nielsen 14 Nikkei Index 597 nominal scales 10, 10, 10–11 non-compliance, rate of 304–5 non-parametric procedures 337, 364, 375 non-probability sampling 17, 17, 17–18, 25 non-response bias 23 non-response errors 23, 25 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com INDEX normal distribution approximating binomial distribution 238–41, 240, 242 bell-shaped curve of 213, 214 chi-square goodness-of-fit tests for 629–31 constructing normal probability plots 231, 232 cumulative standardised 217, 217–18, 219 defined 214 different normal distributions 216, 216, 219 evaluating normality 227, 229–31, 230, 232, 246, 246–7 example calculations 219–26, 220–2, 224–5 misapplication of in business 227 sampling distribution 256–7, 257 standardised normal distribution 285, 285 theoretical properties 214, 229–31 transformation formula 216, 216–19, 217–19, 224 using statistical software 226, 227, 246–7, 246–7 normal probability density function 215, 215–18, 242 normal probability plots 231, 232, 246, 289, 338 normality 227, 229–31, 230, 232, 246, 246–7, 410, 420, 473, 476 null hypothesis (H0) 316, 316–17, 323–5, 327, 332, 350 numerical data, tables and charts for 9–11, 43–53, 73, 79–86 numerical descriptive measures box-and-whisker plots 121, 121–2, 121–2 coefficient of variation 105, 105–6 ethical issues 129–30 five-number summary 120, 120–1 from a frequency distribution 118–19 measures of central tendency 92–9 objectivity in data analysis 129 for a population 113–17, 137 shape 92, 107, 107–9 variance and standard deviation 101–5 Z scores 106, 106–7 numerical variables 9–11, 10, 10, 59–61, 181 objectivity, in data analysis 129 observation 6 observational studies 14 observed frequency 609, 611, 623, 627–8 observed level of significance (p-value) 325 OECD (Organisation for Economic Co-operation and Development) 458 ogives 52, 52–3, 53, 85 one-factor experiments 402 one-sample tests 316–57 flow chart for selecting 352 one-sample t test 336 potential pitfalls 349–51 power of a statistical test 320, 320–1, 344–7, 344–8 t test of hypothesis for the mean 337 using statistical software 356–7 Z test of hypothesis for the mean 337, 353 Z test of hypothesis for the proportion 340, 340–3, 341, 353 see also hypothesis testing; one-tail tests one-sided confidence interval 304, 304–5 one-tail tests choice of 350 comparing two population proportions 364–5 critical value approach 329–30, 330 defined 330 ethical considerations 350 F test for difference between two variances 382, 382–3 for the mean 330, 330, 331–2 p-value approach 331–2 power of a test 344, 344–5, 347, 347 t test 374 Z test for population mean 344, 344 one-way analysis of variance 402–13 assumptions 402, 410–11 between-group variation 402, 403, 439 calculating mean squares 404, 439 completely randomised design 402–13, 403, 405–6, 408, 410, 412 defined 402 example calculation 412–13 F test for differences between more than two means 402–8, 403, 405–6, 408 F test statistic 404, 404–5, 405, 439 key formulas 439 Levene test 411 summary tables 405, 406–8, 408, 419 total variation 403, 403, 403–4, 439 Tukey–Kramer procedure 408, 408–10, 410, 413, 435, 439, 445 using statistical software 406, 408, 412, 444–5, 445 within-group variation 402, 404, 439 see also randomised block design online surveys, rigging of 24 operational definition 6, 16 opinion polls 307 ordered arrays 43, 43–4, 80 ordinal scales 10–11, 11, 11 Organisation for Economic Co-operation and Development (OECD) 458 outliers 16, 106 overall F test 511–12, 512, 513, 542–3 p-value 325 p-value approach to hypothesis testing 325–7, 326 to one-tail tests 331–2 t test for the mean (s unknown) 336–7 to two-tail tests 325–6, 326 Z test for the proportion 342, 342–3 Paasche price index 595, 595–6, 601 paired observations 371 paired t test 372, 372–6, 374–6, 391, 398–9, 398–9 parameters 8, parametric procedures 337 parsimony 581 partial determination, coefficient of 523, 523–4, 537 partial F test 520, 520–3, 537 partial regression coefficients 506 Pearson, Karl 227 percentage differences 561–3, 605 percentage distributions 48, 48–9, 82 percentage polygons 51, 52, 85–6 perfect negative correlation 125, 125 I-5 perfect positive correlation 125, 125 perishable inventory 63 permutations 170, 173, 179 PHStat see Microsoft Excel pie charts 40, 40–2, 41, 79 point estimate 280 Poisson distribution calculating probabilities 197–9 chi-square goodness-of-fit tests for 627–9 defined 196 formula 205 properties 196 relation to exponential distribution 235 using statistical software 198, 210 political polls 307 polls 12, 172, 307 polygons 51–3, 52–3, 83, 85–6 pooled-variance t test 360, 360–4, 361–4, 391, 396–7, 396–7 population comparing the means of two independent populations 359–68, 361–4, 367, 391, 396–8, 396–8 comparing the means of two related populations 371–7, 374–6, 391, 397, 398–9, 398–9 comparing two population proportions 384–8, 385–6, 391, 400, 400 defined estimating total amount 300, 300–2, 314 estimating unknown characteristics 280–2, 281 examples of 8–9 independent populations 359–68, 361–4, 367 numerical descriptive measures for 113–17, 137 related populations 371–7, 374–6 sampling from non-normally distributed populations 256–8, 257 sampling from normally distributed populations 252–6, 253 see also confidence interval estimation; proportions population mean calculation of 114, 137 defined 113 formula 114, 131, 263 power of a statistical test 344–7, 344–8 sampling distributions 250 Z test for 344, 344 population parameters 113, 332 population proportions 259–60, 297–9, 298 population regression coefficients 516–19, 517, 543 population standard deviation 114, 114–15, 131, 137, 250–1, 263, 333, 334 population variance 114, 114–15, 131, 137 portfolio expected return 187, 187–8, 204 portfolio risk 187, 187–8, 204 portfolios 187, 209 positively skewed distribution 107, 107–8, 120 post-hoc comparison 408 poverty, measures of 110 power curve 347, 347 power of a statistical test 320, 320–1, 344–7, 344–8 practical significance 351 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com I-6 INDEX prediction interval for an individual response Y 491, 491–2, 493, 497, 503, 508, 508 prediction line 459, 461–3, 497 predictions, in regression analysis 462–3, 503 predictive analytics 63 prescriptive analytics 63 price indices 591–7, 601 primary sources, of data 13, 13–14 prior probabilities 163 probability a priori classical probability 148, 148–9 basic concepts 148–55 Bayes’ theorem 163, 163–6, 168–9 conditional probability 156, 156–61, 164, 173 contingency tables 150 continuous probability distributions 146, 213–14 counting rules 168–71, 173, 178–9 defined 148 empirical classical approach 149 ethical issues 172 events 149–50 general addition rule 154, 154–5, 173 impossible events 148 joint probability 152, 152–3 marginal probability 151, 151–2, 173 of occurrence 148, 173 sample space 149 subjective 149 using statistical software 178–9 Venn diagrams 150, 150–1, 151 see also binomial distribution; normal distribution; Poisson distribution probability distribution, for a discrete random variable 181, 181–4, 182, 208 probability samples 17, 18 proportions calculating overall proportions 610, 610–11, 615–16, 616, 635 chi-square test for differences between more than two 615–20, 616, 618, 620, 635, 641 chi-square test for differences between two 608–13, 610, 612–13, 635, 641 comparing two population proportions 384– 7, 385–6, 391, 400, 400 confidence interval estimates for 291–3, 292, 305, 314, 388, 400 population proportions 259–60, 263 sample size determination 297–9, 298, 308, 314 standard error of 307 Z test for the difference between two 384–7, 385–6, 391, 400, 400 Z test of hypothesis for 340, 340–3, 341, 352, 353, 357 pth-order autocorrelation 570 pth-order autoregressive forecasting equation 573–4, 601 pth-order autoregressive model 570–1, 571, 575, 581, 601 quadratic trend model 557, 557–8, 558–9, 562, 581, 582–3, 600, 605 qualitative forecasting methods 545 quantile-quantile plots 231 quantitative forecasting methods 545 quartiles 96, 96–8, 100, 100–1 questionnaires 24–5 quota samples 17 R2 (coefficient of multiple determination) 511, 511–12, 537 random component 546, 546–7 random error 402, 418, 427, 439 random experiments 149 random numbers tables 18, 18–20 randomisation 350 randomised block design 415–23 between-block variation 416, 416, 417, 417–18, 439 between-group variation 402, 403, 416, 416, 417–18, 439 block effects 419–20, 421, 439 compared with completely randomised design 416 critical range 422–3, 439 defined 416 estimated relative efficiency 422, 439 F test statistic 419–20, 421, 439 focus of analysis 416 mean squares 418–19, 439 partitioning the total variation 416 random error 416, 416, 418, 439 tests for the treatment and block effects 416, 416–22, 421 total variation 416, 416–17, 439 Tukey procedure 422, 422–3 using statistical software 421, 446, 446 within-group variation 402 randomness 410, 420 range calculation of 99–100, 136 characteristics of 105 defined 46, 99 formula 131 interquartile 100, 100–1, 105 relevant range 463 rare events 60–1 rate of non-compliance 304–5 ratio scales 11, 11 RE (estimated relative efficiency) 422, 439 real-time monitoring 63 recoded variables 16 rectangular distribution 213, 233, 256 region of non-rejection 318, 318, 323, 324 region of rejection 318, 318, 318–19, 323, 324 regression analysis defined 456 ethical issues 493–4, 495, 496 pitfalls 493–4, 495, 496 predictions 462–3 scatter diagrams for 456–8, 457, 459 types of regression models 456–8, 458–9 see also multiple regression models; simple linear regression regression coefficients defined 459 interpreting 505–7, 507, 541–2, 588 net regression coefficients 506 partial regression coefficients 506 population regression coefficients 516–19, 517 producing a prediction line 461–2 testing for significance 484, 512, 513, 517, 517 using dummy variables 525–31, 526, 527, 529–30, 532, 587 regression models 456–8, 458–9 regression sum of squares (SSR) 467, 468, 468, 497, 522, 523 related populations 371–7, 374–6 relative frequency distributions 48, 48–9, 82, 215, 215 relevant range 463 repeated measurements 371 replicates 426 research findings, reporting of 351 Reserve Bank of Australia 14 residual 474, 497 residual analysis defined 473 multiple regression 514–15, 515–16, 543 simple linear regression 473–6, 474–6, 493–4, 495, 496 time-series forecasting models 579, 580, 606 using statistical software 475, 478–9, 503 residual error, measuring magnitude of 579–80 residual plots to assess linearity 474, 475 to detect autocorrelation 477–8, 479 for five forecasting methods 581, 582 multiple regression 514–15, 515–16 simple linear regression 494, 495, 496 resistant measures 101 respondent error 25 response 6 response variables 457 risk of Type II error (b) 320 road fatalities, statistics on 70–1, 70–1, 113–15 robust tests 337, 364 rules, of counting 168–71 sample coefficient of correlation 126, 126–8, 127, 131 sample covariance 123, 123–5, 131 sample mean 93, 93–4, 118, 130 sample proportion 340 sample size 254–5, 256, 323–4, 351 sample size determination in business 295 for the mean 294–6, 296, 308, 314 for the proportion 297–9, 298, 308, 314 using statistical software 296, 298, 314 sample space 149, 150 sample standard deviation 102, 102–3, 131, 334, 337 sample statistic 316, 332 sample variance 102, 103–4, 131 samples cluster 17, 21 convenience 17, 17 defined 8, 17 examples of judgment 17, 17 non-probability 17, 17, 17–18 probability 17, 18 reasons for drawing 17 simple random 17, 18, 18–20 stratified 17, 21 systematic 17, 20, 20–1 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com INDEX sampling applications in auditing 300–7 Central Limit Theorem 214, 256, 256–8, 257 from finite populations 301, 307 from non-normally distributed populations 256–8, 257 from normally distributed populations 252–6, 253 with replacement 18 survey sampling methods 17–21 without replacement 18 sampling distributions defined 249 of the mean 249, 249–58, 253, 263 of the proportion 259–60, 260, 263 using statistical software 82, 265–6 sampling error 23, 25, 295 S&P 500 Index 596 S&P ASX 200 Index 596–7 SAS (software) 26 scales 11–12, 11–12 scatter diagrams defined 59, 456 example 59, 59 necessity of using 493–4 regression analysis 474–5, 493–4, 495, 496 regression models 456–8, 457, 459 sample coefficients of correlation 127 for two numerical variables 59, 59 using statistical software 87, 87–8 seasonal component 546, 546–7 seasonal data, time-series forecasting of 584–9, 585, 588, 600, 601, 606 second differences 561–3, 605 second-order autocorrelation 570 second-order autoregressive model 570–1, 571, 575, 575, 576, 577, 581, 600 second quartile 97 secondary sources, of data 13, 13–14 semi-structured data 15 separate-variance t test 365, 365–7, 367, 391, 397 shape 92, 107, 107–9, 136–7 shark attacks, statistics on 60–1 side-by-side bar charts 56, 56–7, 56–7, 87 significance level 319, 323–4, 350, 431 significance, statistical versus practical 351 simple events 149 simple linear regression 456–98 assumptions of 473, 473–6 calculating the slope 463–5, 497 calculating Y intercept 463–5, 497 coefficient of correlation 486, 497 coefficient of determination 469, 469–72, 470, 497 confidence interval estimate of the slope 485 confidence interval estimation for the mean of Y 489–91 defined 456 determining the equation 458–65, 459–62 Durbin–Watson statistic 475, 479, 479–80, 480, 497, 503 estimation of mean values 489–92 inferences about the slope 482–5, 483–5, 503 key formulas 497 least-squares method 459–61, 460, 460–1 measures of variation 467–72, 468–70, 497 measuring autocorrelation 477–8, 478–9, 503 pitfalls 493–4, 495, 496 prediction interval for an individual response Y 491, 491–2, 493 prediction line 459 regression coefficients 459, 461–2, 484 regression models 456–7, 456–8, 497 relevant range 463 residual analysis 473, 473–6, 474–6 standard error of the estimate 471, 471–2, 497 total sum of squares (SST) 467, 467–9, 468–9, 497 using statistical software 460–1, 461–2, 469, 472, 475, 483, 492, 492, 502, 502–3, 521 simple price index 591, 591–3, 601 simple random samples 17, 18, 18–20 single data value skewed distribution 107, 107, 107–8, 109, 110, 137 slope calculating 497 confidence interval estimate of 485, 518–19, 537 F test for 484, 484–5, 484–5, 497 simple linear regression 463–5, 482–5, 483–5, 503 t test for 482, 482–3, 483, 497 testing for in multiple regression models 516–18, 517, 537 software, statistical see statistical software spam filters 168–9 sparklines 64, 64, 88 spread (dispersion) 99 SPSS/PASW Statistics (software) 26 SS (sum of squares) 101 SSA (sum of squares due to factor A) 426, 426–7 SSAB (sum of squares due to interaction) 427 SSB (sum of squares between groups) 403, 407, 416, 416 SSB (sum of squares due to factor B) 427 SSBL (sum of squares between blocks) 416, 416, 417, 417–18 SSE (error sum of squares, or sum of squares error) 418, 467, 468, 468–9, 497, 581 SSE Composite Index 597 SSR (regression sum of squares) 467, 468, 468, 497, 522, 523 SST (sum of squares total, or total sum of squares) 403, 407, 416, 416, 467, 467–9, 468–9, 497 SSW (sum of squares within groups) 404, 407, 416, 416 stacked data 79 standard deviation of binomial distribution 194, 205 calculation of 101–2 characteristics of 105 chi-square test for 632–3, 632–3, 635 defined 101 in determining sample size 295 of the difference 302 of a discrete random variable 183, 183–4, 204 of exponential distribution 242 from a frequency distribution 118–19, 131 I-7 of hypergeometric distribution 201–2, 202, 205 of normal distribution 216, 216, 217, 222 sample standard deviation 102, 102–3 of the sum of two random variables 186, 186–7, 204 of uniform distribution 234 using statistical software 136–7 see also population standard deviation standard error defined 109 of the estimate 471, 471–2, 497, 580 of the mean 251, 251–2, 252, 263, 307 of the proportion 260, 307 standardised normal distribution 285, 285 standardised normal probability density function 217–18, 242 standardised normal random variables 216 Stata (software) 26 statistic, defined 8, statistical independence 159, 159–60, 173 statistical inference 280–1 statistical packages 26 statistical significance 351 statistical software analysis of variance 406, 408, 412, 430, 433–5, 444–7, 445–7 autoregressive modelling 575–7 bar charts and pie charts 79 binomial distribution 193, 193–4 bullet graphs 89–90 chi-square analysis 612–13, 618, 620, 625, 633, 641 confidence interval estimates 288, 289, 292, 313–14 contingency tables 86–7 cumulative distributions 82 descriptive statistics 109, 109, 135–8 determining sample size 296, 298 frequency distributions 80–2, 81 gauges 88–9 histograms 82–4 hypergeometric distribution 202, 202 hypothesis testing 326, 326, 331, 336–7, 336–8, 342, 356–7, 362–3, 367, 374, 376, 381, 386, 396–400, 397–400 Marascuilo procedure 620 measures of central tendency 135–6 multiple regression 507–8, 527, 527, 529–30, 532, 541–3, 542 normal distribution 226, 227, 246–7, 246–7 one-sample tests of hypothesis 356–7 organising numerical data 79–86 percentage and cumulative percentage polygons 85–6 percentage distributions 82 Poisson distribution 198 probabilities 178–9 probability distribution for a discrete variable 208 randomised block design 421 relative frequency 82 residual analysis 478–9 sample size determination 314 sampling distributions 265–6 scatter diagrams 87, 87–8 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com I-8 INDEX statistical software (continued) simple linear regression 460–1, 461–2, 469, 472, 475, 483, 492, 492, 502, 502–3, 521 sparklines 88 stacked and unstacked data 79 summary tables 77–8, 77–8 time-series forecasting 550, 552, 557–8, 569, 575–7, 582–3, 585, 588, 604, 604–6, 606 time-series plots 88 two-sample tests 396–400, 397–400 variation and shape 136–7 see also Microsoft Excel; Minitab; SAS; SPSS/ PASW; Stata statistics 6, 6–8, 26 Statistics New Zealand stem-and-leaf displays 43, 43–5, 80, 350 straight-line relationship 456, 456 strata 21 stratified samples 17, 21 structured data 15 Studentised range distribution 409, 436 Student’s t distribution 285, 285, 285–6 subjective probability 149 subjectivity, in interpretation 129 sum of squares (SS) 101 sum of squares between blocks (SSBL) 416, 416, 417, 417–18 sum of squares between groups (SSB) 403, 407, 416, 416 sum of squares due to factor A (SSA) 426, 426–7 sum of squares due to factor B (SSB) 427 sum of squares due to interaction (SSAB) 427 sum of squares error (SSE) 418 see also error sum of squares sum of squares total (SST) 403, 407, 416, 416, 467, 467–9, 468–9 see also total sum of squares sum of squares within groups (SSW) 404, 407, 416, 416 summary tables ANOVA summary tables 405, 406–8, 408, 419, 430 defined 38 examples of 38–9 using statistical software 77–8, 77–8 see also frequency distributions survey errors 23–5 survey sampling methods 17–21 surveys 14, 22–5 symmetrical distribution 107, 107, 107–8, 109, 120 syndicated services 14 systematic samples 17, 20, 20–1 t distribution, properties of 285, 285–6 t test checking assumptions 337, 337–8 choice of 352 as a classical parametric procedure 337 for the correlation coefficient 486 critical value approach 334–6, 335 highest-order autoregressive model 571–2, 572, 601 of hypothesis for the mean (s unknown) 334, 334–7, 335–8, 353, 356 means of two independent populations (s unknown) 360–4, 361–4 p-value approach 336–7 paired t test 372, 372–6, 374–6, 398–9, 398–9 pooled-variance t test 360, 360–4, 361–4, 391, 396–7, 396–7 robustness of 337 separate-variance t test 365, 365–7, 367, 391, 397 for the slope 482, 482–3, 483, 497 t statistic and the F statistic 523, 537 test means of two related samples 372–6 tables for categorical data 38–42 choosing an appropriate chart 73 frequency distributions 46–8 for numerical data 43–50 of random numbers 18, 18–20 using statistical software for 77–8, 77–8 see also contingency tables; summary tables telephone polling 12 test statistic 318, 323–4, 334 tests goodness-of-fit tests 627, 627–31, 635 power of 344–7, 344–8 robust tests 337, 364 see also hypothesis testing; one-sample tests; one-tail tests; t test; two-sample tests; twotail tests; Z test third-order autoregressive model 576 third (upper) quartile 97, 100, 130 tied observations 10 time-period forecasting 553, 600 time series 545 time-series forecasting 545–601 assumptions of 546, 599 autoregressive modelling 570, 570–8, 573, 575–8, 581, 582–3, 600–1, 605–6 in business 545 choosing an appropriate model 579–81, 580, 582–3, 606 classical multiplicative model 546, 546–7, 585, 600 defined 545 exponential smoothing 551, 551–3, 552, 600, 604 exponential trend model 558, 558–60, 560–1, 581, 582–3, 585–8, 588, 600–1, 605 five methods compared 581, 582–3, 606 forecasting time period 600 Holt–Winters method 567, 567–8, 569, 581, 582–3, 600 index numbers 591, 591–7 key formulas 600–1 least-squares method 555–63, 581, 582–3 least-squares trend-fitting and forecasting 555–63, 557, 585–9, 588, 605 linear trend model 555, 555–6, 557, 561–2, 581, 582–3, 600, 605 mean absolute deviation (MAD) 580 model selection 561–3, 571–2, 579–81, 580, 582–3, 605 moving averages 548, 548–51, 550 performing a residual analysis 579, 580 pitfalls 599 principle of parsimony 581 quadratic trend model 557, 557–8, 558–9, 562, 581, 582–3, 600, 605 as a quantitative method 545 of seasonal data 584–9, 585, 588, 600, 606 smoothing the annual time series 547–53 using statistical software 550, 552, 557–8, 569, 575–7, 582–3, 585, 588, 604, 604–6, 606 time-series plots 59, 59–61, 60–1, 88, 548 total amount 300, 300–2, 314 total difference 303–4, 314 total sum of squares (SST) 403, 407, 416, 416, 467, 467–9, 468–9, 497 total variation 403, 403, 403–4, 416, 416–17, 426, 426, 439, 467 trade associations 14 transformation formula 216, 216–19, 217–19, 224 treatment effect 402, 404–6, 416, 416–22, 421 treemaps 65, 65, 65–6 trend 545–6, 546 trend component 546 trend-fitting see least-squares trend-fitting and forecasting triple exponential smoothing 555 Tukey–Kramer multiple comparison procedure 408, 408–10, 410, 413, 435, 439, 445 Tukey procedure 422, 422–3, 435–6 two-factor factorial design 425 two-sample tests 359–400 comparing the means of two independent populations 359–68, 361–4, 367 comparing the means of two related populations 371–7, 374–6 comparing two population proportions 384–7, 385–6 F test for difference between two variances 378–83, 379, 381–2 flow chart for selecting 380 using statistical software 396–400, 397–400 two-tail tests autoregressive modelling 573, 577 choice of 350 defined 322 difference between two means 361–2, 362 difference between two proportions 364–5 ethical considerations 350 F test for difference between two variances 379, 382 hypothesis for the proportion 342 p-value approach 325–6, 326, 342, 342–3 paired t test 372–3 t test of hypothesis for the mean 334–6, 335 two-way analysis of variance 425–36 critical range 438, 440 defined 425 F test for factor A effect 428, 440 F test for factor B effect 428, 428–9, 440 F test for interaction effect 429, 429–30, 440 factor A variation 426–8, 436, 439 factor B variation 426–8, 436, 439 interaction variation 439 interpreting interaction effects 432–5, 432–5 key formulas 439–40 mean squares 428, 439 random error 427, 439 summary tables 430 testing for factor and interaction effects 426, 426–32, 430–1 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com INDEX total variation 426, 426, 439 Tukey procedure 435–6 using statistical software 430, 433–5, 446–7, 446–7 Type I errors 319, 319–21, 344 Type II errors 319, 319–21, 344, 345–6, 345–6 unbiased sample mean 249, 249–51, 250 UNDP (United Nations Development Programme) 458 unethical practices see ethical issues unexplained variation 467 uniform distribution 213, 214, 233, 233–4, 234, 242, 256, 257, 258 uniform probability density function 233–4, 234, 242 United Nations Development Programme (UNDP) 458 United States Federal Census 26 unstacked data 79 unstructured data 15 unweighted aggregate price index 593, 593–4, 601 upper quartile 97, 100 values 6, 16 variables categorical 9–10, 10, 10, 55–7, 86–7, 527–8 continuous 10 defined dependent 456, 507–8, 508 discrete 10, 181–4 dummy variables 525–31, 526, 527, 529–30, 532, 587 explanatory 457 independent 456, 506–7, 520, 523–4, 528–9, 535–6, 537 lagged predictor variables 605 numerical 9–10, 10, 10 operational definition recoding 16 response variable 457 types of association 125, 125 variance analysis of variance (ANOVA) 402 calculation of 103–5 characteristics of 105 chi-square test for 632–3, 632–3, 635 defined 101 difference between two means, unequal variances 397–8 of discrete random variables 183, 183–4, 204 equal variance 476, 476 exponential distribution 236, 242 F test for difference between two variances 378–83, 379, 381–2, 399–400, 399–400 from a frequency distribution 118 homogeneity of 411, 412, 445 population variance 114, 114–15 sample variance 102, 103–4 of the sum of two random variables 186, 186–7, 204 uniform distribution 233, 242 using statistical software 136–7, 399–400, 399–400 see also one-way analysis of variance; two-way analysis of variance variance inflationary factor (VIF) 535, 535–6, 537 variation calculating with Excel 136–7 coefficient of 105, 105–6 defined 92 explained variation 467 measures of 99–105, 467–72, 468–70, 497, 502–3 I-9 partitioning the total variation 403, 426 total variation 403, 403, 403–4, 416, 416–17, 426, 426, 439, 467 unexplained 467 Venn diagrams 150, 150–1, 151 VIF (variance inflationary factor) 535, 535–6, 537 wage price index 505 WaldoLands 63, 64, 65, 65 weighted aggregate price indices 594, 594–6 Wilcoxon rank sum test 364 within-group variation 402, 404, 439 _ X bar (X ) 93, 252–3, 255–6, 257, 263 X values 223–6, 224–5, 242 Y intercept 457, 460–1, 461, 463–5, 497 Z scores 106, 106–7, 131, 136–7, 214, 216, 231 Z test for the difference between two means 359, 391 for the difference between two proportions 384, 384–7, 385–6, 391, 400, 400, 612–13 for the difference between two related populations 371–7, 374–6 of hypothesis for the mean 322, 322–8, 323, 326, 337, 353, 356 for the mean difference 371–2, 391 for the population mean 344, 344 for the proportion 340, 340–3, 341, 342, 352, 353, 357 versus t test 337 Z test statistic 322, 323–4, 325 Z values 224–5, 224–6, 231, 242, 263 Copyright © Pearson Australia (a division of Pearson Australia Group Pty Ltd) 2019— 9781488617249 — Berenson/Basic Business Statistics 5e www.freebookslides.com This page is intentionally left blank ... published by Prentice Hall, including Statistics for Managers Using Microsoft Excel, Basic Business Statistics: Concepts and Applications and Business Statistics: A First Course Over the years, Berenson. .. Pty Ltd) 2019— 9781488617249 — Berenson /Basic Business Statistics 5e ix preface This fifth Australasian and Pacific edition of Basic Business Statistics: Concepts and Applications continues to... statistics education and is the co-author of 14 books, including such best-selling statistics textbooks as Statistics for Managers Using Microsoft Excel, Basic Business Statistics: Concepts and