1. Trang chủ
  2. » Y Tế - Sức Khỏe

Statistics The Art and Science of Learning from Data pot

834 2,5K 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 834
Dung lượng 39,61 MB

Nội dung

5.3 Conditional Probability: The Probability of A Given B 230 5.4 Applying the Probability Rules 242 Chapter Summary 255 Chapter Problems 256 Chapter 1 Statistics: The Art and Science

Trang 3

Statistics

The Art and Science of Learning from Data

Trang 5

Boston Columbus Indianapolis New York San Francisco Upper Saddle River

Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montréal Toronto

Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo

Trang 6

Senior Development Editor: Elaine Page

Executive Content Editor: Christine O’Brien

Senior Content Editor: Chere Bemelmans

Associate Content Editor: Dana Bettez

Editorial Assistant: Sonia Ashraf

Senior Managing Editor: Karen Wernholm

Senior Production Project Manager: Beth Houston

Digital Assets Manager: Marianne Groth

Supplements Production Coordinator: Katherine Roz

Manager, Multimedia Production: Christine Stavrou

Executive Marketing Manager: Roxanne McCarley

Marketing Manager: Erin Lane

Marketing Coordinator: Kathleen DeChavez

Rights and Permissions Advisor: Michael Joyce

Image Manager: Rachel Youdelman

Senior Manufacturing Buyer: Debbie Rossi

Senior Media Buyer: Ginny Michaud

Design Manager: Andrea Nix

Senior Designer: Heather Scott

Text Design: Ellen Pettengell

Production Coordination, Composition, and Illustrations: Integra

Cover Image: Robyn Mackenzie/iStockphoto

For permission to use copyrighted material, grateful acknowledgment is made to the copyright holders on page P-1, which is hereby made part of this copyright page

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks Where those designations appear in this book, and Pearson Education was aware of a trademark claim, the designations have been printed in initial caps or all caps.

Library of Congress Cataloging-in-Publications Data

Copyright © 2013, 2009, 2007 Pearson Education, Inc.

All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopy- ing, recording, or otherwise, without the prior written permission of the publisher Print-

ed in the United States of America For information on obtaining permission for use of material in this work, please submit a written request to Pearson Education, Inc., Rights and Contracts Department, 501 Boylston Street, Suite 900, Boston, MA 02116, fax your request to 617-671-3447, or e-mail at http://www.pearsoned.com/legal/permissions.htm

1 2 3 4 5 6 7 8 9 10—Quad—15 14 13 12 11

ISBN-10: 0-321-75594-4 ISBN-13: 978-0-321-75594-0 www.pearsonhighered.com

Trang 7

C HRIS F RANKLIN

Trang 8

4 Long-run probability demonstrations:

a Simulating the probability of rolling a 6

b Simulating the probability of rolling a 3 or 4

c Simulating the probability of head with a fair coin

d Simulating the probability of head with an unfair coin (P(H) = 0.2)

e Simulating the probability of head with an unfair coin (P(H) = 0.8)

f Simulating the stock market

5 Mean versus median

6 Standard deviation

7 Confidence intervals for a proportion

8 Confidence intervals for a mean (for studying the impact of confidence level and the impact of not knowing the standard deviation)

9 Hypothesis tests for a proportion

10 Hypothesis tests for a mean

Trang 9

5.3 Conditional Probability: The Probability of A Given B 230

5.4 Applying the Probability Rules 242

Chapter Summary 255 Chapter Problems 256

Chapter 1 Statistics: The Art and

Science of Learning from

Data 2

1.1 Using Data to Answer Statistical Questions 4

Chapter Summary 20

Chapter Problems 20

Chapter 2 Exploring Data with Graphs

and Numerical Summaries 23

2.1 Different Types of Data 24

2.3 Measuring the Center of Quantitative Data 47

2.4 Measuring the Variability of Quantitative Data 56

2.5 Using Measures of Position to Describe

Chapter 3 Association: Contingency,

Correlation, and Regression 89

3.1 The Association Between Two Categorical Variables 91

3.2 The Association Between Two Quantitative Variables 98

3.3 Predicting the Outcome of a Variable 111

3.4 Cautions in Analyzing Associations 124 Chapter Summary 141

Chapter Problems 141

Chapter 4 Gathering Data 149

4.1 Experimental and Observational Studies 151

Nonexperimental Studies 177 Chapter Summary 189

Chapter Problems 189

Part Review 1 198

Part 1 Questions 198 Part 1 Exercises 202

Gathering and Exploring Data

Part One

Preface xi

Trang 10

6.2 Probabilities for Bell-Shaped Distributions 276

6.3 Probabilities When Each Observation Has Two Possible

Chapter Summary 298

Chapter Problems 299

7.1 How Sample Proportions Vary Around the Population

7.3 The Binomial Distribution Is a Sampling Distribution (Optional) 329

Chapter Summary 332 Chapter Problems 333

Part Review 2 338

Part 2 Questions 338 Part 2 Exercises 342

Chapter 8 Statistical Inference: Confidence

Chapter Summary 392

Chapter Problems 392

Chapter 9 Statistical Inference: Significance

Tests About Hypotheses 400

9.1 Steps for Performing a Significance Test 402

9.2 Significance Tests About Proportions 406

9.4 Decisions and Types of Errors in Significance Tests 435

9.5 Limitations of Significance Tests 440

9.6 The Likelihood of a Type II Error (Not Rejecting H 0, Even Though It’s False) 447

Chapter Summary 453 Chapter Problems 454

Chapter 10 Comparing Two Groups 460

10.5 Adjusting for the Effects of Other Variables 508 Chapter Summary 513

Chapter Problems 515

Part Review 3 524

Part 3 Questions 524 Part 3 Exercises 529

Inferential Statistics

Part Three

Chapter 11 Analyzing the Association

Between Categorical Variables 536

11.2 Testing Categorical Variables for Independence 542

11.3 Determining the Strength of the Association 556

11.4 Using Residuals to Reveal the Pattern

of Association 563

11.5 Small Sample Sizes: Fisher’s Exact

Chapter Summary 571 Chapter Problems 571

Analyzing Association and Extended Statistical Methods

Part Four

Trang 11

Contents ix

Chapter 12 Analyzing the Association

Between Quantitative Variables:

Regression Analysis 576

12.2 Describe Strength of Association 586

12.3 Make Inferences About the Association 599

12.4 How the Data Vary Around the Regression Line 605

12.5 Exponential Regression: A Model for Nonlinearity 615

Chapter Summary 622

Chapter Problems 623

Chapter 13 Multiple Regression 629

13.1 Using Several Variables to Predict a Response 631

13.2 Extending the Correlation and R2 for Multiple

13.3 Using Multiple Regression to Make Inferences 642

13.4 Checking a Regression Model Using Residual Plots 652

13.5 Regression and Categorical Predictors 658

Chapter Summary 673

Chapter Problems 674

Chapter 14 Comparing Groups: Analysis

of Variance Methods 679

14.2 Estimating Differences in Groups for a Single Factor 691

Chapter Summary 714 Chapter Problems 714

Chapter 15 Nonparametric

Statistics 720

and for Matched Pairs 733 Chapter Summary 744

Chapter Problems 745

Part Review 4 748

Part 4 Questions 748 Part 4 Exercises 753

Tables A-1 Answers A-7 Index I-1 Index of Applications I-9 Photo Credits P-1

Trang 12

An Introduction to the Applets

The applets on the CD-ROM that is bound inside all new

copies of this text are designed to help students

under-stand a wide range of introductory statistics topics

• The sample from a population applet lets the user select

samples of various sizes from a wide range of

popula-tion shapes including uniform, bell-shaped, skewed, and

binary populations (including a range of values for the

population proportion, p ) In addition, one can alter

any of the default populations to create a custom

dis-tribution by dragging the mouse over the population

or by going to Custom binary and typing in the desired

population proportion Small samples are drawn in an

animated fashion to help students understand the basic

idea of sampling Larger samples are drawn in an

unani-mated fashion so that characteristics of larger samples

can be quickly compared to population characteristics

• The sampling distributions applet builds off the previous

applet by adding the values of user-selected statistics for

each sample Students can study the resulting sampling

distribution and see how characteristics of the sampling

distribution, such as center and spread, are affected by

sample size and population shape Students can also

com-pare sampling distributions of different statistics such as

the sample mean and the sample median

• The random numbers applet lets students select a

ran-dom sample from a range of user-defined integer values

Students can use the applet to study basic probability

by considering the relative frequency of particular

out-comes among the samples They can also select samples

from a list of values for a hands-on sampling activity

• Six long-run probability demonstration applets simulate

rolling a die, flipping a coin, and fluctuation of the stock

market Students can select the number of times a

simula-tion occurs, and whether they would like it animated The

relative frequency of an event of interest is plotted versus

the number of simulations As the number of simulations

increases, the convergence of the relative frequency to

the true probability of the event will be evident

• The mean versus median applet lets students construct

a data set interactively by clicking on a graphic that

dis-plays the mean and median of the data Using the applet

lets students study the effects of shape and outliers on

the mean and the median

• The standard deviation applet provides a similar type of

exploration This applet is offered in a stacked form so that data sets with different standard deviations can be compared easily

• Three applets help students better understand dence intervals The confidence intervals for a pro- portion applet lets students simulate 95% and 99%

confi-confidence intervals for a population proportion and gain an understanding of how to interpret a 95% and 99% confidence level The confidence intervals are plotted illustrating their relationship in terms of width and their random nature The sample size and the true underlying proportion are specified by the user Two

applets lets students study confidence intervals for

a mean in a similar manner The first can be used to

show how sample size and distributional shape affect

the performance of classic t intervals for the mean The second lets students compare the performance of z and t intervals for different distributional shapes and

samples sizes

• The applets for hypothesis tests for a proportion and

hypothesis tests for a mean help students understand

how the underlying assumptions affect the mance of hypothesis tests These applets plot test sta-tistics and corresponding P-values for data generated under different user-supplied conditions Tabled rejec-tion proportions allow students to determine how the conditions specified affect the true level of significance (Type I error probability) for the tests The concepts

perfor-of power and Type II error can also be explored with these applets

• The correlation by eye applet helps students guess the

value of the correlation coefficient based on a plot of simulated data In addition, students can see how adding and deleting points affects the correlation co-

scatter-efficient Likewise, the regression by eye applet lets

stu-dents attempt interactively determining the regression line for simulated data

• The binomial distribution applet generates samples from the binomial distribution at user-specified param-eter values By varying the parameters, students can de-velop an understanding of how these parameters affect the binomial distribution

Trang 13

Preface

We have each taught introductory statistics for more than 30 years, and we have witnessed the welcome evolution from the traditional formula-driven mathemati-cal statistics course to a concept-driven approach This concept-driven approach places more emphasis on why statistics is important in the real world and places less emphasis on probability One of our goals in writing this book was to help make the conceptual approach more interesting and more readily accessible to college students At the end of the course, we want students to look back at their statistics course and realize that they learned practical concepts that will serve them well for the rest of their lives

We also want students to come to appreciate that in practice, assumptions are not perfectly satisfied, models are not exactly correct, distributions are not exactly normally distributed, and all sorts of factors should be considered in con-ducting a statistical analysis The title of our book reflects the experience of data analysts, who soon realize that statistics is an art as well as a science

What’s New in This Edition

Our goal in writing the third edition of our textbook was to improve the student and instructor user experience We have:

• Clarified terminology and streamlined writing throughout the text to improve ease of reading and facilitate comprehension

• Modified the design to clearly show pedagogical hierarchy and distinguishing features

• Added concept tags to all examples, which makes it easy for students and instructors to identify what is being demonstrated in the example

• Added margin Caution boxes to alert students to areas where they need to pay special attention, such as common mistakes to avoid

• Updated or replaced at least 25 percent of the exercises and examples In addition, we have updated all General Social Services (GSS) data with the most current data available

• Significantly rewritten Chapter 7 : Sampling Distribution In this chapter we now emphasize simulation to develop the concepts of sampling distributions, with less emphasis on the more traditional mathematical approach We have reorganized the chapter to better distinguish a population, data, and sampling distribution We now introduce standard error terminology in Chapter 8 , where in practice we use the sample proportion and sample standard devia-tion to estimate the standard deviation of a sampling distribution We believe this will result in less confusion for the student and emphasize that in prac-

tice, when we use the term standard error , we most often are referencing the

estimated standard deviation of a sampling distribution, not the theoretical standard deviation

• Added Learning Objectives for each chapter to the Instructor’s Edition, which helps when preparing lectures

xi

Trang 14

• Emphasize statistical literacy and develop statistical thinking

• Use real data

• Stress conceptual understanding rather than mere knowledge of procedures

• Foster active learning in the classroom

• Use technology for developing concepts and analyzing data

• Use assessment to evaluate and improve student learning

We wholeheartedly endorse these recommendations, and our textbook takes every opportunity to support these guidelines

Ask and Answer Interesting Questions

In presenting concepts and methods, we encourage students to think about the data and the appropriate analyses by posing questions Our approach, learning

by framing questions, is carried out in various ways, including (1) presenting a structured approach to examples that separates the question and the analysis from the scenario presented, (2) providing homework problems that encourage students to think and write, and (3) asking questions in the figure captions that are answered in the Chapter Review

Present Concepts Clearly

Students have told us that this book is more “readable” and interesting than

oth-er introductory statistics texts because of the wide variety of intriguing real data examples and exercises We have simplified our prose wherever possible, without sacrificing any of the accuracy that instructors expect in a textbook

A serious source of confusion for students is the multitude of inference methods that derive from the many combinations of confidence intervals and tests, means and proportions, large sample and small sample, variance known and unknown, two-sided and one-sided inference, independent and dependent samples, and so on We emphasize the most important cases for practical application of inference: large sample, variance unknown, two-sided inference, and independent samples The many other cases are also covered (except for known variances), but more briefly, with the exercises focusing mainly on the way inference is commonly conducted in practice

Connect Statistics to the Real World

We believe it’s important for students to be comfortable with analyzing a balance

of both quantitative and categorical data so students can work with the data they most often see in the world around them Every day in the media, we see and hear percentages and rates used to summarize results of opinion polls, outcomes of medical studies, and economic reports As a result, we have increased the atten-tion paid to the analysis of proportions For example, we use contingency tables early in the text to illustrate the concept of association between two categorical variables and to show the potential influence of a lurking variable

Trang 15

Preface xiiiOrganization of the Book

The statistical investigative process has the following components: (1) asking a statistical question; (2) designing an appropriate study to collect data; (3) analyz-ing the data; and (4) interpreting the data and making conclusions to answer the statistical questions With this in mind, the book is organized into four parts Part 1 focuses on gathering and exploring data This equates to components

1, 2, and 3, when the data is analyzed descriptively (both for one variable and the association between two variables)

Part 2 covers probability, probability distributions, and the sampling distribution This equates to component 3, when the student learns the underlying probability necessary to make the step from analyzing the data descriptively to analyzing the data inferentially (for example, understanding sampling distributions to develop the concept of a margin of error and a P-value)

Part 3 covers inferential statistics This equates to components 3 and 4 of the statistical investigative process The students learn how to form confidence in-tervals and conduct significance tests and then make appropriate conclusions an-swering the statistical question of interest

Part 4 covers analyzing associations (inferentially) and looks at extended tistical methods

The chapters are written in such a way that instructors can teach out of order For example, after Chapter 1 , an instructor could easily teach Chapter 4 , Chapter 2, and Chapter 3 Alternatively, an instructor may teach Chapters 5 , 6 , and 7 after Chapters 1 and 4

Features of the Third Edition

Promoting Student Learning

To motivate students to think about the material, ask appropriate questions, and develop good problem-solving skills, we have created special features that distin-guish this text

Student Support

To draw students to important material we highlight key definitions, guidelines, cedures, “In Practice” remarks, and other summaries in boxes throughout the text In addition, we have four types of margin notes:

pro-• In Words: This feature explains, in plain language, the definitions and

sym-bolic notation found in the body of the text (which, for technical accuracy, must be more formal)

• Caution: These margin boxes alert students to areas to which they need to pay

special attention, particularly where they are prone to make mistakes or rect assumptions

incor-• Recall: As the student progresses through the book, concepts are presented

that depend on information learned in previous chapters The Recall margin boxes direct the reader back to a previous presentation in the text to review and reinforce concepts and methods already covered

• Did You Know: These margin boxes provide information that helps with the

contextual understanding of the statistical question under consideration

Graphical Approach

Because many students are visual learners, we have taken extra care to make the

text figures informative We’ve annotated many of the figures with labels that

Trang 16

clearly identify the noteworthy aspects of the illustration Further, most figure captions include a question (answered in the Chapter Review) designed to chal-lenge the student to interpret and think about the information being communi-cated by the graphic The graphics also feature a pedagogical use of color to help students recognize patterns and distinguish between statistics and parameters The use of color is explained in the very front of the book for easy reference

Hands-On Activities and Simulations

Chapters 1 through 12 include at least one activity each The instructor can elect

to carry out the activities in class, outside of class, or a combination of both The activity often involves simulation, commonly using an applet available on the companion CD-ROM and within MyStatLab™ These hands-on activities and simulations encourage students to learn by doing

Connection to History: On the Shoulders of

We believe that knowledge pertaining to the evolution and history of the statistics discipline is relevant to understanding the methods we use for designing studies and analyzing data Throughout the text, several chapters feature a spotlight on people who have made major contributions to the statistics discipline These spot-

lights are titled On the Shoulders of

Real World Connections

Chapter-Opening Example

Each chapter begins with a high-interest example that raises key questions and

establishes themes that are woven throughout the chapter Illustrated with ing photographs, this example is designed to grab students’ attention and draw them into the chapter The issues discussed in the chapter’s opening example are referred to and revisited in examples within the chapter All chapter-opening examples use real data from a variety of applications

engag-Statistics: In Practice

We realize that there is a difference between proper “academic” statistics and what is actually done in practice Data analysis in practice is an art as well as a science Although statistical theory has foundations based on precise assumptions

and conditions, in practice the real world is not so simple In Practice boxes and

text references alert students to the way statisticians actually analyze data in tice These comments are based on our extensive consulting experience and re-search and by observing what well-trained statisticians do in practice

prac-Exercises and Examples

Innovative Example Format

Recognizing that the worked examples are the major vehicle for engaging and teaching students, we have developed a unique structure to help students learn to model the question-posing and investigative thought process required to examine issues intelligently using statistics The five components are as follows:

• Picture the Scenario presents background information so students can

visual-ize the situation This step places the data to be investigated in context and often provides a link to previous examples

• Questions to Explore reference the information from the scenario and pose

questions to help students focus on what is to be learned from the example and what types of questions are useful to ask about the data

Trang 17

Preface xv

• Think It Through is the heart of each example Here, the questions posed are

investigated and answered using appropriate statistical methods Each tion is clearly matched to the question so students can easily find the response

solu-to each Question solu-to Explore

• Insight clarifies the central ideas investigated in the example and places them

in a broader context that often states the conclusions in less technical terms Many of the Insights also provide connections between seemingly disparate topics in the text by referring to concepts learned previously and/or foreshad-owing techniques and ideas to come

• Try Exercise: Each example concludes by directing students to an end-of-section

exercise that allows immediate practice of the concept or technique within the example

Concept tags are included with each example so that students can easily identify

the concept demonstrated in the example

Relevant and Engaging Exercises

The text contains a strong emphasis on real data in both the examples and ercises We have updated the exercise sets in the third edition to ensure that students have ample opportunity to practice techniques and apply the con-cepts Nearly all of the chapters contain more than 100 exercises, and more than

ex-25 percent of the exercises are new to this edition or have been updated with rent data These exercises are realistic and ask students to provide interpretations

cur-of the data or scenario rather than merely to find a numerical solution We show how statistics addresses a wide array of applications including opinion polls, market research, the environment, and health and human behavior Because we believe that most students benefit more from focusing on the underlying concepts and in-terpretations of data analyses rather from the actual calculations, the exercises of-ten show summary statistics and printouts and ask what can be learned from them

We have exercises in three places:

• At the end of each section These exercises provide immediate reinforcement

and are drawn from concepts within the section

• At the end of each chapter This more comprehensive set of exercises draws

from all concepts across all sections within the chapter

• In the Part Reviews These exercises draw from across all chapters in the part

Each exercise has a descriptive label Exercises for which technology is mended are indicated with the icon Larger data sets used in examples and exercises are referenced in the text, listed in the back endpapers, and made avail-able on the companion CD-ROM The exercises are divided into the following three categories:

recom-• Practicing the Basics are the section exercises and the first group of

end-of-chapter exercises; they reinforce basic application of the methods

• Concepts and Investigations exercises require the student to explore real data

sets and carry out investigations for mini-projects They may ask students to explore concepts and related theory, or be extensions of the chapter’s methods This section contains some multiple-choice and true-false exercises to help stu-dents check their understanding of the basic concepts and prepare for tests A few more difficult, optional exercises (highlighted with the ♦♦ icon) are included

to present some additional concepts and methods Concepts and Investigations exercises are found in the end-of-chapter exercises and the Part Reviews

• Student Activities are designed for group work based on investigations done

by each of the students on a team Student Activities are found in the end-of- chapter exercises, and additional activities may be found within chapters as well

Trang 18

Technology Integration

Up-to-Date Use of Technology

The availability of technology enables instruction that is less calculation-based and more concept-oriented Output from software applications and calculators

is displayed throughout the textbook, and discussion focuses on interpretation of the output, rather than on the keystrokes needed to create the output Although most of our output is from Minitab® and the TI-83+/84, we also show screen captures from IBM® SPSS® and Microsoft Excel® as appropriate Technology-specific manuals containing keystroke information are available with this text See the supplements listing for more information

Applets

Applets referred to in the text are found on the companion CD-ROM or

with-in MyStatLab Applets have great value because they demonstrate concepts

to students visually For example, creating a sampling distribution is plished more readily with applets than with a static text figure The applets are presented as optional explorations in the text (Description of the applets may

accom-be found on page x.)

Data Sets

We use a wealth of real data sets throughout the textbook These data sets are available on the companion CD-ROM and on the website www.pearsonhighered.com/mathstatsresources/ The same data set is often used in several chapters, helping reinforce the four components of the statistical investigative process and allowing the students to see the big picture of statistical reasoning Exercises using data sets are noted with this icon:

An Invitation Rather Than a Conclusion

We hope that students using this textbook will gain a lasting appreciation for the vital role the art and science of statistics plays in analyzing data and helping us make decisions in our lives Our major goals for this textbook are that students learn how to:

• Produce data that can provide answers to properly posed questions

• Appreciate how probability helps us understand randomness in our lives, as well as grasp the crucial concept of a sampling distribution and how it relates to inference methods

• Choose appropriate descriptive and inferential methods for examining and analyzing data and drawing conclusions

• Communicate the conclusions of statistical analyses clearly and effectively

• Understand the limitations of most research, either because it was based on an observational study rather than a randomized experiment or survey, or because a certain lurking variable was not measured that could have explained the observed associations

We are excited about sharing the insights that we have learned from our rience as teachers and from our students through this text Many students still en-ter statistics classes on the first day with dread because of its reputation as a dry, sometimes difficult, course It is our goal to inspire a classroom environment that

expe-is filled with creativity, openness, realexpe-istic applications, and learning that students find inviting and rewarding We hope that this textbook will help the instructor and the students experience a rewarding introductory course in statistics

Trang 19

Preface xviiSupplements

For the Student

Student’s Solutions Manual , by Sarah Streett, contains fully worked solutions to

odd-numbered exercises (ISBN-10: 0-321-75619-3; ISBN-13: 978-0-321-75619-0)

Video Resources on DVD contain example-level videos that explain how to

work examples from the text The videos provide excellent support for students who require additional assistance or want reinforcement on topics and concepts learned in class (ISBN-10: 0-321-78051-5; ISBN-13: 978-0-321-78051-5)

Excel ® Manual (download only) , by Jack Morse (University of Georgia), provides detailed tutorial instructions and worked-out examples and exercises for Excel Available for download from www.pearsonhighered.com/mathstatsre-sources or within MyStatLab

Graphing Calculator Manual (download only) , by Peter Flanagan-Hyde

(Phoe-nix Country Day School), provides detailed tutorial instructions and out examples and exercises for the TI-83/84 Plus Available for download from www.pearsonhighered.com/mathstatsresources or within MyStatLab

worked-MINITAB ® Manual (download only) , by Linda Dawson (University of

Washing-ton, Tacoma), provides detailed tutorial instructions and worked-out examples and exercises for MINITAB Available for download from www.pearsonhighered.com/mathstatsresources or within MyStatlab

Student Laboratory Workbook , by Megan Mocko (University of Florida) and

Maria Ripol (University of Florida), is a study tool for the first ten chapters of the text This workbook provides section-by-section review and practice and addi-tional activities that cover fundamental statistical topics (ISBN-10: 0-321-78342-5; ISBN-13: 978-0-321-78342-4)

Study Cards for Statistics Software This series of study cards, available for

Excel®, MINITAB®, JMP®, SPSS®, R®, StatCrunch®, and the TI-83/84® graphing calculators provides students with easy, step-by-step guides to the most common statistics software Visit www.myPearsonStore.com for more information

For the Instructor

Instructor’s Edition (IE) contains comprehensive Instructor’s Notes for each

chapter Broken down by section, they offer a valuable introduction to each ter by presenting learning objectives (new to this edition), the author’s rationale for content and presentation decisions made in the chapter, tips for introducing complex material, common pitfalls students encounter, additional examples and activities to use in class, and suggestions for how to integrate applets and ac-tivities effectively Short answers to all of the exercises are given in the Answer Appendix Full solutions to all of the exercises are in the Instructor’s Solutions Manual (ISBN-10: 0-321-75610-X; ISBN-13: 978-0-321-75610-7)

chap-Instructor to chap-Instructor Videos provide an opportunity for adjuncts, part-timers,

TAs, or other instructors who are new to teaching from this text or have ited class prep time to learn about the book’s approach and coverage directly from Chris Franklin The videos focus on those topics that have proven to be most challenging to students Chris offers suggestions, pointers, and ideas about how to present these topics and concepts effectively based on her many years

lim-of teaching introductory statistics She also shares insights on how to help dents use the textbook in the most effective way to realize success in the course The videos are available for download from Pearson’s online catalog at www pearsonhighered.com/irc and through MyStatLab

Trang 20

stu-Instructor’s Solutions Manual, by Sarah Streett, contains fully worked solutions to

every textbook exercise Available for download from Pearson’s online catalog at www.pearsonhighered.com/irc and through MyStatLab

Answers to the Student Laboratory Manual is available for download from Pearson’s online catalog at www.pearsonhighered.com/irc and through MyStatLab

PowerPoint Lecture Slides are fully editable and printable slides that follow the

textbook These slides can be used during lectures or posted to a Web site in

an online course The PowerPoint Lecture Slides are available from Pearson’s online catalog at www.pearsonhighered.com/irc and through MyStatLab

Active Learning Questions are prepared in PowerPoint ® and intended for use with classroom response systems Several multiple-choice questions are available for each chapter of the book, allowing instructors to quickly assess mastery of ma-terial in class The Active Learning Questions are available from Pearson’s online catalog at www.pearsonhighered.com/irc and through MyStatLab

TestGen® ( www.pearsoned.com/testgen ) enables instructors to build, edit, print, and administer tests using a computerized bank of questions developed to cover all the objectives of the text TestGen is algorithmically based, allowing instruc-tors to create multiple but equivalent versions of the same question or test with the click of a button Instructors can also modify test bank questions or add new questions The software and test bank are available for download from Pearson’s online catalog at www.pearsonhighered.com/irc and through MyStatLab

mul-tiple choice and short answer questions for each section of the text, along with the answer keys Available for download from Pearson’s online catalog at www.pearsonhighered.com/irc and through MyStatLab

Technology Resources

Companion CD-ROM

Each new copy of the text comes with a companion CD-ROM containing data sets (.csv, TI-83/84, and txt files) and applets referenced in the text, which are useful for illustrating statistical concepts

MyStatLab ™ Online Course (access code required)

MyStatLab is a course management system that delivers proven results in helping

individual students succeed

• MyStatLab can be successfully implemented in any environment—lab-based, hybrid, fully online, traditional—and demonstrates the quantifiable difference that integrated usage has on student retention, subsequent success, and overall achievement

• MyStatLab’s comprehensive online gradebook automatically tracks students’ results on tests, quizzes, homework, and in the study plan Instructors can use the gradebook to intervene if students have trouble or to provide positive feedback Data can be easily exported to a variety of spreadsheet programs, such as Microsoft Excel

MyStatLab provides engaging experiences that personalize, stimulate, and

mea-sure learning for each student

• Tutorial Exercises with Multimedia Learning Aids: The homework and

prac-tice exercises in MyStatLab align with the exercises in the textbook, and they regenerate algorithmically to give students unlimited opportunity for practice and mastery Exercises offer immediate helpful feedback, guided

Trang 21

Preface xix

solutions, sample problems, animations, videos, and eText clips for extra help at point-of-use

• Getting Ready for Statistics: A library of questions now appears within each

MyStatLab course to offer the developmental math topics students need for the course These can be assigned as a prerequisite to other assignments, if desired

• Conceptual Question Library: In addition to algorithmically regenerated

ques-tions that are aligned with your textbook, there is a library of 1,000 Conceptual Questions available in the assessment managers that require students to apply their statistical understanding

• StatCrunch: MyStatLab includes a web-based statistical software, StatCrunch,

within the online assessment platform so that students can easily analyze data sets from exercises and the text In addition, MyStatLab includes access to

www.StatCrunch.com , a web site where users can access more than 13,000

shared data sets, conduct online surveys, perform complex analyses using the powerful statistical software, and generate compelling reports

• Integration of Statistical Software: Knowing that students often use external

statistical software, we make it easy to copy our data sets, both from the ebook and MyStatLab questions, into software like StatCrunch, Minitab, Excel and more Students have access to a variety of support—Technology Instruction Videos, Technology Study Cards, and Manuals—to learn how to effectively use statistical software

• Expert Tutoring: Although many students describe the whole of MyStatLab as

“like having your own personal tutor,” students also have access to live ing from Pearson Qualified statistics instructors provide tutoring sessions for students via MyStatLab

And, MyStatLab comes from a trusted partner with educational expertise and an

eye on the future

Knowing that you are using a Pearson product means knowing that you are ing quality content That means that our eTexts are accurate, that our assessment tools work, and that our questions are error-free And whether you are just get-ting started with MyStatLab, or have a question along the way, we’re here to help you learn about our technologies and how to incorporate them into your course

To learn more about how MyStatLab combines proven learning applications

with powerful assessment, visit www.mystatlab.com or contact your Pearson

representative

MathXL ® for Statistics Online Course (access code required)

MathXL® is the homework and assessment engine that runs MyStatLab (MyStatLab is MathXL plus a learning management system.) With MathXL for Statistics, instructors can:

• Create, edit, and assign online homework and tests using algorithmically erated exercises correlated at the objective level to the textbook

gen-• Create and assign their own online exercises and import TestGen tests for added flexibility

• Maintain records of all student work, tracked in MathXL’s online gradebook With MathXL for Statistics, students can:

• Take chapter tests in MathXL and receive personalized study plans and/or personalized homework assignments based on their test results

• Use the study plan and/or the homework to link directly to tutorial exercises for the objectives they need to study

Trang 22

• Students can also access supplemental animations and video clips directly from selected exercises

• Knowing that students often use external statistical software, we make it easy

to copy our data sets, both from the eText and the MyStatLab questions, into software like StatCrunch, Minitab, Excel and more

MathXL for Statistics is available to qualified adopters For more information, visit our website at www.mathxl.com , or contact your Pearson representative

StatCrunch ®

StatCrunch ® is powerful web-based statistical software that allows users to form complex analyses, share data sets, and generate compelling reports of their data The vibrant online community offers more than 13,000 data sets for stu-dents to analyze

per-• Collect Users can upload their own data to StatCrunch or search a large library

of publicly shared data sets, spanning almost any topic of interest Also, an online survey tool allows users to quickly collect data via web-based surveys

• Crunch A full range of numerical and graphical methods allow users to analyze

and gain insights from any data set Interactive graphics help users understand statistical concepts, and are available for export to enrich reports with visual representations of data

• Communicate Reporting options help users create a wide variety of

visually-appealing representations of their data

Full access to StatCrunch is available with a MyStatLab kit, and StatCrunch is available by itself to qualified adopters For more information, visit our website

at www.statcrunch.com , or contact your Pearson representative

The Student Edition of MINITAB ® (CD Only)

The Student Edition of MINITAB is a condensed version of the Professional Release of MINITAB statistical software It offers the full range of statistical methods and graphical capabilities, along with worksheets that can include

up to 10,000 data points Only available for bundling with the text (ISBN-10: 0-321-11313-6; ISBN-13: 978-0-321-11313-9)

JMP ® Student Edition

JMP Student Edition is easy-to-use, streamlined version of JMP desktop cal discovery software from SAS Institute Inc and is only available for bundling with the text (ISBN-10: 0-321-67212-7; ISBN-13: 978-0-321-67212-4)

statisti-IBM ® SPSS ® Statistics Student Version

SPSS, a statistical and data management software package, is also available for bundling with the text (ISBN-10: 0-321-67537-1; ISBN-13: 978-0-321-67537-8)

XLSTAT for Pearson

Used by leading businesses and universities, XLSTAT is an Excel ® add-in that offers a wide variety of functions to enhance the analytical capabilities of Microsoft Excel, making it the ideal tool for your everyday data analysis and sta-tistics requirements XLSTAT is compatible with all Excel versions (except Mac 2008) Available for bundling with the text (ISBN-10: 0-321-75932-X; ISBN-13: 978-0-321-75932-0)

For more information, please contact your local Pearson Education Sales Representative

Trang 23

Preface xxiAcknowledgments

We are indebted to the following individuals, who provided valuable feedback for the third edition:

Larry Ammann, University of Texas, Dallas Ellen Breazel, Clemson University

Dagmar Budikova, Illinois State University Richard Cleary, Bentley University

Winston Crawley, Shippensburg University Jonathan Duggins, Virginia Tech

Brian Karl Finch, San Diego State University Kim Gilbert, University of Georgia

Hasan Hamdan, James Madison University John Holcomb, Cleveland State University Nusrat Jahan, James Madison University Martin Jones, College of Charleston Gary Kader, Appalachian State University Jackie Miller, The Ohio State University Megan Mocko, University of Florida June Morita, University of Washington Sister Marcella Louise Wallowicz, Holy Family University Peihua Qui, University of Minnesota

We are also indebted to the many reviewers, class testers, and students who gave

us invaluable feedback and advice on how to improve the quality of the book

ARIZONA Russel Carlson, University of Arizona; Peter Flanagan-Hyde, Phoenix Country Day School Q CALIFORINIA James Curl, Modesto Junior

College; Christine Drake, University of California at Davis; Mahtash ari, UCLA; Dawn Holmes, University of California Santa Barbara; Rob Gould, UCLA; Rebecca Head, Bakersfield College; Susan Herring, Sonoma State Uni-versity; Colleen Kelly, San Diego State University; Marke Mavis, Butte Com-munity College; Elaine McDonald, Sonoma State University; Corey Manchester, San Diego State University; Amy McElroy, San Diego State University; Helen Noble, San Diego State University; Calvin Schmall, Solano Community College Q

Esfandi-COLORADO David Most, Colorado State University QCONNECTICUT Paul

Bugl, University of Hartford; Anne Doyle, University of Connecticut; Pete son, Eastern Connecticut State University; Dan Miller, Central Connecticut State University; Kathleen Mclaughlin, University of Connecticut; Nalini Ravishanker, University of Connecticut; John Vangar, Fairfield University; Stephen Sawin, Fair-field University QDISTRICT OF COLUMBIA Hans Engler, Georgetown Univer-

John-sity; Mary W Gray, American UniverJohn-sity; Monica Jackson, American University

Q FLORIDA Nazanin Azarnia, Santa Fe Community College; Brett Holbrook;

James Lang, Valencia Community College; Karen Kinard, Tallahassee munity College; Maria Ripol, University of Florida; James Smart, Tallahassee Community College; Latricia Williams, St Petersburg Junior College, Clear-water; Doug Zahn, Florida State University QGEORGIA Carrie Chmielarski,

Com-University of Georgia; Ouida Dillon, Oconee County High School; Katherine Hawks, Meadowcreek High School; Todd Hendricks, Georgia Perimeter Col-lege; Charles LeMarsh, Lakeside High School; Steve Messig, Oconee County High School; Broderick Oluyede, Georgia Southern University; Chandler Pike, University of Georgia; Kim Robinson, Clayton State University; Jill Smith,

Trang 24

University of Georgia; John Seppala, Valdosta State University; Joseph Walker, Georgia State University QIOWA John Cryer, University of Iowa; Kathy Rogotz-

ke, North Iowa Community College; R P Russo, University of Iowa; William Duckworth, Iowa State University QILLINOIS Linda Brant Collins, University

of Chicago; Ellen Fireman, University of Illinois; Jinadasa Gamage, Illinois State University; Richard Maher, Loyola University Chicago; Cathy Poliak, Northern Illinois University; Daniel Rowe, Heartland Community College Q KANSAS

James Higgins, Kansas State University; Michael Mosier, Washburn University

QKENTUCKY Lisa Kay, Eastern Kentucky University QMASSACHUSETTS

Katherine Halvorsen, Smith College; Xiaoli Meng, Harvard University; Daniel Weiner, Boston University QMICHIGAN Kirk Anderson, Grand Valley State

University; Phyllis Curtiss, Grand Valley State University; Roy Erickson, gan State University; Jann-Huei Jinn, Grand Valley State University; Sango Oti-eno, Grand Valley State University; Alla Sikorskii, Michigan State University; Mark Stevenson, Oakland Community College; Todd Swanson, Hope College; Nathan Tintle, Hope College Q MINNESOTA Bob Dobrow, Carleton Col-lege; German J Pliego, University of St.Thomas; Engin A Sungur, University

Michi-of Minnesota–Morris QMISSOURI Lynda Hollingsworth, Northwest Missouri

State University; Larry Ries, University of Missouri–Columbia; Suzanne ville, Columbia College QMONTANA Jeff Banfield, Montana State University

Tour-Q NEW JERSEY Harold Sackrowitz, Rutgers, The State University of New

Jersey; Linda Tappan, Montclair State University Q NEW MEXICO David

Daniel, New Mexico State University QNEW YORK Brooke Fridley, Mohawk

Valley Community College; Martin Lindquist, Columbia University; Debby Lurie,

St John’s University; David Mathiason, Rochester Institute of Technology; Steve Stehman, SUNY ESF; Tian Zheng, Columbia University Q NEVADA: Alison

Davis, University of Nevada-Reno Q NORTH CAROLINA Pamela Arroway,

North Carolina State University; E Jacquelin Dietz, North Carolina State versity; Alan Gelfand, Duke University; Scott Richter, UNC Greensboro; Rog-

Uni-er Woodard, North Carolina State UnivUni-ersity Q NEBRASKA Linda Young,

University of Nebraska QOHIO Jim Albert, Bowling Green State University;

Stephan Pelikan, University of Cincinnati; Teri Rysz, University of Cincinnati; Deborah Rumsey, The Ohio State University; Kevin Robinson, University of Akron Q OREGON Michael Marciniak, Portland Community College; Henry

Mesa, Portland Community College, Rock Creek; Qi-Man Shao, University of Oregon; Daming Xu, University of Oregon QPENNSYLVANIA Douglas Frank,

Indiana University of Pennsylvania; Steven Gendler, Clarion University; nie A Green, East Stroudsburg University; Paul Lupinacci, Villanova Univer-sity; Deborah Lurie, Saint Joseph’s University; Linda Myers, Harrisburg Area Community College; Tom Short, Villanova University; Kay Somers, Moravian CollegeQSOUTH CAROLINA Beverly Diamond, College of Charleston; Mur-

Bon-ray Siegel, The South Carolina Governor’s School for Science and Mathematics;

QSOUTH DAKOTA Richard Gayle, Black Hills State University; Daluss

Siew-ert, Black Hills State University; Stanley Smith, Black Hills State University

Q TENNESSEE Bonnie Daves, Christian Academy of Knoxville; T Henry Jablonski, Jr., East Tennessee State University; Robert Price, East Tennessee State University; Ginger Rowell, Middle Tennessee State University; Edith Seier, East Tennessee State University QTEXAS Tom Bratcher, Baylor University; Ji-

anguo Liu, University of North Texas; Mary Parker, Austin Community College; Robert Paige, Texas Tech University; Walter M Potter, Southwestern Universi-ty; Therese Shelton, Southwestern University; James Surles, Texas Tech Univer-sity; Diane Resnick, University of Houston-Downtown QUTAH Patti Collings,

Brigham Young University; Carolyn Cuff, Westminster College; Lajos Horvath, University of Utah; P Lynne Nielsen, Brigham Young University QVIRGINIA

David Bauer, Virginia Commonwealth University; Ching-Yuan Chiang, James Madison University; Steven Garren, James Madison University; Debra Hydorn, Mary Washington College; D’Arcy Mays, Virginia Commonwealth University;

Trang 25

University of Alberta; David Loewen, University of Manitoba

We thank the following individuals, who made invaluable contributions to the third edition:

Ellen Breazel, Clemson University

Linda Dawson, Washington State University, Tacoma

Bernadette Lanciaux, Rochester Institute of Technology

Scott Nickleach, Sonoma State University

The detailed assessment of the text fell to our accuracy checkers, Ann Cannon, Cornell College; Dave Bregenzer, Utah State University; Stan Seltzer, Ithaca Col-lege; Sarah Streett; and the Pearson math tutors Alice Armstrong and Abdellah Dakhama, who checked the manuscript in both the preliminary and final versions Thank you to Sarah Streett, who took on the task of revising the solutions manuals to reflect the many changes to the third edition We also want to thank Jackie Miller (The Ohio State University) for her contributions to the Instructor’s Notes, Webster West (Texas A & M) for his work in producing the applets, and our student technology manual and workbook authors, Jack Morse (University of Georgia), Linda Dawson (University of Washington, Tacoma), Peter Flanagan-Hyde (Phoenix Country Day School), Megan Mocko (University of Florida), and Maria Ripol (University of Florida)

We would like to thank the Pearson team who has given countless hours in veloping this text; without their guidance and assistance, the text would not have come to completion We thank Marianne Stepanian, Chere Bemelmans, Dana Bettez, Sonia Ashraf, Beth Houston, Erin Lane, Kathleen DeChavez, and Chris-tine Stavrou We also thank Allison Campbell, Senior Project Manager at Integra-Chicago, for keeping this book on track throughout production And we extend a very special note of appreciation to Elaine Page, our development editor

Alan Agresti would like to thank those who have helped us in some way, often

by suggesting data sets or examples These include Anna Gottard, Wolfgang Jank, Bernhard Klingenberg, René Lee-Pack, Jacalyn Levine, Megan Lewis, Megan Meece, Dan Nettleton, Yongyi Min, and Euijung Ryu Many thanks also to Tom Piazza for his help with the General Social Survey Finally, Alan Agresti would like to thank his wife Jacki Levine for her extraordinary support throughout the writing of this book Besides putting up with the evenings and weekends he was working on this book, she offered numerous helpful suggestions for examples and for improving the writing

Chris Franklin gives a special thank you to her husband and sons, Dale, Corey, and Cody Green They have patiently sacrificed spending many hours with their spouse and mom as she has worked on this book through three editions A special thank you also to her parents Grady and Helen Franklin and her two brothers, Grady and Mark, who have always been there for their daughter and sister Chris also appreciates the encouragement and support of her colleagues and her many students who used the book, offering practical suggestions for improvement Chris appreciates the support of teachers who have used the previous editions of the book Finally, Chris thanks her coauthor, Alan Agresti, for making this book

a reality, a book they began discussing oh so many years ago

Alan Agresti , Gainesville, Florida Chris Franklin , Athens, Georgia

Trang 26

About the Authors

Alan Agresti is Distinguished Professor Emeritus in the Department of Statistics at the University of Florida He taught statistics there for 38 years, in-cluding the development of three courses in statistical methods for social science students and three courses in categorical data analysis He is author of more than

100 refereed articles and five texts including Statistical Methods for the Social

Sciences (with Barbara Finlay, Prentice Hall, 4th edition, 2009) and Categorical Data Analysis (Wiley, 2nd edition, 2002) He is a Fellow of the America

Statistical Association and recipient of an Honorary Doctor of Science from

De Montfort University in the UK In 2003 he was named Statistician of the Year

by the Chicago chapter of the American Statistical Association, and in 2004 he was the first honoree of the Herman Callaert Leadership Award in Biostatistical Education and Dissemination, awarded by the University of Limburgs, Belgium

He has held visiting positions at Harvard University, Boston University, the London School of Economics, and Imperial College and has taught courses or short courses for universities and companies in about 30 countries worldwide

He has also received teaching awards from the University of Florida and an excellence in writing award from John Wiley & Sons

Christine Franklin is a Senior Lecturer and Lothar Tresp Honoratus Honors Professor in the Department of Statistics at the University of Georgia She has been teaching statistics for more than 30 years at the college level Chris has been actively involved at the national and state level with promoting statistical education at Pre-K–16 since the 1980s She is a past Chief Reader for AP Statis-tic She has developed three graduate level courses at the University of Georgia

in statistics for elementary, middle, and secondary teachers Chris served as the lead writer for the ASA-endorsed Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report: A Pre- K–12 Curriculum Framework Chris has been honored by her selection as a Fellow of the American Statistical Associa-tion, the 2006 Mu Sigma Rho National Statistical Education Award recipient for her teaching and lifetime devotion to statistics education, and numerous teaching and advising awards at the University of Georgia including election to the UGA Teaching Academy Chris has written more than 50 journal articles and resource materials for textbooks Most important for Chris is her family Her most recent memorable experience was a mission trip to Mexico with her husband and sons

In June 2012, Chris will be returning to Philmont, New Mexico, for her third trip backpacking for 11 days with both her sons and other Boy Scouts

Trang 27

Gathering and

Exploring Data

Chapter 1

Statistics: The Art and Science

of Learning from Data

Trang 28

Statistics: The Art and Science

of Learning from Data

1

1.1 Using Data to Answer Statistical Questions

1.2 Sample Versus Population

1.3 Using Calculators and Computers

2

Trang 29

Chapter 1 Statistics: The Art and Science of Learning from Data 3

In the business world, managers use statistics to analyze results of ing studies about new products, to help predict sales, and to measure employee performance In finance, statistics is used to study stock returns and investment opportunities Medical studies use statistics to evaluate whether new ways to treat disease are better than existing ways In fact, most professional occupations today rely heavily on statistical methods In a competitive job market, understanding statistics provides an important advantage

But it’s important to understand statistics even if you will never use it in your job Understanding statistics can help you make better choices Why? Because every day you are bombarded with statistical information from news reports, advertisements, political campaigns, and surveys How do you know what to heed and what to ignore? An understanding of the statistical reasoning—and in some cases statistical misconceptions—underlying these pronouncements will help For instance, this book will enable you to evaluate claims about medical research studies more effectively so that you know when you should be skeptical For example, does taking an aspirin daily truly lessen the chance of cancer?

Example 1

How Statistics Helps Us Learn about the World

Picture the Scenario

In this book, you will explore a wide variety of everyday scenarios For example,you will evaluate media reports about opinion surveys, medical research stud-ies, the state of the economy, and environmental issues You’ll face finan-cial decisions such as choosing between an investment with a sure return and one that could make you more money but could possibly cost you your entire investment You’ll learn how to analyze the available information to answer necessary questions in such scenarios One purpose of this book is to show you why an understanding of statistics is essential for making good decisions

in an uncertain world

Questions to Explore

This book will show you how to collect appropriate information and how

to apply statistical methods so you can better evaluate that information and answer the questions posed Here are some examples of questions we’ll investigate in coming chapters:

 How can you evaluate evidence about global warming?

 Are cell phones dangerous to your health?

 What’s the chance your tax return will be audited?

 How likely are you to win the lottery?

 Is there bias against women in appointing managers?

 What “hot streaks” should you expect in basketball?

 How can you analyze whether a diet really works?

 How can you predict the selling price of a house?

Thinking Ahead

Each chapter uses questions like these to introduce a topic and then introduces

tools for making sense of the available information We’ll see that statistics

is the art and science of designing studies and analyzing the information that those studies produce

Trang 30

We realize that you are probably not reading this book in the hope of becoming

a statistician (That’s too bad, because there’s a severe shortage of statisticians—more jobs than trained people And with the ever-increasing ways in which statistics

is being applied, it’s an exciting time to be a statistician.) You may even suffer from math phobia Please be assured that to learn the main concepts of statistics, logical thinking and perseverance are more important than high-powered math skills Don’t be frustrated if learning comes slowly and you need to read about a topic a few times before it starts to make sense Just as you would not expect to sit through

a single foreign language class session and be able to speak that language fluently, the same is true with the language of statistics It takes time and practice But we promise that your hard work will be rewarded Once you have completed even part of this text, you will understand much better how to make sense of statistical information, and hence the world around you

1.1 Using Data to Answer Statistical Questions

Does a low-carbohydrate diet result in significant weight loss? Are people more likely to stop at a Starbucks if they’ve seen a recent Starbucks TV commercial? Information gathering is at the heart of investigating answers to such questions The

information we gather with experiments and surveys is collectively called data

For instance, consider an experiment designed to evaluate the effectiveness of a low-carbohydrate diet The data might consist of the following measurements for the people participating in the study: weight at the beginning of the study, weight at the end of the study, number of calories of food eaten per day, carbohydrate intake per day, body-mass index (BMI) at the start of the study, and gender A marketing survey about the effectiveness of a TV ad for Starbucks could collect data on the percentage of people who went to a Starbucks since the ad aired and analyze how

it compares for those who saw the ad and those who did not see it

Defining Statistics

You already have a sense of what the word statistics means You hear statistics

quoted about sports events (number of points scored by each player on a ball team), statistics about the economy (median income, unemployment rate), and statistics about opinions, beliefs, and behaviors (percentage of students who indulge in binge drinking) In this sense, a statistic is merely a number calculated from data But statistics as a field can be broadly viewed as a way of thinking about data and quantifying uncertainty, not a maze of numbers and messy formulas

basket-Statistical methods help us investigate questions in an objective manner Statistical problem solving is an investigative process that involves four compo-nents: (1) formulate a statistical question, (2) collect data, (3) analyze data, and (4) interpret results The following examples ask questions that we’ll learn how

to answer using statistical investigations

Scenario 1: Predicting an Election Using an Exit Poll In elections, television networks often declare the winner well before all the votes have been counted They do this using exit polling, interviewing voters after they leave the voting

Statistics

Statistics is the art and science of designing studies and analyzing the data that those

studies produce Its ultimate goal is translating data into knowledge and understanding of

the world around us In short, statistics is the art and science of learning from data

Trang 31

Section 1.1 Using Data to Answer Statistical Questions 5

booth Using an exit poll, a network can often predict the winner after learning how several thousand people voted, out of possibly millions of voters

The 2010 California gubernatorial race pitted Democratic candidate Jerry Brown against Republican candidate Meg Whitman A TV exit poll used to project the outcome reported that 53.1% of a sample of 3889 voters said they had voted for Jerry Brown 1 Was this sufficient evidence to project Brown as the winner, even though information was available from such a small portion of the more than 9.5 million voters in California? We’ll learn how to answer that question in this book

Scenario 2: Making Conclusions in Medical Research Studies Statistical reasoning is at the foundation of the analyses conducted in most medical research studies Let’s consider three examples of how statistics can be relevant

Heart disease is the most common cause of death in industrialized nations

In the United States and Canada, nearly 30% of deaths yearly are due to heart disease, mainly heart attacks Does regular aspirin intake reduce deaths from heart attacks? Harvard Medical School conducted a landmark study to investi-gate The people participating in the study regularly took either an aspirin or a placebo (a pill with no active ingredient) Of those who took aspirin, 0.9% had heart attacks during the study Of those who took the placebo, 1.7% had heart attacks, nearly twice as many

Can you conclude that it’s beneficial for people to take aspirin regularly? Or, could the observed difference be explained by how it was decided which people would receive aspirin and which would receive the placebo? For instance, might those who took aspirin have had better results merely because they were health-ier, on average, than those who took the placebo? Or, did those taking aspirin have a better diet or exercise more regularly, on average?

For years there has been controversy about whether regular intake of large doses of vitamin C is beneficial Some studies have suggested that it is But some scientists have criticized those studies’ designs, claiming that the subsequent statistical analysis was meaningless How do we know when we can trust the statistical results in a medical study that is reported in the media?

Suppose you wanted to investigate whether, as some have suggested, heavy use of cell phones makes you more likely to get brain cancer You could pick half the students from your school and tell them to use a cell phone each day for the next 50 years, and tell the other half never to use a cell phone Fifty years from now you could see whether more users than nonusers of cell phones got brain cancer Obviously it would be impractical to carry out such a study And who wants to wait 50 years to get the answer? Years ago, a British statistician figured out how to study whether a particular type of behavior has an effect on cancer, using already available data He did this to answer a then controversial question: Does smoking cause lung cancer? How did he do this?

This book will show you how to answer questions like these You’ll learn when you can trust the results from studies reported in the media and when you should

be skeptical

Scenario 3: Using a Survey to Investigate People’s Beliefs How similar are your opinions and lifestyle to those of others? It’s easy to find out Every other year, the National Opinion Research Center at the University of Chicago conducts the General Social Survey (GSS) This survey of a few thousand adult Americans provides data about the opinions and behaviors of the American pub-lic You can use it to investigate how adult Americans answer a wide diversity of questions, such as, “Do you believe in life after death?” “Would you be willing

to pay higher prices in order to protect the environment?” “How much TV do you watch per day?” and “How many sexual partners have you had in the past

1Source: Data from www.cnn.com/ELECTION/2010/results/polls/

Trang 32

year?” Similar surveys occur in other countries, such as the Eurobarometer vey within the European Union We’ll use data from such surveys to illustrate the proper application of statistical methods

sur-Reasons for Using Statistical Methods

The scenarios just presented illustrate the three main components of statistics for answering a statistical question:

 Design: Planning how to obtain data to answer the questions of interest

 Description: Summarizing and analyzing the data that are obtained

 Inference: Making decisions and predictions based on the data for answering

the statistical question

Design refers to planning how to obtain data that will efficiently shed light

on the problem of interest How could you conduct an experiment to determine reliably whether regular large doses of vitamin C are beneficial? In marketing, how do you select the people to survey so you’ll get data that provide good pre-dictions about future sales?

Description means exploring and summarizing patterns in the data Files of

raw data are often huge For example, over time the General Social Survey has collected data about hundreds of characteristics on many thousands of people Such raw data are not easy to assess—we simply get bogged down in numbers It

is more informative to use a few numbers or a graph to summarize the data, such

as an average amount of TV watched or a graph displaying how number of hours

of TV watched per day relates to number of hours per week exercising

Inference means making decisions or predictions based on the data Usually

the decision or prediction refers to a larger group of people, not merely those

in the study For instance, in the exit poll described in Scenario 1, of 3889 voters sampled, 53.1% said they voted for Jerry Brown Using these data, we can predict (infer) that a majority of the 9.5 million voters voted for him Stating the percent-

ages for the sample of 3889 voters is description , while predicting the outcome for all 9.5 million voters is inference

Statistical description and inference are complementary ways of analyzing data Statistical description provides useful summaries and helps you find patterns in the data, while inference helps you make predictions and decide whether observed pat-terns are meaningful You can use both to investigate questions that are important

to society For instance, “Has there been global warming over the past decade?”

“Is having the death penalty available for punishment associated with a reduction

in violent crime?” “Does student performance in school depend on the amount of money spent per student, the size of the classes, or the teachers’ salaries?”

Long before we analyze data, we need to give careful thought to posing the questions to be answered by that analysis The nature of these questions has an impact on all stages—design, description, and inference For example, in an exit poll, do we just want to predict which candidate won, or do we want to investi-

gate why by analyzing how voters’ opinions about certain issues related to how

they voted? We’ll learn how questions such as these and the ones posed in the previous paragraph can be phrased in terms of statistical summaries (such as percentages and means) so that we can use data to investigate their answers Finally, a topic that we have not mentioned yet but that is fundamental for sta-

tistical inference is probability , which is a framework for quantifying how likely

various possible outcomes are We’ll study probability because it will help us to answer questions such as, “If Brown were actually going to lose the election (that

is, if he were supported by less than half of all voters), what’s the chance that

an exit poll of 3889 voters would show support by 53.1% of the voters?” If the chance were extremely small, we’d feel comfortable making the inference that his reelection was supported by the majority of all 9.5 million voters

In Words

The verb infer means to arrive at

a decision or prediction by reasoning

from known evidence

In Words

We’ll see in Activity 1 that the term

variable refers to the characteristic

being measured, such as number of

hours per day that you watch TV

Trang 33

Section 1.1 Using Data to Answer Statistical Questions 7

Downloading Data from the

Internet

It is simple to get descriptive summaries of data from the

General Social Survey (GSS) We’ll demonstrate, using one

question asked in recent surveys, “On a typical day, about how

many hours do you personally watch television?”

 Go to the Web site sda.berkeley.edu/GSS

selection.

 The GSS name for the number of hours of TV watching is

TVHOURS Type TVHOURS as the row variable name

 In the Weight menu, make sure that No Weight is selected

Click on Run the Table.

Now you’ll see a table that shows the number of people and, in bold, the percentage who made each of the possible responses For all the years combined in which this question was asked, the most common response was 2 hours of TV a day (about 27% made this response)

What percentage of the people surveyed reported watching

0 hours of TV a day? How many people reported watching TV

24 hours a day?

Another question asked in the GSS is, “Taken all together, would you say that you are very happy, pretty happy, or not too happy?” The GSS name for this item is HAPPY What percentage of people reported being very happy?

You might use the GSS to investigate what sorts of people are more likely to be very happy Those who are happily mar- ried? Those who are in good health? Those who have lots of friends? We’ll see how to find out in this book

*If this doesn’t work, your computer’s firewall settings may be ing access

restrict-Activity 1

Try Exercises 1.3 and 1.4

1.1 Aspirin and heart attacks The Harvard Medical School

study mentioned in Scenario 2 included about 22,000 male

physicians Whether a given individual would be assigned

to take aspirin or the placebo was determined by flipping

a coin As a result, about 11,000 physicians were assigned

to take aspirin and about 11,000 to take the placebo The

researchers summarized the results of the experiment

using percentages Of the physicians taking aspirin, 0.9%

had a heart attack, compared to 1.7% of those taking the

placebo Based on the observed results, the study authors

concluded that taking aspirin reduces the risk of having a

heart attack Specify the aspect of this study that pertains

to (a) design, (b) description, and (c) inference

1.2 Poverty and race The Current Population Survey (CPS)

is a survey conducted by the U.S Census Bureau for the Bureau of Labor Statistics It provides a comprehensive body of data on the labor force, unemployment, wealth, poverty, etc The data can be found online at www.census gov/hhes/www/cpstc/cps_table_creator.html A report from the 2009 CPS focused on a sample of about 50,000 households, each consisting of at least one related person under the age of 18 The report indicated that 14.7% of white households, 30.4% of black households, and 11.1%

of Asian households had annual incomes below the erty line Based on these results, the study authors con-

pov-cluded that the percentage of all such black households

Trang 34

1.2 Sample Versus Population

with annual incomes below the poverty line is between

28.6% and 32.2% Specify the aspect of this study that

pertains to (a) description and (b) inference

1.3 GSS and heaven Go to the General Social Survey Web

site, http://sda.berkeley.edu/GSS Enter HEAVEN as the

row variable and then click Run the Table When asked

whether or not they believed in heaven, what percentage

of those surveyed said yes, definitely; yes, probably; no,

probably not; and no, definitely not? (Data from CSM,

UC Berkeley.)

1.4 GSS and heaven and hell Refer to the previous exercise

You can obtain data for a particular survey year such

as 2008 by entering YEAR(2008) in the Selection Filter

option box before you click on Run the Table

a Do this for HEAVEN in 2008, giving the percentages

for the four possible outcomes

b Summarize opinions in 2008 about belief in hell

(row variable HELL) Was the percentage of “yes, definitely” responses higher for belief in heaven or

in hell?

1.5 GSS for subject you pick At the GSS Web site,

click on Standard Codebook under Codebooks and then on Sequential Variable List Find a subject that

interests you and look up a relevant GSS code name to enter as the row variable Summarize the results that you obtain

We’ve seen that statistics consists of methods for designing investigative studies,

describing (summarizing) data obtained for those studies, and making inferences

(decisions and predictions) based on those data to answer a statistical question

of interest

We Observe Samples But Are Interested

in Populations The entities that we measure in a study are called the subjects Usually subjects

are people, such as the individuals interviewed in a General Social Survey But they need not be For instance, subjects could be schools, countries, or days We might measure characteristics such as the following:

 For each school: the per-student expenditure, the average class size, the average score of students on an achievement test

 For each country: the percentage of residents living in poverty, the birth rate, the percentage unemployed, the percentage who are computer literate

 For each day in an Internet café: the amount spent on coffee, the amount spent on food, the amount spent on Internet access

The population is the set of all the subjects of interest In practice, we usually

have data for only some of the subjects who belong to that population These

subjects are called a sample

Population and Sample

The population is the total set of subjects in which we are interested A sample is

the subset of the population for whom we have (or plan to have) data, often randomly selected.

In the 2008 General Social Survey (GSS), the sample was the 2023 people who participated in this survey The population was the set of all adult Americans at that time—more than 200 million people

Trang 35

Section 1.2 Sample Versus Population 9

Occasionally data are available from an entire population For instance, every ten years the U.S Census Bureau gathers data from the entire U.S population (or nearly all) But the census is an exception Usually, it is too costly and time-consuming to obtain data from an entire population It is more practical to get data for a sample The General Social Survey and polling organizations such as the Gallup poll usually select samples of about 1000 to 2500 Americans to learn

about opinions and beliefs of the population of all Americans The same is true

for surveys in other parts of the world, such as the Eurobarometer in Europe

Descriptive Statistics and Inferential Statistics

Using the distinction between samples and populations, we can now tell you

more about the use of description and inference in statistical analyses

Description in Statistical Analyses

Descriptive statistics refers to methods for summarizing the collected data (where the

data constitutes either a sample or a population) The summaries usually consist of graphs and numbers such as averages and percentages

A descriptive statistical analysis usually combines graphical and

numeri-cal summaries For instance, Figure 1.1 is a bar graph that shows the

percent-ages of various types of U.S households in 2005 It summarizes a survey of 50,000 American households by the U.S Census Bureau The main purpose

of descriptive statistics is to reduce the data to simple summaries without torting or losing much information Graphs and numbers such as percentages

dis-Sample and population

Example 2

An Exit Poll

Picture the Scenario

Scenario 1 in the previous section discussed an exit poll The purpose was to predict the outcome of the 2010 gubernatorial election in California The exit poll sampled 3889 of the 9.5 million people who voted

Insight

The ultimate goal of most studies is to learn about the population For

exam-ple, the sponsors of this exit poll wanted to make an inference (prediction)

about all voters, not just the 3889 voters sampled by the poll

Try Exercises 1.9 and 1.10

Did You Know?

Examples in this book use the five parts

shown in this example: Picture the

Scenario introduces the context Question

to Explore states the question addressed

Think It Through shows the reasoning

used to answer that question Insight

gives follow-up comments related to the

example Try Exercises direct you to a

similar “Practicing the Basics” exercise at

the end of the section Also, each example

title is preceded by a label highlighting the

example’s concept In this example, the

concept label is “Sample and population.” 

Trang 36

and averages are easier to comprehend than the entire set of data It’s much easier to get a sense of the data by looking at Figure 1.1 than by reading through the questionnaires filled out by the 50,000 sampled households From this graph, it’s readily apparent that the “traditional” household, defined as being a married man and woman with children in which only the husband is in the labor force, is no longer very common in the United States In fact, “Other” households, which include female-headed households and households headed

by young adults or older Americans who do not reside with spouses are most common

Descriptive statistics are also useful when data are available for the entire population, such as in a census By contrast, inferential statistics are used when data are available for a sample only, but we want to make a decision or prediction about the entire population

0

20 10

30 40 50 60 70

Dual-income no children

Dual-income with children Traditional

Other

Household type

 Figure 1.1 Types of U.S Households, Based on a Sample of 50,000 Households in

the 2005 Current Population Survey (Source: Data from United States Census Bureau.)

Inference in Statistical Analyses

Inferential statistics refers to methods of making decisions or predictions about a

population, based on data obtained from a sample of that population

In most surveys, we have data for a sample, not for the entire population We use descriptive statistics to summarize the sample data and inferential statistics to make predictions about the population

Descriptive and inferential statistics

Example 3

Polling Opinions on Handgun Control

Picture the Scenario

Suppose we’d like to know what people think about controls over the sales of handguns Let’s consider how people feel in Florida, a state with a relatively high violent crime rate The population of interest is the set of more than 10 million adult residents of Florida

Trang 37

Section 1.2 Sample Versus Population 11

An important aspect of statistical inference involves reporting the likely

precision of a prediction How close is the sample value of 54% likely to be to

the true (unknown) percentage of the population favoring gun control? We’ll

see (in Chapters 4 and 6 ) why a well-designed sample of 834 people yields a sample percentage value that is very likely to fall within about 3–4% (the so-

called margin of error ) of the population value In fact, we’ll see that inferential

statistical analyses can predict characteristics of entire populations quite well

by selecting samples that are small relative to the population size Surprisingly, the absolute size of the sample matters much more than the size relative to the population total For example, the population of China is about four times that

of the United States, but a random sample of 1000 people from the Chinese ulation and a random sample of 1000 people from the U.S population would achieve similar levels of accuracy That’s why most polls take samples of only about a thousand people, even if the population has millions of people In this book, we’ll see why this works

pop-Sample Statistics and Population Parameters

In Example 3 , the percentage of the sample favoring handgun control is an

exam-ple of a samexam-ple statistic It is crucial to distinguish between samexam-ple statistics and

Since it is impossible to discuss the issue with all these people, we can study results from a recent poll of 834 Florida residents conducted by the Institute for Public Opinion Research at Florida International University

In that poll, 54.0% of the sampled subjects said they favored controls over the sales of handguns A newspaper article about the poll reports that the

“margin of error” for how close this number falls to the population age is 3.4% We’ll see (later in the textbook) that this means we can predict

percent-with high confidence (about 95% certainty) that the percentage of all adult

Floridians favoring control over sales of handguns falls within 3.4% of the survey’s value of 54.0%, that is, between 50.6% and 57.4%

ple but in the population of all adult Florida residents The prediction that the percentage of all adult Floridians who favor handgun control

falls between 50.6% and 57.4% is an inferential statistical analysis In

summary, we describe the sample , and we make inferences about the

population

Insight

The sample size of 834 was small compared to the population size of more than 10 million However, because the values between 50.6% and 57.4% are all above 50%, the study concluded that a slim majority of Florida residents favored handgun control

Try Exercises 1.11, part a, and 1.12, parts a–c

Trang 38

the corresponding values for the population The term parameter is used for a

numerical summary of the population

Parameter and Statistic

A parameter is a numerical summary of the population A statistic is a numerical

summary of a sample taken from the population

For example, the percentage of the population of all adult Florida residents favoring handgun control is a parameter We hope to learn about parameters

so that we can better understand the population, but the true parameter values are almost always unkown Thus, we use sample statistics to estimate the parameter values

Randomness and Variability

Random is often thought to mean chaotic or haphazard, but randomness is an extremely powerful tool for obtaining good samples and conducting experiments

A sample tends to be a good reflection of a population when each subject in the population has the same chance of being included in that sample That’s the basis

of random sampling , which is designed to make the sample representative of the

population A simple example of random sampling is when a teacher puts each student’s name on a slip of paper, places it in a hat, and then draws names from the hat without looking

 Random sampling allows us to make powerful inferences about populations

 Randomness is also crucial to performing experiments well

If, as in Scenario 2 on page 5 , we want to compare aspirin to a placebo in terms of the percentage of people who later have a heart attack, it’s best to ran-domly select those in the sample who use aspirin and those who use placebo This approach tends to keep the groups balanced on other factors that could affect the results For example, suppose we allowed people to choose whether or not

to use aspirin (instead of randomizing whether the person receives aspirin or the placebo) Then, the people who decided to use aspirin might have tended to be healthier than those who didn’t, which could produce misleading results

People are different from each other, so, not surprisingly, the

measure-ments we make on them vary from person to person For the GSS question

about TV watching in Activity 1 on page 7 , different people reported ent amounts of TV watching In the exit poll of Example 1 , not all people voted the same way If subjects did not vary, we’d need to sample only one

differ-of them We learn more about this variability by sampling more people If

we want to predict the outcome of an election, we’re better off sampling 100 voters than one voter, and our prediction will be even more reliable if we sample 1000 voters

 Just as people vary, so do samples vary

Suppose you take an exit poll of 1000 voters to predict the outcome of an election Suppose the Gallup organization also takes an exit poll of 1000 vot-ers Your sample will have different people than Gallup’s Consequently, the predictions will also differ Perhaps your exit poll of 1000 voters has 480 voting

Recall

A population is the total group of

individuals about whom you want to

make conclusions A sample is a subset

of the population for whom you actually

have data 

Trang 39

Section 1.2 Sample Versus Population 13

for the Republican candidate, so you predict that 48% of all voters voted for that person Perhaps Gallup’s exit poll of 1000 voters has 440 voting for the Republican candidate, so they predict that 44% of all voters voted for that per-son Activity 2 at the end of the chapter shows that with random sampling, the amount of variability from sample to sample is actually quite predictable Both

of your predictions are likely to fall within 5% of the actual population age who voted Republican, assuming the samples are random If, on the other hand, Republicans are more likely than Democrats to refuse to participate in the exit poll, then we would need to account for this In the 2004 U.S presiden-tial election, much controversy arose when George W Bush won several states

percent-in which exit pollpercent-ing predicted that John Kerry had won Is it likely that the way the exit polls were conducted led to these incorrect predictions?

The Basic Ideas of Statistics

Here is a summary of the key concepts of statistics that you’ll learn about in this book:

Chapter 2 : Exploring Data with Graphs and Numerical Summaries How can

you present simple summaries of data? You replace lots of numbers with ple graphs and numerical summaries

Chapter 3 : Association: Contingency, Correlation, and Regression How does

annual income ten years after graduation correlate with college GPA? You can find out by studying the association between those characteristics

Chapter 4 : Gathering Data How can you design an experiment or conduct a

survey to get data that will help you answer questions? You’ll see why results may be misleading if you don’t use randomization

Chapter 5 : Probability in Our Daily Lives How can you determine the chance

of some outcome, such as winning a lottery? Probability, the basic tool for evaluating chances, is also a key foundation for inference

Chapter 6 : Probability Distributions You’ve probably heard of the normal

distribution or “bell-shaped curve” that describes people’s heights or IQs or test scores What is the normal distribution, and how can we use it to find probabilities?

Chapter 7 : Sampling Distributions Why is the normal distribution so

impor-tant? You’ll see why, and you’ll learn its key role in statistical inference

Chapter 8 : Statistical Inference: Confidence Intervals How can an exit poll of

3889 voters possibly predict the results for millions of voters? You’ll find out

by applying the probability concepts of Chapters 5 , 6 , and 7 to make statistical inferences that show how closely you can predict summaries such as popula-tion percentages

Chapter 9 : Statistical Inference: Significance Tests about Hypotheses How can

a medical study make a decision about whether a new drug is better than a placebo? You’ll see how you can control the chance that a statistical inference makes a correct decision about what works best

Chapters 10 – 15 : Applying Descriptive and Inferential Statistics to Many Kinds of Data After Chapters 2 – 9 introduce you to the key concepts of

statistics, the rest of the book shows you how to apply them in lots of situations For instance, Chapter 10 shows how to compare two groups, such as using a sample of students from your university to make an infer-ence about whether male and female students have different rates of binge drinking

Trang 40

1.6 Description and inference

a Distinguish between description and inference as

reasons for using statistics Illustrate the distinction

using an example

b You have data for a population, such as obtained in a

census Explain why descriptive statistics are helpful

but inferential statistics are not needed

1.7 Number of good friends One year the General Social

Survey asked, “About how many good friends do you

have?” Of the 840 people who responded, 6.1% reported

having only one good friend Identify (a) the sample,

(b) the population, and (c) the statistic reported

(Source: Data from CSM, Berkeley.)

1.8 Concerned about global warming? The Institute

for Public Opinion Research at Florida International

University has conducted the FIU/Florida Poll (www2.fiu

.edu/orgs/ipor/globwarm2.htm ) of about 1200 Floridians

annually since 1988 to track opinions on a wide variety

of issues In 2006 the poll asked, “How concerned are

you about the problem of global warming?” The possible

responses were very concerned, somewhat concerned,

not very concerned, and haven’t heard about it The poll

reported percentages (44, 30, 21, 6) in these categories

a Identify the sample and the population

b Are the percentages quoted statistics or parameters?

Why?

1.9 EPA The Environmental Protection Agency (EPA)

uses a few new automobiles of each model every year to

collect data on pollution emissions and gasoline mileage

performance For the Honda Accord model, identify what’s

meant by the (a) subject, (b) sample, and (c) population

1.10 Babies and social preference A recent study at Yale

University’s Infant Cognition Center, published in the

journal Nature , investigated whether babies develop social

preferences at an early age As part of the study, 16

six-month-old infants were each shown a sequence of videos

One video focused on a figure whose actions toward

others were helpful, while the other focused on a figure

whose actions were hurtful After viewing the videos, each

infant was presented with the two figures and allowed to

choose one to play with Of the 16 infants in the study, 14

chose to play with the helper object The researchers

con-cluded that six-month-old infants have both the ability to

recognize and the preference to align themselves with the

helpful figure Identify (a) the sample, (b) the population,

and (c) the inference being drawn

1.11 Graduating seniors’ salaries The job placement center

at your school surveys all graduating seniors at the school

Their report about the survey provides numerical

summa-ries such as the average starting salary and the percentage

of students earning more than $30,000 a year

a Are these statistical analyses descriptive or inferential?

1820, which she treats as a sample of all marriage records from the early 19th century The average age of the women in the records is 24.1 years Using the appropriate statistical method, she estimates that the average age of brides in early 19th-century New England was between 23.5 and 24.7

a Which part of this example gives a descriptive

sum-mary of the data?

b Which part of this example draws an inference about a

population?

c To what population does the inference in part b refer?

d The average age of the sample was 24.1 years Is 24.1 a

statistic or a parameter?

1.13 Age pyramids as descriptive statistics The figure shown

is a graph published by Statistics Sweden It compares Swedish society in 1750 and in 2010 on the numbers of men and women of various ages, using “age pyramids.” Explain how this indicates that

a In 1750, few Swedish people were old

b In 2010, Sweden had many more people than in 1750

c In 2010, of those who were very old, more were female

than male

d In 2010, the largest five-year group included people

born during the era of first manned space flight

0–4

0 0

50 50

100 100 Population (in thousands)

80+

70–74

0–4 10–14 20–24 30–34 45–49 55–59 65–69

80+

70–74

Graphs of number of men and women of various ages, in 1750

and in 2010 (Source: From Statistics Sweden.)

1.14 Gallup polls Go to the Web site www.galluppoll.com

for the Gallup poll From reports listed on their home page, give an example of (a) a descriptive statistical analysis and (b) an inferential statistical analysis

1.15 National service Consider the population of all

stu-dents at your school A certain proportion support mandatory national service (MNS) following high school Your friend randomly samples 20 students from the school, and uses the sample proportion who

Ngày đăng: 28/06/2014, 20:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w