ABRAHAM DE MOIVRE: PAVING THE WAY FOR PROPORTION INFERENCES
13.94 Race in America. A newspaper article titled “On Race in
Race
Relations Generally Generally No
good bad opinion
White 736 455 147
Black 86 175 36
At the 1% significance level, do the data provide sufficient evi- dence to conclude that U.S. whites and blacks are nonhomoge- neous with respect to their views on race relations in the United States?
13.95 Foreign Affairs. From the Web site ofGallup, Inc., we found polls regarding Americans’ approval of how the president is handling foreign affairs. In two particular polls, the question asked was, “Do you approve of the way Barack Obama is han- dling foreign affairs?” In a February 2009 poll of 1007 national adults, 54% said they approved, and in a March 2009 poll of 1007 national adults, 61% said they approved. At the 5% significance level, do the data provide sufficient evidence to conclude that a difference exists in the approval percentages of all U.S. adults between the two months?
a. Use the two-proportions z-test (Procedure 12.3 on page 565).
b. Use the chi-square homogeneity test.
c. Compare your results in parts (a) and (b).
d. What does this exercise illustrate?
13.96 Auto Bailout. From two USA Today/Gallup polls, we found information about Americans’ approval of government bailouts to two of the Big Three U.S. automakers. The question asked was, “Do you approve or disapprove of the federal loans given to General Motors and Chrysler last year to help them avoid bankruptcy?” In a February 2009 poll of 1007 national adults, 41% said they approved, and in a March 2009 poll of 1007 na- tional adults, 39% said they approved. At the 5% significance level, do the data provide sufficient evidence to conclude that a difference exists in the approval percentages of all U.S. adults between the two months?
a. Use the two-proportionsz-test (Procedure 12.3 on page 565).
b. Use the chi-square homogeneity test.
c. Compare your results in parts (a) and (b).
d. What does this exercise illustrate?
Extending the Concepts and Skills
Chi-Square Homogeneity Test and Two-Proportions z-Test.
As we mentioned on page 619, the chi-square homogeneity test for comparing two population proportions and the two-tailed two-proportionsz-test are equivalent; that is, they always yield the same result. In the following exercises, you are to establish that fact.
13.97 Foreign Affairs. Refer to Exercise 13.95 and show that the value of theχ2-statistic equals the square of the value of the z-statistic. (Note: You may observe slight differences due to roundoff error.)
13.98 Auto Bailout. Refer to Exercise 13.96 and show that the value of the χ2-statistic equals the square of the value of the z-statistic. (Note: You may observe slight differences due to roundoff error.)
13.99 From Exercises 13.97 and 13.98, we conjecture that, for a comparison of two population proportions, the value of the χ2-statistic of a chi-square homogeneity test equals the square of the value of thez-statistic of a two-proportionsz-test. Establish that fact.
13.100 It can be shown that the square of a standard normal vari- able has the chi-square distribution with one degree of freedom.
Use that fact to show, for a chi-square curve with one degree of freedom, thatχα2=z2α/2.
13.101 Use Exercises 13.99 and 13.100 to show that the chi- square homogeneity test for comparing two population propor- tions and the two-tailed two-proportionsz-test are equivalent.
CHAPTER IN REVIEW
You Should Be Able to
1. use and understand the formulas in this chapter.
2. identify the basic properties ofχ2-curves.
3. use the chi-square table, Table VII.
4. explain the reasoning behind the chi-square goodness-of-fit test.
5. perform a chi-square goodness-of-fit test.
6. group bivariate data into a contingency table.
7. find and graph marginal and conditional distributions.
8. decide whether an association exists between two variables of a population, given bivariate data for the entire population.
9. explain the reasoning behind the chi-square independence test.
10. perform a chi-square independence test to decide whether an association exists between two variables of a population, given bivariate data for a sample of the population.
11. perform a chi-square homogeneity test to compare the distri- butions of a variable of two or more populations.
622 CHAPTER 13 Chi-Square Procedures
Key Terms
associated variables,596 association,596 bivariate data,593 cells,593 χα2,581
chi-square(χ2)curve,581 chi-square distribution,581
chi-square goodness-of-fit test,582, 585 chi-square homogeneity test,613, 615
chi-square independence test,603, 606 chi-square procedures,580
chi-square subtotals,584 conditional distribution,595 contingency table,593 cross tabs,593
cross-tabulation table,593 expected frequencies,583 homogeneous,614
marginal distribution,595 nonhomogeneous,614 observed frequencies,583 segmented bar graph,595
statistically dependent variables,596 statistically independent
variables,596 two-way table,593 univariate data,592
REVIEW PROBLEMS
Understanding the Concepts and Skills
1. How do you distinguish among the infinitely many differ- ent chi-square distributions and their correspondingχ2-curves?
2. Regarding aχ2-curve,
a. at what point on the horizontal axis does the curve begin?
b. what shape does it have?
c. As the number of degrees of freedom increases, a χ2-curve begins to look like another type of curve. What type of curve is that?
3. Recall that the number of degrees of freedom for the t-distribution used in a one-meant-test depends on the sample size. Is that true for the chi-square distribution used in a chi- square
a. goodness-of-fit test?
b. independence test?
c. homogeneity test?
Explain your answers.
4. Explain why a chi-square goodness-of-fit test, a chi-square in- dependence test, or a chi-square homogeneity test is always right tailed.
5. If the observed and expected frequencies for a chi-square goodness-of-fit test, a chi-square independence test, or a chi- square homogeneity test matched perfectly, what would be the value of the test statistic?
6. Regarding the expected-frequency assumptions for a chi- square goodness-of-fit test, a chi-square independence test, or a chi-square homogeneity test,
a. state them.
b. how important are they?
7. Race and Region. T. G. Exter’s book Regional Markets, Vol. 2/Households(Ithaca, NY: New Strategist Publications, Inc.) provides information on U.S. households by region of the coun- try. This problem gives data current at the time of the book’s pub- lication. One table in the book cross-classifies households by race (of the householder) and region of residence. The table shows that 7.8% of all U.S. households are Hispanic.
a. If race and region of residence are not associated, what per- centage of Midwest households would be Hispanic?
b. There are 24.7 million Midwest households. If race and region of residence are not associated, how many Midwest house- holds would be Hispanic?
c. In fact, there are 645 thousand Midwest Hispanic households.
Given this information and your answer to part (b), what can you conclude?
8. Suppose that you have bivariate data for an entire population.
a. How would you decide whether an association exists between the two variables under consideration?
b. Assuming that you make no calculation mistakes, could your conclusion be in error? Explain your answer.
9. Suppose that you have bivariate data for a sample of a popu- lation.
a. How would you decide whether an association exists between the two variables under consideration?
b. Assuming that you make no calculation mistakes, could your conclusion be in error? Explain your answer.
10. Consider aχ2-curve with 17 degrees of freedom. Use Ta- ble VII to determine
a. χ02.99. b. χ02.01. c. theχ2-value that has area 0.05 to its right.
d. theχ2-value that has area 0.05 to its left.
e. the twoχ2-values that divide the area under the curve into a middle 0.95 area and two outside 0.025 areas.
11. Educational Attainment. The U.S. Census Bureaucom- piles census data on educational attainment of Americans. From the documentCurrent Population Reports, we obtained the 2000 distribution of educational attainment for U.S. adults 25 years old and older. Here is that distribution.
Highest level Percentage Not HS graduate 15.8
HS graduate 33.2
Some college 17.6
Associate’s degree 7.8 Bachelor’s degree 17.0 Advanced degree 8.6
A random sample of 500 U.S. adults (25 years old and older) taken this year gave the following frequency distribution.
Highest level Frequency
Not HS graduate 84
HS graduate 160
Some college 88
Associate’s degree 32 Bachelor’s degree 87
Advanced degree 49
Decide, at the 5% significance level, whether this year’s distribu- tion of educational attainment differs from the 2000 distribution.
12. Presidents. From theInformation Please Almanac, we com- piled the following table on U.S. region of birth and political party of the first 44 U.S. presidents. The table uses these abbrevia- tions: F=Federalist, DR=Democratic-Republican, D=Demo- cratic, W=Whig, R=Republican, U=Union; NE=Northeast, MW=Midwest, SO=South, WE=West.
Region Party Region Party Region Party
SO F SO R MW R
NE F SO U NE D
SO DR MW R MW D
SO DR MW R SO R
SO DR MW R NE D
NE DR NE R SO D
SO D NE D WE R
NE D MW R MW R
SO W NE D SO D
SO W MW R MW R
SO D NE R NE R
SO W MW R SO D
NE W SO D NE R
NE D MW R WE D
NE D NE R
a. What is the population under consideration?
b. What are the two variables under consideration?
c. Group the bivariate data for the variables “birth region” and
“party” into a contingency table.
13. Presidents. Refer to Problem 12.
a. Find the conditional distributions of birth region by party and the marginal distribution of birth region.
b. Find the conditional distributions of party by birth region and the marginal distribution of party.
c. Does an association exist between the variables “birth region”
and “party” for the U.S. presidents? Explain your answer.
d. What percentage of presidents are Republicans?
e. If no association existed between birth region and party, what percentage of presidents born in the South would be Republi- cans?
f. In reality, what percentage of presidents born in the South are Republicans?
g. What percentage of presidents were born in the South?
h. If no association existed between birth region and party, what percentage of Republican presidents would have been born in the South?
i. In reality, what percentage of Republican presidents were born in the South?
14. Hospitals.From data inHospital Statistics, published by the American Hospital Association, we obtained the following con- tingency table for U.S. hospitals and nursing homes by type of facility and type of control. We used the abbreviations Gov for Government, Prop for Proprietary, and NP for nonprofit.
Facility
Control
Gov Prop NP Total
General 1697 660 3046 5403
Psychiatric 266 358 113 737
Chronic 21 1 4 26
Tuberculosis 3 0 1 4
Other 59 148 203 410
Total 2046 1167 3367 6580
In the following questions, the termhospital refers to either a hospital or nursing home:
a. How many hospitals are government controlled?
b. How many hospitals are psychiatric facilities?
c. How many hospitals are government controlled psychiatric facilities?
d. How many general facilities are nonprofit?
e. How many hospitals are not under proprietary control?
f. How many hospitals are either general facilities or under pro- prietary control?
15. Hospitals. Refer to Problem 14.
a. Obtain the conditional distribution of control type within each facility type.
b. Does an association exist between facility type and control type for U.S. hospitals? Explain your answer.
c. Determine the marginal distribution of control type for U.S. hospitals.
d. Construct a segmented bar graph for the conditional distribu- tions and marginal distribution of control type. Interpret the graph in light of your answer to part (b).
e. Without doing any further calculations, respond true or false to the following statement and explain your answer: “The con- ditional distributions of facility type within control types are identical.”
f. Determine the marginal distribution of facility type and the conditional distributions of facility type within control types.
g. What percentage of hospitals are under proprietary control?
h. What percentage of psychiatric hospitals are under proprietary control?
i. What percentage of hospitals under proprietary control are psychiatric hospitals?
624 CHAPTER 13 Chi-Square Procedures
16. Hodgkin’s Disease. Hodgkin’s disease is a malignant, pro- gressive, sometimes fatal disease of unknown cause characterized by enlargement of the lymph nodes, spleen, and liver. The follow- ing contingency table summarizes data collected during a study of 538 patients with Hodgkin’s disease. The table cross-classifies the histological types of patients and their responses to treatment 3 months prior to the study.
Histologicaltype
Response
Positive Partial None Total Lymphocyte
18 10 44 72
depletion Lymphocyte
74 18 12 104
predominance Mixed
154 54 58 266
cellularity Nodular
68 16 12 96
sclerosis
Total 314 98 126 538
At the 1% significance level, do the data provide sufficient evi- dence to conclude that histological type and treatment response are statistically dependent?
17. Income and Residence. TheU.S. Census Bureaucompiles information on money income of people by type of residence and publishes its finding inCurrent Population Reports. Inde- pendent simple random samples of people residing inside princi- pal cities (IPC), outside principal cities but within metropolitan areas (OPC), and outside metropolitan areas (OMA), gave the following data on income level.
Incomelevel
Residence
IPC OPC OMA Total
Under $5,000 30 49 18 97
$5,000–$9,999 36 45 20 101
$10,000–$14,999 41 57 27 125
$15,000–$24,999 82 122 46 250
$25,000–$34,999 69 108 41 218
$35,000–$49,999 73 126 40 239
$50,000–$74,999 67 135 34 236
$75,000 & over 68 146 20 234
Total 466 788 246 1500
a. Identify the populations under consideration here.
b. Identify the variable under consideration.
c. At the 5% significance level, do the data provide sufficient evidence to conclude that people residing in the three types
of residence are nonhomogeneous with respect to income level?
18. Economy in Recession? TheQuinnipiac University Poll conducts nationwide surveys as a public service and for research.
This problem is based on the results of one such poll conducted in May 2008. Independent simple random samples of registered Democrats, Republicans, and Independents were asked, “Do you think the United States economy is in a recession now?” Of the 628 Democrats sampled, 528 responded “yes,” as did 231 of the 471 Republicans sampled and 472 of the 646 Independents sam- pled. At the 1% significance level, do the data provide sufficient evidence to conclude that a difference exists in the percentages of registered Democrats, Republicans, and Independents who thought the U.S. economy was in a recession at the time?
Working with Large Data Sets
19. Yakashba Estates. The document Arizona Residential Property Valuation System, published by theArizona Department of Revenue, describes how county assessors use computerized systems to value single-family residential properties for property tax purposes. On the WeissStats CD are data on lot size (in acres) and house size (in square feet) for homes in the Yakashba Estates, a private community in Prescott, AZ. We used the following cod- ings for lot size and home size.
Lot size House size
Size (acres) Coding Size (sq. ft.) Coding
Under 2.25 L1 Under 3000 H1
2.25–2.49 L2 3000–3999 H2
2.50–2.74 L3 4000 & over H3 2.75 & over L4
Use the technology of your choice to do the following tasks for the coded variables.
a. Group the bivariate data for the variables “lot size” and “house size” into a contingency table.
b. Find the conditional distributions of lot size by house size and the marginal distribution of lot size.
c. Find the conditional distributions of house size by lot size and the marginal distribution of house size.
d. Does an association exist between the variables “lot size” and
“house size” for homes in the Yakashba Estates? Explain your answer.
20. Withholding Treatment. Several years ago, aGallup Poll asked 1528 adults the following question: “The New Jersey Supreme Court recently ruled that all life-sustaining medical treatment may be withheld or withdrawn from terminally ill pa- tients, provided that is what the patients want or would want if they were able to express their wishes. Would you like to see such a ruling in the state in which you live, or not?” The data on the WeissStats CD give the responses by opinion and educational level. Use the technology of your choice to decide, at the 1% sig- nificance level, whether the data provide sufficient evidence to conclude that opinion on this issue and educational level are as- sociated.
FOCUSING ON DATA ANALYSIS
UWEC UNDERGRADUATES Recall from Chapter 1 (see pages 30–31) that the Focus database and Focus sample contain information on the un- dergraduate students at the University of Wisconsin - Eau Claire (UWEC). Now would be a good time for you to re- view the discussion about these data sets.
Open the Focus sample worksheet (FocusSample) in the technology of your choice. In each part, apply the chi- square independence test to decide, at the 5% significance level, whether the data provide sufficient evidence to con- clude that an association exists between the indicated vari-
ables for the population of all UWEC undergraduates. Be sure to check whether the assumptions for performing each test are satisfied. Interpret your results.
a. sex and classification b. sex and residency c. sex and college
d. classification and residency e. classification and college
f. college and residency
CASE STUDY DISCUSSION