Using stata for quantitative analysis

239 5 0
Using stata for quantitative analysis

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

01 Using for Q uantitative Analysis Kyle C Longest Furman University (SiSAGE Los Angeles | London | New Delhi Singapore | Washington DC »SAGE UH V y a n u n d L n ‘«M ' «#> i > i t t a t f m y f n OC FOR INFOô WA T O N ( Copyright â 2012 by SAGF Publications, Inc SAGE Pubfccabons mc 2456 Tauer Road Thousand Oska.Caiitomia 91320 E-marf orderOsagapub com SAGE Publications Ltd All rights reserved N o part of this book may he reproduced or utilized in any form or by my m eans, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in w riting from the publisher Otver's Yard 55 City Road London EC1Y 1SP United Kmgdom Printed in the United States of America SAGE Publications India Pvt Ltd B1A1 Mohan Cooperative Industna! Area Mathura Road New Delhi 110 044 India SAGE Publications Asia-Pacific Pte Ltd 33 Pekin Street *02-01 Far East Square Singapore 048763 Library o f Congress Cataloging-in-Publication Data Longest, Kyle C U sing Stata for quantitative analysis/ Kyle C Longest P cm- j - j Includes bibliographical references an d index ISBN 978-1-4129-9711-9 (pbk.) Stata Social sciences— Graphic Computer programs Social sciences— Statistical methods— C o m p u ter m e th o d s — Executive Editor: Production Editor Jerry Westby Brittany Bauhaus Copy Editor QuADS (P) Ltd programs I Title HA32.L66 2012 005.5'5— dc23 Typesetter: C&M Digitals (P) Ltd Proofreader: Eleni-Maria Georgiou Cover Designer: Anupama Krishnan Marketing Manager: Erica DeLuca Permissions Editor: Karen Ehrmann 2011041851 Certified Chain of Custody SUSTAINABLE Promoting Sustainable Forestry INITIATIVE IIN IIlA llvt www.sfiprogram.org" SFI-01268 SFI label applies to text stock 11 12 13 14 15 y Brief Contents Preface A cknow ledgm ents ix xiii PART I: FOUNDATIONS FOR W ORKING WITH STATA Chapter 1: G etting to Know Stata 12 Chapter 2: The Essentials 17 Chapter 3: D o Files and Data M anagem ent 55 PART II: QUANTITATIVE ANALYSIS W ITH STATA 89 Chapter 4: Descriptive Statistics 90 Chapter 5: Relationships Betw een N om in al and Ordinal V ariables 113 Chapter 6: Relationships Between D ifferent Measurement Levels 137 C hapter 7: Relationships Betw een Interval-Ratio Variables 148 C hapter 8: Enhancing Your C om m an d Repertoire 167 Appendix: G etting to Know Stata 11 187 C hapter Exercise Solutions 201 “How To” Index 221 About th e A uthor 225 Detailed Contents Preface ix M otivation and Purpose ix A bout th e N ational Study o f Y outh and Religion A N ote o n Versions x xi A N ote o n N otation References xi xii Acknow ledgm ents xiii PART I: FOUNDATIONS FOR WORKING W ITH STATA Chapter 1: G ettin g to Know Stata 12 W hat You See G etting S tarted W ith Data Files O p en in g and Saving Stata D ata Files D ata Browser and E ditor E nterin g Your Own D ata 10 U sing D ifferent Types o f D ata Files in Stata Types o f Variables in D ata Files Exercises 10 13 16 Chapter 2: T h e Essentials 17 In tu itio n an d Stata C om m ands The S tru c tu re o f Stata C om m ands C om m and V ariables if S tatem ents O p tio n s 17 19 20 20 20 21 E xecuting a C om m and U sing the C om m and W indow The E ssential C om m ands tabulate 21 21 22 su m m ary generate replace (if) recode N onessential, Everyday C om m ands 28 29 35 43 48 renam e drop/keep (if) describe display set m o re off Sum m ary o f C o m m ands Used in This C h a p te r 48 48 50 51 52 52 Exercises 54 Chapter 3: Do Files and Data Management W hat Is a D o File? O p ening and Saving Do Files T ranslation From the C om m and W in d o w G etting the M ost O ut of Do Files Data M anagem ent W orking W ith Labels M issing Data Using S tring Variables Saving Results Sum m ary o f C o m m an d s Used in This C h a p te r Exercises 55 56 57 58 66 69 69 74 80 83 86 87 PART II: QUANTITATIVE ANALYSIS W ITH STATA 89 Chapter 4: D escriptive Statistics 90 Frequency D istrib u tio n s H istogram s, Bar Graphs, and Pie C h a rts Measures o f C entral Tendency an d V ariability Box P lots Summary o f C o m m an d s Used in This C h a p te r Exercises Chapter 5: R elationships Between N om inal an d Ordinal Variables Cross-Tabulations 91 97 102 107 110 111 113 113 C hi-S quare Test M easures o f Association E laboration 122 124 126 M ultivariate Bar Graphs 130 S u m m ary o f C om m ands U sed in T h is Chapter 134 Exercises 135 Chapter 6: R elationships Between D ifferent Measurement Levels T esting M eans 137 138 C onfidence Intervals 139 Testing a Specific Value (O ne-S am ple t Test) 140 Testing the Mean o f Tw o G ro u p s (IndependentSam ples fTest) A nalysis o f Variance (ANOVA) S u m m ary o f C om m ands U sed in T his C hapter Exercises Chapter 7: R elationships Between Interval-R atio Variables 141 144 146 147 148 C o rrelatio n 148 Scatterplots L inear Regression 150 156 M ultiple Linear Regression D ichotom ous (D u m m y ) V ariables and Linear Regression S u m m ary o f C om m ands Used in T his C hapter Exercises Chapter 8: E nhancing Your C om m an d Repertoire 160 162 165 166 167 Stata H elp Files 167 Ways to Search and Access S tru ctu re and Language A dvanced Convenience C o m m a n d s tab, gen(new var) egen 168 171 175 175 178 m ark an d m arkout alpha, gen(new var) S u m m ary o f C om m ands Used in T his C hapter Exercises 181 183 185 185 Appendix: G etting to Know Stata 11 187 Chapter Exercise Solutions 201 “How To” Index 221 About the A uthor 225 Preface Motivation and Purpose T h e motivation for th is book, as I assume is tru e for m ost, came from a series o f personal experiences First, as a graduate stu d e n t, I rem em ber literally laying awake at night d rea d in g the idea of using a c o m p u te r program to c o n d u c t statistical analyses T h e first statistics course I to o k required Stata to co m p lete th e assignments a n d the final research project T his necessity was so over­ w helm ing at the tim e, in part, because there d id n o t seem to be any stra ig h t­ forw ard, concise texts explaining the basics o f Stata O ver my tim e in g rad u a te school, I came to be very fam iliar with Stata, even to th e point that I developed a serious passion for b o th learning Stata and teaching it to students w h o were facing the same fears I once did In a som ew hat m irro red experience, I was hoping to use Stata as a significant portion o f th e classroom experience and requirem ents w hen I first began teaching a course on Q uantitative Analysis I so o n realized that th e re still was not a m anageable in tro d u cto ry text on th e use o f Stata for q uantitative research.1 Thus, I so u g h t to contribute to filling this void by providing a straightforw ard, applied in tro d u c tio n to using Stata This book will b e m ost beneficial to read ers w h o are novices w h en it com es to Stata and are at least in the early stages o f learning strategies fo r co n ­ ducting quantitative analysis It does assum e th a t th e reader has a w o rk in g knowledge of basic statistical techniques and term inology The o rg an izatio n a n d coverage of the b o o k is guided by the co n ten t an d ordering o f topics fo u n d in most introductory social statistics textbooks In this m anner, it can serve as a n excellent com p an io n , eith er for a class or self-learner, to such a textbook 'Assuredly, there are several very good and effective texts on learning Stata Virtually all o f these, however, are aimed at experienced users or are so detailed and long that they are not helpful for a typical classroom in which teaching Stata is not the primary purpose x U SIN G STATA FOR QUANTITATIVE ANALYSIS To b e clear, this book sho u ld n o t be used to learn statistics o r quantitative analysis Som e basic assu m p tio n s an d explanations are provided, but these sho u ld n o t be used in place o f a m o re th o ro u g h coverage o f each o f the analytic strategies T he statistical g ro u n d in g fo r this book is based prim arily on F rankfort-N achm ias and L eo n -G u erre ro ’s (2009) Social Statistics fo r a Diverse Society T h e definitions an d in te rp re ta tio n s o f the specific m easures and tests are b ased on those presented in th is text O f course, any inaccuracies or mis­ takes are solely m ine A lso, this book does n o t a tte m p t to cover every aspect o f each Stata com­ m and th a t is introduced M ore ex p erien ced users u n d o u b ted ly know shortcuts or alternative m ethods for the te ch n iq u e s that are presented But the given d escrip tio n has been geared to in tro d u c e com plete novice users to Stata This targeted audience requires th at th e explanation starts w ith the basics before ju m p in g into the advanced features T he presented co m m an d s and procedures are discussed because they are th e m ost simplified strategies th at effectively accom plish the pertinent goals About the National Study of Youth and Religion The d a ta for this book com e fro m the N ational Study o f Y outh an d Religion (NSYR) T he NSYR is a lo n g itu d in al, nationally representative telephone sur­ vey o f U.S young adults T here a re three waves o f data, all o f w hich are publically available T h e variables that are used in the examples th ro u g h o u t this book come from th e m ost recent follow-up survey o f 2,532 young ad u lts com pleted in the fall o f 2007 At the time o f this survey, the respondents were all betw een the ages of 18 a n d 24 Each respondent com p leted a com puter-assisted telephone inter­ viewing (CATI) survey that lasted approxim ately an hour T h is data set covers a broad array o f topics, m aking it possible, across examples, to use variables per­ tinent to several disciplines For exam ple, it contains several standard self­ esteem m easures of interest to psychologists, a wide array o f questions on religion useful for sociologists, n u m e ro u s questions on finances (e.g., debt) applicable to economics, and m easures o f substance use behaviors th a t would be p e rtin e n t to social work or h ealth researchers The full d ata set an d docu­ m e n tatio n can be downloaded fro m the Association o f Religion D ata Archives (http://w w w thearda.com /A rchive/Files/D escriptions/N SY RW 3.asp) T h e first wave of the survey sam pled 3,290 U.S E nglish- a n d Spanish­ speaking teenagers, ages 13 to 17 The sam pling and survey were conducted from July 2002 to August 2003 u sin g random -digit-dialing, draw ing on a sam­ ple o f random ly generated telephone n u m b ers representative o f all noncellular phone num b ers in the United States T he overall response rate o f 57% for the Chapter Exercise Solutions 211 10 hist relretrt, freq tT Graph ■Graph File E dt Object & aph lo o ls Help j j G raph ; fr X Chapter Exercises - Solutions tab crelder faithl, col + - + I Key I I frequency I column percentage (crelder_w3) R:28 How much you I personally care or | (faithl_w3) F:l How important or unimportant is not about [INSERT | religious faith in shaping how y Somewhat Not very Not impor | Very Extremely LIST A-C] Total Very much 337 72.01 356 59.04 339 45.87 163 44.05 139 | 41.49 | 1,334 53.04 Somewhat 110 23.50 203 33.67 320 43.30 160 43.24 129 | 38.51 I 922 36.66 212 USING STATA FOR QUANTITATIVE ANALYSIS A little I 12 2.56 34 5.64 57 7.71 36 9.73 48 I 14.33 | 187 7.44 Do not really care | 1.92 10 1.66 23 3.11 11 2.97 19 | 5.67 | 72 2.86 Total I 468 100.00 603 100.00 739 100.00 370 100.00 335 | 100.00 | 2,515 100.00 tab crelder faithl, col chi [ T A B L E O MIT TE D ] Pearson chi2(12) = 149.7612 Pr = 0.000 tab crelder f a i t h l , col chi gamma taub [ T A B L E O MIT TE D ] gamma = 0.2835 Kendall's tau-b = 0.1924 ASE = 0.024 ASE = 0.017 4a tab attend tab attend, nol recode attend (0/1=0) (2/6=1), gen(freqatt) 4b bysort freqatt: tab crelder faithl, col c hi gamma taub -> freqatt = + - + I Key | I frequency | I column percentage | + + (crelder_w3) R:28 | How much you I personally care or | not about [INSERT (faithl_w3) F:l How important or unimportant is religious faith in shaping how y Chapter Exercise Solutions 213 LIST A-C] Extremely Very Somewhat Not very Not impor | Total Very much 64 73.56 117 59.69 220 47.11 150 45.05 136 41.46 687 48.69 Somewhat 16 18.39 60 30.61 195 41.76 140 42.04 125 38.11 536 37.99 A little 3.45 17 8.67 37 7.92 34 10.21 48 14.63 139 9.85 Do not really care 4.60 1.02 15 3.21 2.70 19 5.79 49 3.47 Total 87 100.00 196 100.00 467 100.00 333 100.00 328 100.00 1,411 100.00 (faithl_w3) F:1 How important or unimportant is religious faith in shaping how y Extremely Very Somewhat Not very Not impor | Total 58.1602 0.1986 0.1349 Pearson chi2(12) gamma Kendall's tau-b Pr = 0.000 ASE = 0.034 ASE = 0.023 -> freqatt = Key I frequency I column percentage (crelder_w3) R:28 How much you personally care or not about [INSERT LIST A-C] Very much 273 71.65 238 58.62 118 43.54 13 35.14 42.86 645 58.53 Somewhat 94 24.67 143 35.22 125 46.13 20 54.05 57.14 386 35.03 A little 2.36 17 4.19 20 7.38 5.41 0.00 48 4.36 Do not really care 1.31 1.97 2.95 5.41 0.00 23 2.09 Total 381 100.00 406 100.00 271 100.00 37 100.00 100.00 1,102 100.00 Pearson chi2 (12) = gamma = Kendall's tau-b = 65.1490 Pr ■ 0.000 0.3461 ASE = 0.041 0.2142 ASE = 0.026 214 USING STATA FOR QUANTITATIVE ANALYSIS Chapter Exercises - Solutions ci longstr Variable I Obs Mean Std Err [95% Conf Interval] longstr I 2211 748 12 724 771 Obs Mean Std Err [99% Conf Interval] longstr I 2211 748 12 717 778 ci longstr, Variable l e v e l (99) I ttest longstr==365 One-sample t test Variable I Obs Mean longstr | 2211 747.6278 Std Err Std Dev [95% Conf Interval] 11 87787 558.5124 724.3348 mean = mean(longstr) Ho: mean = 365 Ha: mean < 365 Pr(T < t) = 1.0000 770.9207 t = degrees of freedom = H a : mean !=■ 365 Pr(|T| > |t|) = 0.0000 32.2135 2210 Ha: mean > 365 Pr(T > t) = 0.0000 ttest longstr, by(cu_cohab) Two-sample t test with equal variances Group | Obs Mean No I Yes | 1595 616 combined I 2211 diff | Std Err Std Dev [95% Conf Interval] 630.6307 1050.567 12.35235 24.26674 493.3212 602.2847 606.4022 1002.911 654.8593 1098.222 747.6278 11.87787 558.5124 724.3348 770.9207 -419.9358 24.94891 -468.8616 -371.0101 diff = mean(No) - mean(Yes) Ho: diff = Ha: di f f < P r ( T < t) = 0.0000 t = -16.8318 2209 degrees of freedom Ha: diff != Pr(|T| > |t|) = 0.0000 Ha: diff > Pr(T > t) = 1.0000 Chapter Exercise Solutions 215 anova longstr employst Number of Root M SE Source obs = 2211 = 552.887 R-squared Adj R-squared F = = 0.0223 0.0200 I Partial SS df Model | 15344694.4 3068938.88 10.04 0.0000 employst | I 15344694.4 3068938.88 10.04 0.0000 Residual I 674034008 2205 305684.357 Total | 689378703 2210 311936.065 MS Prob > F Chapter Exercises - Solutions l scatter kidwntmn relretrt r— EM® Graph - G rap h r File \3 Edit H Object M * Graph A u Tools 1» Help - ^ X uL Graph O CD J C™ ' CD C ♦ O >■§lf) _ c ♦ œ 33 ♦ LEE ♦♦ o >

Ngày đăng: 02/09/2021, 21:04

Tài liệu cùng người dùng

Tài liệu liên quan