agricultural statistical data analysis using stata

264 1.2K 0
agricultural statistical data analysis using stata

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Agricultural Statistical Data Analysis Using Stata George E Boyhan Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2013 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S Government works Version Date: 20130503 International Standard Book Number-13: 978-1-4665-8586-7 (eBook - PDF) This book contains information obtained from authentic and highly regarded sources Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint Except as permitted under U.S Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400 CCC is a not-for-profit organization that provides licenses and registration for a variety of users For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 To Dr Norton who answered the phone over the Christmas holidays Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 Contents I n t r o d u c t i o n vii About the A u t h o r xi C h a p t e r G e n e r a l S tat is t i c a l Pa c k ag e s C o m pa r is o n s Program 3 Windows and Menus 4 What’s on the Menu? 13 Conclusion 27 C h a p t e r 2 D ata E n t r y 29 Importing Data 32 Manipulating Data and Formats 44 C h a p t e r 3 D e s c r i p t i v e S tat is t i c s 55 Output Formats 60 Experimentation Ideas 60 C h a p t e r 4 Tw o S a m p l e Te s t s 63 ANOVA 69 Output and Meaning 71 C h a p t e r 5 Va r iat i o n s of O n e Fa c t o r ANOVA D e sig n s 75 Randomized Complete Block Design 75 Latin Square Designs 80 Balanced Incomplete Block Designs 84 Balanced Lattice Designs 88 Group Balanced Block Design 92 Subsampling 96 v vi C o n t en t s C h a p t e r 6 Tw o and M o r e Fa c t o r s ANOVA 101 Split-Plot Design 106 Split-Block Design 109 Evaluation over Years or Seasons 114 Three-Factor Design 118 Split-Split Plot Design 120 Covariance Analysis 125 Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 C h a p t e r 7 P r o g r a m m i n g S tata 133 C h a p t e r 8 P o s t H o c Te s t s 147 Planned Comparisons 147 Built-in Multiple Range Tests 151 Programming Scheffé’s Test 157 C h a p t e r 9 P r e pa r i n g G r a p h s 167 Graphing in Stata 167 C h a p t e r 10 C o r r e l at i o n a n d R e g r e ssi o n 179 Correlation 179 Linear Regression 183 C h a p t e r 11 D ata Tr a n s f o r m at i o n s 203 C h a p t e r 12 B i n a r y, O r d i n a l , a n d C at e g o r i c a l D ata A n a ly sis 215 A p p e n d i x 231 R e f e r e n c e s 237 I n d e x 239 Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 Introduction Stata is a statistical software package that began as a command-line program A graphical user interface (GUI) was added to the program sometime after its introduction, which has generally been very well executed It allows beginners and novice users to conduct statistical procedures without having to type commands that can become rather complex with certain models The command-line approach is never very far away and, as you gain confidence with the program, you will find yourself using it more and more The program has matured into a user-friendly environment with a wide variety of statistical functions A couple of nice features that have dramatically improved usability are being able to have a dataset visible on the desktop, while analyzing data and help menus that indicate where in the menus the specific statistical function can be found This book will attempt to introduce the reader to using Stata to solve agricultural statistical problems Stata, as a general purpose statistical program, has a large suite of commands that are applicable in a variety of disciplines Based on the number and scope of textbooks available on Stata, it has a strong following in medical, large population, and regression analyses This is not to detract from its overall capabilities to solve a wide range of problems vii Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 viii In t r o d u c ti o n This book provides an overview of using the Stata program It includes a discussion of the various menus, many of the dialog boxes, and an explanation of how the parts are integrated An explanation of how data can be entered into the program or imported is also presented Surprisingly, for those new to statistical software and analyses, this can be one of the most time-consuming aspects of statistics Stata has a very in-depth set of capabilities for entering, importing, and manipulating data prior to analyses This is followed by a chapter on the simplest of descriptive statistics An ever-increasing level of complexity as different models and approaches to agricultural statistical problems are introduced follows One of the biggest changes in Stata is the ability to create graphs This gives the Stata user another tool in preparing results for presentation and publication This book attempts to explain how to use Stata to analyze agricultural experiments Data that violate the underlying assumptions in many parametric tests must be handled differently This may involve transformation or the use of nonparametric tests Various examples from agricultural experiments are covered Agricultural Statistical Data Analysis Using Stata includes the more important statistical procedures used in agricultural research Various experimental designs and how to handle them within Stata are discussed Analysis of variance and covariance applications for agricultural experiments is covered Post hoc tests and comparisons are covered as well How to perform regression and correlations with some agricultural examples is included The more important nonparametric tests used in agricultural research are also covered—in particular, the use of chi-square for categorical data, such as from inheritance studies As mentioned earlier, Stata grew out of a command-line interface, which is still recognizable as part of its foundation In fact, this command-line interface is one of its strongest attributes because these commands can be organized and executed as a program, which expands the capabilities of Stata and ultimately makes things easier for users willing to devote some time to developing unique programs to solve their particular problems An introduction to programming Stata is included, which should help users in this area How to program Stata to extend its usability is also covered Multiple-range tests ix Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 In t r o d u c ti o n are part of Stata, but they will be used as examples on how to implement them in Stata as user-written programs are covered as well How various programming files relate to one another and how to develop your own programs are also discussed Although the programming capabilities of Stata are some of its best attributes, for the occasional user, it may seem quite daunting This is where the GUI can be a real help In this book, I present the GUI approach along with the command-line approach, so that the occasional user can use the program without feeling intimidated or thinking they have to climb a steep learning curve All of the datasets used in the book are from other texts, from my own research, or made up to highlight a procedure Where datasets are taken from other texts, the text and page number are listed These textbooks are listed in the References at the end of the book and all are excellent sources for more information about using the statistics described in this book In addition, Stata includes all of its reference materials as PDF files with the program There are links to these files in the online help These reference manuals have a more in-depth discussion of the specific procedure in question as well as references from the scientific literature I try to use the typesetting conventions in Stata’s manuals, but won’t be presenting commands in as formal a manner There’s no use re-inventing the wheel For a comprehensive presentation of a particular command, the reference manuals are always there, as is excellent online help both within the program and from the Internet The figures that present different parts of the program generally alternate between Macintosh® and Microsoft Windows®based computers These elements are almost identical between the two systems So, with that, let’s begin George Boyhan Data sets available for download at http://www.crcpress.com/product/isbn/9781466585850 Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 Downloaded by [Hanoi University of Agriculture] at 02:24 03 April 2014 Index A Abbreviated commands, 36 About Stata , 14, 25 Additive and nonadditive effects, 209–210 Adjusted means, analysis of covariance, 127–128 Adjusted R-squared, 77, 185 Adjusted sum of squares, 86 Adjusted treatment mean square calculation, 231–236 Ado-Files, 17, 134, 135 Advanced menu, Edit, 22 Advice menu, 25 Also See button, Viewer window, 10 Analysis of covariance, 125–131 Analysis of variance (ANOVA), 69–71, 75 ANOVA tables, 71–73, 77, 103–104 balanced incomplete block designs, 84–88 balanced lattice designs, 88–92, 231–236 Bartlett’s test for equal variances, 72 commands, 70–73, 75–76, See also anova F value, 69–70, 95, 105, 203 group balanced block design, 92–95 Latin square designs, 80–84 one factor designs, 75–80 post hoc tests, 147–151 programming example (anovalsd), 137–146 with ranks (Kruskal-Wallis test), 227–228 regression and, 185 residual mean square, 78, 94–95 subsampling, 96–99 tests for checking normality, 203–205 underlying assumptions, 203 Analysis of variance, two and more factors, 101 annual or seasonal evaluations, 114–118 ANOVA tables, 103–104 239 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 24 In d e x contrast command, 105–106 covariance analysis, 125–131 interaction effects, 101–103, 113 split-block design, 109–114 split-plot design, 106–109 split-split plot design, 120–125 three-factor design, 118–120 Angular transformation, 210–211 Annual or seasonal evaluations, 114–118 anova, 70–71, 75–78, See also Analysis of variance analysis of covariance, 126, 130 annual or seasonal evaluations, 117 balanced incomplete block designs and, 86 balanced lattice designs, 91 continuous and categorical predictors, 191–192 dependent and independent variables, 75, 141 ereturn list, 78–80, 232 examining programming (which anova), 133 factorial experiments, 102, See also Analysis of variance, two and more factors group balanced block design, 93–95 Latin square designs, 81 matrix list, 128, 130 multiple range tests, 152–157 post hoc tests, 147–151 programming example (anovalsd), 137–146 split-block design, 110–114 split-plot design, 107 tests for checking normality, 203–205 three-factor design, 118–120 using the Do-File Editor, 134–135 Anovalsd, 139 ANOVA tables, 71–73, 77, 103–104 Appending or merging data, 41–44 Applications folder, 133 Apply New Scheme, 23 Arcsine transformation, 210–211 Arithmetic operators, 142 asin(), 211 assert, 32 Asymptotic standard error, 73 Auto-tabbing, 137 B Back transformation, 213–214 Balance Braces, 22 Balanced incomplete block (BIB) designs, 84–88 Balanced lattice designs, 88–92, 231–236 ballatadj, 91–92, 231 Bar chart, 168 Bartlett’s test for equal variances, 72, 211 Between group variance, 70 Binomial data, 210, 215–216 calculating probabilities, 216 chi-square goodness of fit, 216–221 bitest, 215 bitesti, 216 Blocks, 75, 89, 92, See also Randomized complete block design balanced incomplete block designs and, 84–88 group balanced block design, 93–95 split-block design, 109–114 Blue text, Bonferroni’s test, 154, 155 Box and whisker plot, 174 In d e x Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 Box plot, 174 Brace matching, 22 Break button, 8, 24 Bring All to Front menu, 24 browse, 12 Browse , 17 by, 58, 114 Byte, 55 C c., 192 Cancel, 35 capture, 136, 139 Cases (rows), 12 Categorical data, 61 anova command, 191–192 covariance analysis, 125–131 Causal relationship, 179 cd, 56, 89, 135 Central tendency measures calculation, 56–57 Change Working Directory , 17, 33, 35, 89, 135 Chi-square, 205–206 goodness of fit calculation, 216–221 chisquare, 217 ci, 59 Classic setting, Clear, 21 Close, 16, 18 Close Tab, 16 cluster, 233 Coefficient of correlation (r), 179 Coefficient of determination (r2), 77, 182 Coefficient of variation (CV), 143, 206 calculation using tabstat, 209 collapse, 13, 51, 158, 232, 233 Color schemes, 23 data in Data editor, 31–32 41 Do-File Editor, 137 using display as, 143 Columns (variables), 12, 29 Combined View, 23 Comma delimited data, 32–33, 34 Command break button, 8, 24 Commands abbreviations, 36 downloadable, 10, See also Downloadable commands and programs entry region, 11 finding help, 36 pathnames, 35 swapping Command and Results regions, 23 typing, 35 User menu, 24 Command window, 4, 35 Comments, 140 Completely randomized design (CRD), 70 multiple range tests, 154, 212–213 relative efficiency vs Latin square design, 82–83 relative efficiency vs randomized complete block design, 79–80 compress, 56 Computer file types, 35 Confidence intervals, 59, 199 Contents menu, 25 Continuous, 61, 111, 191–192 contract, 221–222 contrast, 95, 105–106, 148 control, 166 Cook-Weisberg test, 204 Copy and paste, 20–21, 44 Copy Table, 20 correlate, 181–182 Correlation, 179–180 defined, 180 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 242 In d e x intraclass correlation in ANOVA table, 73 scatter plot graph, 180–181 Spearman’s rank correlation, 182–183 count, 235 Count data, 61 Covariance analysis, 125–131 Covariate, 126–129 Critical t values, 141–142 csgof, 216–220 Cubic equation, 197 Customize Toolbar , 23 D Data Browser, Data dictionary, 18–19, 37–39 Data Editor, 7, 11–13, 23, 29 copying/pasting data, 21, 44 data color, 31–32 format commands, 29, 31 Data entry, 29 adding additional information, 52–53 avoiding errors, 32 copying/pasting data, 20–21, 44 data types, 34, 38 file types, 35 importing data, 18–19, 32–35 importing data using data dictionary, 18–19, 37–39 importing fixed formatted data using infix, 39–41 manipulating data, 44–48 merging data, 41–44 spreadsheet format, 29 using the Data Editor, 29–32, See also Data Editor Data Management, 29 Data manipulation multiple weights for plots, 48–49 randomization, 44–48 Data menu, 24 Data storage type, 34, 38, 55–56 output formats, 60–61 Data transformation, 61, 214 arcsine or angular, 210–211 back transformation, 213–214 calculating chi-square and probability using ladder command, 205–206 log, 192–193, 195, 206, 207–210 multiplicative or nonadditive effects and, 209–210 regression and, 193–197 skewness and kurtosis values, 206–207 Delete Line, 21 Delimiters, 32–33, 34 Dependent variable, anova command, 75, 141 describe, 36–37, 52–53, 55 Descriptive statistics, 55, 56 confidence intervals, 59 displaying results, 58–59, 60 summarize command, 57–58, 98 tabstat command, 56–57 detail, 206, 216 Diagnostic graphs, 167, 175 Dialog button, Viewer window, 10 Dictionary files, 18–19, 37–39 Discrete, 61 display, 58, 60, 143 display as, 143 Distributional graphs, 167 do, 134 Documentation, 3–4 searching, 10 Do-File Editor, 7, 14, 16, 23, 134–137 color system, 137 creating data dictionary, 38 Find function, 21–22 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 In d e x Do-Files, 136 ado-Files, 17 and ado extensions, 134 extensions, 16 opening, 16, 17 saving, 16–17 showing side-by-side, 16 Do menu, 17 Double variables, 56 Downloadable commands and programs, 10, 133, 163 using findit, 163, 209, 216, 228 Drag and drop, 22 drop, 136, 139, 234 Dummy variable, 129–131 duncan, 164–165, 212–213 Duncan’s New Multiple Range Test (MRT), 162–165, 212–213 dunnett, 165 Dunnett’s test, 154, 165 E e-class commands, 78 The Economist, 23 Edge effects, 129 edit, 12 Edit menu, 20–23 Effective error mean square calculation, 231–236 else, 145, 160 end, 144 ereturn list, 78–80, 141, 156, 232 error, 143 Error codes and messages, 5–6, 11, 18 suppressing using capture command, 136 Error distribution assumptions, 203–205 Estimation commands, 78 24 Example Datasets, 19 Excel files, 18, 32, 44 exp(), 194 expandcl, 233 Experimental data, 60–61 Experimental error variance calculation, 97 Exponential regression, 195, 200 Exporting data, 19 expperc, 218–219 F Factorial experiments, 101–102, See also Analysis of variance, two and more factors annual or seasonal evaluations, 114–118 interaction effects, 101–103, 113 split-block design, 109–114 split-plot design, 106–109 split-split plot design, 120–125 three-factor design, 118–120 Factory settings, 15 Familywise error rate, 152 fden, 131 File extensions, 16 File menu, 16–20 Filename , 17 File types, 35 Filter button, Data Editor, 12 Find, 10, 21–22 findit, 163, 209, 216, 228 Fisher’s Protected LSD, 144 float, 38 Float variables, 55–56 Fonts, Format commands, 29, 31, 60, 143–144 forvalues, 159–161 Frequency weight, 76 Frequently Asked Questions, 26 friedman, 228–229 24 In d e x Friedman’s Test, 228–229 ftail(), 144, 236 F-test, 78, 185, 203 F value, 69–70, 95, 105 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 G General Preferences, 14 generate, 36 ladder command option, 206 which generate, 133 Genetics applications, 217–218 gom3, 201 gom4, 201 Gompertz function, 200 Go to Line , 22 graph, 167 Graph Editor, 18, 23, 25, 169–172 Graphical user interface (GUI), 1, 133 Graphics menu, 24, 167 Graphing, 167 box plot, 174 changing scheme or size, 176 combining graphs, 175–176 diagnostics, 167, 175 graph editor, 18 histogram, 172–174, 176 management of graphs, 176 renaming graphs, 22–23 scatter plot, 174–175, 180–181 twoway command, 114, See also twoway using the Graph Editor, 169–172 graph matrix, 175 Graph Preferences, 14 Graph window, Group balanced block design, 92–95 H help, 55 Help functions, 25, 55, 59 help about commands, 36 integration with Internet resources, 2, programming help operators, 142 searching, 25 for Stata programs, 135 Stata Website menu, 26 using Viewer windows, 10 Help menu, 25 help operators, 142 Heterogeneity effects, 129 Heteroscedastic variances, 207 hettest, 204 histogram, 172, 176 Histograms, 172–174, 176 Homoscedastic variances, 203 HTML, 20 I if, 56–57, 67, 76, 145, 160 Importing data, 18–19, 32–35 copying/pasting data, 20–21, 44 fixed formatted data (using infix), 39–41 merging data, 41–44 using data dictionary, 18–19, 37–39 in, 57, 67, 76 Indent/unindenting lines, 22 Independent sample t-test, 226 Independent variable anova command, 75, 141 model I and model II, 180, 183 infile using, 39 infix, 39–41 input, 143 Insert File , 16 insheet, 35–36, 48 int, 38, 52 Integer type data, 38, 60 Interaction effects, 101–103, 113 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 In d e x Intercooled Stata, Internet resource integration, 2, blue text links, Example Datasets, 19 proxy server, 14 Stata Website menu, 26 updates, 2, 14 Interquartile range (IQR), 174 Intraclass correlation, 73 Intrinsic differences, 61 invnormal, 99 invttail, 59, 141–142 J Jump To button, Viewer window, 10 K Kendall number, 229 Kruskal-Wallis test, 227–228 Kurtosis, 206–207 kwallis, 227–228 L label list, 149, 166 ladder, 204, 205–206, 211 Latin square (LS) designs, 80–84 Layout menu, 23 Least Significant Difference (LSD), 141, 144 Least squares means, 86–87 Least-squares method, 184 level(#), 67 Linear model, 203 Linear regression, 183–193, See also Regression Line indenting/unindenting, 22 Line numbering, 22, 137, 219 local, 160–162 Local macros, 80, 140 Log files, 7, 10–11, 18, 55 24 Logistic function, 200, 201 Log transformation, 192–193, 195, 206, 207–210 loneway, 70–72 Long format, 49 Looping, 159–161 M Macintosh operating system, menus and, 14 subdirectory pathnames, 35 Macros, 80, 140 quotation marks, 140–141 Stata programming language and, 135–136 Main window, 4–6 command entry region, 11 resizing, Variables and Properties region, Manage Preferences, 15 Mann-Whitney Test, 226 Manuals, 3–4, 29 Marginal means, 86–87, 127–128 margins, 86–87, 108–109, 113, 120, 128 marginsplot, 108–109, 113, 120 Master dataset, 41–43 Match Braces, 22 matrix list, 128, 130 mean, 87 Means calculation, 56–57 adjustment for analysis of covariance, 127–128 least squares or marginal, 86–87 standard error calculation, 58 summarize command, 98 tabstat command, 209, See also tabstat variance of treatment mean, 97 Mean square (MS) column, ANOVA table, 72 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 24 Mean square error (MSE), 77, 94, 126, 131, 141, 142, 157, 163, 185 Median difference evaluation Kruskal-Wallis test, 227–228 Sign Test, 222–224 Wilcoxon’s Signed Rank Test, 224–226 Menus and functions in Stata, 13–26 merge, 233–234 Merging data, 41–44 Microsoft Excel files, 18, 32, 44 Microsoft Office macros, 135 Missing data analysis of covariance, 126, 129–131 balanced incomplete block designs and, 85, 87 mod, 51 Model I, 180, 183 Model II, 180 Modulus, 51 Monitor size, more, 135 More icon, 8, 24 Multiple range tests, 151–157 Duncan’s MRT, 162–165, 212–213 oneway command, 154–156 programming, 157–165 Scheffe’s test, 154, 155–162 Multiplicative effects, 209–210 N Natural log, 208, 210 Nested variables, 84 News menu, 25 New Tab, 16 nl, 200–201 nonadd, 209–210 In d e x Nonparametric data, 61, See also Categorical data; Ordinal data transformation, See Data transformation Nonparametric measures, 182–183 Nonparametric tests, 229 Friedman’s Test, 228–229 Kruskal-Wallis test, 227–228 Mann-Whitney Test, 226 Sign Test, 222–224 Wilcoxon’s Signed Rank Test, 224–226 Normal distribution assumptions, 63, See also Data transformation calculating chi-square and probability using ladder command, 205–206 skewness and kurtosis values, 206–207 tests for checking normality, 203–205 notes, 53 Null hypothesis, type I and type II errors and, 66–67 O Observations (rows), 12, 29 ODBC data, 19 OK button, 35 One-sample t-tests, 67–68 oneway, 70–73, 154–156, 211–212 Online help, 2, Open , 16 Opening Do-Files, 16, 17 Open Recent menu, 16 Operating systems SPSS and SAS versions, Stata support, Ordinal data, 61 Kruskal-Wallis test, 227–228 In d e x Mann-Whitney Test, 226 Sign Test, 222–224 Wilcoxon’s Signed Rank Test, 224–226 Output formats, 60–61 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 P Page Setup, 19 Paired t-tests, 69, 222, 224 Parametric data, 61 Partial sum of squares, 86 Paste, 21, 44 Paste Special , 21 Pathnames, 35 Pearson chi2, 221 Percent data, 210–211 Planned comparisons, 147–151 Polynomial functions, 197–200 Population variance, 63 Post hoc tests, 147 multiple range tests, 151–157, 212–213 multiple range tests, programming, 157–165 oneway command, 154–156 planned comparisons, 147–151 test command, 147–151 Tukey’s test, 153–154 Power, 64–67 sample size and, 67 Power curve, 193–194 Preferences menu, 14–16, 137 preserve, 13, 50, 158, 221–222, 232 Preserve variable case, 34 Printer menu, Print options, 19–20 Probability, deviation from normality, 205–206 program, 139 changing lines, 219 Program file types, 35 47 Programming, 133 anovalsd example, 137–146 arithmetic operators, 142 comments, 140 and ado extensions, 134 documenting code, 162 downloading user-written programs, 133, 163 formatting directives, 143–144 help operators, 142 interactive, 136 line numbering, 22, 137 looping using forvalues, 159–162 macros and quotation marks, 140–141 multiple range tests, 157–165 using the Do-File Editor, 134–137, See also Do-File Editor variables and macros, 135–136 viewing programs, 133 Properties button, Data Editor, 12 Properties region of Data Editor, 29 Properties region of Main window, Proxy server, 14 pveffects, 152–153 pwcompare, 152–153, 165, 212 Q qfitci, 199 qsturng, 163–165 Quadratic equation, 197, 199 Quartic equation, 197 quietly, 139, 163, 219–220 Quotation marks, 140–141 R Randomization, 44–48 Randomized complete block design (RCBD) Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 24 In d e x additive effects, 209 blocks and replications, 75, 89 factorial experiments, 101–102, See also Analysis of variance, two and more factors multiple range tests, 155 one-way ANOVA, 75–80 randomization, 44–48 relative efficiency vs balanced incomplete block design, 88 relative efficiency vs completely randomized design, 79–80 relative efficiency vs Latin square design, 82–83 subsampling, 96–99 two-way classified data, 228 Randomized treatments, group balanced block design, 92 Random sample assumptions, 203 Ranked data, 61, See also Ordinal data ranksum, 226 Raw files, 33, 40 r-class commands, 58 Redo, 20, 171 Red text, 5–6 Refresh, 10 regress, 184–193 graphical diagnostics, 167 polynomial functions, 197–200 with transformed data, 193–197 Regression, 179–181 ANOVA and, 185 checking underlying assumptions, 185–186 diagnostics plots, 175 linear, 183–193 nonlinear analysis using nl, 200–201 polynomial functions, 197–200 power curve, 193–194 residuals, 185–187 with transformed data, 193–197 rename, 233 replace, 36 Replication, 75, 89 two-way classified data, 228 Residual mean square, 78, 94–95, 185 Residuals, 185–187 normal distribution assumptions, 203–205 Resizing text, 23 Resizing windows, 9, 24 restore, 13, 51, 158, 232 result, 143 Results window, 4–5 printing, 19–20 swapping Command and Results regions, 23 Review window, 4, 11 Root mean square error (RMSE), 77, 94, 141, 232 Rows (cases or observations), 12, 29 R-squared values, 72, 77, 182, 185 adjusted, 77 runiform(), 45–46 rvfplot, 185–186, 204, 207 rvpplot, 185–186, 204 S Sample size power and, 67 sampsi command, 64–65 subsample size determination, 97 Z-test and, 63, 64 Sampling error, 96 sampsi, 64–65 SAS, importing XPORT data, 19 Save As , 16 Saving data, 6, 16–17 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 In d e x scatter, 187, 193, 195, 197, 199, 201 Scatter plot, 174–175, 180–181 scheffe, 158, 165 Scheffe’s test, 154, 155–162 Search functions, 10, 11, 21–22 Help menu, 25 Seasonal or annual evaluations, 114–118 Select All, 21 Select Line, 21 Select Next Tab, 25 Select Next View, 24 Select Previous Tab, 25 Select Previous View, 24 Sequential sum of square, 86 Serial number, 14 set more off, 135, 139, 158 Sex ratios, 215 sfrancia, 204 Shapiro-Francia W’ test, 204–205 Shapiro-Wilk W test, 204–205 Sidák’s adjustment, 154, 155–156 signrank, 224–226 signtest, 222–224 Sign Test, 222–224 Skewness, 206–207 sktest, 204, 205 Small Stata, Snapshots button, 13 Soil heterogeneity, 129 Sorting data, 47–48 spearman, 183 Spearman’s rank correlation, 182–183 Split-block design, 109–114 Split-plot design, 106–109 Split-split plot design, 120–125 Spreadsheet data import, 18, 32–33 Spreadsheet format, 29 SPSS, sqrt(), 59, 142, 156–157 24 Standard deviation, 73 tabstat command, 209 Standard error of the mean, 58 Stand count, 129 Stata, comparison with other statistical packages, 1–3 integration with Internet resources, menus and functions, 13–26 operating systems, printed documentation, 3–4 updates and upgrades, 2, 14, 26, 133 version information, 14, 25 windows and menus, 4–13 Stata Blog, 26 Stata/IC, Stata Journal, 2, 23, 25, 26 Stata Markup and Control Language, 16 Stata/MP, Stata Press, 26 Stata/SE, Stata Technical Bulletins, 163 Stata Website menu, 26 Statistical packages comparison, 1–3 Statistics menu, 24 str8, 38 String variables, 34, 38, 56 Student-Newman-Keuls (SNK) test, 154, 162 Student’s t, 141–142 Subdirectories, 35 Submit button, 35 summarize, 57–58, 78, 98, 139, 141, 206 Sum of squares (SS) column, ANOVA table, 72, 77 partial or adjusted (type III), 86 sequential (type I), 86 swilk, 204 Syntax Highlighting, 14 250 In d e x Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 T Tab delimited data, 32–33, 34 table, 119 Table Copy Options , 21 Tabs, 10 Do-File Editor preferences, 137 opening new windows as, 15–16 Select menu options, 25 tabstat, 56–57, 63–64, 127, 134–135, 140, 206–207, 209 Tabular t, 59 tabulate, 220, 222 Technical support, 2, 14, 26 test, 147–151 text, 143 Text files, 33 Text sizing, 23 Text variables, 32, 34 Three factor design, 118–120 Toolbar customization, 23 Translate submenu, 18 Treatment comparison tests, See Post hoc tests ttest, 67–69 ttesti, 69 t-tests F value and, 70 immediate form, 69 independent samples, 226 one-sample, 67–68 paired and unpaired, 69, 222, 224 two-group, 68–69 Tukey’s test, 153–154, 165, 210 Two-group t-test, 68–69 Two sample tests, 63 ttest, 67–69 Z-test, 63–64 twoway, 114 regression and, 187, 195, 197–199, 201 Two-way classified data, 228 Type I error, 66–67, 151–152 Type II error, 66–67 Type I sum of squares, 86 Type III sum of squares, 86 U Undo, 20, 171 Unix, subdirectory pathnames, 35 Unpaired t-tests, 69 Updates and upgrades, 2, 14, 26, 133 User menu, 24 User Support, 26 V Value Label, 31, 52 Variable labels, 11, 34, 52–53 renaming, 233 Variables (columns), 12, 29 Variables, Stata programming language and, 135 Variables button, Data Editor, 12 Variables Manager, 4, Variables region of Main window, Variables window, 11 Variance, 63, See also Analysis of variance assumptions for ANOVA, 203 Bartlett’s test for equal, 72, 211 calculation using tabstat, 209, See also tabstat of treatment mean, 97 varlist, 35, 58 Versions of Stata, 14, 25 View End of Lines, 22 Viewer windows, 7, 9–11 printing, 19–20 View menu, 17, 23–24 View Whitespace, 22 Visual Basic for Applications (VBA), 135–136 In d e x Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 W Weights, 57, 76 covariate use, 129 What’s New menu, 25 which, 133 Wide format, 49 Widescreen View, 23 Wilcoxon’s Signed Rank Test, 224–226 Window menu, 24 Windows, opening as tabs, 15–16 Windows in Stata, overview, 4–13 Windows menu, Windows operating system, menus and, 14 subdirectory pathnames, 35 Windows Preference, 14 51 Within group variance, 70 Working Directory, 17, 33, 35, 56, 89, 135 Wrap around, 22 X XML data, 19 Y Yates’ correction, 219 yline, 186 Z Zoom, 24 Z-test, 63–64 Downloaded by [Hanoi University of Agriculture] at 02:25 03 April 2014 [...]... on using both Stata as well as statistical textbooks They even have a journal, Stata Journal, with articles on using Stata to implement various statistical functions Finally, unlike other statistical packages that may only offer a limited number of statistical functions, Stata offers a comprehensive set of statistical functions as well as extensibility through its built-in programming language Stata. .. File, Example Datasets…, which when selected brings up a Viewer window with links to Stata example datasets One link is to datasets that were loaded on your computer when Stata was installed As you read through Stata s documentation, it refers to these example datasets to illustrate Stata s capabilities Clicking on the link Example datasets installed with Stata will bring up a list of datasets used... the data and a separate dictionary file, with a dct extension that describes the data for the purposes of importation Finally, for text file importation, there is an item for importing an unformatted text file Importing SAS XPORT, ODBC data source, and XML data also are for importing data into Stata, but deal with importing from another statistical or software package, SAS XPORT from SAS, from a database... run on and the size of datasets they can handle Stata/ MP is for multiprocessor machines, while Stata/ SE is for single processor machines Both of these are considered the professional versions of the software and both handle the largest datasets Stata/ IC, which was formerly known as Intercooled Stata, is the intermediate-sized program, while Small Stata handles the smallest of datasets and is the slowest... ns 3 program searches Stata maintains many of these examples and many are available from third parties Downloaded by [Hanoi University of Agriculture] at 02:21 03 April 2014 Program Stata is available on the three major operating systems: Windows, Macintosh, and Unix In addition, there are several flavors of Stata available These include Stata/ MP, Stata/ SE, Stata/ IC, and Small Stata These versions differ... a database source (ODBC—open database connectivity), or from any application that supports the open source XML format The Export menu also has selections for exporting Microsoft Excel files (.xls, xlsx) There is a Comma- or tab-separated data, Text data (fixed- or free-format), SAS XPORT, ODBC data source, and XML data, for exporting data files As mentioned previously, Stata maintains tight integration... Pack ag es C omparisons Stata is a general-purpose statistical program that has some unique features not found in other such general packages Two other popular general-purpose statistical packages are SAS (Statistical Analysis System) and SPSS (Statistical Package for the Social Sciences) Each of these has its strengths and weaknesses SAS probably has the greatest user base among agricultural researchers... programs pasted into Stata Data can be pasted into the Data Editor window that includes the column titles, if present, and Stata, which will enter the data into the cells Stata asks if column titles are present and places that information in the gray column titles row at the top if needed In addition to the Paste command is the Paste Special…, which is available for pasting into the Data Editor This menu... the Command area of the Main window The Data Editor also can be opened so that changes cannot be made by typing browse in the Command window The Data Editor works just like any spreadsheet If you are familiar with Excel, the Data Editor works in a similar fashion where data are entered in cells defined by the row number and column heading In Stata, as in most statistical software, the rows are referred... appear with a black rectangle The Data Editor is not capable of producing a noncontiguous dataset; therefore, if you select a cell by itself and enter a value, the Data Editor will enter missing values in all the empty cells from the first cell (row 1, column 1) to the cell in which you have entered data The missing data will appear as periods (.) At the top of the Data Editor are several buttons One

Ngày đăng: 21/11/2015, 07:15

Từ khóa liên quan

Mục lục

  • 1

    • Introduction

    • About the Author

    • 2

      • General Statistical Packages Comparisons

        • Program

          • Windows and Menus

          • What’s on the Menu?

          • Conclusion

          • 3

            • Data Entry

              • Importing Data

              • Manipulating Data and Formats

              • 4

                • Descriptive Statistics

                  • Output Formats

                  • Experimentation Ideas

                  • 5

                    • Two Sample Tests

                      • ANOVA

                      • Output and Meaning

                      • 6

                        • Variations of One Factor ANOVA Designs

                          • Randomized Complete Block Design

                          • Latin Square Designs

                          • Balanced Incomplete Block Designs

                          • Balanced Lattice Designs

                          • Group Balanced Block Design

                          • Subsampling

                          • 7

                            • Two and More Factors ANOVA

                              • Split-Plot Design

                              • Split-Block Design

                              • Evaluation over Years or Seasons

Tài liệu cùng người dùng

Tài liệu liên quan