Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 577 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
577
Dung lượng
3,58 MB
Nội dung
Biostatistical Design and Analysis Using R A Practical Guide Murray Logan A John Wiley & Sons, Inc., Publication Biostatistical Design and Analysis Using R Companion website A companion website for this book is available at: www.wiley.com/go/logan/r The website includes figures from the book for downloading Biostatistical Design and Analysis Using R A Practical Guide Murray Logan A John Wiley & Sons, Inc., Publication This edition first published 2010, 2010 by Murray Logan Blackwell Publishing was acquired by John Wiley & Sons in February 2007 Blackwell’s publishing program has been merged with Wiley’s global Scientific, Technical and Medical business to form Wiley-Blackwell Registered office: John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK Editorial offices: 9600 Garsington Road, Oxford, OX4 2DQ, UK The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK 111 River Street, Hoboken, NJ 07030-5774, USA For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988 All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher Wiley also publishes its books in a variety of electronic formats Some content that appears in print may not be available in electronic books Designations used by companies to distinguish their products are often claimed as trademarks All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners The publisher is not associated with any product or vendor mentioned in this book This publication is designed to provide accurate and authoritative information in regard to the subject matter covered It is sold on the understanding that the publisher is not engaged in rendering professional services If professional advice or other expert assistance is required, the services of a competent professional should be sought Library of Congress Cataloguing-in-Publication Data Logan, Murray Biostatistical design and analysis using R : a practical guide / Murray Logan p cm Includes bibliographical references and index ISBN 978-1-4443-3524-8 (hardcover : alk paper) – ISBN 978-1-4051-9008-4 (pbk : alk paper) Biometry R (Computer program language) I Title QH323.5.L645 2010 570.1 5195 – dc22 2009053162 A catalogue record for this book is available from the British Library Typeset in 10.5/13pt Minion by Laserwords Private Limited, Chennai, India Printed and bound in Singapore 2010 Contents Preface R quick reference card General key to statistical methods Introduction to R 1.1 Why R? 1.2 Installing R 1.2.1 Windows 1.2.2 Unix/Linux 1.2.3 MacOSX 1.3 The R environment 1.3.1 The console (command line) 1.4 Object names 1.5 Expressions, Assignment and Arithmetic 1.6 R Sessions and workspaces 1.6.1 Cleaning up 1.6.2 Workspaces 1.6.3 Current working directory 1.6.4 Quitting R 1.7 Getting help 1.8 Functions 1.9 Precedence 1.10 Vectors - variables 1.10.1 Regular or patterned sequences 1.10.2 Character vectors 1.10.3 Factors 1.11 Matrices, lists and data frames 1.11.1 Matrices 1.11.2 Lists 1.11.3 Data frames - data sets xv xix xxvii 1 2 3 4 6 7 8 10 11 12 13 15 16 16 17 18 vi CONTENTS 1.12 Object information and conversion 1.12.1 Object information 1.12.2 Object conversion 1.13 Indexing vectors, matrices and lists 1.13.1 Vector indexing 1.13.2 Matrix indexing 1.13.3 List indexing 1.14 Pattern matching and replacement (character search and replace) 1.14.1 grep - pattern searching 1.14.2 regexpr - position and length of match 1.14.3 gsub - pattern replacement 1.15 Data manipulation 1.15.1 Sorting 1.15.2 Formatting data 1.16 Functions that perform other functions repeatedly 1.16.1 Along matrix margins 1.16.2 By factorial groups 1.16.3 By objects 1.17 Programming in R 1.17.1 Grouped expressions 1.17.2 Conditional execution – if and ifelse 1.17.3 Repeated execution – looping 1.17.4 Writing functions 1.18 An introduction to the R graphical environment 1.18.1 The plot() function 1.18.2 Graphical devices 1.18.3 Multiple graphics devices 1.19 Packages 1.19.1 Manual package management 1.19.2 Loading packages 1.20 Working with scripts 1.21 Citing R in publications 1.22 Further reading Data sets 2.1 Constructing data frames 2.2 Reviewing a data frame - fix() 2.3 Importing (reading) data 2.3.1 Import from text file 2.3.2 Importing from the clipboard 2.3.3 Import from other software 2.4 Exporting (writing) data 2.5 Saving and loading of R objects 2.6 Data frame vectors 2.6.1 Factor levels 18 18 20 20 21 22 23 24 24 25 26 26 26 27 28 29 30 30 30 31 31 32 34 35 36 39 40 42 42 45 45 46 47 48 48 49 50 50 51 51 52 53 54 54 CONTENTS 2.7 Manipulating data sets 2.7.1 Subsets of data frames – data frame indexing 2.7.2 The %in% matching operator 2.7.3 Pivot tables and aggregating datasets 2.7.4 Sorting datasets 2.7.5 Accessing and evaluating expressions within the context of a dataframe 2.7.6 Reshaping dataframes 2.8 Dummy data sets - generating random data vii 56 56 57 58 58 59 59 62 Introductory statistical principles 3.1 Distributions 3.1.1 The normal distribution 3.1.2 Log-normal distribution 3.2 Scale transformations 3.3 Measures of location 3.4 Measures of dispersion and variability 3.5 Measures of the precision of estimates - standard errors and confidence intervals 3.6 Degrees of freedom 3.7 Methods of estimation 3.7.1 Least squares (LS) 3.7.2 Maximum likelihood (ML) 3.8 Outliers 3.9 Further reading 65 66 67 68 68 69 70 Sampling and experimental design with R 4.1 Random sampling 4.2 Experimental design 4.2.1 Fully randomized treatment allocation 4.2.2 Randomized complete block treatment allocation 76 76 83 83 84 Graphical data presentation 5.1 The plot() function 5.1.1 The type parameter 5.1.2 The xlim and ylim parameters 5.1.3 The xlab and ylab parameters 5.1.4 The axes and ann parameters 5.1.5 The log parameter 5.2 Graphical Parameters 5.2.1 Plot dimensional and layout parameters 5.2.2 Axis characteristics 5.2.3 Character sizes 5.2.4 Line characteristics 5.2.5 Plotting character parameter - pch 85 86 86 87 88 88 88 89 90 92 93 93 93 71 73 73 73 74 75 75 532 BIBLIOGRAPHY Green, P T (1997) Red crabs in rain forest on Christmas Island, Indian Ocean: activity patterns, density and biomass Journal of Tropical Ecology 13, 17–38 Hall, S J., S A Gray, and Z L Hammett (2000) Biodiversity-productivity relations: an experimental evaluation of mechanisms Oecologia 122, 545–555 Hastie, T J and Tibshirani, R J (1990) Generalized Additive Models Chapman & Hall, Boca Raton, FL Hollander, M., and D A Wolfe (1999) Nonparametric statistical methods, 2nd edition edition John Wiley & Sons, New York Ihaka, R., and R Gentleman (1996) R: A Language for Data Analysis and Graphics Journal of Computational and Graphical Statistics 5, 299–314 Keough, M J., and P T Raimondi (1995) Responses of settling invertebrate larvae to bioorganic films: effects of different types of films Journal of Experimental Marine Biology and Ecology 185, 235–253 Kirk, R E (1968) Experimental design: procedures for the behavioral sciences Brooks/Cole, Monterey, CA Legendre, P., (2001) Model II regression - User’s guide D´epartment de sciences biologiques, Universit´e de Montr´eal Loyn, R H., (1987) Nature Conservation: the Role of Remnants of Native Vegetation, Chapter effects of patch area and habitat on bird abundances, species numbers and tree health in fragmented victorian forests Surrey Beatty & Sons, Chipping Norton, NSW Mac Nally, R M (1996) Hierarchical partitioning as an interpretative tool in multivariate inference Australian Journal of Ecology 21, 224–228 Maindonald, J H., and J Braun (2003) Data Analysis and Graphics Using R – An Example-based Approach Cambridge University Press, London Manly, B F J (1991) Randomization and Monte Carlo methods in biology Chapman & Hall, London McGoldrick, J M., and R C Mac Nally (1998) Impact of flowering on bird community dynamics in some central Victorian eucalypt forests Ecological Research 13 McKechnie, S W., P R Ehrlich, and R R White (1975) Population genetics of Euphdryas butterflies I Genetic variation and the neutrality hypothesis Genetics 81, 571–594 Medley, C N., and W H Clements (1998) Responses of diaton communities to heavy metals in streams: the influence of longitudinal variation Ecological Applications 8, 663–644 Milliken, G A., and D E Johnson (1984) Analysis of messy data Volume I: Designed Experiments Van Nostrand Reinhold, New York Minchinton, T E., and P M Ross (1999) Oysters as habitat for limpets in a temerate mangrove forest Australian Journal of Ecology 24, 157–170 Mullens, A., (1993) The effects of inspired oxygen on the pattern of ventilation in the Can Toad (Bufo marinus) and the Salt Water Crocodile (Crocodylus porosus) Honours thesis, University of Melbourne, Australia Murrell, P (2005) R Graphics (Computer Science and Data Analysis) Chapman & Hall/CRC Nelson, V E., (1964) The effects of starvation and humidity on water content in Tribolium confusum Duval (Coleoptera) Ph.D thesis, University of Colorado Partridge, L., and M Farquhar (1981) Sexual activity and the lifespan of male fruitflies Nature 294, 580–581 Paruelo, J M., and W K Lauenroth (1996) Relative abundance of plant functional types in grasslands and shrublands of North America Ecological Applications, pp 1212– 1224 Peake, A J., and G P Quinn (1993) Temporal variation in species-area curves for invertebrates in clumps of an intertidal mussel Ecography 16, 269–277 Pinheiro, J C., and D M Bates (2000) Mixed effects models in S and S-PLUS Springer-Verlag, New York BIBLIOGRAPHY 533 Polis, G A., S D Hurd, C D Jackson, and F Sanchez-Pi˜nero (1998) Multifactor population limitation: variable spatial and temporal control of spiders on Gulf of California islands Ecology 79, 490–502 Powell, G L., and A P Russell (1984) The diet of the eastern short-horned lizard (Phrynosoma douglassi brevirostre) in Alberta and its relationship to sexual size dimorphism Canadian Journal of Zoology 62, 428–440 Powell, G L., and A P Russell (1985) Growth and sexual size dimorphism in Alberta populations of the eastern short-horned lizard Phrynosoma douglassi brevirostre Canadian Journal of Zoology 63, 139–154 Quinn, G P (1988) Ecology of the intertidal pulmonate limpet Siphonaria diemenensis Quoy et Gaimard II Reproductive patterns and energetics Journal of Experimental Marine Biology and Ecology 117, 137–156 Quinn, G P., and K J Keough (2002) Experimental design and data analysis for biologists Cambridge University Press, London R Development Core Team, (2005) R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing, Vienna, Austria URL http://www.R-project.org Reich, P B., D S Ellsworth, M B Walters, J M Vose, C Gresham, J C Volin, and W D Bowman (1999) Generality of leaf trait relationships: a test across six biomes Ecology 80, 1955–1969 Roberts, J (1993) Regeneration and growth of coolibah, Eucalyptus coolibah subsp arida, a riparian tree, in the Cooper Creek region of South Australia Australian Journal of Ecology 18, 345–350 S´anchez-Pi˜nero, F., and G A Polis (2000) Bottom-up dynamics of allochthonous input: direct and indirect effects of seabirds on islands Ecology 81, 3117–3132 Sinclair, A R E., and P Arcese (1995) Population consequences of predation-sensitive foraging: the Serengeti wildebeest Ecology 76, 882–891 Smith, E J (1967) Cloud seeding experiments in Australia Procedings of the 5th Berkley Symposium 5, 161–176 Smith, F (1939) A genetic analysis of red-seed coat color in Phaseolus vulgaris Hilgardia 12, 553–621 Sokal, R., and F J Rohlf (1997) Biometry, 3rd edition W H Freeman, San Francisco Taulman, J F., K G Smith, and R E Thill (1998) Demographic and behavioral responses of southern flying squirrels to experimental logging in Arkansas Ecological Applications 8, 1144–1155 Venables, W N., and B D Ripley (2002) Modern Applied Statistics with S-PLUS, 4th edn Springer-Verlag, New York Walter, D E., and D J O’Dowd (1992) Leaves with domatia have more mites Ecology 73, 1514–1518 Ward, S., and G P Quinn (1988) Preliminary investigations of the ecology of the predatory gastropod Lepsiella vinosa (Lemarck) (Gastropoda Muricidae) Journal of Molluscan Studies 73, 109–117 Wilcox, R R (2005) Introduction to Robust Estimation and Hypothesis Testing Elsevier Academic Press, New York Wood, S N (2006) Generalized Additive Models: An Introduction with R Chapman & Hall/CRC, Boca Raton, FL Young, R F., and H E Winn (2003) Activity patterns, diet, and shelter site use for two species of moray eels, Gymnothorax moringa and Gymnothorax vicinus, in Belize Copeia 2003, 44–55 Zar, G H (1999) Biostatistical methods Prentice-Hall, New Jersey Zuur, A F., E N Ieno, N J Walker, A A Saveliev, and G M Smith (2009) Mixed Effects Models and Extensions in Ecology with R Springer, New York R Index -> ->>= (assignment), 11 : (sequence), 10, 11, 12 :: (name space), 11 < (less than?), 11