1. Trang chủ
  2. » Giáo án - Bài giảng

Hair multivariate data analysis 7th revised

739 18 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 739
Dung lượng 10,92 MB

Nội dung

Sách phân tích dữ liệu đa biến của Hair, tái bản lần 7 cung cấp phần giới thiệu theo hướng ứng dụng về phân tích đa biến cho người muốn tìm hiểu thống kê phục vụ nghiên cứu, làm luận văn, đồ án tốt nghiệp.

Multivariate Data Analysis Hair Black Babin Anderson 781292 021904 7th edition ISBN 978-1-29202-190-4 Multivariate Data Analysis ack Joseph F Hair Jr William C Bl on Barry J Babin Rolph E Anders Seventh Edition Multivariate Data Analysis Joseph F Hair Jr William C Black Barry J Babin Rolph E Anderson Seventh Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsoned.co.uk © Pearson Education Limited 2014 All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior written permission of the publisher or a licence permitting restricted copying in the United Kingdom issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS All trademarks used herein are the property of their respective owners The use of any trademark in this text does not vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such trademarks imply any affiliation with or endorsement of this book by such owners ISBN 10: 1-292-02190-X ISBN 13: 978-1-292-02190-4 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Printed in the United States of America P E A R S O N C U S T O M L I B R A R Y Table of Contents Overview of Multivariate Methods Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson Examining Your Data Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 31 Exploratory Factor Analysis Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 89 Multiple Regression Analysis Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 151 Multiple Discriminant Analysis Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 231 Logistic Regression: Regression with a Binary Dependent Variable Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 313 Conjoint Analysis Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 341 Cluster Analysis Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 415 Multidimensional Scaling Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 475 10 Analyzing Nominal Data with Correspondence Analysis Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 519 11 Structural Equations Modeling Overview Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 541 12 Confirmatory Factor Analysis Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 599 13 Testing Structural Equations Models Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 639 I 14 MANOVA and GLM II Joseph F Hair, Jr./William C Black/Barry J Babin/Rolph E Anderson 665 Index 729 Overview of Multivariate Methods LEARNING OBJECTIVES Upon completing this chapter, you should be able to the following: ᭿ ᭿ ᭿ ᭿ ᭿ ᭿ ᭿ Explain what multivariate analysis is and when its application is appropriate Discuss the nature of measurement scales and their relationship to multivariate techniques Understand the nature of measurement error and its impact on multivariate analysis Determine which multivariate technique is appropriate for a specific research problem Define the specific techniques included in multivariate analysis Discuss the guidelines for application and interpretation of multivariate analyses Understand the six-step approach to multivariate model building CHAPTER PREVIEW This chapter presents a simplified overview of multivariate analysis It stresses that multivariate analysis methods will increasingly influence not only the analytical aspects of research but also the design and approach to data collection for decision making and problem solving Although multivariate techniques share many characteristics with their univariate and bivariate counterparts, several key differences arise in the transition to a multivariate analysis To illustrate this transition, this chapter presents a classification of multivariate techniques It then provides general guidelines for the application of these techniques as well as a structured approach to the formulation, estimation, and interpretation of multivariate results The chapter concludes with a discussion of the databases utilized throughout the text to illustrate application of the techniques KEY TERMS Before starting the chapter, review the key terms to develop an understanding of the concepts and terminology used Throughout the chapter, the key terms appear in boldface Other points of emphasis in the chapter are italicized Also, cross-references within the key terms appear in italics Alpha (a) See Type I error Beta (β) See Type II error Bivariate partial correlation Simple (two-variable) correlation between two sets of residuals (unexplained variances) that remain after the association of other independent variables is removed From Chapter of Multivariate Data Analysis, 7/e Joseph F Hair, Jr., William C Black, Barry J Babin, Rolph E Anderson Copyright © 2010 by Pearson Prentice Hall All rights reserved Overview of Multivariate Methods Bootstrapping An approach to validating a multivariate model by drawing a large number of subsamples and estimating models for each subsample Estimates from all the subsamples are then combined, providing not only the “best” estimated coefficients (e.g., means of each estimated coefficient across all the subsample models), but their expected variability and thus their likelihood of differing from zero; that is, are the estimated coefficients statistically different from zero or not? This approach does not rely on statistical assumptions about the population to assess statistical significance, but instead makes its assessment based solely on the sample data Composite measure See summated scales Dependence technique Classification of statistical techniques distinguished by having a variable or set of variables identified as the dependent variable(s) and the remaining variables as independent The objective is prediction of the dependent variable(s) by the independent variable(s) An example is regression analysis Dependent variable Presumed effect of, or response to, a change in the independent variable(s) Dummy variable Nonmetrically measured variable transformed into a metric variable by assigning a or a to a subject, depending on whether it possesses a particular characteristic Effect size Estimate of the degree to which the phenomenon being studied (e.g., correlation or difference in means) exists in the population Independent variable Presumed cause of any change in the dependent variable Indicator Single variable used in conjunction with one or more other variables to form a composite measure Interdependence technique Classification of statistical techniques in which the variables are not divided into dependent and independent sets; rather, all variables are analyzed as a single set (e.g., factor analysis) Measurement error Inaccuracies of measuring the “true” variable values due to the fallibility of the measurement instrument (i.e., inappropriate response scales), data entry errors, or respondent errors Metric data Also called quantitative data, interval data, or ratio data, these measurements identify or describe subjects (or objects) not only on the possession of an attribute but also by the amount or degree to which the subject may be characterized by the attribute For example, a person’s age and weight are metric data Multicollinearity Extent to which a variable can be explained by the other variables in the analysis As multicollinearity increases, it complicates the interpretation of the variate because it is more difficult to ascertain the effect of any single variable, owing to their interrelationships Multivariate analysis Analysis of multiple variables in a single relationship or set of relationships Multivariate measurement Use of two or more variables as indicators of a single composite measure For example, a personality test may provide the answers to a series of individual questions (indicators), which are then combined to form a single score (summated scale) representing the personality trait Nonmetric data Also called qualitative data, these are attributes, characteristics, or categorical properties that identify or describe a subject or object They differ from metric data by indicating the presence of an attribute, but not the amount Examples are occupation (physician, attorney, professor) or buyer status (buyer, nonbuyer) Also called nominal data or ordinal data Power Probability of correctly rejecting the null hypothesis when it is false; that is, correctly finding a hypothesized relationship when it exists Determined as a function of (1) the statistical significance level set by the researcher for a Type I error (a), (2) the sample size used in the analysis, and (3) the effect size being examined Practical significance Means of assessing multivariate analysis results based on their substantive findings rather than their statistical significance Whereas statistical significance determines whether the result is attributable to chance, practical significance assesses whether the result is useful (i.e., substantial enough to warrant action) in achieving the research objectives Reliability Extent to which a variable or set of variables is consistent in what it is intended to measure If multiple measurements are taken, the reliable measures will all be consistent in their Overview of Multivariate Methods values It differs from validity in that it relates not to what should be measured, but instead to how it is measured Specification error Omitting a key variable from the analysis, thus affecting the estimated effects of included variables Summated scales Method of combining several variables that measure the same concept into a single variable in an attempt to increase the reliability of the measurement through multivariate measurement In most instances, the separate variables are summed and then their total or average score is used in the analysis Treatment Independent variable the researcher manipulates to see the effect (if any) on the dependent variable(s), such as in an experiment (e.g., testing the appeal of color versus black-andwhite advertisements) Type I error Probability of incorrectly rejecting the null hypothesis—in most cases, it means saying a difference or correlation exists when it actually does not Also termed alpha (a) Typical levels are or percent, termed the 05 or 01 level, respectively Type II error Probability of incorrectly failing to reject the null hypothesis—in simple terms, the chance of not finding a correlation or mean difference when it does exist Also termed beta (β), it is inversely related to Type I error The value of minus the Type II error (1 - β) is defined as power Univariate analysis of variance (ANOVA) Statistical technique used to determine, on the basis of one dependent measure, whether samples are from populations with equal means Validity Extent to which a measure or set of measures correctly represents the concept of study— the degree to which it is free from any systematic or nonrandom error Validity is concerned with how well the concept is defined by the measure(s), whereas reliability relates to the consistency of the measure(s) Variate Linear combination of variables formed in the multivariate technique by deriving empirical weights applied to a set of variables specified by the researcher WHAT IS MULTIVARIATE ANALYSIS? Today businesses must be more profitable, react quicker, and offer higher-quality products and services, and it all with fewer people and at lower cost An essential requirement in this process is effective knowledge creation and management There is no lack of information, but there is a dearth of knowledge As Tom Peters said in his book Thriving on Chaos, “We are drowning in information and starved for knowledge” [7] The information available for decision making exploded in recent years, and will continue to so in the future, probably even faster Until recently, much of that information just disappeared It was either not collected or discarded Today this information is being collected and stored in data warehouses, and it is available to be “mined” for improved decision making Some of that information can be analyzed and understood with simple statistics, but much of it requires more complex, multivariate statistical techniques to convert these data into knowledge A number of technological advances help us to apply multivariate techniques Among the most important are the developments in computer hardware and software The speed of computing equipment has doubled every 18 months while prices have tumbled User-friendly software packages brought data analysis into the point-and-click era, and we can quickly analyze mountains of complex data with relative ease Indeed, industry, government, and university-related research centers throughout the world are making widespread use of these techniques We use the generic term researcher when referring to a data analyst within either the practitioner or academic communities We feel it inappropriate to make any distinction between these two areas, because research in both relies on theoretical and quantitative bases Although the research objectives and the emphasis in interpretation may vary, a researcher within either area must address all of the issues, both conceptual and empirical, raised in the discussions of the statistical methods Overview of Multivariate Methods MULTIVARIATE ANALYSIS IN STATISTICAL TERMS Multivariate analysis techniques are popular because they enable organizations to create knowledge and thereby improve their decision making Multivariate analysis refers to all statistical techniques that simultaneously analyze multiple measurements on individuals or objects under investigation Thus, any simultaneous analysis of more than two variables can be loosely considered multivariate analysis Many multivariate techniques are extensions of univariate analysis (analysis of single-variable distributions) and bivariate analysis (cross-classification, correlation, analysis of variance, and simple regression used to analyze two variables) For example, simple regression (with one predictor variable) is extended in the multivariate case to include several predictor variables Likewise, the single dependent variable found in analysis of variance is extended to include multiple dependent variables in multivariate analysis of variance Some multivariate techniques (e.g., multiple regression and multivariate analysis of variance) provide a means of performing in a single analysis what once took multiple univariate analyses to accomplish Other multivariate techniques, however, are uniquely designed to deal with multivariate issues, such as factor analysis, which identifies the structure underlying a set of variables, or discriminant analysis, which differentiates among groups based on a set of variables Confusion sometimes arises about what multivariate analysis is because the term is not used consistently in the literature Some researchers use multivariate simply to mean examining relationships between or among more than two variables Others use the term only for problems in which all the multiple variables are assumed to have a multivariate normal distribution To be considered truly multivariate, however, all the variables must be random and interrelated in such ways that their different effects cannot meaningfully be interpreted separately Some authors state that the purpose of multivariate analysis is to measure, explain, and predict the degree of relationship among variates (weighted combinations of variables) Thus, the multivariate character lies in the multiple variates (multiple combinations of variables), and not only in the number of variables or observations For our present purposes, we not insist on a rigid definition of multivariate analysis Instead, multivariate analysis will include both multivariable techniques and truly multivariate techniques, because we believe that knowledge of multivariable techniques is an essential first step in understanding multivariate analysis SOME BASIC CONCEPTS OF MULTIVARIATE ANALYSIS Although the roots of multivariate analysis lie in univariate and bivariate statistics, the extension to the multivariate domain introduces additional concepts and issues of particular relevance These concepts range from the need for a conceptual understanding of the basic building block of multivariate analysis—the variate—to specific issues dealing with the types of measurement scales used and the statistical issues of significance testing and confidence levels Each concept plays a significant role in the successful application of any multivariate technique The Variate As previously mentioned, the building block of multivariate analysis is the variate, a linear combination of variables with empirically determined weights The variables are specified by the researcher, whereas the weights are determined by the multivariate technique to meet a specific objective A variate of n weighted variables (X1 to Xn) can be stated mathematically as: Variate value ϭ w1X1 ϩ w2X2 ϩ w3X3 ϩ ϩ wnXn where Xn is the observed variable and wn is the weight determined by the multivariate technique Overview of Multivariate Methods The result is a single value representing a combination of the entire set of variables that best achieves the objective of the specific multivariate analysis In multiple regression, the variate is determined in a manner that maximizes the correlation between the multiple independent variables and the single dependent variable In discriminant analysis, the variate is formed so as to create scores for each observation that maximally differentiates between groups of observations In factor analysis, variates are formed to best represent the underlying structure or patterns of the variables as represented by their intercorrelations In each instance, the variate captures the multivariate character of the analysis Thus, in our discussion of each technique, the variate is the focal point of the analysis in many respects We must understand not only its collective impact in meeting the technique’s objective but also each separate variable’s contribution to the overall variate effect Measurement Scales Data analysis involves the identification and measurement of variation in a set of variables, either among themselves or between a dependent variable and one or more independent variables The key word here is measurement because the researcher cannot identify variation unless it can be measured Measurement is important in accurately representing the concept of interest and is instrumental in the selection of the appropriate multivariate method of analysis Data can be classified into one of two categories—nonmetric (qualitative) and metric (quantitative)—based on the type of attributes or characteristics they represent The researcher must define the measurement type—nonmetric or metric—for each variable To the computer, the values are only numbers As we will see in the following section, defining data as either metric or nonmetric has substantial impact on what the data can represent and how it can be analyzed NONMETRIC MEASUREMENT SCALES Nonmetric data describe differences in type or kind by indicating the presence or absence of a characteristic or property These properties are discrete in that by having a particular feature, all other features are excluded; for example, if a person is male, he cannot be female An “amount” of gender is not possible, just the state of being male or female Nonmetric measurements can be made with either a nominal or an ordinal scale Nominal Scales A nominal scale assigns numbers as a way to label or identify subjects or objects The numbers assigned to the objects have no quantitative meaning beyond indicating the presence or absence of the attribute or characteristic under investigation Therefore, nominal scales, also known as categorical scales, can only provide the number of occurrences in each class or category of the variable being studied For example, in representing gender (male or female) the researcher might assign numbers to each category (e.g., for females and for males) With these values, however, we can only tabulate the number of males and females; it is nonsensical to calculate an average value of gender Nominal data only represent categories or classes and not imply amounts of an attribute or characteristic Commonly used examples of nominally scaled data include many demographic attributes (e.g., individual’s sex, religion, occupation, or political party affiliation), many forms of behavior (e.g., voting behavior or purchase activity), or any other action that is discrete (happens or not) Ordinal Scales Ordinal scales are the next “higher” level of measurement precision In the case of ordinal scales, variables can be ordered or ranked in relation to the amount of the attribute possessed Every subject or object can be compared with another in terms of a “greater than” or “less than” relationship The numbers utilized in ordinal scales, however, are really nonquantitative because they indicate only relative positions in an ordered series Ordinal scales provide no measure of the actual amount or magnitude in absolute terms, only the order of the values The researcher knows the order, but not the amount of difference between the values ... understanding multivariate analysis SOME BASIC CONCEPTS OF MULTIVARIATE ANALYSIS Although the roots of multivariate analysis lie in univariate and bivariate statistics, the extension to the multivariate. .. ensures that the data underlying the analysis meet all of the requirements for a multivariate analysis Multivariate techniques demand much more from the data in terms of larger data sets and more... be loosely considered multivariate analysis Many multivariate techniques are extensions of univariate analysis (analysis of single-variable distributions) and bivariate analysis (cross-classification,

Ngày đăng: 29/12/2021, 06:17

TỪ KHÓA LIÊN QUAN