Làm sạch dữ liệu là một thuật ngữ khá rộng áp dụng cho các thao tác sơ bộ trên tập dữ liệu trước khi phân tích. Đây thường sẽ là nhiệm vụ đầu tiên của một trợ lý nghiên cứu và là phần tẻ nhạt của bất kỳ dự án nghiên cứu nào khiến chúng tôi mong muốn mình trở thành trợ lý nghiên cứu. Stata là một công cụ tốt để làm sạch và thao tác dữ liệu, bất kể phần mềm bạn định sử dụng để phân tích. Lần vượt qua đầu tiên của bạn tại một tập dữ liệu có thể liên quan đến bất kỳ hoặc tất cả những điều sau: Bỏ quan sát (Dropping observations) Xóa biến Di chuyển biến Đối phó với các yếu tố ngoại lai Tạo biến mới Di chuyển biến Gắn nhãn biến Thay đổi tên biến MOSL sử dụng bộ dữ liệu moslauto.dta từ phần mềm Stata để trình bày và chạy mẫu cho toàn bộ bài kiến thức này. Bộ dữ liệu gồm 2 ngân hàng ABB và ACB thu thập trong thời gian từ 2010 – 2018 với các biến giải thích gồm ROA, QM (quy mô ngân hàng), TGHĐ (tỷ giá hối đái), CPDT, VT và ND. Hãy cùng nhau Tải về dữ liệu qua nút bự chảng dưới đây rồi thực hành xử lý dữ liệu theo nhé các bạn
BASIC ECONOMETRICS FOURTH EDITION Damodar N Gujarati United States Military Academy, West Point Boston Burr Ridge, IL Dubuque, IA Madison, WI New York San Francisco St Louis Bangkok Bogota Caracas Kuala Lumpur Lisbon London Madrid Mexico City Milan Montreal New Delhi Santiago Seoul Singapore Sydney Taipei Toronto McGraw-Hill Higher Education 'EZ A Division of The McGraw-Hill Companies BASIC ECONOMETRICS Published by McGraw-HiII/lrwin, a business unit of The McGraw-Hili Companies, Inc 1221 Avenue of the Americas, New York, NY, 10020 Copyright © 2003, 1995, 1988, 1978, by The McGraw-Hili Companies, Inc All rights reserved No part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written consent of The McGraw-Hili Companies, Inc., including, but not limited to, in any network or other electronic storage or transmission, or broadcast for distance learning Some ancillaries, including electronic and print components, may not be available to customers outside the United States This book is printed on acid-free paper domestic international 890DOC/DOC0987 67890DOC/DOC0987 ISBN: 978-0-07-233542-2 MHID: 0-07-233542-4 ISBN: 978-0-07-112342-6 MHID: 0-07-112342-3 Publisher: Gary Burke Executive sponsoring editor: Lucille Sutton Developmental editor: Aric Bright Marketing manager: Martin D Quinn Associate project manager: Catherine R Schultz Senior production supervisor: Lori Koetters Senior designer: Jenny EI-Shamy Media producer: Melissa Kansa Supplement producer: Erin Sauder Cover design: Jamie O'Neal Typeface: 10/12 New Aster Compositor: Interactive Composition Corporation Printer: R R Donnelley & Sons Company Library of Congress Control Number: 2001099577 INTERNATIONAL EDITION ISBN 0-07-112342-3 Copyright © 2003 Exclusive rights by The McGraw-Hili Companies, Inc for manufacture and export This book cannot be re-exported from the country to which it is sold by McGraw-HilI The International Edition is not available in North America www.mhhe.com ABOUT THE AUTHOR After teaching for more than 28 years at the City University of New York, Damodar N Gujarati is currently a professor of economics in the Department of Social Sciences at the U.S Military Academy at West Point, New York Dr Gujarati received his M.Com degree from the University of Bombay in 1960, hIs M.B.A degree from the University of Chicago in 1963, and his Ph.D degree from the University of Chicago in 1965 Dr Gujarati has published extensively in recognized national and international journals, such as the Review of Economics and Statistics, the Economic Journal, the Journal of Financial and Quantitative Analysis, the Journal of Business, the American Statistician, and the Journal of Industrial and Labor Relations Dr Gujarati is an editorial referee to several journals and book publishers and was a member of the Board of Editors of the Journal of Quantitative Economics, the official journal of the Indian Econometric Society Dr Gujarati is also the author of Pensions and the· New York City Fiscal Crisis (the American Enterprise Institute, 1978), Government" and Business (McGraw-Hill, 1984), and Essentials of Econometrics (McGraw-Hill, 2d ed., 1999) Dr Gujarati's books on econometrics have been translated into several languages Dr Gujarati was a Visiting Professor at the University of Sheffield, U.K (1970-1971), a Visiting Fulbright Professor to India (1981-1982), a Visiting Professor in the School of ManagemeiJt of the National University of Singapore (1985-1986), and a Visiting Professor of Econometrics, University of New South Wales, Australia (summer of 1988) As a regular participant in USIXs lectureship program abroad, Dr Gujarati has lectured extensively on micro- and macroeconomic topics in countries such as Australia, China, Bangladesh, Germany, India, Israel, Mauritius, and the Republic of South Korea Dr Gujarati has also given seminars and lectures in Canada and Mexico iii To my wife, Pushpa, and my daughters, Joan and Diane BRIEF CONTENTS PREFACE xxv Introduction PART SINGLE-EQUATION· REGRESSION MODELS PART II 10 11 12 13 15 17 37 58 107 The Nature of Regression Analysis Two-Variable Regression Analysis: Some Basic Ideas Two-Variable Regression Model: The Problem of Estimation Classical Normal Linear Regression Model (CNLRM) Two-Variable Regression: Interval Estimation and Hypothesis Testing Extensions of the Two-Variable Linear Regression Model Multiple Regression Analysis: The Problem of Estimation Multiple Regression Analysis: The Problem of Inference Dummy Variable Regression Models 119 164 202 248 297 RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL 335 Multicollinearity: What Happens if the Regressors Are Correlated Heteroscedasticity: What Happens if the Error Variance Is Nonconstant? Autocorrelation: What Happens if the Error Terms Are Correlated Econometric Modeling: Model Specification and Diagnostic Testing 341 387 441 506 vi BRIEF CONTENTS PART III 14 15 16 17 PART IV 18 19 20 21 22 Appendix A Appendix B Appendix C Appendix D Appendix E TOPICS IN ECONOMETRICS 561 Nonlinear Regression Models Qualitative Response Regression Models Panel Data Regression Models Dynamic Econometric Models: Autoregressive and Distributed-Lag Models 563 580 636 SIMULTANEOUS-EQUATION MODELS Simultaneous-Equation Models The Identification Problem Simultaneous-Equation Methods Time Series Econometrics: Some Basic Concepts Time Series Econometrics: Forecasting 715 717 735 762 792 835 A Review of Some Statistical Concepts Rudiments of Matrix Algebra The Matrix Approach to Linear Regression Model Statistical Tables Economic Data on the World Wide Web 869 913 926 959 977 SELECTED BIBLIOGRAPHY 979 656 CONTENTS PREFACE xxv Introduction 1.1 1.2 1.3 WHAT IS ECONOMETRICS? WHY A SEPARATE DISCIPLINE? METHODOLOGY OF ECONOMETRICS Statement of Theory or Hypothesis Specification of the Mathematical Model of Consumption Specification of the Econometric Model of Consumption Obtaining Data Estimation of the Econometric Model Hypothesis Testing Forecasting or Prediction Use of the Model for Control or Policy Purposes 4 8 TYPES OF ECONOMETRICS MATHEMATICAL AND STATISTICAL PREREQUISITES THE ROLE Of THE COMPUTER SUGGESTIONS FOR FURTHER READING 10 12 12 13 13 SINGLE-EQUATION REGRESSION MODELS 15 The Nature of Regression Analysis 17 1.1 1.2 HISTORICAL ORIGIN OF THE TERM REGRESSION THE MODERN INTERPRETATION OF REGRESSION 1.3 STATISTICAL VERSUS DETERMINISTIC RELATIONSHIPS 17 18 18 22 Choosing among Competing Models 1.4 1.5 1.6 1.7 PART Examples \Iii viii CONTENTS 1.4 1.5 1.6 1.7 REGRESSION VERSUS CAUSATION REGRESSION VERSUS CORRELATION TERMINOLOGY AND NOTATION THE NATURE AND SOURCES OF DATA FOR ECONOMIC ANALYSIS SUMMARY AND CONCLUSIONS 25 25 29 29 30 31 EXERCISES 32 Two-Variable Regression Analysis: Some Basic Ideas 37 Types of Data The Sources of Data The Accuracy of Data A Note on the Measurement Scales of Variables 1.8 2.1 2.2 2.3 A HYPOTHETICAL EXAMPLE THE CONCEPT OF POPULATION REGRESSION FUNCTION (PRF) THE MEANING OF THE TERM LINEAR Linearity in the Variables Linearity in the Parameters 2.4 2.5 2.6 2.7 2.8 3.1 3.2 STOCHASTIC SPECIFICATION OF PRF THE SIGNIFICANCE OF THE STOCHASTIC DISTURBANCE TERM THE SAMPLE REGRESSION FUNCTION (SRF) AN ILLUSTRATIVE EXAMPLE SUMMARY AND CONCLUSIONS 3.4 3.5 3.6 3.7 3.8 37 41 42 42 42 43 45 47 51 52 EXERCISES 52 Two-Variable Regression Model: The Problem of Estimation 58 THE METHOD OF ORDINARY LEAST SQUARES THE CLASSICAL LINEAR REGRESSION MODEL: THE ASSUMPTIONS UNDERLYING THE METHOD OF LEAST SQUARES A Word about These Assumptions 3.3 22 23 24 PRECISION OR STANDARD ERRORS OF LEAST-SQUARES ESTIMATES PROPERTI.ES OF LEAST-SQUARES ESTIMATORS: THE GAUSS-MARKOV THEOREM THE COEFFICIENT OF DETERMINATION ,2: A MEASURE OF "GOODNESS OF FIT" A NUMERICAL EXAMPLE ILLUSTRATIVE EXAMPLES A NOTE ON MONTE CARLO EXPERIMENTS 58 65 75 76 79 81 87 90 91 CONTENTS 3.9 3A.1 3A.2 3A.3 3A.4 3A.5 3A.6 3A.7 SUMMARY AND CONCLUSIONS 93 EXERCISES 94 APPENDIX 3A DERIVATION OF LEAST-SQUARES ESTIMATES LINEARITY AND UNBIASEDNESS PROPERTIES OF LEAST-SQUARES ESTIMATORS VARIANCES AND STANDARD ERRORS OF LEAST-SQUARES ESTIMATORS COVARIANCE BETWEEN ~1 AND ~2 THE LEAST-SQUARES ESTIMATOR OF 0- MINIMUM-VARIANCE PROPERTY OF LEAST-SQUARES ESTIMATORS CONSISTENCY OF LEAST-SQUARES ESTIMATORS 100 100 Classical Normal Linear Regression Model (CNLRM) 107 4.1 4.2 THE PROBABILITY DISTRIBUTION OF DISTURBANCES THE NORMALITY ASSUMPTION FOR Ui 4.3 PROPERTIES OF OLS ESTIMATORS UNDER THE NORMALITY ASSUMPTION THE METHOD OF MAXIMUM LIKELIHOOD (ML) 100 101 102 102 104 105 Ui Why the Normality Assumption? 4.4 4.5 4A.1 4A.2 5.1 5.2 5.3 SUMMARY AND CONCLUSIONS 108 108 109 110 112 113 114 APPENDIX4A MAXIMUM LIKELIHOOD ESTIMATION OF TWO-VARIABLE REGRESSION MODEL MAXIMUM LIKELIHOOD ESTIMATION OF FOOD EXPENDITURE IN INDIA 114 APPENDIX 4A EXERCISES 117 Two-Variable Regression: Interval Estimation and Hypothesis Testing 119 STATISTICAL PREREQUISITES INTERVAL ESTIMATION: SOME BASIC IDEAS CONFIDENCE INTERVALS FOR REGRESSION COEFFICIENTS fJ1 AND /32 Confidence Interval for /32 Confidence Interval for /31 Confidence Interval for /31 and /32 Simultaneously 5.4 5.5 5.6 ix CONFIDENCE INTERVAL FOR 0- HYPOTHESIS TESTING: GENERAL COMMENTS HYPOTHESIS TESTING: THE CONFIDENCE-INTERVAL APPROACH Two-Sided or Two-Tail Test One-Sided or One-Tail Test 117 119 120 121 121 124 124 124 126 127 127 128 X CONTENTS 5.7 HYPOTHESIS TESTING: THE TEST-OF-SIGNIFICANCE APPROACH Testing the Significance of Regression Coefficients: The tTest Testing the Significance of a : The 5.8 x2 Test HYPOTHESIS TESTING: SOME PRACTICAL ASPECTS The Meaning of "Accepting" or "Rejecting" a Hypothesis The "Zero" Null Hypothesis and the "2-t" Rule of Thumb Forming the Null and Alternative Hypotheses Choosing el, the Level of Significance The Exact Level of Significance: The p Value Statistical Significance versus Practical Significance 129 129 133 134 134 134 135 136 137 138 The Choice between Confidence-Interval and Test-of-Significance Approaches to Hypothesis Testing 5.9 5.10 REGRESSION ANALYSIS AND ANALYSIS OF VARIANCE APPLICATION OF REGRESSION ANALYSIS: THE PROBLEM OF PREDICTION Mean Prediction Individual Prediction 5.11 5.12 5A.1 5A.2 5A.3 5A.4 SUMMARY AND CONCLUSIONS 150 EXERCISES 151 APPENDIX5A PROBABILITY DISTRIBUTIONS RELATED TO THE NORMAL DISTRIBUTION DERIVATION OF EQUATION (5.3.2) DERIVATION OF EQUATION (5.9.1) DERIVATIONS OF EQUATIONS (5.10.2) AND (5.10.6) 159 Variance of Individual Prediction 6.1 164 REGRESSION THROUGH THE ORIGIN 164 167 SCALING AND UNITS OF MEASUREMENT A Word about Interpretation 6.3 6.4 6.5 6.6 159 161 162 162 162 163 Extensions of the Two-Variable Linear Regression Model r2 for Regression-through-Origin Model 6.2 142 142 144 Other Tests of Model Adequacy Variance of Mean Prediction 140 145 146 147 149 REPORTING THE RESULTS OF REGRESSION ANALYSIS EVALUATING THE RESULTS OF REGRESSION ANALYSIS Normality Tests 5.13 139 REGRESSION ON STANDARDIZED VARIABLES FUNCTIONAL FORMS OF REGRESSION MODELS HOW TO MEASURE ELASTICITY: THE LOG-LINEAR MODEL SEMILOG MODELS: LOG-LIN AND LIN-LOG MODELS How to Measure the Growth Rate: The Log-Lin Model The Lin-Log Model 169 173 173 175 175 178 178 181 ... book is printed on acid-free paper domestic international 890DOC/DOC0987 67890DOC/DOC0987 ISBN: 97 8-0 -0 7-2 3354 2-2 MHID: 0-0 7-2 3354 2-4 ISBN: 97 8-0 -0 7-1 1234 2-6 MHID: 0-0 7-1 1234 2-3 Publisher: Gary Burke... ISBN 0-0 7-1 1234 2-3 Copyright © 2003 Exclusive rights by The McGraw-Hili Companies, Inc for manufacture and export This book cannot be re-exported from the country to which it is sold by McGraw-HilI... ix CONFIDENCE INTERVAL FOR 0- HYPOTHESIS TESTING: GENERAL COMMENTS HYPOTHESIS TESTING: THE CONFIDENCE-INTERVAL APPROACH Two-Sided or Two-Tail Test One-Sided or One-Tail Test 117 119 120 121 121