(Luận văn thạc sĩ) construct credit scoring models using logistic regression, neural network and the hybrid model

84 13 0
(Luận văn thạc sĩ) construct credit scoring models using logistic regression, neural network and the hybrid model

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

UNIVERSITY OF ECONOMICS INSTITUTE OF SOCIAL STUDIES HO CHI MINH CITY THE HAGUE VIETNAM THE NETHERLANDS VIETNAM – NETHERLANDS PROGRAMME FOE M.A IN DEVELOPMENT ECONOMICS CONSTRUCT CREDIT SCORING MODELS USING LOGISTIC REGRESSION, NEURAL NETWORK AND THE HYBRID MODEL BY LE MINH TIEN MASTER OF ARTS INDEVELOPMENT ECONOMICS HO CHI MINH CITY, NOVEMBER 2015 UNIVERSITY OF ECONOMICS INSTITUTE OF SOCIAL STUDIES HO CHI MINH CITY THE HAGUE VIETNAM THE NETHERLANDS VIETNAM – NETHERLANDS PROGRAMME FOE M.A IN DEVELOPMENT ECONOMICS CONSTRUCT CREDIT SCORING MODELS USING LOGISTIC REGRESSION, NEURAL NETWORK AND THE HYBRID MODEL A thesis submitted in partial fulfilment of the requirements for the degree of MASTER OF ARTS INDEVELOPMENT ECONOMICS By LE MINH TIEN Academic Supervisor: DR PHAM DINH LONG HO CHI MINH CITY, NOVEMBER 2015 Abstract Viet Nam economy is facing many difficulties, the operation of enterprises is not effective leading to the non performing loan ratio of Banks increases In the period 2007 to 2014, Viet Nam have seen a downtrend in credit growth from 53,89% in 2007 to 11,8% in 2014 without signs of strong recovery in the next period A decline of credit growth implies that enterprises are facing difficult in approaching credit from lending institutions and those enterprises which operate mainly base on credit will be strongest affected ones Non performing loan ratio of Banks in Viet Nam has increased in 2007 to 2014, from 2% in 2007 then reached 3,25% in 2014 (highest in 2012 at 4,08%) In this period, almost enterprises could not approach Banks’ loans while Banks are afraid of non performing loan ratio increasing However, Banks are competing strongly with domestic and foreign ones to achieve shares and maintain profit at the current Viet Nam is known as a densely populated country (a market size of 90 million people and high proportion of young people) which is considered as a potential retail market for Banks to expand and develop in the next period To increase the competitiveness of Banks and also improve effective loan risk management, this study applied different methods that are common used to build up credit scoring model such as logistic regression, neural network and hybrid model Credit scoring model is considered as an application which is developed and widely applied in the sector of finance and banking in the last decades, it is useful in accelerating credit analysis process of Banks Final results confirmed that characteristics like age, education, marital status, current living status, living time in the current place, type of job, working time in current job, working time in current field, number of dependent people, historical payment have a statistically significant effect on repayment capacity of a customer Credit scoring models can classify customers according to different strategic purposes of users And the performance of hybrid models seemed better and more reliable than separate ones Content CHAPTER 1: INTRODUCTION CHAPTER 2: LITERATURE REVIEW 11 2.1 The concept of credit scoring model: 11 2.2 Judgmental analysis method and credit scoring model: 12 2.3 Advantages and disadvantages of credit scoring models: 13 2.4 Historical development of credit scoring models: 14 2.4.1 Development in credit card and instant loan markets: 16 2.4.2 Development in mortgage markets: 17 2.4.3 Development in consumer credit market: 18 2.5 Common variables in constructing credit scoring models: 20 2.6 Common techniques employed in credit scoring models: 23 CHAPTER 3: METHODOLOGY 26 3.1 Data: 26 3.1.1 Variables: 26 3.1.2 Assumptions: 28 3.2 Methodology: 30 3.3 Logistic regression: 31 3.3.1 Theory: 31 3.3.2 Odds ratio: 31 3.3.3 Information value: 32 3.3.4 Quality of the model: 32 3.3.4.1 Log-likelihood ratio (LR) test: 32 3.3.4.2 Pearson Chi-Square test: 33 3.3.4.3 Akaike Information Criterion (AIC): 33 3.4 Neural Network: 34 3.4.1 Theory: 34 3.4.2 Components of artificial neural network: 34 3.4.3 Back Propagation Algorithm: 37 3.5 The hybrid model: 38 3.6 Comparison of models: 38 CHAPTER 4: EMPIRICAL RESULTS 39 4.1 Data: 39 4.1.1 Dependent Variable: 39 4.1.2 Independent Variables: 40 4.2 Estimation results: 48 4.2.1 Construction of Logit models: 49 4.2.1 Comparison of Logit models: 50 4.2.1.1 Log-likelihood ratio (LR) test: 50 4.2.1.2 Person Chi-square test: 51 4.2.1.3 Akaike Information Criterion (AIC): 51 4.2.1.4 Classification tables: 52 4.2.1.5 Comparison summary: 53 4.3 Neural network: 53 4.3.1 Measurement of Model performance: 53 4.3.2 Importance of independent variables: 54 4.4 Hybrid model: 55 4.4.1 Hybrid model 1: 55 4.4.2 Hybrid model 2: 56 4.5 Summary comparison: 57 CHAPTER 5: CONCLUSION 58 5.1 Research summary and implication: 58 5.1.1 Research summary: 58 5.1.2 Implication: 59 5.2 Limitations of the study: 60 References 62 List of tables Table 01 Common variables in previous studies…………………………………………… 23 Table 02 Common methods in previous studies.…………………………………………… 26 Table 03 Variables and their definitions…………………………………………….……….27 Table 04 Summary of selected variables in logit models…………………………………… 50 Table 05 Log-likelihood ratio (LR) test… ………………………………………………… 50 Table 06 Person Chi-square test result……………………………………………………… 51 Table 07 Akaike Information Criterion (AIC) result……………………………………… 51 Table 08 Classification table of logit models……………………………………………… 51 Table 09 Summary logit model comparison………………………………………………… 52 Table 10 Neural network model summary………………………………………………… 53 Table 11 Classification of Neural network model………………………………………… 53 Table 12 Importance of independent variables of Neural network model……………… 54 Table 13 Hybrid model summary………………………………………………………… 55 Table 14 Classification of Hybrid model 1………………………………………………… 55 Table 15 Hybrid model summary………………………………………………………… 56 Table 16 Classification of Hybrid model 2………………………………………………… 56 Table 17 Selected model summary………………………………………………………… 57 Table 18 Correlation Matrix…….………………………………………………………… 65 Table 19 Collinearity Test……….………………………………………………………… 66 Table 20 Results of logit model ………………………………………………………… 66 Table 21 Results of logit model ………………………………………………………… 67 Table 22 Results of logit model ………………………………………………………… 68 Table 23 Results of Neural network model……………………………………………… 72 Table 24 Results of Hybrid model 1.……………………………………………………… 74 Table 25 Results of Hybrid model 2….………………………………………………….… 78 Table 26 Summary of Information value of variables ……… ………………………… 81 List of figures Figure 01Viet Nam credit growth in 2006-2014…………………………………………… 09 Figure 02 Non performing loan ratio in 2006-2014……………………………………… 10 Figure 03 Steps to construct Credit scoring model………………………………………… 30 Figure 04 Processing information in an Artificial Neuron………………………………… 34 Figure 05 Neural network with one hidden layer………………………………………… 34 Figure 06 Example of Summation function………………………………………………… 35 Figure 07 Example of Sigmoid function of ANN…………………………………………… 36 Figure 08 Back propagation algorithm of single neuron………………………………… 37 Figure 09 Ratio of good/bad customer of dataset………………………………………… 40 Figure 10 Ratio of good/bad customer base on age of customer………………………… 41 Figure 11 Ratio of good/bad customer base on Current living status…………………… 42 Figure 12 Ratio of good/bad customer base on Education level………………………… 43 Figure 13 Ratio of good/bad customer base on Gender…………………………………… 44 Figure 14 Ratio of good/bad customer base on Marital status…………………………… 44 Figure 15 Ratio of good/bad customer base on Living time at current place…………… 45 Figure 16 Ratio of good/bad customer base on Type of job………………………….…… 45 Figure 17 Ratio of good/bad customer base on Working time in present job…………… 46 Figure 18 Ratio of good/bad customer base on Working time in current field….……… 47 Figure 19 Ratio of good/bad customer base on Number of dependent people…………… 48 Figure 20 Ratio of good/bad customer base on Historical payment ……… …………… 48 CHAPTER 1: INTRODUCTION In 2007, The Financial Crisis began from United States (US) by a decisive decline of home prices, then affected entire the economy and spread through the world economy A cut deep into demand all over the world made Viet Nam economy facing many difficulties in export sector in this time Enterprises have to narrow down their operations result in the credit growth of the banking system has slowed down in the recent period Viet Nam credit growth in 2006-2014 60% 53,89% 50% 40% 37,53% 31,19% 30% 25,44% 25,43% 20% 12,51% 11,80% 12,00% 10% 8,91% 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 -10% Figure 01: Viet Nam credit growth in 2006-2014 Source: The State Bank of Vietnam’s annual report 2006-2014 Banks are afraid of losing their capital because of existing difficulties of the economy while the sign of economic recovery is still very weak, thus they are careful in making their lending decisions Economists forecast this situation would still continue in the next few years To survive and develop in this period, some economists suggested that, in the coming period, the retail banking segment will be the alternative strategy could help Banks developing their businesses and maybe is the key growth because Viet Nam has a market size of 90 million people (with a high proportion of young people) which will generate opportunities for Banks to expand their services to help consumers increasing asset value and better businesses management as well as carry out daily payment activities Viet Nam with some typical characteristics of lowincome developing country such as dynamic young population, rising income and desire to improve the quality and lifestyle will be a great potential for the retail banking development To take advantage of this opportunity, Banks have to improve procedure system to make it more convenient and better risk management to develop this new segment Non performing loan ratio in 2006- 2014 5% 4,08% 4% 3,60% 3% 3,25% 2,66% 2% 3,79% 2,40% 2,14% 2,18% 2% 1% 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 Figure 02: Non performing loan ratio in 2006-2014 Source: The State Bank of Vietnam’s annual report 2006-2014 Credit scoring model is a useful tool that was first introduced in 1940, and developed rapidly over the last two decades This is a statistical technique that helping banks or lending institutions predicting the probability of a customer can pay back the loan on time or not (Mester, 1997) This model enables banks and financial institutions classifying and evaluating easily and quickly customer’s risks to make lending decisions faster and more accurate than judgmental system This paper will build up credit scoring model by using different techniques such as logistic regression, neural network and a hybrid model of them to find out the suitable one and give some implications to assess customer’s risk Research questions and objectives: This study aim to build up credit scoring model by using three different techniques such as logistic regression, neural network and a hybrid model of them to identify which characteristics of customer will affect their default probability; then comparing the performance between models and finding the best one The results of this study will answer this below questions:  Which characteristics of customer can be used to identify that customer can pay back the loan or not ?  Find out the better technique to construct credit scoring model in this study Scope of this study: In order to conduct this study, the collected sample of this study will comprise personal information of 690 customers of MBBank that has had a loan with this bank within the year of 2012 And the status of their loans (default/delinquent or good) will be recorded at the end of 2013 The personal information of customers will contain their own characteristics such as: gender, age, education, marital status, current living status, living time in the current place, type of job, working time in current job, working time in current field, number of dependent people, historical payment Structure of this study: The first chapter of this study mentions the reason of conducting the study, research questions and objecitives In the second chapter, this study will present an overview relating to credit scoring model, then mentions the independent variables that are commonly used and the preferred method mostly applied to construct a credit scoring model Details of the independent variables were used in the research such as the meaning of each variable, assumptions of these variables and the steps that constructed models also was contained in the third chapter The first part of the fourth chapter will present an overview related to the dataset was used in this study and illustrate initial aspects about the relationship between the independent variables and the dependent variable The second part of the fourth chapter will focus on outlining the steps taken to construct and select final credit scoring model Findings, implications and limitations of this study will be discussed in the last chapter 10 Divorce Marital Status Married -1.081 0.488 4.907 0.027 Single -0.513 0.548 0.878 0.349 Owner -0.876 0.330 7.049 0.008 With parent 0.175 0.374 0.218 0.641 4-7 -0.828 0.351 5.575 0.018 >7 -0.896 0.302 8.812 0.003 Officer -1.634 0.271 36.250 0.000 Manager -2.335 0.342 46.589 0.000 2-4 -0.485 0.517 0.877 0.349 >4 -0.850 0.500 2.893 0.089 2.505 0.461 29.569 0.000 3.806 0.785 23.506 0.000 Renting Current Living Status

Ngày đăng: 30/12/2020, 17:34

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan