1. Trang chủ
  2. » Luận Văn - Báo Cáo

Khóa luận tốt nghiệp Kinh tế chính trị: Analysis of Factors Affecting Firm Innovation: An Empirical Investigation for Vietnamese Firms

209 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Analysis of Factors Affecting Firm Innovation: An Empirical Investigation for Vietnamese Firms
Tác giả Dao Viet Anh
Người hướng dẫn Assoc. Prof. To The Nguyen
Trường học Vietnam National University, Hanoi University of Economics and Business
Chuyên ngành Political Economy
Thể loại graduation thesis
Năm xuất bản 2023
Thành phố Ha Noi
Định dạng
Số trang 209
Dung lượng 73,2 MB

Cấu trúc

  • 1. Research objective and qU€SfIOTNS............................... 26 51 3019910 91 9v vn ng ng nry 4 2. Research object and scope of the Study ..............................- ---- 5 2S. 1S 393 9 11 1 1 ng re 4 3. Structure Of this Study .............................- G111 HH HH Hy 4 (0)
  • CHAPTER 1: LITERATURE REVIEW AND ANALYTICAL FRAMEWORK (16)
    • 1.1. Definitions and COTCDS...................... G6 G5 1 E011 1H HH 6 1. The evolution, concepts and definitions of InnovatiOI.........................-- --ô ô<< ô+ 6 2. The attribute, aspects Of InnOVAfIOII............................- -- Gv Hư 19 3. Firm InOVAfIOIN......................... -- -ó << E11 11 11 11 1 nh ng TH nh TH HH 22 (0)
      • 1.1.5. Firm 000400 (0)
    • 1.2. Empirical studies and hypotheses ..........................- - -- <6 <1 E919 9 1H 1g ng ng kết 30 1. Empirical studies on InOVAfIOTI.........................-- s55 111gr 30 2. Empirical studies on firm innovation CapDACIẨY......................... -- ---- -cc + ssvsseesek 35 3. Determinant of firm 1nnOVAfIOI.......................- .. - <5 1 191113930111 1111991 1 ngư 37 3.1. External factors nh ae (40)
      • 1.2.3.2. Internal factors TP (51)
      • 1.2.4. Research ỉap.......................- --- cv TH HH HH HH 50 SUB-CONCLUSION CHAPTER l.............................. - --- G11 ng Hiệp 52 (60)
  • CHAPTER 2: RESEARCH METHODOLOGY ............................... SG Ssssseieeersres 53 2.1. MethodỌOBV......................... ..- 5 <1 1H TH kh 53 2.1.1. Model 1 — Ordered ẽOT1(...............................-- -- s6 + + k*xE 919931191 19 vn ng ngư 53 2.1.1.1. Ordered logit model estImatION:..............................- 5 5 13v ssskkrseessree 55 2.1.1.2. Model assumptions Sa (63)
    • 2.1.1.3. Statistical and model 1 test assumpfIOnS...........................- 5 5+5 <<+<c++se+sexe 59 2.1.1.4. Proposed analytical framework of the model 1 ..............................-- -- ô+ 58 (69)
    • 2.1.2.2. Model assumptions 1 (74)
    • 2.1.2.3. Statistical and model 2 test assumpfIOnS......................... .-- 5 + +c << ++sesseee 62 2.1.2.4. Potential alternative imOd€Ì.............................. ..- << + + E111 vn 63 2.1.3. Model 3 004i (75)
    • 2.1.3.1. Logistic regression model estimation ...............................- ------- ô+ ++s<++se+seeesee 71 Model assumptions .0.........eecccecssccesseccesneceesnecessnecesneeceseeceeaeeeeseeessneeeeaes 74 2.1.3.2. Statistical and model 3 test assumpfIONS.........................-...- 5 +5 cssxesssee 74 2.1.3.3. Proposed analytical framework for the model 3...............................- --- ô+ 70 (84)
    • 2.2.1. SurVey DODUÌ4fIOT.........................-- s1 SH HH HH Hh 71 2.2.2. SCOD€ Of SUIVECY 0... eeseesccesseceseeeseecesecesceceeceaeeececeaeceaeeeeecsaeeeeeceaeceaeeeteeseaeees 72 2.2.3. General descriptive statistics in research đafa.....................- -.----+--s<+++scesseerss 73 2.2.4. Descriptive data model 1 — ordered logit reỉr€SS1ON.....................-- ô55+ 5-<<<+5+ 81 2.2.5. Descriptive data model 2 — Poisson TeBỉT€SSIOH............................ 55s + £++<++eessx 85 2.2.6. Descriptive data model 3 — Logit TeðT€SSIOI........................... -- 5 55 xxx s+£eveese 88 SUB-CONCLUSION CHAPTER 2...........................- - Án ng HH ngư 91 (0)
  • CHAPTER 3: RESULTS AND DISCUSSION .................................. nhe 92 3.1. The current state of innovative activities of enterprises in Vietnam (110)
    • 3.1.1. Financing activities for innovative start-ups in Vietnam...........................- ----- 94 3.1.2. Intermediate Organizations ......................-- ---- -< E11 1S 9n Hee 100 3.2. Science and technology enterprise developmeh(...........................-- --- ô<< ô<< + x2 102 (112)

Nội dung

To complete the topic " Analysis of Factors Affecting Firm Innovation: AnEmpirical Investigation for Vietnamese Firms", [ would like to express my sincerethanks to the leadership, staff

LITERATURE REVIEW AND ANALYTICAL FRAMEWORK

Empirical studies and hypotheses - - <6 <1 E919 9 1H 1g ng ng kết 30 1 Empirical studies on InOVAfIOTI . s55 111gr 30 2 Empirical studies on firm innovation CapDACIẨY -cc + ssvsseesek 35 3 Determinant of firm 1nnOVAfIOI .- - <5 1 191113930111 1111991 1 ngư 37 3.1 External factors nh ae

Economists have consistently highlighted the critical role of innovation in driving both firm growth and overall economic development Schumpeter and Nichol (1934) identified innovation as the key catalyst for economic advancement in the early twentieth century Building on this foundation, endogenous growth theory has underscored innovation's significance in enhancing competitiveness, productivity, output, and employment (Romer, 1990).

Research indicates a significant relationship between innovation and enterprise performance, with studies categorizing the effects into three main types: the direct effect, which highlights how innovation directly enhances firm performance; the moderating effect, which examines how external factors influence the innovation-performance relationship; and the mediating effect, which explores how innovation acts as a bridge between various antecedents and firm performance outcomes.

Research shows varying direct effects of innovation on company performance, with some studies indicating a positive impact (Roberts, 1999; Pucik, 2005; Hua & Wemmerlöv, 2006; Salavou, 2002), while others report no significant effect (Ram & Jung, 1991; Hultink & Atuahene-Gima, 2000) Notably, companies can attain sustained high profits by consistently launching new and valuable innovations that address previously unmet consumer needs (Roberts).

Research indicates that while profits from individual innovations may decline over time, consistent innovation is crucial for overall strong company performance (Hunt and Morgan, 1995; Jacobson, 1988; Porter, 1985; Rumelt, 1991) Alternatively, companies may achieve sustained high profits by effectively sidestepping competition, even if they innovate less frequently A study by Roberts (1999) on the U.S pharmaceutical industry supports the link between a high propensity for innovation and sustained superior profitability, but it does not establish a connection between profitability persistence and the ability to evade competition.

Numerous empirical studies have established a positive relationship between innovation and business performance, examining not only profit but also indicators like sales growth and market share Prajogo and Ahmed (2006) explored this connection by analyzing data from 194 Australian managers, evenly split between manufacturing (52%) and service sectors (47%) Their findings indicated no significant differences in product and process innovation performance between the two sectors However, a stronger correlation was identified between innovation and business performance specifically within manufacturing companies, particularly concerning process innovation.

Research indicates that process innovation is more closely linked to business performance than product innovation in the manufacturing sector Deshpandé et al (1993) established a positive correlation between innovativeness and organizational success in Japanese firms, highlighting its impact on profitability, market share, and growth Similarly, Johnson et al (1996) found that innovation significantly affects various performance metrics, including market share and return on investment, in Canadian companies In the food sector in Greece, Salavou (2002) identified product innovation as a crucial determinant of company performance based on Return on Assets (ROA) Additional studies, such as Yamin et al (1997), explored the effects of different types of innovation on corporate success, focusing on liquidity and ROI Furthermore, Subramanian and Nilakanta (1996) examined the timing and quantity of innovation adoption in the banking sector, concluding that both factors significantly impact organizational efficiency, while only the timing affects effectiveness.

Innovation serves as a crucial moderating variable within models that encompass internal capital stock, external markets, and environmental factors (Huang & Rice, 2009) It influences outcomes in both stable environments (Hargadon & Douglas, 2001) and dynamic settings (Garg et al., 2003), as well as in various market channels, including domestic and international markets (Otero-Neira et al., 2009) For instance, a study examining the relationships with high-achieving CEOs involved a field survey of 105 manufacturing companies, which evaluated the interest of CEOs in research and its impact on company performance (Garg et al., 2003) This highlights the necessity for executives to strategically allocate their limited time across relevant areas of their business.

The study explores how a company's external environment influences its innovation performance, utilizing a Hierarchical regression model to highlight the moderating role of scanning emphasis on innovation Results indicate that in dynamic environments, CEOs focus more on external tasks, linking internal innovation functions to higher performance Conversely, in stable environments, enhanced scanning of both internal and external areas related to functional efficiency leads to improved performance, particularly in sales growth Additionally, Huang and Rice (2009) advocate for open innovation, emphasizing knowledge exchange, yet they overlook knowledge efficiency post-acquisition This research investigates the link between open innovation strategies and absorption capacity, measuring innovation performance through R&D intensity (RDI) To enhance limited survey data, a new metric is introduced, comparing R&D spending to total new product or process development costs Findings reveal that the ability to absorb knowledge is vital for fostering successful innovation.

Scholars have explored the mediating effect of innovation in various contexts, such as transformation outcomes (Liao and Rice, 2010), innovation output (Neely et al., 2001), and IT investment (Dibrell et al., 2008) For instance, Choi et al (2009) conducted a study on 118 major construction companies in Korea from 1997 to 2003 to empirically assess the relationship between firm innovation and performance, particularly in light of the market changes following the Asian Financial Crisis Utilizing AMOS analysis, the research revealed a significant correlation between a firm's innovation and its performance outcomes.

The performance of a firm is indirectly influenced by its innovation capabilities, specifically through product and process fits, which are critical indicators of success or failure A study by Gunday et al (2011) examined the impact of various types of innovations—organizational, process, product, and marketing—on corporate operations across 184 Turkish manufacturing firms The research established a theoretical framework that highlights the positive relationship between innovation and business performance, demonstrating that effective innovation significantly enhances performance in the manufacturing sector.

Small and medium-sized enterprises (SMEs) are crucial to the economic and technological advancement of nations, despite their resource constraints They often excel in innovative projects, allowing them to establish monopolies and achieve sustainable entrepreneurial success According to Schumpeter (1934), innovation enables SMEs to gain a competitive edge by introducing unique products, services, and processes This innovation allows them to penetrate niche markets, foster brand loyalty, and reduce price sensitivity among customers The agility of SMEs enables them to adapt quickly, securing monopoly rents over extended periods By focusing on highly innovative offerings, SMEs can evade price competition, attract a broader customer base, and ultimately drive growth.

Innovation does not always lead to enhanced corporate success, particularly when ideas are introduced without proper implementation across the organization To truly improve performance, innovations must deliver tangible results, such as reduced manufacturing costs or enhanced customer service This can be achieved through technical innovations, information and communication technologies, and organizational changes, along with practices like quality management and teamwork However, small and medium-sized enterprises (SMEs) face challenges in adopting innovation, including resistance within the firm or market, significant investment risks, and the need for advanced organizational resources and competencies for effective product development.

1.2.2 Empirical studies on firm innovation capacity

The central question of innovation studies is why some firms are more innovative than others This inquiry can be rephrased to explore why certain firms have a greater capacity to innovate A firm's capacity to innovate is defined by its potential to generate innovative outcomes, which is influenced by its available resources and capabilities These resources and abilities empower firms to identify and leverage new opportunities for growth.

Fan (2011) analyzes the role of innovation capacity in the economic growth of China and India from 1981 to 2004, revealing its significant impact, especially during the 1990s The study measures innovation outputs through patents and high-tech/service exports, highlighting substantial advancements driven by increased R&D investment and personnel It emphasizes the governments' vital role in fostering innovation by connecting science and business, incentivizing innovation activities, and balancing technology imports with local R&D Additionally, the paper examines micro-level insights through case studies of domestic biotech firms, illustrating how innovation capacity affects market success and is influenced by global institutional factors and national policies Yeo (2010) further explores innovation capacity in the context of developed countries, particularly the U.S., underscoring the shift towards a knowledge-based economy where knowledge is a key production element The research analyzes innovation capacity across U.S metropolitan areas from 1988 to 2007, demonstrating its growing impact on regional economic growth.

The study highlights that variations in industry bases and region-specific factors in the U.S contribute to the economic impact of innovation capacity during the dot-com era, albeit to a limited extent To further elucidate these findings, the research concludes with a literature review advocating for a contextual approach to analyze how innovation capacity influences a region's economic performance.

RESEARCH METHODOLOGY SG Ssssseieeersres 53 2.1 MethodỌOBV - 5 <1 1H TH kh 53 2.1.1 Model 1 — Ordered ẽOT1( . s6 + + k*xE 919931191 19 vn ng ngư 53 2.1.1.1 Ordered logit model estImatION: - 5 5 13v ssskkrseessree 55 2.1.1.2 Model assumptions Sa

Statistical and model 1 test assumpfIOnS - 5 5+5 <<+<c++se+sexe 59 2.1.1.4 Proposed analytical framework of the model 1 ô+ 58

= The parallel slopes assumption/the proportional odds assumption

Various statistical tests, including the Brant, Wald, likelihood ratio (LR), and score (Rao or Lagrange multiplier) tests, have been developed to evaluate the validity of the parallel slopes assumption These methodologies have been explored in numerous studies, notably by Brant (1990), Buse (1982), Engle (1984), Fullerton and Xu (2016), Greene and Hensher (2012), Long (1997), and Powers and Xie (2009).

This study utilizes the Brant test (1990) to assess the proportional odds assumption in ordinal logistic regression models The Brant test evaluates whether the influence of independent variables on the dependent variable remains consistent across all levels of the dependent variable By comparing the model's coefficients with a scaled version of these coefficients—where the scaling factor is determined by the distribution—the test effectively determines the validity of the parallel slopes assumption.

The dependent variable is analyzed by calculating the test statistic, which is derived from the difference between two sets of coefficients The significance of this statistic is assessed using a chi-squared distribution, with degrees of freedom corresponding to the number of independent variables involved.

A significant outcome in the Brant test suggests a violation of the parallel slopes assumption, indicating that the model's coefficients vary across different categories of the dependent variable This violation can result in biased or inconsistent estimates, ultimately impacting the interpretation of the findings.

The null hypothesis in the Brant test asserts that the parallel slopes assumption is valid, suggesting that the impact of each independent variable remains constant across all levels of the dependent variable Conversely, the alternative hypothesis posits that this assumption is violated, implying that the model's coefficients differ across the categories of the dependent variable.

Ho: The slopes do not differ

> If the test fails to reject the Ho (if the p-value is large), we conclude that the proportional odds assumption is reasonable, and we can use the ordinal model.

HI: The slopes are different

> Ifthe test does reject the Ho (if the p-value is small), we conclude that the proportional odds assumption is not reasonable, and we cannot use the ordinal model.

2.1.1.4 Proposed analytical framework of the model 1

Following the literature review, the study puts forward a research model, denoted as Model 1, which is presented below:

Characteristic of the organization Firm size

1) Resource management: Purchase fixed assets, land access, Money

2) Work climate: Number of employees paid social insurance

Training cost, labor education, technological skill of labor

Competitive impact, Reason - Limited capacity

Internet, IT in inner communication department

Count data often serves as the dependent variable in various contexts, such as the annual number of zoo visits, patents granted to a company, doctor visits, speeding fines issued, or vehicles passing through a toll booth in a short time frame These basic variables are discrete, meaning they can only take specific non-negative values.

In model 2 of this study, the dependent variable is the counting data indicating the number of 4.0 technologies applied in enterprises Therefore, the choice of Poisson regression is reasonable.

A Poisson distribution is characterized by the counting of whole number observations where the occurrence of one event does not influence the likelihood of another, and the data is organized into identical known time intervals The Poisson probability mass function effectively estimates the probabilities associated with this distribution.

The Poisson distribution, represented by Equation (2.5), is defined by a single parameter, λ, which indicates the rate of occurrence of events within a fixed time interval In this context, Y denotes the variable of interest, while K represents the count value (K = 0, 1, 2, ) The factorial notation K! is used to express the product of all positive integers up to K, with the special case of 0! equal to 1 Since observed counts are non-negative whole numbers, the parameter A will also be non-negative Unlike distributions with two parameters, such as the normal distribution characterized by mean (μ) and variance (σ²), the Poisson distribution's mean and variance are both determined by its single parameter, λ.

59 are the same and equal J In other words, to make inferences about the Poisson distribution, the single parameter  needs to be estimated.

Regression analysis is a valuable method for examining the relationships among multiple variables and count data However, linear regression is often unsuitable for count data due to its violation of model assumptions, particularly when the dependent variable is not normally distributed Traditional techniques based on the general linear model (GLM) assume a continuous and normally distributed dependent variable, which is not applicable for count data, especially in cases of rare events with a mean below 10 that tend to be skewed Therefore, classic methods that rely on normality are inadequate for analyzing count data Fortunately, alternative statistical techniques are available that can effectively model count data without the requirement for normal distribution.

Poisson regression is a statistical technique that models the relationship between a count-dependent variable and one or more independent variables Named after French mathematician Siméon-Denis Poisson, this method utilizes the Poisson distribution to analyze count data, representing the probability of event occurrences within a specific time or space interval The primary aim of Poisson regression is to estimate model parameters that illustrate how independent variables influence the mean count of the dependent variable This approach is especially beneficial for scenarios where the dependent variable is a non-negative integer count.

The Poisson regression model can be written as: yi = EÚ,) + uj =A + tụ

Where: the y are independently distributed as Poisson random variables with a mean A; for each individual, expressed as follows:

A; = EG/|X¡) = exp[f + B2X2i + + BeXi] = exp (BX)

Where: the expression exp (BX) signifies the exponential of expression XB with

XB stands for multiple regression.

The X variables are the explanatory variables that can determine the mean of the dependent variable Therefore, by itself, it also determines the variance if the Poisson model is appropriate Taking exp (XB) will ensure that the mean of the counter variable,

For estimation purposes, this model can be written as follows:

Model assumptions 1

Poisson regression, similar to linear least squares regression, relies on specific model assumptions for accurate inference The response variable must represent counts per unit of time or space and follow a Poisson distribution Additionally, observations should be independent, and the mean must equal the variance, a defining characteristic of Poisson random variables Lastly, the logarithm of the mean rate, log(A), needs to be a linear function of the predictor variable x.

One of the key assumptions of Poisson regression is that the mean and variance of the dependent variable are equal, which is known as the equidispersion assumption If

61 this assumption is violated, alternative regression models such as negative binomial regression may be more appropriate.

EƠ,) = Aj var (yi) = Ai

The Poisson distribution is characterized by the unique property of equidispersion, where the mean and variance of a variable are equal However, in real-world applications, it is common for the variance of observed variables to exceed the mean, a phenomenon known as overdispersion.

Statistical and model 2 test assumpfIOnS . 5 + +c << ++sesseee 62 2.1.2.4 Potential alternative imOd€Ì - << + + E111 vn 63 2.1.3 Model 3 004i

In Poisson regression analysis with Stata, it is essential to test several key assumptions, including the goodness-of-fit test, the test for overdispersion, and the test for zero-inflation.

One fundamental assumption of Poisson regression is that the mean and variance of the dependent variable are equal, a condition known as equi-dispersion To assess this, researchers can utilize the Pearson chi-square test and graphical methods to analyze mean and variance values In Stata, the Pearson chi-square test divided by its degrees of freedom should approximate 1 if the equi-dispersion assumption is valid Additionally, Stata facilitates the likelihood ratio test, which compares a Poisson regression model to a more adaptable negative binomial regression model that accounts for overdispersion For evaluating model fit, Stata offers various goodness-of-fit statistics, including the Pearson chi-square test and deviance.

62 goodness-of-fit test, and the likelihood ratio test These tests evaluate whether the model fits the data adequately, and a significant result indicates a lack of fit.

Hypothesis testing on overdispersion as follow (Hartono et al., 2021):

Ho: ỉ = 0 (without overdispersion) HI: 6 # 0 (with overdispersion)

The test statistic used is:

The test criteria reject the null hypothesis (HO) if the value of X? exceeds Xin-p or if the p-value is less than the significance level (α) Over-dispersion in estimation is assessed by calculating the Pearson's Chi-square value divided by the degrees of freedom; a result greater than 1 indicates that the data is over-dispersed (Hartono et al., 2021).

As stated in previous section, the Poisson regression model assumes equi- dispersion, where the mean and variance of the response variable are equal (Tinungki,

In 2019, it was noted that count data can exhibit overdispersion, where the variance exceeds the mean This phenomenon necessitates careful consideration, as failing to address overdispersion can lead to inaccurate statistical inferences Specifically, it can result in underestimated standard error estimates, which are crucial for constructing test statistics and confidence intervals Consequently, this oversight may lead to an inflated number of statistically significant results (Hayat & Higgins, 2014).

The prevalence of overdispersion in data has led to the development of several alternative models, including the negative binomial, quasi-Poisson, generalized Poisson, and zero-inflated models.

= Negative binomial regression (NBREG) and Quasi-Poisson regression

The quasi-Poisson and negative binomial models remain popular choices in statistical analysis, largely because they are readily accessible in software and can be easily adapted for regression applications.

The quasi-Poisson and negative binomial models effectively address overdispersion in data, each featuring two parameters Scholars often advocate for the negative binomial distribution as a preferable alternative to the Poisson distribution when overdispersion is evident, as it incorporates a parameter that adjusts Poisson dispersion Additionally, the quasi-Poisson regression model offers a generalized approach to handling overdispersion without needing to specify the distribution of the response variable, relying instead on assumptions about the first two moments Parameter estimation in quasi-Poisson regression is achieved through the quasi-likelihood estimation method.

According to Ver Hoef and Boveng (2007), both quasi and negative binomial models can be framed as generalized linear models Let Y be a random variable such that:

Where E(Y) is the expectation of Y, var(Y) is the variance of Y, >0 and ỉ >1 E(Y) is also known as the ““mean”” of the distribution Although >0, the data themselves can

The quasi-Poisson model, represented as Y ~ Poi(1, 9), is founded on the connection between its formulation and the expected value and variance of a Poisson distribution, utilizing a log link function This model is defined by its mean and variance, which are the primary moments (Wedderburn, 1974) Efron (1986) and Gelfand and Dalal (1990) illustrated the process of creating a distribution for this model through reparameterization Estimation typically relies on the first two moments and estimating equations (Lee and Nelder, 2000) The quasi-Poisson model maintains parameters in a clear and interpretable manner, enabling standard model diagnostics while ensuring efficient fitting algorithms.

The generalized Poisson regression model extends the traditional Poisson regression by accommodating extra-Poisson variation, making it suitable for data exhibiting equidispersion or overdispersion This model establishes a linear relationship between the logarithm of the mean and the covariates, incorporating a dispersion parameter to effectively address overdispersion issues.

With regard to negative binomial model, when denoting the random variable Y having a negative binomial distribution as Y ~NB (u, k), with a parameterization such that:

In the context of statistical modeling, the relationship between mean and variance is crucial For a Poisson distribution, the variance is linearly related to the mean, expressed as E(Y) = u and var(Y) = vyp(u) In contrast, for a negative binomial distribution, the variance exhibits a quadratic relationship with the mean, indicated by var(Y) = w + KH This distinction highlights the presence of overdispersion, represented by the multiplicative factor 1 + ku, which is dependent on the mean, differing from the quasi-Poisson model.

Both quasi-Poisson regression and negative binomial regression are popular models because they both have a single mean parameter that can be varied based on covariates.

Quasi-Poisson regression models the response variable \( Y \) as following a Poisson distribution with a mean that varies based on covariates This approach allows for the mean \( \mu_i \) of the \( i \)-th observation to be greater than zero, making it suitable for count data with overdispersion.

Excessive zeros in data can lead to overdispersion and compromise the validity of Poisson regression models (Tang & Tang, 2019) This issue often arises when the number of zero outcomes exceeds what a Poisson distribution predicts, typically due to population heterogeneity For example, individuals immune to the event being studied will consistently yield zero outcomes, resulting in structural zeros To address this, Tang and Tang (2019) suggest using zero-inflated models, which combine a Poisson regression for at-risk subjects with a binary regression for the structural zero group Various tests, such as the Wald, score, likelihood ratio tests, and the Vuong test, can assess the presence of inflated zeros by determining if the constant probability of zeros is statistically significant.

The Zero-inflated Poisson model addresses the issue of excessive zeros in data that exceed the expectations of the Poisson distribution This model posits two distinct groups within the dataset: the Always-0 group, which consistently has a zero count with a probability of 1, and the Not always-0 group, which may exhibit varying counts.

The standard Poisson distribution predicts 66 counts, but a zero-count observed in the data could originate from either group If the zero-count is from the Always-0 group, it indicates that the observation is unlikely to yield a positive outcome The overall model integrates probabilities from both groups, effectively addressing the overdispersion and excess zeros that the standard Poisson model cannot accommodate (Jeon, 2013).

The probability of an observation belonging to the Always-0 group can be effectively predicted using logit or probit models, where the likelihood (wi) is influenced by the specific characteristics of each observation.

Logistic regression model estimation .- - ô+ ++s<++se+seeesee 71 Model assumptions 0 .eecccecssccesseccesneceesnecessnecesneeceseeceeaeeeeseeessneeeeaes 74 2.1.3.2 Statistical and model 3 test assumpfIONS .- - 5 +5 cssxesssee 74 2.1.3.3 Proposed analytical framework for the model 3 .- - ô+ 70

To begin, as the dependent variable Y is a binary variable, taking a value of either

Using the ordinary least squares (OLS) method, we analyze individual behavior in relation to independent variables, employing a standard model for this assessment.

The primary goal is to calculate the probability based on the explanatory variables' values It's essential to keep in mind two key conditions while developing the probability function: first, the estimated probability must always remain between 0 and 1 as the explanatory variables (Xi) vary; second, the relationship between probability (P) and the explanatory variables (Xj) is non-linear, indicating that the probability approaches 0 gradually.

71 rate as X¡ gets small, and approaches | at a slow rate as Xj gets very large." The logit and probit models satisfy these conditions.

Suppose an individual's decision depends on an unobservable utility index (I*) that depends on various explanatory variables This represents as follow:

In which, i = the 1-th individual, u = the error term

Yi=1if lj =0 Yi=1if Ij

Ngày đăng: 08/12/2024, 21:21

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN