6. PHÂN TÍCH TƯƠNG QUA N HỒI QUY
6.4.2. Lập mô hình phi tuyến nhiều lớp chuyển về tuyến tính trong Statgraphics
Trong Statgraphics, việc tính toán mô hình phi tuyến nhiều lớp đơn giản hơn vì không cần tạo thêm các cột đổi biến số, biến sốđược đổi trực tiếp trong hộp thoại khi thiết lập mô hình.
Sau đó mở file dữ liệu này trong Statgraphics Plus, vào chương trình chạy hàm nhiều lớp
Đổi biến số ngay trong hộp thoại
Kết quả chạy hàm phi tuyến nhiều lớp quy về tuyến tính
Multiple Regression - log(M)
Dependent variable: log(M) Independent variables: log(H)
log(N)
Standard T
Parameter Estimate Error Statistic P-Value
CONSTANT -5.14328 0.681084 -7.55161 0.0000 log(H) 2.20541 0.095991 22.9752 0.0000
log(N) 0.641785 0.0818578 7.84025 0.0000
Analysis of Variance
Source Sum of Squares Df Mean Square F-Ratio P-Value
Model 29.988 2 14.994 271.40 0.0000
Residual 2.04411 37 0.0552463
Total (Corr.) 32.0321 39
R-squared = 93.6185 percent
R-squared (adjusted for d.f.) = 93.2736 percent Standard Error of Est. = 0.235045
Mean absolute error = 0.16285
Durbin-Watson statistic = 1.47918 (P=0.0243) Lag 1 residual autocorrelation = 0.243443
The StatAdvisor
The output shows the results of fitting a multiple linear regression model to describe the relationship between log(M) and 2 independent variables. The equation of the fitted model is
log(M) = -5.14328 + 2.20541*log(H) + 0.641785*log(N)
Since the P-value in the ANOVA table is less than 0.05, there is a statistically significant relationship between the variables at the 95.0% confidence level.
The R-Squared statistic indicates that the model as fitted explains 93.6185% of the variability in log(M). The adjusted R- squared statistic, which is more suitable for comparing models with different numbers of independent variables, is 93.2736%. The standard error of the estimate shows the standard deviation of the residuals to be 0.235045. This value can be used to construct prediction limits for new observations by selecting the Reports option from the text menu. The mean absolute error (MAE) of 0.16285 is the average value of the residuals. The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in your data file. Since the P-value is less than 0.05, there is an indication of possible serial correlation at the 95.0% confidence level. Plot the residuals versus row order to see if there is any pattern that can be seen.
In determining whether the model can be simplified, notice that the highest P-value on the independent variables is 0.0000, belonging to log(N). Since the P-value is less than 0.05, that term is statistically significant at the 95.0% confidence level. Consequently, you probably don't want to remove any variables from the model.
Trong Statgrahics Plus còn cho phép tạo tổ hợp biến ngay trong hộp thoại, ví dụ có thể lập hàm dạng: ln(M) = a + b1.N*H, trong đó N.ln(H) là tổ hợp biến. Trong hộp thoại tạo tổ hợp biến như sau Plot of log(M) 2.4 3.4 4.4 5.4 6.4 predicted 2.4 3.4 4.4 5.4 6.4 ob se rv ed
Kết quả có hàm theo quan hệ nhiều biến dưới dạng tổ hợp biến
Multiple Regression - log(M)
Dependent variable: log(M) Independent variables: N*H
Standard T
Parameter Estimate Error Statistic P-Value
CONSTANT 3.17609 0.248379 12.7873 0.0000
N*H 0.000133068 0.0000252748 5.26485 0.0000
Analysis of Variance
Source Sum of Squares Df Mean Square F-Ratio P-Value
Model 13.5104 1 13.5104 27.72 0.0000
Residual 18.5217 38 0.487412
Total (Corr.) 32.0321 39
R-squared = 42.1778 percent
R-squared (adjusted for d.f.) = 40.6561 percent Standard Error of Est. = 0.698149
Mean absolute error = 0.515141
Durbin-Watson statistic = 0.780029 (P=0.0000) Lag 1 residual autocorrelation = 0.559301
The StatAdvisor
The output shows the results of fitting a multiple linear regression model to describe the relationship between log(M) and 1 independent variables. The equation of the fitted model is
log(M) = 3.17609 + 0.000133068*N*H
Since the P-value in the ANOVA table is less than 0.05, there is a statistically significant relationship between the variables at the 95.0% confidence level.
The R-Squared statistic indicates that the model as fitted explains 42.1778% of the variability in log(M). The adjusted R- squared statistic, which is more suitable for comparing models with different numbers of independent variables, is 40.6561%. The standard error of the estimate shows the standard deviation of the residuals to be 0.698149. This value can be used to construct prediction limits for new observations by selecting the Reports option from the text menu. The mean absolute error (MAE) of 0.515141 is the average value of the residuals. The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in your data file. Since the P-value is less than 0.05,
there is an indication of possible serial correlation at the 95.0% confidence level. Plot the residuals versus row order to see if there is any pattern that can be seen.
In determining whether the model can be simplified, notice that the highest P-value on the independent variables is 0.0000, belonging to N*H. Since the P-value is less than 0.05, that term is statistically significant at the 95.0% confidence level. Consequently, you probably don't want to remove any variables from the model.