Hội nghị Khoa học Công nghệ lần thứ 4 SEMREGG 2018 109 MANAGEMENT MODEL OF AIR QUALITY AND FORECAST OF SO2 AND NOX CONCENTRATION OF NHON TRACH INDUSTRIAL ZONE COMBINING NEURAL NETWORK AND GEOSTATISTIC[.]
Hội nghị Khoa học Công nghệ lần thứ - SEMREGG 2018 MANAGEMENT MODEL OF AIR QUALITY AND FORECAST OF SO2 AND NOX CONCENTRATION OF NHON TRACH INDUSTRIAL ZONE COMBINING NEURAL NETWORK AND GEOSTATISTICS TECHNOLOGY Le Thi Thu Thao, Pham Hoang Thu Na, Pham Van Tat* Department of Environmental Engineering, Hoa Sen University, Ho Chi Minh city * Email: vantat@gmail.com ABSTRACT The management modelling of the emitted gases SO2 and NOx at Nhon Trach industrial zone was utilized to assess the air quality using the nonlinear multivariate regression, neural network and Kriging technique The operation of this industry zone in years 2011 to 2018 was developed quyckly extent The nonlinear multivariate models and the three-layer neural network architecture I(3)-HL(8)-O(2) were constructed to predict the concentration of gases SO2 and NOx The predicted values resulting from those models are compared with the monitored concentration at various locations at Nhon Trach The nonlinear models were established with statistical parameters R2fit = 0.9334, RMSE = 0.0499, SSE = 0.6217, Fstat = 64775.9226 for gas SO2 and R2fit = 0.9704, RMSE = 0.03813, SSE = 0.7918, Fstat = 68385.4330 for gas NOx The neural network model I(3)-HL(8)-O(2) was built with RMSE = 0.0344, R2train = 0.9777 for gas SO2 and RMSE = 0.0263, R2train = 0.9986 for gas NOx Geostatistical techniques and Kriging interpolation model were used to find the trend of air pollution at industrial zones of Nhon Trach The study results indicate that the neural network is able to give better predictions with less residual mean square error than those given by nonlinear multivariate models The obtained models can support effectively in the management of air quality in industrial areas of Nhon Trach, Dong Nai Keywords: Air pollution, geostatistical analysis, interpolation method, neural network, multivariate analysis INTRODUCTION In recent years the industrial area of Nhon Trach district is in the status of air pollution at an alarming rate, due to the population grows rapidly which is followed by the development of industrial areas The air pollution is also increasing Recognizing the level of air pollution caused by the industrialization, since 1999 the station network of air monitoring has been located in residential, traffic and industrial zone with 46 positions In addition, it will continue to offer control solutions while adding several monitoring locations to industrial clusters by 2010 [1] So far, Nhon Trach district has 10 industrial zones, including Nhon Trach 1, 2, 2-D2D, 2-Loc Khang, 2-Nhon Phu, 3, 5, 6, Nhon Trach and Ong Keo This number will continue to increase in the future [2,3] Nowadays the studies implemented on dispersion of pollutants in atmospheric environment The various prediction techniques including gaussian and numerical models are generally used The inputs to dispersion models include emission, meteorological data and monitoring locations The output of these models is the predicted concentration of specified monitoring locations [4,5] The models are mainly based on the mathematical formulation of the physics and chemistry of the atmosphere Especially under normal conditions, when the pollutants disperse on horizontal 109 The fourth Scientific Conference - SEMREGG 2018 direction due to changes in the wind direction over a one-hour duration could not be well represented by the Gaussian distribution [6] Even the dispersion models have some physical parameters and detailed information about the sources of pollutants But in this model other parameters are not generally known To surmount this, statistical models are also employed to simplify the prediction of pollutant concentrations [7,8] The statistical models could be the relationship between the variables in nature However, such models requyre information about the data distribution [6] Recently, neural network models have also been applied to predict pollutant concentrations The neural network model can be a better alternative to statistical models It can handle data having high dimensionality [9,10] Furthermore the geographic information system technique has been also widely used in several fields of environmental management, in order to analyze and manage the factors affecting the environment clearly, more accurate In this work we report the construction of nonlinear multivariate models and neural networks to predict the concentration of gases SO2 and NOx at the industrial zone of Nhon Trach These models are constructed by the relationship between impacting factors humidity (%), temperature (oC) and wind speed (km/h) with concentrations of gases SO2 and NOx The geostatistics technique is used to identify the contaminated areas by air pollutants and the trend of the neighboring location in an industrial area in Nhon Trach The obtained output resulting from those models is compared to the monitoring data The interpolation technique of Kriging model is used to evaluate to find the trend of pollutants in industrial areas in Nhon Trach MATERIALS AND METHODS 2.1 Dataset The monitoring stations of air quality in Nhon Trach consist of 11 monitoring positions To test the air quality we performed the calculations of AQI index for substances SO2 and NOx The monitoring stations are distributed close by industrial areas in Nhon Trach, such as Ong Keo, Nhon Trach, Dai Phuoc, Long Phuoc, the industrial center Phu Thanh Vinh Thanh The monitoring subjects are used to create the various models and to chart the trend of pollution levels of each gas in accordance the years 2011 to 2018 The program Arcgis10.2 is used to map for the Nhon Trach location of Dong Nai and the monitoring locations of Nhon Trach as well as providing information on the characteristics of GIS map The dataset including spatial and attribute data as the x-y coordinate of monitoring stations, concentration of gases SO2, NOx and meteorological data are used to establish the shape file of the study area at Nhon Trach of Dong Nai province The monitoring stations with sample notation are given in Table The map of the study area at Nhon Trach is presented in Figure Table The location of monitoring stations with sample notation of industrial zone No Monitoring station Coordinate Sample notation X Y The industrial zone Nhon Trach 409198 1188812 AI-NT-01-L1 The industrial zone Nhon Trach 413132 1185690 AI-NT-02-L1 The industrial zone Nhon Trach 410982 1182587 AI-NT-03-L1 The industrial zone Nhon Trach 407036 1186484 AI-NT-04-L1 Ong Keo 402445 1177682 AI-OK-01-L1 Ong Keo 397441 1178192 AI-OK-02-L1 110 Hội nghị Khoa học Công nghệ lần thứ - SEMREGG 2018 No Monitoring station Coordinate Sample notation X Y Nhon Trach district 405107 1185926 AI-NT-13-L1 Dai Phuoc Ward 398527 1186989 AI-NT-14-L1 Long Phuoc Ward 420014 1185571 AI-SB-07-L1 10 The industrial center Phu Thanh-Vinh Thanh 401628 1183706 AI-PT-01-L1 11 The industrial zone of high technology Long Thanh 410077 1191703 AI-LTa-01-L1 The map layout of study area and air-monitoring stations at Nhon Trach are demonstrated by program ArcGIS Figure Study area map Nhon Trach and location of air-monitoring stations at Nhon Trach 2.2 Methods 2.2.1 Nonlinear multivariable model A nonlinear multivariate model was developed by the relationship between factors and gas concentration to compare the performance of the neural network The data were first checked for nonlinear regression analysis For this, all the variables were examined for autocorrelations among themselves The noise of the concentration data was removed using a log transformation The wind speed, temperature and relative humidity as input variables of nonlinear model and patterns of the logarithm of concentration of gases SO2 and NOx as the output variables, the nonlinear models were tested as proposed by [9,10] The nonlinear model was developed by using Ljung-Box statistics [5,11] to examine the adequacy of the model The suitable model equation is yn+1 = b1/yn Hn + b2 (b3 Hn + b4)b5 + b6 (b7 Tn + b8 Rn + b9)-b10 (1) Where yn is the log of concentration (mg/m ) of gases SO2 and NOx; Rn is the wind speed (km/h), Tn is the temperature (oC); Hn is the relative humidity, %; suffix n denotes the year; and b1, b2, b3, b4, b5, b6, b7, b8, b9 and b10 are the coefficients of nonlinear multivariate model These coefficients can be determined by the least squares technique using an Advanced Differential Evolution algorithm with population size of 20, mutation rate of 0.85 and crossover rate of 0.7 111 The fourth Scientific Conference - SEMREGG 2018 2.2.2 Neural network model A neural network is a biologically motivated structure whose ith neuron has an input value xi, output value yi = f(xi), and connections with the other neurons are described by weights wij A threelayer network I(k)-HL(m)-O(n) with one hidden layer is given in Fig A brief description of neural networks is given in [12,13] Generalization and error tolerance are the main features of neural networks [14] The neural network I(k)-HL(m)-O(n) was trained and tested using program JMP Pro 13 A three-layer neural network consisted of an input layer, an output layer and one hidden layer A typical Elman network is presented in Fig Each layer has a number of nodes called neurons The nodes in the input layer distribute the input signals to the network The nodes in the output layer are characteristics of air quality Each node in the hidden and output layers has an activation function, which transfers the node input to an output signal The output is a function of the inputs to the first layer The sigmoid transfer function from -1 to is given as y e x x (2) Where y is the node output and x is the total node input 2.2.3 Geostatistics method Geostatistics is a class of statistics used to analyze and predict the values associated with spatial or spatiotemporal phenomena It incorporates the spatial coordinates of the data within the analyses [15] The geostatistical tools were developed as a practical means to describe spatial patterns and interpolate values for locations where samples were not taken Those tools and methods have since evolved to not only provide interpolated values, but also measures of uncertainty for those values [16,17] Geostatistical analysis has also evolved from uni- to multivariate and offers mechanisms to incorporate secondary datasets that complement a primary variable of interest, thus allowing the construction of more accurate interpolation and uncertainty models [15] Kriging model assumes that at least some of the spatial variation observed in natural phenomena can be modeled by random processes with spatial autocorrelation, and requyre that the spatial autocorrelation be explicitly modeled Kriging techniques can be used to describe and model spatial patterns, predict values at unmeasured locations, and assess the uncertainty The progress is expressed through the steps: - Collecting data attributes spatial vector, the air pollution parameters were observed of the Nhon Trach district in years 2011 to 2018 - Building the base map of Dong Nai provincial boundary, rivers and lakes, roads,… - Calculating Air Quality Index AQI for gases SO2 and NOx in the study area - Interpolation of AQI index under Kriging interpolation method - Validating the accuracy and the standard deviation of the interpolation results RESULTS AND DISCUSSION 3.1 Dataset of air monitoring quality The air quality index AQI is calculated separately from the data of automatically each air monitoring station for the ambient air environment; AQI value is calculated for each monitoring 112 Hội nghị Khoa học Công nghệ lần thứ - SEMREGG 2018 parameter of environmental quality Each environmental parameter is used to determine a specific AQI value; the final AQI value is the maximum of the AQI values for each parameter; the scale of AQI value is divided into certain ranges When the AQI value is within a certain range, the warning message for the community for that value range will be given [18] To have fully validation of air quality of industrial zone at Nhon Trach, AQI values for SO2 and NOx are calculated with a handbook for calculation of air quality index (AQI) issued together with Decision No 878 / QD-TCMT July 1, 2011 of the Director General for the General of Department of Environment [18] AQI average values of gases SO2 and NOx were calculated for the monitoring stations in Nhon Trach district of Dong Nai province in years 2011 to 2018, as given in Fig Fig showed that, the overall SO2 parameter from 2011 until now remained in the safety level according to AQI calculated values In case of gas NOx the AQI average values of NOx showed the level of air pollution in the year 2011 beyond the AQI criterion according to the AQI criterion of Decision No 878 / QD-TCMT But in the next years AQI values tend to fall AQI value of SO2 40 30 20 2011 2012 2013 2014 2015 2016 2017 2018 80 AQI value of NOx 2011 2012 2013 2014 2015 2016 2017 2018 50 60 criterion 40 20 10 1 1 1 1 1 1 -L 2-L 3-L 4-L 1-L 2-L 3-L 4-L 7-L 1-L 1-L 2-L 0 01 0 1 0 T- NT- NT- NTKK- NT- NT- -SB- -PT- Ta- Ta-N L L -O I-O I I III I I A A AI AI AI A A A A A A AI Sample notation 1 -L 01 T- -N AI T- -N AI -L 02 T- -N AI 1 1 1 -L -L -L -L -L -L 01 02 03 04 13 14 TTTKK-N -O -O I-N I-N I I AI A A A A Sample notation Figure The average AQI values of gases SO2 and NOx at monitoring locations in industrial zone of Nhon Trach over years 2011 to 2018 3.2 Nonlinear multivariate model To construct the management model of air quality the nonlinear multivariate models for important gases SO2 and NOx need to be established to predict the their concentration [5, 9] The coefficients of equation (1) were determined by the least squares technique using an Advanced Differential Evolution algorithm with population size of 20, mutation rate of 0.85 and crossover rate of 0.7 This new evolution algorithm is used in this work The quality of nonlinear multivariate regression models for gases SO2 and NOx is presented by equation (3) and (4) For these nonlinear models the meteorological variables H (%), T(oC), R(km/h) at one location are also used as an input variable to all the locations For validation of the models, the RMSE values are used, calculated as RMS = (SSQ/n)1/2, where n is the number of residuals and SSQ the sum of squared residuals The RMSE values for the concentrations fitted at industrial sites are 0.04999 for SO2 and 0.03813 for NOx, respectively, and the predicted values it is R2pred = 0.9602 for SO2 and R2pred = 0.9776, respectively The fitted Fig illustrates the predicted results obtained using the nonlinear multivariate models The correlation coefficients between the monitored and fitted data for monitoring sites are 0.9334 for SO2 and 0.9704 for NOx A nonlinear multivariate regression model is constructed for gas SO2 in an industrial zone 113 The fourth Scientific Conference - SEMREGG 2018 yn+1 = 0.016/yn Hn + 142.014 (1.346 Hn + 214.673)-0.450 - 129.635 (-9.342 Tn - 20.893 Rn + 27543.100)-0.243 R fit (3) = 0.9334, RMSE = 0.04999, SSE = 0.6217, Fstat = 64775.9226 A nonlinear multivariate regression model is constructed for gas NOx in an industrial zone yn+1 = 0.015/yn Hn + 56.360 (0.005 Hn + 2.933)-0.419 - 4.543 (1.015 Tn + 2.260 Rn + 8976.532)-0.222 R fit (4) = 0.9704, RMSE = 0.03813, SSE = 0.7918, Fstat = 68385.4330 2.5 SO2 (mg/m3) CNOx-Monitoring NOx (mg/m3) CSO2-Monitoring 2.0 CSO2-Fitted 1.5 1.0 CNOx-Fitted 2.0 1.5 1.0 0.5 2011 2012 2013 2014 2015 2016 2017 2011 2012 Năm 2013 2014 2015 2.0 2017 2.5 CSO2-Monitoring CSO2-Predicted CNOx-Monitoring NOx (mg/m3) SO2 (mg/m3) 2016 Năm 1.5 1.0 CNOx-Predicted 2.0 1.5 1.0 2012 2013 2014 2015 2016 Năm 2017 2018 2012 2013 2014 2015 2016 2017 2018 Năm Figure Fitting plot between the monitored, fitted and predicted concentration of gases SO2 and NOx at industrial zone using nonlinear multivariate model 3.3 Construction of neural network A neural network is constructed to train the monitoring data of industrial zone collected at 11 monitoring locations, as given Table Owing to variations in micrometeorological data, it is preferable to use for prediction of air quality; however, it was possible to train either the separate neural networks or the together neural network The input parameters for the nonlinear models include wind speed, R (km/h), temperature, T (°C) and relative humidity, H (%) The output parameters consist of concentrations of gases SO2 and NOx The neural network architecture I(k)HL(m)-O(n) is constructed carefully by considering neurons of the hidden layer It may be possible to validate the predictability of the gas concentrations at the monitoring stations in years 2011 to 2018 The convergence of the neural network enables the selection of the optimum neural network structure and also the number of neurons in the hidden layer The parameters were used for training process are the sigmoid transfer function, learning rate 0.1, momentum 0.7, target epochs 10000 and target MSE 0.0001 The neural network architecture I(3)-HL(8)-O(2) is chosen for management modeling of air quality at the industrial zone of Nhon Trach, as shown in Fig This neural network enables to adapt the flexible predictability 114 Hội nghị Khoa học Công nghệ lần thứ - SEMREGG 2018 Figure The neural network architecture I(3)-HL(8)-O(2) with three nodes in input layer as H (%), T (oC) and R (km/h), and two nodes of output layer as concentration of SO2 and NOx The training and prediction quality of this neural network I(3)-HL(8)-O(2) is pointed out by using the monitoring data of 11 locations in years 2011 to 2018, as showed in Fig In training process the data set was partitioned into the training and test set randomly For monitoring data on the training set, the correlation coefficients between monitored and training values for gases SO and NOx are R2train of 0.9777 and R2train of 0.9986, respectively The statistical errors of neural network I(3)-HL(8)-O(2) for training and test process are presented by values RMSE of 0.0344 and 0.0231 of gas SO2, and RMSE of 0.0263 and 0.0435 for gas NOx, respectively, as exhibited in Fig.5 and Table NOx (mg/m3) SO2 (mg/m3) 2.5 CSO2-Monitoring 2.0 CSO2-training 1.5 CNOx-Monitoring CNOx-training 2.0 1.5 1.0 1.0 2011 2012 2013 2014 2015 2016 2017 2018 2011 2012 2013 2014 Năm 2015 2016 2017 2018 Năm 2.0 2.5 CSO2-Monitoring NOx (mg/m3) SO2 (mg/m3) 2014 CSO2-Test 1.5 CNOx-Monitoring CNOx-Test 2.0 1.5 1.0 1.0 2012 2013 2014 Năm 2015 2016 2017 2018 2011 2012 2013 2014 2014 2015 2016 2017 2018 Năm Figure Fitting plot between the training and test concentration of gases SO2 and NOx at industrial zone using neural network I(3)-HL(8)-O(2) The results show that the neural network is able to train the data set and give the predictions accurately The discrepancy between actual and test lines in Fig resulting from this neural network I(3)-HL(8)-O(2) is insignificant This is also shown in statistical values in Table 115 ... industrial zone Nhon Trach 409198 1188812 AI-NT-01-L1 The industrial zone Nhon Trach 413132 1185690 AI-NT-02-L1 The industrial zone Nhon Trach 410982 1182587 AI-NT-03-L1 The industrial zone Nhon. .. [18] To have fully validation of air quality of industrial zone at Nhon Trach, AQI values for SO2 and NOx are calculated with a handbook for calculation of air quality index (AQI) issued together... values of gases SO2 and NOx at monitoring locations in industrial zone of Nhon Trach over years 2011 to 2018 3.2 Nonlinear multivariate model To construct the management model of air quality