Itis important to determine the tensile force T based on allthe system variabilities soil properties, anchorage geo-metry and CCL–GM interface shear characteristics toensure anchorage s
Trang 1Machine-learning modelling of tensile force in anchored geomembrane liners
K V N S Raviteja1,2, K V B S Kavya3, R Senapati4and K R Reddy5
1
SIRE Research Fellow, Department of Civil, Materials, and Environmental Engineering, University of
Illinois, Chicago, IL, USA
2
Assistant Professor, Department of Civil Engineering, SRM University AP, Amaravati, Guntur, India,
E-mail: raviteja.k@srmap.edu.in
3
Research Scholar, Department of Civil Engineering, SRM University AP, Amaravati, Guntur, India,
E-mail: kvbskavya@gmail.com
4
Assistant Professor, Department of Computer Science and Engineering, SRM University AP, Amaravati,
Guntur, India, E-mail: rajiv.s@srmap.edu.in
5
Professor, Department of Civil, Materials, and Environmental Engineering, University of Illinois,
Chicago, IL, USA, E-mail: kreddy@uic.edu (corresponding author)
Received 28 October 2022, accepted 26 February 2023
ABSTRACT: Geomembrane (GM) liners anchored in the trenches of municipal solid waste (MSW)
landfills undergo pull-out failure when the applied tensile stresses exceed the ultimate strength of the
liner The present study estimates the tensile strength of GM liner against pull-out failure from
anchorage with the help of machine-learning (ML) techniques Five ML models, namely multilayer
perceptron (MLP), extreme gradient boosting (XGB), support vector regression (SVR), random forest
(RF) and locally weighted regression (LWR) were employed in this work The effect of anchorage
geometry, soil density and interface friction were studied with regards to the tensile strength of the GM.
In this study, 1520 samples of soil –GM interface friction were used The ML models were trained and
tested with 90% and 10% of data, respectively The performance of ML models was statistically
examined using the coefficients of determination (R 2 , R 2
adj ) and mean square errors (MSE, RMSE) In addition, an external validation model and K-fold cross-validation techniques were used to check the
models ’ performance and accuracy Among the chosen ML models, MLP was found to be superior in
accurately predicting the tensile strength of GM liner The developed methodology is useful for tensile
strength estimation and can be beneficially employed in landfill design.
KEYWORDS: Geosynthetics, Anchorage capacity, Machine learning, Geoenvironment, Landfill
REFERENCE: Raviteja, K V N S., Kavya, K V B S., Senapati, R and Reddy, K R (2023).
Machine-learning modelling of tensile force in anchored geomembrane liners Geosynthetics
International [https://doi.org/10.1680/jgein.22.00377]
1 INTRODUCTION
Composite liner consisting of compacted clay liner (CCL)
(or geosynthetic clay liner, GCL) and geomembrane
(GM) is used to prevent leachate from escaping from
municipal solid waste (MSW) landfills GM is placed
over CCL or GCL and overlain by a leachate drainage
layer An anchor system secures GM to avoid pull-out
failure Figure 1 shows the schematic representation of the
liner and GM anchorage system in MSW landfills
Ensuring the stability and integrity of the composite
liner system is crucial in landfill design The anchor
system secures GM liners in order to avoid pull-out failure
caused by stresses induced by the drainage layer (Koerner
et al 1986; Sharma and Reddy 2004) It is reported that
the geosynthetic interface components are highly influ-enced by the properties of overlying waste (Reddy et al 2017) The conventional limit equilibrium analysis lacks the ability to determine displacement along the critical shear plane and report strain levels within the composite liner system (Reddy et al 1996)
Anchor systems could be of different geometries (simple runout, rectangular, L-shape and V-shape) with soil backfilled in the trenches (Koerner et al 1986) GM liners are often prone to pull-out failure along the side slopes of the landfill during installation Figure 2 presents the pull-out force and corresponding resistance forces developed along the liner embedded in a V-shaped trench The anchorage capacity should be designed in an optimal way so that it acts rigid when the mobilised tension is less,
Trang 2and flexible when the mobilised tension reaches the
ultimate tensile strength to avoid tear in the GM liner It
is important to determine the tensile force (T ) based on all
the system variabilities (soil properties, anchorage
geo-metry and CCL–GM interface shear characteristics) to
ensure anchorage stability
A large number of physical tests and evaluations needs
to be conducted on pull-out apparatus and shear box
equipment for the experimental assessment of tensile
forces in the GM liner It is recommended to conduct one
test on the tensile properties of the liner for every
100 000 ft2 (TCEQ 2017) – that is, a 600-acre landfill
site requires more than 3000 conformances testing to
determine the tensile properties of the liner Further, the
variability associated with various design parameters of
anchorage could demand repetitive testing for proper
judgement (Raviteja and Basha 2021) The friction angle
at CCL–GM and sand–GM interfaces are the most
critical parameters in anchor trench design Most
pull-out failures on the side slope are initiated at the
soil–GM interface Recent and past studies have shown
that low frictional resistance at the interface, tensile
stiffness of the liner and failure of the soil mass along
the preferential slip lines in granular soils are some of the
major causes Inadequate analysis of soil–geosynthetic
interface characteristics would result in pull-out failure
Koerner et al (1986) analysed anchorage resistance by
determining the pressure exerted by cover and backfill soil
on the GM liner Further, four design models were
developed to address the stability and tension factors for cover soils on GM-lined slopes (Koerner and Hwu 1991) Qian et al (2002) derived an expression for tensile force in the liners for simple, rectangular and V-shaped anchors by considering normal stress from cover soil The anchor trench pull-out resistance is analysed and compared for four design models (Raviteja and Basha 2018) A significant variability is associated with soil–GM liner interface friction that needs to be incorporated in the design of anchor trench (Raviteja and Basha 2015) The target reliability-based design optimisation is proposed for
a V-shaped anchor trench against pull-out failure (Basha and Raviteja 2016) Huang and Bathurst (2009) devel-oped statistical bilinear and nonlinear models for predict-ing the pull-out capacity of geosynthetics The cyclic interface shear properties between sandy gravel and high-density polyethylene (HDPE) GM were experimen-tally evaluated and further modelled through a constitu-tive relationship (Cen et al 2019) Miyata et al (2019) proposed ML regression models to predict the pull-out capacity in steel strip reinforcement The pull-out coeffi-cient is determined using analytical techniques that rely on soil engineering properties, namely stress-related skin friction between soil and geosynthetics (Samanta et al 2022)
In general, artificial intelligence (AI) techniques fre-quently outperform traditional and deterministic sol-utions AI approaches such as artificial neural networks (ANN), genetic programming (GP) and support vector
V-trench
Backfill
Cover soil
Native soil
Native soil
GM liner
Tensile force
GM liner Drainage layer CCL/GCL
Figure 1 Schematic representation of GM liner anchored in a V-shaped trench
Backfill
Cover soil
Drainage layer
Native soil
GM liner
Pull-out force,
T (kN/m)
dcs
dat
f
f f f
f
f
Figure 2 Anchorage showing the mobilised tension and interface frictional resistance acting along the length of GM liner
Trang 3machines (SVM) are more sophisticated, resulting in wide
usage for geotechnical engineering designs Several
authors have identified the importance of extensive
database analysis in better predicting experimental
results Machine learning (ML)-based applications are
gaining prominence in geotechnical engineering (Sharma
et al 2019; Hu and Solanki 2021; Mittal et al 2021;
Rauter and Tscchnigg 2021; Zhang et al 2021) Chou
et al (2015) used an evolutionary metaheuristic
geosynthetic-reinforced soil structures The applicability
is verified for five different ML models in determining the
peak shear strength of soil–geocomposite drainage layer
interfaces (Chao et al 2021) ANN models successfully
estimate anticipated settlement in geosynthetic-reinforced
soil foundations (Raja and Shukla 2021) It is reported
that the pull-out coefficient in geogrids could be
accu-rately predicted using random forest regression (RFR)
(Pant and Ramana 2022) Ghani et al (2021) studied the
response of strip footing resting on prestressed
geotextile-reinforced industrial waste using ANN and
extreme ML The complex heterogeneous nature of the
soil properties and the peculiar interaction with various
geosynthetic materials can be simulated and well analysed
using ML models Ghani et al (2021) studied the response
of strip footing resting on prestressed geotextile-reinforced
industrial waste using ANN and a revolutionary ML
approach The complex heterogeneous nature of the soil
properties and the peculiar interaction with various
geosynthetic materials can be simulated and well analysed
using ML models Chao et al (2023) experimentally
validated the peak shear strength of clay–GM interface
predicted using AI algorithms
This paper used five different ML models to build an
anchorage model for assessing tensile force against
pull-out failure A dataset has been compiled from
published test results that include soil parameters, soil–
liner interface friction angle (δ), side slope angle (α)
and allowable tensile force (Ta) ML models were studied
using K-fold cross-validation (CV) and grid search to
find hyperparameters for a better prediction of results
A comparative analysis is carried out to determine the
superior ML model
2 METHODOLOGY
2.1 Anchored GM liner tensile force against pull-out force
The tension mobilised in the anchored GM liner is
affected by the friction at the soil–liner interface,
over-burden pressure from soil cover, liner alignment, trench
geometry, construction activities and equipment loads at
the crest portion A high mobilised tension may pull out
the liner from the anchorage Nevertheless, a rigid
anchorage can lead to tearing of the GM liner Figure 2
shows the GM liner anchorage indicating the resisting ( f )
and pull-out (T ) forces The anchor-holding capacity
should preferably be between the allowable tensile force
and the ultimate tensile force of the GM liner to avoid
pull-out failure and tear in the GM liner However, as
suggested by Koerner (1998) and Qian et al (2002), pull-out failure is preferable to tensile failure of the GM liner Basha and Raviteja (2016) reported the following Equation 1 to calculate the allowable GM tensile force against pull-out failure (Ta) based on the Qian et al (2002) theory considering the GM liner as a continuous member throughout the length
Ta¼γdcsLrotanδ þ 2γ dð csþ 0:5datÞLattanδcosψ
whereγ is unit weight of soil, dcsis depth of cover soil, Lro
is runout length, δ is the interface friction angle, ψ is trench angle,α is the angle of side slope, Latand datare the length and depth of anchor trench, respectively
2.2 Multilayer perceptron (MLP) The MLP is a multilayered network with input, output layers and hidden units that can represent a variety of nonlinear functions MLP is a part of an artificial neural network Interactions between the inputs and outputs can
be represented using multilayer neural networks Each layer consists of neurons linked across different layers with connection weights The weight of connections is adjusted based on the output error, which is the difference between the ideal and predicted output when propagating back-wards Backpropagation is the method for updating weights in such multilayered neural networks The graphical representation of the architecture for MLP is shown in Figure 3
For example, a two-layer network with one hidden layer and one output layer is provided with a requisite number
of hidden units Using appropriate activation functions can represent any Boolean and continuous functions with intolerance The algorithm is trained as given in the following steps Step 1: Initialise the structure of the network as well as weights with small random values at different biases in the network Step 2: Forward comput-ing: apply training examples comprising of ((x1, y1), (x2, y2)…, (xm, ym)) to the network one by one, where x (input vector) = {γ, dcs, Lro, δ, ψ, α, Lat, dat}; y (output vector) = {Ta} Step 3: Update weights: predicted output says ˆy = Tafor a particular configuration of the network
x1
x2
x3
z n
z1
vh
Inputs First hiddenlayer
Error signals
Output layer Input signals
Figure 3 Architecture of MLP
Trang 4If there is a difference in y1and ˆy, the weight vectors will
be adjusted accordingly based on the computation of
error signals to neurons Step 4: Repeat the process with
updated weights until the model converges to obtain less
error between actual output and predicted output
Consider a training dataset of m samples (x1, x2,…, xm)
The forward propagation calculation is given in Equations
2 and 3 Each neuron consists of linear and activation
functions, as shown in Figure 3 Further, the loss is
calculated using the function in Equation 4
zh¼ aðwT
ˆy ¼ vT
E¼1
2 ðy ˆyÞ2
ð4Þ where zh indicates activation of hidden layer, a is the
activation function, wh is the weight vector, vhT
is the transpose of weight vector, E is the loss function
The calculated errors are back propagated, updating the
weights wh and vh by propagating the estimated error
using the descent gradient as given in Equations 5 and 6
The number of hidden layers and neurons in each layer
will affect the model performance
Δvh¼ η@v@E
Δwh¼ η@E
whereη is the learning factor
2.3 Extreme gradient boosting (XGB)
XGB is a supervised ML algorithm proposed by Chen
and Guestrin (2016) Gradient tree boosting is one of the
techniques that works efficiently for classification and
regression applications The regular boosting algorithm
works on the principle of ensembling multiple weak learners sequentially to form a strong learner Generally,
a weak learner is a small decision tree with few splits wherein each tree learns from errors made by the previous model until there is no improvement XGB also works on the same principle with additional regularisation par-ameters, which improves the model’s accuracy by prevent-ing overfittprevent-ing Figure 4 illustrates the architecture of XGB The current output of the mth tree is the sum of previous tree output and the hypothesis function of the current tree multiplied by the regularisation parameter (Equations 7 and 8)
Tmð Þ ¼ TX m1ð Þ þ ðαX rÞmhmðX; rm1Þ ð7Þ
arg min
m
i ¼1
L Y½ i; Ti1ð Þ þ αXi rhiðXi; ri1Þ ð8Þ
where Tm(X ) is the mth tree output, (αr)iis a regularisation parameter, riis the computed residuals with ith tree, hiis a function trained to predict residuals and L(Y, T(X )) is the differential loss function
2.4 Support vector regression (SVR) SVM are supervised machine-learning models designed for classification and further updated to regression problems (Vapnik 1995) SVM predicts discrete categori-cal labels as a classifier, whereas SVR envisages continu-ous order variables as a regressor (Vapnik 1997) Although SVR is based on the same processing principle,
it is used to solve regression problems, unlike SVM Problem-solving involves constructing a hyperplane that separates the positive and negative parts of the data, along with two decision boundaries that are parallel to the hyperplane This is known as the insensitive region (ε) (ε-tube), which makes the data linearly separable In SVR, the algorithm forms the best tube by formulating optimisation Nevertheless, SVR balances the complexity
of prediction errors by providing a good approximation
Tree 1
Compute α1
Data set (X, Y)
Compute residuals
(r1) = Y–Ŷ
Compute
residuals (r2 ) residuals (rComputei)
Compute
residuals (r m) Output
Tree 2
Tree m
í
Figure 4 Process of XGB
Trang 5In this method, the search converges at a hyperplane that
holds maximum training data within the boundaries
(ε-tube) To estimate a linear function, SVR can be
formed as given in Equation 9
Consider a two-dimensional (2D) training set (x1,
x2, …, xn) where x is an input variable, y is a target
variable and n is number of variables The core goal of
SVR is to obtain y The divergence from the actual output
(ε) can be achieved by minimising the Euclidean form of
weight vector (w) (Equation 10) subjected to the
con-straints (Equation 11) The algorithm finds a weight
vector where most samples are within the margin
Prediction error that lies outside the margin can be
decreased by inserting the slack variables ξi and ξi*
which help in converting hard margin to soft The
optimisation functions are provided in Equations 12 and
14 and the corresponding constraints are given in
Equations 13 and 15 If the hyperparameter (C) is too
high, the model will not allow large slacks If C = 0, then
slack variables are not penalised, so they can be as large as
possible resulting in poor performance of the model
Minimise:1
Constraints: yi w xð iÞ b ε
w xi
ð Þ þ b yi ε
ð11Þ where b is a dimensionless constant variable
Minimise: 1
2k kw 2þC X
n
i ¼1
ðξiþ ξ
Constraints:
yi w xð iÞ b ε þ ξi
w xi
ð Þ þ b yi ε þ ξ
i
ξi; ξ
8
<
Minimise wð ; bÞ : 1
2k kw 2þC X
j i¼1
ðξiþ ξ
Constraints:
PK k¼1wk1 x i;kþ b yi ε þ ξ
i
yiPK k¼1wk1 x i ;kþ b ε þ ξi
ξi; ξ
i 0
8
>
>
ð15Þ The model described above is for linear regression
problems SVR is flexible and performs nonlinear
regression problems by projecting the data in high
dimensional space using kernel methods to avoid
com-plexity In the nonlinear process, SVR adopts the kernel
function (Φ) that represents the nonlinear relationship
between w and x Figure 5 presents the architecture of
SVR Among the various kernel functions, radial basis
and polynomial functions are successfully employed for
geotechnical engineering problems (Debnath and Dey
2018)
2.5 Random forest (RF)
RF is a bootstrap aggregation-based ensemble machine-learning algorithm defined on decision trees developed by Breiman (2001) This method incorporates randomness during the attribute selection phase replacing the training data Bagging techniques force the ensemble model to generate a variety of decision trees, where each tree acts on different data subsets Since the trees are made up of a random selection of samples and features, they create numerous random trees, forming a random forest The
RF modelling procedure is depicted in Figure 6 The
RF method is superior to the decision tree technique RF has low variance and low bias as this method averages the number of decision trees trained on various parts
of the same training data, making it better for prediction
A dataset with‘m’ samples creates a decision tree from several bootstrap samples by considering only a random subset of the total of ‘F’ features Hence, D features are evaluated for each tree at each split D¼pffiffiffiffiF
Their correlation is reduced when the trees are trained with
a random subset of features As with bagging, training
is typically performed for a large number of trees As given in Equation 16, the average of all random tree outputs O1, O2, O3,…, Onis used as the regression model output (Cn)
Cn¼1nX
n i¼1
2.6 Locally weighted regression (LWR) LWR is exposure to a non-parametric, supervised learn-ing algorithm As the name indicates, LWR predictions are based on data close to the new instance and the contribution of each training example is weighted based
on its distance from the new instance LWR excludes the training phase, so the entire work is performed during the testing/prediction phase Further, LWR considers the full dataset to make predictions, unlike simple regression, which is needed to construct the regression line local to each new data point Thus, LWR overcomes
Input, x
ξ
ξ
+ε
–ε
Figure 5 Architecture of SVR
Trang 6the limitations of linear regression by assigning weights to
training data (Cleveland and Devlin 1998) The weights
are higher for data points close to the new data point being
predicted by the algorithm, as shown in Figure 7 The
method optimises Θ to minimum and modifies the cost
function (Equation 17) The computation of weighting
function (wi) is given in Equation 18 The learning
algorithm (Θ) chooses parameters for better predictions
This approximation calculates the estimated target value
for the query instance
Xm
i ¼1
wi yi ΘTxi
ð17Þ
wi¼ e x ð i x Þ 2
=2τ 2
ð18Þ
2.7 Grid search hyperparameter optimisation
In ML models, hyperparameters must be specified for
adopting a model to the dataset The general effects of
hyperparameters on a model are frequently recognised but
determining the appropriate hyperparameter and
combi-nations of interactive hyperparameters for a given dataset
can be complex Systematic searching of different
pre-ferences for model hyperparameters and selecting the
subset that produces the best model on a given dataset is
one of the best ways This is referred to as hyperparameter
optimisation or tuning The scikit-learn library in Python
has this functionality with different optimisation
tech-niques that can provide a unique set of high-performing
hyperparameters Random and grid searches are the two
primary and widely used optimisation techniques for
tuning the optimisation parameters In this study, grid
search is used to generate an optimised model It considers
a search space to be a grid of hyperparameter values and
evaluates each position in the grid Grid search is ideal
for double-checking combinations that have previously
performed well
2.8 K-fold CV K-fold CV is a specific type of predictive analytic model with a broad framework that can apply various types of models within it It consists of the following steps (1) The original data is randomly divided into K subsamples, which serve as the training data (2) For each fold, models are estimated using K– 1 subsamples, with the Kth subsample serving as a validation dataset The process is repeated until all subsamples have served
as validation and the model result can be averaged across the folds The K-fold CV can also be extended
by dividing the original data into a subset that goes through the K-fold CV process The rest of the data is split into another subgroup that can be used to evaluate the final model performance The final data subset is often termed test data In contrast, the testing set is utilised to assess the generalisation error of the finalised model (Zhang et al 2021) K-fold CV considers all three components: training, validation and testing Although there is no definite rule for determining the values of K, the number of K-folds was set to five in this study, as suggested by Kohavi (1995) and Wang et al (2015)
Decision tree-1
Averaging
Final output
Output-n
Decision tree-2 Dataset
Decision tree-n
Figure 6 Process of RFR
x
Figure 7 Architecture of LWR
Trang 7Table 1 Statistical descriptors for input and output variables
250
120 140
100 80 60 40 20 0 15.5 16.0 16.5
17.0 17.5 18.0 18.5
200
150
100
50
0
δ (q)
ψ (q)
α (q)
140 120 100 80 60 40 20 0
140 120 100 80 60 40 20 0
140 120 100 80 60 40 20 0
140 120 100 80 60 40 20 0
140 160
120 100 80 60 40 20 0
140 160
120 100 80 60 40 20 0
0.10
0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
Figure 8 Histograms indicating the wide range of variability among input parameters
Trang 83 DATABASE
In ML, the size of samples is a crucial factor in developing
an effective prediction model In this study, 1520 samples
were generated from collated laboratory results of soil–
GM interface friction to formulate the models (Raviteja
and Basha 2015) The ranges of anchorage geometric
parameters were considered in a wide range covering all
practical possibilities A dataset of 1441 samples
(exclud-ing outliers) is compiled with nine variable parameters:
soil–liner interface friction angle (δ), unit weight of soil
(γ), runout length (Lro), depth of anchor trench (dat),
depth of cover soil (dcs), slope angle of trench (ψ), length
of anchor trench (Lat), side slope angle (α) and allowable
GM tensile force (Ta) The statistics of the given
parameters are listed in Table 1 with the mean value (μ),
standard deviation (σ) and size of the dataset (n) The
extent of variability among the input parameters is shown
in Figure 8
Raviteja and Basha (2018) developed a mathematical
model to accurately predict anchorage capacity, which is a
function of friction angle, unit weight of cover soil and
trench geometry There are standard ranges for the
variables, such as geometry of trench, depth of soil cover
and thickness of GM liner, to be used in MSW
contain-ment facilities The anchorage capacity is computed by
varying these variables within the practical ranges, as
shown in Table 1, as well as using 65 measured friction
angle values These results are then used as data points for
ML modelling Thus, with the modelled variables and the
65 measured friction angle values, there was a total of
1441 data points for ML modelling
In general, deep neural network (DNN) models like
convolutional neural network (CNN), recurrent neural
network (RNN) and generative adversarial networks
(GAN) require a large amount (from hundreds to
thousands) of data to build a successful ML model
However, in the present study such DNN models were not
used The present study employs MLP (shallow neural network), XGB (regression/classification), SVR (regression/classification), RF (classification/regression) and LWR (regression) For the chosen ML models, as a general thumb-rule, 50–100 data points per predictor are required to build an efficient ML model (Gopaluni 2010; Bujang et al 2018; Hecht and Zitzmann 2020) It is crucial to evaluate all other influencing variables when determining the optimal amount of data for analysis The present study employed 1441 input data samples for each
of the nine variables used for the analysis
3.1 Correlation analysis The correlation coefficient is determined to understand the relationship between dataset parameters Pearson’s correlation coefficient is used in this study to quantify the relationship between each pairwise parameter in terms of strength and direction Correlation coefficient (ρ) values range from −1 to +1 (−1 ≤ ρ ≤ 1), where +1 indicates strong positive correlation and−1 indicates strong inverse correlation The heat map presenting correlation coeffi-cients of chosen parameters is given in Figure 9 The analysis was useful in identifying the significant governing parameters and in the further process of the model algorithm As is evident from Table 2, soil–liner interface friction (δ) has a strong influence in governing tensile force against pull-out failure
3.2 Examination of outliers Outliers are points that differ from the remaining obser-vations with extreme values These outliers can interfere during the training and cause algorithms to underperform, resulting in less accurate ML models Specific to this study, the box plot was used to clean the data It is a standardised method of displaying data distribution, measured using a five-point scale (minimum, 1st quartile (Q1), median, 3rd quartile (Q3), maximum) It is used to detect outliers and
1
0.01
–0.032
0.016
–0.05
0.02
0.038
0.58 0.087 –0.027 0.034 –0.012 –0.038 0.008 –0.036 1
1
1
0.01
–0.036
–0.032
0.008
–0.0026
–0.031
–0.019 0.52 –0.0026 0.00055
0.00055
1
–0.0026 –0.027 –0.038 –0.012
–0.031
–0.014
0.017
0.39 0.27 0.0012
0.0012 0.27
0.39 –0.019 0.42 0.16 0.087 0.58
–0.34
–0.34
0.017 0.52
0.049 –0.027
0.038 0.02
0.034
–0.03
0.0017
–0.0026
–0.014
1
–0.81
1.00
0.75
0.50
0.25
0
–0.25
–0.50
–0.75
1
1
1
0.016
1
0.0017
0.42
1
–0.027
–0.03
0.049
0.16
–0.052
–0.068
–0.068
–0.052
–0.81 –0.07
Ta
Ta
Lro dcs
dcs
dat
dat
Lat
Lro Lat
γ
γ α
ψ
ψ
δ
Figure 9 Heat map representation of correlation coefficient matrix
Trang 9identify the data distribution Equations 19 and 20 present
the computation of maximum and minimum limits
Figure 10 depicts the box plot with a maximum and
minimum limit of 45.95 and 0.51, respectively A total of 79
outliers were identified in the dataset and eliminated before
training the model
L
L
where IQR is the inter-quartile range = (Q3− Q1)
4 IMPLEMENTATION OF DIFFERENT
MODELS
In the present study, five different ML models are
employed: (1) MLP: a neural network approach to
predict output features; (2) XGBoost: a boosting
tech-nique that uses various decision trees trained sequentially;
(3) RF: a bagging technique that uses different decision
trees trained in parallel on sub-sample data; (4) SVR: a
technique which uses kernel trick to fit the nonlinear
data and (5) LWR: a technique which fits the model
using locally moving weighted regression based on the
original data The chosen ML models follow different
strategies in prediction and are suitable for solving
regression problems All the analyses are performed
using the Python programming language (v3.8.11)
Pandas and NumPy libraries are used to process and
analyse the data, Sklearn is used to code different
algorithms and matplotlib library is used to visualise the
results The dataset consisting of 1441 data points is
randomly split at a ratio of 90 : 10 for training and testing, respectively For optimisation and better model perform-ance, five-fold CV was performed on a 90% training set wherein four folds were utilised for training and the remaining one for validation Prediction on the testing set
is completed by averaging the result obtained from the five folds for each model Figure 11 illustrates this framework development
The MLP algorithm was trained on the data using the Scikit-learn package in Python Optimisation of an algorithm was performed by fine-tuning the hyperpara-meters using grid search Critical parahyperpara-meters in MLP are the number of layers (chosen between 1 and 4), the number
of hidden units (50 to 300) and the activation function (rectified linear unit, ReLu) Training an MLP with 210 more layers and hidden units may degrade the performance
by overfitting it as the available data is less in the present work The XGB algorithm was trained using the XGBoost package in Python As XGBoost is based on sequentially ensembling multiple weak learners (estimators), increasing the number of estimators may overfit the data Parameters such as the number of trees (ranging from 300–600), learning rate, regularisation parameter, maximum depth and minimum child weight are fine-tuned The regularis-ation parameter is critical in determining how much
Table 2 Correlation discretisation among the variables
ρ Correlation strength Variables
0.2 –0.5 Moderate L at , d cs , L ro , ψ
Predicted values
Dataset
Models
Grid search
10% for testing
90% for training
5-folds
1st iteration (m1) 2nd iteration (m2) 3rd iteration (m3) 4th iteration (m4) 5th iteration (m5)
m = (1/5)
i = 1
5
m i
Figure 11 Framework for development and operation of ML models
140 120 100 80 60 40 20 0
Box plot
1
TaQ
Figure 10 Box plot showing the data pattern for allowable tensile force ( T a )
Trang 10weightage each estimator should pay attention to for the
final prediction In SVR, a hyperplane with a maximum
number of points is the best line It fits nonlinear equations
using the kernel trick The critical hyperparameters are
kernel (‘rbf’,‘poly’, etc.), C (regularisation parameter) and
G (specifies the epsilon-tube) Parameter C is critical in the
model performance, where a large C value leads to a small
margin and vice versa The RF algorithm was constructed
using the scikit-learn module Optimisation of the
algor-ithm was performed by fine-tuning the hyperparameters
such as the number of trees, number of samples required to
split a node, depth of a tree and minimum samples required
for the leaf node RF was trained by varying the number of
trees from 300 to 700 and depth from 1 to 6 The minimum
sample is at the leaf node ranging from 5–10 It is observed
that the model tends to overfit and stabilise at a certain
point with an increase in the number of trees in RF
The LWR algorithm was trained using the scikit-lego
module It has been used for smoothing and can be used
in ML applications for interpolating data (Debnath and
Dey 2018) The critical hyperparameters are sigma and
span, used to smoothen the curve
4.1 Metrics of performance
The following performance indicators were used to assess
prediction outputs of ML models Equation 21 is used to
compute the model’s RMSE (root mean square error) An
RMSE near zero indicates a lower prediction error The
coefficient of determination (R2) is calculated as given in
Equation 22 The closer R2is to 1, the better the model
fits for data The computation of mean absolute
percen-tage error (MAPE) is shown in Equation 23 A MAPE
near zero indicates a high degree of predictability
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1
m
Xm i¼1
ðyi ˆyiÞ2
s
ð21Þ
R2¼ 1
P
ðyi ˆyiÞ2
P
MAPE¼100%
m
Xm
i ¼1
yi ˆyi
yi
4.2 Predictive analysis by external validation
An external validation method that analyses model
predictability using test dataset performance is generated
(Golbraikh and Tropsha 2002) The evaluation of
predic-tion accuracy involves a comparison of observed values
and predictive values of an external test set that is not used
in model development (Kubinyi et al 1998; Zefirov and
Palyulin 2001) The model must satisfy the four criteria
mentioned below in Equations 24–27 to be considered
acceptable The four conditions considered are external
validation criteria (C1, C2, C3and C4)
0:85 k 1:15 orð Þ0:85 k′ 1:15 ð25Þ
R2 R2 o
R2
0:1 orð Þ R2 R′2o
R2
The values of k and k′ can be calculated by using Equations 28 and 29 The correlation coefficients that pass-through origin can be obtained from Equations
30–33
k¼
Pn i¼1yp;iyo;i
Pn
k′ ¼
Pn
i ¼1yp ;iyo ;i
Pn
R2
o¼ 1
Pn i¼1y2p;ið1 kÞ2
Pn
R′2o ¼ 1
Pn i¼1y2o;ið1 kÞ2
Pn
R2s ¼ R2 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiR2 R2
o
q
ð32Þ
Unlike the coefficient of determination (R2) which increases with the addition of variables, the adjusted coefficient (Radj2 ) increases only when a significant vari-able that contributes to the performance of model is added Thereby, indicates the most influential parameters that govern the design
R2adj¼ 1 1 R2
N 1
where N is total sample size and p is number of independent variables
5 RESULTS The performance of various models was assessed based on three popular indicators: RMSE, MAPE and coefficient
of determination (R2) The mathematical expressions for performance indicators are given in Equations 21–23 The calculations were performed using actual and predicted outputs of different models based on training, validation and test datasets A model can be considered as best fit if the performance is consistent across all the three evalu-ation indicators The model can be considered as overfit if
it performs well on the training set but fails to predict the test set, while the model is considered as underfit if it does not perform well on the training set but performs well on the test set The training, validation and testing errors for all models across all the performance indicators are summarised in Table 3 Insufficient data points result in overfitting of the ML model In such cases, the model becomes too specialised to the training data and does not