DSpace at VNU: An implementation of the Levenberg-Marquardt algorithm for simultaneous-energy-gradient fitting using two...
Accepted Manuscript Title: An implementation of the Levenberg–Marquardt algorithm forsimultaneous-energy-gradient fitting using two-layer feed-forwardneural networks Author: Hieu T Nguyen-Truong Hung M Le PII: DOI: Reference: S0009-2614(15)00256-0 http://dx.doi.org/doi:10.1016/j.cplett.2015.04.019 CPLETT 32926 To appear in: Received date: Revised date: Accepted date: 10-3-2015 6-4-2015 7-4-2015 Please cite this article as: Hieu T Nguyen-Truong, Hung M Le, An implementation of the LevenbergndashMarquardt algorithm forsimultaneous-energy-gradient fitting using two-layer feed-forwardneural networks, Chemical Physics Letters (2015), http://dx.doi.org/10.1016/j.cplett.2015.04.019 This is a PDF file of an unedited manuscript that has been accepted for publication As a service to our customers we are providing this early version of the manuscript The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain ip t An implementation of the Levenberg–Marquardt algorithm for simultaneous-energy-gradient fitting using two-layer feed-forward neural networks Hieu T Nguyen-Truong, Hung M Le∗ us cr Faculty of Materials Science, University of Science, Vietnam National University, Ho Chi Minh City, Vietnam an Abstract We present in this study a new and robust algorithm for feed-forward neural network (NN) fitting This method is developed for the application in poten- M tial energy surface (PES) construction, in which simultaneous energy-gradient fitting is implemented using the well-established Levenberg–Marquardt (LM) algorithm Three fitting examples are demonstrated, which include the vibra- d tional PES of H2 O, reactive PESs of O3 and ClOOCl In the three testing cases, te our new LM implementation has been shown to work very efficiently Not only increasing fitting accuracy, it also offers two other advantages: less training Ac ce p iterations are utilized and less data points are required for fitting Keywords: Levenberg–Marquardt algorithm, Feed-Forward Neural Network, Potential Energy Surface Introduction For years, artificial neural networks has become a robust and powerful tool for constructing ab initio potential energy surfaces with reasonable computa- tional efforts [1–23] However, in most of the above studies, it is obligated that large data sets are required to construct qualitative PESs which can sufficiently describe reaction channels and long-range atomic interactions It is therefore of ∗ Corresponding author Tel.: 84-838-350-831 Email address: hung.m.le@hotmail.com (Hung M Le) Preprint submitted to Chemical Physics Letter April 5, 2015 Page of 16 importance to develop an efficient training algorithm with better performance to improve the fitting accuracy of NNs ip t In most studies, energy is obviously the main fitting objective in NN PES It should be noted that atomic forces, which are derived from energy gradients, cr are indeed required to be well-approximated for the vitality of solving differ- ential equations of motions in molecular dynamics simulations Not long ago, us Pukrittayakamee et al [7] established an approach to simultaneously fit energies and forces in NN models using the back-propagation algorithm In principle, not only does this method improve training efficiency, it also helps to reduce the size an of training data Actually, we have confirmed the validity of such an argument in a previous study when the back-propagation training method was employed to fit energies and gradients simultaneously This method also incorporated M symmetry permutations of input variables [13] However, the back-propagation method often suffers from extremely low convergence rate In fact, there have been efforts to improve its working efficiency [24, 25] On the other hand, the d use of the Jacobian-based Leverberg–Marquardt algorithm is very promising in te terms of performance efficiency In the present work, we develop a new version of the Levenberg–Marquardt Ac ce p algorithm as an approach to simultaneously fit energies and forces using a twolayer feed-forward NN We hereby term our newly-implemented algorithm as the function-derivative Levenberg–Marquardt (FD-LM) fitting method To validate fitting efficiency, we apply the current FD-LM approach to three molecular systems with different levels of PES roughness: H2 O, O3 , and ClOOCl By providing a direct comparison of PES fitting accuracy of the present work with those obtained by the back-propagation algorithm [13] and by fitting only energies [11], we observe that the FD-LM significantly improves fitting accuracy and thereby save much computational resource Page of 16 Mathematical Background and Implementation ip t It is important to imply that the traditional approximation method for twolayer feed-forward NNs only fits targeted energies In order to include force (gradient) fitting, we follow Pukrittayakamee et al [7] to employ a minimiza- following: L Q (tq − aq ) + q=1 ρ R R ∂tq ∂aq − ∂pr,q ∂pr,q us P = cr tion scheme of mean square error (MSE), or so-called performance index, as r=1 , (1) an where L = Q(R + 1), Q is the number of samples, R is the number of input parameters, tq is the qth scaled target (in a particular pre-defined range such as [-1; 1] in order to enhance fitting efficiency), ∂tq /∂pr,q is the rth partial M derivative of the qth target (each quantity is converted in correspondence with the scaled inputs and targets), aq is the qth output energy predicted by the NN, ∂aq /∂pr,q is the predicted rth partial derivative of the qth sample given by d the NN An empirical parameter, ρ, is introduced into the performance index te formula to assign a “penalty” on the significance of all gradients with respect to targeted energies Using a small ρ value would lower predicting accuracy of Ac ce p gradients and thereby put a higher constraint on fitting accuracy of energy On the other hand, a large ρ value would reduce fitting accuracy of the main target (energy) It can be obviously seen that if ρ vanishes, our implemented algorithm will essentially work as a traditional energy fitting procedure Despite the fact that such a parameter can be tuned to adjust fitting performance, there is an idea suggested by Pukrittayakamee et al [7] to first determine ρ: max |tq | ρ= ∂tq ∂pr,q max (2) The output aq of the NN is given by N wn2 a1n,q + b, aq = (3) n=1 Page of 16 where N , w2 and b are the number of hidden neurons, weights and a bias of the second layer of the NN, respectively, and wn,r pr,q + b1n ip t R a1n,q = f , (4) r=1 cr where f (x) is the transfer function (here we employ as the transfer func1 tion), R is the number of input samples, wn,r and b1n are weights and biases us of the first NN layer, respectively The scaled variables, pr,q , are employed as input signals instead of using the original inputs p0r,q The scaling scheme is as pr,q = p0r,q − min{p0r,q } , max{p0r,q } − min{p0r,q } (q = 1, 2, , Q), (5) an follow: where max{p0r,q } and min{p0r,q } are the maximum and minimum values of the M original input within Q samples, respectively In the training process, weights and biases are continuously updated to reduce the performance index Now, let us denote x as a column vector to rep- d resent all NN parameters, which include weights and biases of the two layers w1,1 w1,2 ··· w1,R Ac ce p xT = te The form of x is written as: w2,1 w2,2 wN,R ··· b11 b12 ··· w12 b1N w22 ··· wN b (6) The number of parameters in vector x is M = N (R + 1) + (N + 1) In the present work, we use the Levenberg–Marquardt algorithm to update vector x, which represents all NN parameters: xnew = x − JT J + µI −1 JT e, (7) where e is the error vector that is defined as a column vector of L elements, eT = e∗1 e1,1 e2,1 ··· eR,1 e∗2 e1,2 e2,2 ··· eR,Q (8) where e∗q = tq − aq , (9) and er,q = ρ R ∂tq ∂aq − ∂pr,q ∂pr,q (10) Page of 16 We can rewrite the performance index P in Eq (1) as a function of error vector e, L Q R e∗q + (er,q ) (11) r=1 q=1 The Jacobian J in Eq (7) is an L × M matrix as given below: ∂e∗1 ∂b1N ∂e1,1 ∂b1N ∂e2,1 ∂b1N ∂eR,1 ∂b1N ∂e∗2 ∂b1N ∂e1,2 ∂b1N ∂e2,2 ∂b1N ∂eR,Q ∂b1N ∂e∗1 ∂w12 ∂e1,1 ∂w12 ∂e2,1 ∂w12 ∂eR,1 ∂w12 ∂e∗2 ∂w12 ∂e1,2 ∂w12 ∂e2,2 ∂w12 ∂eR,Q ∂w12 ∂e∗1 ∂w22 ∂e1,1 ∂w22 ∂e2,1 ∂w22 ∂eR,1 ∂w22 ∂e∗2 ∂w22 ∂e1,2 ∂w22 ∂e2,2 ∂w22 ∂eR,Q ∂w22 (12) cr ∂e∗1 ∂b12 ∂e1,1 ∂b12 ∂e2,1 ∂b12 ∂eR,1 ∂b12 ∂e∗2 ∂b12 ∂e1,2 ∂b12 ∂e2,2 ∂b12 ∂eR,Q ∂b12 us ∂e∗1 ∂b11 ∂e1,1 ∂b11 ∂e2,1 ∂b11 ∂eR,1 ∂b11 ∂e∗2 ∂b11 ∂e1,2 ∂b11 ∂e2,2 ∂b11 ∂eR,Q ∂b11 an ∂e∗1 ∂wN,R ∂e1,1 ∂wN,R ∂e2,1 ∂wN,R ∂eR,1 ∂wN,R ∂e∗2 ∂wN,R ∂e1,2 ∂wN,R ∂e2,2 ∂wN,R ∂eR,Q ∂wN,R M ∂e∗1 ∂w1,2 ∂e1,1 ∂w1,2 ∂e2,1 ∂w1,2 ∂eR,1 ∂w1,2 ∂e∗2 ∂w1,2 ∂e1,2 ∂w1,2 ∂e2,2 ∂w1,2 ∂eR,Q ∂w1,2 Ac ce p where ∂e∗1 ∂w1,1 ∂e1,1 ∂w1,1 ∂e2,1 ∂w1,1 ∂eR,1 ∂w1,1 ∂e∗2 ∂w1,1 ∂e1,2 ∂w1,1 ∂e2,2 ∂w1,1 ∂eR,Q ∂w1,1 d J= te ip t P = ∂e∗q = −1, ∂b (13) ∂e∗q = −a1n,q , ∂wn2 (14) ∂e∗q = −wn2 − a1n,q ∂b1n , (16) ∂er,q = 0, ∂b (17) ρ − a1n,q R wn,r , ∂e∗1 ∂wN ∂e1,1 ∂wN ∂e2,1 ∂wN ∂eR,1 ∂wN ∂e∗2 ∂wN ∂e1,2 ∂wN ∂e2,2 ∂wN ∂eR,Q ∂wN ∂e∗1 ∂b ∂e1,1 ∂b ∂e2,1 ∂b ∂eR,1 ∂b ∂e∗2 ∂b ∂e1,2 ∂b ∂e2,2 ∂b ∂eR,Q ∂b , (15) ∂e∗q ∂e∗q = p , r,q ∂wn,r ∂b1n ∂er,q =− ∂wn2 (18) Page of 16 ∂er,q ∂er,q = −2wn2 a1n,q , ∂b1n ∂wn2 ip t ∂e∗q ρ 1 − 2wn,r a1n,q pr ,q R ∂b1n (20) cr ∂er,q = ∂wn,r (19) Results and Discussions us As mentioned earlier, the newly-implemented FD-LM algorithm for simultaneous energy-gradient fitting is applied to construct the PES for three different an molecular systems at different levels of complexity for testing purposes Those systems include H2 O, O3 , and ClOOCl (see Table for more details) The PES of O3 was previously reported by Le et al [11] using the traditional NN fit- M ting method with only energy approximation, while Nguyen-Truong and Le [13] reported NN PESs for H2 O and ClOOCl trained by the back-propagation algorithm with simultaneous-energy-gradient approximation By providing direct d comparisons of the present FD-LM results with those reported in the litera- te ture [11–13], we are able to discuss the major advantages of our newly-developed FD-LM algorithm in PES fitting Ac ce p In each case, a training set is employed to train the NN PES, while an independent testing set is used to evaluate statistical error The use of a validation set to prevent “over-fitting” (often referred to as the “early stopping” technique) is in fact unnecessary in the FD-LM scheme because of the incorporated gradient interpolation during the fitting process The training process is terminated when either µ (in Eq (7)) achieves 10−6 or the training error increases continuously in 20 epochs In the first case, we re-construct the vibrational ground-state PES of molecu- lar H2 O based on existing ab initio data [13] With only three internal variables, the potential function is rather simple compared to the two later cases In the training process, we utilize a set of 191 configurations with a potential energy range of 1.5 eV In the hidden NN layer, we use 10 neurons, which bring the Page of 16 Table 1: Smallest and largest values of input parameters (˚ A for bond lengths or degree for smallest largest r1 (O–H1 ) 0.781 1.161 r2 (O–H2 ) 0.838 1.293 θ(H1 –O–H2 ) 64.18 164.57 -76.235 O3 1.107 -76.180 2.157 1.057 1.557 73.72 157.72 -225.005 -224.913 1.481 2.448 r2 (O2 –O1 ) 1.048 2.823 te an r1 (O1 –O2 ) us E cr H2 O r2 (O2 –O3 ) r3 (O2 –Cl2 ) 1.486 2.444 θ1 (O2 –O1 –Cl1 ) 73.494 179.663 θ2 (Cl2 –O2 –O1 ) 79.872 178.927 φ(Cl2 –O2 –O1 –Cl1 ) 0.027 184.672 -1069.194 -1069.150 ClOOCl Ac ce p d r1 (O1 –Cl1 ) M θ(O1 –O2 –O3 ) E E ip t angles), and potential energy (Hartree) total number of NN parameters to 51 Subsequently, a statistical test is performed with an independent set of 5,612 configurations, and the testing error reveals excellent fitting accuracy More details regarding best achieved results for training and testing can be found in Table Also, the previously-reported results [13] are also included for comparison purposes We see that the current FD-LM algorithm is much powerful and robust than the NN back-propagation In particular, the present root-mean-squared error (RMSE) for a testing set of 5,612 configurations is as small as 6.55 × 10−4 eV (or 1.51 × 10−2 kcal/mol), Page of 16 while the corresponding absolute-average error (AAE) is 4.59 × 10−4 eV (or 1.06 × 10−2 kcal/mol) Those errors are in fact very small when we compare ip t them to the energy range (1.5 eV) Not only obtaining excellent accuracy in energy prediction, we also success- cr fully reproduce forces with respect to three internal variables describing the H2 O molecule In Fig 1, we show the real and interpolated energies and forces us of 50 testing configurations We observe that the NN-predicted results are in excellent agreement with the data given by MP2/6-311G* calculations Moreover, the optimization process indicates an advancement when only 188 epochs an (training iterations) are required to reach numerical convergence Recall that in the previous study [13], more than 60,000 epochs were used by the NN backpropagation method M The next fitting objective, which is the reactive PES of O3 , is further complicated in terms of electronic structure Even though it is a three-atom problem like H2 O, the fourth-order Møller–Plesset perturbation [26, 27] (MP4) calcu- d lations suggested that the reactive PES of O3 is higher energetic and consists te of three different switchable spin states: singlet ground state and two excited states (triplet and quintet) Therefore, the potential energy function of such a Ac ce p molecule with a complicated electronic structure is expected to be quite complex In the training process, we employ the pre-constructed dataset (obtained by grid scans of internal coordinates) with an energy range of 2.5 eV [11] The total number of hidden neurons to fit the PES of O3 is 150 At termina- tion, a total number of 1,187 epochs are used to train the NN parameters, and the analysis of numerical accuracy for O3 is provided in Table The previous fitting error reported by Le et al [11] is also included for comparison It should be noted that we only use 2,815 configurations to construct the O3 training set, whereas Le et al [11] used up to 5,906 configurations It can be seen that the testing result obtained from FD-LM fitting is better than that reported in the previous study (0.035 eV vs 0.045 eV in RMSE, respectively) The improvement in our result is clearly due to the use of simultaneous energy-force fitting to approximate the PES From the above two testing cases, we see that the Page of 16 ip t Table 2: Fitting errors for H2 O, O3 , and ClOOCl evaluated in this work and previous studies [11–13] The number of configurations used in each training/testing set is given in parentheses kcal/mol training set (191) 3.58 × 10−4 8.26 × 10−3 testing set (5,612) 6.55 × 10−4 1.51 × 10−2 Ref 13 1.03 × 10−2 2.38 × 10−1 training set (191) 2.88 × 10−4 6.63 × 10−3 4.59 × 10−4 1.06 × 10−2 7.80 × 10−3 1.80 × 10−1 cr eV AAE testing set (5,612) M Ref 13 O3 training set (2,815) 3.54 × 10−2 8.16 × 10−1 testing set (50) 3.50 × 10−2 8.07 × 10−1 0.45 × 10−1 1.03 d RMSE Ref 11 2.17 × 10 5.00 × 10−1 testing set (50) 2.22 × 10−2 5.12 × 10−1 Ref 11 7.56 × 10−2 1.74 training set (1,693) 2.37 × 10−4 5.48 × 10−3 testing set (17,457) 2.97 × 10−2 6.85 × 10−1 Ref 12 1.37 × 10−2 3.16 × 10−1 Ref 13 4.09 × 10−2 9.43 × 10−1 training set (1,693) 1.73 × 10−4 3.99 × 10−3 testing set (17,457) 5.93 × 10−3 1.37 × 10−1 Ref 12 0.78 × 10−2 1.80 × 10−1 Ref 13 2.69 × 10−2 6.20 × 10−1 te training set (2,815) −2 Ac ce p AAE an RMSE us H2 O ClOOCl RMSE AAE Page of 16 ip t −76.18 −76.2 cr E (Hartree) −76.19 −76.21 us −76.22 −76.23 (a) an −0.1 −0.2 −0.3 (b) te 0.05 d 0.1 −0.05 −0.1 −dE/dθ (Hartree/rad) Ac ce p −dE/dr (Hartree/Bohr) −0.4 M −dE/dr1 (Hartree/Bohr) 0.1 −0.15 (c) 0.05 −0.05 −0.1 (d) Figure 1: Energies and forces of 50 testing H2 O configurations: circle – NN predicted, square – ab initio calculations 10 Page 10 of 16 current FD-LM method not only reduces fitting error, but it also downsizes the training set In Fig 2, we show the approximated and true energies and forces ip t of 50 randomized configurations In a few number of cases, the predictions of gradients with respect to the O–O bond or O–O–O bending angle are not quite cr accurate; however, we can see that the majority of testing cases still reveals very good gradient predictions us In addition, we also attempt to train the NN with a smaller dataset of 1,409 configurations Indeed, the fitting accuracy for the training set is somewhat improved (RMSE = 0.017 eV), while the testing accuracy for all available data an points is not as good (RMSE = 0.076 eV) In the last case, the PES of ClOOCl is constructed Four atoms in the system constitute a set of six internal input variables, which include three bond M distances, two bending angles, and one dihedral angle With more internal variables, we employ a NN with 150 hidden neurons to fit this PES For comparison purposes, we utilize a previously-constructed data set in the training process, d which contains 1,693 configurations as also employed in a previous study (with te an energy range of 1.2 eV) [13] The RMSE and AAE of the training set are reported as 2.37 × 10−4 eV (or 5.48 × 10−3 kcal/mol) and 1.73 × 10−4 eV (or Ac ce p 3.99 × 10−3 kcal/mol), respectively Compared to the RMSE that was reported in a previous study, we believe that the current LM implementation is highly advantageous A large testing set of 17,457 configuration (also obtained from a previous study [12]) is employed to validate fitting accuracy The testing RMSE is subsequently estimated as 0.030 eV, which is comparable to the error produced by back-propagation fitting (0.041 eV) In particular, for the training set, we even observe that the FD-LM algorithm significantly improve the fitting error (almost 60 times smaller than the RMSE reported in the literature [12]) In total, 1,241 epochs are used to approximate the PES with excellent statistical accuracy, while we note that more than 60,000 epochs were used when the back-propagation algorithm was employed The testing RMSE is, however, unusually larger than the RMSE of the training set as can be seen in Table Unlike RMSE, if we look more carefully at the training and testing AAE in this 11 Page 11 of 16 ip t cr −224.94 −224.96 us E (Hartree) −224.92 −224.98 (a) an 0.2 0.4 0.2 (b) d −0.2 M te −dE/dr2 (Hartree/Bohr) −dE/dr1 (Hartree/Bohr) −225 −dE/dθ (Hartree/rad) Ac ce p −0.2 −0.4 −0.6 0.3 (c) 0.2 0.1 −0.1 −0.2 (d) Figure 2: Energies and forces of 50 testing O3 configurations: circle – NN predicted, square – ab initio calculations 12 Page 12 of 16 ClOOCl case, the distinction in magnitude is much less significant From the reported AAE and RMSE, we can conclude that there are still a minority of ip t inaccurately-fitted data points in the testing set, which causes an unusually high RMSE Still, such reported RMSE and AAE of a very large testing set (17,457 cr configurations) give sufficient proof of a qualitative fit The gradients with respect to six internal variables are indeed nicely inter- us polated as shown in Fig In these energy and gradient plots, we only show true and predicted values for a sample of 50 random configurations instead presenting a large testing dataset as shown in Table It is also necessary to an test the adequacy of data in the zero-point vibrational states In this test, 886 ClOOCl configurations obtained from ground-state vibrations are employed An M excellent RMSE is reported as 1.86 × 10−3 eV for this set Conclusions d In this work, we develop a new fitting algorithm to approximate accurate PESs from ab initio data The implementation of this Levenberg–Marquardt te version with simultaneous-energy-gradient fitting is subject to improve PES fitting quality using two-layer feed-forward neural networks There are two Ac ce p improvements that we should clearly insist First, the training efficiency is significantly improved (less number of training iterations are required) in comparisons with the back-propagation method Second, smaller amounts of data points are required to construct PESs because gradient fitting interpolates the shape of PES functions in multi-dimensional hyperspace better This improvement helps to decrease the expensive computational effort of sampling ab initio data Overall, these two advantages offer a meaningful improvement in computational efficiency In the demonstrated examples, training and testing performance is much improved in comparisons with the reported studies [13, 11] In the case of fitting vibrational PES for H2 O, the testing RMSE is reported as 6.55×10−4 eV, which is 15 times smaller in magnitude compared to that reported in a previous 13 Page 13 of 16 ip t −1069.155 cr −1069.165 −1069.17 us E (Hartree) −1069.16 −1069.175 −1069.18 (a) an −dE/dθ (Hartree/rad) 0.04 0.02 −0.02 (b) d 0.05 −0.1 (c) te −dE/dφ (Hartree/rad) −dE/dr3 (Hartree/Bohr) Ac ce p −dE/dr2 (Hartree/Bohr) −0.06 −0.05 −0.05 −0.1 (e) 0.05 −0.04 0.05 M −dE/dθ (Hartree/rad) −dE/dr1 (Hartree/Bohr) −1069.185 −0.05 −0.1 (d) −0.05 (f) 0.03 0.02 0.01 −0.01 (g) −0.02 Figure 3: Energies and forces of 50 testing ClOOCl configurations: circle – NN predicted, square – ab initio calculations 14 Page 14 of 16 study [13] In the ozone case, we observe, however, only a slight improvement because of the complicated electronic structure resulting spin-state crossing in ip t the PES In the last case, we conceive excellent fitting results for the training set, which reveals the robustness of our FD-LM algorithm The RMSE and cr AAE of the ClOOCl testing set are higher than those of the training set due to inaccurate predictions of a minority of data points Still, the testing error us magnitude is comparable to that reported earlier [13] Moreover, gradient fitting is also impressive as witnessed in three testing cases In conclusion, the new FDLM algorithm is concluded to work efficiently to improve fitting performance an In our future objective, we look forward to applying this robust FD-LM fitting algorithm to molecular systems with higher complexity for dynamics studies Moreover, the incorporation of symmetry permutations of input parameters M should also be considered Our implemented code is available for downloading Acknowledgments d in the supplementary material associated with this article te We thank the Faculty of Materials Science, University of Science, Vietnam National University in Ho Chi Minh City for computing supports This work Ac ce p is financially supported by the National Foundation for Science and Technology Developments (NAFOSTED) under grant 103.01-2013.28 References [1] F V Prudente, P H Acioli, J J S Neto, J Chem Phys 109 (1998) 8801 [2] S Lorenz, A Groß, M Scheffler, Chem Phys Lett 395 (2004) 210 [3] J Behler, S Lorenz, K Reuter, J Chem Phys 127 (2007) 014705 [4] J Behler, M Parrinello, Phys Rev Lett 98 (2007) 146401 [5] H M Le, L M Raff, J Chem Phys 128 (2008) 194310 [6] H M Le, S Huynh, L M Raff, J Chem Phys 131 (2009) 014107 15 Page 15 of 16 [7] A Pukrittayakamee, M Malshe, M Hagan, L M Raff, R Narulkar, ip t S Bukkapatnum, R Komanduri, J Chem Phys 130 (2009) 134101 [8] H M Le, L M Raff, J Phys Chem A 114 (2010) 45 [10] J Behler, Phys Chem Chem Phys 13 (2011) 17930 cr [9] J Behler, J Chem Phys 134 (2011) 074106 us [11] H M Le, T S Dinh, H V Le, J Phys Chem A 115 (2011) 10862 [12] A T H Le, N H Vu, T S Dinh, T M Cao, H M Le, Theor Chem an Acc 131 (2012) 1158 [13] H T Nguyen-Truong, H M Le, J Phys Chem A 116 (2012) 4629 M [14] N Artrith, B Hiller, J Behler, Phys status solidi 250 (2013) 1191 [15] B Jiang, H Guo, J Chem Phys 139 (2013) 054112 d [16] J Li, B Jiang, H Guo, J Chem Phys 138 (2013) 074309 te [17] T Morawietz, J Behler, J Phys Chem A 117 (2013) 7356 [18] T Morawietz, J Behler, Zeitschrift fă ur Phys Chemie 227 (2013) 1559 Ac ce p [19] H T Nguyen-Truong, C M Thi, H M Le, Chem Phys 426 (2013) 31 [20] A M Sarotti, Org Biomol Chem 11 (2013) 4847 [21] J Behler, J Phys Condens Matter 26 (2014) 183001 [22] B Jiang, H Guo, J Chem Phys 141 (2014) 034109 [23] S Manzhos, R Dawes, T Carrington, Int J Quantum Chem (2014) n/a [24] G D Magoulas, M N Vrahatis, G S Androulakis, Neural Comput 11 (1999) 1769 [25] S Nandy, A Das, P Pratim Sarkar, Int J Comput Appl 39 (2012) [26] C Møller, M S Plesset, Phys Rev 46 (1934) 618 [27] R Krishnan, J A Pople, Int J Quantum Chem 14 (1978) 91 16 Page 16 of 16 ... t An implementation of the Levenberg–Marquardt algorithm for simultaneous-energy-gradient fitting using two-layer feed-forward neural networks Hieu T Nguyen-Truong, Hung M Le∗ us cr Faculty of. .. that the FD-LM significantly improves fitting accuracy and thereby save much computational resource Page of 16 Mathematical Background and Implementation ip t It is important to imply that the. .. to the use of simultaneous energy-force fitting to approximate the PES From the above two testing cases, we see that the Page of 16 ip t Table 2: Fitting errors for H2 O, O3 , and ClOOCl evaluated