Applying bayesian neural network to evaluate the influence o f specialized mini projects on final perform ance o f engineering students a case study

6 6 0
Applying bayesian neural network to evaluate the influence o f specialized mini projects on final perform ance o f engineering students a case study

Đang tải... (xem toàn văn)

Thông tin tài liệu

■ MATHEMATICS AND COMPUTER SCIENCE I COMPUTATIONAL SCIENCE, PHYSICAL SCIENCES I ENGINEERING DOI: 10.31276/VJSTE.64(4).10-15 Applying Bayesian neural network to evaluate the influence of specialized mini projects on final performance of engineering students: A case study Minh Truong Nguyen*1, Viet-Hung Dang2, Truong-Thang Nguyen2* ‘University o f Sciences, Vietnam National University, Hanoi 2Faculty o f Building and Industrial Construction, Hanoi University o f Civil Engineering Received 10 June 2022; accepted September 2022 A bstract: In this article, deep learning probabilistic models are applied to a case study on evaluating the influence of specialized mini projects (SMPs) on the performance of engineering students on their final year project (FYP) and cumulative grade point average (CGPA) This approach also creates a basis to predict the final performance of undergraduate students based on their SMP scores, which is a vital characteristic of engineering training The study is conducted in two steps: (i) establishing a database by collecting 2890 SMP and FYP scores and the associated CGPA of a group o f engineering students that graduated in 2022 in Hanoi; and (ii) engineering two deep learning probabilistic models based on Bayesian neural networks (BNNs) with the corresponding architectures o f 8/16/16/1 and 9/16/16/1 for FYP and CGPA, respectively The significance o f this study is that the proposed probabilistic models are capable of (i) providing reasonable analysis results such as the feature importance score of individual SMPs as well as an estimated FYP and CGPA; and (ii) predicting relatively close estimations with mean relative errors from 6.8% to 12.1% Based on the obtained results, academic activities to support student progress can be proposed for engineering universities K eyw ords: data, engineering, machine learning, neuron network, project Classification num bers: 1.3, 2.3 Introduction Nowadays, universities are capable o f collecting data with reference to their students in electronic format As a result, there is an urgent need to effectively transform large volumes o f data into knowledge to improve the quality o f m anagerial decisions and to predict academic perform ance o f students at an early stage As a part o f artificial intelligence (AI) techniques recently adopted in a w ide variety o f hum an life applications [1,2], various m achine learning (ML) approaches have been increasingly applied to analyse educational data, such as student scores, to concentrate academic assistance on students as w ell as to improve the university training programs ML is an especially appealing alternative in the field o f engineer training and education as it is difficult or unfeasible to develop conventional algorithm s to perform required tasks [3,4 ], S.S Abu-Naser, et al (2015) [4] developed an artifical neural netw ork (ANN) m odel for predicting student performance at the Faculty o f Engineering and Inform ation Technology, Al-A zhar University, based on the registration records o f 1407 students using a feed forward back propagation algorithm for training The m odel was tested with an overall result o f 84.6% E.Y Obsie, et al (2018) [5] developed a neural netw ork model for predicting student cum ulative grade point averages for the 8th semester (CGPA8) and designed an application based on the predictive models The real dataset employed in the study was gathered from 134 students at the Hawassa University School o f Com puter Science that graduated in 2015, 2016, and 2017 It is shown that the student progress perform ance, which is m easured by CGPA8, can be predicted using scores from their first-, second-, and third-year courses Z Iqbal, et al (2017) [6] utilized collaborative filtering (CF), matrix factorization (MF), and restricted Boltzmann machine (RBM ) techniques to systematically analyse real-world data collected from 225 undergraduate students enrolled in the Electrical Engineering program at the Information Technology University (ITU) from which the academic perform ance o f the ITU students was evaluated It was shown that the RBM technique was 'Corresponding author: Email: thangnt2@huce.edu.vn 1■.!rral ntScitnct I " > jt: t ncintt-rin" DECEM BER 2022 • VO LU M E 64 NUMBER MATHEMATICS AND COMPUTER SCIENCE | COMPUTATIONAL SCIENCE, PHYSICAL SCIENCES I ENGINEERING m better than the other techniques in predicting student perform ance in the particular course S.D.A Bujang, et al (2021) [7] introduced a com prehensive analysis o f m achine learning techniques to predict final student grades in first sem ester courses by im proving the perform ance o f predictive accuracy The perform ance accuracy o f six wellknown m achine learning techniques, namely, decision tree (J48), support vector m achine (SVM ), naive bayes (NB), K-nearest neighbour (K-NN), logistic regression (LR), and random forest (RF) using 1282 student course grades were presented and followed by a multiclass prediction m odel to reduce the over-fitting and m isclassification results caused by im balanced m ulti-classification using the Synthetic M inority Oversam pling Technique (SM OTE) w ith a tw o feature selection method It was shown that the proposed model integrated w ith RF had significant im provem ent w ith the highest f-m easure o f 99.5% [7], It is w orth m entioning that m ost o f the aforem entioned M L approaches w ere conducted in a determ inistic manner Hence, there is a need to develop a probabilistic m odel that is capable o f providing w ell-predicted results as well as estimating the confidence o f the results through associated intervals Such a result is m ore relevant for experim ental data on student exam scores rather than a single point estimation because, even w ith the same student, scattered results can be obtained from different series o f experiments In training programs at engineering universities, specialized mini projects (SMPs) play an important role as they progressively provide knowledge as well as accumulate conceiving, designing, implementing, and operating skills necessary for their FYP, which is an integrated topic to solve a practical problem o f the particular field o f engineer training This article applies a machine learning approach to predict FYP and final CGPA results from those SMPs based on w hich the influence o f SMPs on the FYP and CGPA can be evaluated in a data-driven manner A case study is conducted by collecting 2890 datapoints in the form o f score results from eight SMPs, one FYP, and the CGPA o f a group o f 289 engineering students that graduated in 2022 in Hanoi Then, two deep learning probabilistic models based on BNNs are established for FYP and CGPA predictions It is shown from the obtained results that the proposed approach is a practical tool providing quick and reasonable analysis results such as the feature importance score o f an individual SMP and the estimated FYP and CGPA results Furthermore, a relatively close estimation can be captured from the BNN model for CGPA, providing useful information for academic management Stochastic model using BNNs As pointed out by various authors, a m ajor obstacle to the data-driven m ethod is the scarcity o f relevant data, and this problem becom es accentuated w hen studying the obtained scores o f various individual students [8-12], Even w ith data in hand, there exist unavoidable deviations between them [13-14], Thus, this study proposes to engineer a probabilistic machine learning model on the basis o f BNNs rather than deterministic ones as done in the reviewed works The advantage o f such a probabilistic m odel is that it is capable o f predicting quantities o f interest such as the FYP score or CGPA, as well as estim ating the am ount o f uncertainty that is associated with the prediction values It is evident that the more data available, the more accurate the model, and vice versa In summary, the key contribution o f this article is to propose a probabilistic M L model to predict the FYP score and CGPA results from the given scores o f SMPs so that the effect o f an individual SMP as well as FYP on the final perform ance o f students can be evaluated We begin by briefly reviewing the ANN to set up m athem atical symbols and terminology Given a dataset D=[X,Y], the ith data sample is denoted by Y =[x /, rx J w ith n being the num ber o f features Herein, each feature is an input related to SMP and FYP results It is desirable to develop a non-linear m apping from X to Y, i.e., Y=fiX ) A standard architecture o f AN N consists o f an input layer, an output layer, and one or m any hidden layers with the total num ber o f layers o f the ANN being L Each layer consists o f various neurons that are fully connected with all neurons in neighboring layers Mathematically, a neuron j at layer / could be described by a linear transform ation plus a non­ linear activation function, as follows: where x} is output o f neuron j at layer /; is output o f neuron k at the previous layer /-1; w !.k is a weight assigned to the connection between the form er and the latter; Nj j is the total num ber o f neurons o f layer /-l; fa/-1 is a real value to be determined, also known as bias; and h is an activation function Here, the sigmoid activation function is used to squeeze values into the range (0,1) Moreover, the function is continuous and differentiable everywhere, thus rendering the training process o f the neural network via a gradient descentbased algorithm faster and more effective By setting W as the matrix o f weights corresponding to layer /, with /= 1, ,L , the AN N could be described by the following equation derived from the description o f neural networks in [15]: = F iX tlW ) = / L( ( f 2( f 1(Xi \W 1)\W 2) |W L) (2) where ?t is a prediction o f F, and f t with l= \, ,L denotes transform ation operations at layer / in the ANN The network will be iteratively trained to determine the optimal values o f Wl that m inimize the discrepancy betw een V) and Y BNN is a probabilistic deep learning m odel that combines the high prediction perform ance o f ANN with the ability to DECEMBER 2022 • VO LU M E 64 NUMBER Vietnam Journal of Science Technology and Engineering 11 ■ MATHEMATICS AND COMPUTER SCIENCE COMPUTATIONAL SCIENCE, PHYSICAL SCIENCES ENGINEERING estimate uncertainty o f the Bayes theory [16] In the authors’ opinion, the model is especially suitable for working with not-so-abundant collected data owing to two reasons: (i) in practice, similar series o f experiments with identical input parameters still provide different results due to unavoidable uncertainty; and (ii) fitting an ANN with many parameters to a limited database may cause the over-fitting problem, i.e., ANN is likely to yield low-accuracy results on new data despite being well trained In other words, it is necessary to not only perform prediction o f FYP and CGPA results but also to estimate how much confidence we have about the prediction results For this purpose, rather than assigning the deterministic values for weight W o f the neural network, BNN uses a Gaussian probability distribution for W as below: W = p + cr x e withe~N{0,l) ( 3) where // and a denote the mean and standard deviation matrices o f W and e is the noise drawn from a zero-mean unity-variance norm al distribution Then, p, a are param eters to determine through the learning process Note that the output o f BNN is a probability distribution, thus a specialized loss function L is required to measure the m odel’s performance The adopted m etric is the KullbackLeibler divergence (KL) whose form ula is: L = KL(q(W\p,a)\\p(W\D)) = Case study - Database on score results of SMPs, FYP and CGPA In the postgraduate training program o f civil engineers, a student is required to pass all the subjects including eight specialized mini projects before being qualified to conduct his/her final year project The SMPs consist of: Design o f Architecture (DoA), Design o f Foundation (DoF), M echanism o f Reinforced Concrete Structures (MoRCS); Design o f Reinforced Concrete Buildings (DoRCB); Design o f Structural Steel Buildings (DoSSB); Construction Technology (C T l); Construction Technology (CT2); and Construction M anagement (CM) that will be num bered from SMP.l to SMP.8, respectively Each individual SMP provides students with the corresponding knowledge and professional skill that will be integrated in his/her FYP to design and build a civil/industrial building in real situation It is noteworthy that the SM Ps’ and FY P’s exams are all conducted in the form o f oral defence As a result, all the aforementioned SMPs significantly influence the FYP, which together with all theoretical subjects and SMPs contributes to the CGPA (Fig 1) (4) Next, the optimal values o f p" and a" are the solutions o f the following m inim ization problem: p '.o * = a rg m in K L (q (W \p , a )\\p (W \D ')) (5 ) n,0 Via the B ayes’ rule, p(W\D) can be calculated as below: p{W \D ) = p (D \W )p (W ) (6) pm Substituting Eq (6) into Eq (4), the loss function L is rewritten as follows: L loSP(D) ~ 'ogPW - 'ogpfDllV) (7) This loss function can be approxim ated from observed discrete data as follows: L = — ^ [ l o g q ( W ilder) - lo g p (W - logp(D |W D ] (8 ) Fig Projects in training program o f civil engineers in HUCE In this research, a dataset o f 2890 scores o f eight SMPs, one FYP, and the CGPA is collected from 289 civil engineers who graduated from Hanoi University o f Civil Engineering (HUCE) in 2022 Figs 2-5 display the histograms o f all the score results o f their SMPs, FYP, and CGPA on the 4-point scale, showing clear visualization o f the range o f values as well as their distributions s i= l 140 where A is the total num ber o f samples 120 Next, the gradients o f the loss function with respect to p and a are derived by: 100 'C 80 a a d L (W \n ,a ) t d L (W \n ,a ) AM _ dW + d/J g* 60 ill (9 ) 40 Aa = d L (W \p ,a ) — dW — x e + dL(W \p, a ) 20 Finally, p and a are updated using a small learning rate a as follows: a

Ngày đăng: 27/02/2023, 23:57

Tài liệu cùng người dùng

Tài liệu liên quan