Xây dựng mô hình tìm kiếm và gợi ý tài nguyên học tập TT TIENG ANH

MINISTRY OF EDUCATION & TRAINING CAN THO UNIVERSITY DOCTORAL THESIS SUMMARY Specialization: Information System Code: 62 48 01 04 TRAN THANH DIEN BUILDING MODELS FOR SEARCHING AND RECOMMENDING LEARNING RESOURCES Can Tho, 2022 The thesis is completed at: CAN THO UNIVERSITY Academic Instructors: Nguyen Thai Nghe, Assoc Prof PhD The thesis will be defended before the Board of thesis review Meeting at: …………………………… At hour day month year Reviewer 1: Reviewer 2: Reviewer 3: The thesis is available at: - National Library - Information and Learning Center, Can Tho University PUBLISHED ARTICLES CT1 CT2 CT3 CT4 CT5 CT6 Tran Thanh Dien, Bui Huu Loc and Nguyen Thai-Nghe, 2019 Article Classification using Natural Language Processing and Machine Learning The 13th International Conference on Advanced Computing and Applications (ACOMP 2019), pp 78-84 DOI: 10.1109/ACOMP.2019.00019 (Scopus) Tran Thanh Dien, Thanh Hai Nguyen, Nguyen Thai-Nghe, 2020 Deep Learning Approach for Automatic Topic Classification in An Online Submission System Advances in Science, Technology and Engineering Systems Journal, Vol 5, No 4, pp 700-709 ISSN: 2415-6698 DOI: 10.25046/aj050483 (Scopus) Tran Thanh Dien, Huynh Ngoc Han and Nguyen ThaiNghe, 2019 An Approach for Plagiarism Detection in Learning Resources The 6th International Conference on Future Data and Security Engineering (FDSE 2019) Lecture Notes in Computer Science Springer Nature Vol 11814, pp 722-730 E-ISSN: 1611-3349, P-ISSN: 03029743 DOI: 10.1007/978-3-030-35653-8_52 (Scopus Q3) Tran Thanh Dien, Le Van Trung, Nguyen Thai-Nghe, 2020 An approach for semantic-based searching in learning resources The 12th IEEE International Conference on Knowledge and Systems Engineering (KSE 2020), pp 183188 DOI: 10.1109/KSE50997.2020.9287798 (Scopus) Trần Thanh Điện, Nguyễn Ngọc Tuấn, Nguyễn Thanh Hải, Nguyễn Thái Nghe, 2020 Tăng tốc tìm kiếm tài nguyên học tập theo nội dung kỹ thuật xử lý liệu lớn Kỷ yếu Hội thảo khoa học Quốc Gia lần thứ 9: Công nghệ Thông tin Ứng dụng lĩnh vực (CITA 2020) Trang 171178 ISBN: 978-604-84-5517-0 Tran Thanh Dien, Luu Hoai Sang, Thanh Hai Nguyen, Nguyen Thai-Nghe, 2020 Deep Learning with Data Transformation and Factor Analysis for Student Performance Prediction International Journal of Advanced Computer Science and Applications (IJACSA), Vol 11, No 8, pp 711-721 E-ISSN: 2156-5570, P-ISSN: 2158-107X DOI: 10.14569/IJACSA.2020.0110886 (Scopus Q3; ESCI) CT7 Tran Thanh Dien, Luu Hoai Sang, Thanh Hai Nguyen, Nguyen Thai-Nghe 2020 Course Recommendation with Deep Learning Approach The 7th International Conference on Future Data and Security Engineering (FDSE 2020) Communications in Computer and Information Science Springer Nature Vol 1306, pp 63-77 E-ISSN: 1865-0937, P-ISSN: 1865-0929 DOI: 10.1007/978-981-33-4370-2_5 (Scopus Q4) CT8 Tran Thanh Dien, Le Duy-Anh, Nguyen Hong-Phat, Nguyen Van-Tuan, Trinh Thanh-Chanh, Le Minh-Bang, Nguyen Thanh-Hai, and Nguyen Thai-Nghe, 2021 Four Grade Levels-based Models with Random Forest for Student Performance Prediction at a Multidisciplinary University The 15th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2021) Lecture Notes in Networks and Systems Springer Nature Vol 278, pp 1-12 E-ISSN: 2367-3389, P-ISSN: 23673370 DOI: 10.1007/978-3-030-79725-6_1 (Scopus Q4) CT9 Tran Thanh Dien, Pham Huu Phuoc, Nguyen Thanh-Hai, Nguyen Thai-Nghe, 2021 Personalized student performance prediction using multivariate long short-term memory The 8th International Conference on Future Data and Security Engineering (FDSE 2021) Communications in Computer and Information Science Springer Nature Vol 1500, pp 238-247 E-ISSN: 1865-0937, P-ISSN: 18650929 DOI: 10.1007/978-981-16-8062-5_16 (Scopus Q4) CT10 Tran Thanh Dien, Nguyen Thanh-Hai and Nguyen ThaiNghe, 2021 Deep Matrix Factorization for Learning Resources Recommendation 13th International Conference on Computational Collective Intelligence (ICCCI 2021) Lecture Notes in Computer Science Springer Nature Vol 12876, pp 167-179 E-ISSN: 1611-3349, P-ISSN: 03029743 DOI: 10.1007/978-3-030-88081-1_13 (Scopus Q3) CHAPTER INTRODUCTION 1.1 The urgency of the thesis Open learning has become an innovation movement in education and has been constantly developing Open educational resources are integral to open learning Learning resources are educational resources that are developed and provided for the teaching and learning process to meet learning goals Learning resources can be provided through the systems such as e-learning system, curriculum and lecture management system, education management system, publishing management system, etc With the rapid development of information technology, the demand for online learning is raising In addition, the commute is limited due to the pandemic and other issues, thereby increasing the demand for online learning and the use of materials for online teaching and learning As the demand for online learning increases, the demand for searching learning resources increases Therefore, it is necessary to have more effective methods of searching learning resources as well as recommending learning resources that are suitable for learners’ needs Although there are related studies on searching and recommending learning resources, new approaches to search learning resources, meeting the needs of learners, should be proposed Learning resources are mainly in formats of doc, pdf, etc., so it is necessary to solve the problem of search unstructured documents On the other hand, learning resources are increasingly diverse in many different fields (or topics), it is necessary to have effective search methods For instance, classification is suggested to determine the field of the query, then search on the corresponding field instead on the whole data Another problem is that semantics needs paying attention to make the search process more effective In addition, it is necessary to have methods of rating prediction and recommendation of learning resources suitable for each learner Therefore, the study on building models for searching and recommending learning resources has scientific and practical significance in the process of implementing learning resource management systems 1.2 Aims, objects, scope and research methods The overall objective of the thesis is to propose models for searching and recommending learning resources to meet the needs of learners, helping to achieve better student performance results The specific objectives of the thesis are (1) building models for searching learning resources with attention to semantic issues in order to improve the search effectiveness to meet the needs of learners; (2) building models for predicting student performance and recommending appropriate learning resources for each learner The main objects of the thesis are the models for searching and recommending learning resources Learning resources are diverse, including lectures, course books, books, articles, theses, dissertations, images, videos, and other digital learning resources The scope of the thesis focuses on text This thesis’s research method is to synthesize and analyse relevant studies gathered from many reputable and reliable resources of scientific articles, books, thereby proposing new models or approaches to improve the effectiveness of searching and recommending learning resources 1.3 The research content of the thesis To obtain its objectives, the thesis needs solving the research contents as shown in Figure 1.1 The first research content is to build a learning resource classification model to narrow the search space, making the search process more effective This content is shown in section  of Figure 1.1 and detailed in Chapter 3, as a premise for searching learning resources presented in Chapter The second research content is to build learning resource search models with interest in semantics, and inherits the results of classification of learning resources in the first research content This content is shown in sections  and  of Figure 1.1 and detailed in Chapter 4, as a premise for rating prediction and learning resource recommendation presented in Chapter and Chapter The third research content is to build rating prediction models, specifically predicting student performance This research content is shown in section  of Figure 1.1 and detailed in Chapter 5, as a premise for recommending learning resources suitable for learners’ abilities presented in Chapter The fourth research content is to build models of recommending learning resources that are suitable for their abilities, thereby improving learning performance This research content is shown in section  of Figure 1.1 and detailed in Chapter Figure 1.1: The architecture of learning resource search and recommendation system 1.4 The scientific contributions of the thesis Firstly, the thesis proposes an approach based on deep learning techniques with multilayer perceptron (MLP), which is compared with other machine learning techniques The results show that the proposed approach for classifying learning resources is feasible on the considered datasets The results of contribution are shown in the publications of CT1 and CT2 Secondly, the thesis proposes two approaches for searching learning resources based on ensemble the similarities of cosine and word-order, and based on ontologies Both approaches pre-process and classify queries and learning resources to determine the respective domain/topic to narrow the search space Solutions to speed up search data processing are also tested The results of this contribution are shown in the publications of CT3, CT4, and CT5 Thirdly, the thesis proposes models to predict student performance, with different approaches based on deep learning techniques, including a model that predicts learning performance for all students using a convolutional neural network (CNN), predictive model according to learning ability group using MLP and random forest (RF), and per student prediction model using long short-term memory (LSTM) The results of this contribution are shown in the publications of CT6, CT7, CT8, and CT9 Finally, the thesis also proposes a deep matrix factorization (DMF) model which is extended from standard matrix factorization (MF) Two dataset groups including datasets of learning resources and datasets of students’ learning performance are used to validate the model and to compare it with other techniques of the recommender system The results of this contribution are shown in the publication of CT10 1.5 The structure of the thesis The thesis consists of seven chapters Chapter is the introduction Chapter presents the background and research related to document classification, document search problems, rating prediction, and recommendation Chapter proposes learning resource classification models using deep learning techniques Chapter proposes learning resource search approaches based on document similarity and ontology-based search models Chapter proposes models to predict student performance on all student data, on group of learning abilities and on individual student using deep learning techniques Chapter proposes a learning resource recommendation model with DMF, which is compared with other methods in the recommender system Chapter summarizes the results of the study and future works CHAPTER BACKGROUND AND RELATED STUDIES As mentioned, the main research objects of the thesis are resource learning search and recommendation models It is necessary to learn about learning resource classification models, learning resource search, and some other problems 2.1 Text classification techniques In learning resource search systems, especially large-scale resources, the first stage of the search progress is to process the query to determine topic, and then search on such domain or topic Therefore, query classification plays an important role in narrowing the search space, increasing speed, and improving the accuracy of search results There are many algorithms for text classification problems by machine learning techniques such as k nearest neighbors (kNN), Naïve Bayes, support vector machines (SVM), decision trees, random forests, etc to learn on a set of labelled sample questions, thereby building a model of query classification, in which SVM is a commonly used and quite effective technique 2.2 Similarity-based text search techniques The similarity calculation problem is stated as follows: consider two documents di and dj The goal is to find a value S (di, dj), S € (0,1) representing the similarity between the two documents of di and dj The higher the value is, the more similar the meaning of the two documents is Currently, there are several methods to calculate text similarity, including semantic similarity (cosine similarity) and word-order similarity 2.3 Ontology and semantic-based search techniques Semantic web and ontology are the two main foundations of semantic search The structure of the semantic web consists of many complex layers One of the main ideas of semantic web is that meaningful data can be shared between computers in the form of a data model representing the domain Ontology represents a set of concepts, or objects in a particular domain and the relationships between these concepts Ontology editors are applications designed to aid in the creation or manipulation of ontologies 2.4 Recommender system and its techniques The problem formulation of the recommendation is generally expressed as follows: Let denote U be a set of n users, I be a set of m items, R be a set of user ratings, rui ∈ R (R ⊂ ℜ) be user u’s rating on item i Let denote Dtrain ⊆ U × I × R be training dataset; Dtest ⊆ U × I × R be test dataset; r: U × I → R (u, i) ↦ rui The goal of the recommender system (RS) is to find an algorithm: 𝑟𝑟̂ : U × I → ℜ so that an objective function 𝒪𝒪(𝑟𝑟, 𝑟𝑟̂ ) satisfies a certain condition For example, if 𝒪𝒪 is a function to measure error as Root Mean Squared Error, then: RMSE = (testrui − rˆ(u ,i ) )2 should be minimum ∑ test | D | u ,i ,r∈D Some techniques used for the studies in the thesis MF is to divide a large matrix X into two smaller matrices W and H, so that X can be reconstructed from these two smaller matrices as accurately as possible, i.e., X ~ WHT, as shown in Figure 2.1 In bias matrix factorization (BMF), a variant of the MF, the predicted value is added with biases Tensor factorization (TF) is a general form of matrix analysis for a 3-dimensional context with a tensor Z of size U × I × T, then Z can be rewritten as: 𝐙𝐙 ≈ ∑𝐾𝐾 𝑘𝑘=1 𝒘𝒘𝑘𝑘 ∘ 𝒉𝒉𝑘𝑘 ∘ 𝒒𝒒𝑘𝑘 In addition, some popular methods of recommender systems are also used as baselines for comparison, such as Global Average, User Average, Item Average, and User kNN Figure 2.1: Illustration of matrix factorization techniques similarity and selects the similarity threshold, then the checking performance will result in articles that are similar to the considered article with the given threshold Bảng 4.1: The experimental results in checking similarity of articles No Articles Results Field: Technology; SIM threshold > 20% Development of mix Article 1: Nghiên cứu tận dụng rác thải proportion for self- nhựa gia công bê tông làm vật liệu xây conmpacting dựng SIM = 0.274 concrete based on Article 2: Developing computer vision algorithm for ripe tomato localization optimal dense and estimation of the distance from the packing of aggregates and paste camera system to the centre of the ripe content tomato on the tree SIM = 0.210 … Experimentally checking the similarity of two certain articles is performed; the result is described in Table 4.2 Bảng 4.2: The experimental result of checking the similarity of two given articles SIM Result No Artilce Article threshold Biomass of Biomass and > 30% SIM = Melaleuca CO2 absorption 0.556 forest at the U of Melaleuca Minh Thuong forest in Lung National Part, Ngoc Hoang Kien Giang Natural Reserve Province 4.3 Ontology-based search 4.3.1 The proposed model The general architecture of the semantic-based search model is described in Figure 4.1 The SVM classifier is used in text classification process 12 Figure 4.1: The architecture of the semantic-based search model In this study, a semantic search system for learning resources is built in the field of information technology (including information system, computer science, software engineering, computer networks and communications), however, this research can also be extended to other fields 4.3.2 The experimental data From the identified fields, relevant course books and lectures are collected Then, dictionary for the fields of information technology is built, extract records from the collected documents After pre-processing the data, the steps of segmenting words and removing stop words, etc are performed The obtained data after processing include 1,114 records with the dimension of the vector being 1,336 (number of attributes) 4.3.3 The experimental result To classify the data, SVM algorithm is used, and evaluation of the classification model is based on precision, recall and F1 measures The experimental results show that the classification effectiveness of SVM algorithm is quite good with accuracy of over 13 95% In this study, an ontology-based search system is built as in Figure 4.3 Figure 4.2: Semantic search system In the study, a data processing solution for search is tested based on big data processing techniques The experiments show that the parallel processing greatly shortens the processing time of the data processing compared to the traditional search while the accuracy does not change 4.4 Conclusion In this chapter, the approaches for searching learning resources based on document similarity are proposed; it is the combination of the semantic similarity of the document with the word-order similarity Besides, a method for ontology-based searching learning resources is proposed To speed up learning resource search, a solution to speed up data processing is also tested The experimental results show that the proposed methods and models are feasible to apply for searching learning resources based on semantic similarities and ontologies 14 CHAPTER LEARNING PERFORMANCE PREDICTION MODELS 5.1 Introduction In this chapter, models to predict learning performance with three approaches based on deep learning techniques, including building prediction model for all students using CNN, prediction model of learning ability using MLP and RF, and prediction model for per student using LSTM and MLP The experimental results show that the proposed models offer pretty good prediction results, which can be applied to practices The results of this chapter are published in CT6, CT7, CT8 and CT9 5.2 Learning performance prediction model on the whole data 5.2.1 The proposed model The deep learning architecture uses CNN on one-dimensional data illustrated in Figure 5.1 The proposed CNN network takes as input a data sequence with 21 attributes passing through the first convolutional layer using 64 kernels of size with a stride of Figure 5.1: The proposed CNN architecture 5.2.2 The experimental data The collected data related to students, courses, marks, and other information from 2007 to 2019 with more than 3.8 million records The data is phased from 2007 to 2016 for training and from 2017 to 2019 for testing 15 5.2.3 The experimental results The process of finding hyper-parameters: two optimization functions of RMSprop and Adam are compared and used; using early stopping technique After five epochs consecutively if the result does not improve, the learning process stops, running up to 500 epochs The mean absolute error (MAE) measure is used Data transformation: large input values slow down the learning and convergence process, and the training time is large Therefore, it is necessary to be able to scale the values of the attribute to a certain range of values In this study, quantile transformation (QTF) is suggested as a data transformation, helping deep learning algorithms to converge better In the experiment, using CNN, QTF and Adam optimization function show that the performance prediction results have quite good error when there are 16 considered datasets whose MAE measures are all less than 0.8 (prediction on a scale of 4), some of them are less than 0.5 In this study, besides using CNN, QTF, and Adam optimization function, RMSprop optimization function is used to compare and evaluate the proposed model more objectively The experimental results in Table 5.1 show that with the prediction model using CNN, the RMSprop optimization function gives better prediction results than Adam on most of the considered datasets (13 out of 16 datasets), when using QTF This result shows that the RMSprop optimization function may be suitable when using onedimensional (1D) and sequence time data With QTF, RMSprop and Adam optimization functions, the model is also used CNN to predict learning performance on the entire datasets containing more than 3.8 million records collected from all academic units of Can Tho University The results show that using the Adam optimization function gives better results than the RMSprop optimization function when using the prediction model with CNN architecture This can be explained that when the 16 entire datasets are used, the sequence nature of the data is limited, so the RMSprop function may not promote its strengths Table 5.1: The results of learning performance prediction with MAE measure using CNN, QTF, Adam and RMSprop optimization function CNNCNNDataset RMSprop Adam Education 0.5733 0.5847 Environment and Natural Resources 0.5989 0.6130 Economics 0.5922 0.6098 Foreign Languages 0.4853 0.4961 Social Sciences and Humanities 0.5920 0.5793 Aquaculture and Fisheries 0.5918 0.6471 Law 0.5546 0.5675 Political Sciences 0.5765 0.5547 Mekong Delta Development Research 0.5678 0.5684 Agriculture 0.5806 0.5828 Biotechnology R&D 0.5330 0.5980 Physical Education 0.6762 0.6853 Engineering Technology 0.7454 0.7487 Information & Communications Technology 0.6903 0.7285 Natural Sciences 0.6725 0.7989 Rural Development 0.7134 0.6936 5.3 Performance prediction model on learning ability group 5.3.1 The proposed model In this study, four prediction models are proposed for four groups of students with different academic abilities, using MLP techniques, as shown in Figure 5.2 The MLP architecture consists of an input layer, an output layer, and five hidden layers The input layer contains data attributes; the output layer has neuron representing the mark to be predicted with a value from to The first four hidden layers contain 256 neurons while the fifth hidden layer contains neurons 17 Figure 5.2: The overall diagram of the approach The early stopping technique is used with epochs, running up to 500 epochs; Adam optimization function is used; default learning rate is 0.001 5.3.2 The experimental data Collected data related to students, courses, marks, and other information from 2007 to 2019 with more than 3.8 million records The data are divided by time; the training dataset and the test dataset have the ratio of 2/3 and 1/3, respectively 5.3.3 The experimental results For comparison, baselines are used such as User Average (prediction based on the average results of student), Item Average (prediction based on the average results of course) In addition, other methods of collaborative filtering are compared In this study, two common measures, RMSE and MAE, are used to evaluate the models, averaging over 10 experimental runs The experimental results with the two measures of RMSE and MAE are presented in Figure 5.3 GroupMLP presents four models based on four groups of students’ learning ability MLP presents a model to predict the academic performance of all students 18 Figure 5.3: Measure comparison between GroupMLP and MLP The results show that GroupMLP performs better than other baselines of the recommender system with the two measures of MAE and RMSE, giving an improved result of more than 70% In addition to using MLP technique, another prediction model is proposed based on GPA to divide into four different models (including excellent, very good, good and fair) using RF algorithm The results show that this model also gives good prediction results according to each group of learning ability 5.4 Learning performance prediction model on per student 5.4.1 The proposed model and experimental data In this study, prediction models are proposed to predict learning performance of individual student, using LSTM and MLP The LSTM architecture takes as input sequences of time steps The LSTM layer has 50 neurons, a dense layer (hidden layer) with neuron gives the result of the prediction a value between and as shown in Figure 5.4 Figure 5.4: Architecture of LSTM 19 Meanwhile, the MLP network architecture consists of an input layer, five hidden layers, and an output layer The input layer contains the attributes of the input data The first hidden layer has neurons using the activation function of ReLU, the second and third hidden layers have 27 neurons using the activation function of sigmoid, the fourth hidden layer has neurons using the activation function of ReLU, the fifth hidden layer has neuron for the output value between and as shown in Figure 5.5 Figure 5.5: Architecture of MLP network For experiment, a dataset of students’ learning performance of some academic units (mainly in science and engineering technology) of a university Academic performance data are collected from 2017 to 2019 with more than million records To diversify the experimental data, the original dataset with more than million records of students’ learning performance for courses is divided into new datasets that retain students with at least 10 records and 20 records of learning results 5.4.2 The experimental results The prediction results using the RMSE measure with LSTM and MLP are shown in Table 5.2 Table 5.2: The predictive results using RMSE with the architecture of LSTM and MLP Dataset LSTM MLP Description Dataset having 10 StudentPerformance10 0.505 0.536 records per student Dataset having 20 StudentPerformance 20 0.513 0.526 records per student 20 From the results, the LSTM model has better prediction performance than the MLP model on the same dataset This shows that the LSTM network works quite well on data with sequence time With the model using MLP architecture, the results are quite good compared to the model for all students as the previous section 5.5 Conclusion This chapter presents approaches for building learning performance prediction models on all student data using CNN, learning performance prediction model based on learning ability group using MLP network and RF, and learning performance prediction model based on per student using LSTM and MLP The experimental results show that the proposed models give ascendingly good prediction results, respectively This shows that the proposed models and techniques, especially deep learning techniques, can be practically applied to predict learning performance, which can then be used for appropriate course recommendation for students 21 CHAPTER LEARNING RESOURCE RECOMMENDATION MODEL 6.1 Introduction In this chapter, a DMF model extended from standard MF to recommend learning resources suitable to learners’ abilities The results of this chapter are published in CT10 6.2 Learning resource recommendation model by DMF The proposed recommendation model using DMF is detailed in Figure 6.1 Figure 6.1: Framework of DMF model The proposed DMF model has layers The input layer describes the current user or learning resource Embedding layer to embed user and learning resources features (latent factors) Embedded features are concatenated as input to the hidden layer MLP Finally, the output layer results in the predicted rating value In this study, the hidden layer MLP has 128 neurons (the number of hidden layers and the number of neurons can be set depending on the dataset) The number of neurons is selected by the method of hyper-parameter searching The network is deployed with the Adam optimization function, with a default learning rate of 0.001 22 6.3 The experimental data The proposed model is verified based on two groups of data including datasets of learning resources and datasets of students’ learning performance of a university Datasets of learning resources include datasets describing the ratings of learning resources (items) of users The number of users, learning resources, and ratings of these datasets are described in Table 6.1 These datasets are rather sparse, so they are filtered to retain users or learning resources having at least ratings Table 6.1: Description of datasets No Dataset Ratings LibraryThings BX-Book-ratings Related-Article Recommendation Ratings-Books #user #item 53,424 10,000 70,618 385,251 105,283 340,556 2,663,825 7,224,279 #ratings 981,756 1,387,125 1,149,780 48,879,167 8,026,324 2,330,066 22,507,155 Datasets of students’ learning performance include datasets The first dataset is the students’ learning performance of a university’s some academic units The second dataset is the students’ learning results, which is retained at least 10 records (10 courses) for each student Similarly, the third dataset retains at least 20 records for each student 6.4 The experimental results In this study, the RMSE measure is used to evaluate the model and compare with other methods of the recommender system such as Global Average, User Average, Item Average, User kNN and MF Hyper-parameters are experimentally searched Both experiments on the two data groups including datasets on learning resources and datasets on students’ learning performance offer quite similar results For example, in the datasets of learning resources, we find the number of the MLP layer’s neurons is about 100; number of latent 23 factors K ~ 10; the number of epochs for the DMF model to converge is (compared to the MF model that converges after to epochs) Similar to the data group of students’ learning performance, the DMF model always converges earlier An example of the RMSE measure between DMF and other methods in the recommender system on Dataset is shown in Figure 6.2 Similar results are found in other datasets Figure 6.2: Comparison of the RMSE measure between the methods on Dataset (Ratings) In general, DMF gives superior results compared to other methods of the recommender system The datasets that overcome the sparse data situation have better results than the original dataset From the results, the ratings can be used to recommend courses or learning resources suitable for learners 6.5 Conclusion In this chapter, the DMF model is proposed The proposed model is tested on two data groups including datasets of learning resources and datasets of students’ learning performance, and compared to other methods of the recommender system The results show that the DMF model has quite good predictive performance compared to other techniques on the same dataset From the results of predictive rating, it is possible to recommend learning resources or courses suitable for each learner 24 CHAPTER CONCLUSIONS AND FUTURE WORKS 7.1 Results of the study In order to achieve the overall goal of building models for searching and recommending learning resources, the thesis is proposed the models of classification, learning resource search, learning performance prediction, and learning resource recommendation with different techniques to solve existing problems The results of the thesis can be summarized as follows: A learning resource classification model based on MLP is proposed Besides, the results of comparing the deep learning techniques with other machine learning techniques show that this new approach gives more feasible and effective performance of document classification The two approaches for searching learning resources are proposed based on document similarity and based on ontologies In each approach, queries and learning resources are classified to identify domain (or topic) to narrow search space before searching on the corresponding topic of the built learning resources The model to predict learning performance is proposed using deep learning techniques including learning performance prediction model on all student data using CNN, learning performance prediction model on ability group using MLP and RF, and learning performance prediction model on per student using LSTM and MLP The experimental results show that the three proposed models give ascendingly good predictive results, respectively It can be seen that the proposed models and techniques, especially deep learning techniques, are greatly potential to build prediction models of learning performance in particular or learning resources in general A learning resource recommendation model using the DMF, which is extended from the standard MF technique is proposed The model is validated with two groups of datasets including datasets of learning resources and datasets of students’ learning performance at 25 a university The DMF model is also compared with other baselines of the recommender systems The results show that the DMF model has good rating prediction performance compared to other techniques, thereby recommend suitable learning resources or courses for each learner 7.2 Future works The two approaches are proposed for searching learning resources based on document similarity and based on ontologies In the future, it is necessary to study methods to evaluate the effectiveness of the ontology-based search model The comparison of search performance between these two approaches should also be considered It is necessary to conduct experiments on predictive models of learning outcome rating and learning resource recommendation on several different datasets to have a more comprehensive and objective assessment of the proposed techniques, especially deep learning ones In addition, it is necessary to study multi-attributes with time sequence, select the attributes that have a positive influence on the prediction results to improve the effectiveness of the deep learning model Models of search, prediction, and recommendation focus on textual learning resources Further research could be conducted on these models for other types of learning resources, like videos It is necessary to link the works of the thesis In addition, it is possible to study and integrate models of search, rating prediction and learning resource recommendation into a learning resource management system that can be applied to the context of educational institutions, especially higher education institutions 26 ... (Scopus) Trần Thanh Điện, Nguyễn Ngọc Tuấn, Nguyễn Thanh Hải, Nguyễn Thái Nghe, 2020 Tăng tốc tìm kiếm tài nguyên học tập theo nội dung kỹ thuật xử lý liệu lớn Kỷ yếu Hội thảo khoa học Quốc Gia... 10.1007/978-981-33-4370-2_5 (Scopus Q4) CT8 Tran Thanh Dien, Le Duy -Anh, Nguyen Hong-Phat, Nguyen Van-Tuan, Trinh Thanh-Chanh, Le Minh-Bang, Nguyen Thanh-Hai, and Nguyen Thai-Nghe, 2021 Four Grade... Thông tin Ứng dụng lĩnh vực (CITA 2020) Trang 171178 ISBN: 978-604-84-5517-0 Tran Thanh Dien, Luu Hoai Sang, Thanh Hai Nguyen, Nguyen Thai-Nghe, 2020 Deep Learning with Data Transformation and

Định dạng
Số trang	30
Dung lượng	1,11 MB