1. Trang chủ
  2. » Luận Văn - Báo Cáo

Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.

151 0 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 151
Dung lượng 3,6 MB

Nội dung

Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.Dự đoán liên kết trong mạng hỗn tạp và ứng dụng dự đoán mối quan hệ giữa RNA không mã hóa và bệnh.

MINISTRY OF EDUCATION AND TRAINING HANOI NATIONAL UNIVERSITY OF EDUCATION NGUYEN VAN TINH LINK PREDICTION IN HETEROGENEOUS INFORMATION NETWORKS AND ITS APPLICATIONS IN PREDICTING ASSOCIATIONS BETWEEN NON-CODING RNAS AND DISEASES DOCTORAL DISSERTATION IN COMPUTER SCIENCE HANOI-2023 MINISTRY OF EDUCATION AND TRAINING HANOI NATIONAL UNIVERSITY OF EDUCATION NGUYEN VAN TINH LINK PREDICTION IN HETEROGENEOUS INFORMATION NETWORKS AND ITS APPLICATIONS IN PREDICTING ASSOCIATIONS BETWEEN NON-CODING RNAS AND DISEASES Major: Computer Science Code: 9480101 DOCTORAL DISSERTATION IN COMPUTER SCIENCE SUPERVISORS Assoc Prof Dr TRAN DANG HUNG Dr LE THI TU KIEN Hanoi-2023 i AUTHORSHIP'S DECLARATION I, NGUYEN VAN TINH, affirm that the dissertation entitled “Link prediction in heterogeneous information networks and its applications in predicting associations between non-coding RNAs and diseases” has been completed by myself under the supervision of Assoc.Prof.Dr Tran Dang Hung and Dr Le Thi Tu Kien I assure some points as follows: - This dissertation was done in the Ph.D research time at Hanoi National University of Education - This work has not been submitted for any other degrees or qualifications at Hanoi National University of Education or any other institutions - Appropriate acknowledgment has been given in the thesis where references have been made to the other published works - The submitted thesis is my own, except the work in the collaboration has been included The collaborative contributions have been indicated Hanoi, 2023 Ph.D Student SUPERVISORS: Assoc Prof Dr TRAN DANG HUNG Dr LE THI TU KIEN ii ACKNOWLEDGEMENT The dissertation was completed in duration of my Ph.D course at Hanoi National University of Education (HNUE) HNUE is a special place where I obtained valuable knowledge and skills on the way to become a researcher I am so grateful for all the people who always support and encourage me completing the dissertation Firstly, I would to say thanks to my advisors, Assoc Prof Dr Tran Dang Hung and Dr Le Thi Tu Kien for their instruction, advice, and encouragement throughout my Ph.D course My dissertation could not be completed without my advisors’ scientific direction, encouragement, and support Secondly, I wish to thank all members of the Faculty of Information Technology, HNUE for their frequent support during my Ph.D course And I also wish to thank all my colleagues in the Faculty of Information Technology, Hanoi University of Industry (HaUI) for their support in professional work during the time of the Ph.D course Next, I wish to thank Assoc Prof Dr Than Quang Khoat, Hanoi University of Science and Technology, and Dr Nguyen Tran Quoc Vinh, Faculty of Information Technology, The University of Da Nang - University of Science and Education for their contributions and suggestions during my Ph.D course And then, I also would like to thank all reviewers for their valuable comments and suggestions on the dissertation’s completion Additionally, this work was funded by Gia Lam Urban Development and Investment Company Limited, Vingroup and Supported by Vingroup Innovation Foundation (VINIF) under project code VINIF.2019 DA18 Finally, I would like to express my sincere gratitude to my family and friends for their continuous support and encouragement to complete the Ph.D course Hanoi, 2023 Ph.D Student Nguyen Van Tinh iii CONTENTS AUTHORSHIP'S DECLARATION i ACKNOWLEDGEMENT .ii CONTENTS iii ABBREVIATIONS vi LIST OF TABLES vii LIST OF FIGURES .viii INTRODUCTION CHAPTER BACKGROUND 10 1.1 Basic concepts 10 1.1.1 Heterogeneous information networks 11 1.1.2 Biological systems .13 1.1.3 Non-coding RNAs (ncRNAs) 14 1.2 Link prediction in heterogeneous information networks 15 1.2.1 Link prediction problem 15 1.2.2 Link prediction methods 16 1.2.3 Link prediction applications in biological systems 19 1.3 Computational methods for predicting associations between non-coding RNAs and diseases .22 1.3.1 Predicting non coding RNA-disease association prediction as a link prediction problem 22 1.3.2 Materials used for ncRNA-disease association prediction 22 1.3.3 Similarity calculation and network construction .26 1.3.4 Literature review of computational methods to predict ncRNA-disease associations .27 1.4 Thesis’s research directions 36 1.5 Some evaluation methods and metrics to evaluate prediction performance 37 1.5.1 Cross-validation 37 1.5.2 Area under Roc Curve (AUC) 38 iv 1.5.3 Area under Precision-Recall Curve (AUPR) .39 1.5.4 Checking case studies 40 1.6 Chapter summary .41 CHAPTER NCRNA-DISEASE ASSOCIATIONS PREDICTION WITH COLLABORATIVE FILTERING AND RESOURCE ALLOCATION PROCESS ON A TRIPARTITE GRAPH 43 2.1 Motivations .43 2.2 Main related works 45 2.2.1 The item-based collaborative filtering algorithm for ncRNA-disease association prediction 45 2.2.2 Resource allocation on a tripartite graph 46 2.3 The proposed model for predicting ncRNA-disease associations based on a collaborative filtering algorithm and a resource allocation process on a tripartite graph .48 2.4 Employing the proposed model to infer miRNA-disease associations based on collaborative filtering and resource allocation 50 2.4.1 Detailed description of proposed model's stages in inferring miRNA-disease associations .50 2.4.2 Proposed method's experiments and results 54 2.5 Employing the proposed model to predict lncRNA-disease associations based on collaborative filtering and resource allocation 66 2.5.1 Detailed description of proposed model's stages in predicting lncRNA-disease associations .66 2.5.2 Proposed method’s experiments and results 71 2.6 Chapter summary .79 CHAPTER MIRNA-DISEASE ASSOCIATIONS PREDICTION USING IMPROVED RANDOM WALK WITH RESTART AND INTEGRATING MULTIPLE SIMILARITIES 81 3.1 Motivation and main related works 81 v 3.2 Datasets used in the proposed method .83 3.2.1 Human miRNA-disease associations 83 3.2.2 Disease semantic similarity 83 3.2.3 MiRNA functional similarity 84 3.3 Proposed method 85 3.3.1 Proposed method overview 85 3.3.2 Calculating Gaussian interaction profile kernel similarity for miRNAs and diseases…………… .87 3.3.3 Calculating Integrated similarity for miRNAs and diseases 88 3.3.4 Weighted K-nearest known neighbors algorithm 88 3.3.5 Constructing miRNA similarity-based and disease similarity based heterogeneous networks 89 3.3.6 Employing improved random walk with restart to predict miRNA-disease associations .91 3.3.7 Rank the final prediction score of associations to obtain predicted miRNAdisease associations 94 3.4 Experiments and results 94 3.4.1 Datasets 94 3.4.2 Implementing and Estimating time complexity of the proposed method .95 3.4.3 Performance measures 96 3.4.4 Performance comparison with other related models 100 3.4.4 Case studies 102 3.5 Chapter summary and discussion 108 CONCLUSION AND FUTURE WORKS 110 PUBLICATIONS 113 REFERENCES 114 vi ABBREVIATIONS No 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Abbreviation AUC AUPR CF CNN CRC DAGs DBN FN FP FPR GCN GIP HCC HF HIN lncRNAs LOOCV MF miRNAs ncRNAs NMF OAG POAG ROC RWR SVM TN TP TPR WKNKN Meaning Area Under Roc Curve Area Under Precision-Recall Curve Collaborative filtering Convolutional neural network Colorectal cancer Directed acrylic graphs Deep brief network False negative False positive False positive rate Graph convolutional network Gaussian interaction profile Hepatocellular carcinoma Heart failure  Heterogeneous information network Long non-coding RNAs Leave-one-out cross validation Matrix factorization Micro RNAs Non-coding RNAs Non-negative matrix factorization Open-angle glaucoma Primary open-angle glaucoma Receiver operating characteristic Random Walk with Restart Support vector machine True negative True positive True positive rate Weighted K nearest known neighbors vii LIST OF TABLES Table 1.1 Databases containing miRNA-related information and miRNA-disease associations .23 Table 1.2 Databases containing lncRNA-related information .24 Table 2.1 Performance comparison with other related models 60 Table 2.2 Top 40 predicted miRNAs for Prostatic Neoplasms .62 Table 2.3 Top 40 predicted miRNAs for Heart failure 63 Table 2.4 Top 40 predicted miRNAs for Glioma 64 Table 2.5 Top 20 miRNAs for Glaucoma, Open-Angle 65 Table 2.6 AUC and AUPR values of related methods in comparison 76 Table 2.7 Top 10 predicted Prostate cancer-related lncRNAs .78 Table 2.8 Top 10 predicted lncRNAs related to Stomach cancer 78 Table 3.1 AUC and AUPR One-sample t-test .97 Table 3.2 Evaluation of index changes in WKNKN algorithm .99 Table 3.3 AUC and AUPR values RWRMMDA and other latest methods in comparison 102 Table 3.4 Top 40 predicted Breast Neoplasms-associated miRNAs 103 Table 3.5 Top 40 predicted Hepatocellular carcinoma-associated miRNAs .105 Table 3.6 Top 40 predicted Stomach Neoplasms-associated miRNAs .106 Table 3.7 Top 10 predicted associations between Lung Neoplasms and miRNAs from the simulated experiment for predicting new disease-related miRNAs .107 Table 3.8 Top 10 predicted associations for Ovarian Neoplasms and miRNAs from the simulated experiment for predicting new disease-related miRNAs 108 viii LIST OF FIGURES Figure 0.1 The dissertation outline Figure 1.1 An illustration of HIN with multiple node types and multiple link types .11 Figure 1.2 An illustration of HIN’s network schema 12 Figure 1.3 An illustration of a link prediction problem 16 Figure 1.4 A ROC curve and AUC's illustration 39 Figure 1.5 An illustration of a Precision-recall curve and AUPR 40 Figure 2.1 The proposed model's flowchart 49 Figure 2.2 The datasets and the numbers of data nodes in the proposed method 56 Figure 2.3 ROC curve and AUC value of the proposed method with γ = 0.9 in one experimental running time 59 Figure 2.4 Precision-Recall curve and AUPR value of the proposed method with γ = 0.9 in one experimental running time 60 Figure 2.5 The relationships between the different data sources and the numbers of data nodes used in the proposed method 72 Figure 2.6 The proposed method's ROC curves and AUC values in running times of experiments with γ=0.8 75 Figure 2.7 The proposed method's Precision-Recall curves and AUPR values in running times of experiments with γ=0.8 .76 Figure 3.1 Illustration of computing miRNA functional similarity .84 Figure 3.2 The workflow of the proposed method (RWRMMDA) .85 Figure 3.3 Illustration of the process of weight assignment in disease space and miRNA space 91 Figure 3.4 The improved RWR process's steps to predict miRNA-disease associations .92 Figure 3.5 ROC curves and AUC values (a) and PR curves and AUPR values (b) in running times of 5-fold cross-validation experiments 97

Ngày đăng: 24/04/2023, 14:46

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w