1. Trang chủ
  2. » Giáo Dục - Đào Tạo

(Luận văn) comparing receptor binding properties of 2019 ncov virus with those of sars cov virus using computational biophysics approach

52 0 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 52
Dung lượng 2,18 MB

Nội dung

VIETNAM NATIONAL UNIVERSITY, HANOI VIETNAM JAPAN UNIVERSITY CONG PHUONG CAO COMPARING RECEPTOR BINDING PROPERTIES OF 2019-nCoV VIRUS WITH THOSE OF SARS-CoV VIRUS USING COMPUTATIONAL BIOPHYSICS APPROACH MASTER'S THESIS VIETNAM NATIONAL UNIVERSITY, HANOI VIETNAM JAPAN UNIVERSITY CONG PHUONG CAO COMPARING RECEPTOR BINDING PROPERTIES OF 2019-nCoV VIRUS WITH THOSE OF SARS-CoV VIRUS USING COMPUTATIONAL BIOPHYSICS APPROACH MAJOR: NANOTECHNOLOGY CODE: 8440140.11 QTD RESEARCH SUPERVISOR: Associate Prof Dr NGUYEN THE TOAN Hanoi, 2021 Acknowledgements It could be said that without Prof Nguyen The Toan, I couldn’t have gone this far in my scientific research path, much less conducting this master thesis Therefore, first of all, I want to express my sincere thank to Prof Nguyen The Toan as my beloved master thesis supervisor in the VNU Key Laboratory on Multiscale Simulation of Complex Systems and the Faculty of Physics, VNU University of Science, Vietnam National University I also wish to thank Dr Pham Trong Lam, who guided me in my very first steps in the machine learning field as well as give me precious advice for my research during my internship period and thesis defense preparation I would like to thank the lecturers in VJU Master’s Program in Nanotechnology for many inspirational discussions and helpful knowledge from classes I would also like to thank all staff, lecturers, and my good friends in VJU for helping me a lot during my memorable study in VJU This research is funded by Vietnam National University under grant number QG.20.82 Hanoi, 17 July 2021 Cong Phuong Cao Contents Acknowledgements i List of Tables iv List of Figures v List of Abbreviations vi INTRODUCTION MOLECULAR DYNAMICS SIMULATION 2.1 Molecular Dynamics 2.1.1 Integration Algorithm 2.1.2 Force field 2.2 Materials and Models 2.3 Simulation Details 2.3.1 Thermostat and Barostat 2.3.2 Periodic Boundary Conditions 4 9 10 ANALYSES METHODS 3.1 Sequence Alignment 3.2 Root Mean Square Deviation 3.3 Root Mean Square Fluctuation 3.4 Principal Component Analysis 3.5 Variational Autoencoder 13 13 13 14 14 15 RESULTS AND DISCUSSION 4.1 Preliminary Sequence Alignments of The Viral RBDs 4.2 Deviations and Fluctuations of The Structural Backbone Atoms 4.2.1 Root Mean Square Deviations 4.2.2 Root Mean Square Fluctuations 4.3 Principal Component Analysis 4.4 Machine Learning on 6M0J System 19 19 20 21 22 25 27 CONCLUSIONS 30 REFERENCES 32 ii A IN-HOUSE SOURCE CODE A.1 Data Pre-processing Source Code A.2 Autoencoder Source Code 35 35 35 B ADDITIONAL VAE RESULTS 38 iii List of Tables 2.1 The molecules simulated for each systems 3.1 The detailed parameters of VAE model 18 4.1 The trace of the co-variance matrix of the projections of the protein backbones on the two largest principal components 26 iv List of Figures 1.1 1.2 2.1 2.2 The binding of coronavirus spike protein to human ACE2 receptor Antibodies neutralizing SARS-CoV-2 virus by blocking its interaction with human ACE2 receptor Illustration of VAE structure used for protein datasets 4.1 The sequence alignments of the viral RBD of 6VW1 and 6M0J for two variants of SARS-CoV-2 virus, and of 2AJF for SARS-CoV virus The location of four discovered significant mutations of the viral RBD The root mean square deviations of the backbone of the human ACE2 receptor and of the viral RBD protein The root mean square fluctuations of the backbone of the human ACE2 receptor and of the viral RBD protein The location of residue 113 of the viral RBD in the 6VW1 system The location of residue 50 of the viral RBD in the 2AJF system The probability density in the plane of the two largest principal components from the PCA of the backbones structure of proteins Latent space projection of variational autoencoder trained on the distance matrix of RBD-ACE2 complex of 6M0J 4.4 4.5 4.6 4.7 4.8 A 2-dimensional PBC view along the z-axis direction of the 6VW1 system The primitive system is surrounded and interacts with its images A typical snapshot of the 6M0J system after being simulated for 800ns showing the arrangement of RBD and ACE2 fluctuating in water 3.1 4.2 4.3 11 12 16 19 20 21 23 24 25 27 28 B.1 Latent space projection of variational autoencoder trained on the distance matrix of RBD-ACE2 complex of 6M0J B.2 Latent space projection of variational autoencoder trained on the distance matrix of RBD-ACE2 complex of 6M0J B.3 Latent space projection of variational autoencoder trained on the distance matrix of RBD-ACE2 complex of 6M0J 38 39 40 v List of Abbreviations SARS Severe Acute Respiratory Syndrome SARS-CoV-2 Severe Acute Respiratory Syndrome CoronaVirus 2019-nCoV 2019 Novel CoronaVirus, colloquial name of SARS-CoV-2 SARS-CoV or SARS-CoV-1 Severe Scute Respiratory Syndrome CoronaVirus (caused the epidemic in June 2003, different from 2019-nCoV) RBD Receptor-Binding Domain ACE2 Angiotensin Converting Enzyme MD Molecular Dynamics EOM Newton’s Equations of Motion RCSB The Research Collaboratory for Structural Bioinformatics PDB Protein Data Bank PBC Periodic Boundary Conditions PME Particle Mesh Ewald RMSD Root Mean Square Deviation RMSF Root Mean Square Fluctuation PCA Principal Component Analysis VAE Variational Autoencoder DAE Deep Autoencoder vi Chapter INTRODUCTION By the end of 2019, the Severe acute respiratory syndrome coronavirus (SARS-CoV-2) (also known as 2019-nCoV) was detected in Wuhan city, China, and spread rapidly to all over the countries and regions, forcing The World Health Organization must declare a public health emergency only three months later [1] Because of the extremely fast spread rate, fast mutation rate and the toxicity of the SARS-CoV-2, scientists are rushing to find a cure for severe acute respiratory syndrome caused by the virus It turns out that the genome of SARS-CoV-2 is very similar to the genome of other coronaviruses and can be classified as a variant of the Severe acute respiratory syndrome coronavirus (SARS-CoV), which caused the SARS epidemic in June 2003 The structure of coronavirus can be divided into two parts, namely core and shell The viral core is the single-stranded RNA viral genome The viral shell is the combination of fat lipids, envelope proteins, and spike proteins, in which spike proteins play an important role in the entry of the RNA viral genome into the host cell The receptor-binding domain (RBD) is a subunit of the spike glycoprotein (also known as protein S) attached to the viral outer shell [2], [3] RBD recognizes and binds to human cells through a receptor call Angiotensin Converting Enzyme (ACE2), like a key being inserted into a lock (illustrated in Figure 1.1) [4] After that, the coronavirus is incorporated into the host cell to release the viral RNA into the cytoplasm According to [6]–[10], the RBD of SARS-CoV and SARS-CoV-2 have significant similarities in genome sequence and also use the same cellular entry receptor, namely ACE2 Because of the critical relation between SARS-CoV and SARS-CoV-2, there raises an important question: What are the significant differences (mutations) between them mak-ing SARS-CoV-2 much more contagious and dangerous? It is supposed that the muta-tions in the RBD of SARS-CoV-2 in respect of that of SARS-CoV can impact the bind-ing affinity for the ACE2 receptor [8], [11] In this study, we aimed to answer the above question by analyzing the structural differences in the binding of RBDs of two variants of SARS-CoV-2 and SARS-CoV to the human ACE2 receptor FIGURE 1.1: The binding of coronavirus spike protein to human ACE2 receptor (The figure is from [5]) One of the approaches is to study the behavior of the coronaviruses (including SARSCoV-2) interactions with the human ACE2 receptor using computational biophysics approaches, such as molecular dynamics and unsupervised machine learning techniques In this study, we use both molecular dynamics and machine learning To investigate the characteristics of the binding mechanism of the complex of RBD protein and ACE2 receptor, conventional molecular dynamics is used to simulate the molecular interactions The trajectories obtained from the molecular dynamics simulation are then used as input for the principal component analysis (PCA) and the variational autoencoder (unsupervised learning methods) to extract features (knowledge) of the binding It is expected that from knowing the binding mechanism between the viral RBDs and the ACE2 receptor, one can build and develop antibodies or antiviral drugs based on the binding features of the RBD of the SARS-CoV-2 spike protein The SARS-CoV-2 spike protein is the main target for antibodies and antiviral drugs design throughout the

Ngày đăng: 23/10/2023, 14:38

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w