Khóa luận tốt nghiệp: Predicting Protein-Ligand binding affinity using atomic-level descriptors and convolutional networks

This study focuses on a novel deep learning architecture designed to effectively model the intricate interactions between proteins and ligands, known as Atomic Convolutional Networks ACN

Definition of Ligand 0ã

In biochemistry, a ligand is defined as a molecule or atom that binds reversibly to a biomolecule, typically a protein, to serve a biological purpose The term "ligand" originates from the Latin word "ligare", which means "to bind".

When a ligand binds to a protein, it forms a complex with the protein through intermolecular forces such as ionic bonds, hydrogen bonds, and Van der Waals forces This binding interaction between the ligand and the protein can result in a change in the protein's conformation or activity, leading to various biological effects. Ligands can be diverse in nature, ranging from small molecules like hormones, neurotransmitters, drugs, and metabolites to larger molecules such as antibodies. They can also include ions or atoms that bind to a central metal atom in a coordination compound.

Ligands can be classified into three major different types based on the number of binding sites with the central metal atom, charge, and size: e Monodentate Ligands: Monodentate ligands are capable of binding to a central metal atom or ion through a single donor atom Examples of monodentate ligands include chloride ions (chloro), water (aqua), hydroxide ions (hydroxo), and ammonia (ammine). e Bidentate Ligands: Bidentate ligands have two donor atoms that can bind to a central metal atom or ion simultaneously The binding of bidentate ligands forms a chelate complex An example of a bidentate ligand is ethylenediamine (en). e Polydentate Ligands: Polydentate ligands, also known as multidentate ligands, can donate multiple electron pairs to a central metal ion in a coordination complex These ligands have multiple binding sites, which allow them to form multiple bonds with the metal ion simultaneously.

The binding of a ligand to a protein is typically reversible, meaning that the ligand can associate and dissociate from the protein However in some cases, ligand binding can be irreversible, but this is atypical in biological systems.

Monodentate Ligands ơ H of ử J â Ms oN N C=N: tl |

Water Ammonia Cyanide Chloride Pyridine

H,C—NH—CH, C—CH, H,C—N ya 2 2 \ anil O © CH

Protein-Ligand Interactions 0 ceeescessceseesseeseeeeseeeeeeseeeeaeeeseeeaeeeaes 28 2.3 Theoretical Basis of Atomic Convolutional Networks

Protein-ligand binding refers to the formation of a complex between a protein and a ligand A ligand is a molecule that binds to a protein with high affinity and specificity. This binding is based on molecular recognition between the protein and the ligand, where the ligand interacts with the protein through various chemical interactions such as hydrogen bonding, electrostatic interactions, and hydrophobic interactions [19].

The binding is driven by thermodynamics, with factors such as affinity, specificity, and conformational changes influencing the interaction It is a crucial process in many biological functions, including signal transduction, enzyme catalysis, and drug action. Understanding the mechanisms of protein-ligand interactions is important for elucidating the underlying biological processes and for the development of therapeutic interventions.

2.2.6.2 Factors Influencing Protein-Ligand Binding Affinity

Protein-ligand binding affinity refers to the strength of the interaction between a protein and a ligand molecule Understanding the characteristics of protein-ligand binding affinity is crucial in drug design and discovery processes:

The factors influencing protein-ligand binding affinity are diverse and complex Here are some key factors that contribute to the binding affinity: e Structural Complementarity: The shape, size, and electroastatic properties of the protein's binding site should be complementary to the ligand's structure for optimal binding. e Flexibility and Mobility: The flexibility and mobility of both the protein and the ligand can influence binding affinity The conformational changes and dynamics of the bound ligand can affect the binding affinity [19],[20].

Hydrophobic and Hydrophilic Interactions: Hydrophobic interactions between nonpolar regions of the protein and ligand, as well as hydrophilic interactions between polar regions, contribute to binding affinity.

Specific Interactions: Specific interactions, such as hydrogen bonding, salt bridges, and van der Waals forces, between the protein and ligand molecules enhance binding affinity [20].

Binding Energy: The binding energy, which represents the energy released or absorbed during the formation of the protein-ligand complex, is a crucial factor in binding affinity A more negative binding energy indicates a stronger binding affinity [18].

Concentration and Equilibrium: The concentration of the protein and ligand can affect binding affinity The equilibrium between the free and bound forms of the protein-ligand complex is influenced by the concentrations of the interacting molecules [21].

Binding Site Accessibility: The accessibility of the binding site on the protein can impact binding affinity If the binding site is buried or obstructed, it may affect the binding affinity.

Chemical Properties: The chemical properties of the ligand, such as its size, charge, and polarity, can influence binding affinity Ligands with favorable chemical properties for the protein's binding site are more likely to have higher binding affinity.

2.3 Theoretical Basis of Atomic Convolutional Networks

Theory of ACNs Model - - xxx vn 9 9n ngư, 30 2.3.2 Architecture Of ACNS 1757

Atomic Convolutional Networks (ACNs) are a type of neural network architecture that is specifically designed for tasks involving atomic-level data, such as predicting protein-ligand binding affinity or analyzing materials science properties ACNs are end-to-end and fully differentiable, meaning that they can be trained using backpropagation and optimized using gradient-based methods

The fundamental concept behind ACN’s lies in the idea that each atom has its own unique characteristics which can be used as inputs into a neural network model where they are combined together using convolutions operations forming feature maps from which meaningful information can be extracted from complex datasets These feature maps enable us to identify correlations between different types of atoms even when no prior knowledge exists about them; making it possible for us discover new properties or phenomena related material sciences such as catalysis or energy storage applications without needing extensive experimental studies beforehand. Additionally, since each atom acts like its own node within the network structure we also have increased flexibility when designing our models; allowing us customize how our input parameters interact with one another leading higher accuracy results than traditional methods could ever achieve before.

Finally, due their ability handle large amounts of data while still being able generate accurate predictions rapidly, ACNs have become increasingly popular amongst researchers across many disciplines; providing them with unprecedented levels insight into their respective areas study through advanced machine learning techniques all while saving time money resources along way As technology continues advance so will capabilities these networks meaning more exciting discoveries await those who choose employ them within their workflows.

Gomes J et al [1] introduced two novel primitive convolutional operations: atom type convolution and radial pooling The atom type convolution utilizes a neighbor-listed distance matrix to extract features that encode local chemical environments from an input representation (Cartesian atomic coordinates) without relying on spatial locality.

Atomistic fully en ti HE

Figure 2-7 Schematic of atomic convolution layer.

Distance matrix and neighbor list construction The research involves constructing the distance matrix R and atomic number matrix

Z based on the Cartesian atomic coordinates X To reduce complexity, a neighbor list routine is employed, reducing the step from O(N’) to O(NM), where M represents the maximum number of neighbors The neighbor list routine uses a radial interaction cutoff of 12A and truncates the neighbor list at a maximum number of neighbors, typically 12 The initial neighbor list distance matrix representation is invariant to rigid body translation and rotation of the molecular system but not atom index or neighbor list atom index permutation.

The input for distance matrix construction is a coordinate matrix C with dimensions (N,3) This matrix is then transformed into a neighbor-listed matrix R with dimensions (N, M) During the neighbor listing process, the M atoms that are spatially closest to atom 1 are identified.

Let Ni = [ai,, , ai, | be the list of neighbors Then Ri, is defined as

The neighbor listing operation also constructs from the (N,1) atomic number vector

4 a(N,M) matrix Z which lists the atomic number of neighboring atoms (atom types).

Zi,; = Atom type of ai,

Atom type convolution The output of the atom type convolution is generated using the distance matrix R and the atomic number matrix Z The matrix R is passed through a (1x1) filter with a stride of 1 and depth of Na, where Nat is the total number of unique atomic numbers (atom types) in the molecular system The atom type convolution kernel is implemented as a step function that processes the neighbor distance matrix R:

0 otherwise Here, Na represents the atomic number of atom type a (ranging from 1 to Nat) The atom type convolution layer is used to apply convolution to the neighbor distance matrix, resulting in an output matrix E with dimensions (N,M,Nat) The atom type convolution can also be thought of as an expansion layer that one-hot encodes the atom type Na into separate copies of the distance matrix RN,.

Radial pooling layer Radial pooling is a process used to reduce the dimensionality of the output obtained from the atom type convolution This dimensionality reduction is done to prevent overfitting by providing a more abstract representation through feature binning, as well as reducing the number of parameters that need to be learned In addition, radial pooling ensures that the output representation remains invariant to the permutation of atom indices in the neighbor list During radial pooling, a radial filter is applied to non-overlapping sub-regions of the input representation In term of mathematical, the radial pooling layers perform pooling over tensor slices (receptive fields) with a size

32 of (1xMx1), a stride of 1, and a depth of N;, where N; represents the number of desired radial filters.

The functional form f; is used for radial pooling filters. ỉ, ƒsứwj) = exp (- oe fer) cos (=) 0 < Ti, < R,

The parameters rs and o; are trainable parameters for pooling function fs Parameter

Re is the radial interaction cutoff, which is fixed to 12 A Then, the resulting pooled matrix P has a shape of (N,Nat,N,) and contains entries that are given by the following equation:

Pi natty = Bn, ằ fn, (Ea) a bạ, j=1

Within the equation, Bn, represents the non-learnable scaling constant and bạ, represents a non-learnable bias constant Conceptually, applying radial pooling following an atom type convolution layer produces features which sum the pairwise- interactions between atom i with atom type a; (e.g., H, C, N, etc.) and all adjacent atoms of type aj (e.g., H-H, H-C, H-N, etc.).

Atomistic fully connected network The shape of the output P from the radial pooling layer is of (N,Na,N;) To obtain a tensor with coordinates flattened for each atom, the authors reshape it into a tensor of shape (N,Nat,Nr) To stack atomic convolution layers, the flattened output from the radial pooling layer is fed back into the atom-type convolution operation Finally, the tensor is inputted into a fully-connected network in a row-wise manner, with each row representing an atom.

The same fully connected weights and biases are utilized for each atom in a given molecule The output of the atomistic fully-connected network corresponds to the

33 energy E; associated with atom i (¡ = 1 N) The total energy of the molecule, EF >i Ei, is the sum of the atomic energies and is consequently invariant to atom index permutation The input dimension of the fully connected network in an ACNN model is solely determined by the number of features, not the number of atoms; as a result, a fully trained ACNN model has the ability to generalize to larger systems beyond those present in the training set, as long as the number of atom types and radial pooling filters remain fixed.

Application of ACNs in Predicting Protein-Ligand Binding Affinity 34

The architecture of the atomic convolution network generates an energy that is both size-extensive and differentiable concerning atomic positions [1] To capture non- covalent interactions using the atomic convolution energy function, the authors incorporate the following thermodynamic cycle into the learning process.

AG complex = Gcomplex li Gprotein ~ Gtigana

To create a model which can accurately predict AGcompiex, the authors create a system containing three weight-sharing, replica networks, one each for complex, protein, and ligand, in which it’s trained to avoid loss (Figure 2-12).

It is important to highlight that the thermodynamic cycle is integrated as a subcomponent within the complete network This end-to-end system is trained to predict AG while maintaining respect for the underlying adsorption thermodynamics.

EXPERIMENTAL RESEARCH 5 <5 555 5< se s=ss=see 36 “` h

PDBBind-CN Website - Làn HH HH HH Hi, 36 3.1.2 UniProt and RCSB PDB Website .- - 255cc ssessrsersree 37 3.1.3 Dataset PreprOCe€SSInE - eee eceeetesceeeteeteseeeeeeeeteeeeese 39 3.2 Description of the 2D Sequence Structure of Proteins

Website fullname: Protein-Ligand Binding Affinity Database of China, abbreviated as PDBBind-CN [2].

Addressed at www.pdbbind.org.cn, it is the website of the PDBBind-CN database, a comprehensive database of real measured protein-linker binding data experience. This database was developed by Professor Renxiao Wang's group at Fudan University in China.

The site offers several key features, including: e Access protein-linker binding data: The PDBBind-CN database contains over 100,000 protein-linker binding data points, including binding values, melting temperatures, and other information other. e Access protein-linker 3D structures: The database also provides access to the 3D structures of linker-protein complexes. e Download data and structure files: Users can download data and structure files from the database. e Data Search: The site provides search tools to help users search for protein-linker binding data. e User support: The site provides support to users, including user guides,

FAQs, and contact with the development team.

The website serves as a valuable asset for researchers in the fields of protein science, pharmacology, and molecular biology It offers an extensive and trustworthy dataset of protein-linker binding information, which can be utilized for the development of prediction models for protein-linker binding.

3.1.2 UniProt and RCSB PDB website Ư.s%®

UniProt [3] stands as a crucial and esteemed reservoir of protein knowledge It is regarded as the largest centralized database that furnishes comprehensive details about protein sequences, functional information and associated materials UniProt's data encompasses diverse information, including protein structure, genetic details, functionality, protein-protein interactions, disease insights, and more.

UniProt is a project developed and upheld by a diverse community, rather than being driven by a single individual or organization Key entities involved in the development and maintenance of UniProt comprise of: e European Bioinformatics Institute (EBI): EBI is one of three Data Institutes of the University of Cambridge Institute for Science and Technology Innovation EBI is responsible for the management and development of UniProtKB, an important part of the UniProt project. e Swiss Institute of Bioinformatics (SIB): SIB is a national organization specializing in research and services in the fields of computational biology and

37 structural biology SIB is responsible for the management and development of UniProtKB. e Protein Information Resource (PIR): PIR is an organization at Georgetown

University, USA, specializing in research and providing services related to protein information PIR has contributed important data and knowledge to the UniProt project.

The global community of researchers, molecular biology experts, and other organizations are also actively contributing information and knowledge to UniProt through the validation and data provision process This collaborative effort plays a pivotal role in ensuring the accuracy and reliability of information on UniProt.

The website of UniProt provides users with access to protein information through a user-friendly interface Users can conveniently search by protein name, UniProt ID, gene name, gene product name, protein structure, and various other search criteria.

UniProt plays a significant role in advancing research and applications in the realms of molecular biology, medicine, and numerous other fields Researchers, students, and professionals in the field of proteomics frequently rely on UniProt to gain insights into the properties and functions of proteins, as well as their interactions.

Figure 3-3 Logo of RCSB PDB.

RCSB PDB [4] (RCSB.org) is the US data center for the global Protein Data Bank (PDB) archive of 3D structure data for large biological molecules (proteins, DNA, and RNA) essential for research and education in fundamental biology, health, energy, and biotechnology.

The Protein Data Bank (PDB) was established as the 1s open access digital data resource in all of biology and medicine (Historical Timeline) It is today a leading global resource for experimental data central to scientific discovery.

Through an internet information portal and downloadable data archive, PDB provides access to 3D structure data for the molecules of life, found in all organisms on the planet.

Understanding the 3D structure of a biological macromolecule is crucial for comprehending its impact on human and animal health, its role in plant life, food and energy production, as well as its significance in other areas related to global well- being and longevity.

The vast repository of 3D structure information housed in the PDB has been fundamental in driving notable progress in our comprehension of protein architecture. This has led to recent breakthroughs in protein structure prediction, propelled by artificial intelligence techniques and deep or machine learning methods.

We have selected the PDBbind v2013_core_set for constructing our models predicting protein-ligand binding affinity.

The downloaded data includes 195 folders.

Each folder is a protein and a ligand bound together.

Here we use these 3 files: ligand.sdf, pocket.pdb and protein.pdb e Ligand.sdf: This data contains information about the 3D structure of a ligand molecule. e Protein.pdb: A standard format for storing 3D structural information of proteins.

39 e Pocket.pdb: This file contains information about a specific binding site on the protein, called a "binding pocket" A binding pocket is a region on a protein that can bind to another molecule, such as a drug molecule.

Each Protein file has about > 3000 lines of data with 12 columns of data.

Each Pocket file has about 190 lines of data with 11 columns of data.

Each Ligand file has about 150 lines of data with 8 columns of data.

Protein PDB ID (protein_pdb_id): 1A30

This is the unique identifier for the protein structure in the Bank Protein Data (PDB).

REMARK GENERATED BY X-TOOL on Mon Nov 18 12:13:08 2013

Figure 3-4 Data in Protein file.

REMARK GENERATED BY X-TOOL on Mon Nov 18 12:13:00 2013

Figure 3-5 Data in Pocket file.

The Protein file provides comprehensive data on the entire protein structure, whereas the Pocket file is a subset of the protein file, focusing on the relevant atoms and residues within the binding site or relevant functional regions The Pocket file does not include Chain identifier A.

We have extracted and analyzed Row 1 from the Protein file, revealing detailed information about a specific atom:

Contributions and Thesis Results cece ee eseeeseesesseeseeseseeeeseeeeeeaees 63 1 3D Visualization and Rendering - ô -ô ô+ s+ssexsseeeseeesees 63 2 Website Building - 5< ceeceseeseeaeseeeneessessesseseaeeeeeees 67 3 Testing the Model with New Proteins [mported

Here we use code snippets to bring images of Protein and Ligand according to PDBbing's data set

First, we install DeepChem: Execute the provided cell containing installation rdkit Chem rdkit.Chem import AllChem deepchem as deepchem.utils download_url, load_from_disk

After the installation, we will load the "pdbbind_core_df.csv.gz" file from pdbbind, this file contains protein-ligand complexes.

Each row represents a complex with vital information include: e Unique complex identifier e Ligand SMILES string e Binding affinity (Ki) e Protein and ligand PDB file lines (as Python lists) print(Eile does not exist Downloading download_url("https://s3-us-west-1.amazonaws.com/deepchem.io/datasets/pdbbind_core_df.csv.gz") print(‘File downloaded ) raw_ dataset = load_from_disk(dataset_file) raw_ dataset = raw_dataset[['pdb_id’, 'smiles', 'label']] openmm.app PDBFile pdbfixer PDBFixer deepchem.utils.vina_utils

Next, we perform the following steps:

1 Extract protein and ligand information from the dataset.

2 Rectify errors in the protein PDB.

3 Optimize ligand geometry for docking.

4 Eliminate errors in protein and ligand molecules to ensure compatibility with

5 Generate separate PDB files for proteins and ligands, facilitating visualization. pdbid = raw_ dataset['pd 1'].iloc[ 1] ligand = raw_dataset['smiles'].iloc[ 1] fixer = PDBFixer(pdbid=pdbid), PDBFile.writeFile(fixer.topology, fixer.positions, open(%s.pdb' % (pdbid), 'w')) p, m = None, None

# fix protein, optimize ligand geometry, and sanitize molecules p,m pare_inputs('%s.pdb' % (pdbid), ligand) print 6s failed PDB f (pdbid)) p and m: # protein and molecule are readable by RDKit print(pdbid, p.GetNumAtoms())

Chem.rdmolfiles MoIToPDBFile(p, 'display/protein_%s.pdb' % (pdbid)) Chem.rdmolfiles.MolToPDBFile(m, ‘display/ligand_%s.pdb' % (pdbid))

After that, we use the mdtraj library to load previously prepared protein and ligand PDBs, then the nglview library to display protein and ligand molecules in 3D.

64 display, ligand_mdtraj = md.load_pdb( % (pdbid)) p= show_ mdtraj(protein_mdtraj) lle show_mdtraj(ligand_mdtraj)

Finally, we will archieve the results in figure 3-14 and 3-15: display(p

Figure 3-12 Visualization of a protein disptay(1

In order to visualize the molecules, we have performed the following steps: e Load Prepared Protein and Ligand Molecules using the mdtraj library to load the prepared protein and ligand molecules. e Employ the nglview library to visualize the molecules in a 3D format. e Save the 3D images of the protein-ligand complexes as HTML files.

The purpose of doing this is to prepare data and create 3D images of protein-ligand complexes for subsequent analysis and modeling and the generated 3D images will help users to see the molecular structures directly from the protein matrix. from d import display, foriin (1, len(raw_dataset)): raw_dataset|'pdb_id'].iloc|i] ind = raw_dataset['smiles'].iloc[i] print(i) print(pdbid) if pdbid in ["2zjw","1e66"," 1r5y","3su2"," Lhfs"] : continue fixer = PDBFixer(pdbid=pdbid) p.m= try: print(pdbid, ligand) p, m = prepare_inputs(‘templates/display/%s.pdb' % (pdbid), ligand) except: print('%s failed PDB fixing' % (pdbid)) if p m:

Chem.rdmolfiles MolToPDBFile(p, templates/display/protein_ %s.pdb' % Chem.rdmolfiles MolToPDBFile(m, 'templates/display/ligand .pdb' % (pdbid)) protein_mdtraj = md.load_pdb( 'templates/display/protein_%s.p pdbid)) ligand_mdtraj = md.load_pdb(‘templates/display/ligand_%s.pdb' % (pdbid)) p= show_mdt protein_mdtraj)

write_html("templates/display/protein_%s.html" % (pdbid),[p]) write_html("templates/display/ligand_%s.html" % (pdbid),[1])

This aims to provide visualization, support model training, and contributing to the overall improvement of the model on the website we are collaborating with.

In this thesis, we deploy the developed model on a website to predict the binding affinity between new proteins and ligands from the dataset.

When users make POST requests with the Ligand_id and Protein_id, which are located in the LigandInfo and ProteinInfo tables, we validate the selected ligand and protein in our database using ligand_name and protein_name Following this, we retrieve the pdb_codes for the ligand and protein and utilize an API call to obtain the model's prediction results for Log(Kd/Ki) Finally, we present this information to the users.

@login_required def home(): lid = request.form.get(‘liga pid = request.form.get(‘protein') ligands = LingadInfo.query.all(), Iproteins = ProteinInfo.query.allQ request.method == 'P( selected_ligand = LingadInfo.query.filter_by( selected_ protein = ProteinInfo.query.filter_by(id=pid).first() ligand = ( db.session.query (lingads) filter( or_(

Lingad and_name.ilike(f'% {selected_ligand.code}%’),

) protein = ( db.session.query (proteins) filter( or_(

Proteins.protein_ name.1like(f'% { selected_ protein.name)} %`),

), ).firstQ) ligand is None or protein is None: render_template( , ligands=ligands, proteins=Iproteins =lid,pid=pid,

67 response = requests.request("GET", url, headers={

‘Content-Type’: 'application/json' result_value = round(son_ data['result][0], 2) render_template(‘home.html’, ligands nds, proteins=lproteins, lcode=lid,pid=pi selected_ligand: ed_ligand, selected_prote lected_protein, er "Model run fail") render_template(home.htmẽ', ligands ds, proteins=lproteins, lcode=lid,pid=pid, selected_ligand= selected_ligand, selected_protein=selected_protein, Ipdb_ code = ligand[0], ppdb_code = protein[0],result = result_value) render_template(home.htmẽ', ligands=ligands, proteins=lproteins, lcode=lid,pid=pid)

Here we will input the X value (compress the two files protein and ligand and then featurize them into vectors to use as input for the model to predict the binding strength between Protein and Ligand).

“npa = acm.predict(input)” will print the Output returned to the website ligand_pdbcode = request.args.get("ligand") protein_pdbcode = request.args.get("protein") ligand_pdbcode is None or protein_pdbcode is None: ligand_pdbcode ="3zsx" protein_pdbcode ="3zsx" s_pocket.pdb" % protein_pdbcode) gand.sdf" % ligand_pdbcode) dict(input) response_data = jsonify(response_data) e Weare predicting the binding strength between | protein and 1 ligand, both of which share the same PDB code.

In Figure 3-14, the predicted binding affinities (-logKD/Ki values) for the protein- ligand interaction associated with the PDB code 2w66 are displayed This graphical representation presents the predicted binding strengths for the same protein-ligand interaction with PDB code 2w66 Following this, Figure 3-15 provides detailed

68 information about the protein information associated with the PDB code 2w66. Correlating with these details, Figure 3-16 establishes a link by displaying the - logKD/Ki values and the respective ligand names for the PDB code 2w66.

Examining this figure allows us to connect the protein information from Figure 3-18 with the predicted binding strengths presented in Figure 3-17.

Info: Function or role in biological or chemical systems BT_4395

Methods of synthesis or isolation Physical and chemical properties Conclusion: N-[(3R,4S,5R,6R,7R)-3,5,6- Info: O-GicNAcase BT_4395: Demystifying a Bacterial trihydroxy-7-(hydroxymethyl)azepan-4-yljacetamide is a Enzyme O-GlcNAcase BT_4395, found in the bacterium novel compound that has not yet been studied in detail [tis @ Bacteroides thetaiotaomicron, might sound like a mouthful, 7-membered nitrogen-containing ring with multiple hydroxy! but its function is intriguing Here's a breakdown of what we groups and an acetamidate group, The biological or chemical know about this crucial enzyme: Function: O-GlcNAcase function of this compound is not yet known Potential BT_4395 belongs to a family of enzymes called O-

Applications: Based on its structure, N~ GlcNAcases, responsible for removing O-GlcNAc (O-linked

[(3R,4S,5R,6R,7R)-3,5,6-trihydroxy-7- N-acetylglucosamine) modifications from proteins These

(hydroxymethyl)azepan-4-yl]aeetamide could have a variety modifications play diverse roles in cellular processes, ranging of potential applications For example, it could be used as: A from signaling to protein stability By removing them, drug or therapeutic agent A reagent in chemical synthesis A BT_4395 likely regulates protein function in B. solvent or additive thetaiotaomicron Specificity: While most O-GlcNAcases

Figure 3-14 Prediction result of -logKD/Ki for PDB code 2w66

# PDB code, release year, EC number, protein name

Figure 3-15 Protein information for PDB code 2w66

# PDB code, resolution, release year, -logkd/Ki, Kd/Ki, reference, ligand name

Figure 3-16 -logKD/Ki value and ligand name of pdbcode 2w66

Similar for the next three figures.

Info: XOT is a novel ligand that has been shown to bind to the N-terminal active site of angiotensin-converting enzyme (ACE) ACE is a key enzyme involved in the renin-angiotensin system, which plays a role in blood pressure regulation, inflammation, and other important biological processes XOT has a dissociation constant (Ki) of 12 nM for the N-terminal active site of ACE, which is significantly lower than that of the current standard of care for ACE inhibition, ramipril (Ki = 0.7 uM) This suggests that XOT could be a more potent and effective ACE inhibitor than ramipril XOT has also been shown to be selective for the N-terminal active site of ACE It does not inhibit the C-terminal active site of ACE, which is responsible for the degradation of bradykinin This selectivity could make XOT a safer alternative to ACE inhibitors that inhibit both active sites of ACE XOT is currently being investigated in preclinical studies for the treatment of hypertension, heart failure, and other diseases associated with the renin-angiotensin system.

Tiêu đề	Predicting Protein-Ligand binding affinity using atomic-level descriptors and convolutional networks
Tác giả	Do Hoang Phuc, Le Ngoc Thai Phuong
Người hướng dẫn	PhD Do Phuc, MSc Nguyen Thi Kim Phung
Trường học	University of Information Technology
Chuyên ngành	Information Systems
Thể loại	Thesis
Năm xuất bản	2023
Thành phố	Ho Chi Minh City

Định dạng
Số trang	85
Dung lượng	51,46 MB