Crystal structure determination of KREMEN1, DICKKOPF1 and MeCP2

CRYSTAL STRUCTURE DETERMINATION OF KREMEN1, DICKKOPF1 AND MeCP2 VINDHYA B.N.REDDY NATIONAL UNIVERSITY OF SINGAPORE 2008 CRYSTAL STRUCTURE DETERMINATION OF KREMEN1, DICKKOPF1 AND MeCP2 VINDHYA B.N.REDDY A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF BIOLOGICAL SCIENCES NATIONAL UNIVERSITY OF SINGAPORE 2008 ACKNOWLEDGEMENTS Firstly, I would like to express my greatest gratitude to my supervisor, Dr. K. Swaminathan for giving me an opportunity to work on these valuable projects and get experience in the field of Structural Biology and X-ray crystallography; for being patient with my flaws; for extending a lot of technical support and guidance and for the constant encouragement throughout my tenure as a graduate student. I would like to express my wholehearted thankfulness to him for all the support and knowledge he has imparted to me. I warmly thank our collaborators from the University of Pennysylvania, Prof. Sarah Miller and Prof. Mariuz Wasik for helping us with the start of all the projects. I owe my most sincere gratitude to Dr. Davis Ng and Dr. Kazue Kanehara from the Temasek Lifesciences Laboratories, Singapore for their untiring help with experiments in the yeast system and clarification of my doubts and queries on that regard. I would like to thank all my co-workers in the lab and my very good friends, Anupama, Toan, Kuntal, Pankaj, Dileep and Sunil for all the special moments we have shared and for the kind encouragement, constant help and support. Special thanks to Shiva for the constructive suggestions related to wet-lab experiments and bioinformatical analysis, kind support and all the help during difficult moments in the lab. I warmly thank, with best regards, all the members of Structural Biology lab 5 for their comments and expert guidance on wet-lab experiments. I also wish to thank my friend Karthik from SBL-2 for guidance in performing Circular Dichroism experiments and many valuable suggestions. i Special and heartfelt thanks to all my family members, relatives and friends outside NUS. My parents and sister have provided me with the best long distance support, love and encouragement possible during my stay in Singapore. The foundation that they have laid for me and the incessable morale boost cannot be thanked with words. Thanks to my very special friends, Tanushree, Suguna, Kirthan and Nilofer for being there with me during all fun-filled and difficult moments in Singapore. My sincere thanks are due to the thesis committee members Drs. He Yuehui and Prasanna Kolatkar for their constructive criticism and excellent advice during the preparation of this thesis. The financial support from the Department of Biological Sciences, National University of Singapore is gratefully acknowledged. ii TABLE OF CONTENTS Acknowledgements i Table of contents Summary iii viii List of abbreviations x List of figures xiii List of tables xvi CHAPTER 1 INTRODUCTION 1.1 WNT SIGNALLING PATHWAY: BIOLOGICAL BACKGROUND 1 1.2 CANONICAL WNT/ β-CATENIN PATHWAY 2 1.3 ANTAGONISTS OF WNT PATHWAY 4 1.3.1 Dickkopf-1 protein: significance and characterisation 4 1.3.1.1 Homology of Dkk-1with Colipase family 6 Kremen-1: characterisation and biological function 7 1.3.2 1.4 1.5 WNT ANTAGONISTS IN ACTION: INTERACTION OF DKK-1/KRM/LRP5/6 8 DNA METHYLATION 11 1.5.1 Methyl-CpG binding proteins (MeCP2) 12 1.5.2 Structure of MBDs 13 1.5.3 MeCP2 15 1.5.4 Architecture of MeCP2 16 1.5.5 Role of MeCP2 in transcription repression 16 1.5.6 MeCP2 and Rett Syndrome 18 iii 1.6 1.7 1.8 1.9 STRUCTURE DETERMINATION OF PROTEINS 18 1.6.1 History and application of macromolecular X-ray crystallography 19 1.6.2 Protein crystallisation 19 BASIC CONCEPTS IN PROTEIN CRYSTALLOGRAPHY 20 1.7.1 Lattices, point groups and space groups 21 1.7.2 hkl plane 22 1.7.3 Principle of X-ray diffraction and Bragg’s law 23 1.7.4 Reciprocal space 24 1.7.5 The Ewald sphere 25 1.7.6 Fourier transformation and structure factor equation 25 1.7.7 Phase problem 26 STRUCTURE DETERMINATION 27 1.8.1 Solution to phase problem 27 1.8.2 Direct methods 27 1.8.3 Molecular replacement (MR) 28 1.8.4 Multiple isomorphous replacement (MIR) 28 1.8.5 Anomalous scattering 29 1.8.5.1 MAD 29 1.8.5.2 SAD 30 1.8.6 Phase improvement 30 1.8.7 Model building and refinement 31 1.8.8 Validation and structure deposition 32 OBJECTIVES OF THE PROJECTS 33 iv CHAPTER 2 MATERIALS AND METHODS 2.1 2.2 2.3 2.4 2.5 PREPARATION FOR TARGET GENE AMPLIFICATIONS 35 2.1.1 Generation of cDNA using RT-PCR 35 2.1.2 Primer design for PCR 36 PCR OPTIMISATION PROCEDURE 36 2.2.1 Kremen1 36 2.2.2 Dickkopf1 (Dkk1 FL and Dkk1 Cys2) and MeCP2 37 2.2.3 Agarose gel extraction of the PCR products 37 CLONING 38 2.3.1 pGEM-T-Easy cloning vector 38 2.3.2 Preparation of E. coli DH5α competent cells 38 2.3.3 Transformation into DH5α Competant cells and blue-white screening 39 SCREENING OF TRANSFORMANTS 39 2.4.1 Colony PCR screening 39 2.4.2 Double digestion screening 40 2.4.3 Agarose gel electrophoresis 41 2.4.4 Plasmid DNA sequencing 41 2.4.4.1 Cycle sequencing PCR 41 2.4.4.2 Ethanol precipation 42 SUBCLONING INTO EXPRESSION VECTORS 42 2.5.1 Subcloning target genes into E. coli and baculovirus vectors 42 2.5.2 Subcloning of gene targets into S. cerevisiae 43 v 2.5.3 2.6 2.7 Phenol/ chloroform treatment and ethanol precipitation PROTEIN EXPRESSION AND PURIFICATION 44 2.6.1 Transformation and small scale expression in E. coli 44 2.6.2 Protein expression in yeast 46 2.6.2.1 Transformation in Yeast (W303a, pep4::HIS3 strain) 46 2.6.2.2 Preparation of protein for western blotting 46 2.6.2.3 Western blotting 47 PROTEIN PURIFICATION 48 2.7.1 Using affinity chromatography 48 2.7.2 Enterokinase cleavage 50 2.7.3 Thrombin cleavage 50 2.7.4 Gel filtration 50 2.7.5 Sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) 2.8 2.9 44 51 BIOPHYSICAL CHARACTERISATION 51 2.8.1 Analysis of purity and homogeneity 51 2.8.2 Circular dichroism 52 CRYSTALLISATION TRIALS 52 CHAPTER 3 RESULTS AND DISCUSSION 3.1 PCR OPTIMISATIONS 53 3.2 MOLECULAR CLONING 55 3.2.1 T/A cloning and blue white screening 55 3.2.2 Subcloning of the gene inserts into expression vectors 56 vi 3.3 3.4 PROTEIN EXPRESSION TRIALS 60 3.3.1 60 Expression in E. coli REASONS FOR PROTEIN EXPRESSION FAILURE AND REMEDIES 65 3.4.1 Co-expression of Proteins 67 3.4.2 Expression in yeast (S. cerevisiae) 67 3.5 EXPRESSION AND PURIFICATION OF HIS-TAGGED Dkk1Cys2 3.6 ANALYSIS OF PROTEIN PURITY, HOMOGENITY 68 AND MOLECULAR WEIGHT 71 3.7 CIRCULAR DICROISM OF Dkk1Cys2 73 3.8 PEPTIDE MASS FINGERPRINTING (PMF) 74 3.9 EXPRESSION AND PARTIAL PURIFICATION OF MeCP2 75 CHAPTER 4 CONCLUSIONS AND FUTURE WORK 79 REFERENCES 82 vii SUMMARY The Kremen1 and Dickkopf1 proteins form an exclusive class (the Dkk class) of evolutionarily well conserved antagonists of the canonical WNT/β-catenin pathway. Their role is to regulate vertebrate development by maintenance of an important constituent, β-catenin at levels desired to perform its necessary function. The proteins have been characterised with their respective domain architectures and mutual binding has been established in vivo, through co-immunoprecipitation and co-transfection studies. The mechanism underlying the binding of the two proteins to further the process of WNT inhibition has been intriguing. We have undertaken the crystal structure determination of Krm1, Dkk1FL and the Krm1-Dkk1Cys2 complex. We have been able to express and purify one of the crucial domains of Dkk1 from Mus musculus, proved to be necessary and sufficient in binding with Krm1 and to inhibit Wnt signalling. The 78 aa containing C-terminal domain of Dkk1, Dkk1Cys2, has been cloned in the pET32a vector and expressed in the E. coli BL21 (DE3) host strain. The protein has been purified to homogeneity and is presently under crystallisation trials. Krm1 from Mus musculus was cloned into pET32a and found to express completely in inclusion bodies in E. coli. Several attempts for expression have failed in both E. coli and S. cerevisiae. MeCP2 is another important mammalian protein with a role in the maintenance of DNA methylation that is essential for mammalian development. It shares about 70% identity with the MBD family of proteins, whose MBD domains are evolutionarily conserved. The lack of functional similarity between these proteins outside this domain is worth investigation. The structures of the MBD domain from MBD1 and MeCP2 have viii been determined by NMR and are found to be very similar. However, the pathway that MeCP2 chooses for achieving transcriptional repression is still under investigation. Additionally, there is a proposed model in the structural context of MeCP2 being involved in a medically significant neurological disorder, the Rett syndrome. We attempt to address the alteration of its DNA binding function and the above questions by solving the structure of full length MeCP2 using X-ray crystallography. MeCP2 from Homo sapiens has been cloned into vectors compatible with E. coli and some soluble protein expression has been detected during initial trials. Protein expression trials are currently underway for Krm1, Dkk1FL and also for MeCP2 in the baculovirus expression system. ix LIST OF ABBRIEVATIONS aa ACT AP APC AT C C/COOH CBP CD cDNA CIP CK1 CNS COL CRD CRID Cys2 DEPC dhkl Dkk DLS DMSO DNA dNTP Dsh DTT E E. coli ECD ECL EtBr F Fhkl f(j) FL Fz g GAL GPI GSK3β GST HDAC His-Pro I Amino Acid Actin Alkaline Phosphatase Adenomatous polyposis coli Adenosine: Thymine End-centered Carboxy terminal Creb binding protein Circular Dichroism Complementary DNA Calf Intestinal Phosphatase Casein kinase 1 Central Nervous System Colipases Cysteine-rich domain Co-repressor interacting domain Cysteine Rich Domain 2 Diethylpyrocarbonate Interplanar Spacing Dickkopf Dynamic Light Scattering Dimethyl Sulfoxide De-oxy ribonucleic acid Deoxyribonucleotide triphosphate Dishevelled Dithiothreitol Embryonic day Escherichia coli Extracellular domain Enhanced Chemiluminescence Ethidium Bromide Face-centered Structure factor Scattering factor Full Length Frizzled Relative Centrifugal force Galactose Glycosylphosphatidylinositol Glycogen synthase kinase-3β Glutathione S-transferase Histone deacetylase Complex Histidine and Proline rich region Body-centered x IgG IPTG Kd Krm L LB LEF LRP MAD MBD mCpG MCS MeCP MES MIR MR mRNA MW MWCO N/NH2 NMR P PBS PCP PCR PDB pI PLATE PMF Psi PVDF RE R-factor Rpm RT S. cerevisiae SAD SC-Ura SDS-PAGE sFRP Sk TAE TCA TCF TFIIB TM Immunoglobulin G Isopropyl β-D-1-thiogalactopyranoside Dissociation constant Kremen Loop Luria-Bertani Lymphoid enhancer factor Low-density lipoprotein receptor related protein Multiwavelength Anomalous Dispersion Methyl-CpG Binding domain Methyl Cytidine (phospho-diester bond) Guanosine Multiple Cloning Site Methyl-CpG Binding Protein 2-(N-morpholino)ethanesulfonic acid Multiple Isomorphous Replacement Molecular Replacement messenger- Ribo-nucleic acid Molecular Weight Molecular Weight Cut Off Amino terminal Nuclear Magnetic Resonance Primitive Phosphate Buffer Saline Planar Cell Polarity Polymerase Chain Reaction Protein Data Bank Isoelectric pH PEG/Li-acetate/TE Peptide Mass fingerprinting Pound per Square inch Polyvinylidene difluoride Restriction Enzyme Residual/ Reliability Factor Revolutions per minute Reverse Transcriptase Saccharomyces cerevisiae Single Anomalous Dispersion Synthetic Complete- Uracil Sodium dodecyl sulfate- Polyacrylamide Gel Electrophoresis Secreted Frizzled-related proteins Soggy Tris-acetate-EDTA Trichloroacetic acid T-cell factor Transcription Factor IIB Transmembrane xi TRD Trx U UV WIF WNT YPD βME θ λ ρ (x,y,z) Transcription repressor domain Thioredoxin Unit(s) Ultra Violet WNT inhibitory factor Wingless (Wg), WNT-1(int-1) Yeast Peptone Dextrose β-Mercaptoethanol Angle of incidence Wavelength Electron density xii LIST OF FIGURES Chapter 1 Page Figure 1.1 Canonical WNT/β catenin signalling pathway 2 Figure 1.2 Schematic representation of Dkk-1 protein 5 Figure 1.3 Alignment of the Dkks with colipases and other related molecules 6 Figure 1.4(a) A three-dimensional model of the colipase fold based on porcine colipase structure 7 Figure 1.4(b) Domain organisation of colipase-containing proteins based on sequence similarity 7 Figure 1.5 Sequence comparison of Krm proteins 10 Figure 1.6 Deletion analysis of Kremen and Dickkopf 11 Figure 1.7 Model for functional interactions of Dkk1, LRP5/6 and Krm to block the Canonical Wnt signal in cells 11 Figure 1.8 Domain organisation of MBD family members 13 Figure 1.9 Sequence alignment of the MBD family proteins 14 Figure 1.10(a) Solution structure of the MBD of MBD1 15 Figure 1.10(b) Putative DNA binding site of MBD 15 Figure 1.11 Figure 1.12 Proposed potential mechanisms for repression mediated by MBD proteins 17 The unit-cell 21 Figure 1.13(a) Constructive interference 23 Figure 1.13(b) Destructive interference 23 xiii Figure 1.14 Bragg’s law 23 Figure 1.15 Reciprocal lattice 24 Figure 1.16 The Ewald sphere 25 Chapter 3 Page Figure 3.1 PCR optimisations of Krm1 54 Figure 3.2 Touchup PCR products of MeCP2, Dkk1FL and Dkk1Cys2 55 Figure 3.3 Double digested products of Krm1 constructs 58 Figure 3.4 Double digested products of Dkk1FL constructs 58 Figure 3.5 Double digested products of MeCP2 constructs 59 Figure 3.6 Colony PCR products of Krm1 in PTS210 59 Figure 3.7 Colony PCR products of Dkk1FL in PTS210 60 Figure 3.8(a) Expression of Krm1 in pQE30 vector / M15 cells 62 Figure 3.8(b) Expression of Krm1 in pGEX 4T1 vector / BL21 (DE3) cells 62 Figure 3.8(c) Expression of Krm1 in pET32a vector / BL21 (DE3) cells 63 Figure 3.8(d) Expression of Dkk1FL in pGEX4T1 vector / BL21 (DE3) cells 63 Figure 3.8(e) Expression of Dkk1FL in pET32a vector / BL21 (DE3) cells 64 Figure 3.8(f) Expression of MeCP2 in pET14b vector / pLySS (DE3) cells 64 Figure 3.9 Western blot analysis of Krm1 and Dkk1FL 65 Figure 3.10 The Kyte-Doolittle hydropathy plot for Kremen1 68 Figure 3.11 Small Scale expression of Dkk1Cys2 69 Figure 3.12 Talon affinity purification of His-tagged Dkk1Cys2 70 Figure 3.13 Gel filtration profile of Dkk1Cys2 on a Sephadex-75 column 70 Figure 3.14 SDS gel of the thrombin cleaved Dkk1Cys2 71 xiv Figure 3.15 Dynamic Light Scattering analysis of Dkk1Cys2 72 Figure 3.16 Native gel of Dkk1Cys2 73 Figure 3.17 CD Spectrum of Dkk1Cys2 73 Figure 3.18 Predicted secondary structure of Dkk Cys2 74 Figure 3.19 Peptide mass fingerprinting of Krm1 74 Figure 3.20 Peptide mass fingerprinting of Dkk1Cys2 75 Figure 3.21 Expression of MeCP2 in pET32a vector / BL21 (DE3) cells 76 Figure 3.22 Solubility check of MeCP2 76 Figure 3.23 Talon affinity purification of His-tagged MeCP2 77 Figure 3.24 Gel filtration profile of MeCP2 on Sephadex- 200 column 77 Figure 3.25 SDS-PAGE analysis of his-tagged MeCP2 elution fractions 78 xv LIST OF TABLES Page Table 1.1 Crystal systems and their related unit-cells and lattices 22 Table 2.1 Expression trials for proteins 45 Table 3.1 Target proteins with the corresponding expression systems and vectors used Table 3.2 57 Summary of protein expression in E. coli, S. cerevisiae and Baculovirus 61 xvi CHAPTER 1: INTRODUCTION 1.1 WNT SIGNALLING PATHWAY: BIOLOGICAL BACKGROUND The WNT/β-catenin canonical pathway is most extensively studied in cell signalling. This pathway involves the evolutionarily conserved secreted WNT (Wingless from Drosophila and Int-1 from Mus musculus) cysteine-rich glycoproteins (Clevers, 2006). About 19 members of the WNT protein family have been identified in mammals and the functions of WNTs have been elucidated by genetic and cell biological studies in models including Drosophila melanogaster, Danio rerio, Caenorhabditis elegans, Xenopus laevis, Musmusculus, sea urchin, chicken embryos and mammalian cultured cells (Moon et al., 2002; Moon et al., 2004). They act as short-range ligands that mediate signalling through serpentine receptors of the Frizzled gene family. WNTs work to regulate a wide range of developmental processes in both embryos and adults (Wodarz and Nusse, 1998; Miller et al., 1999; Moon et al., 1997). These comprise embryonic induction, generation of cell polarity, cell fate specification, cell migration (Cadigan and Nusse, 1997), mammary gland and skin appendage morphogenesis and hair follicle formation (Chu et al., 2004). In addition, deregulation in WNT signalling that leads to elevated β-catenin levels has been largely implicated in the genesis of a number of malignancies (Polakis, 2000; Morin, 1999; Miller et al., 1999; Akiyama et al., 2000), degenerative disorders (Nusse, 2005) and several developmental defects. WNT genes are not functionally equivalent. They give rise to diverse pleiotropic effects through activation of distinct intracellular pathways that abundantly exhibit cross- 1 talk with other signalling pathways (Moon et al., 1997). WNT proteins also depend on a repertoire of receptors and co-factors present on the cell surface to determine the transcriptional endpoints and hence, WNT target genes are mostly cell type specific. In particular, three pathways have been identified, namely the WNT/Ca2+ cascade, planar cell polarity (PCP) pathway or non-canonical WNT/ β-catenin pathway and the canonical WNT/ β-catenin pathway. 1.2 CANONICAL WNT/ β-CATENIN PATHWAY Our interest lies in the canonical WNT/β-catenin signalling pathway. WNT signal transduction is mediated by the Frizzled (Fz) genes encoding seven transmembrane receptor proteins (Vinson et al., 1989; Wang et al., 1996; Chan et al., 1992) with a cysteine-rich domain (CRD) at the N-terminus which bind Wnts with high affinity (Hsieh et al., 1999). However, the pathway diverges downstream of the Dishevelled protein and acts through a core set of highly conserved proteins to regulate β-catenin levels in the nucleus and cytoplasm (Fig. 1.1). Figure 1.1. Canonical WNT/β-catenin signalling pathway in (a) absence and (b) presence of WNTs, respectively. WNTs bind to Fz and LRP5/6 to 2 induce β-catenin release from the catenin destruction complex and its subsequent translocation into the nucleus to activate gene transcription (adapted from Moon et al., 2004). In the absence of active WNT ligands, free cytoplasmic β-catenin is recruited into a ‘Catenin destruction complex’ assembled by the tumour suppressors, APC and Axin. The multiprotein complex, including GSK3β and CK1 triggers phosphorylation of βcatenin at the N-terminal, leading to ubiquitylation followed by proteosomal degradation of β-catenin. This leads to low cytoplasmic and nuclear β-catenin levels, and hence inhibition of downstream gene transcriptional events (Moon et al., 2004). Activation of the canonical signalling cascade is triggered when secreted WNT ligands interact with Fz receptors through the CRD which then bind to the single-pass transmembrane protein identified as the low-density lipoprotein receptor related proteins 5 and 6 (LRP5/6) in vertebrates and Arrow in the Drosophila (Tamai et al., 2000; He et al., 2004) at the membrane surface. This results in the inhibition of GSK3β phosphorylation of β-catenin by the dissociation of the enzyme from the destruction complex (Willert and Nusse, 1998), possibly through the activation of Dsh. In addition, Axin is also degraded, further decreasing β-catenin phosphorylation. As a result, the stabilised β-catenin then accumulates in the cytoplasm before translocating to the nucleus, allowing subsequent complex formation with the DNA bound transcription factors, TCF and LEF. These repressed transcription factors activate important target genes downstream, leading to a myriad of effects, most notably regulation of cell proliferation, survival and cell fate. 3 1.3 ANTAGONISTS OF WNT PATHWAY Several antagonists work in concert to dampen the WNT signalling pathway to ensure target gene expression in the correct cellular and developmental context. Wnt antagonism plays a central role in anterior specification during anteroposterior patterning of neural plate during Xenopus gastrulation (Davidson, 2002). Wnt antagonists are of two functional classes, the secreted Frizzled-related proteins (sFRP) class and the Dickkopf (Dkk) class. Members of sFRP class include sFRP family, WNT inhibitory factor (WIF) and Cerberus, exerting their effect by direct binding and sequestration of soluble WNTs (Kawano and Kypta, 2003). 1.3.1 Dickkopf-1 protein: significance and characterisation Inhibition of WNT signalling can be mediated by the members of the Dickkopf (Dkk) family of proteins and in particular, Dkk1 (Glinka et al., 1998). The founding member of the multigene Dkk family is Dkk-1, with three other members identified in vertebrates, including Dkk-2, Dkk-3 and Dkk-4 (Krupnik et al., 1999; Monaghan et al., 1999; Niehrs, 2006). Dkks are an evolutionarily ancient gene family, found in vertebrates, including humans, and in invertebrates like Dictyostelium, Cnidarians, Urochordates and ascidians but not in Drosophila and Coenorhabditis elegans. There is a strong functional divergence between Dkk3 and Dkk1/2/4 gene families during early metazoan evolution (Niehrs, 2006). Dkk1/2/4, all regulate Wnt Signalling and bind to LRP6 and Krm1 and 2 unlike Dkk3 (Mao et al., 2002). Dkks are glycoproteins of 255-350 aa, containing a signal sequence at N-terminus and sharing two conserved characteristically spaced cysteine-rich domains. The N- 4 terminal cysteine-rich domain, Dkk_N (Cys1) is unique to Dkks and the C-terminal cysteine-rich domain (Cys2) has a pattern of 10 cysteines related to colipase fold (Niehrs, 2006; Aravind and Koonin, 1998). Dkks play an important role in vertebrate development, locally inhibiting Wnt regulated processes such as antero-posterior axial patterning, limb development, somitogenesis and eye formation. In adults, Dkks are implicated in bone formation and bone disease, cancer and Alzhiemer’s disease (Niehrs, 2006). The characteristic developmental function of Dkk-1 is its head inducing activity (Mukhopadhyay et al., 2001; Glinka et al., 1998). A human homologue of Dkk1, Soggy (Sk) (Fig. 1.2), has been characterised biochemically and is found to complement Dkk1 function in Xenopus laevis (Fedi et al., 1999). a b Figure 1.2. Schematic representation of Dkk-1 protein. (a) Dkk-1 Architecture representing the C1 and C2 domains and percentage identity between human and mouse or Xenopus cysteine-rich domains. SP: Signal Peptide; C: Cysteine; N: N-glycosylation site (b) Consensus of Dkk-1 5 sequence from human, mouse and Xenopus cysteine-rich domains. (adapted from Fedi et al., 1999). 1.3.1.1 Homology of Dkk-1with Colipase family The structural and sequence homology between colipase and the C-teminal domain of Dkk has been recently discovered (Figs. 1.3 and 1.4). It has been convincingly suggested that Dkks and colipases have the same disulfide-bonding pattern and a similar fold. The structure of colipase fold is solved using X-ray crystallography and it consists of short β-strands connected by loops and stabilised by disulfide bonds, resulting in finger-like structures that may serve as interactive surfaces for lipases (Tilbeurgh et al., 1999). Figure 1.3. Alignment of the Dkks with colipases and other related molecules. Xdkk-1 and Mdkk-1 are the Dkks from Xenopus laevis (XI) and Mus musculus (Mm) respectively. COL stands for the colipases. The conserved residues are colored according to the 85% consensus rule: polar residues, red; acidic and basic residues, pink; hydroxylic residues, blue; hydrophobic residues, yellow background; tiny residues, green background; small residues, blue backgroud; large residues, gray background. The conserved cysteines, which form the disulfide-bonding pattern typical of this family, are shown in inverse red shading. The disulfide-bonding network connecting the cysteines is shown in a separate color for each pair. The predicted structural elements based on the porcine colipase crystal structure are shown above the alignment, with arrows representing β-strands (adapted from Aravind and Koonin, 1998). 6 The position of the hydrophobic amino acid residues are conserved well between the carboxy-teminal domain of Dkk and the colipases. One direct functional implication of this observation is that the colipase-like domain of Dkk may be necessary for the membrane association of this protein, which in turn may be required for the inhibition of Wnt secretion or Wnt-receptor interaction (Aravind and Koonin, 1998). Figure 1.4. (a) A three-dimensional model of the colipase fold on the basis of the porcine colipase crystal structure. The β-strands are shown in yellow, the loops in blue and the disulfide bonds in pink. The hydrophobic residues (in the single-letter amino-acid code) are possibly involved in lipid interaction, are shown as space-filling spheres in gold (b) Domain organisation of the colipase-domain-containing proteins based on sequence similarity. Blue: signal peptide; green: N-terminal cysteine-rich domain; red: colipase domain; thick bar: 100 amino acids. (adapted from Aravind and Koonin, 1998). 1.3.2 Kremen-1: characterisation and biological function Kremens are type-I transmembrane proteins, composed of 473aa, Fig. 1.5. There are two related forms of Krm (Krm1, Krm2), identified to be widely expressed in adult tissues, including the skeletal muscle, brain and during embryonic development. It consists of three conserved extracellular domains, namely the kringle domain, Wsc and 7 CUB (Complement Sub components Clr/Cls, Ugef, Bmp1), while the intracellular region has no conserved motif involved in signal transduction (Nakamura et al., 2001). Kringles are autonomous structural domains found predominantly in blood clotting and fibrinolytic proteins and in some serine proteases. They are believed to play a role in binding mediators such as membrane phospholipids and proteoglycans. The Wsc domains are present in yeast cell wall integrity and stress response component proteins. The CUB domain is involved in protein-protein and glycosaminoglycan-protein interactions, and in a number of proteins involved in development and differentiation. Although amino acid sequence homologies between vertebrate Krm1 and Krm2 are only about 35-40%, their occurrence and the order of their domains are conserved in all orthologues (Davidson et al., 2002). A potential role of Kremen is seen in the regulation of cellular responses upon external stimulus or cell-cell interaction in neuronal and/or muscle cells (Nakamura et al., 2001). Krm is expressed maternally in mouse and frog in the early anterior neural folds. Both Krm1 and Krm2 are co-expressed with Dkk1 in the prechordal plate underlying the anterior neurectoderm. These expression domains are consistent with a role of Krm in regulating early anteroposterior (AP) patterning in CNS and in Wnt inhibition pathway (Davidson, 2002). 1.4 WNT ANTAGONISTS IN ACTION: INTERACTION OF DKK- 1/KRM/LRP5/6 Physiological interaction between Krm and Dkk1 proteins in vivo has been studied in Xenopus laevis. Kremens bind both Dkk1 and Dkk2 (but not Dkk3) with an apparent Kd in the nM range. It has been shown that both Krm and Dkk1 are required 8 equally to block Wnt/LRP Signalling (Davidson, 2002). The membrane attachment of Krm proteins via GPI-anchor is important for mediating the Wnt/LRP inhibition of Dkk1. All the three ECDs (Kringle, Wsc and CUB domains) of Krm1 are required for binding with Dkk1, as proved by the co-immunoprecipitation and co-transfection assays carried out with the ECD deletion constructs of Krm1 and alkaline phosphatase fused Dkk-1 (Fig 1.6) (Mao et al., 2002). a b 9 Figure 1.5. Sequence comparison of Krm proteins. (a) Alignment of Krm1 and Krm2 protein sequences from Xenopus (X) and mouse (m).The Kringle, Wsc and CUB and transmembrane domains are highlighted and conserved amino acids are shown in white (within coloured domains) or red. (b) Krm homology tree and matrix showing overview of homology and amino acid identity, respectively, between the Xenopus, mouse and human Krm proteins. (adapted from Davidson et al., 2002). Dkk1 is a high-affinity ligand for LRP5/6 and binds both LRP5 and LRP6 and at an apparent Kd in the range 10-10 M (Bafico et al., 2001; Mao et al., 2001; Semenov et al., 2001). The Dkk1/LRP5/6 complex subsequently binds to Kremen (Krm1/2) (Mao et al., 2002). Formation of this ternary complex triggers rapid endocytosis and the consequent removal of LRP5/6 from the plasma membrane, preventing the WNT signalling and βcatenin is stabilised for WNT gene expression (Fig. 1.7). b 10 Figure 1.6. Deletion analysis of Kremen and Dickkopf. (a) Schematic drawing of (left) mkrm2 and (right) Dkk-1 deletion constructs. SP, signal peptide; KR, kringle domain; TM, transmembrane domain; L1, L2, linker region 1, 2 (b) Summary of the binding and Wnt inhibition of the mkrm2 deletion constructs (c) Bound AP activity measurement colorimetrically at 405nm (left) Dkk1–AP binding to 293T cells transfected with mkrm2 deletion constructs; (right) 293T cells transfected with LRP6 or mkrm2 as indicated, incubated with recombinant XDkk1-AP, AP-XDkk1-Cys1, or AP-XDkk1-Cys2 respectively (adapted from Mao et al., 2002; Mao and Niehrs, 2003). The colipase fold of Dkk1 (Cys2) is necessary and sufficient for Kremen and LRP6 binding and for Wnt inhibition, shown by AP- bound deletion analysis of Dkk-1 (Fig 1.6) (Mao and Niehrs, 2003). Figure 1.7. A model showing the functional interactions of Dkk1, LRP5/6 and Krm to block the Canonical Wnt signal in cells (adapted from Mao et al., 2002). 1.5 DNA METHYLATION DNA methylation, mainly at the sequence of CG, is the most common covalent epigenetic modification of the eukaryotic genome, involving the addition of a methyl group at position-5 of cytosine in cytidine-guanosine (CpG) dinucleotide pairs. Roughly, 70% of all CpG dinucleotides in the mammalian genome are methylated and majority of these sites occur in repetitive DNA elements. CpG islands are genomic regions that contain a high frequency of CG dinucleotides. They are in and near approximately 40% 11 of promoters of mammalian genes. Unlike CpG sites in the coding region of a gene, in most instances, the CpG sites in the CpG islands of promoters are unmethylated if genes are expressed. Methylation of CpG sequences has been implicated in stable modulation of celltype specific gene expression during development by affecting the protein-DNA interactions that are required for transcription (Cedar, 1988). Examples include gene silencing observed in inactive X-chromosome and other chromosomal abnormalities (Riggs and Pfeifer, 1992), in genomic imprinting (Bartolomei and Tilghman, 1997; Neumann and Barlow, 1996; Razin and Cedar, 1994), in transformed cell-lines and tumors (Bird, 1996; Rountree et al., 2001). 1.5.1 Methyl-CpG binding proteins (MeCP2) Methyl-CpG binding proteins form a family of five proteins, including MBD1, MBD2, MBD3 and MBD4 (Hendrich and Bird, 1998; Wade, 2001), that bind methylated CpG (mCpG) sequences within double-stranded DNA and represses transcription by recruiting histone deacetylases (Nan et al., 1998; Jones et al., 1998; Ballestar and Wolffe, 2001). Each member of this family has a stretch of 60-80 amino acid residues displaying 50-70% similarity between all five proteins. MBDs interact with transcriptional repressors and chromatin remodeling factors (Bird et al., 2002; Kimura and Shiota, 2003). The MBD of MBD4 is most similar to that of MeCP2 in primary sequence, while the MBDs of MBD1, 2 and 3 are more similar to each other than to either MBD4 or MeCP2 (Figs. 1.8 and 1.9). The presence of an intron, located at a conserved position in 12 all genes (Hendrich and Bird, 1998), indicates that the MBDs within each protein are evolutionarily related, but the lack of similarity between these proteins outside of the MBD (excluding MBD2 and MBD3) may indicate that each protein carries out a different function within the cell (Hendrich and Bird, 1998). Figure 1.8. Domain organisation of MBD family members. For each protein, the MBD domain is depicted as a green box, the TRD domain as yellow box, CXXC repeats of MBD1 as blue box, GR repeat of MBD2 as purple box and acidic repeat at the carboxy terminus as orange box (adapted from Wade, 2001). 1.5.2 Structure of MBDs The alignment of MBD proteins from selected organisms is given in Fig. 1.9. Ohki et al., (1999) determined the structure of MBD of the human methylation-dependent repressor MBD1, while Wakefield et al., (1999) determined the structure of MBD of MeCP2 using NMR, Fig. 1.10. Although the sequences from MBD1 and MeCP2 exhibit only a moderate degree of homology, sequences can easily be aligned with a number of conserved residues throughout the MBD. The structures of the two MBDs are very similar. The structure is a novel-wedge shaped α/β sandwich. The four up and down antiparallel β-sheet is contributed by NH2 terminal constituting one face of the wedge, 13 (highly positively charged) (Wakefield et al, 1999). Another face of the wedge is formed by the three-turn helix with another single turn helix in the COOH terminal of the protein (negatively charged towards the thick end of wedge). Figure 1.9. Sequence alignment of the MBD family proteins. Conserved residues are boxed. Important residues are colored (blue, basic; yellow, hydrophobic; green, acidic or polar (adapted from Ohki et al., 1999). a b 14 Figure 1.10. (a) Solution structure of the MBD of MBD1 (adapted from Koradi et al., 1996). (b) Putative DNA binding site of MBD. Basic residues are colored in blue; aromatic residues, yellow; an acidic residue, green. Main chains of residues strongly affected by addition of methylCpG DNA are colored in red. B-form DNA is also shown in the left-hand figure, with methyl groups in the symmetric methyl-CpG highlighted in yellow (adapted from Ohki et al., 1999). MBDs do not undergo dimerisation for recognition of symmetrical mCpG sequences, but bind to DNA as monomer (Nan et al., 1993). Mutation analysis on MBD has resulted in a putative model of MBD with DNA, Fig. 1.10 (b) (Ohki et al., 1999). The model proposes that the interaction between MBD and methylated DNA takes place along the major groove of a standard B-form DNA. The two longer β-sheet strands (β2 and β3), as well as the loop between them (L1), would interact with the major groove of the DNA. Also, the residues between β4 and α1 seem to establish contacts with the phosphate backbone. MBDs from other proteins are likely to contain a similar fold, although MBD1, MBD2 and MBD3 must exhibit local differences in structure, particularly at the thick end of the wedge-shaped domain. 1.5.3 MeCP2 MeCP2 is found to be essential for embryonic development and contributes to methylation-dependent gene silencing (Li et al., 1993; Bird, 1993; Tate and Bird, 1993; Meehan et al., 1989; Boyes and Bird, 1991; Lewis et al., 1992; Nan et al., 1997). MeCP2 is concentrated on the pericentromeric heterochromatin in the genome (Lewis et al., 1992; Nan et al., 1996). Species and tissue comparisons show that MeCP2 is widely distributed in mammals except in embryonal carcinoma cell lines, which have very low levels (Meehan et al., 1989). 15 1.5.4 Architecture of MeCP2 MeCP2 is an archetypical methyl-CpG-binding and multidomain protein, containing the domains: Methyl-CpG binding domain, MBD (residues 77-162), Corepressor interacting domain, CRID (163-207 aa), transcriptional repression domain, TRD (208-311 aa) and His-Pro domain (312-404 aa). It has the first methyl CpG-binding domain (MBD) protein to be identified that is involved in selective recognition of CpG (Wade, 2001; Lewis et al., 1992; Nan et al., 1993). The protein localises to densely methylated regions (major satellite DNA) in the mouse genome (Wade, 2001; Nan et al., 1996). The TRD overlaps a nuclear localisation signal (Wade, 2001; Nan et al., 1997). The region upstream to the MBD has no known function. The carboxyl terminus of MeCP2 has unusual and repetitive sequences that are similar to members of the fork head family (Wade, 2001; Vacca et al., 2001). MeCP2 binds to single, symmetrically methylated CpG dinucleotide at hemimethylated or fully methylated DNA sites on the chromosome regardless of sequence context (Nan et al., 1996; Lewis et al., 1992). However, the former is a poor substrate. The methyl group and base identity is important for MeCP2 binding. AT-rich DNA containing methyl CpG is a preferred substrate for MeCP2 (Meehan et al., 1992; Klose et al., 2005). 1.5.5 Role of MeCP2 in transcription repression A major breakthrough in the study of MeCP2 dependent repression came with the finding that MeCP2 binds to a methylated DNA and recruits the Sin3–histone deacetylase complex to promoters, resulting in deacetylation of core histones and subsequent 16 transcriptional silencing. The region of interaction with Sin3 on MeCP2 significantly overlapped the TRD domain (Jones et al., 1998; Nan et al., 1998; Wade, 2001). MeCP2 possibly utilises multiple pathways to achieve a repressed state (Fig. 1.11). Presumably, the choice of mechanism is influenced by the cell type, DNA sequence and local chromatin architecture (Wade, 2001). a. Model 1 b. Model 2 c. Model 3 d. Model 4 Figure 1.11. Proposed potential mechanisms for repression mediated by MBD proteins. In the cartoons, histone octamers are represented by gray balls and DNA is in blue. (a) Model 1- MBD proteins interact with HDAC to generate hypoacetylated, condensed chromatin (b) Model 2MBD proteins coat methylated loci occluding regulatory DNA (c) Model 3- MBD proteins alter local DNA and or chromatin architecture (d) Model 4- MBD protein sequesters an essential transcription factor, preventing its function (adapted from Wade, 2001). It has been demonstrated, in vitro and in vivo in mammalian cells that MeCP2 directly prevents a component in the basal transcription machinery from functioning, probably by direct contact of one of its domains (TRD with TFIIB) during the assembly of preinitiation complex. It represses transcription at a distance of greater than 500 bp from the transcription site and selectively inhibits transcription complex assembly on methylated DNA (Kaludov et al., 2000). TRD actively represses transcription from both 17 unmethylated and methylated promoters, relying significantly on histone deacetylation (Nakao et al., 2001; Nan et al., 1998). 1.5.6 MeCP2 and Rett Syndrome MeCP2 also participates in epigenetic control of neuronal function (Amir et al., 1999). About 80% of the classic Rett syndrome, a neuro-developmental disorder, is caused by mutations occuring denovo during spermatogenesis, which leads to X chromosome inactivation (Xq28) (Van den Veyver et al., 2001). Although MeCP2 is ubiquitously expressed, the phenotype of the syndrome is restricted to the brain (Roloff et al., 2003). In females, Rett syndrome is one of the most common causes of mental retardation with an incidence of one in 10000-15000 (Hagberg, 1985). Most missense mutations seem to interfere with the normal function of MBD or TRD, majority of the truncating mutations delete at least a part of the TRD. This leads to partial or complete loss of protein function, especially involved in brain development (van den Veyver et al., 2001; Chen et al., 2001). R106W, R133C, F155S and T158M are mutations at the methyl-CpG binding domain (MBD), which impair binding affinity to methylated DNA (Ballestar et al., 2000). 1.6 STRUCTURE DETERMINATION OF PROTEINS Proteins, an important class of biological macromolecules present in all living organisms, perform their functions by folding into specific spatial conformations. The function of a protein at the molecular level is clearly understood by determining its three dimensional structure by techniques such as X-ray crystallography, Nuclear Magnetic 18 Resonance (NMR) spectroscopy and cyro-electron microscopy. Each technique has its own strengths and limitations and complements each other in structure determination. 1.6.1 History and application of macromolecular X-ray crystallography X-rays were discovered in 1895 by Wilhelm Röntgen. In 1912, the longitudinal nature of X-rays was proved to the scientific community, when German physicist Max von Laue directed an X-ray beam through a crystal and observed an interference or diffraction pattern. In 1915, W.H. Bragg and W.L. Bragg won the Nobel Prize for Physics, for the formulation of X-ray crystallography. They proved that the structure of a crystal on the molecular level could be deduced from the study of an interference pattern. In the past 40 years, this important discovery has gained such a high momentum, that nearly 35,000 protein structures have been determined using this technique. However, this number represents only a small part in the entire congregation of proteins known and unknown to the scientists worldwide. 1.6.2 Protein crystallisation Crystallisation is one of the several means (including nonspecific aggregation / precipitation) by which a metastable supersaturated solution can reach a stable lower energy state by reduction of solute concentration (Weber, 1991). The three stages of crystallisation common to all molecules are nucleation, growth, and cessation of growth. Nucleation is the process by which non crystalline aggregates that are free in solution, come together to produce a thermodynamically stable aggregates with a repeating lattice, which must first exceed a specific size (the critical size) to become a supercritical nucleus 19 capable of further growth. The degree at which nucleation occurs is determined by the degree of supersaturation of the solutes in the solution. The critical size is dictated by several operating conditions like temperature, supersaturation, pH, ionic strength etc. The most commonly used methods for crystallisation are the hanging drop and sitting drop vapor-diffusion methods, dialysis and batch method. The hanging and sitting drop methods rely on the diffusion of a precipitant / volatile agent between a micro-drop of mother liquor and much larger reservoir solution. The principle behind initial crystallisation trials is the application of the Sparse matrix method. This method allows for quick screening of several conditions like wide ranges of pH, salts, precipitants and additives / ligands that are selected from known crystallisation conditions. After obtaining small crystals, that particular condition(s) could be optimised further to obtain crystals of suitable size and quality for diffraction experiments. 1.7 BASIC CONCEPTS IN PROTEIN CRYSTALLOGRAPHY Crystals are defined as solids with a periodic three dimensional arrangement of an atomic structure. The simplest repeating unit in a crystal is called a unit-cell with three basis vectors a, b and c, and inter-axial angles α, β and γ between them (Fig. 1.12). The smallest and unique volume within the unit-cell that can be rotated and translated to generate one unit-cell is called the asymmetric unit. Unit-cells are classified into seven different systems based on their dimensions. The arrangement of molecules in a unit-cell is governed by symmetry (Section 1.7.1) and this symmetrical arrangement defines the system of that crystal. Table 1.1 shows the 7 different crystal systems: 20 c β α γ b a Figure 1.12. The unit-cell 1.7.1 Lattices, point groups and space groups In a crystal, unit-cells are arranged in a contiguous way to fill space. If a point is assumed to represent a whole unit-cell, then the array of all points will form a lattice. Table 1.1 summarises the collection of 14 such lattices, known as Bravais lattices. The Bravais lattices are classified as primitive, P (simple unit-cell with one point for each unit-cell), face centered, F (an additional lattice point at the center of each face), body centered, I (an additional point at the center of the cell), end centered, C (an additional point at the center of one face). Molecules are arranged in the unit-cell with certain symmetry operations when packed into a crystal. A symmetry operation gives an identical or similar image of an object. Besides unit translations along the three unit-cell axes, called three-dimensional translation symmetry, the three crystallographic symmetry elements are rotation, reflection, and inversion. The combination of these symmetry elements that acts on a unit-cell is commonly called a point group. There are totally 32 crystallographic point groups (11 with proper rotations and 21 with improper rotations). 21 Table 1.1. The crystal systems and their related unit-cells and lattices. Rotation or reflection, when combined with translation, will generate screw or glide symmetry, respectively. The combination of lattices and point groups leads to 230 different possible ways of molecular packing in a crystal, known as space groups, out of which, only 65 space groups are applicable to protein crystals (McRee, 1999). 1.7.2 hkl plane The diffraction effect of a crystal (Section 1.7.3), is well explained using the concept of hkl planes. When X-rays are diffracted by a crystal, each resulting spot is created by an imaginary set of parallel ‘hkl’ planes that slice the entire crystal in a particular direction. The index h is the number of integral parts into which the set of planes cuts the X direction (or a-axis) of each unit-cell. Similarly, the indices k and l specify how many such planes exist per unit-cell in the Y and Z directions. The family of planes having indices hkl (h, k, l must be integers) is the hkl family of planes. 22 1.7.3 Principle of X-ray diffraction and Bragg’s law X-rays are electromagnetic waves which interact with matter like electrons. Xrays, scattered from different electrons, will travel different distances, differing in their relative phases. Thus, there will be interference when they are in phase (constructive interference) or when out of phase (destructive interference), Fig. 1.13. a b Figure 1.13. (a) Constructive interference (b) destructive interference The geometric requirements needed for diffraction to occur were first explained by Bragg. Bragg showed that a set of parallel planes, n, with indices, hkl and interplanar spacing, dhkl produces a diffracted beam when X-rays of wavelength λ impinge on the planes at an angle θ and are reflected at the same angle, only if θ meets the condition 2 dhkl sinθ = nλ (Eq. 1.1) Figure 1.14. Bragg’s law. 23 This is known as Bragg's law for X-ray diffraction, Fig. 1.14. 1.7.4 Reciprocal space A reciprocal lattice is defined as a discrete set of diffracted rays whose vectors are perpendicular to the real lattice planes from which they are derived. Figure 1.15. Reciprocal lattice. In Fig. 1.15, the Bragg planes and the incoming and reflected rays are shown, as before, for two diffraction angles, but now a vector perpendicular to the Bragg plane is added for each. The phase of a wave diffracting from an object lying in between two Bragg’s planes depends on the fraction of the distance of the object from one Bragg plane to the next. If the position of the object is considered as a vector, then the distance of that object from one of the Bragg planes can be obtained by projecting that vector on the plane normal. The phase shift can then be obtained by dividing the projected distance by the Bragg spacing between the planes. Mathematically, we can carry out the projection and division by giving the plane normal a length equal to the reciprocal of the Bragg spacing, and then computing the dot product between the position vector and the plane normal. Because the plane normal is a vector with a length reciprocal to the spacing in the object, we call it a vector in reciprocal space. 24 1.7.5 The Ewald sphere Ewald’s geometrical construction, Fig. 1.16, helps us visualise which Bragg planes are in the correct orientation to diffract. Figure 1.16. The Ewald sphere. Rearranging Bragg’s law in reciprocal space, (Eq. 1.2) Fig. 1.16 demonstrates how each reciprocal lattice point must be arranged with respect to the X-ray beam in order to satisfy Bragg’s law and produce a reflection from the crystal. Since the Ewald sphere is three dimensional, reflections by hkl planes when exposed to X-rays form a three dimensional network of diffraction spots. If the Ewald sphere has a diameter of 2/λ, then any reciprocal-lattice point within a distance 2/λ from the origin can be rotated into contact with the sphere to form a diffraction spot. 1.7.6 Fourier transformation and structure factor equation The atomic arrangement (or electron density) in a crystal is related to all the diffraction spots that are obtained from a crystal through the Fourier transformation principle. The electron density at any point can be calculated using Eq. 1.3. 25 ρ ( x, y , z ) = 1 V ∑∑∑F h k hkl e − 2 πi ( hx + ky + lz ) (Eq. 1.3) l In the above equation, we are transforming the reciprocal space of lattice planes to real or direct space electron density at the point x,y,z. Thus, if we know Fhkl, the structure factor (inverse space entity from diffraction by electrons), we can calculate the actual real structure (the density of electrons in real space). The Structure factor Fhkl for a reflection h,k,l is a complex number derived quite straightforward as follows: n Fhkl = ∑ f j e 2πi ( hx j + ky j +lz j ) j =1 (Eq. 1.4) This is a simple summation, which extends over all atoms j at fractional coordinates xj, yj and zj and f(j) is the scattering factor of each atom which depends on the atomic number of that atom and the diffraction angle of the corresponding reflection (h,k,l). Ironically, Eq. 1.4 shows that if we know the positions of all atoms (the structure) then we can easily calculate all structure factors and with the help of Eq. 1.3, we can calculate the electron density at any point (again, the structure). 1.7.7 Phase problem In order to compute Eq. 1.4, we need the amplitudes of all diffracted waves (involving the scattering factor component), which are obtained from the intensities of the diffraction spots and also their relative phase shifts with respect to the origin of the unit-cell (involving the positions of all atoms), which cannot be measured directly. The immeasurability of the phase angle is known as phase problem. 26 The structure factor equation, Eq. 1.4, can be rearranged as shown in Eq. 1.5 to explicitly show the phase angle. N F (hkl ) = ∑ | f j | [cos 2π (hx + ky + lz ) j + i sin 2π (hx + ky + lz ) j ] (Eq. 1.5) j =1 The methods used for phase angle estimation are discussed in the next section. Solving the phase problem is the most crucial step in crystal structure determination. 1.8 STRUCTURE DETERMINATION 1.8.1 Solution to phase problem Four methods are used to solve the phase problem in macromolecular structure determination. They are: direct methods, heavy-atom method (or isomorphous replacement method), anomalous scattering method (also called anomalous dispersion) and molecular replacement method. All these methods only yield phase estimates for a limited set of reflections. To improve the accuracy of the phase and to get an interpretable electron density map, refinement at both reciprocal and real space is carried out with the help of Fourier transformation. 1.8.2 Direct methods This method relies on the possibility of development of useful statistical relationships between sets of structure factors to deduce their phases. However, we must assume a crystal to be made up of similarly-shaped atoms with positive electron density everywhere. The direct methods estimate the initial phases for a selected set of reflections using a triple relation and extend phases to more reflections. A triple relation is one in 27 which there are trio of reflections in which the intensity and phase of one reflection can be explained by the other two. The prime requirement for the direct methods to be successful in protein crystallography is very high resolution data (> 1.2 Å). This has limited its usefulness as only structures of small molecules can be solved easily, although the direct methods have been used to phase proteins up to 1000 atoms. 1.8.3 Molecular replacement (MR) The molecular replacement method is used when a protein molecule has high sequence and hence structural similarity to an already solved protein structure (referred to as the search model). Usually, the Patterson function (a function derived using structure factors) of the search model is first correctly orientated in the new crystal unit-cell by means of rotation functions and then translated to achieve the best fit that is supported by a convincing correlation factor and a residual factor (details of the residual factor in section 1.8.7). 1.8.4 Multiple isomorphous replacement (MIR) In this technique, first an X-ray data set for a protein crystal (native crystal) is collected. Following this, the native crystal is soaked in a specific heavy atom, such as mercury, platinum or gold. The goal is to obtain derivative crystals in which heavy atoms bind specifically and consistently to each protein molecule in the unit-cell. Another data set is collected for the derivative and the positions of the heavy atoms are determined using difference Patterson maps. Once the initial heavy atom locations have been 28 determined, the coordinates, occupancy and temperature factors of each heavy atom are refined. At least two isomorphous derivatives (which are essentially isomorphous to the native crystal) are needed for successful structure determination by MIR. 1.8.5 Anomalous scattering The atomic scattering factor of an atom has three components as shown in Eq. 1.6. f = f0+ f ′+ if ″ (Eq. 1.6) f0 is a scattering term that is dependent on the Bragg angle and the two terms f′ and f″ are independent of the scattering angle, but dependent on wavelengths. These latter two terms represent the anomalous scattering that occurs at the absorption edge when the Xray photon energy is sufficient to promote an electron from an inner shell to the next shell. The dispersive term f′ reduces f0 whereas the absorption term f″ is 90° advanced in phase with respect to f′. This leads to a breakdown in Friedel's law (which in a normal case gives Fhkl = F-h-k-l), giving rise to anomalous differences that can be used to locate anomalous scatterers in a crystal, if any. 1.8.5.1 MAD Isomorphous replacement has several problems: non-isomorphism between crystals (unit-cell changes, reorientation of the protein, conformational changes, changes in salt and solvent ions), problems in locating all the heavy atoms, problems in refining heavy-atom positions, occupancies and thermal parameters and errors in intensity measurements. The use of the multiwavelength anomalous dispersion (MAD) method 29 overcomes the non-isomorphism problems. The method used is similar to MIR in the calculation of difference Patterson maps to locate anomalous scatterers but now, we calculate difference anomalous Patterson maps. Data are collected at several, typically three, wavelengths in order to maximise the absorption and dispersive effects. The method demands excellent optics for accurate wavelength setting with minimum wavelength dispersion. Generally, all data are collected from a single frozen crystal with high redundancy in order to increase the statistical significance of the measurements and data are collected with as high a completeness as possible. 1.8.5.2 SAD Single anomalous dispersion (SAD) is a sub-set of MAD. It is becoming increasingly practical to collect data at the absorption peak and use density-modification protocols to break the phase ambiguity and provide interpretable maps. 1.8.6 Phase improvement The experimentally determined phases are usually not accurate enough to give a completely interpretable electron-density map. Thus, a variety of phase improvement techniques are available, like density modification, solvent flattening, histogram matching and non-crystallographic averaging to modify electron density and improve phases. Solvent flattening is a powerful noise suppression technique that removes negative electron density and sets the value of electron density in the solvent regions to a value lower than that of protein electron density. Histogram matching alters the values of 30 electron-density points to concur with an expected distribution of electron-density values. Non-crystallographic symmetry averaging imposes equivalence on electron density values when more than one copy of a molecule is present in the asymmetric unit. Density modification is a cyclic procedure, involving back-transformation of the modified electron-density map to give modified phases, recombination of these phases with the experimental phases and calculation of a new map which is then modified iteratively until convergence. 1.8.7 Model building and refinement A model of the subject protein is produced by fitting the components of the structure into the experimentally derived electron-density map, followed by refinement. Generation of an atomic model of the molecule(s) is a crucial step in the structure determination process. The model-building task may be far from straightforward because of poor phase information and the resolution of the diffraction data may be limited. However, automation in the recent years has enormously reduced the amount of time involved in manual model building. When an initial atomic model is available, a vast amount of geometrical restraints of the structure can be applied in the structure refinement and rebuilding process, in order to generate better phases and in turn, a better / accurate atomic model. These steps are carried out in an iterative way to achieve gradual improvement of the model. Refinement of a model is the optimisation of a function of a set of observations by changing selected associated parameters so that the atomic model agrees well with the diffraction data. Refinement programs refine the geometry and temperature factors of the model atoms to 31 improve the fitness of observed and calculated structure-factor amplitudes. Lowresolution data must be collected for proper evaluation of the structure because this is used in bulk solvent averaging. The highest possible resolution limit should be used because this maximises the accuracy and precision of the structure. This is determined by the signal-to-noise ratio [I/σ(I)], completeness and redundancy of data within the highest resolution shell. After several rounds of refinement and map fitting, the model is considered as an acceptable final model. The refinement of the structural model against the experimentally observed X-ray diffraction data is measured by a ‘residual’ or ‘reliability’ factor (R-factor), Eq. 1.7. The progress in iterative real and reciprocal space refinement is monitored by computing the difference between the measured structure factor |Fobs| and the calculated structure factor |Fcalc| from the current model. (Eq. 1.7) When the model converges to a correct structure, the difference between measured F's and calculated F's will also converge. A desirable target R factor for a protein model should be less than 0.25. 1.8.8 Validation and structure deposition The correctness and precision of atomic parameters in a structure will need to be assessed thoroughly, both during and after refinement. Validation of the correctness of the model of a macromolecule obtained from crystal structure analysis is necessary for several reasons. The number of atoms in the model and therefore the number of refined parameters may not be significantly higher than the number of measured unique reflections, especially at 32 low resolution. Moreover, a large percentage of reflection intensities may be rather weak in relation to their estimated uncertainties. In addition, some parts of the model are usually more flexible or disordered and their conformation cannot be confidently defined in an electrondensity map. When the model is refined without restraints, even at atomic resolution, these weakly defined regions of the model become very disordered, in contrast to well defined fragments (Dauter et al., 1992; Kleywegt, 2000). As long as both the R and Rfree (calculated with about 10% of relections that are not included for refinement) values continue to decrease with each cycle of refinement, the process is repeated and an improved protein structure model can be built. Furthermore, as torsion angles are usually not restrained during refinement, their agreement with expected values in the Ramachandran plot (Ramachandran et al., 1963; Ramakrishnan & Ramachandran, 1965) and proper side chain rotamers are extremely useful in model validation. Once an X-ray structure for a macromolecule is determined and it has successfully passed a proper validation process, the resulting model is deposited in the Protein Data Bank (PDB) (http://www.rcsb.org/pdb) and is available to the world. 1.9 OBJECTIVES OF THE PROJECTS Kremen1, Dickkopf1 and MeCP2 are highly evolutionarily conserved proteins, each playing a very significant and crucial role in the development of mammals. Krm1 and Dkk1 are in vivo binding partners which specifically inhibit the passage of Wnt signal through cells, by forming a ternary complex with another protein LRP5/6 on cell membrane. Internalisation of the` complex from the membrane is triggered by Dkk1. The binding sites for Dkk1 with both Krm1 and LRP5/6 locate to the C-terminal cysteine rich domain, Dkk1Cys2. However, binding of Krm1 to Dkk1 33 requires the presence of the entire extracellular domain. Structure determination of the involved proteins, especially, Dkk1, poses special interest, as it is an important therapeutic target, specifically for osteoporosis and cancers. Thus, our aim is to understand the mechanism of Wnt inhibition by solving the protein structures individually (Krm1 and Dkk1FL) and also as a complex (Krm1-Dkk1Cys2). MeCP2, belonging to MBD family of proteins, is a key player involved in the prime function of DNA methylation and transcriptional repression in mammals. MeCP2 and its role in regulating the transcriptional machinery have been under scrutiny for the past 10 years. Although there are solved structures of MBD from MBD1 and MeCP2 available, none of them strongly suggest the exact mechanism of MeCP2 in transcription inhibition and the Rett syndrome. We attempt to address this problem by solving the structure of full length MeCP2 using X-ray crystallography. 34 CHAPTER 2: MATERIALS AND METHODS 2.1 PREPARATION FOR TARGET GENE AMPLICATIONS 2.1.1 Generation of cDNA using RT-PCR Total mRNA was extracted from an embryonic day (E) 14.5 mouse embryo using Trizol (Invitrogen) and the RNEasy (Qiagen) kit. RT-PCR was carried out according to the Superscript® III First Strand Synthesis system for RT-PCR protocol (Invitrogen). An RNA / primer mix was prepared as: 1 μl of dNTP mix (10 mM), 1 μl of oligo(dT)20 (50 μM) and 1 μl of total RNA (1.6 μg) obtained previously from a mouse embryo were combined in a 0.2 ml eppendorf tube and made up to a total volume of 10 μl with DEPCtreated water. RNA was denatured by incubation of the primer mixture in the heat block at 65 °C for 5 min, before placing on ice for 1 min. A 10 μl cDNA synthesis mix was prepared by addition of the components in the order: 2 μl of 10X RT Buffer, 4 μl of 25 mM MgCl2, 2 μl of 0.1 M DTT, 1 μl of 40 U/μl RNase OUT™, 1 μl of 200 U/μl Superscript™ III RT. RNase OUT™ recombinant RNase inhibitor was included to minimise the degradation of RNA due to possible ribonuclease contamination. The cDNA synthesis mix was combined with the cooled RNA primer mixture and incubated at 50 ˚C for 50 min. The cDNA synthesis reaction was then terminated at 85 ˚C for 5 min, before replacing it back on ice. This was followed by adding 1 μl of RNase H and further incubation at 37 ˚C for 20 min. RNase H was used for improved sensitivity of the next PCR step by removal of the RNA template. The newly synthesised cDNA was stored at -20 ˚C until further use for PCR. 35 2.1.2 Primer design for PCR Primers were designed for PCR of the following genes: 1. 21-353 aa fragment containing the extracellular domain of Kremen1 (Mus musculus) 2. 1-272 aa, full length Dickkopf1 (Mus musculus) 3. 195-273 aa fragment containing the cysteine-rich domain 2 of Dickkopf1 (Mus musculus) 4. 1-487 aa, full length MeCP2 (human Sezary-4 lymphoma cell line) The upstream and downstream oligonucleotide primers were incorporated with flanking restriction sites that was present in the multiple cloning site (MCS) of the corresponding vector that was used for subcloning and absent in the insert to be cloned. 2.2 PCR OPTIMISATION PROCEDURE 2.2.1 Kremen1 Gradient PCR was first performed using an Eppendorf® Mastercycler Personal PCR machine to optimise the annealing temperature across a broad range (55-67 °C) using a set of forward and reverse oligonucleotide primers with respective RE sites flanking the primer ends. A 50 µl PCR reaction mixture was prepared for amplifying Kremen1 containing 25 µl of 2X ImmoMix™ (BioLine), 1 µl each of 10 µM upstream and downstream oligonucleotide primers, 2 µl of cDNA as DNA template. A two-step gradient program was designed to amplify the insert with a 30 cycles PCR reaction. The first step involved 10 cycles of denaturation at 94 ˚C for 30 sec, annealing at 58 ˚C for 1 min, extension at 72 ˚C for 1 min, followed by another 36 elongation at 72 ˚C for 3 min. The second step involved 20 cycles of denaturation at 94 ˚C for 30 sec, annealing at 66 ˚C for 1 min, extension at 72 ˚C for 1 min, followed by a final elongation at 72 ˚C for 5 min. 2.2.2 Dickkopf1 (Dkk1 FL and Dkk1 Cys2) and MeCP2 A 50 µl reaction mix containing 1 µl of upstream and downstream primers respectively (10 µM), 2.5 µl of 10X Mg-free buffer, 1.5 µl of MgCl2, 1 µl of dNTP mix, 0.5 µl of Expand High fidelity DNA polymerase (Roche) was used. The template used for Dkk1 FL, Dkk1 Cys2 and MeCP2 was the pGEM-T-Easy clones containing the respective full length inserts. Touchup PCR was used to amplify Dkk1FL, Dkk1Cys2 and MeCP2. The amplification was performed for 30 cycles, which involved denaturation at 94 ˚C for 30 sec, annealing at 60 ˚C for 30 sec and extension at 72 ˚C for 1 min. The annealing step had 1 ˚C increment every cycle for 10 cycles, which would help in better annealing of primers over a gradation of temperature. The next amplification step of 20 cycles involved denaturation at 94 ˚C for 30 sec, annealing at 65 ˚C for 30 sec, extension at 72 ˚C for 1 min and final extension at 72 ˚C for 5-10 min. 2.2.3 Agarose gel extraction of the PCR products The PCR products were analysed on 1% agarose gel for Krm1, Dkk1FL and MeCP2 and 1.3% gel for Dkk1Cys2. The agarose gel containing the DNA band of interest was cut with a scalpel followed by purification using the Qiaquick Gel Extraction 37 kit (Qiagen) according to standard manufacturer’s protocol. DNA was eluted in 40-50 μl of elution buffer. 2.3 CLONING 2.3.1 pGEM-T-Easy cloning vector Ligation of the target DNA after PCR and purification was first carried out into the pGEM®-T Easy vector system (Promega) in a 10 µl reaction. The DNA concentration of the PCR product was checked using a Nanovue (GE Heathcare) spectrophotometer. The following equation was used to calculate the amount of PCR product to be used for the ligation reaction. An insert: vector molar ratio of 3:1 was taken for the ligation reaction according to manufacturer’s recommendations. ng of vector × kb size of insert × insert : vector molar ratio = ng of insert kb size of vector (Eq. 2.1) 1 µl of the vector, 5 µl of 2X ligation buffer and 0.5 µl T4 DNA ligase (Promega) were used to set up a ligation reaction. A control for the ligation mixture was prepared in another tube without the PCR product. The ligation mixtures were incubated at room temperature for 1 hour before being transformed into DH5α competent cells. 2.3.2 Preparation of E. coli DH5α competent cells Frozen glycerol stock of DH5α was streaked onto a Luria Bertani (LB) agar plate (Sigma) (without ampicillin) and incubated overnight at 37 °C. A fresh single colony was inoculated into 5 ml of LB broth and incubated in a shaker overnight at 37 °C. The overnight culture was transferred into 100 ml of LB broth and incubated in a shaker at 37 38 °C with shaking speed of 200 rpm. Cells were grown until O.D. at 600 nm wavelength reached 0.4 - 0.6 and then transferred into 50 ml tubes and spun at 1,520 g at 4 °C for 10 min. The supernatant was removed and the cell pellet was resuspended gently with 40 ml of cold glycerol buffer (0.1 M CaCl2, 15% glycerol) and then incubated on ice for 30 min. Cells were spun at 1,520 g for another 10 min at 4 °C and the supernatant was discarded. The pellet was resuspended carefully in 4 ml of cold glycerol buffer, aliquoted at 50 μl per tube, frozen in liquid nitrogen and stored at -80 °C. 2.3.3 Transformation into DH5α Competant cells and blue-white screening A 50 µl aliquot of the previously prepared competent E. coli DH5α cells was thawed on ice for 5 min. 1-5 µl of the plasmid was added, mixed gently and kept on ice for 20 min. The cells were then heat shocked at 42 °C for 90 sec and immediately kept back on ice for 2 min. 1 ml of LB medium was added into the tube, the cells were incubated with shaking at 37 °C for an hour. Cells were microfuged at 9000 rpm for 3 min and 900 μl of the supernatant was discarded. The cell pellet was resuspended with the remaining medium and plated on LB agar plate supplemented with a suitable antibiotic for selection. Plates were then incubated overnight at 37 °C. Additionally, for blue / white colony screening of colonies with pGEM-T-Easy vector system, 10 µl of 1 M IPTG and 20 µl of X-gal solution (Biorad) were added to the LB Plates (containing appropriate antibiotic) before plating the cells on it. 2.4 SCREENING OF TRANSFORMANTS 2.4.1 Colony PCR screening 39 The traditional method of colony PCR screening of transformants was used for fast screening of ligated transformants. In this method, 10 well separated colonies were picked up from the transformant plate and pipeted up and down to resuspend them in 7 μl of autoclaved water taken in 10 different tubes. 1 μl of this cell suspension was used as a template for doing PCR. A PCR mix was prepared similar to the mix prepared for target gene amplification, mentioned in section 2.2 and a PCR protocol unique to each gene target for amplification was used. The PCR products were analysed on agarose gel to confirm the presence of amplified target inserts at the right gene fragment size. 2.4.2 Double digestion screening Proper insertion of a gene into a target plasmid (gene orientation) was then verified by double digestion screening of colonies grown on LB-agar plate with suitable selection antibiotics. Around 10 colonies (only white colonies in the case of blue white screening) were selected for screening from the ligation plate. Usually, the ligation plate should have 2-3 fold more colonies as compared to the control plate. The colonies were inoculated into 2 ml LB broth containing suitable selection antibiotics and incubated overnight with shaking at 37 °C. Cell pellets were recovered by microfuging at 13,000 rpm for 3 min. DNA plasmid was extracted from pelleted cells by using the Qiagen Miniprep kit and eluted with 40 μl of elution buffer. A double digestion mixture of 30 μl was set up by taking 3 μl of DNA, purified from the colony, 0.5 μl of each of the two restriction enzymes (NEB) whose respective sites were present in the upstream and downstream primers of the amplified PCR product and 3 μl of the compatible NEB buffer was added. All the 10 tubes were incubated at 37 40 °C for 2.5 - 3 hours. 10 μl of the digestion products were run on 1% agarose gel at 100 V to confirm the presence of the respective target insert in the vector. 2.4.3 Agarose gel electrophoresis Agarose gel electrophoresis was performed using 1% UltraPure™ agarose (Invitrogen) in 60-100 ml 1X Tris-acetate (TAE) running buffer, containing 2 μl of ethidium bromide (10 mg/ml) solution. Each DNA sample was first mixed with 2 μl of Blue / Orange 6X loading dye (Promega). 3 μl of 1 kb DNA ladder (Promega) was also similarly mixed with the dye and loaded alongside the sample wells, serving as a marker for estimation of molecular weight. After loading, the gel was run at 90-100 V for about an hour until the loading dye reached the end of the gel. DNA bands were then visualised using the GeneSnap machine (SynGene BioImaging System) under an ultraviolet (UV) transilluminator. Extra care was taken to ensure that all the equipment and reagents used for gel electrophoresis were kept separately and gloves were worn at all times to prevent toxic EtBr contamination. 2.4.4 Plasmid DNA sequencing 2.4.4.1 Cycle sequencing PCR 1 μl of plasmid DNA containing about 250 ng of DNA was mixed with 4 μl of water, 1 μl of the respective vector specific sequencing primer (3.2 μM), 2 μl of 5X Big Dye Buffer and 2 μl of Big Dye Terminator V3.1 (Applied Biosystems). 25 cycles of PCR reaction were carried out with the denaturation step at 96 °C for 30 sec, annealing at 50 °C for 15 sec and extension at 60 °C for 4 min. 41 2.4.4.2 Ethanol precipation DNA was then precipitated by adding 1/10th volume of 3 M sodium acetate of pH 5.2 and mixed by vortexing briefly. Then, 25 μl ice cold 100% ethanol was added to this. The mixture was vortexed well and placed in a -80 °C freezer for 15 min before the tubes were microfuged at 13000 rpm for 20 min at 4 °C. The supernatant was removed and the pellet was washed with 500 μl of 70% ethanol and microfuged at 13000 rpm for 5 min at 4 °C. The supernatant was carefully aspirated and the sample was dried in a vacuum centrifuge before using for sequencing on an automated ABI Prism 3100 (Applied Biosystems) Sequencer. 2.5 SUBCLONING INTO EXPRESSION VECTORS The sequence verified pGEM-T Easy clones containing the target inserts were used for subcloning into expression vectors compatible with bacterial (E. coli), yeast (Saccharomyces cerevisiae) and baculovirus (SF9 insect cells) expression. 2.5.1 Subcloning target genes into E. coli and baculovirus vectors The target insert was digested from the pGEM-T Easy vector after sequence verification, using the respective restriction enzymes flanking its ends. The expression vector of interest was also digested with the same set of restriction enzymes. A 50 μl double digestion mixture containing 5 μl of the pGEM-T Easy clone [containing the target insert (about 50 ng/μl)], 20 U each of the respective restriction enzymes in 5 μl of the compatible buffer was set up. The vector was also digested in the same way as the 42 insert and 2 such replica tubes were incubated for 4 - 5 hours at 37 °C. The replica tubes were pooled in for insert and vector respectively following incubation. To prevent the vector from self-ligating after digestion, it was further subjected to de-phosphorylation at the 5´ termini using calf intestinal phosphatase, CIP (NEB). 1 μl of CIP was added to 100 μl of the double digested vector and incubated for 1 hour at 37 °C. Both the double digested insert and vector were purified to remove all the buffers using the PCR Purification kit (Qiagen). DNA was eluted in 40 - 50 μl of elution buffer. The above digested products were ligated similar to that of ligation in the pGEMT Easy vector with a control reaction. This was followed by transformation of 5 μl ligation product into E. coli DH5α cells. The transformed colonies were then screened by double digestion and final confirmation was performed by plasmid DNA sequencing. 2.5.2 Subcloning of gene targets into S. cerevisiae PTS210 is a yeast expression vector, a YCp50-based construct, containing the GAL1/GAL10 promoter, a short polylinker (BamHI, HindIII, XbaI), and the ACT1 transcriptional terminator (a generous gift from Prof. Davis Ng, Temasek Lifesciences Laboratory, Singapore). Gene targets Krm1 and Dkk1 FL were amplified with primers having the BamH1 restriction site on both upstream and downstream ends. Blunt end ligation was performed by Klenow treatment of the inserts and the vector. After digesting the vector and the inserts in a 25 μl reaction using the procedure mentioned in section 2.4.2 at 37 °C overnight, the vector and insert were subjected to Klenow treatment, wherein 0.83 μl of 1 mM dNTPs and 0.2 μl of Klenow (5000 U/ml) was added. The mixture was incubated for 15 min at 25 °C and then the samples were subjected to gel 43 extraction and final DNA was eluted in 10 μl elution buffer (Section 2.2.3). The vector was then subjected to CIP treatment mentioned in section 2.5.1 followed by heat inactivation at 55 °C for 30 min. Phenol / chloroform extraction, followed by ethanol precipitation, was performed. 2.5.3 Phenol/ chloroform treatment and ethanol precipitation 100 μl of phenol / chloroform mixture and 50 μl of dH20 were added to the above tube, vortexed well and microfuged at a maximum speed of 13000 rpm for 5 min at room temperature. The aqueous phase of about 100 μl was transferred to a fresh tube and the interface and the organic phases were discarded. The same steps were repeated again from the beginning starting from the addition of phenol/chloroform one more time. After this, ethanol precipitation (Section 2.4.4.2) was performed. The DNA pellet was resuspended in 10 μl of water. Ligation of vector and insert, transformation, colony PCR screening and sequence verification were performed as explained earlier before protein expression trials. 2.6 PROTEIN EXPRESSION AND PURIFICATION 2.6.1 Transformation and small scale expression in E. coli The plasmids that contained the gene inserts of Krm1, Dkk1 FL, Dkk1 Cys2 and MeCP2 were transformed into several E. coli host strains for expression of recombinant proteins. The transformation procedure was carried out as described in section 2.3.3. A few of the well separated colonies were picked and inoculated in 2 ml of LB media with appropriate antibiotic/s and incubated overnight at 37 °C in a shaker. Next morning, 44 protein expression was tested on small scale (10-50 ml of fresh LB media) containing appropriate antibiotic/s for the host strains chosen. LB media was inoculated and grown at 37 °C until the cell density reached 0.4 - 0.6 at 600 nm. Induction was then performed using varying IPTG concentrations ranging from 0.1 to 1 mM in different tubes and protein expression was carried out at 37 °C for 4 - 5 hours and 16 °C for 12-16 hours, respectively. The samples were then pelleted down by spinning at 8,983 g for 20 min at 4 °C. The pellets from 2 ml samples were resuspended in 200 μl of 8 M urea lysis buffer and vortexed thoroughly to lyse cells. Samples from whole cell lysate were analysed on SDS-PAGE. Samples showing any expression in comparison with the control (uninduced and empty vector) were tested for solubility. The cell pellets were resuspended in 2 ml lysis buffer (Tris-Cl or 1X PBS) and sonicated at an amplitude of 22% for 10 sec with pulse 1 sec on and 1 sec off. Following this, samples were microfuged at 13000 rpm for 3 min and the supernatant and pellets were analysed separately on SDS-PAGE. Table 2.1. Expression trials for proteins. pFastbacHTB is a baculoviral vector. PTS210 is a yeast vector (S. cerevisiae). The other vectors mentioned in the table are compatible with E. coli. Their pI and Molecular weight in kDa are also given. The manufacturer of the vectors are: pGEX4T1 (GE Healthcare lifesciences), pET32a, pETduet1 and pET14b (Novagen), pFastbacHTB (Invitrogen) and pQE30 (Qiagen). No. Target pI/MW (kDa) Expression Vectors used for protein expression 1 2 8.74 / 29.297 9.30 / 8.839 pGEX4T1, pET32a, PTS210, pFastbacHTB pET32a 5.62 / 36.487 pQE30, pGEX4T1, pET32a, pETduet1, PTS210, pFastbacHTB pETDuet1 3 4 5 Dkk1FL Dkk1 Cys2 Krm1 Dkk1 FL + Krm1 MeCP2 9.95 / 52.440 pET14b, pET32a, pFastbacHTB 45 Dkk1Cys2 and MeCP2, cloned in the pET32a vector, were transformed into the E. coli BL21 (DE3) host strain for protein expression in 1 L culture. The expression condition for both proteins was at 37 °C, induction with 0.4 and 0.5 mM IPTG, respectively with a shaking speed of 200 rpm. 2.6.2 Protein expression in yeast 2.6.2.1 Transformation in Yeast (W303a, pep4::HIS3 strain) About 3 μl of the plasmid DNA (Krm1/Dkk1 FL cloned into the PTS210 vector) was taken (about 500 ng) and mixed with 100 μl of PLATE and 2 μl of sonicated salmon sperm DNA (Stratagene) of concentration 10 μg/μl. The W303a, pep4::HIS3 strain of yeast cells were spread on YPD (Yeast Peptone Dextrose, Sigma) plate and incubated at 30 °C overnight. The grown cells (only a few) were picked from the plate using toothpick and added to the vial containing the PLATE-DNA mixture. The contents were mixed well and incubated for 4 hours-overnight at room temperature (about 8 hours is best). The contents were then microfuged at 9000 rpm, most of the supernatant was discarded and the pellet was resuspended in about 20-30 μl of residual supernatant. This mixture was vortexed and then plated on an YPD plate and incubated at 30 °C for 2 days. 2.6.2.2 Preparation of protein for western blotting The protein samples expressed in yeast were prepared using the 10% trichloroacetic acid (TCA) cytosolic protein extraction method for western blot analysis. Colonies grown on the YPD transformation plate were picked and grown in minimal synthetic complete- uracil media to log phase (SC-Ura media with 2% glucose for 46 uninduced control and SC-Ura media with 2% galactose for induced samples). Cells at OD600 of 2 units were taken in a 15 ml Falcon tube and microfuged at 1500 rpm for 5 min at 4 °C. The pellet was resuspended in 1 ml of 10% TCA on ice. 1 ml of zirconium beads (BioSpec) were added on ice and cells were lysed by 2 times 30 sec cycle in mini-bead beater at high setting. The lysate was then transferred to a 1.5 ml eppendorf tube. The beads were washed with 400 μl of 10% TCA and microfuged at 14000 rpm for 10 min at 4 °C. The supernatant was discarded by aspiration and the pellet containing the precipitated protein was resuspended in 80 μl of TCA resuspension solution (Appendix). The contents were boiled for 10 min at 100 °C and vortexed a few times, followed by microfuging at 14000 rpm for 15 min at 4 °C. 30 μl of the supernatant was now transferred to a new tube and 30 μl of 2X SDS gel loading buffer was added and boiled for 10 min at 100 °C. 2.6.2.3 Western blotting Polyvinylidene difluoride (PVDF) membrane, used in western blotting, was first soaked in 10 ml of 100% methanol for 10 sec, before washing twice with de-ionised water and equilibrated in transfer buffer (48 mM Tris, 39 mM glycine, 20% methanol, 1.3 mM of 10% SDS). In particular, methanol is necessary for preventing gel expansion by heating during the transfer process and to allow the protein to be adsorbed to the membrane. The gel, obtained after SDS-PAGE with pre-stained protein ladder markers, was then placed on the equilibrated PVDF membrane supported on three layers of filter paper pre-soaked in the transfer buffer. This was then overlaid with another three layers of pre-soaked filter paper. Care was taken to exclude any air bubbles between the gel, 47 membrane and filter paper to permit proper flow of electric current. Protein transfer was then carried out by electrophoresis using Transblot SD, Semidry Transfer Cell (Biorad Labs) at 15 mA for 90 min. After the transfer of proteins to the membrane, the membrane with the bound proteins was subjected to overnight blocking with 5% milk powder added to PBST (1X PBS + 0.05% tween20, a non-ionic detergent used to break unwanted protein-lipid interactions) with gentle rocking at 4 °C. This step is critical to prevent non-specific binding of immunological reagents to the membrane, hence effectively reducing noise in the background of the blot which can lead to false positives. Following blocking, the membrane was washed twice for 5 min with 15 ml of PBST. The primary anti-his antibody (the protein of interest has a 6X His tag) from mouse was then added for an hour to allow for binding to the protein of interest. After that, the membrane was again washed with PBST five times for 10 min each to remove any unbound primary antibody before the addition of goat anti-mouse IgG secondary antibody. Like before, the membrane was washed with PBST thrice for 10 min each, after incubation with the secondary antibody for an hour. Subsequently, enhanced chemiluminescent (ECL) developing reagents from the SuperSignal®West His Probe™ kit (Pierce) was applied evenly on the membrane and exposed to X-ray film, which was then developed and analysed for protein bands. 48 2.7 PROTEIN PURIFICATION 2.7.1 Using affinity chromatography Cells from four liter pellet of Dkk1 Cys2 and two liter pellet of MeCP2 were resuspended in 100 ml of lysis buffer [50 mM Tris-HCl (pH 7.0 for Dkk1 Cys2 and pH 8.5 for MeCP2), 200 mM NaCl, 20 mM imidazole, 5 mM βME] respectively. The cell suspension was divided into 25 ml portions and subjected to two cycles of lysis in French press (Thermo Electron Corporation) at a cell pressure of 12000 psi followed by 14000 psi. The suspension after lysis was pelleted for 30 min at 39,391 g at 4 °C. The supernatant was collected after lysis and added to five ml bed volume of Talon® Metal Affinity Resin (Clontech), pre-equillibrated with the Tris-NaCl lysis buffer. The binding of protein to the resin was allowed for one hour at 4 °C on a rocking platform. The loosely bound contaminant proteins to the affinity column were removed with three washes using wash buffer1 [50 mM Tris-HCl (pH 7.0 for Dkk1 Cys2 and pH 8.5 for MeCP2), 200 mM NaCl, 20 mM imidazole, 5 mM βME, 0.5% Tween20] followed by additional 2 washes with wash buffer 2 [50 mM Tris-HCl (pH 7.0 for Dkk1 Cys2 and pH 8.5 for MeCP2), 200 mM NaCl, 40 mM imidazole, 5 mM βME]. His-tagged MeCP2 was eluted using elution buffer 1 [50 mM Tris-HCl (pH 8.5), 200 mM NaCl, 250 mM imidazole, 5 mM βME] and elution buffer 2 [50 mM Tris-HCl (pH 8.5), 200 mM NaCl, 400 mM imidazole, 5 mM βME] in a final volume of 10 ml. SDS-PAGE of various samples from the process was carried out to confirm the success of the purification step. The resin was subjected to regeneration with 20 mM MES buffer (pH 5.0) containing 0.1 M NaCl followed by five bed volumes of distilled water and stored in 20% ethanol at 4 °C for further use. 49 2.7.2 Enterokinase cleavage Cleavage of the N-Terminal tags (thioredoxin Trx tagTM, 6X His tag® and Sprotein affinity containing S tagTM) in the fusion protein of Dkk1 Cys2 in pET32a was attempted by protease enterokinase (Roche) cleavage. 25 µg of the fusion protein was incubated with 0.6 µg enterokinase according to the standard manufacturer’s protocol. The protein was incubated at 4 °C, 16 °C, and at room temperature and 5 μg samples were collected at different time points like 1, 3, 6, 12 and 24 hours for analysis using SDS-PAGE. 2.7.3 Thrombin cleavage Thrombin cleaves only the Trx tag and N-Terminal His tag from the fusion protein, but the S-tag (15 aa) is left behind. Thrombin protease (Amersham Biosciences) was added at a concentration of 5 U/mg of fusion protein i.e., about 200 units for four liter pellet and cleavage was carried out on-column at 4 °C in 10-15 ml of Tris-Cl lysis buffer for 20 hours. Majority of the cleaved Dkk1 Cys2 protein was obtained in the flow through from the column and a little in the following two washes with the Tris-Cl lysis buffer. The Talon purified and cleaved protein still had some amount of contaminant proteins, which were visualised by SDS-PAGE analysis. 2.7.4 Gel filtration The Trx and N-His tag cleaved protein was further purified by size exclusion chromatography on a pre-equilibrated HiLoad 16/60 Superdex-75 column (Amersham Biosciences) using the Tris-Cl running buffer [50 mM Tris-HCl (pH 7.0), 200 mM NaCl, 5 mM βME]. The gel filtration column was pre-equilibrated with at least 1.5 column 50 volumes of running buffer before use. Protein sample was concentrated to about 2.5-3 mg/ml prior to loading onto the column. Chromatography was carried out with a flow rate of 0.7 ml/min and a total buffer volume of 130 ml. Eluted Protein fractions corresponding to the protein peak at 280 nm were analysed by SDS-PAGE and pure fractions were pooled together and concentrated using 5000 Da MWCO Microcon (Millipore) at 755 g and 4 °C. The concentrated protein was stored at -80 °C until it was set up for crystallisation trials. 2.7.5 Sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) 12.5% / 15% resolving gel and 4% stacking gel were casted according to the recipe in Appendix using the Mini-PROTEAN 3 system (Bio-Rad). Protein samples were mixed with an equal volume of 2X SDS sample loading buffer and boiled for 5 minutes at 100 °C in a heat block. 10 μl of each sample was loaded into the wells, and electrophoresis was carried out at 120 V until the marker dye reached the end of the gel. The gel was stained with Coomassie blue staining solution for about 20 min and then destained using destaining solution (Appendix) overnight. 2.8 BIOPHYSICAL CHARACTERISATION 2.8.1 Analysis of purity and homogenity Concentrated Dkk1 Cys2 protein was analysed for purity and homogeneity by SDS-PAGE, Native-PAGE and Dynamic Light Scattering (Dyna Pro from Protein Solutions). Peptide Mass fingerprinting was used to confirm the identity of the purified protein. 51 2.8.2 Circular dichroism Circular dichroism (CD) experiments were carried out for Dkk1 Cys2 with 25 μM protein in 50 mM Tris-Cl buffer (pH 7.5) at room temperature. The spectra were acquired with a J-810 Spectropolarimeter (Jasco, Japan) using a quartz cuvette with 1 mm path length (Hellma). The spectra were averaged over 3 scans to reduce the background noise and recorded at a wavelength region from 190 nm to 260 nm with 0.1 nm resolutions using a scan speed of 50 nm/min and a response time of 8 seconds. 2.9 CRYSTALLISATION TRIALS The concentrated Dkk1 Cys2 protein was set up for crystallisation trials using the hanging drop vapor diffusion method with different concentrations ranging from 4-8 mg/ml with Screen 1 and 2 (Hampton Research). The protein was set up under oil for crystallisation using microbatch method with the Jena Screen buffers Classic (1-10) in a protein: precipitant volume ratio of 1:1. 14 μl of paraffin oil was used to protect the sample from oxidation. 52 CHAPTER 3: RESULTS AND DISCUSSION 3.1 PCR OPTIMISATIONS For Krm1, PCR optimisation was initially tried with a single step annealing PCR protocol using Expand high fidelity PCR System. The forward and reverse primers had (BamH1, HindIII) restriction sites respectively. The program contained a denaturation step at 95 °C for 3 min, followed by 30 cycles of amplification (denaturation at 94 °C for 30 sec, annealing at 55 °C for 30 sec and extension at 72 °C for 1 min). This was followed by a final extension at 72 °C for 5 min. The above experiment did not yield any Krm1 PCR product. The protocol was then modified to incorporate two steps of annealing in order to allow the primers to bind more specifically to the target gene. All the other PCR conditions were kept constant. A broad range of temperatures were used for each of the annealing steps: for the first step, 52-59 °C and for the second step, 55-68 °C respectively. This trial again failed to give Krm1 PCR product, however, some nonspecific primer-dimer formation was observed on agarose gel. The high percentage of GC (>60%) in the Krm1 could cause secondary structures, which create problems at different stages of the polymerase chain reaction during (1) the denaturation of the template (2) annealing of the primer and (3) extension of the primer by taq polymerase. Various concentrations of DMSO (1.5 to 5%) were used as an adjuvant in the PCR mixture to facilitate DNA strand separation (in GC rich difficult secondary structures) because it disrupts base pairing and has been shown to improve PCR efficiency. When this experiment also failed to give the Krm1 PCR product, 2X ImmoMix™ [ImmolaseTM DNA polymerase, 32 mM (NH4)2 SO4, 134 mM Tris-HCl (pH 53 8.3 at 25 °C), 0.02% Tween 20, 2 mM dNTPs, 3 mM MgCl2] was included in the PCR mixture. This polymerase is a heat-activated thermostable DNA polymerase, used mostly for T/A cloning and is extremely specific and can eliminate non-specifics such as primerdimers and mis-primed products. Single step annealing PCR protocol with Immolase DNA Polymerase gave many non specific bands as shown in Fig 3.1a. Finally, we could get a single band PCR product for Krm1 using a two step gradient PCR, with annealing temperatures optimised at 58 and 66 °C for the first and second steps, respectively, Fig 3.1b. a 1kb b 1kb Figure 3.1. PCR optimisations of Krm1 (a) PCR amplification using 2X Immomix and single step annealing PCR program. Annealing at 55 °C and 30 cycles amplification shows many non-specific bands (b) Optimised two-step gradient protocol with annealing temperatures at 58 and 66 °C, respectively using 2X Immomix gave Krm1 at the right size of 1 kb. 1% agarose gel was run. For the amplification of Dkk1 FL, Dkk1 Cys2 and MeCP2, touchup PCR was used. Such a PCR protocol allows for relaxed and specific annealing of primers, which gives a better yield of PCR product without swamping of the product of interest by nonspecific amplifications. The earliest steps of a touchdown polymerase chain reaction cycle have lower annealing temperatures (58 to 60 °C was used). The annealing temperature is increased in increments (of usually 1 °C) for every subsequent set of 54 cycles. Thus, amplification of the above three targets, was quite straightforward using a high fidelity Taq polymerase and the templates were full length Dkk1 and MeCP2 in pGEM-T-Easy vector. a 1.5kb 1kb b 250bp 200bp 0.5kb Figure 3.2. Touchup PCR products of MeCP2, Dkk1 FL and Dkk1 Cys2 (a) 1% agarose gel for MeCP2 (1.46 kb) and Dkk1FL (0.8 kb) (b) 1.3% agarose gel for Dkk1Cys2 (0.234 kb). 3.2 MOLECULAR CLONING 3.2.1 T/A cloning and blue white screening All the amplified PCR products were initially cloned into the high copy vector pGEM-T-Easy. This is a T/A cloning vector in which single 3´-T overhangs at the insertion site of the vector greatly improve the efficiency of ligation of a PCR product into the plasmid by preventing re-circularisation of the vector. The Expand high fidelity DNA polymerase provides the compatible ‘A’ overhangs for the PCR product. This polymerase adds an ‘A’ deoxyadenosine, in a template-independent fashion, to the 3´ends of the amplified gene fragments. The constructs thus generated were sequence verified using the pairwise alignment algorithm from EBI (European bioinformatics Institute) against the gene sequence obtained from Pubmed (desired sequence). 55 3.2.2 Subcloning of the gene inserts into expression vectors The target gene inserts Krm1, Dkk1 FL, Dkk1 Cys2 and MeCP2, after sequence verification, were double digested using respective restriction enzymes from the pGEMT-Easy vector and then gel extracted. They were subcloned into different expression vectors compatible with bacteria (E. coli). Subcloning was also performed into vectors compatible for protein expression in yeast (S. cerevisiae) and baculovirus (SF9 insect cells), Table 3.1. The vectors contained different types of affinity purification tags, like His-tag and GST-tag. All the constructs were double digested with corresponding restriction enzymes and completely sequence verified to check for the absence of erroneous mutations (frame shift and substitution) in the target gene sequence and for the alignment of the target gene in frame with the start codon ATG in the vector. Figs. 3.3, 3.4 and 3.5 show the double digestion results for the Krm1, Dkk1 FL and MeCP2, constructs, respectively. In case of cloning in yeast vector PTS210, blunt end ligation was used because several attempts with sticky end ligation using the BamH1 site in the MCS of the vector failed. Therefore, Klenow treatment was performed in order to remove both the 5’ and 3’ BamH1 sticky overhangs in the vector and inserts (Krm1 and Dkk1 FL). Blunt end ligation is advantageous because it is easier for both the vector and insert ends to join together due to lack of specificity (unlike in the case of sticky end ligation). However, the orientation of the insert may not be in the right direction in the vector. This problem was verified by sequencing the positive constructs (confirmed by colony PCR) using PTS210 vector specific forward and reverse primers. 56 Table 3.1. Target proteins with the corresponding expression systems and vectors used. S. No 1 2 Target protein Krm1 Dkk1 FL Expression system E. coli Expression vectors (size) pQE30 (3.4 kb) Restriction sites used for cloning BamH1, HindIII pGEX4T1 (4.969 kb) pET32a (5.9 kb) BamH1, Not1 pETDuet1 (5.42 kb) Nde1, Xho1 S. cerevisiae PTS210 (7.95 kb) Blunt end ligation Baculovirus pFastbacHTB (4.856 kb) pGEX4T1 (4.969 kb) pET32a (5.9 kb) PTS210 (7.95 kb) pFastbacHTB (4.856 kb) pET32a (5.9 kb) pETDuet1 (5.42kb) BamH1, Xho1 E. coli S. cerevisiae Baculovirus 3 4 Dkk1 Cys2 Dkk1FL+ Krm1 E. coli E. coli 5 MeCP2 E. coli Baculovirus pET14b (4.671 kb) pET32a (5.9 kb) pFastbacHTB (4.856 kb) BamH1, Xho1 BamH1, Not1 BamHI, Not1 Blunt end ligation BamH1, HindIII BamH1, Xho1 Dkk1FL(BamH1, HindIII) Krm1 (Nde1, Xho1) Nde1, BamH1 BamH1, Xho1 BamH1,Xho1 57 5kb 3.0kb 1.0kb 0.5kb M 1 2 3 4 5 6 Figure 3.3. Double digested products of Krm1 constructs. Lanes: M – 1 kb DNA ladder; 1 – pGEM-T-Easy-Krm1 (digested with BamH1 and HindIII); 2 – pQE30-Krm1 (BamH1, HindIII); 3 – pGEX4T1-Krm1 (BamH1, Not1); 4 – pET32a-Krm1 (BamH1, Xho1); 5 – pETDuet1-Krm1 (Nde1, Xho1); 6 – pFastbac HTB (BamH1, Xho1). 1% agarose gel was run. 5kb 3.0kb 1.0kb 0.5kb M 1 2 3 4 Figure 3.4. Double digested products of Dkk1FL constructs. Lanes: M – 1 kb DNA ladder; 1 – pET32a-Dkk1FL (digested with BamH1 and Not1); 2 – pGEX4T1-Dkk1FL (BamH1, Not1); 3 – pGEM-T-Easy-Dkk1FL (BamH1, Not1); 4 – pFastbacHTB-Dkk1FL (BamH1, HindIII). 1% agarose gel was run. 58 5kb 3.0kb 1.5kb M 1 2 3 4 Figure 3.5. Double digested products of MeCP2 constructs. Lanes: M – 1 kb DNA ladder; 1 – pGEM-T-Easy-MeCP2 (digested with BamH1 and Xho1); 2 – pET32a-MeCP2 (BamH1, Xho1); 3 – pET14b-MeCP2 (Nde1, BamH1); 4 – pfastbacHTB-MeCP2 (BamH1, Xho1). 1% agarose gel was run. 1.0kb M 1 2 3 4 5 6 7 8 Figure 3.6. Colony PCR products of Krm1 in PTS210. Lanes: M – 1 kb DNA ladder; 1-8 – PCR products of 8 colonies picked up for PCR. Lanes 3, 6, 7 and 8 show positive clones. 1% agarose gel was run. 59 1.0kb 0.5kb M 1 2 3 4 5 6 Figure 3.7. Colony PCR products of Dkk1 FL in PTS210. Lanes: M – 1 kb DNA ladder; 1-6 – PCR products of 6 colonies picked up for PCR. Lanes 3, 4, 5 show positive clones. 1% agarose gel was run. 3.3 PROTEIN EXPRESSION TRIALS 3.3.1 Expression in E. coli Protein expression trials were carried out in E. coli with sequence verified constructs made in suitable expression vectors. The conditions that can influence protein expression, like host strain, temperature (37, 25 and 16 °C) and IPTG concentration (0.05 to 1.0 mM) were varied. The results are summarised in Table 3.2. The chosen vectors and E. coli host strains are given below. For vector pQE30: M15 [pREP4] (Qiagen) and SG13009 [pREP4] (Qiagen). For pGEX4T1, pET32a, pET14b and pETDuet1: BL21 (DE3) (Novagen), pLysS (DE3) (Novagen), BL21 (DE3) Codon Plus-RP (Stratagene), Rosetta (DE3) (Novagen), Origami B (DE3) (Novagen), Rosettagami (DE3) (Novagen), AD494 (DE3) (Novagen) strains were used. 60 Table 3.2. Summary of protein expression in E. coli, S. cerevisiae and Baculovirus. S.No Target Protein pI/MW( kDa) Expression Vector 1 Dkk1 FL 8.74 / 29.297 2 Dkk1 Cys2 9.30 / 8.839 pGEX4T1 pET32a PTS210 pFastbacHTB pET32a 3 Krm1 5.62 / 36.487 4 5 Dkk1 FL+Krm1 MeCP2 9.95 / 52.440 pQE30 pGEX4T1 pET32a pETDuet1 PTS210 pFastbacHTB pETDuet1 pET14b pET32a pFastbacHTB Protein Expression detected Nil Nil Nil In progress Yes, Soluble Nil Nil Yes, Insoluble Nil Nil In progress Nil Nil Yes, Soluble in progress In progress The negative results, Table 3.2, for Krm1 are shown in Fig. 3.8a, 3.8b and 3.8c, for Dkk1 FL in Fig. 3.8d and 3.8e and for MeCP2 in Fig. 3.8f. Krm1 in pET32a was over-expressed as inclusion bodies, Fig. 3.8c. Trials to recover some soluble protein also failed when IPTG concentration was reduced to as low as 0.05 mM and protein expression was carried out at a low temperature of 16 °C. 61 97 66 45 30 20.1 14.4 M 1 2 3 4 5 Figure 3.8 (a) Expression of Krm1 in pQE30 vector / M15 cells. Lanes: M - Protein Ladder (kDa) ; 1 – sample induced with 0.5 mM IPTG at 37 °C; 2 – IPTG 1 mM, 37 °C; 3 – IPTG 0.5 mM, 16 °C; 4 – IPTG 1 mM, 16 °C; 5 - uninduced control. 97 66 45 30 20.1 14.4 M 1 2 3 4 Figure 3.8 (b) Expression of Krm1 in pGEX 4T1 vector / BL21 (DE3) cells. Lanes: M - Protein ladder (kDa) ; 1 - sample induced with 1 mM IPTG at 37 °C; 2 – IPTG 0.5 mM, 37 °C; 3 – uninduced control; 4 – induced control of pGEX 4T1 empty vector. 62 97 66 45 30 20.1 14.4 1 2 M 3 4 5 6 Figure 3.8 (c) Expression of Krm1 in pET32a vector / BL21 (DE3) cells. Lanes: 1 – uninduced supernatant ; 2 – uninduced pellet ; M - Protein ladder (kDa); 3 – supernatent induced with 0.5 mM IPTG at 37 °C ; 4 – pellet induced with 0.5 mM IPTG, 37 °C ; 5 – supernatent induced with 0.5 mM IPTG, 16 °C ; 6 – pellet induced with 0.5 mM IPTG, 16 °C. 97 66 45 30 20.1 14.4 M 1 2 3 4 Figure 3.8 (d) Expression of Dkk1 FL in pGEX4T1 vector / BL21 (DE3) cells. Lanes: M – Protein Ladder (kDa); 1 – sample induced with 1 mM IPTG, 37 °C ; 2 – IPTG 0.5 mM, 37 °C; 3 – uninduced control; 4 – induced control of pGEX 4T1 empty vector. 63 97 66 45 30 20.1 14.4 M 1 2 3 4 5 Figure 3.8 (e) Expression of Dkk1 FL in pET32a vector / BL21 (DE3) cells. Lanes: M – Protein Ladder (kDa); 1 – uninduced control; 2 – sample induced with 1 mM IPTG, 37 °C; 3 – uninduced control; 4 – IPTG 0.5 mM, 37 °C ; 5 - induced control of pET32a empty vector. 97 66 45 30 20.1 14.4 M 1 2 3 4 5 Figure 3.8 (f) Expression of MeCP2 in pET14b vector / pLySS (DE3) cells. Lanes: M - Protein Ladder (kDa); 1 – supernatant induced with 0.5 mM IPTG, 16 °C; 2 – supernatent induced with 0.5 mM IPTG, 37 °C ; 3 – pellet induced with 0.5 mM IPTG, 16 °C ; 4 – pellet induced with 0.5 mM IPTG, 37 °C; 5 – uninduced control. Protein expression for both Krm1 and Dkk1 FL was tested in S. cerevisiae strain W303a, pep4::HIS3. The protein expression was analysed by Western blot, Fig. 3.9. 64 200 150 100 95 50 37 25 C UK IK UD ID Figure 3.9. Western blot analysis of Krm1 and Dkk1FL using anti-his primary antibody. Lanes are marked as: C – Alpha-his protein as control expressed at 95 kDa; UK– uninduced Krm1 control; IK - induced Krm1 sample ; UD – uninduced Dkk1FL control; ID – induced Dkk1FL sample. Krm1 and Dkk1 FL show no expression in the induced samples at the expected size of 36 kDa and 29 kDa respectively when compared with their respective uninduced controls. From the above results, we learn that Krm1, Dkk1 FL could not be expressed in both E. coli and S. cerevisiae. 3.4 REASONS FOR PROTEIN EXPRESSION FAILURE AND REMEDIES There are several reasons behind the failure of protein expression in E. coli. Some of them are discussed below. Toxic gene expression and additional plasmid stability: Even in the absence of IPTG, there is some expression of T7 RNA polymerase from the lacUV5 promoter in λDE3 lysogens and therefore, basal expression of the target protein. Any recombinant protein expressed in E. coli may interfere with normal cell function and therefore may be “toxic” to the bacteria. The degree of toxicity will vary from protein to protein. If target 65 gene products are sufficiently toxic to E. coli, this basal level of expression can be enough to prevent vigorous growth and the establishment of plasmids in λDE3 lysogens. The proteins Krm1 and Dkk1 FL may be toxic to the host cell when expressed. However, one approach to control basal expression is to use vectors that contain a T7 lac promoter eg: pET32a vector (pET System Manual, Novagen). For very high toxic genes, a combination of T7 Lac promoter-containing vector and pLysS is preferable. Usage of pLysS (DE3) hosts confers additional stability to the plasmid because the host strain contains a compatible plasmid that provides small amount of T7 lysozyme, a natural inhibitor of the T7 RNA polymerase (pET System Manual, Novagen). Thus, we made use of this host strain in our expression trials. Solubility issues: The Bioinformatical analysis of Krm1 showed that it has ~80% and Dkk1 FL has 93% chance of insolubility, respectively when expressed in E. coli. Protein solubility was presumably improved in our experiments by using the pET32a vector, which produces a thioredoxin fusion protein. The N-terminal thioredoxin tag promotes disulfide bond formation and thereby, increases the yield of soluble product in the cytoplasm (pET System Manual, Novagen). Krm1 has 16 cysteine residues and 24 proline residues, Dkk1 FL has 22 cysteines, 17 arginines and 14 prolines. The presence of such a high number of these amino acids may confer difficulty in protein expression. Additionally, the thioredoxin-reductase deficient strain AD494 (DE3) was used to maximise soluble target protein expression. The trxB- cells have been shown to permit disulfide bond formation in the cytoplasm of E. coli, which appears to depend on the presence of an oxidised form of thioredoxin (pET system manual, Novagen). 66 Rare codon problem: The Krm1, Dkk1 FL and MeCP2 proteins contain around 46 arginines. However, there are not many arginines in the amino terminus of the proteins which would otherwise cause a severe problem (pET system manual, Novagen). The usage of E. coli strains like BL21-Codon plus (DE3) RP and BL21 (DE3) Rosettagami normally takes care of eukaryotic protein expression, by supplying tRNAs for rare codons. 3.4.1 Co-expression of proteins Co-expression of two or more interacting proteins is a strategy that is occasionally employed to increase the yield and solubility of them. Many proteins are only active as heteromeric complexes and in cases where one protein is unstable without the other, coexpression would be an ideal method of choice. The authenticity of biological complex of Krm1-Dkk1 FL logically allowed our attempt to express these two proteins simultaneously in the pETDuet1 vector. However, no expression of Krm1 or Dkk1 FL was detected in pETDuet1 expression. Residues 1-20 and 390-409 of Krm1 is more hydrophobic, Fig. 3.10. However, we used only the fragment containing 21-350 aa for our studies. Krm1 may have a natural tendency to form inclusion bodies when expressed in bacteria. This is also one of the resons why we attempted to express Krm1 along with Dkk1. 3.4.2 Expression in yeast (S. cerevisiae) Over-expression of proteins in yeast has a unique advantage. Being a higher order eukaryotic system, yeast can aid a protein in proper folding and post-translational 67 modification. These single-celled eukaryotic organisms grow quickly in defined medium, are easier and less expensive to work with than insect or mammalian cells, and are easily adapted to fermentation. Yeast expression systems are ideally suited for large-scale production of recombinant eukaryotic proteins. But, trials of Krm1 and Dkk1 FL expression failed in this system also. Figure 3.10. The Kyte-Doolittle hydropathy plot for Kremen1 (473 aa). The peak region above 1.8 (red line) is a transmembrane region. 3.5 EXPRESSION AND PURIFICATION OF HIS-TAGGED Dkk1Cys2 The expression of Dkk1 Cys2 was successful with pET32a vector. Dkk1 Cys2 was initially expressed in E. coli BL21 (DE3) cells in small scale using 50 ml LB to test the expression level and its solubility at 37 °C and 0.4 mM IPTG. The protein was soluble, Fig. 3.11. This was followed by large scale, 4 L culture of LB, supplemented with ampicillin. Cells were harvested and lysed using French press and centrifuged to separate supernatant from 68 cell debris. The supernatant was loaded into a Talon affinity column as a first step of purification of His tagged Dkk1 Cys2. After giving sufficient wash to the column with wash buffers, Dkk1 Cys2 was separated from other bacterial proteins that were bound non-specifically to the column. After several unsuccessful cleavage attempts to release the protein from the tag / bead with enterokinase, the protein was cleaved with thrombin and was eluted out in a final volume of 20 ml. Eluted fractions were checked by SDSPage, Fig. 3.12. However, digestion by thrombin could not remove the S-tag from the protein. Hence, all the work carried out was with the protein containing S-tag. The elutions were concentrated to 2-3 mg/ml before loading into a Sephadex-75 gel-filtration FPLC column for further purification. The monomeric protein was eluted out at an elution volume of about 80 ml as a single peak (Fig. 3.13). 31 21.5 14.4 M U S P Figure 3.11. Small Scale expression of Dkk1Cys2. Lanes: M - protein ladder (kDa) ; U - uninduced; S - supernatent; P - pellet. Dkk1 Cys2 is over-expressed in the soluble fraction. 69 31 21.5 14.4 S FT W1 W2 RB RA E1 E2 M Figure 3.12. Talon affinity purification of His-tagged Dkk1Cys2. Lanes: S - supernatent; FT - flow through; W1, W2 - initial and final washes; RB - Beads with bound Dkk1Cys2 before thrombin cleavage; RA - beads with the Trx-His tag after thrombin cleavage; E1, E2 - elutions 1 and 2 from the column after cleavage; M - protein ladder (kDa). It can be seen that most of the contaminating bacterial proteins are removed in the elution lanes. B9 B2 Figure 3.13. Gel filtration profile of Dkk1Cys2 on a Sephadex-75 column. Fractions B9-B2 under the peak was pooled for further characterisations. The purity of the protein was improved and appeared as a single band at a molecular weight of approx. 14 kDa when analysed on a 15% SDS-PAGE gel (Fig. 3.14). 70 31 21.5 14.4 6.5 M B2 B3 B4 B5 B6 B7 B8 B9 Figure 3.14. 15% SDS gel of the thrombin cleaved Dkk1 Cys2. The lanes correspond to the elution tube in FPLC are marked and M is the protein ladder (kDa). 3.6 ANALYSIS OF PROTEIN PURITY, HOMOGENITY AND MOLECULAR WEIGHT Dkk1 Cys2 was purified to at least 98% pure as analysed by SDS-PAGE. The homogeneity of the protein was analysed using Dynamic Light Scattering (DLS) and native-PAGE. The DLS results, Fig. 3.15, suggest that the protein is mono-disperse with a good poly-dispersity index of less than 20% at concentrations less than 5 mg/ml, which is suited for crystallisation trials. However, the molecular weight appears to be quite high of about 62 kDa on DLS after concentrating the protein to 8.5 mg/ml instead of 12 kDa, after thrombin cleavage. This suggests that the protein may be an oligomer or it is also possible that the molecule may not be assuming a spherical shape for accurate size determination on DLS. 71 Figure 3.15. Dynamic Light Scattering analysis of Dkk1 Cys2 at a concentration of 8.5 mg/ml. 72 Therefore, DLS by itself may not be a very accurate technique in determining the homogenous state of some proteins. A native gel, Fig. 3.16, was run to confirm the physical state of the protein to be a monomer which was used for crystallisation trials. 232kDa 140kDa 66kDa Figure 3.16. Native gel of Dkk1Cys2 at a concentration of 5.5 mg/ml shows that the protein is a monomer. 3.7 CIRCULAR DICROISM OF Dkk1Cys2 A negative peak at 218 nm and a positive peak around 196 nm were observed in the CD spectrum of Dkk1 Cys2, Fig. 3.17. These two peaks are characteristic of a β-sheet structure. However, there is also another minimum at 210 nm, which may be due to a possible lower percentage of α-helical structure. The spectrum generated is Buffer used: 50mM tris-Nacl buffer, pH7.5 Sample Conc: 25 μM Figure 3.17. CD Spectrum of Dkk1 Cys2. 73 quite consistent with the secondary structure prediction of Dkk1 Cys2, Fig. 3.18. But the prediction result shows that a major portion of Dkk1 Cys2 is random coil which is not indicated in the CD analysis. Pred: CCCCCCCCCCCEEHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEC AA: CLRSSDCAAGLCCARHFWSKICKPVLKEGQVCTKHKRKGSHGLEIFQRCYCGEGLACRIQ 10 20 30 40 50 60 Pred: CCCCCCCCCCEEEEECCC AA: KDHHQASNSSRLHTCQRH 70 Figure 3.18. Predicted secondary structure of Dkk1 Cys2. H-helix, Estrand and C-coil. 3.8 PEPTIDE MASS FINGERPRINTING (PMF) The molecular weight of Krm1 ~ 53 kDa with an 18 kDa Trx-tag. The source for the PMF result, Fig. 3.19, turns out to be of Rattus norvegicus, which is not correct. This may have occurred because of the presence of impurities after denaturing and purification of inclusion bodies of Krm1, expressed in the pET32a vector and BL21 (DE3) cells. Figure 3.19. Peptide mass fingerprinting of Krm1 with the thioredoxin tag shows to be Krm-1 from Rattus norvegicus. The probability based Mowse score from the Mascot search of Krm1 shows a hit with gi 16758464. 74 The molecular weight seen from the result for Dkk1 Cys2, Fig. 3.20, is ~30 kDa with ~18 kDa accounting for the Trx-tag and the remaining 12 kDa is accounted for Dkk1 Cys2 based on the sequence. Figure 3.20. Peptide mass fingerprinting of Dkk1 Cys2 with a thioredoxin tag is confirmed to be Dkk-1 from Mus musculus. The probability based Mowse score from the Mascot search of Dkk1 Cys2 shows a hit with gi 31542557. 3.9 EXPRESSION AND PARTIAL PURIFICATION OF MeCP2 MeCP2 cloned into pET32a vector was expressed in BL21 (DE3) cells, Fig 3.21. The protein expression was observed at 37 °C and 0.5 mM IPTG. The protein was quite soluble, Fig 3.22. This was followed by large scale, 2 L culture of LB, supplemented with ampicillin. Cells were harvested and lysed using French press and centrifuged to separate supernatant from cell debris. The supernatant was loaded into a Talon affinity column as a first step of purification of His tagged MeCP2. Sufficient washes were given to the column with wash buffers and MeCP2 was eluted out in a final volume of 10 ml. Eluted fractions were checked by SDS-PAGE, Fig. 3.23. 75 97 66 45 30 20.1 14.4 M 1 2 3 4 Figure 3.21. Expression of MeCP2 in pET32a vector / BL21 (DE3) cells. Lanes: M – Protein Ladder (kDa); 1 – induced control of empty pET32a vector; 2 - uninduced control ; 3 – sample induced with 0.5 mM IPTG, 37 °C; 4 – IPTG 1 mM , 37 °C. 97 66 45 30 20.1 14.4 U M P S Figure 3.22. Solubility check of MeCP2. Lanes: U – uninduced control ; M- protein ladder (kDa) ; P – pellet ; S - supernatent. MeCP2 is quite soluble. The eluted protein from His-tagged affinity column was then loaded into a Sephadex-200 gel-filtration FPLC column for further purification, Fig 3.24. A single protein peak was obtained at 45 ml. The protein fractions were analysed on SDS-PAGE for purity, Fig 3.25. 76 97 66 45 30 20.1 14.4 M S FT W1 W2 E1 E2 E3 E4 Figure 3.23. Talon affinity purification of His-tagged MeCP2. Lanes : M protein ladder (kDa) ; S - supernatent; FT - flow through; W1, W2 - initial and final washes; E1, E2 - elutions 1 and 2 using elution buffer 1; E3, E4 - elutions 3 and 4 using elution buffer 2. A11 A15 Figure 3.24. Gel filtration profile of His-tagged MeCP2 on Sephadex- 200 gel filtration column. The protein MeCP2 was visualised on the SDS-PAGE at ~97 kDa with the Trx tag which is 18 kDa. But, the actual size of the protein should be 53 kDa. The big difference is the size of ~26 kDa may be due to stable association of the protein with co-factors like Sin3a, Dnmt1, CoREST, SUV39H1 or C-SKI. Additionally, it may also be possible that the protein may be forming a homo-multimer due to self-association. 77 97 66 45 30 20.1 14.4 M A12 A13 A14 A15 Figure 3.25. 12.5% SDS-PAGE analysis of his-tagged MeCP2 elution fractions under the gel-filtration peak, A12-A15. Lane M: Protein ladder (kDa). 78 CHAPTER 4: CONCLUSIONS AND FUTURE WORK In the projects undertaken, we aimed to determine the crystal structures of three proteins, Kremen1, Dickkopf1 and MeCP2 which are crucial players in mammalian development. We have expressed and purified the Cys2 domain of Dkk1 from Mus musculus, which has been proved to be necessary and sufficient in binding with Krm1. The C-terminal domain of Dkk1 from Mus musculus, Dkk1 Cys2 has been cloned in pET32a vector and over-expressed in E. coli BL21 (DE3) strain. The protein has been purified by affinity chromatography and gel filtration. The protein is quite homogenous as indicated by dynamic light scattering studies and Native-PAGE. The secondary structure from Circular Dichroism analysis confirms the proposed model that Dkk1 Cys2 is similar to Colipase, which is made up of only β-strands. Dkk1 Cys2 has been concentrated up to 8 mg/ml and has been set up for crystallisation using several screens. Once protein crystal formation is observed, we would proceed with condition optimisations to get crystals of diffraction quality. Additionally, NMR studies could also be performed to solve the protein structure in solution. Krm1 from Mus musculus was cloned into several E. coli vectors. The pET32a vector containing the thioredoxin solubility fusion tag could express the protein only in inclusion bodies in E. coli. Several attempts for expression, including Dkk1 FL+Krm1 co-expression to get some soluble protein in E. coli failed. Expression in higher eukaryotic expression system, like S. cerevisiae, was also attempted for the Krm1 and 79 Dkk1 FL proteins, but, unfortunately no protein expression was detected even in inclusion bodies. MeCP2 from Homo sapiens was cloned into E. coli vectors and some soluble protein expression has been observed in the initial small scale trials. The protein has been purified by affinity chromatography and gel filtration. However, biophysical characterisation of the protein has to be performed in order to analyse the molecular size, homogeneity and secondary structure of MeCP2. A molecular size discrepancy of ~26 kDa has been observed on the SDS-PAGE gel (97 kDa containing 18 kDa Trx tag) when compared to the expected size of the protein (53 kDa). This discrepancy could be interpreted by analysis of the purified protein by mass-spectrometry and peptide mass fingerprinting to find out possible stable association of MeCP2 with any co-factors. The homogenity of the protein and secondary structure could be checked using DLS / NativePAGE and Circular Dichroism, respectively. The suitability of baculovirus system for eukaryotic protein expression may permit us to overcome the failure of expression of soluble Krm1, Dkk1 FL and MeCP2. The key features associated with this system could be an advantage to us at this stage. This includes suitability of protein expression from mammalian source and ease of post translational modification, cytotoxic protein expression and also faster and easier scale up to larger volumes in case of detectable soluble proteins. At the same time two other options can be explored for expression of Krm1 and Dkk1 FL which contain a high number of cysteine residues. They are: 1. Periplasmic protein Expression 80 In contrast to the cytoplasm, the periplasm of E.coli has an oxidising environment that contains enzymes to catalyse the formation and isomerisation of disulfide bonds. Directing heterologous proteins to the periplasm using a suitable signal sequence can be one of the strategies when attempting to isolate active and properly folded proteins containing disulfide bonds. 2. Solubilisation of inclusion bodies Krm1 can be purified under denaturing conditions of urea or guanidium hydrochloride or using a detergent like Tween 20 followed by refolding. This method is not highly recommended as it could distort the secondary structure of the protein in the process. 3. Artificial gene synthesis Krm1 and Dkk1 FL could be fused together using synthetic gene synthesis for production of their complex. Gene synthesis aids in codon usage optimisation and optimal expression of the proteins in a heterologous expression system. Recently, constructs for Krm1, Dkk1 FL and MeCP2 have been made in the pFastbacHTB vector for expression in baculovirus. Additionally, trials can be made for co-expression of Dkk1 FL-Krm1 in baculovirus. Also, for MeCP2, whose domains have been characterised, the TRD domain has been mapped to play a significant role in transcriptional repression. The domain structure is definitely of interest, which could be attempted for as a future work in addition to solving the structure of full-length MeCP2. Cloning of the TRD domain in pET32a vector has already been initiated. 81 REFERENCES Akiyama, T. (2000). Wnt/β-catenin signaling. Cytokine & Growth Factor Reviews, 11, 273-282. Amir, R.E., Van den Veyver, I.B., Wan, M., Tran, C.Q., Francke, U. and Zoghbi, H.Y. (1999). Rett syndrome is caused by mutations in X-linked MeCP2, encoding methylCpG-binding protein 2. Nature genetics, 23, 185-188. Aravind, L. and Koonin, E.V. (1998). A colipase fold in the carboxy-terminal domain of the Wnt antagonists- the Dickkopfs. Current Biology, 8, R477-478. Bafico, A., Liu, G., Yaniv, A., Gazit, A. and Aaronson, S.A. (2001) Novel Mechanism of Wnt Signalling inhibition mediated by Dickkopf-1 interaction with LRP6/Arrow. Nature Cell Biology, 3, 683-686. Ballestar, E., Yusufzai, T.M. and Wolffe A.P. (2000). Effects of Rett Syndrome Mutations of the Methyl-CpG Binding Domain of the Transcriptional Repressor MeCP2 on Selectivity for Association with Methylated DNA. Biochemistry, 39, 7100-7106. Ballestar, E. and Wolffe, A.P. (2001). Methyl-CpG-binding proteins Targeting specific gene repression. Eur. J. Biochem., 268, 1-6. Bartolomei, M.S. & Tilghman, S.M. (1997). Genomic imprinting in mammals. Annu. Rev. Genet., 31, 493-525. Bird, A.P. (1993). Functions for DNA methylation in vertebrates. Cold Spring Harbor Symp. Quant. Biol., 58, 281–285. Bird, A.P. (1996). The relationship of DNA methylation to cancer. Cancer Surv., 28, 87101. Bird, A. (2002). DNA methylation patterns and epigenetic memory. Genes & Dev., 16, 621. Boyes, J. and Bird, A. (1991). DNA methylation inhibits transcription indirectly via a methyl-CpG binding protein. Cell, 64, 1123-1134. Cadigan, K.M., and Nusse, R. (1997). Wnt signaling: a common theme in animal development. Genes and Dev., 11, 3286-3305. Cedar, H. (1988). DNA Methylation and Gene Activity. Cell, 53, 3-4. 82 Chan, S.D.H., Karpf, D.B., Fowlkes, M.E., Hooks, M., Bradley, M.S., Vuong, V, Bambino, T., Liu, M.Y.C., Arnaud, C.D., Strewler, G.J. and Nisseson, R.A. (1992). Two homologs of the Drosophila polarity gene frizzled (fz) are widely expressed in mammalian tissues. J. Biol. Chem., 267, 25202-25207 Chen, R.Z., Akbarian, S., Tudor, M. and Jaenisch, R. (2001). Deficiency of methyl-CpG binding protein-2 in CNS neurons results in a Rett-like phenotype in mice. Nature Genetics, 27, 327-331. Chu, E.Y., Hens, J., Andl, T., Kairo, A., Yamaguchi, T.P., Brisken, C., Glick, A., Wysolmerski, J.J. and Millar, S.E. (2004). Canonical WNT signalling promotes mammary placode development and is essential for initiation of mammary gland morphogenesis. Development, 131, 4819-29. Clevers, H. (2006). Wnt/β-catenin signaling in development and disease. Cell, 127, 469478. Dauter, Z., Sieker, L.C., Wilson, K.S. (1992). Refinement of rubredoxin from Desulfovibrio vulgaris at 1.0 Å with and without restraints. Acta Crystallogr., 48, 4259. Davidson, G., Mao, B., Barrentes, I.B., Niehrs, C. (2002). Kremen proteins interact with Dickkopf1 to regulate anteroposterior CNS patterning. Development, 129, 5587-5596. Fedi, P., Bafico, A., Soria, A.N., Burgess, W.H., Miki, T., Bottaro, D.P., Kraus, M.H. and Aaronson, S.A. (1999). Isolation and Biochemical Characterization of the Human Dkk-1 Homologue, a Novel Inhibitor of Mammalian Wnt Signaling. J. Biol. Chem., 274(27), 19465-19472. Glinka, A., Wu, W., Delius, H., Monaghan, A.P., Blumenstock, C. and Niehrs, C. (1998). Dickkopf-1 is a member of a new family of secreted proteins and functions in head induction. Nature, 391, 357-362. Hagberg, B. (1985). Rett’s syndrome: prevalence and impact on progressive severe mental retardation in girls. Acta Paediatr. Scand., 74, 405–408. Hendrich, B. and Bird, A. (1998). Identification and Characterization of a Family of Mammalian Methyl-CpG Binding Proteins. Mol. and Cell. Biol., 18(11), 6538–6547. He, X., Semenov, M., Tamai, K., and Zeng, X. (2004). LDL receptor-related proteins 5 and 6 in Wnt/β-catenin signaling: Arrows point the way. Development, 131(8), 16631677. Hsieh, J.C., Rattner, A., Smallwood, P.M. and Nathans, J. (1999). Biochemical characterization of Wnt-Frizzled interaction using a soluble, biologically active vertebrate Wnt protein. Proc. Natl. Acad. Sci. USA, 96, 3546-3551. 83 Jones, P.L., Veenstra, G.J.C., Wade, P.A., Vermaak, D., Kass, S.U., Landsberger, N., Strouboulis, J. and Wolffe, A.P. (1998). Methylated DNA and MeCP2 recruit histone deacetylase to repress transcription. Nature Genetics, 19,187-191. Kaludov, N.K and Wolffe, A.P. (2000). MeCP2 driven transcription repression in vitro: selectivity for methylated DNA, action at a distance and contacts with the basal transcription machinery. Nucl. Acids Res., 29(9), 1921-1928. Kawano, Y. and Kypta, R. (2003). Secreted antagonists of the Wnt signalling pathway. J.Cell Sci., 116, 2627-2634. Kimura, H. and Shiota, K. (2003). Methyl-CpG-binding Protein, MeCP2, is a target molecule for maintenance DNA methyltransferase, Dnmt1. J. Biol. Chem., 278(7), 4806–4812. Kleywegt, G.J. (2000). Validation of protein crystal structures. Acta Crystallogr. D, 56, 249-265. Klose, R.J., Shireen, A.S., Schmiedeberg, L., McDermott, S.M., Stancheva, I. and Bird, A.P. (2005). DNA binding selectivity of MeCP2 due to a requirement for A/T sequences adjacent to methyl-CpG. Mol. Cell., 19, 667–678. Koradi, R., Billeter, M. and Wuethrich, K. (1996). MOLMOL: a program for the display and analysis of macromolecular structures. J. Mol.Graph., 14, 51–55. Krupnik, V.E., Sharp, J.D., Jiang, C., Robinson, K., Chickering, T.W., Amaravadi, L., Brown, D.E., Guyot, D., Mays, G., Leiby, K., Chang, B., Duong, T., Goodearl, A.D.J., Gearing, D.P., Sokol, S.Y., Mc.Carthy, S.A. (1999). Functional and Structural diversity of the human dickkopf gene family. Gene, 238, 301-313. Lewis, J.D., Meehan, R.R., Henzel, W.J., Maurer-Fogy, I., Jeppesen, P., Klein, F. and Bird, A. (1992). Purification, sequence, and cellular localization of a novel chromosomal protein that binds to methylated DNA. Cell, 69, 905-914. Li, E., Beard, C., Forster, A., Bestor, T. and Jaenisch, R. (1993). DNA methylation, genomic imprinting and mammalian development. Cold Spring Harbor Symp. Quant. Biol., 58, 297–305. Mao, B. and Niehrs, C. (2003). Kremen2 modulates Dickkopf2 activity during Wnt/LRP6 signalling. Gene, 302, 179-183. Mao, B., Wu, W., Davidson, G., Marhold, J., Li, M., Mechler, B.M., Delius, H., Hoppe, D., Stannek, P., Walter, C., Glinka, A., and Niehrs, C. (2002). Kremen proteins are novel Dickkopf receptors that regulate Wnt/β-catenin signalling. Nature, 417, 664667. 84 Mao, B., Wu, W., Li, Y., Hoppe, D., Stannek, P., Glinka, A. and Niehrs, C. (2001). LDLreceptor-related-protein 6 is a receptor for Dickkopf proteins. Nature, 411, 321-325. McRee, D.E. (1999). Practical Protein Crystallography, 2nd edn. Academic Press, San Diego, CA. Meehan, R.R., Lewis, J.D. and Bird, A.P. (1992). Characterization of MeCP2, a vertebrate DNA binding protein with affinity for methylated DNA. Nucl. Acids Res., 20(19), 5085-5092. Meehan, R.R., Lewis, J.D., McKay, S., Kleiner, E.L. and Bird, A.P. (1989). Identification of a mammalian protein that binds specifically to DNA containing methylated CpGs. Cell, 58, 499-507. Miller, J.R., Hocking, A.M., Brown, J.D., and Moon, R.T. (1999). Mechanism and function of signal transduction by the Wnt/β-catenin and Wnt/Ca2+ pathways. Oncogene, 18, 7860-7872. Monaghan, A.P., Kioschis, P., Wu, W., Zuniga, A., Bock, D., Poustka, A., Delius, H., Niehrs, C. (1999). Dickkopf genes are co-ordinately expressed in mesodermal lineages. Mech. Dev., 87, 45–56. Moon, R. T., Bowerman, B., Boutros, M., and Perrimon, N. (2002). The Promise and Perils of Wnt signaling through β-catenin. Science, 296, 1644-46. Moon, R.T., Brown, J.D. and Torres, M. (1997). WNTs modulate cell fate and behavior during vertebrate development. Trends Gen., 13, 157-162. Moon, R. T., Kohn, A. D., De Ferrari, G. V. and Kaykas, A. (2004). Wnt and β-catenin signalling: diseases and therapies. Nature Genetic Reviews, 5, 689-698 Morin, P. J. (1999). β-catenin signaling and cancer. BioEssays, 21, 1021-1030. Mukhopadhyay M, Shtrom S, Rodriguez-Esteban C, Chen L, Tsukui T, Gomer L et al. (2001). Dickkopf1 is required for embryonic head induction and limb morphogenesis in the mouse. Dev. Cell, 1, 423–434. Nakamura, T., Aoki, S., Kitajima, K., Takahashi, T., Matsumoto, K., and Nakamura, T. (2001). Molecular cloning and characterization of Kremen, a novel kringle-containing transmembrane protein. Biochimica et Biophysica Acta, 1518, 63-72. Nakao, M., Matsui, S., Yamamoto, S., Okumura, K., Shirakawa, M., Fujita, N. (2001).Regulation of transcription and chromatin by methyl-CpG binding protein MBD1. Brain and Development, 23, S174–S176. 85 Nan, X., Campoy, F.J. and Bird, A. (1997). MeCP2 Is a Transcriptional Repressor with abundant Binding Sites in Genomic Chromatin. Cell, 88, 471–481. Nan, X., Meehan, R.R. and Bird, A.P. (1993). Dissection of the methyl-CpG binding domain from the chromosomal protein MeCP2. Nucl. Acids Res., 21(21), 4886-4892. Nan, X., Ng, H.H., Johnson, C.A., Laherty, C.D., Turner, B.M., Eisenman, R.N. and Bird, A. (1998). Transcriptional repression by the methyl-CpG-binding protein MeCP2 involves a histone deacetylase complex. Nature, 393, 386-389. Nan, X., Tate, P., Li, E. and Bird, A. (1996) DNA Methylation Specifies Chromosomal Localization of MeCP2. Mol. and Cell. Biol., 16(1), 414–421. Neumann, B. and Barlow, D. P. (1996). Multiple roles for DNA methylation in gametic imprinting. Curr. Opin.Genet. Dev., 6, 159-163. Niehrs, C. (2006). Function and biological roles of the Dickkopf family of Wnt modulators. Oncogene, 25, 7469-7481. Nusse, R. (2005). Wnt signaling in disease and in development. Cell Research, 15(1), 2832. Ohki, I., Shimotake, N., Fujita, N., Nakao, M. and Shirakawa, M. (1999) Solution structure of the methyl-CpG-binding domain of the methylation-dependent transcriptional repressor MBD1. The EMBO Journal, 18(23), 6653–6661. Polakis P. (2000). Wnt signaling and cancer. Genes Dev., 14, 1837–1851. Ramachandran, G.N., Ramakrishnan, C. and Sasisekharan,V. (1963). Stereochemistry of polypeptide chain configurations. J. Mol. Biol., 7, 95-99. Ramakrishnan, C. and Ramachandran, G.N. (1965). Stereochemical criteria for polypeptide chain conformations. Allowed conformations for a pair of peptide units. Biophys. J., 5, 909-933. Razin, A. and Cedar, H. (1994). DNA methylation and genomic imprinting. Cell, 77, 473-476. Riggs, A.D. and Pfeifer, G.P. (1992). X-chromosome inactivation and cell memory. Trends Genet., 8(5), 169-174. Roloff, T.C., Ropers, H.H. and Nuber, U.A. (2003). Comparative study of methyl-CpGbinding domain proteins. BMC Genomics, 4, 1-9. Rountree, M. R., Bachman, K. E., Herman, J.G., Baylin, S.B. (2001) DNA methylation, chromatin inheritance, and cancer. Oncogene, 20, 3156-3165. 86 Semenov, M. V., Tamai, K., Brott, B. K., Kuhl, M., Sokol, S. and He, X. (2001). Head inducer Dickkopf-1 is a ligand for Wnt coreceptor LRP6. Current Biology, 11(12), 951-61 Tamai, K., Semenov, M., Kato, Y., Spokony, R., Liu, C., Katsuyama, Y., Hess, F., SaintJeannet, J.P. and He, X. (2000). LDL-receptor related proteins in Wnt signal transduction. Nature, 407, 530-535. Tate, P. & Bird, A. (1993). Effects of DNA methylation on DNA-binding proteins and gene expression. Curr. Opin. Genet. Dev., 3, 226-231. Tilbeurgh, H.V., Bezzine, S., Cambillau, C., Verger, R., Carriere., F. (1999) Colipase: Structure and interaction with Pancreatic lipase. Biochimica et Biophysica Acta, 1441, 173-184. Vacca, M., Filippini, F., Budillon, A., Rossi, V., Mercadante, G., Manzati, E., Gualandi, F., Bigoni, S., Trabanelli, C., Pini, G., Calzolari, E., Ferlini, A., Meloni, I., Hayek, G., Zappella, M., Renieri, A., D’Urso, M., D’Esposito, M., MacDonald, Kerr, A., Dhanjal, S., Hultén, M. (2001) Mutation analysis of the MECP2 gene in British and Italian Rett syndrome females. J. Mol. Med., 78, 648-655. Van den Veyver, I.B., Zoghbi, H.Y. (2001). Mutations in the gene encoding methyl-CpGbinding protein 2 cause Rett syndrome. Brain & Development, 23, S147–S151. Vinson, C. R., Conover, S. and Adler, P. N. (1989). A Drosophila tissue polarity locus encodes a protein containing seven potential transmembrane domains. Nature, 338, 263-264. Wade, P.A. (2001). Methyl CpG-binding proteins and transcriptional repression. Bioessays, 23, 1131-1137. Wakefield, R.I.D., Smith, B.O., Nan, X., Free, A., Soteriou, A., Uhrin, D., Bird, A.P. and Barlow, P.N. (1999). The Solution Structure of the Domain from MeCP2 that Binds to Methylated DNA. J. Mol. Biol., 291, 1055-1065. Wang, Y., Macke, J. P., Abella, B. S., Andreasson, K., Worley, P., Gilbert, D. J., Copeland, N. G., Jenkins, N. A. and Nathans, J. (1996). A large family of putative transmembrane receptors homologous to the product ofthe Drosophila tissue polarity gene frizzled. J. Biol. Chem., 271, 4468-4476. Weber, P. C. Physical principles of protein crystallization. (1991) Adv. Prot. Chem., 41, 1-36. Willert, K. and Nusse, R. (1998) β-catenin: a key mediator of Wnt Signaling. Current Opin. Genet. Dev., 8, 95-102. 87 Wodarz, A. and Nusse, R. (1998). Mechanisms of Wnt signaling in development. Annu. Rev. Cell Dev. Biol., 14, 59–88. 88 APPENDIX PLATE [PEG/Li-acetate/TE] 40% PEG 3350 (autoclaved) 100 mM LiOAc (filtered) 10X TE pH 7.5 [10 mM Tris-HCl (pH 7.5) ; 1 mM EDTA (pH 8.0)] TCA resuspension solution 3 % SDS 100 mM Tris-HCl (pH 11.0) 3 mM DTT Coomassie staining solution 0.1% Coomassie brilliant blue R-250 40% Ethanol 10% Acetic acid Coomassie is dissolved in ethanol before adding water and acetic acid. Destaining solution 10% acetic acid SDS-PAGE gel composition Resolving gel composition: 12.5% gel: For 5 ml buffer, 1.3 ml 1.5 M Tris pH 8.8, 2 ml 30% acrylamide, 0.05 ml 10% SDS, 0.05 ml 10% APS and 0.002 ml TEMED are added. 15% gel: For 5 ml buffer, 1.3 ml 1.5 M Tris pH 8.8, 2.5 ml 30% acrylamide, 0.05 ml 10% SDS, 0.05 ml 10% APS and 0.002 ml TEMED are added. Stacking gel composition: 4% Gel: For 2 ml buffer, 0.25 ml 1.0 M Tris pH 6.8, 0.33 ml 30% acrylamide, 0.02 ml 10% SDS, 0.02 ml 10% APS and 0.002 ml TEMED are added. For native gels, the composition is the same as SDS-PAGE gel, only addition of SDS to the buffers is excluded. 2 X SDS gel loading buffer 100 mM Tris-HCl (pH 6.8) 200 mM DTT 4% SDS 20% glycerol 0.2% bromophenol blue [...]... fingerprinting of Krm1 74 Figure 3.20 Peptide mass fingerprinting of Dkk1Cys2 75 Figure 3.21 Expression of MeCP2 in pET32a vector / BL21 (DE3) cells 76 Figure 3.22 Solubility check of MeCP2 76 Figure 3.23 Talon affinity purification of His-tagged MeCP2 77 Figure 3.24 Gel filtration profile of MeCP2 on Sephadex- 200 column 77 Figure 3.25 SDS-PAGE analysis of his-tagged MeCP2 elution fractions 78 xv LIST OF TABLES... determined the structure of MBD of the human methylation-dependent repressor MBD1, while Wakefield et al., (1999) determined the structure of MBD of MeCP2 using NMR, Fig 1.10 Although the sequences from MBD1 and MeCP2 exhibit only a moderate degree of homology, sequences can easily be aligned with a number of conserved residues throughout the MBD The structures of the two MBDs are very similar The structure. .. between colipase and the C-teminal domain of Dkk has been recently discovered (Figs 1.3 and 1.4) It has been convincingly suggested that Dkks and colipases have the same disulfide-bonding pattern and a similar fold The structure of colipase fold is solved using X-ray crystallography and it consists of short β-strands connected by loops and stabilised by disulfide bonds, resulting in finger-like structures... of MeCP2 in primary sequence, while the MBDs of MBD1, 2 and 3 are more similar to each other than to either MBD4 or MeCP2 (Figs 1.8 and 1.9) The presence of an intron, located at a conserved position in 12 all genes (Hendrich and Bird, 1998), indicates that the MBDs within each protein are evolutionarily related, but the lack of similarity between these proteins outside of the MBD (excluding MBD2 and. .. Sequence comparison of Krm proteins (a) Alignment of Krm1 and Krm2 protein sequences from Xenopus (X) and mouse (m).The Kringle, Wsc and CUB and transmembrane domains are highlighted and conserved amino acids are shown in white (within coloured domains) or red (b) Krm homology tree and matrix showing overview of homology and amino acid identity, respectively, between the Xenopus, mouse and human Krm proteins... X-chromosome and other chromosomal abnormalities (Riggs and Pfeifer, 1992), in genomic imprinting (Bartolomei and Tilghman, 1997; Neumann and Barlow, 1996; Razin and Cedar, 1994), in transformed cell-lines and tumors (Bird, 1996; Rountree et al., 2001) 1.5.1 Methyl-CpG binding proteins (MeCP2) Methyl-CpG binding proteins form a family of five proteins, including MBD1, MBD2, MBD3 and MBD4 (Hendrich and Bird,... expression of Dkk1Cys2 69 Figure 3.12 Talon affinity purification of His-tagged Dkk1Cys2 70 Figure 3.13 Gel filtration profile of Dkk1Cys2 on a Sephadex-75 column 70 Figure 3.14 SDS gel of the thrombin cleaved Dkk1Cys2 71 xiv Figure 3.15 Dynamic Light Scattering analysis of Dkk1Cys2 72 Figure 3.16 Native gel of Dkk1Cys2 73 Figure 3.17 CD Spectrum of Dkk1Cys2 73 Figure 3.18 Predicted secondary structure of. .. Figure 1.5 Sequence comparison of Krm proteins 10 Figure 1.6 Deletion analysis of Kremen and Dickkopf 11 Figure 1.7 Model for functional interactions of Dkk1, LRP5/6 and Krm to block the Canonical Wnt signal in cells 11 Figure 1.8 Domain organisation of MBD family members 13 Figure 1.9 Sequence alignment of the MBD family proteins 14 Figure 1.10(a) Solution structure of the MBD of MBD1 15 Figure 1.10(b)... substrate for MeCP2 (Meehan et al., 1992; Klose et al., 2005) 1.5.5 Role of MeCP2 in transcription repression A major breakthrough in the study of MeCP2 dependent repression came with the finding that MeCP2 binds to a methylated DNA and recruits the Sin3–histone deacetylase complex to promoters, resulting in deacetylation of core histones and subsequent 16 transcriptional silencing The region of interaction... AP-XDkk1-Cys2 respectively (adapted from Mao et al., 2002; Mao and Niehrs, 2003) The colipase fold of Dkk1 (Cys2) is necessary and sufficient for Kremen and LRP6 binding and for Wnt inhibition, shown by AP- bound deletion analysis of Dkk-1 (Fig 1.6) (Mao and Niehrs, 2003) Figure 1.7 A model showing the functional interactions of Dkk1, LRP5/6 and Krm to block the Canonical Wnt signal in cells (adapted .. .CRYSTAL STRUCTURE DETERMINATION OF KREMEN1, DICKKOPF1 AND MeCP2 VINDHYA B.N.REDDY A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF BIOLOGICAL SCIENCES... INTERACTION OF DKK-1/KRM/LRP5/6 DNA METHYLATION 11 1.5.1 Methyl-CpG binding proteins (MeCP2) 12 1.5.2 Structure of MBDs 13 1.5.3 MeCP2 15 1.5.4 Architecture of MeCP2 16 1.5.5 Role of MeCP2 in transcription... 16 1.5.6 MeCP2 and Rett Syndrome 18 iii 1.6 1.7 1.8 1.9 STRUCTURE DETERMINATION OF PROTEINS 18 1.6.1 History and application of macromolecular X-ray crystallography 19 1.6.2 Protein crystallisation

Định dạng
Số trang	108
Dung lượng	1,84 MB