Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 268 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
268
Dung lượng
13,83 MB
Nội dung
IDENTIFICATION AND CHARACTERIZATION OF NOVEL PROTEINS FROM A RARE AUSTRALIAN ELAPID SNAKE DRYSDALIA CORONOIDES SHIFALI CHATRATH (M.Sc. (Biotechnology)) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY AT THE NATIONAL UNIVERSITY OF SINGAPORE DEPARTMENT OF BIOLOGICAL SCIENCES FACULTY OF SCIENCE NATIONAL UNIVERSITY OF SINGAPORE AUGUST 2010 Acknowledgements At the very outset, I would like to thank God for providing me awesome load of strength to endure the hardships of research life. I am indebted to National University of Singapore for sponsoring my survival in Singapore by awarding research scholarship. The vibrant research environment in NUS helped me shape as a skillful researcher. I express my sincere gratitude to my supervisor Prof. R. Manjunatha Kini for his commendable support during my stay in ‘Protein Science Lab’. He has always been a source of inspiration, encouragement and support. His critical comments on my experimental designs have improved my way of thinking about science. Thanks Prof.! for making me an independent researcher. I would like to especially thank him for his promptness for hastening the process of my thesis submission. I feel fortunate to be co-supervised by Prof. Prakash Kumar. His useful suggestions during lab meetings, manuscript and thesis writing have greatly helped me improve my writing skills. He has been a kind, humble and patient person who helped me during ups and downs of research life. I am also extremely thankful to Dr. J. Sivaraman, Dr. K. Swaminathan and Dr. Henry Mok for always being available for advising me on structural studies of my project. I also thank Dr. Lin Qingsong for helping me understand proteomics part of my project. I would also like to thank Dr. Hai Wei Song from Institute of Molecular and Cell Biology (IMCB) for helping me with the set up of crystallization. My sincere thank goes to our collaborator Prof. Daniel Bertrand from Department of Neuroscience, University of Geneva, Geneva, Switzerland for carrying out a part of pharmacological studies of drysdalin in his laboratory. This acknowledgement would be incomplete without thanking Prof. Anjali Karande, Indian Institute of Science, Bangalore India, Prof. Gurcharan Kaur and Prof. Prabhjeet Singh from Guru Nanak Dev University, Amritsar, India, who always guided me during my tough times. I am extremely grateful to them for giving me a strong background in various fields of biotechnology. I express my warm gratitude to all the past and present members of the ‘Protein Science Lab’. Thanks to Dr. Rajagopalan Nandhakishore for guiding me through this project, Dr. Cho Yeow for teaching me HPLC, Dr. Susanta Pahari, Dr. Robin Doley and Dr. Md. Abu Reza for useful discussions, Dr. Raghurama Prabhakar Hegde for helping me with the modeling of structures, Dr. Joanna Pawlak for useful suggestions regarding my project and Dr. Alex Chapeaurogue for helping with the proteomics part of my thesis. I would also like to thank Dr. Ryan, Dr. Guna Shekhar, Dr. Pushpalatha, Shi Yang, Girish, Amrita, Bhaskar, Angelina, Sindhuja, Angie and Aldo for their help during my stay in the lab. My special thanks go to Sheena for teaching i me organ bath assays and Aarthi, Dr. Om Praba and Dr. Pushpa for being my lunch and coffee buddies. I also acknowledge the help offered to me by ‘Plant Morphogenesis Lab memebers’; Dr. Ramammoorthy, Vivek, Vijay, Mahesh and Petra for useful comments on my work progress. I also thank Ms. Tay Bee Ling for timely flow of reagents for my project. DBS non-academic staff also deserves a vote of thanks for helping me with administrative stuff. I want to sincerely appreciate the helps offered to me by Structure Biology Laboratory members including Dr. Karthik for useful discussions, Dr. Zhang Jingfeng, Tzer Fong, Lisa, Veerendra, Pankaj, Shaveta, Abhilash, Manjeet, Priyanka, Thangavelu. They were always there to provide me things when I used to forget while coming one level up! Thanks to Pallavi for helping me with primer designing when I used to get stuck. Thanks to Dr. Xing Ding and Meng Kiat from Dr. Hai Wei’s lab, Mr. Mourier Gilles, CEA, Paris, France for useful advice on refolding and Milena from Prof. Daniel’s lab. I am extremely grateful to my aunt and her family in Singapore who provided me a home away from home. I extend my thanks to my in-laws who have been very supportive throughout. I have no words to express my gratitude for my brother who supported my education after my father passed away. I would also like to thank my God-fearing mother and sister for having faith in me that I can it. Above all, I am extremely thankful to my husband for being my pillar of strength. I would not have come so far without his support and also my apologies for releasing my frustration on him which he endured quite patiently. Shifali Chatrath August, 2010 ii Table of Contents Acknowledgements i Table of contents iii Summary vii List of figures ix List of tables xii Abbreviations xiii Chapter Review of Literature Introduction Venomous snakes Drysdalia coronoides; a rare Australian elapid Snake venom Venom composition, Enzymatic proteins, Non-enzymatic proteins Three-finger toxin family 12 Neurotoxins, Hannalgesin, Fasciculins, Muscarinic toxins, Cardiotoxins (CTxs), Calciseptine and FS2 toxin, Dendroaspin or mambin, Non-venom proteins with ‘3FTx’ fold, 3FTx fold: molecular scaffold with multiple missions Nicotinic Acetylcholine Receptors (nAChRs) 29 Muscular type of nAChRs, Neuronal type of nAChRs, Receptor-ligand interface revealed by AChBP (AChbinding protein) Aim and scope of the thesis Chapter 38 Venom Transcriptome and Proteome of Drysdalia coronoides Introduction 41 Materials and Methods 42 iii Reagents and kits; Collection of venom and venom gland; RNA isolation and cDNA synthesis; Cloning of ds cDNA; Isolation of plasmids and verification of clones; cDNA sequencing; Sequence analysis; 3’RACE of venom proteins; In-solution tryptic digestion; HPLC separation and mass spectrometric analysis of tryptic peptides; Molecular modeling Results and Discussions 50 Transcriptome 50 Construction of cDNA library; Composition of venom gland transcriptome of D. coronoides; Three-finger toxins family; serine protease inhibitors; Cysteine-rich secretory proteins (CRISPs); Phospholipases A2 (PLA2s); Venom nerve growth factor (VNGF); Phospholipase B (PLBs); a new family of snake venom proteins; Snake venom metalloproteases (SVMPs); Vespryns; Cellular transcripts; Unknown and hypothetical proteins Chapter Proteomics of crude venom of Drysdalia coronoides 69 Description of novel proteins 75 Cloning, Expression and Purification of Recombinant Drysdalin Introduction 79 Overview of pET system; Choosing the host for expression; Choosing the vector for expression; Factors influencing expression and purification; Refolding of proteins Materials and Methods 85 Reagents and kits used; Bacterial strains and vectors used; Columns used for purification; Cloning of synthetic gene into pET-32a and pET-M; Transformation of cloning and expression host strains; Expression of protein; Purification of protein; Mass determination; Refolding of protein Results and discussion 97 Expression of the trx- fused drysdalin in pET-32a; Affinity purification and cleavage of the fusion protein; RP-HPLC of untagged drysdalin; Refolding and RP-HPLC of untagged drysdalin; Expression of the His-drysdalin in pET-M; Affinity Purification of His-drysdalin; RP-HPLC purification of His-drysdalin, Refolding and RP-HPLC of iv His-drysdalin; Expression, purification and refolding of 15 N-labeled His-drysdalin Chapter Structural and Functional Characterization of Recombinant Drysdalin Introduction 118 Functional characterization 118 Structural characterization 120 Materials and methods 123 Materials; Animals; In vivo toxicity studies; Ex vivo organ bath studies; Electrophysiological studies; Measurement of Circular Dichroism (CD) spectra; Crystallization; NMR data acquisition Results and discussion 129 Functional characterization 129 In vivo toxicity studies; Ex vivo toxicity studies; In vitro refolded versus in vivo folded drysdalin; Reversibility studies of drysdalin on neuromuscular junction; Comparison of drysdalin to α-bungarotoxin (Bgtx); In vitro electrophysiological studies of drysdalin Structural characterization 142 Measurement of CD spectra; Crystallization; NMR studies Chapter Gene Regulation of Differentially Expressed Isoforms of Drysdalin Introduction 148 Materials and methods 153 Liver tissue; Kits and reagents used; Isolation of genomic DNA; Construction of genome walker libraries; Genome walking; Isolation and sequencing of clones Results and discussion Chapter 155 Gene structure of a novel protein 513V5 Introduction 173 Materials and Methods 175 Materials; Genomic DNA isolation and 513V gene v sequencing Results and discussion 176 Designing of primers for 513V; Analysis of 513V5 gene; Accelerated Segment Switch in Exon to alter Targeting (ASSET) in 513V genes; Novel protein encoded by 513V5 gene; Phylogenetic significance of 513V5 gene Chapter Conclusions and Future Prospects Conclusions 189 Future Prospects 192 Bibliography 195 Appendix 224 Publications 238 vi Summary Identification and characterization of novel proteins from a rare Australian elapid snake Drysdalia coronoides Partial transcriptome from the venom gland of a rare Australian elapid snake Drysdalia coronoides, whose venom composition was not known, was elucidated by cDNA library approach and the results were corroborated by determining proteome from the crude venom of the snake. Three novel proteins belonging to three finger toxin (3FTx) super-family of the snake venom proteins were identified. They consist of three clusters represented by the clones 13A, 342A and 513V5. One of them, named drysdalin, possesses distinct structural features predicted by online server ITasser, and was chosen for functional and structural characterization. Drysdalin was expressed in E. coli followed by affinity chromatography, reverse phase HPLC and refolding. It showed dose-dependent and time-dependent neurotoxicity in mice with LD50 of 0.775 mg/kg. It was found to be an irreversible blocker of muscle-type and neuronal-type nicotinic acetylcholine receptors (nAChRs) with EC50 of 37 nM (~2.8 fold less than α-bungarotoxin) and 27 nM, respectively. These data were in stark contrast to the substitutions of functionally conserved residues which would lead to the decrease in binding affinity to nAChRs. Our data suggest that despite these substitutions, there are some structural changes in drysdalin that might have prevented the loss of binding affinity. The attempts to solve threedimensional structure of drysdalin by X-ray crystallography and NMR were initiated because the C-terminus of drysdalin was predicted to acquire a unique fold. The protein could not crystallize. However, 2D-NMR spectrum acquired with 15N-labeled vii drysdalin could indicate the reasons for failure of crystallization attempts. The protein sample used for 2D-NMR was suspected to exhibit heterogeneity and/or flexibility. These results will help in future experimental designs for structural characterization of drysdalin. The gene structures of the differentially expressed, closely related isoforms of drysdalin were studied by genome walking approach. The promoter regions of the genes encoding these isoforms were highly similar with several random mutations sharing no relation to their abundance in the transcriptome. This suggests that cis elements in the proximity of the gene are not responsible for the differential expression of these isoforms. The second novel protein encoded by clone 342A exhibited a shorter loop II lacking two of the critical residues implicated in binding to nAChRs. Therefore, this 3FTx is expected to have altered pharmaco-physiological properties. The third novel protein encoded by clone 513V5 was identified from the genomic DNA. This search was prompted due to the identification of a truncated clone 513A in the cDNA library. Genomic DNA PCR revealed that 513V5 had a different open reading frame than 513A transcript due to an insertion of bp in exon II. Also, it had different exonintron boundary than other 3FTxs. These results imply the significance of deciphering snake genome to search for new venom proteins. Our study has significantly contributed to the field of snake venom research with novel toxins identified from a combined transcriptomics, proteomics and genomic approach. This work opens new avenues for various biochemical and biophysical studies that may have biomedical applications in the long-term. viii List of Figures Chapter 1.1 Drysdalia coronoides and its geographical distribution 1.2 Snake fang and the venom gland 1.3 Multiple sequence alignment of short-chain and long-chain toxins 13 1.4 Functional diversity of 3FTxs 14 1.5 Multiple sequence alignment of atypical long-chain neurotoxins and nonconventional neurotoxins 18 1.6 ‘3FTx fold’ in non-venom proteins 24 1.7 Gene organization of ‘3FTx fold’ containing protein from different organisms 26 1.8 Cartoon representation of nAChRs 31 1.9 Schematic showing the events leading to neurotransmission followed by muscular contraction 34 1.10 The three-dimensional structures of nAChR, AChBP alone and in complex with α-cobratoxin 37 Chapter 2.1 Schematic showing construction of cDNA library 47 2.2 Agarose gel electrophoresis (1%) of RNA and ds cDNA from D. coronoides 52 2.3 Venom transcripts of D. coronoides 53 2.4 Three-finger toxins (3FTxs) from D.coronoides 56 2.5 Serine Protease Inhibitors (SPIs) from D.coronoides 59 2.6 Phospholipases A2 (PLA2s) from D. coronoides 62 2.7 Phospholipase B from D. coronoides 64 2.8 Snake venom metalloproteases (SVMPs) from D. coronoides 66 2.9 SDS-PAGE analysis of D. coronoides venom 69 2.10 RP-HPLC profile of tryptic digest of the crude venom of D.coronoides 70 2.11 Schematic representing the sequence coverage of the proteins identified in the venom of D. coronoides 72-73 2.12 Comparison of three dimensional structural models of the novel proteins to the crystal structures of α-Cobratoxin (Cbtx) and erabutoxin (Ebx) 76 2.13 Schematic showing the summary of characterization of novel proteins from D. coronoides 78 ix Appendix Appendix table A.1. Sequences of the peptides obtained from the trypsin digestion of the crude venom of D. coronoides identified by tandem mass spectrometry. Protein (No. of clones) Mascot ion score Tryptic sequences identifed Precursor mass (Da) 3FTx2A 62 GCGCPTVKPGIQR 1429.70 44 VCCATDK 853.37 3FTx342A 40 TWSGTIIER 1062.57 (2) 33 GCGCPPLKPPIR 1351.70 65 SEPCAPGENLCYTK 1625.70 3FTx20A 97 VIELGCAATCPPAEPK 1712.85 (10) 107 DITCCSTDNCNTHP 1694.62 107 KDITCCSTDNCNTHP 1822.73 43 SWCDAFCSIR 1244.53 98 VIQLGCAATCPTTKPYEEVTCCSR 2801.32 46 DKCNPHPAQR 1222.58 36 CNPHPAQR 979.46 3FTx73A 49 TWCDYWCHVK 1454.63 (24) 23 CNPHPLQRPR 1274.65 58 TCPPGENLCYTK 1439.64 3FTx77A 55 TWCDAFCSIR 1315.58 (11) 106 VDLGCAATCPTAKPGVDITCCSTDK 2697.25 41 RVDLGCAATCPTAKPGVDITCCSTDK 2853.35 30 TWCDFR 884.37 (130) 3FTx31A (9) 3FTx469A (6) 3FTx13A 225 Appendix (64) 80 SEPCASGENLCYTK 1615.69 197 AVELGCAATCPTTKPYEEVTCCSTDDCNR 3365.48 65 SEPCAPGENLCYTK 1625.70 63 DRPHFCHLPADTGPCK 1850.85 53 FQAFYYHPVHR 1464.75 130 CLEFIYGGCEGNANNFK 1992.91 81 KCLEFIYGGCEGNANNFK 2121.02 30 TIDECKR 864.43 SPI18A 46 DRPHFCHLPADPGR 1674.80 (42) 103 CNALSEAFYYNPVQR 1774.86 71 DRPDFCHLPHETGPCK 1965.88 78 IQAFYYNPIYDTCLK 1908.98 25 TMDECKR 939.41 50 DRPDFCHLPADSGSCK 1861.80 59 GNFQAFYYHPVHR 1635.83 116 TCLEFIYGGCEGNANNFK 2093.96 47 CTFAHSPPHTR 1310.61 37 YLYVCQYCPAGNIR 1776.85 128 SGPTCGDCPSACVNGLCTNPCK 2411.98 66 YEDAFTNCNELAK 1574.70 68 CPATCFCHTEII 1508.67 PLB 65 GYWPSYNIPFHK 1508.77 (1) 31 HNPCNTICCR 1331.55 3FTx43A (16) SPI28A (75) SPI87A (9) SPI161A (12) CRISP (4) 226 Appendix 72 IHDDCYGDAEK 1322.53 59 IHDDCYGDAEKK 1450.65 99 MLAYDYYCGENGPYCK 2003.81 28 FVCDCDVK 1042.43 104 CFAGAPYNDANWNIDTTK 2057.96 55 FVCACDVQAAK 1268.58 91 GLFSEDYTETHYAPDGR 1957.90 117 GEECDCGSPQDCQDACCNAATCK 2692.98 90 LQHDCDSGECCEQCK 1925.70 86 AAKDDCDLPESCTGQSAECPTDSFQR 2945.25 176 DDCDLPESCTGQSAECPTDSFQR 2675.06 76 NGHPCQNNQGYCYNGK 1910.78 27 CPIMTNQCIALK 1449.71 61 GPGVNVSPDECFTLK 1619.79 33 QNDPECGFCR 1265.47 79 LLCQEGNATCICFPTTDDPDYGMVEPGTK 3289.56 26 VRPQCILNKPLR 1493.89 31 DIVTPPVCGNYFVER 1765.91 94 GEECDCGSPQDCQSACCNATTCK 2694.96 115 LQHEAQCDSGECCEQCK 2138.84 55 CLIMTNQCIALK 1449.71 SVMP4 62 FSSCSVQEHQR 1364.61 (0) 59 NGHPCQNNEGYCYNGK 1911.78 PLA2147A (1) PLA222R (0) SVMP9 (0) SVMP8 (0) 227 Appendix Vespryn (0) 34 QNVPECGFCR 1249.52 83 TVENVGVPQVVPDNPER 1848.97 32 FSSSPCVLGSPGFR 1497.73 42 LVPEELIWQR 1282. 74 51 IVVFLDYSEGK 1269.69 Peptides annotated primarily based on the unique sequences are highlighted in bold for closely related isoforms. Unbold peptides are common with other isoforms and are annotated based on the abundance of clones (in parentheses) obtained in the cDNA library. The peptides with MASCOT score below 30 are italicized. Underlined C/M indicates carbamidomethylation of cysteine/methionine. 228 Appendix Figure A.2 Tandem mass spectrum obtained for pro-peptide sequence of snake venom metalloproteases (SVMPs) from D.coronoides. The peaks in red correspond to the ion-series obtained from the pro-peptide sequence (mentioned in the inset). The mass (Da) corresponding to each peak is mentioned at the top of each peak (in blue). 229 Appendix Figure A.3 Vector map of pET-32a and pET-M. (A) Vector diagram of pET-32a with all the essential features of the vector. (B) pET-32a was engineered to remove regions blocked (in gray box) including thioredoxin tag and S-tag while retaining the His-tag and was subsequently named as ‘pET M’. The region corresponding to the multiple cloning site (MCS) is underlined by dashed line. 230 Appendix Table A.2 (a) Tris-Tricine Gel 15% Resolving Stacking Stock Solutions 30% Acrylamide (29:1) 12% Resolving Stacking 7.5 ml 972 µl 12 ml 1.4 ml 3X Gel Buffer ml 1.86 ml 9.99 ml 2.525 ml Glycerol, 90% 1.58 ml - 4.6 ml - 900 µl 4.67 ml 3.126 ml 6.246 ml 10% APS 75 µl 40 µl 300 µl 225 µl TEMED 10 µl 10 µl 30 µl 30 µl Water Table A.2 (b) 3X Gel Buffer 3M Tris-Cl, pH 8.45 0.3 % (w/v) SDS Table A.2 (c) Running buffers Cathode Buffer (1X) Anode Buffer 0.1 M Tris 0.2 M 0.1 M Tricine 0.1 % (w/v) SDS Tris-Cl → No need to adjust pH Table A.2 (d) 2X Tricine Sample Buffer 125 mM Tris-HCl, pH 6.8 24 % (w/v) Glycerol % (w/v) SDS % (w/v) Mercaptoethanol 0.02 % (w/v) Bromophenol Blue → 30min at 40°C 231 Appendix Exon II 13A (64) 43A (16) 173A (1) CTTAGCATACACCAGGAAATGCTACAAAACACATCCTTATAAATCTGAGCCTTGTGCATCTGGGGAGAACCTATGCTATACAAAGACTTGGTGTGATTTTCGGT CTTAGCATACACCAGGAAATGCTACAAAACACATCCTTATAAATCTGAGCCTTGTGCACCTGGGGAGAACCTATGCTATACAAAGACTTGGTGTGATTTTCGGT CTTAGCATACACCAGGAAATGCTACAAAACACATCCTTATAAATCTGAGCCTTGGGCATCTGGGGAGAACCTATGCTATACAAAGACTTGGTGTGATTTTCGGT Exon III 13A(64) 43A (16) 173A (1) TGATTTTCGGTGTAGCCAACTAGGAAAGGCAGTCGAATTGGGATGTGCTGCTACTTGCCCTACAACGAAGCCCTATGAGGAAGTTACCTGTT TGATTTTCGGTGTAGCCAACTAGGAAAGGCAGTCGAATTGGGATGTGCTGCTACTTGCCCTACAACGAAGCCCTATGAGGAAGTTACCTGTT TGATTTTCGGTGTAGCCAACTAGGAAAGGCAGTCGAATTGGGATGTGCTGCTACTTGCCCTACAACGAAGCCCTATGAGGAAGTTACCTGTT 13A(64) 43A (16) 173A (1) GCTCAACAGACGATTGCAACAGATTTCCGAATTGGGAACGGCCTAGACCACGTCCTCGAGGGTTGCTCTCATCCATCATGGACCATCCTTGA GCTCAACAGACGATTGCAACAGATTTCCGAATTGGGAACGGCCTAGACCACGTCCTCGAGGGTTGCTCTCATCCATCATGGACCATCCTTGA GCTCAACAGACGATTGCAACAGATTTCCGAATTGGGAACGGCCTAGACCACGTCCTCGAGGGTTGCTCTCATCCATCATAGACCATCCTTGA Figure A.4 Partial cDNA sequences of long-chain neurotoxin (LNTx) isoforms from D. coronoides. The exon II and exon III of LNTx isoforms chosen for studying gene regulation. Bracketed is the number of transcripts obtained for each isoforms in the cDNA library. The protein coding region of exon I is identical in all the three isoforms. 232 Appendix D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 CAAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGGGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA CAAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATGAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAACGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA -AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA ******************** ******************************* ************************** *** -1288 -1285 -1288 -1369 -1369 -1369 -1369 -1369 -1369 -1369 -1369 -1369 -1369 -1369 -1369 -1289 -1369 -1369 -1369 -1369 -1369 43D7 43D8 43D2 AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTTAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTTAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA ******************************************** ************************************** -1279 -1284 -1288 173D2 173D4 173D8 173D9 173D3 173D10 AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA AAAAATGCCTGAATGCGCTGAGATGGATAGCCTAACCAAATTGTGAAAAAATAAACAAGATAGGGACTATTATGAAAAATGGA *********************************************************************************** -1288 -1288 -1288 -1287 -1286 -1286 D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTCGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAGGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATGTAGCAATCCTTG ********************* ********************* **************************************** 43D7 43D8 43D2 ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG -1195 ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG -1200 ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG -1204 ************************************************************************************ 173D2 173D4 173D8 173D9 173D3 173D10 ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG ACATTTGGTATGATTGGTTAAAGAAAAGGTAATATTTGATATTTGGATAAGTTATGGAAATGACAGAAGATTAGCAATCCCTTG ************************************************************************************ -1204 -1201 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1205 -1287 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1204 -1203 -1202 -1202 233 Appendix D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCAAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGGGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGTGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGTGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAA-TTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTATACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG ***************************** ******** ***** ********************** **************** -1120 -1117 -1120 -1120 -1120 -1120 -1120 -1120 -1120 -1120 -1120 -1120 -1120 -1120 -1120 -1121 -1120 -1120 -1120 -1120 -1120 43D7 43D8 43D2 AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATGCCATTATGATTGAG -1111 AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATGCCATTATGATTGAG -1116 AAAGTTACTGTTGCGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG -1120 ************* ******************************************************* ************** 173D2 173D4 173D8 173D9 173D3 173D10 AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGGTTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAATG-AGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGGTTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG AAAGTTACTGTTGTGCCCAAAATGAGGGCGAGACGCAGAGAATTGTACGTTCTCTCCACCAACTAAAATTCCATTATGATTGAG ********************* * ****************************************************** ***** -1120 -1120 -1120 -1120 -1118 -1118 D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCCGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG **************************************************************************** ******* -1036 -1033 -1036 -1036 -1036 -1036 -1036 -1036 -1036 -1036 -1036 -1036 -1036 -1036 -1036 -1037 -1036 -1036 -1036 -1036 -1036 43D7 43D8 43D2 CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAATGGGACTTAA------TGTGCACCTGTCTCT----ATG -1037 CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAATGGGACTTAA------TGTGCACCTGTCTCT----ATG -1042 CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG -1042 ********************************************** *** ***** *************** *** 173D2 173D4 173D8 173D9 173D3 173D10 CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGATATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG CTAATAACCCTTTTGCTAAAGACACTTGGTATACCTCCAAGTACAACGGGGCTTAAGAGAGATGTGCACCTGTCTCTGTTTATG **************************** ********************************************** ******** -1036 -1036 -1036 -1036 -1034 -1034 234 Appendix D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCATGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAGACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCCCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGGCGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCCGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC ******* ********************************* ***************** **** ************ ****** -952 -949 -952 -952 -952 -952 -952 -952 -952 -952 -952 -952 -952 -952 -952 -953 -952 -952 -952 -952 -952 43D7 43D8 43D2 CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATTTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC -952 CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATTTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC -958 CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC -958 **************************************** ******************************************* 173D2 173D4 173D8 173D9 173D3 173D10 CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC CAAAGCAAACATTGCACCCCATACCAAATTATTGTATTATCTGTATTGTCTCTATTAGCGTGCCTCCAGGAATAGGGACGTGGC ************************************************************************************ -952 -952 -952 -952 -950 -950 D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTCGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACGTTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG *********** **************************************** ******************************* -868 -865 -868 -868 -868 -868 -868 -868 -868 -868 -868 -868 -868 -868 -868 -869 -868 -868 -868 -868 -868 43D7 43D8 43D2 ATTCTCTCCTT-GGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG -869 ATTCCCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG -874 ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG -874 **** ****** ************************************************************************ 173D2 173D4 173D8 173D9 173D3 173D10 ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ATTCTCTCCTTTGGTTCCTCAATAAAAATGTTCACAGTTGCAACACAATCACATTTGTATTTGGGTTTTTATGGTTAACGTTAG ************************************************************************************ -868 -868 -868 -868 -866 -866 235 Appendix D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATGCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ******************************************* **************************************** -784 -781 -784 -784 -784 -784 -784 -784 -784 -784 -784 -784 -784 -784 -784 -785 -784 -784 -784 -784 -784 43D7 43D8 43D2 ATCAATCATGCACTTCCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT -785 ATCAATCATGCACTTCCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT -790 ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT -790 *************** ******************************************************************** 173D2 173D4 173D8 173D9 173D3 173D10 ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ATCAATCATGCACTTTCTATTCTTCGGGCAGGAGACCACTGATTCTGGATCCGAAATTTCTTTTTTTCAGTCTCCCTCCATTTT ************************************************************************************ -784 -784 -784 -784 -782 -782 D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACGTCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTT-GTGTGAAAGGAAA-C CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAGTGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACACCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACACCACAAGAAAAGAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC *********************************************** ********** ******* ************** * -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 -701 43D7 43D8 43D2 CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACAAAGATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC -702 CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACAAAGATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC -707 CATCTTGGAAAGCCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC -707 ************ ************************* * ******************* *********************** 173D2 173D4 173D8 173D9 173D3 173D10 CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC CATCTTGGAAAGTCTGCACAATGCAATGTATAGAAACATAAATCAGGACATCACAAGAAA-GAGGTTTTGTGTGAAAGGAAAGC ************************************************************ *********************** -701 -701 -701 -701 -699 -699 236 Appendix D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAATGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAA-GAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATTAACATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAA-GAGATGTTGTGTTCTAGACAAACGCTGCAA-TCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGGGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATAAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATAAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGTAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA **** ********************* ******* *********** ***** ************ * ************** -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 -617 43D7 43D8 43D2 ATAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA -618 ATAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA -623 GGAATGGGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA -623 ********************************************************************************** 173D2 173D4 173D8 173D9 173D3 173D10 AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGTAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA AGAATGAGATGTTGTGTTCTAGACAAACGCTGCAAATCCCAAGGAAGCAGATGAGCTTATAGCAATAACAATAGCACTTATATA *********************************************** ************************************ -617 -617 -617 -617 -615 -615 D4 D7 D1 D8 D27 D12 43D13 D24 D14 173D1 D22 D18(2) D6 D28 43D5 43D11 43D6 43D10 D5 D26 D11 CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTGTAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCCGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCACTCCTT ******* *************************** ***************** ************************ ***** -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 -534 43D7 43D8 43D2 CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGGGTTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCCTTCCTT -534 CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT -540 CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT -540 *********************************** ***************************************** ****** 173D2 173D4 173D8 173D9 173D3 173D10 CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT CCGCTTTATAATGCTTTATAATCCTCTCTAAGTGG-TTTACAGAGTCAGCCTCTGGCCCCCAACAATTTGTGTCCTCATTCCTT *********************************** ************************************************ -534 -534 -534 -534 -532 -532 Figure A.5 Alignment of partial gene sequence of LNTx genes of D. coronoides. Region upstream to the sequence aligned in Figure 5.2. 237 Publications/Conference Presentations Publications Publications Chatrath, S.T., Chapeaurouge, A., Lin, Q., Lim, T.K., Dunstan, N., Mirtschin, P., Kumar, P. P. and Kini, R. M. Identification of novel proteins from the venom of a cryptic snake Drysdalia coronoides by a combined transcriptomics and proteomics approach. J. Prot. Res., 2010, DOI: 10.1021/pr1008916 Chatrath, S.T., Bertrand, D., Jensen, A. A., Foo, C. S., Kumar, P. P. and Kini, R. M. Functional charaterization of a novel protein drysdalin from Drysdalia coronoides. (manuscript under preparation) International Conference Presentations 1. “Identification of a novel protein from the venom transcriptome of a rare snake Drysdalia coronoides”, Poster presentation Chatrath, S.T., P., Kumar, P.P. and Kini, R. M.; Joint 5th Structural Biology and Functional Genomics and 1st Biophysics International Conference, organized by National University of Singapore, Singapore, during 9-11 December, 2008. 2. “Identification of a novel protein from the venom transcriptome of a rare snake Drysdalia coronoides”, Poster presentation Chatrath, S.T., P., Kumar, P.P. and Kini, R. M.; 13th Biological Sciences Graduate Congress (BSGC) organized by National University of Singapore, Singapore, during 15-17 December, 2008. 3. “Identification and characterization of a novel protein from a rare snake Drysdalia coronoides”, Oral presentation Chatrath, S.T., P., Kumar, P.P. and Kini, R. M.; 14th BSGC organized by Chulalongkorn University, Bangkok, Thailand, during 10-11 December, 2008 4. “Identification and characterization of a novel protein from a rare snake Drysdalia coronoides”, Poster presentation Chatrath, S.T., P., Kumar, P.P. and Kini, R. M.; International Anatomical Sciences and Cell Biology Conference (IASCBC) organized by Department of Anatomy, National University of Singapore, Singapore, during 26-29 May, 2010 238 ��������������������������������������������������������������������������� ��������������������������������������������������������������������������������� ����������������������������������������������������� [...]... Viperidae, Crotalidae, Elapidae and Hydrophidae (Table 1.1) Not all of them are dangerous to humans Colubridae, the largest snake family (~1000 species) produce small volumes of venom and have poorly developed venom delivery apparatus [10] Therefore, most of the snakes of this family are harmless except a few like the African Boomslang (Dispholidus typus) Drysdalia coronoides; a rare Australian elapid Australian. .. Australian elapid Australian elapids are considered to be the most toxic snake species of the world, with all the top 10 and 19 of the top 25 elapids with known LD50s residing exclusively on this continent [11] The venomous terrestrial snakes of the Elapidae family have undergone an extensive radiation in Australia [12] Elapid snake genus Drysdalia from southern Australia is a group of rare snakes comprising... snakes the organism of choice for research by various laboratories world wide Snake venom has long been known to be employed in traditional Indian, Chinese and Arabian medicine Over the years, the perception regarding venom has changed drastically from that of a deadly weapon to a pharmaceutically important cocktail of bioactive proteins and polypeptides that act as lead molecules for therapeutics development... development [1-3] Snakes evolved from burrowing lizards during lower cetaceous period (~100-150 million years ago) [4] There are about 2,930 species of snakes distributed on every continent except Antarctica, islands of Ireland, Iceland and New Zealand [5, 6] They have successfully colonized various habitats and feed on small animals including lizards, snakes, rodents, small mammals, birds, eggs and insects... different snake families as well as within a family depending on the feeding habits, geographical location and environmental conditions [23-25] 5 Chapter 1 Figure 1.2 Snake fangs and the venom gland Cartoon representations showing the location of venom gland in the snake s head and close look of venom ejection from the fang Photo downloaded from the URL: http://animals.howstuffworks.com/snakes /snake4 .htm... Three-dimensional structures of 3FTxs isolated from various snake venoms are shown The protein names with PDB codes and source organisms of the proteins are as follows: (A) Neurotoxins: Erabutoxin (3EBX; Laticauda semifasciata), (B) α-cobratoxin (2CTX; Naja kaouthia), (C) κ-bungarotoxin, a dimer (1KBA; Bungarus multicinctus) and (D) Candoxin (1JGK; Bungarus candidus) (E) Fasciculin (1FSS; Dendroaspis angusticeps)... Non-enzymatic proteins from snake venoms 10 Chapter 1 Table 1.3 (continued) 11 Chapter 1 Three-finger toxin family Three-finger toxin (3FTx) family, a well-characterized non-enzymatic superfamily of snake venom proteins, is found abundantly in the venoms of elapids (cobras, kraits and mambas) and hydrophids (sea snakes and sea kraits) [54] However, recently, 3FTxs have also been found in the venoms of colubridae... scincid lizards and frogs [14] There are no reports of this snake biting humans Snake venom Snake venom is produced in a highly developed secretory organ called venom gland (a modified parotid salivary gland of other vertebrates) [4] This gland is situated on each side of the head below and behind the eye, wrapped with a muscular sheath (Figure 1.2) Venom is stored in large alveoli before getting channelled... erabutoxin b (Ebx) 187 Appendix A. 1 Amino acid sequence alignment of Interferon Gamma Inducible protein 30 (GILT) 224 A. 2 Mass spectrum obtained for pro-peptide sequence of snake venom metalloproteases (SVMPs) from D .coronoides 229 A. 3 Vector map of pET-3 2a and pET-M 230 A. 4 Partial cDNA sequences of long-chain neurotoxin (LNTx) isoforms from D coronoides 232 A. 5 Alignment of partial gene sequence of. .. colubridae [5557] and crotalidae (rattlesnakes) family of snakes [58, 59] Based on the length of polypeptide chain and number of disulfide bonds, they are broadly classified as short-chain (generally, 60-64 aa) and long-chain (generally, 66-75 aa) toxins with four and five disulfide bonds, respectively [39] (Figure 1.3) However, some proteins, not withstanding this length of polypeptides, have been . proteins from a rare Australian elapid snake Drysdalia coronoides Partial transcriptome from the venom gland of a rare Australian elapid snake Drysdalia coronoides, whose venom composition was not. IDENTIFICATION AND CHARACTERIZATION OF NOVEL PROTEINS FROM A RARE AUSTRALIAN ELAPID SNAKE DRYSDALIA CORONOIDES SHIFALI CHATRATH (M.Sc. (Biotechnology)) A THESIS. without thanking Prof. Anjali Karande, Indian Institute of Science, Bangalore India, Prof. Gurcharan Kaur and Prof. Prabhjeet Singh from Guru Nanak Dev University, Amritsar, India, who always guided