CÁC TRÌNH TӴ PROTEIN SENSE VÀ ANTISENSE Ӣ VI KHUҬN
1KLӉPVҳFWKӇYLNKXҭQÿDVӕOjPӝWVӧLÿ{L'1$FyKuQKYzQJWUzQKuQKWKjQK WӯKDLVӧLÿѫQ'1$NKpSNtQ0ӛLVӧLÿѫQ'1$OjPӝWSRO\QXFOHRWLGHÿѭӧFFҩXWҥR EӣL FiF PҳW [tFK QXFOHRWLGH DGHQLQH $ F\WRVLQH(C), thymine (T) và guanine (G)
&iFQXFOHRWLGHQj\ÿѭӧFQӕLYӟLQKDXWK{QJTXDFiFOLrQNӃWSKRVSKRGLHVWHUWURQJÿy PӝWQKyPSKRVSKDWHÿѭӧFQӕLYӟLKDLQKyPÿѭӡQJGHR[\ULERVHFӫDKDLQXFOHRWLGH ӢPӝWÿҫXFӫDSRO\QXFOHRWLGHFyQKyPSKRVSKDWHJҳQYjRQJX\rQWӱFDUERQVӕYj ÿҫXNLDFyQKyPK\GUR[\OJҳQYjRQJX\rQWӱFDUERQVӕFӫDÿѭӡQJGHR[\ULERVH&iF ÿҫXQj\OҫQOѭӧWÿѭӧFJӑLOjÿҫXảYjÿҫXả+DLVӧLÿѫQSRO\QXFOHRWLGHFӫDQKLӉP VҳFWKӇNӃWKӧSYӟLQKDXQKӡFiFOLrQNӃWK\GURJHQWKHRP{KuQKNӃW FһSQucleotide FӫD:DWVRQYj&ULFN>@0ӛLVӧLÿѫQKRjQFKӍQKFӫDQKLӉPVҳFWKӇFKӭDFiFWUuQKWӵ mang thông tin mã hóa cho protein, các RNA ribosome U51$FiF51$YұQFKX\ӇQ W51$YjFiFWUuQKWӵNK{QJPmKyD ӢKҫXKӃWFiFYLNKXҭQSKkQWӱ'1$QKLӉPVҳFWKӇFKӍcó PӝWÿLӇPNKӣLÿҫXYjPӝW ÿLӇP NӃW WK~F VDR FKpS '1$ Ӣ PӛL SKtD FӫD ÿLӇP NKӣL ÿҫX VDR FKpS KDL VӧL'1$EәVXQJWѭѫQJӭQJÿѭӧFVӱ GөQJOjPNKX{QÿӇVDRFKpS PҥFKWUѭӟFOHDGLQJVWUDQGYjPҥFKVDXODJJLQJ sWUDQGEӣLEӝPi\VDRFKpS DNA 0ӛLQӱDFӫDQKLӉPVҳFWKӇFKӭDFiFÿRҥQ'1$PmKyDYjNK{QJPmKyD&iFÿRҥQPmKyDOjFiFÿRҥQPDQJ WK{QJ WLQ Pm KyD FKR 51$ YұQ FKX\ӇQ WUDQVIHU 51$ KD\ W51$ 51$ ribosome (ribosomal RNA hay rRNA) và RNA thông tin (messenger RNA hay mRNA) &iFÿRҥQ'1$PDQJWK{QJWLQPmKyDFKRFiFP51$FzQÿѭӧFJӑLOjFiF ÿRҥQSURWHLQVHQVHYjDQWLVHQVH Trình tӵ protein sense là trình tӵ giӕQJ\QKѭWUuQKWӵ P51$ 7URQJ NKL ÿy WUuQK SURWHLQ DQWLVHQVH Oj WUuQK Wӵ OjP NKX{Q ÿӇ RNA polymerase tәng hӧp nên trình tӵ mRNA Các pURWHLQÿѭӧc tҥo ra thông qua quá trình dӏch mã các trình tӵ mRNA ÿy Oj QKӳng chuӛi trình tӵ các amino acid nӕi kӃt vӟi nhau bҵng các liên kӃt peptide Có 20 amino acid tham gia vào các trình tӵ protein sau dӏch mã Mӛi amino acid ÿѭӧc mã hoá tӯ mӝt bӝ ba nucleotide (codon, hay còn gӑi là
17 mã di truyӅn) trên mRNA 1KѭYұ\WѭѫQJӭng vӟi mӝt mã di truyӅn trên mRNA là mӝt trimer trong trình tӵ protein sense (hoһc antisense) Trong quá trình tiӃn hóa, các trimer có thӇ bӏ WKD\ÿәi khi vi khuҭn chӏu áp lӵc cӫa chӑn lӑc tӵ nhiên và tiӃQÿӃn sӵ phân hóa thành hai loài vi khuҭn khác nhau Áp lӵc chӑn lӑc tӵ nhiên càng cao thì mӭFÿӝ WKD\ÿәi trimer càng nhiӅu và khoҧng cách tiӃn hóa giӳa hai loài vi khuҭn này càng lӟn.
CÁC MA TRҰ1Ĉ,ӆM SӔ
MD WUұQ ÿLӇP Vӕ(scoring PDWUL[ KD\ FNJQJ FzQ ÿѭӧF JӑL Oj PD WUұQ WKD\ WKӃ VXEVWLWXWLRQPDWUL[OjPӝWPDWUұQKDL FKLӅXKuQKWKjQKWӯKDLGm\FiF\ӃXWӕ\ӃXWӕ FyWKӇOjQXFOHRWLGHFRGRQDPLQRDFLGQKѭQKDXYӟLFiFFRQVӕErQWURQJPDWUұQOj FiFÿLӇPVӕÿѭӧFWKLӃWOұSÿӇP{WҧNKҧQăQJWKD\WKӃFiF\ӃXWӕFKRQKDXWURQJTXi WUuQKWLӃQKyDӣPӭFÿӝSKkQWӱ &iFPDWUұQÿLӇPVӕÿѭӧFVӱGөQJWURQJVҳSJLyQJ FӝWFiFWUuQKWӵ'1$KRһFSURWHLQQKҵP[iFÿӏQKPӭFÿӝWѭѫQJÿӗQJJLӳDFiFWUuQK WӵQKҵPÿӇWuPNLӃPWUuQKWӵWUrQFѫVӣGӳOLӋXFyPӭFÿӝWѭѫQJÿӗQJYӟLPӝWWUuQKWӵ FKRWUѭӟFGӵÿRiQFKӭFQăQJFӫDSURWHLQKD\[k\GӵQJFk\SKiWVLQKORjLYjWKLӃWOұS OҥLWUuQKWӵSURWHLQWәWLrQ,
MDWUұQÿLӇPVӕQXFOHRWLGe (nucleotide scoring matrix) là PӝWPDWUұQ P{WҧPӭF ÿӝWKD\WKӃFiFQXFOHRWLGHWURQJWUuQKWӵ'1$&yWKӇVӱGөQJPӝWP{KuQKWKD\WKӃ QXFOHRWLGH ÿӇ WҥR UD PӝW PD WUұQ ÿLӇP Vӕ QXFOHRWLGH Mô KuQK ÿҫX WLrQ Yj ÿѫQ JLҧQ QKҩWOjP{KuQK-&ÿѭӧFWKLӃW OұS EӣL-XNHV &DQWRU[2]-&OjPӝWPDWUұQJӗP KjQJYjFӝWKD\[P{WҧPӭFÿӝWKD\WKӃFӫDFiFQXFOHRWLGHWURQJ quá trình WLӃQ KyD FӫD '1$ 7KHR P{ KuQK Qj\ PӭF ÿӝ WKD\ WKӃ PӝW QXFOHRWLGH EӣL PӝW QXFOHRWLGHNKiFWURQJWUuQKWӵ'1$OjQKѭQKDXDÿӕLYӟLFҧEӕQQXFOHRWLGH$&* và T (%ҧQJ1.1ÿӗQJWKӡLWҫQVXҩW[XҩWKLӋQFӫDFiFQXFOHRWLGHWURQJWUuQKWӵOjQKѭ nhau
%ҧQJ.10DWUұQÿLӇPVӕnucleotide JC69
D OjÿLӇPVӕFKӍPӭFÿӝWKD\WKӃQKѭQKDXJLӳDFiFQXFOHRWLGH [2]
BLASTN OjF{QJFөÿѭӧFWtFKKӧSPDWUұQÿLӇPVӕQXFOHRWLGHQj\YӟLD = -4 [3] %/$671ÿѭӧFVӱGөQJSKәELӃQÿӇWuPNLӃPWUuQKWӵWѭѫQJÿӗQJJLӳDPӝWWUuQK WӵQXFOHRWLGHTXHU\YӟLPӝWWUuQKWӵQXFOHRWLGH(subject) FyWUrQFѫVӣGӳOLӋX sinh KӑF Trong quá trình WKӵF KLӋQ WuP NLӃP %/$671 FKLD WUuQK Wӵ TXHU\ UD WKjQK FiF ÿRҥQQJҳQJӗPNêWӵVDXÿyWuPFiFÿRҥQQJҳQQj\WURQJWӯQJWUuQKWӵsubject trên FѫVӣGӳOLӋXQXFOHRWLGHYjFKRÿLӇPVӱGөQJPDWUұQÿLӇPVӕQXFOHRWLGHÿѭӧFWtFK KӧS WUrQ WURQJ ÿy ³PDWFKHG´ ± WӭF NK{QJ WKD\ WKӃ ± ÿѭӧF JiQ ÿLӇP WURQJ NKL ³PLV-PDWFKHG´± WӭFFyWKD\WKӃ± ÿѭӧFJiQÿLӇP-4 [3@6DXÿy%/$671tìm FKӑQ WәQJÿLӇPJiQFKRNêWӵ FDRQKҩWUӗLWLӃSWөFJiQÿLӇPFKRFiFNêWӵKDLErQÿRҥQ NêWӵQj\YӟLFiFÿLӇPVӕÿmWtFKKӧSWURQJ%/$6717LӃSWheo, BLASTN tính WәQJÿLӇPÿmJiQFKRFiFNêWӵFKRÿӃQNKLWәQJQj\QҵPGѭӟLQJѭӥQJWKUHVWKROGOj
20 WKuGӯQJOҥLYjNK{QJWtQKÿLӇPQӳDĈRҥQWUuQKWӵWUrQFѫVӣGӳOLӋXFKRWәQJÿLӇP WUrQ QJѭӥQJ Qj\ ÿѭӧF JӑL Oj +63 +LJK-Scoring VHJPHQW 3DLU Yj ÿѭӧF EiR FiR Oj WѭѫQJ ÿӗQJ YӟL WUuQK Wӵ TXHU\ >@ 7URQJ FKѭѫQJ WUuQK %/$671 KLӋQ WҥL QӃX WәQJ ÿLӇPQҵPGѭӟLQJѭӥQJFKӍOjWҥPWKӡLUӗLOҥLWăQJOrQWUrQQJѭӥQJWKu%/$671YүQ WLӃSWөFGRÿyFiFÿRҥQ+63ÿѭӧFEiRFiRYӟLFKLӅXGjLFyWKӇUҩWOӟQ
0DWUұQÿLӇPVӕDPLQRDFLGDPLQRDFLGVFRULQJPDWUL[FѫEҧQOjPӝWPDWUұQ JӗPFyKjQJYjFӝWYӟLFiFÿLӇPVӕErQWURQJPDWUұQWKӇKLӋQNKҧQăQJWKD\ WKӃFӫDKDLDPLQRDFLGFKRQKDXWURQJTXiWUuQKWLӃQKyD
PAM (PoinW$FFHSWHG0XWDWLRQOjPӝWNKiLQLӋPÿѭӧFÿѭDUDEӣL6FKZDUW]YjDayhoff [5@Fy QJKƭD Oj PӝW DPLQRDFLG EӏWKD\ WKӃEӣL PӝWDPLQRDFLG NKiF OjGR ÿѭӧFFKҩSWKXұQEӣLFKӑQOӑFWӵQKLrQ0DWUұQ3$0ÿѭӧFWKLӃWOұSGӵDWUrQFѫVӣ
19 ÿӝWELӃQTXDQViWÿѭӧF WURQJKӑSURWHLQFyPӕLTXDQKӋWLӃQKyDJҫQYӟLPӭF ÿӝWѭѫQJÿӗQJLGHQWLW\VDXVҳSJLyQJFӝW7ҫQVXҩWÿӝWELӃQVDXÿyÿѭӧFFKX\ӇQ ÿәLYӅWӭFJLӕQJQKDX\ÿ~FÿӃQKD\LGHQWLW\WѭѫQJӭQJYӟLPDWUұQ 3$0 6DX ÿy QӳD 3$0 ÿѭӧF ÿHP QKkQ QJRҥL VX\ ÿӇ Fy ÿѭӧF FiF PD WUұQ NKiF WѭѫQJӭQJYӟLFiFPӭFÿӝWѭѫQJÿӗQJNKiFQKDXFӫDFiFWUuQKWӵ7KHRÿy3$0 WѭѫQJ ӭQJ YӟL 3$0 3$0 3$0 Yj 3$0 identity
0DWUұQÿLӇPVӕ%/2680ÿѭӧFÿѭDUDEӣL+HQLNRII +HQLNRII>@.KiFYӟL FiFPDWUұQÿLӇPVӕ3$0FiFPDWUұQÿLӇPVӕ%/2680NKiFQKDXÿѭӧFWKLӃWOұSWӯ YLӋFVҳSJLyQJFӝWQKLӅXWUuQKWӵSURWHLQFyPӭFÿӝWѭѫQJÿӗQJNKiFQKDXEҵQJFiFK [iFÿӏQKWUӵFWLӃSFiFPӭFÿӝWKD\WKӃGRÿyWUiQKÿѭӧFFiFVDLOӋFKWҥRUDGRSKҧL QJRҥLVX\PDWUұQ3$0WKjQKFiFPDWUұQ3$0NKiFĈӗQJWKӡLWKD\YuFKӍGӵDWUrQ VҳSJLyQJFӝWFiFWUuQKWӵFyPӕLTXDQKӋWLӃQKyDUҩWJҫQ>@+HQLNRII +HQLNRIIWҥR UDYQJWѭѫQJÿӗQJKD\FzQJӑLOjFiFEORFN&yNKRҧQJQKyPSURWHLQFyTXDQ
KӋ WLӃQ KyD YӟL QKDX ÿѭӧF Vӱ GөQJ FKR VҳS JLyQJ FӝW WҥR UD KѫQ YQJ WѭѫQJ ÿӗQJWӭFFiFEORFNVDXNKLÿmEӓÿLFiFNKRҧQJWUӕQJJDS&iFEORFNnày ÿѭӧFVӱGөQJÿӇ[iFÿӏQKWҫQVXҩWWKD\WKӃFӫDFiFDPLQRDFLGvà WҥRUDFiFPDWUұQÿLӇPVӕBLOSUM &ө WKӇ YӅ FiF EѭӟF WKLӃW OұS PD WUұQ ÿLӇP Vӕ %/2680 ÿѭӧF WUuQK Ej\WURQJ3KөOөF
%ҧQJ0DWUұQÿLӇPVӕDPLQRDFLG%/2680 [4]
%/2680%ҧQJOjPDWUұQÿѭӧFWKLӃWOұSWӯEORFNWURQJÿyFiFWUuQKWӵ FyPӭFÿӝWѭѫQJÿӗQJWӯWUӣOrQ0ӭFÿӝWKD\WKӃFӫDFiFDPLQRDFLGWURQJEORFN Qj\ÿѭӧF[iFÿӏQKÿӇWҥRUDPӝWEҧQJVӕKD\FzQJӑLOjPDWUұQÿLӇPVӕ%/2680
>@&iFPDWUұQ%/2680NKiFFNJQJÿѭӧFWҥRUDWKHRFiFKQj\WURQJÿyFiFFKӍVӕ WKHRVDXFөPNêWӵ%/2680FKӍPӭFÿӝWѭѫQJÿӗQJWӕLWKLӇXFӫDFiFWUuQKWӵWURQJ EORFN Vӱ GөQJ ÿӇ WҥR UD PD WUұQ %/2680 WѭѫQJ ӭQJ 9t Gө PD WUұQ ÿLӇP Vӕ
%/2680ÿѭӧFWҥRUDWӯEORFNFӫDFiFWUuQKWӵFyPӭF ÿӝWѭѫQJÿӗQJWӯWUӣ OrQVDXNKLÿmORҥLEӓFiFNKRҧQJWUӕQJWҥRUDEӣLVҳSJLyQJFӝWFiFWUuQKWӵ
7ѭѫQJWӵQKѭ3$0ӣ%/2680QKӳQJWKD\WKӃNK{QJOjPWKD\ÿәLDPLQRDFLG KRһFFyWKD\ÿәLDPLQRDFLGQKѭQJNK{QJOjPWKD\ÿәLQKLӅXFiFÿһFWtQKVLQK, lý, hóa FӫDDPLQRDFLGWKuFyÿLӇPVӕGѭѫQJWURQJNKLQKӳQJWKD\WKӃ[ҧ\UDtWWKѭӡQJ[X\rQ KѫQKD\KLӃPNKL[ҧ\UDWKuFyÿLӇPVӕkP7UӏWX\ӋWÿӕLFӫDÿLӇPVӕGѭѫQJFjQJOӟQ WKӇKLӋQVӵWKD\WKӃ[ҧ\UDFjQJWKѭӡQJ[X\rQWUiLOҥLWUӏWX\ӋWÿӕLFӫDÿLӇPVӕkP FjQJOӟQWKӇKLӋQVӵWKD\WKӃFjQJNKy[ҧ\UD
7URQJWuPNLӃPFiFWUuQKWӵVXEMHFWWӯWUrQ&6'/SURWHLQWѭѫQJÿӗQJYӟLPӝW WUuQKWӵSURWHLQTXHU\FKRWUѭӟF%/$673FKLDWUuQKWӵTXHU\WKjQKFiFÿRҥQQJҳQ JӗPNêWӵVDXÿyWuPFiFÿRҥQNêWӵQj\WURQJWӯQJWUuQKWӵWUrQ&6'/1ӃXFyWӭF ³PDWFKHG´%/$673VӁJiQÿLӇPÿmWtFKKӧSWӯPDWUұQÿLӇPVӕ%/2680VDXÿyFKӑQ³PDWFKHG´FyWәQJÿLӇPWӭFWәQJÿLӇPJiQFKRNêWӵFDRQKҩWUӗLWLӃSWөFJiQ ÿLӇP FKR FiF Nê Wӵ KDL ErQ WtQK Wӯ ³PDWFKHG´ Qj\ YӟL FiF ÿLӇP Vӕ ÿm WtFK KӧS Wӯ
%/2680 7LӃS WKHR%/$673WtQKWәQJ ÿLӇPÿmJҳQFKRFiF NêWӵFKRÿӃQNKL WәQJQj\QҵPGѭӟLQJѭӥQJWKUHVWKROGWKѭӡQJOjWKuGӯQJOҥLĈRҥQWUuQKWӵWUrQ
&6'/FKRWәQJÿLӇPWUrQQJѭӥQJQj\ÿѭӧFJӑLOj+63+LJK-Scoring segment Pair) YjÿѭӧFEiRFiROjWѭѫQJÿӗQJYӟLWUuQKWӵTXHU\>@7ѭѫQJWӵQKѭ%/$6710өF QӃXWәQJÿLӇPFKӍWҥPWKӡLUӟW[XӕQJGѭӟLQJѭӥQJUӗLOҥLWăQJOrQWUrQQJѭӥQJ WKu%/$673YүQWLӃSWөF
0DWUұQÿLӇPVӕFRdon (codon scoring matrix) OjPӝWPDWUұQFyKjQJYj FӝWYӟLFiFÿLӇPVӕErQWURQJPDWUұQP{WҧNKҧQăQJ WKD\WKӃFiFFRGRQKD\FzQJӑL OjFiFPmGLWUX\ӅQYӟLQKDXWURQJFiFWUuQKWӵP51$ %ҧQJ)
%ҧQJ0DWUұQÿLӇPVӕcodon [7]
&yWҩWFҧcodonWURQJVӕÿyOjPmGӯQJVWRSFRGRQFRGRQFzQOҥLWKDP JLD Pm KyD FKR DPLQR DFLG Fy QJKƭD Oj Fy PӝW Vӕ DPLQR DFLG ÿѭӧF Pm KyD EӣL QKLӅX KѫQ PӝW FRGRQ EӣL Yu YӅ PһW EҧQ FKҩW Pm GL WUX\ӅQ PDQJ WtQK WKRiL ELӃQ (degenerate%ҧQJ 1.4)
9ӏWUtNK{QJWKRiLELӃQÿѭӧFLQKRDYӏWUtWKRiLELӃQJҩSÿ{LÿѭӧFLQKRDYӟLFKӳQKӓKѫQYӏWUt WKRiLELӃQJҩSEDÿѭӧFLQQJKLrQJWURQJNKLYӏWUtWKRiLELӃQJҩSEӕQÿѭӧFLQWKѭӡQJ.
&iFFRGRQWKDPJLDPmKyDFKRFQJPӝWDPLQRDFLGÿѭӧFJӑLOjFiFPmWKRiL ELӃQGHJHQHUDWHFRGRQ&iFPmWKRiLELӃQPmKyDFKRFQJPӝWDPLQRDFLGFyWKӇ NKiFQKDXӣEҩWNǤYӏWUtQjRWURQJEDYӏWUtFӫDPm9tGөDPLQRDFLGSKHQ\ODODQLQH ÿѭӧFPmKyDEӣLFiFFRGRQ888Yj88&+DLFRGRQQj\NKiFQKDXӣYӏWUtWKӭED FiFFRGRQ88$88*&88&8&&8$Yj&8*FQJPmKyDFKROHXFLQH1KӳQJ FRGRQ Qj\ NKiF QKDX ӣ Yӏ WUt WKӭ QKҩW Yj WKӭba; trong khi các codon UCA, UCC, UCG, UCU, AGC và AGU ± mã hóa cho serine ± NKiFQKDXӣFҧEDYӏWUtWKӭQKҩW WKӭKDLYjFҧWKӭED
0ӝWYӏWUtWURQJFRGRQNK{QJWKRiLELӃQQӃXEҩWNǤVӵWKD\WKӃQXFOHRWLGHQjRӣ YӏWUtQj\ÿӅXOjPWKD\WKӃDPLQRDFLG
0ӝWYӏWUtWURQJFRGRQFyWtQKWKRiLELӃQJҩSÿ{LWZR-IROGGHJHQHUDWHQӃXFKӍKDLWURQJEӕQQXFOHRWLGHFyWKӇӣYӏWUtQj\WKuFRGRQPmKyDFKRFQJPӝWDPLQRDFLG
23 QKѭYӏWUtWKӭEDFӫDFiFFRGRQPmKyDKLVWLGLQH± CAT và CAC ± WKRiLELӃQJҩSÿ{L ӢQKӳQJYӏWUtWKRiLELӃQJҩSÿ{LFiFQXFOHRWLGHOX{QKRһFOjS\ULPLGLQH&8KRһF OjSXULQH$*YuWKӃFKӍFyVӵWKD\WKӃS\ULPLGLQHWKjQKSXULQHKRһFQJѭӧFOҥLPӟL WKD\ÿәLDPLQR acid
&KӍFyYӏWUtWKӭEDWURQJFiFFRGRQPmKyDLVROHXFLQH$88$8&Yj$8$Fy WtQKWKRiLELӃQJҩSEDWKUHH-IROGGHJHQHUDWHYuNKLEӏWKD\WKӃEӣL$8KD\&ÿӅX
Pm KyD FKR LVROHXFLQH WUӯ NKL Eӏ WKD\ WKӃ EӣL * WKu WUӣ WKjQK $8* ± mã hóa cho methionine
TèNH HèNH NGHIấN CӬ87521*9ơ1*2ơ,1ѬӞC
MDWUұQÿLӇPVӕQXFOHRWLGHÿѫQJLҧQÿmÿѭӧFWtFKKӧSWURQJF{QJFөVӱGөQJSKәELӃQ%/$671ÿӇWuPNLӃPWUuQKWӵWѭѫQJÿӗQJYӟLPӝWWUuQKWӵQXFOHRWLGHFKRWUѭӟFYӟLJLҧÿӏQKUҵQJFiFQXFOHRWLGHWURQJFiFWUuQKWӵWLӃQKyDYӟLWӕFÿӝQKѭQKDXTuy nhiên, trrQWKӵFWӃPӭFÿӝWKD\WKӃJLӳDFiFQXFOHRWLGHWѭѫQJÿӗQJYӅFҩXWU~FKyDKӑFFKҷQJKҥQJLӳD&Yj7KRһFJLӳD$Yj*WUDQVLWLRQWKѭӡQJFDRKѫQJLӳDFiFQXFOHRWLGHNKiFQKDXYӅFҩXWU~F KyD KӑF QKѭJLӳDFiFS\ULPLGLQH&Yj7YjFiF purine (A và G) (WUDQVYHUVLRQGRVӵFKX\ӇQKyDJLӳDFiFQXFOHRWLGHWѭѫQJÿӗQJYӅFҩXWU~FKyDKӑFÿzLKӓLtWQăQJOѭӧQJKѫQ9ҧOҥLPmGLWUX\ӅQFKRSKpSVӵFKX\ӇQ ÿәLJLӳDFiFFҩXWU~FWѭѫQJÿӗQJQKLӅXKѫQJLӳDFiFFҩX WU~Fkhông WѭѫQJÿӗQJYӅPһWKyDKӑFPjNKông làm WKD\WKӃFiFDPLQRDFLG [9, 10@7KHRWKӵFWӃQj\.LPXUD[9] ÿmFKRUDÿӡLP{KuQK.± WURQJÿyPӭFÿӝWKD\WKӃJLӳDFiFQXFOHRWLGHWѭѫQJ ÿӗQJYӅFҩXWU~FKyDKӑFNKiFYӟLPӭFÿӝWKD\WKӃJLӳDFiFQXFOHRWLGHNK{QJWѭѫQJ ÿӗQJYӅFҩXWU~FKyDKӑFWѭѫQJӭng là D và E (%ҧQJ
%ҧQJ0DWUұQÿLӇPVӕnucleotide K80 [9]
D FKӍPӭFÿӝFKX\ӇQÿәLJLӳDFiFSXULQHKD\JLӳDFiFS\ULPLGLQHYӟLQKDXE FKӍPӭFÿӝFKX\ӇQÿәL
Felsenstein [11@FNJQJPӣUӝQJP{KuQK-&thành mô hình F81 WURQJÿyFyWtQK ÿӃQVӵNKiFQKDXYӅWҫQVXҩW[XҩWKLӋQFӫD FiFQXFOHRWLGHWURQJWUuQKWӵQKѭOj NӃWTXҧ FӫDFKӑQOӑFWӵQKLrQ0ӝWVӕP{KuQKVDXÿyFNJQJÿѭӧFSKiWWULӇQWUrQFѫVӣPӣUӝQJ các mô hình JC69 Yj QKѭP{KuQK+.@ và mô hình SYM94 [13] Mô KuQKSKӭFWҥSKѫQFҧOjP{KuQK*75>14], vì FyWKrPFiFWK{QJVӕSKҧQiQKPӭF ÿӝ WKD\ WKӃ NKiF QKDX FӫD FiF QXFOHRWLGH NKiF QKDX Yj WҫQ VXҩW [XҩW KLӋQ FӫD FiF nucleotide FNJQJNKiFQKDX1JRjLUDFiFWK{QJVӕSKҧQiQKFiFYӏWUtPjWҥLÿyFiF QXFOHRWLGHNK{QJWKD\ÿәLWKHRWKӡLJLDQLQYDULDEOHVLWHV,>@FNJQJQKѭSKҧQiQK PӭFÿӝWKD\WKӃNKiF QKDXJLӳDFiFYӏWUtNKiFQKDXWURQJWUuQKWӵYDULDWLRQDFURVV sites, +G) [16]FKҷQJKҥQQKѭP{KuQK*75*,[17] FNJQJFyWKӇÿѭӧFWKrPYjR các mô hình trên Mô hình GTR86+G+I WKѭӡQJSKҧQiQKGӳOLӋXWKұWWӕWKѫQFiFP{ KuQKÿѫQJLҧQNKiFFKRWKҩ\WLӃQKyDWKӵFVӵOjPӝWTXiWUuQKUҩWSKӭFWҥS7X\QKLrQ do mô hình GTR86+G+I TXiSKӭFWҥSYӅPһWWRiQKӑFYjGRÿyNK{QJÿѭӧFiSGөQJ WURQJWKӵFWӃ9uYұ\$UHQDV>@FKRUҵQJFiFQJKLrQFӭXWLӃSWKHRFҫQWtQKÿӃQNKҧ QăQJWKӵFWKLFӫDP{KuQKQJKLrQFӭXÿӇFyWKӇѭӟFWtQKFKtQK[iFPӭFÿӝWѭѫQJÿӗQJ JLӳDFiFWUuQKWӵQXFOHRWLGH ĈӕL YӟL PD WUұQ ÿLӇP Vӕ DPLQR DFLG WURQJ FiF PD WUұQ 3$0 FKӍ Fy PD WUұQ 3$0OjFyÿLӇPVӕÿѭӧFWtQKWUӵFWLӃSWӯFѫVӣGӳOLӋXSURWHLQFzQFiFPDWUұQ3$0 NKiFWKuÿѭӧFWҥRWKjQKWӯ3$0QrQÿmNK{QJSKҧQiQKWӕWGӳOLӋXWKұW1JRjLUDGR ÿѭӧFWKLӃWOұSGӵDWUrQFiFWUuQKWӵSURWHLQFyPӕLTXDQKӋWLӃQ KyDUҩWJҫQQKDXFKR QrQPDWUұQ3$0NK{QJSKҧQiQKÿ~QJPӭFÿӝWѭѫQJÿӗQJJLӳDFiFWUuQKWӵSURWHLQ FyPӕLTXDQKӋWLӃQKyD[D0XHOOHUHWDO>@ QKҩQPҥQKFiFP{KuQKWLӃQKyDFҫQ SKҧLP{WҧFKtQK[iFWҫQVXҩWWKD\WKӃDPLQRDFLGWURQJFiFNKRҧQJWKӡi gian WLӃQKyD xa
26 7URQJNKLÿykKLP{WҧFiFPDWUұQ%LOSUM, Henikoff & Henikoff [6] ÿmFKӍ
UD UҵQJ %/2680 WKӇ KLӋQ WӕW KѫQ QKLӅX VR YӟL 3$0 WѭѫQJ ÿѭѫQJ YӟL
%/2680%/2680ÿmÿѭӧFWtFKKӧSYjRF{QJFө%/$673VӱGөQJWURQJWuP NLӃPFiFWUuQKWӵWѭѫQJÿӗQJWUrQFѫVӣGӳOLӋXWURQJNKL%/2680OҥLÿѭӧFWtFK KӧSYjRFiFF{QJFө)$67$Yj66($5&++DLF{QJFөQj\FKRVӵWuPNLӃPQKҥ\ KѫQVRYӟL%/2680QKѭQJ\rXFҫXVҳSJLyQJFӝWGjLKѫQ'RÿѭӧFJLҧÿӏQKUҵQJ WҩWFҧFiFYӏWUtWURQJWUuQKWӵSURWHLQÿӅXWLӃQKyDYӟLWӕFÿӝQKѭQKDXcho nên các ma WUұQ ÿLӇP Vӕ 3$0 Yj %/2680 Fy WKӇ NK{QJ SKҧQ iQK ÿ~QJ Gӳ OLӋX WKұW 7KHR Arenas [18@ÿLӇPVӕWURQJFiFPDWUұQ%/2680ҧQKKѭӣQJUҩWOӟQÿӃQÿӝQKҥ\FӫD FKѭѫQJWUuQKWuPNLӃPYuFiFÿLӇPVӕQj\SKҧQiQKWUӵFWLӃSWҫQVXҩWWKD\WKӃFӫDFiF DPLQRDFLGWURQJWұSKӧSFiFWUuQKWӵVӱGөQJÿӇWҥRUDFK~QJĈӗQJWKӡLFiFPDWUұQ Qj\\rXFҫXVҳSJLyQJFӝWFiFWUuQKWӵGjLPӟLFKRÿLӇPWѭѫQJÿӗQJFyêQJKƭD7rong NKLÿyFiFPDWUұQ3$0WKӇKLӋQWӕWKѫQNKLWuPNLӃPFiFWUuQKWӵWѭѫQJÿӗQJQJҳQ FiFH[RQQJҳQQXFOHRWLGHKD\QKӳQJWUuQKWӵFyPӕLTXDQKӋWLӃQKyDUҩWJҫQ [21] 9uYұ\$UHQDVFKRUҵQJFҫQWKLӃWSKҧLWKD\ÿәLPDWUұQKLӋQÿDQJWtFKKӧSPһF ÿӏQK FKR %/$673 ± BLOSUM62, và cho FASTA/SSEARCH ± BLOSUM50 Trên WKӵFWӃFiFYӏWUtWURQJWUuQKWӵSURWHLQWLӃQKyDYӟLFiFWӕFÿӝNK{QJJLӕQJQKDX'R ÿyFyPӝWVӕQJKLrQFӭXÿmÿѭDUDFiFP{KuQKWLӃQKyDNKiFQKDXWURQJÿyFyWtQK ÿӃQVӵNKiFQKDXFӫDFiFYQJYӏWUtWURQJWUuQKWӵSURWHLQ>-22] Ngoài ra, Keane et al [23] còn cho WKҩ\PDWUұQÿLӇPVӕ SKҧQiQKWӕWQKҩWKDLEӝGӳOLӋXSURWHLQOӟQFӫD YLNKXҭQSURWHREDFWHULDYjYLNKXҭQFәDUFKDHDOҥLÿѭӧFWKLӃWOұSWӯFiFSURWHLQ3RO FӫDUHWURYLUXV
1JRjL FiF PD WUұQ ÿLӇP Vӕ GӵD WUrQ FiF WUuQK Wӵ SURWHLQ Fy VҹQ QKѭ 3$0 Yj
%/2680P{KuQKWLӃQKyDGӵDWUrQFiFÿһFÿLӇPJҩSQӃSSURWHLQFNJQJÿѭӧFÿӅQJKӏ vì ÿmFҧLWLӃQPӝWFiFKFyêQJKƭDQKӳQJQKѭӧFÿLӇPFӫDPAM và BLOSUM khLSKҧQ iQKGӳOLӋXWKӵF>-31], tuy nhiên, OҥL FKѭDÿѭӧFWKLӃWOұSPӝWFiFKKRjQWKLӋQEӣLYuFiFKjPNKҧQăQJWKӵFWKLEӣLFiFFKѭѫQJWUuQKSKkQWtFKWLӃQKyDNK{QJJLҧLTX\ӃW ÿѭӧFVӵSKөWKXӝFOүQQKDXJLӳDFiFYQJYӏWUtWURQJWUuQKWӵYjGRÿyYүQFKѭDÿѭӧFWKLӃWOұSWURQJFiFFKѭѫQJWUuQKSKәELӃQFKRFiFPөFÿtFKFөWKӇ
27 ĈӕLYӟLPDWUұQÿLӇPVӕFRGRQFyPӝWVӕQJKLrQFӭX FKRWKҩ\YLӋF[HP[pWVӵ WLӃQ KyD SKkQ Wӱ ӣ PӭF ÿӝ FRGRQ FKR SKpS WuP UD PӕL TXDQ KӋ WLӃQ KyD JLӳD FiF FKӫQJORjLWӕWKѫQVRYӟLӣPӭFÿӝDPLQRDFLG>32, 33]
0DWUұQÿLӇPVӕFRGRQÿҫXWLrQÿѭӧFWKLӃW OұSEӣLSchneider HWDOYjRQăP
>@WUrQFѫVӣWUuQKWӵEӝJHQHKRjQFKӍQKFӫDÿӝQJYұWFy[ѭѫQJVӕQJJӗPQJѭӡL FKXӝWJjӃFKYjFiQJӵDEҵQJFiFKVҳSJLyQJFӝWWӯQJFһSWUuQKWӵWѭѫQJÿӗQJPm KyDFKRSURWHLQWӯWӯQJFһSEӝJHQHVDXÿyFKӍFKӑQFiFWUuQKWӵFyPӭFWѭѫQJÿӗQJ WӯÿӃQÿӇ[iFÿӏQKWҫQVXҩW[XҩWKLӋQFӫDFiFFRGRQNK{QJWKD\ÿәLYjFiF FRGRQ Eӏ WKD\ WKӃ JLӳD FiF FһS WUuQK Wӵ WUrQ Fѫ Vӣ ÿy ÿӇ WKLӃW OұS PD WUұQ ÿLӇP Vӕ FRGRQ%ҧQJ7X\QKLrQFKRÿӃQQD\PDWUұQÿLӇPVӕQj\FKѭDÿѭӧFӭQJGөQJ PӝW FiFK Fө WKӇ Fy WKӇ Oj GR Qy ÿѭӧF WKLӃW OұS Wӯ FKӍ Eӝ JHQH VLQK YұW Fy [ѭѫQJ VӕQJYjFKӍFyFiFWUuQKWӵFyPӭFÿӝWѭѫQJÿӗQJWӯÿӃQÿѭӧFVӱGөQJÿӇ WKLӃWOұSPDWUұQQrQWtQKÿҥLGLӋQFKRFiFJLӕQJORjLVLQKYұWNKông cao
0һWNKiF theo EiRFiRWәQJTXDQQăPFӫD Arenas >@YӅFiF[XKѭӟQJSKiWWULӇQFiFPDWUұQÿLӇPVӕSKөFYөSKҧQiQKVӵWLӃQKyDӣPӭFÿӝSKkQWӱÿӇ[k\GӵQJÿѭӧFPӝWPD WUұQÿLӇPVӕcodon SKҧQiQKÿ~QJVӵWLӃQKyD FҫQWtQKÿӃQWtQK không ÿӗQJQKҩWGӑFWKHRWUuQKWӵYj WKHRWKӡLJLDQWӭFFiFvùngYӏWUtWURQJWUuQKWӵ FyWKӇWLӃQKyDWKHRFiFFiFKWKӭFkhác nhau theo FiFNKRҧQJWKӡLJLDQNKiFQKDX.
TÍNH CҨP THIӂT CӪ$Ĉӄ TÀI
/ƭQKYӵFWLӃQKyDSKkQWӱFҫQFiFF{QJFөFKXҭQ[iFÿӇSKkQWtFKWuPNLӃPFNJQJQKѭÿiQKJLiPӕLTXDQKӋWLӃQKyDJLӳDFiFWUuQKWӵÿҥLSKkQWӱVLQKKӑFYjGRÿyFҫQFyFiFPDWUұQÿLӇPVӕFKXҭQĈmFyUҩWQKLӅXQJKLrQFӭXWKDPJLDFҧLWLӃQFiFF{QJFөQj\YӟLQKLӅXPDWUұQÿLӇPVӕNKiFQKDXWѭѫQJӭQJYӟLFiFP{KuQKWKD\WKӃNKiF nhau 'WKӃFiFPDWUұQÿLӇPVӕÿmWtFKKӧSPһFÿӏQKWURQJFiFF{QJFөWuPNLӃPWUuQK Wӵ WѭѫQJ ÿӗQJ SKә ELӃQ KLӋQ QD\ NK{QJ SKҧL OX{Q SK KӧS Yj GR ÿy FҫQ SKҧL ÿѭӧFWKD\WKӃ>@KD\PӝWP{KuQKWKD\WKӃFyWKӇWKӇKLӋQWӕWQKҩWWUrQGӳOLӋXNK{QJÿҥLGLӋQFKRQyWURQJSKkQWtFKYjÿiQKJLiPӕLTXDQKӋJLӳDFiFJLӕQJORjLWUrQGӳOLӋXWKұW>@4XiWUuQKQJKLrQFӭXYjWuPNLӃPP{KuQKWKD\WKӃFҧLWLӃQFKRWKҩ\ Fác mô hìQK WKD\ WKӃ FRGRQ Wӓ UD Fy WtQK ѭX YLӋW KѫQ KҷQ VR YӟL FiF P{ KuQK nucleotide và mô hình amino acid [18, 37-39]7X\QKLrQFiFP{KuQKÿѫQJLҧQFNJQJ
28 QKѭVӵNӃWKӧSFiFP{KuQKNKiFQKDXYүQFKѭDWKӇSKҧQiQKÿ~QJVӵWLӃQKyDNKiF QKDXFӫDFiFYQJYӏWUtNKiFQKDXWURQJWUuQKWӵ>@ &KtQKYuYұ\ÿӅWjLQj\NKҧR sát PӭFÿӝ WKD\ÿәLFӫDWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHtrên hai replichore FӫD FiF QKLӉP VҳF WKӇ YL NKXҭQ WKXӝF KӑBurkholderiaceae ÿӇ OjP Fѫ Vӣ FKRYLӋFWKLӃWOұSFiFPDWUұQÿLӇPVӕFRGRQVDXQj\
1.5 LUҰ1Ĉ,ӆM MӞI CӪA Ĉӄ TÀI ĈӅ tài nghiên cӭu này có hai luұQÿLӇm mӟLQKѭVDX
1 Thӭ nhҩt, ÿk\OjQJKLrQFӭXÿҫXWLrQKѭӟQJÿӃn viӋc thiӃt lұp các ma trұQÿLӇm sӕ mô tҧ mӭFÿӝ thay thӃ codon tUrQFѫVӣ các mӭFÿӝ thay thӃ cӫa các trimer trong các trình tӵ protein sense và antisense cӫa nhiӉm sҳc thӇ
2 Thӭ hai, viӋc thiӃt lұp ma trұQÿLӇm sӕ tӯ WUѭӟc tӟi nay luôn sӱ dөng các dӳ liӋu lӟn cӫa các trình tӵ trên các CSDL sinh hӑF Pj FKѭD WtQK ÿӃn vӏ trí cӫa chúng trong các nhiӉm sҳc thӇ cө thӇ và khoҧng cách tiӃn hóa giӳa chúng Nghiên cӭu này [iF ÿӏnh mұW ÿӝ phân bӕ cӫa các trimer trong các trình tӵ protein sense và antisense ӣ KDLErQÿLӇm khӣLÿҫu sao chép cӫa các nhiӉm sҳc thӇ cӫa các vi khuҭn thuӝc hӑ Burkholderiaceae, và khҧo sát mӭFÿӝ thay ÿәi cӫa mұWÿӝ phân bӕ cӫa các trimer trong các trình tӵ này khi vi khuҭn tiӃn hóa, nhҵm KѭӟQJÿӃn viӋc thiӃt lұp mӝt ma trұn ÿLӇm sӕ codon trong các nghiên cӭu tiӃp theo
+ѭӟQJQJKLrQFӭXQj\FӫDÿӅWjLSKKӧSYӟLQKұQÿӏQKYӅ[XKѭӟQJQJKLrQFӭX [k\ GӵQJ PD WUұQ ÿLӇP Vӕ SKөF Yө SKkQ WtFK WLӃQ KyD WURQJ EiR FiR QăP FӫD Arenas >@ÿyOjSKҧLWtQK ÿӃQYLӋF[HP[pWWtQKNK{QJÿӗQJQKҩWGӑFWKHRWUuQKWӵ QKLӉPVҳFWKӇYjWKHRWKӡLJLDQ
1.6 MӨC TIÊU NGHIÊN CӬU ĈӅWjL QKҵPPөFWLrXQJKLrQFӭXNKҧRViWVӵWKD\ÿәL FӫDFiFWULPHUWURQJFiF WUuQKWӵprotein sense và antisense WURQJTXiWUuQKWLӃQKyDFӫDFiFYLNKXҭQWKXӝFKӑ
'R FiF WUuQK Wӵ SURWHLQ VHQVH \ KӋW QKѭ FiF WUuQK WӵmRNA Yj FiF WUuQK Wӵ protein DQWLVHQVHOjFiFWUuQKWӵOjPNKX{QÿӇWәQJKӧSQrQFiFWUuQKWӵP51$cho nên PӭFÿӝWKD\ÿәLFӫDFiFWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLsense FӫDvi NKXҭQWURQJTXiWUuQKWLӃQKyDVӁOjFѫVӣÿӇWtQK toán PӭFÿӝWKD\WKӃcodon trong TXiWUuQKWLӃQKyD FӫDYLNKXҭQ, WҥRQӅQWҧQJÿӇWKLӃWOұS cáFPDWUұQÿLӇPVӕFRGRQ sau này
'RNӃWTXҧ QJKLrQFӭXVӁ OjP QӅQ WҧQJÿӇ WҥR UD FiF PDWUұQ ÿLӇP VӕFRGRQ SKөFYөFKRF{QJWiFSKkQWtFKWuPNLӃPYjÿiQKJLiPӕLTXDQKӋWLӃQKyDJLӳDFiFWUuQK Wӵ ÿҥL SKkQ Wӱ VLQK KӑF NӃW TXҧ FӫD QJKLrQ FӭX Qj\ Yu Yұ\ VӁ JyS SKҫQ có ý QJKƭDWKLӃWWKӵFWURQJVӵSKiWWULӇQFӫDOƭQKYӵFSKkQWtFKWLӃQKyDJL~StFKWӕWKѫQFKR FiFQJKLrQFӭX WuPNLӃPWUuQKWӵWѭѫQJÿӗQJWUrQ FѫVӣGӳOLӋX QJKLrQFӭX VRViQKFiFEӝJHQHYjQJKLrQFӭXPӕLTXDQKӋWLӃQKyDJLӳDFiFloài
2.1 VҰT LIӊU ĈӕLWѭӧng nghiên cӭu là các trình tӵ nhiӉm sҳc thӇ hoàn chӍnh cӫa các vi khuҭn thuӝc hӑ Burkholderiaceaeÿѭӧc tҧi vӅ Gѭӟi dҥng các tұSWLQFyÿX{LIQDYjWK{QJWLQ vӅ vӏ trí cӫa các trình tӵ protein sense và antisense trong nhiӉm sҳc thӇGѭӟi dҥng các tұSWLQFyÿX{LSWWWӯ FѫVӣ dӳ liӋu NCBI
7UuQKWӵQKLӉPVҳFWKӇ KRjQFKӍQKFӫDFiFYLNKXҭQWKXӝFKӑ Burkholderiaceae
;iFÿӏQKFiFYӏWUtÿLӇPNKӣLÿҫXYjNӃWWK~FVDRFKpSWUrQFiF
;iFÿӏQKPұWÿӝSKkQEӕFӫDFiFWULPHU WURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVH
;iFÿӏQKPӭFÿӝWKD\ ÿәLYӅPұWÿӝSKkQEӕFӫDFiFWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHJLӳDFiFYLNKXҭQ
+uQK Oj Fk\ SKiW VLQK ORjL Fy JӕF URRWHG WUHH WKӇ KLӋQ WKӭ Wӵ WLӃQ KyD Yj NKRҧQJFiFKWLӃQKyDJLӳDFiFYLNKXҭQWKXӝFKӑBurkholderiaceae WUrQFѫVӣFiFWrình Wӵ6U'1$FӫDFiFYLNKXҭQQj\>@7KHRÿyFyPӝWWәWLrQFKXQJFӫDWҩWFҧFiF ORjL Qj\ SKkQ KyD WKjQK KDL QKiQK QKiQK WKӭ QKҩW QKiQK , UӁ YӅ SKtD ORjLB rhizoxinica HKI 454 YjFiFORjLSKtDErQWUrQQKiQKWKӭKDLQKiQK,,UӁYӅSKtDORjL
B phymatum STM815 YjFiFORjLSKtDErQGѭӟLChúng tôi FKӑQQJүXQKLrQPӝWVӕ loài và B rhizoxinica HKI 454 WURQJ QKiQK , ÿӇ NKҧR ViW PӭF ÿӝ WKD\ ÿәL PұW ÿӝ WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH Yj DQWLVHQVH FӫD FiF ORjL Qj\ VR YӟL FӫDB rhizoxinica HKI 454ÿӗQJWKӡLFKӑQQJүXQKLrQPӝWVӕORjLYjB phymatum STM815
WURQJQKiQK,,YjNKҧRViWWѭѫQJWӵQKѭNKҧRViWFKRFiFORjLWURQJQKiQK,
2.2.1 ;iFÿӏnh vӏ WUtÿLӇm khӣLÿҫu và kӃt thúc sao chép
9ӏWUtÿLӇPNKӣLÿҫXYjNӃWWK~FVDRFKpSWURQJPӛLQKLӉPVҳFWKӇÿѭӧF[iFÿӏQK WKHR FiF JLi WUӏ *&-VNHZ WtFK ONJ\ GӑF WKHR WUuQK Wӵ '1$ QKLӉP VҳF WKӇ KRjQ FKӍQK FKӭD WURQJ FiF WұS WLQ Fy ÿX{L IQD Vӱ GөQJ FKѭѫQJ WUuQK *HQVNHZ RQOLQH[40] 5HSOLFKRUHWKӭQKҩW( R1) ÿѭӧFWtQKWӯÿLӇPNKӣLÿҫXFKRWӟLÿLӇPNӃWWKúc sao chép, và replichore WKӭhai (R2) ÿѭӧFWtQK WӯÿLӇPNӃWWK~FVDRFKpSWӟLÿLӇPNKӣLÿҫXVDR chép
2.2.2 Trích xuҩt các trình tӵ protein sense và antisense
&iFWUuQKWӵprotein VHQVHYjDQWLVHQVHÿѭӧFWUtFK[XҩW WӯFiFWUuQKWӵQKLӉPVҳF WKӇ KRjQFKӍQKQKӡFKѭѫQJWUuQK extractseq FӫDJyL(0%266>41@WUrQFѫVӣcác Yӏ WUtÿmÿѭӧF[iFÿӏQKFӫDFK~QJGӑFWKHRFKLӅXGjLQKLӉPVҳFWKӇ WURQJFiFWұSWLQFy ÿX{LSWWWUrQFѫVӣGӳOLӋX1&%, nói trên
2.2.3 ;iFÿӏnh tҫn suҩt xuҩt hiӋn cӫa trimer trong trình tӵ
Tҫn suҩt xuҩt hiӋn cӫa mӝt trimer trong trình tӵ là sӕ lҫn trimer ÿyxuҩt hiӋn dӑc theo trình tӵÿѭӧF [iFÿӏnh thông qua mӝt phҫn mӅPÿmVӱ dөng trong nghiên cӭu WUѭӟFÿk\>@7ҫn suҩt xuҩt hiӋn cӫa các trimer trong nhóm các trình tӵ protein sense hoһc antisense trên mӛi replichore cӫa nhiӉm sҳc thӇ ÿѭӧF[iFÿӏnh theo cách không chӗng các nucleotide lên nhau dӑc theo tӯng trình tӵ, theo Kѭӟng tӯ 5' ÿӃn 3' Có tҩt cҧ
64 trimer có thӇ FyWURQJÿyWULPHU OjWULPHU%6Ĉ1Fӫa 32 trimer còn lҥi (Bҧng 2.1)
AAA TTT CAA TTG GAA TTC TAA TTA
AAC GTT CAC GTG GAC GTC TAC GTA
AAG CTT CAG CTG GAG CTC TAG CTA
AAT ATT CAT ATG GAT ATC TAT ATA
ACA TGT CCA TGG GCA TGC TCA TGA
ACC GGT CCC GGG GCC GGC TCC GGA
ACG CGT CCG CGG GCG CGC TCG CGA
ACT AGT CCT AGG GCT AGC TCT AGA
AGA TCT CGA TCG GGA TCC TGA TCA
AGC GCT CGC GCG GGC GCC TGC GCA
AGG CCT CGG CCG GGG CCC TGG CCA
AGT ACT CGT ACG GGT ACC TGT ACA
ATA TAT CTA TAG GTA TAC TTA TAA
ATC GAT CTC GAG GTC GAC TTC GAA
ATG CAT CTG CAG GTG CAC TTG CAA
ATT AAT CTT AAG GTT AAC TTT AAA
2.2.4 ;iFÿӏnh mұWÿӝ phân bӕ cӫa trimer trong trình tӵ
0ӛL UHSOLFKRUH FӫD QKLӉP VҳF WKӇ Fy QKyP WUuQK Wӵ 5HSOLFKRUH 5 Fy QKyPR1-S và nhóm R1-AS; Replichore R2 có nhóm R2-S và nhóm R2-$6 ĈӇ [iF ÿӏQKPұWÿӝSKkQEӕFӫDPӝWWULPHUFyWKӇJӑLPӝWFiFKQJҳQJӑQOjPұWÿӝWULPHUWURQJPӝWQKyPWUuQKWӵSURWHLQVHQVHKRһFDQWLVHQVHWҫQVXҩW[XҩWKLӋQFӫDWULPHUÿyWUѭӟFKӃWÿѭӧF[iFÿӏQKQKѭWURQJ0өF6DXÿyFiFJLiWUӏWҫQVXҩW[XҩWKLӋQFӫD
33 WULPHUVӁÿѭӧFFKX\ӇQYjREҧQJWtQK([FHO0ұWÿӝSKkQEӕ0 T FӫDPӝWWULPHUÿѭӧF WtQKWKHRF{QJWKӭFVDX:
M T = x × 1000 / L 7URQJÿy[OjWҫn suҩt xuҩt hiӋn cӫa mӛi trimer; L (tính bҵng bp) là tәng chiӅu dài cӫa tҩt cҧ các trình tӵ trong nhóm ÿѭӧc sӱ dөQJÿӇ [iFÿӏnh tҫn suҩt xuҩt hiӋn cӫa trimer
&iFJLiWUӏWҫQVXҩW[XҩWKLӋQFӫDWULPHUFӫDPӛLQKyPWUuQKWӵÿѭӧFFKLDOjP Gm\PӝWGm\JӗPJLiWUӏFKRWULPHUYjGm\NLDJӗPJLiWUӏFKRWULPHU
%6Ĉ1WѭѫQJӭQJ'RÿyPӛLQKLӉPVҳFWKӇFyGm\JLiWUӏPұWÿӝ[HP3KөOөF
2.2.5 ;iFÿӏnh khoҧng cách tiӃn hóa giӳa các trình tӵ protein sense và antisense cӫa hai vi khuҭn khác nhau
.KLYLNKXҭQ%WLӃQKyDUD[DYLNKXҭQ$NKRҧQJFiFKWLӃQKyDGJLӳDFiFWUuQK WӵSURWHLQVHQVHYjDQWLVHQVHFӫDKDLORjLYLNKXҭQQj\ÿѭӧFWtQKWKHRF{QJWKӭFVDX d = 1- r WURQJ ÿy U Oj KӋ Vӕ TXDQ KӋ 3HDUVRQ [iF ÿӏQK EҵQJ KjP Vӕ &255(/ WURQJ ([FHOJLӳDGm\JLiWUӏPұWÿӝWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHWUrQ5FӫDORjL$ YjGm\WѭѫQJӭQJFӫDORjL%WӭFJLӳD5-6FӫD$Yj5-6FӫD%KRһFJLӳD5-AS FӫD$Yj5-AS FӫD%KRһFJLӳD5-6FӫD$Yj5-6FӫD%KRһFJLӳD5-$6FӫD$ và R2-$6FӫD% ĈӕLYӟLFiFYLNKXҭQFyQKLӅXQKLӉPVҳFWKӇJLiWUӏPұWÿӝWUXQJEuQKÿѭӧFVӱ GөQJ&KҷQJKҥQYӟLYLNKXҭQFyKDLQKLӉPVҳFWKӇYjPұWÿӝWULPHUWUXQJEuQKOj các giá WUӏWUXQJEuQKFӝQJFӫDFiFJLiWUӏPұWÿӝFӫDKDLGm\JLiWUӏPұWÿӝWѭѫQJӭQJ FӫDKDLQKLӉPVҳFWKӇYtGөJLӳD5-6FӫDQKLӉPVҳFWKӇYj5-6FӫDQKLӉPVҳFWKӇ [HP3KөOөFYj
MӨC TIÊU NGHIÊN CӬU
ĈӅWjL QKҵPPөFWLrXQJKLrQFӭXNKҧRViWVӵWKD\ÿәL FӫDFiFWULPHUWURQJFiF WUuQKWӵprotein sense và antisense WURQJTXiWUuQKWLӃQKyDFӫDFiFYLNKXҭQWKXӝFKӑ
Ý N*+Ƭ$.+2$+ӐC VÀ THӴC TIӈN CӪ$Ĉӄ TÀI
éQJKƭDNKRDKӑc
'R FiF WUuQK Wӵ SURWHLQ VHQVH \ KӋW QKѭ FiF WUuQK WӵmRNA Yj FiF WUuQK Wӵ protein DQWLVHQVHOjFiFWUuQKWӵOjPNKX{QÿӇWәQJKӧSQrQFiFWUuQKWӵP51$cho nên PӭFÿӝWKD\ÿәLFӫDFiFWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLsense FӫDvi NKXҭQWURQJTXiWUuQKWLӃQKyDVӁOjFѫVӣÿӇWtQK toán PӭFÿӝWKD\WKӃcodon trong TXiWUuQKWLӃQKyD FӫDYLNKXҭQ, WҥRQӅQWҧQJÿӇWKLӃWOұS cáFPDWUұQÿLӇPVӕFRGRQ sau này.
éQJKƭDWKӵc tiӉn
'RNӃWTXҧ QJKLrQFӭXVӁ OjP QӅQ WҧQJÿӇ WҥR UD FiF PDWUұQ ÿLӇP VӕFRGRQ SKөFYөFKRF{QJWiFSKkQWtFKWuPNLӃPYjÿiQKJLiPӕLTXDQKӋWLӃQKyDJLӳDFiFWUuQK Wӵ ÿҥL SKkQ Wӱ VLQK KӑF NӃW TXҧ FӫD QJKLrQ FӭX Qj\ Yu Yұ\ VӁ JyS SKҫQ có ý QJKƭDWKLӃWWKӵFWURQJVӵSKiWWULӇQFӫDOƭQKYӵFSKkQWtFKWLӃQKyDJL~StFKWӕWKѫQFKR FiFQJKLrQFӭX WuPNLӃPWUuQKWӵWѭѫQJÿӗQJWUrQ FѫVӣGӳOLӋX QJKLrQFӭX VRViQKFiFEӝJHQHYjQJKLrQFӭXPӕLTXDQKӋWLӃQKyDJLӳDFiFloài
VҰT LIӊU
Trích xuҩt các trình tӵ protein sense và antisense
&iFWUuQKWӵprotein VHQVHYjDQWLVHQVHÿѭӧFWUtFK[XҩW WӯFiFWUuQKWӵQKLӉPVҳF WKӇ KRjQFKӍQKQKӡFKѭѫQJWUuQK extractseq FӫDJyL(0%266>41@WUrQFѫVӣcác Yӏ WUtÿmÿѭӧF[iFÿӏQKFӫDFK~QJGӑFWKHRFKLӅXGjLQKLӉPVҳFWKӇ WURQJFiFWұSWLQFy ÿX{LSWWWUrQFѫVӣGӳOLӋX1&%, nói trên
2.2.3 ;iFÿӏnh tҫn suҩt xuҩt hiӋn cӫa trimer trong trình tӵ
Tҫn suҩt xuҩt hiӋn cӫa mӝt trimer trong trình tӵ là sӕ lҫn trimer ÿyxuҩt hiӋn dӑc theo trình tӵÿѭӧF [iFÿӏnh thông qua mӝt phҫn mӅPÿmVӱ dөng trong nghiên cӭu WUѭӟFÿk\>@7ҫn suҩt xuҩt hiӋn cӫa các trimer trong nhóm các trình tӵ protein sense hoһc antisense trên mӛi replichore cӫa nhiӉm sҳc thӇ ÿѭӧF[iFÿӏnh theo cách không chӗng các nucleotide lên nhau dӑc theo tӯng trình tӵ, theo Kѭӟng tӯ 5' ÿӃn 3' Có tҩt cҧ
64 trimer có thӇ FyWURQJÿyWULPHU OjWULPHU%6Ĉ1Fӫa 32 trimer còn lҥi (Bҧng 2.1)
AAA TTT CAA TTG GAA TTC TAA TTA
AAC GTT CAC GTG GAC GTC TAC GTA
AAG CTT CAG CTG GAG CTC TAG CTA
AAT ATT CAT ATG GAT ATC TAT ATA
ACA TGT CCA TGG GCA TGC TCA TGA
ACC GGT CCC GGG GCC GGC TCC GGA
ACG CGT CCG CGG GCG CGC TCG CGA
ACT AGT CCT AGG GCT AGC TCT AGA
AGA TCT CGA TCG GGA TCC TGA TCA
AGC GCT CGC GCG GGC GCC TGC GCA
AGG CCT CGG CCG GGG CCC TGG CCA
AGT ACT CGT ACG GGT ACC TGT ACA
ATA TAT CTA TAG GTA TAC TTA TAA
ATC GAT CTC GAG GTC GAC TTC GAA
ATG CAT CTG CAG GTG CAC TTG CAA
ATT AAT CTT AAG GTT AAC TTT AAA
2.2.4 ;iFÿӏnh mұWÿӝ phân bӕ cӫa trimer trong trình tӵ
0ӛL UHSOLFKRUH FӫD QKLӉP VҳF WKӇ Fy QKyP WUuQK Wӵ 5HSOLFKRUH 5 Fy QKyPR1-S và nhóm R1-AS; Replichore R2 có nhóm R2-S và nhóm R2-$6 ĈӇ [iF ÿӏQKPұWÿӝSKkQEӕFӫDPӝWWULPHUFyWKӇJӑLPӝWFiFKQJҳQJӑQOjPұWÿӝWULPHUWURQJPӝWQKyPWUuQKWӵSURWHLQVHQVHKRһFDQWLVHQVHWҫQVXҩW[XҩWKLӋQFӫDWULPHUÿyWUѭӟFKӃWÿѭӧF[iFÿӏQKQKѭWURQJ0өF6DXÿyFiFJLiWUӏWҫQVXҩW[XҩWKLӋQFӫD
33 WULPHUVӁÿѭӧFFKX\ӇQYjREҧQJWtQK([FHO0ұWÿӝSKkQEӕ0 T FӫDPӝWWULPHUÿѭӧF WtQKWKHRF{QJWKӭFVDX:
M T = x × 1000 / L 7URQJÿy[OjWҫn suҩt xuҩt hiӋn cӫa mӛi trimer; L (tính bҵng bp) là tәng chiӅu dài cӫa tҩt cҧ các trình tӵ trong nhóm ÿѭӧc sӱ dөQJÿӇ [iFÿӏnh tҫn suҩt xuҩt hiӋn cӫa trimer
&iFJLiWUӏWҫQVXҩW[XҩWKLӋQFӫDWULPHUFӫDPӛLQKyPWUuQKWӵÿѭӧFFKLDOjP Gm\PӝWGm\JӗPJLiWUӏFKRWULPHUYjGm\NLDJӗPJLiWUӏFKRWULPHU
%6Ĉ1WѭѫQJӭQJ'RÿyPӛLQKLӉPVҳFWKӇFyGm\JLiWUӏPұWÿӝ[HP3KөOөF
2.2.5 ;iFÿӏnh khoҧng cách tiӃn hóa giӳa các trình tӵ protein sense và antisense cӫa hai vi khuҭn khác nhau
.KLYLNKXҭQ%WLӃQKyDUD[DYLNKXҭQ$NKRҧQJFiFKWLӃQKyDGJLӳDFiFWUuQK WӵSURWHLQVHQVHYjDQWLVHQVHFӫDKDLORjLYLNKXҭQQj\ÿѭӧFWtQKWKHRF{QJWKӭFVDX d = 1- r WURQJ ÿy U Oj KӋ Vӕ TXDQ KӋ 3HDUVRQ [iF ÿӏQK EҵQJ KjP Vӕ &255(/ WURQJ ([FHOJLӳDGm\JLiWUӏPұWÿӝWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHWUrQ5FӫDORjL$ YjGm\WѭѫQJӭQJFӫDORjL%WӭFJLӳD5-6FӫD$Yj5-6FӫD%KRһFJLӳD5-AS FӫD$Yj5-AS FӫD%KRһFJLӳD5-6FӫD$Yj5-6FӫD%KRһFJLӳD5-$6FӫD$ và R2-$6FӫD% ĈӕLYӟLFiFYLNKXҭQFyQKLӅXQKLӉPVҳFWKӇJLiWUӏPұWÿӝWUXQJEuQKÿѭӧFVӱ GөQJ&KҷQJKҥQYӟLYLNKXҭQFyKDLQKLӉPVҳFWKӇYjPұWÿӝWULPHUWUXQJEuQKOj các giá WUӏWUXQJEuQKFӝQJFӫDFiFJLiWUӏPұWÿӝFӫDKDLGm\JLiWUӏPұWÿӝWѭѫQJӭQJ FӫDKDLQKLӉPVҳFWKӇYtGөJLӳD5-6FӫDQKLӉPVҳFWKӇYj5-6FӫDQKLӉPVҳFWKӇ [HP3KөOөFYj
2.2.6 ;iF ÿӏnh mӭF ÿӝ WKD\ ÿәi cӫa mұW ÿӝ trimer trong các trình tӵ protein sense và antisense khi mӝt loài tiӃn hóa ra xa mӝt loài khác ĈӕLYӟLPӛLFһSYLNKXҭQ$Yj%YӟL%WLӃQKyD[DWәWLrQKRһFWәWLrQWUXQJ JLDQKѫQVRYӟL$JLiWUӏORJ10(M T B /M T A WURQJÿy0T A và M T B WѭѫQJӭQJOjPұWÿӝFӫDWULPHU7WURQJ5-6FKҷQJKҥQFӫDYLNKXҭQ$YjPұWÿӝFӫDWULPHU7WURQJ5-
34 6FӫDYLNKXҭQ%FKRELӃWPӭFÿӝNKiFQKDXYӅPұWÿӝWULPHU7WURQJYLNKXҭQ%VR YӟLYLNKXҭQ$NKLYLNKXҭQWәWLrQKRһFWәWLrQWUXQJJLDQWLӃQKyDWKjQKYLNKXҭQ$
Yj YL NKXҭQ % &ө WKӇ PұW ÿӝ WULPHU 7 WăQJ NKL ORJ 10 (M T B /M T A) ! JLҧP NKL log 10 (M T B /M T A KRһFNK{QJWKD\ÿәLNKLORJ 10 (M T B /M T A NKLYLNKXҭQ%WLӃQ KyDUD[DYLNKXҭQ$ ĈӕLYӟLFiFYLNKXҭQFyQKLӅXQKLӉPVҳFWKӇJLiWUӏPұWÿӝWUXQJEuQKÿѭӧFVӱ GөQJ
2.2.7 ;iF ÿӏnh tӍ lӋ WăQJ KRһc giҧm) cӫa mұW ÿӝ trimer trong các trình tӵ protein sense và antisense khi mӝt loài tiӃn hóa ra xa mӝt loài khác
Khi vi kKXҭQ%WLӃQKyDUD[DYLNKXҭQ$ÿӝWăQJKD\JLҧPPұWÿӝFӫDPӝW trimer 'M T FKtQKOjKLӋXVӕJLӳDPұWÿӝWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHKRһF DQWLVHQVHWѭѫQJӭQJWUrQWӯQJUHSOLFKRUHFӫDQKLӉPVҳFWKӇFӫD%VRYӟL$WӭFOj
7UrQ5WURQJVӕ7ULPHUFyD7ULPHUFy'M T > 0 trong R1-S và b Trimer có 'MT > 0 trong R1-$6YӟLDE ĈӗQJWKӡLWURQJVӕWULPHU%6Ĉ1WѭѫQJӭQJ FӫD7ULPHUFyF7ULPHU%6Ĉ1Fy'M T > 0 trong R1-6YjFyG7ULPHU%6Ĉ1Fy 'M T > 0 trong R1-$6YӟLFG KLÿyWәQJPұWÿӝWULPHUWăQJ6M T WăQJ-R1 trên toàn R1 là:
6MTWăQJ-R1= 6'MT-S + 6'MT-AS + 6'MT%6Ĉ1-S + 6'MT%6Ĉ1-AS
7URQJÿy6'MT-S, 6'MT-AS, 6'MT%6Ĉ1-S, 6'MT%6Ĉ1-AS WѭѫQJӭQJOjWәQJ'MT
FӫD D 7ULPHU WURQJ 5-S, b Trimer trong R1-$6 F 7ULPHU %6Ĉ1 WURQJ 5-S và d 7ULPHU%6Ĉ1WURQJ5-AS
7әQJPұWÿӝWULPHUJLҧP6M T JLҧP-5WUrQWRjQ5FNJQJÿѭӧFWtQKWѭѫQJWӵFKR FiF7ULPHUYj7ULPHU%6Ĉ1Fy'M T < 0
7UrQ 5 WәQJ PұW ÿӝ WULPHU WăQJ6M T WăQJ-5 Yj WәQJ PұW ÿӝ WULPHU JLҧP6M T JLҧP-5ÿѭӧFWtQKWKHRFiFKQKѭÿmWtQKFKR5
35 7ӍOӋPӭFWăQJFӫDPұWÿӝtrimer T (%'MT) WUrQ5NKL%WLӃQKyDUD[D$ ÿѭӧFWtQKQKѭVDX
7ѭѫQJWӵWӍOӋPӭFJLҧPFӫDPұWÿӝtrimer T (%'M T WUrQ5NKL%WLӃQKyDUD[D
%'M T = 'M T x 100/|6M T JLҧP-R1| ĈӕLYӟLFiFYLNKXҭQFyQKLӅXQKLӉPVҳFWKӇJLiWUӏPұWÿӝ WUXQJEuQKÿѭӧFVӱGөQJ
&+ѬѪ1*.ӂT QUҦ NGHIÊN CӬU VÀ BÀN LUҰN
MҰ7ĈӜ PHÂN BӔ CӪA CÁC TRIMER TRONG CÁC TRÌNH TӴ
MұWÿӝ trimer trong các trình tӵ protein sense và antisense trên các replichore
&iFWUuQKWӵSURWHLQVHQVH6YjDQWLVHQVH$6ÿѭӧFWUtFK[XҩWWӯFiFUHSOLFKRUH 5Yj5WҥRWKjQKFiFQKyPWUuQKWӵ5-6JӗPFiFWUuQKWӵSURWHLQVHQVHWUrQ5 R1-$6 JӗP FiF WUuQK Wӵ SURWHLQ DQWLVHQVH WUrQ 5 5-6 JӗP FiF WUuQK Wӵ SURWHLQ sense trên R2) và R2-$6JӗPFiFWUuQKWӵSURWHLQDQWLVHQVHWUrQ50ұWÿӝWULPHU WURQJFiFQKyPWUuQKWӵQj\ÿѭӧF[iFÿӏQKQKѭWURQJ0өF.ӃWTXҧ[iFÿӏQKPұW ÿӝ SKkQ Eӕ FӫD FiF WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH Yj DQWLVHQVH WUrQ KDL UHSOLFKRUH FӫD WҩW Fҧ FiF QKLӉP VҳF WKӇ FӫD ORjL YL NKXҭQ WKXӝF Kӑ
Burkholderiaceae +uQK FKR WKҩ\ QKLӉP VҳF WKӇ FӫD FiF YL NKXҭQ OX{Q EҧR WӗQWtQKWѭѫQJÿӗQJJLӳDPұWÿӝWULPHUWURQJFiF WUuQKWӵSURWHLQVHQVHWUrQPӝWUHSOLFKRUHYjPұWÿӝWULPHU%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHWUrQUHSOLFKRUHNLD +uQK Yj +uQK ELӇX WKӏ PұW ÿӝ SKkQ Eӕ FӫD FiF WULPHU Yj WULPHU %6Ĉ1WѭѫQJӭQJWURQJFiFQKyPWUuQKWӵ5-S, R1-AS, R2-S và R2-$6FӫDKDLQKLӉPVҳFWKӇFӫDYLNKXҭQB mallei ATCC23344 &iF3KөOөF-ELӇXWKӏPұWÿӝSKkQEӕFӫDFiFWULPHUYjWULPHU%6Ĉ1WѭѫQJӭQJWURQJFiFQKyPWUuQKWӵ5-S, R1-AS, R2-S và R2-$6FӫDPӝWVӕORjLNKiF
Hình 3.10ұWÿӝSKkQEӕWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHFӫD QKLӉPVҳFWKӇ
Hình 3.20ұWÿӝSKkQEӕWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHFӫDQKLӉPVҳFWKӇ
7UrQ P͟L QKL͍P V̷F WK͋ P̵W ÿ͡ FͯD WULPHU WURQJ FiF WUuQK W SURWHLQ VHQVH WUrQ5ÿ˱ͥQJOL͉QQpWPjXÿ͗[̭S[͑E̹QJP̵Wÿ͡FͯDWULPHU%6Ĉ1W˱˯QJͱQJ WURQJFiFWUuQKWSURWHLQDQWLVHQVHWUrQ5ÿ˱ͥQJÿͱWQpWPjX[DQKOiFk\Ĉ͛QJ WKͥLP̵Wÿ͡FͯDWULPHUWURQJFiFWUuQKWSURWHLQVHQVHWUrQ5ÿ˱ͥQJOL͉QQpWPjX K͛QJ [̭S [͑ E̹QJ P̵W ÿ͡ WULPHU %6Ĉ1 W˱˯QJ ͱQJ WURQJ FiF WUuQK W SURWHLQ DQWLVHQVHWUrQ5ÿ˱ͥQJÿͱWQpWPjX[DQKQ}QFKX͙LP̵Wÿ͡FͯDWULPHUWURQJFiF WUuQK W SURWHLQ DQWLVHQVH WUrQ 5 ÿ˱ͥQJ OL͉Q QpW PjX [DQK G˱˯QJ ÿ̵P [̭S [͑ E̹QJ P̵W ÿ͡ WULPHU %6Ĉ1 W˱˯QJ ͱQJ WURQJ FiF WUuQK W SURWHLQ VHQVH WUrQ 5 ÿ˱ͥQJÿͱWQpWPjXYjQJYjP̵Wÿ͡FͯDWULPHUWURQJFiFWUuQKWSURWHLQDQWLVHQVH WUrQ5ÿ˱ͥQJOL͉QQpWPjX[DQKG˱˯QJQK̩W[̭S[͑E̹QJP̵Wÿ͡WULPHU %6Ĉ1 W˱˯QJͱQJWURQJFiFWUuQKWSURWHLQVHQVHWUrQ5ÿ˱ͥQJÿͱWQpWPjXFDP
39 'R Vӵ WѭѫQJ ÿӗQJ JLӳD PұW ÿӝ WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH KRһFDQWLVHQVHWUrQKDLUHSOLFKRUHFӫDPӝWQKLӉPVҳFWKӇOX{QÿѭӧFEҧRWӗQQKѭWKӃFKR nên chúng tôi tLӃQ KjQK VR ViQK PұW ÿӝ WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH YjDQWLVHQVH WUrQ WӯQJ UHSOLFKRUH JLӳD FiF ORjL YL NKXҭQ YӟL QKDX ÿӇ ÿiQK JLi PӭF ÿӝNKiFQKDXYӅPұWÿӝSKkQEӕFӫDFiFWULPHUWURQJFiFWUuQKWӵQj\JLӳDFiFORjL.
MӬ&ĈӜ 7+$<ĈӘI MҰ7ĈӜ TRIMER TRONG CÁC TRÌNH TӴ
MӭFÿӝ WKD\ÿәi mұWÿӝ phân bӕ trimer trong quá trình tiӃn hóa cӫa các vi khuҭn
7URQJ QKiQK , ӣ +uQK ORjLB rhizoxinica HKI 454 ӣ JҫQ YL NKXҭQ Wә WLrQ QKҩW&K~QJW{LOұSWӍVӕJLӳDFiFPұWÿӝWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHKRһF DQWLVHQVHWUrQWӯQJUHSOLFKRUHFӫDYLNKXҭQQj\YjFӫDPӝWWURQJQKӳQJYLNKXҭQNKiF WKXӝFQKiQK,&K~QJW{LQJүXQKLrQFKӑQYLNKXҭQB lata 383, B thailandensis E264,
B pseudomallei 1026b, B mallei SAVP1 và B mallei ATCC23344 Sau khi tính
ORJDULWKFѫVӕFӫDFiFWӍVӕQj\FK~QJW{L[iFÿӏQKÿѭӧFFiFWULPHUFyPұWÿӝWăQJ OrQJLiWUӏORJDULWKGѭѫQJYjFiFWULPHUFyPұWÿӝJLҧPÿLJLiWUӏORJDULWKkP+uQK 3.3, 3.4, 3.5, 3.6 YjNKLFiFYLNKXҭQWLӃQKyDUD[DB rhizoxinica HKI 454
Hình 3.3 0ӭFÿӝNKiFQKDXYӅPұWÿӝ trimer JLӳD B rhizoxinica HKI 454 và B lata 383
40 Hình $FKRWKҩ\VRYӟLORjLB rhizoxinica HKI 454PұWÿӝWULPHUWUrQ5 FӫDORjLB lata FyWKD\ÿәLWăQJOүQWKD\ÿәLJLҧPWX\QKLrQVӕWULPHUFyPұWÿӝ WKD\ÿәLWăQJORJ 10 (M T B /M T A ! tWKѫQVRYӟLVӕWULPHUFyPұWÿӝWKD\ÿәLJLҧP (khi log 10 (M T B /M T A 7URQJÿyWKD\ÿәLWăQJQKLӅXQKҩWOjWULPHU*$*WURQJFiF WUuQKWӵSURWHLQDQWLVHQVHFӝW PjX[DQKGѭѫQJÿұPYjWULPHU %6Ĉ1FӫD Qy&7& WURQJFiFWUuQKWӵSURWHLQVHQVHFӝWPjXFDP7URQJFiFWUuQKWӵSURWHLQVHQVHPұW ÿӝFӫD*$*JLҧPÿLFӝWPjXÿӓYjWURQJFiFWUuQKWӵSURWHLQDQWLVHQVH&7&FNJQJ JLҧPÿLFӝW PjX [DQKQ}QFKXӕLWX\QKLrQӣFiF PӭF tWKѫQ&iF WULPHU WKD\ ÿәL JLҧPQKLӅXQKҩWJӗPFy77$Yj&7$WURQJFiFWUuQKWӵSURWHLQVHQVHFiFFӝWPjX ÿӓ Yj FiF WULPHU %6Ĉ1 WѭѫQJ ӭQJ FӫD FK~QJ 7$$ Yj 7$* WURQJ FiF WUuQK Wӵ SURWHLQDQWLVHQVHFiFFӝWPjX[DQKQ}QFKXӕL6ӵWKD\ÿәLJLҧPFӫD77$Yj&7$ WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFiFFӝWPjX[DQKGѭѫQJÿұPtWKѫQVRYӟLWURQJ FiFWUuQKWӵSURWHLQVHQVHFiFFӝWPjXÿӓÿӗQJWKӡLVӵWKD\ÿәLJLҧPFӫDFiFWULPHU
%6Ĉ1WѭѫQJӭQJFӫDFK~QJOj7$$Yj7$*WURQJFiFWUuQKWӵSURWHLQVHQVHFiFFӝW PjX FDP tW KѫQ VR YӟL WURQJ FiF WUuQK Wӵ SURWHLQ DQWLVHQVH FiF FӝW PjX [DQK Q}Q FKXӕL
7ѭѫQJWӵWUrQ5+uQK%VӕWULPHUFyPұWÿӝWKD\ÿәLWăQJtWKѫQVӕWULPHU FyPұWÿӝWKD\ÿәLJLҧPÿӗQJWKӡLWULPHUFyPұWÿӝWKD\ÿәLWăQJQKLӅXQKҩWӣÿk\ FNJQJOj*$*WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHYjWULPHU%6Ĉ1FӫDQy&7&WURQJ FiFWUuQKWӵSURWHLQVHQVH&iFWULPHUFyPұWÿӝWKD\ÿәLJLҧPQKLӅXQKҩWFNJQJOj77$ Yj&7$WURQJFiFWUuQKWӵSURWHLQVHQVHYjFiFWULPHU%6Ĉ1WѭѫQJӭQJFӫDFK~QJ 7$$Yj7$*WURQJFiFWUuQKWӵSURWHLQDQWLVHQVH0ұWÿӝFӫD77$Yj&7$WURQJFiF WUuQKWӵSURWHLQDQWLVHQVHFNJQJ tWKѫQVRYӟLWURQJFiF WUuQKWӵSURWHLQVHQVH Yj FӫD 7$$ Yj 7$* WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH FNJQJ tW KѫQ VR YӟL WURQJ FiF WUuQK Wӵ protein antisense
0һWNKiFFyWKӇWKҩ\Wӯ+uQKUҵQJWҩWFҧFiFWULPHUYjWULPHU%6Ĉ1WѭѫQJ ӭQJWUrQFҧKDLUHSOLFKRUHÿӅXWKD\ÿәLYӅPұWÿӝSKkQEӕGtWKD\QKLӅX
0ұWÿӝSKkQEӕFӫDFiFWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHFӫD các loài B thailandensis E264, B pseudomallei 1026b, B mallei SAVP1, B mallei
41 ATCC23344 VR YӟL ORjLB rhizoxinica HKI 454 FNJQJ Fy NLӇX WKD\ ÿәL WăQJ KRһF JLҧPJLӕQJQKѭӣORjLB lata +uQKYj&yWKӇ WKҩ\UҵQJFiF ORjLNKiFQKDXFyVӵNKiFQKDXYӅPӭFÿӝWKD\ÿәLYӅPұWÿӝFӫDFiFWULPHUWKӇKLӋQ TXDFiFJLiWUӏORJ 10 (M T B /M T A )
Hình 3.4 0ӭFÿӝNKiFQKDXYӅPұWÿӝ trimer JLӳD B rhizoxinica HKI 454 và B thailandensis
Hình 3.5 0ӭFÿӝNKiFQKDXYӅPұWÿӝ trimer JLӳD B rhizoxinica HKI 454 và B pseudomallei
Hình 3.6 0ӭFÿӝNKiFQKDXYӅPұWÿӝ trimer JLӳD B rhizoxinica HKI 454 và B mallei SAVP1
Hình 3.7 0ӭFÿӝNKiFQKDXYӅPұWÿӝ trimer JLӳD B rhizoxinica HKI 454 và B mallei
7URQJQKiQK,,ӣ+uQKVRYӟLYLNKXҭQB phymatum STM815PұWÿӝSKkQEӕFӫDFiFWULPHUWUrQ5FӫDORjLB xenorovans /%FNJQJFyWKD\ÿәLWăQJOүQWKD\ ÿәLJLҧPQKѭFiFYLNKXҭQӣQKiQK,+uQKWX\QKLrQVӕWULPHUFyPұWÿӝWKD\ÿәLWăQJOҥLQKLӅXKѫQVRYӟLVӕWULPHUFyPұWÿӝWKD\ÿәLJLҧPÿӗQJWKӡLFiFWULPHUFy
43 PұWÿӝWKD\ÿәLWăQJOҥLFyPӭFWăQJQKLӅXKѫQVRYӟLFiFWULPHUFyPұWÿӝWKD\ÿәL JLҧP+uQK
Hình 3.8 0ӭFÿӝNKiFQKDXYӅPұWÿӝ trimer JLӳD B phymatum STM815 và B xenorovans
7UrQ5+uQK$FiFWULPHUFyPұWÿӝWKD\ÿәLWăQJQKLӅXQKҩWJӗPFy77$WURQJFiFWUuQKWӵSURWHLQVHQVHFӝWPjXÿӓYjWULPHU%6Ĉ1FӫDQy7$$WURQJFiFWUuQK Wӵ SURWHLQ DQWLVHQVH FӝW PjX [DQK Q}Q FKXӕL WULPHU 7&7 WURQJ FiF WUuQK WӵSURWHLQDQWLVHQVHFӝWPjX[DQKGѭѫQJÿұPYjWULPHU%6Ĉ1FӫDQy$*$WURQJFiFWUuQKWӵSURWHLQVHQVHFӝWPjXFDPWULPHU7$7WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFӝWPjX[DQKGѭѫQJÿұPYjWULPHU%6Ĉ1FӫDQy$7$WURQJFiFWUuQKWӵ protein VHQVHFӝWPjXFDPWULPHU**7WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFӝWPjX[DQKGѭѫQJÿұPYjWULPHU%6Ĉ1FӫDQy$&&WURQJFiFWUuQKWӵSURWHLQVHQVHFӝWPjXFDPWULPHU&&7WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFӝWPjX[DQKGѭѫQJÿұPYj tULPHU%6Ĉ1FӫDQy$**WURQJFiFWUuQKWӵSURWHLQVHQVHFӝWPjXFDPQJRjLUDFzQFyWKӇNӇÿӃQ$*7WURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHFӝWPjXÿӓYjPjX [DQK GѭѫQJ ÿұP WѭѫQJ ӭQJ Yj WULPHU %6Ĉ1 FӫD Qy $&7 WURQJ FiF WUuQK Wӵ protein sense vjDQWLVHQVHWѭѫQJӭQJYӟLFӝWPjXFDPYjPjX[DQKQ}QFKXӕL
44 7UrQ5+uQK%FNJQJYұ\VӕWULPHUFyVӵWKD\ÿәLWăQJFNJQJQKLӅXKѫQVR YӟLVӕWULPHUFyVӵWKD\ÿәLJLҧPÿӗQJWKӡLFiFWULPHUFyPұWÿӝWăQJFNJQJFyPӭF WăQJQKLӅXKѫQVRYӟLFiFWULPHUFyPұWÿӝJLҧP6RYӟLFiFYLNKXҭQWURQJQKiQK, QѫL Fy Vӵ WKD\ ÿәL PұW ÿӝ FiF WULPHU WUrQ 5 NKi JLӕQJ WUrQ 5 ӣB xenorovans
/%PұWÿӝFiFWULPHUWKD\ÿәLWUrQ5NKiFӣFiFPӭFWKҩ\U}WUrQ56ӵNKiF nhau này càng rõ nét khi xem xét B phymatum STM815 YjPӝWORjLNKiFWLӃQKyD[D KѫQORjLBurkholderia Sp CCGE1003 (Hình 3.9)
Hình 3.9 0ӭFÿӝNKiFQKDXYӅPұWÿӝ trimer JLӳD B phymatum STM815 và Burkholderia Sp
Hình 3.$ FKR WKҩ\ FiF WULPHU 7&7 7&$ 7$7 *&$ &&7 $&$ Fy PұW ÿӝ WKD\ÿәLPҥQKPӁKѫQWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHWUrQ5FiFFӝWPjX[DQK GѭѫQJÿұPVRYӟLORjLB xenorovans /%+uQK$&iFWULPHU%6Ĉ1WѭѫQJӭQJ FӫDFK~QJWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHWUrQUHSOLFKRUHQj\FiFFӝWPjX[DQK Q}QFKXӕLFNJQJWKD\ÿәLPҥQKPӁWX\QKLrQYӟLPӭFÿӝtWKѫQVRYӟLFiFWULPHUFӫD FK~QJWURQJFQJFiFWUuQKWӵFiFFӝWPjX[DQKGѭѫQJÿұP7URQJNKLÿyPұWÿӝ FiFWULPHUYjWULPHU%6Ĉ1WѭѫQJӭQJFӫDFK~QJWUrQ5+uQK%WKD\ÿәLtWKѫQ VRYӟLWUrQ5
1K˱ Y̵\ các N͇W TX̫ QJKLrQ FͱX FKR WK̭\ P̵W ÿ͡ SKkQ E͙ FͯD W̭W F̫ FiF trimer ÿ͉XFyWKD\ÿ͝LNKLYLNKX̱QWL͇QKyD&yF̫WKD\ÿ͝LWăQJO̳QWKD\ÿ͝LJL̫P
&iFORjLNKiFQKDXFyV͙O˱ͫQJFiFWULPHUYjFiFWULPHU%6Ĉ1W˱˯QJͱQJFͭWK͋
Fy PͱF ÿ͡ WKD\ ÿ͝L Y͉ P̵W ÿ͡ SKkQ E͙ ṶW NKiF QKDX WURQJ FiF WUuQK W SURWHLQ VHQVHYjDQWLVHQVHWUrQF̫KDLUHSOLFKRUHFͯDQKL͍PV̷FWK͋
7KӃQKѭQJQKӳQJWKD\ÿәLYӅPұWÿӝSKkQEӕWULPHUQj\FyQpWJuFKXQJÿӇcó WKӇEҧRWӗQÿѭӧFVӵWѭѫQJÿӗQJYӅPұWÿӝFӫDWULPHUYjWULPHU%6Ĉ1WѭѫQJӭQJWURQJFiF WUuQK Wӵ SURWHLQ VHQVH Yj DQWLVHQVH WUrQ KDL UHSOLFKRUH FӫD QKLӉP VҳF WKӇ QKѭ ÿmWUuQKEj\WURQJ0өF?
Khoҧng cách tiӃn hóa giӳa các trình tӵ protein sense và antisense giӳa các loài
'RPұWÿӝWULPHUWURQJWUuQKWӵQKLӉPVҳFWKӇKRjQFKӍQKFyWtQKÿһFWUѭQJORjL UҩWFDRYjGRÿyFyNKҧQăQJSKkQORҥLYLNKXҭQWӕWKѫQ6U'1$ӣPӭFGѭӟLJLӕQJ
>@FKRQrQWURQJQJKLrQFӭXQj\FK~QJW{LVӱGөQJPұWÿӝWULPHUWURQJFiFWUuQKWӵ SURWHLQVHQVHYjDQWLVHQVHÿӇWtQKNKRҧQJFiFKWLӃQKyDFӫDFiFWUuQKWӵQj\JLӳDFiF loài
7KHRNӃWTXҧQJKLrQFӭXWUuQKEj\ӣ0өFOX{QFyVӵWѭѫQJÿӗQJJLӳDPұW ÿӝSKkQEӕWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHKRһFDQWLVHQVHWUrQPӝWUHSOLFKRUH YjPұWÿӝSKkQEӕWULPHU%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHKRһF VHQVHWUrQUHSOLFKRUHFzQOҥLFӫDQKLӉPVҳFWKӇ NKLFiFYLNKXҭQWLӃQKyD 'RÿyWrong SKҫQQJKLrQFӭXQj\FK~QJW{LVRViQKPұWÿӝSKkQEӕFӫDcác trimer trong các trình WӵSURWHLQVHQVH KRһFDQWLVHQVH trên WӯQJUHSOLFKRUHFӫDQKLӉPVҳFWKӇJLӳDKDLORjL YӟLQKDXVӱGөQJKjP&255(/WURQJ([FHO9ӟLFiFORjLFyQKLӅXKѫQPӝWQKLӉP VҳFWKӇJLiWUӏWUXQJEuQKFӫDFiFJLiWUӏPұWÿӝWULPHUWURQJFiFWUuQKWӵQj\WUrQWӯQJ UHSOLFKRUHFӫDWҩWFҧFiFQKLӉPVҳFWKӇÿѭӧFVӱGөQJ 1ӃXFyVӵWKD\ÿәLKD\NKiF QKDXtWYӅPұWÿӝSKkQEӕFӫDFiFWULPHUNӃWTXҧFӫDKjP&255(/OjKӋVӕTXDQKӋ 3HDUVRQUVӁWLӃQYӅJLiWUӏFzQQӃXFyVӵWKD\ÿәLFjQJQKLӅXYӅPұWÿӝSKkQEӕ FӫDFiFWULPHUWKuUVӁFjQJWLӃQUD[DYjFjQJQKӓKѫQJLiWUӏ.KLÿyÿӝOӟQG - r WKӇKLӋQNKRҧQJFiFKWLӃQKyDJLӳDFiFWUuQKWӵSURWHLQVHQVHKRһFDQWLVHQVHWUrQ5 KRһF5JLӳDKDLYLNKXҭQ +uQKYj+uQKWѭѫQJ ӭQJFKRWKҩ\kKRҧQJFiFK WLӃQKyDJLӳDFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHWUrQ5và trên R2 JLӳDYLNKXҭQ
B rhizoximica HKI454 và FiFYLNKXҭQNKiFWURQJQKiQK,ӣ+uQK
Hình 3.10 KRҧQJFiFKWLӃQKyDJLӳDFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHWUrQ5FӫD B rhizoximica HKI454 và FӫDFiFORjLNKiFWURQJQKiQK,
7UrQ5NKRҧQJFiFKJLӳDFiFWUuQKWӵSURWHLQVHQVHFiFFӝWPjXÿӓWҥREӣLVӵ WKD\ÿәLYӅPұWÿӝFӫDFiFWULPHUWURQJFiFWUuQKWӵQj\YjNKRҧQJFiFKJLӳDFiFWUuQK WӵSURWHLQDQWLVHQVHFiFFӝWPjX[DQKQ}QFKXӕLWҥREӣLVӵWKD\ÿәLYӅPұWÿӝFӫD các WULPHU%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHOjQJҳQKѫQFҧFKӭQJ WӓFiFPұWÿӝWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjFiFWULPHU%6Ĉ1WѭѫQJӭQJ WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHWKD\ÿәLtWKѫQVRYӟLPұWÿӝWULPHUWURQJFiFWUuQK
Wӵ SURWHLQ DQWLVHQVH WKӇ KLӋQ EӣL FiF FӝW PjX [DQK GѭѫQJ ÿұP Yj PұW ÿӝ WULPHU
%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQVHQVHWKӇKLӋQEӣLFiFFӝWPjXFDP0ӝW ÿLӅXÿiQJOѭXêQӳDOjWUrQ5NKRҧQJFiFKJLӳDFiFWUuQKWӵSURWHLQDQWLVHQVHFiFFӝWPjX[DQKGѭѫQJÿұPYjNKRҧQJFiFKJLӳDFiFWUuQKWӵSURWHLQVHQVHFiFFӝWPjXFDPÿmOӟQOҥLFzQNKiFELӋWQKDXUҩWQKLӅXFKӭQJWӓWURQJTXiWUuQKWLӃQKyDWUrQ5PұWÿӝFiFWULPHUWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHYjPұWÿӝFiFWULPHU%6Ĉ1
47 WѭѫQJӭQJ WURQJFiFWUuQKWӵSURWHLQVHQVHNK{QJQKӳQJWKD\ÿәLPҥQKPӁKѫQFiFPұW ÿӝWULPHUWURQJKDLQKyPWUuQKWӵNLDPjFzQNKiFQKDXUҩWQKLӅX
Hình 3.11.KRҧQJFiFKWLӃQKyDJLӳDFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHWUrQR2 FӫD B rhizoximica HKI454 và FӫDcác loài khác trong nhánh I
7UrQ5WѭѫQJWӵQKѭWUrQ5FiFJLiWUӏNKRҧQJFiFKWLӃQKyDFKRWKҩ\PұWÿӝ WULPHU WURQJFiF WUuQK WӵSURWHLQVHQVH Yj PұWÿӝWULPHU %6Ĉ1WѭѫQJӭQJWURQJFiF WUuQKWӵSURWHLQDQWLVHQVHFNJQJWKD\ÿәLtWKѫQVRYӟLPұWÿӝWULPHUWURQJFiFWUuQKWӵ proteiQDQWLVHQVHYjPұWÿӝWULPHU%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQVHQVH ĈӗQJWKӡLPұWÿӝFiFWULPHUWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHYjPұWÿӝFiFWULPHU
%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQVHQVHFNJQJWKD\ÿәLPҥQKPӁKѫQVRYӟLFiFPұWÿӝ WULPHUWURQJKDLQKyPWUuQKWӵFzQOҥLYjFNJQJNKiFQKDXUҩWQKLӅX7X\Yұ\VӵNKiFQKDXQj\ÿӕLOұSYӟLWUrQ5WUӯB mallei ATCC23344
48 7URQJ QKiQK ,, +uQK NKRҧQJ FiFK JLӳD FiF WUuQK Wӵ SURWHLQ VHQVH Yj DQWLVHQVHWUrQKDLUHSOLFKRUHFӫDFiFYLNKXҭQӣQKiQK%FNJQJÿѭӧFWuPWKҩ\WѭѫQJWӵ QKѭFӫDFiFYLNKXҭQWURQJQKiQK,+uQKYj
Hình 3.12 .KRҧQJFiFKWLӃQKyDJLӳDFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHWUrQ5FӫD B phymatum STM815 và FӫDKDLORjLNKiFWURQJQKiQK,,
Hình 3.13 .KRҧQJFiFKWLӃQKyDJLӳDFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHWUrQ5FӫD B phymatum STM815 và FӫDKDLORjL khác trong nhánh II
Nói chung, WURQJ TXi WUuQK WL͇Q KyD FͯD FiF YL NKX̱Q WKX͡F K͕
%XUNKROGHULDFHDHP̵Wÿ͡FͯDFiFWULPHUWURQJFiFWUuQKWSURWHLQVHQVHYjP̵Wÿ͡ FͯDFiFWULPHU%6Ĉ1W˱˯QJͱQJWURQJFiFWUuQKWSURWHLQDQWLVHQVHtWWKD\ÿ͝L K˯Q VRYͣLP̵Wÿ͡FͯDFiF trimer trong các WUuQKWSURWHLQDQWLVHQVHYjP̵Wÿ͡WULPHU
%6Ĉ1 W˱˯QJ ͱQJ WURQJ FiF WUuQK W SURWHLQ VHQVH 6 WKD\ ÿ͝L GL͍Q UD P̩QK Pͅ K˯QQKL͉XFKRFiFWULPHUWURQJFiFWUuQKWSURWHLQDQWLVHQVHYjFiFWULPHU%6Ĉ1 W˱˯QJ ͱQJ WURQJFiF WUuQKW SURWHLQVHQVHĈ͛QJWKͥL QKͷQJWKD\ ÿ͝LY͉P̵Wÿ͡ FͯDFiFWULPHUWURQJFiFWUuQKWSURWHLQDQWLVHQVHYjFͯDFiFWULPHU%6Ĉ1WURQJ FiFWUuQKWSURWHLQVHQVHFNJQJṶWNKiFEL W
.ӃWTXҧWuPWKҩ\ӣWUrQFKӭQJWӓUҵQJWURQJTXiWUuQKWLӃQKyD, các trimer trong FiFWUuQKWӵSURWHLQVHQVHtWEӏWKD\WKӃKѫQVRYӟLWURQJFiFWUuQKWӵSURWHLn antisense, YjFiFWULPHU%6Ĉ1WѭѫQJӭQJFӫDFiFWULPHUQj\WURQJFiFWUuQKWӵSURWHLQDQWLVHQVH tW Eӏ WKD\ WKӃ KѫQ VR YӟL WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH WUrQ Fҧ KDL UHSOLFKRUH FӫDQKLӉPVҳFWKӇ.
TӍ lӋ WKD\ÿәi cӫa mұWÿӝ trimer giӳa các vi khuҭn
0ұWÿӝSKkQEӕFӫDPӝWWULPHUOjWҫQVXҩW[XҩWKLӋQWUXQJEuQKFӫDWULPHUÿyWUrQ PӛLNESWUuQKWӵ 7KHRNӃWTXҧQJKLrQFӭXWUuQKEj\ӣ0өFNKLYLNKXҭQWLӃQKyDPұWÿӝSKkQEӕFӫDWҩWFҧFiFWULPHUÿӅXWKD\ÿәLKRһFWăQJOrQKRһFJLҧP ÿLYjFiFORjLNKiFQKDXFyVӕOѭӧQJFiFWULPHUYjFiFWULPHU%6Ĉ1WѭѫQJӭQJFөWKӇFyPӭFÿӝWKD\ÿәLYӅPұWÿӝSKkQEӕNKiFQKDXWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHWUrQFҧKDLUHSOLFKRUHFӫDQKLӉPVҳFWKӇ 7KHRNӃWTXҧQJKLrQFӭXWUuQKEj\ ӣ 0өF NKL YL NKXҭQ WLӃQ KyD, PұW ÿӝ FӫD FiF WULPHU WURQJ FiF WUuQK WӵSURWHLQ VHQVH Yj PұW ÿӝ FiF WULPHU %6Ĉ1 WѭѫQJ ӭQJ WURQJ FiF WUuQK Wӵ SURWHLQ antisense tWWKD\ÿәL KѫQVRYӟLPұWÿӝWULPHUWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHYjPұWÿӝWULPHU%6Ĉ1WURQJFiFWUuQKWӵSURWHLQVHQVH, và QKӳQJWKD\ÿәLYӅPұWÿӝFӫDFiFWULPHUWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHYjFӫDFiFWULPHU%6Ĉ1WURQJFiFWUuQKWӵSURWHLQVHQVHWKuOӟQKѫQQKLӅXÿӗQJWKӡLFNJQJUҩWNKiFELӋWQKDX 7URQJNKLÿyNKLWLӃQKyDYLNKXҭQFNJQJEҧRWӗQVӵSKkQEӕPұWÿӝFӫDWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVH WUrQ PӝW UHSOLFKRUH Yj FӫD WULPHU %6Ĉ1 WѭѫQJ ӭQJ WURQJ FiF WUuQK Wӵ SURWHLQDQWLVHQVHWUrQUHSOLFKRUHFzQOҥL0өF'R ÿySKҫQQJKLrQFӭXQj\ÿѭӧFWKӵFKLӋQQKҵP[iF ÿӏQKWӍOӋWăQJKD\JLҧPFӫD PӝWWULPHU NKL PұW ÿӝWULPHU WKD\ÿәLWăQJKD\JLҧPWURQJPӕLTXDQKӋYӟLVӵWăQJKRһFJLҧPYӅPұWÿӝFӫDFiFWULPHUNKiF3KѭѫQJSKiS[iFÿӏQKÿѭӧFWUuQKEj\WURQJ0өF.7
50 7KHRNӃWTXҧWUuQKEj\ӣ0өFӣFiFYLNKXҭQWUrQQKiQK,+uQKVӕ WULPHUKRһFWULPHU%6Ĉ1WѭѫQJӭQJWUrQFҧKDLUHSOLFKRUHFyPұWÿӝWăQJtWKѫQVӕ WULPHUKRһFWULPHU%6Ĉ1WѭѫQJӭQJFyPұWÿӝJLҧP7X\QKLrQNӃWTXҧWUuQKEj\ӣ
%ҧQJFKRWKҩ\WUrQPӛLUHSOLFKRUHWәQJVӕPұWÿӝWULPHUWăQJOX{QEҵQJWәQJVӕ PұWÿӝWULPHUJLҧPĈLӅXQj\FyQJKƭDOjNKLFiFWULPHUFyPұWÿӝWăQJOrQWUrQPӝW UHSOLFKRUH PӝW OѭӧQJ EDR QKLrX WKu FiF WULPHU Fy PұW ÿӝ JLҧP ÿL WUrQ UHSOLFKRUH ÿy SKҧLJLҧP PӝWOѭӧQJWѭѫQJӭQJFKRWKҩ\VӵSKөWKXӝFOүQQKDXJLӳDFiFWULPHUFy PұWÿӝWăQJYjFiFWULPHUFyPұWÿӝJLҧPWUrQFQJPӝWUHSOLFKRUH
%ҧQJ 7әQJPұWÿӝWULPHUWăQJKRһFJLҧPNKLFiFYLNKXҭQWLӃQKyDUD[DYLNKXҭQ B rhizoxinica HKI 454 ӣQKiQK I 5[Oj5KRһF5 'ҩX- FKӍPұWÿӝJLҧP
6'M T -S 6'M T -AS 6'M 7 % 6 Ĉ 1 -S 6'M 7 % 6 Ĉ 1 -AS 6 M T Wă QJ -Rx KR һF 6 M T JLҧ P -Rx
B rh iz o xin ica HKI 4 5 4 B la ta 3 8 3 R1
7X\QKLrQVӵWăQJYjJLҧPPұWÿӝWULPHUWUrQPӝWUHSOLFKRUHWKuNK{QJOLrQTXDQ ÿӃQ Vӵ WăQJYjJLҧP PұWÿӝWULPHU WUrQ UHSOLFKRUH NLD EӣLYu WәQJVӕ PұWÿӝWULPHUWăQJKRһFJLҧPWUrQ5FyWKӇJLӕQJKRһFNKiF5%ҧQJ
Hình 3.14 7ӍOӋWăQJKRһFJLҧPFӫDPұWÿӝ trimer khi B lata WLӃQKyDUD[D B rhizoxinica
Khi B lata WLӃQ KyD UD [DB rhizoxinica HKI 454 trong nhánh I, trên R1,
WăQJQKLӅXQKҩWYӅPұWÿӝOjWULPHU*$*WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHYjWULPHU
%6Ĉ1FӫDQy&7&WURQJFiFWUuQKWӵSURWHLQVHQVH+uQK$7ӍOӋWăQJPұWÿӝ FӫD*$*Yj&7&WURQJVӕFiFWULPHUFy PұWÿӝWăQJWѭѫQJӭQJFKLӃPYj WәQJPұWÿӝWăQJ+uQK$7URQJNKLÿy*$*WURQJFiFWUuQKWӵSURWHLQVHQVH JLҧPQKѭQJӣPӭFtWKѫQFKLӃPWәQJPұWÿӝJLҧPYjWӍOӋJLҧPFӫD&7&WURQJ FiF WUuQK Wӵ SURWHLQ DQWLVHQVH FKLӃP WәQJ PұW ÿӝ JLҧP ĈiQJ FK~ ê Oj WULPHU
*&*YjWULPHU%6Ĉ1FӫDQy&*&WăQJYӅPұWÿӝWURQJWҩWFҧFiFQKyPWUuQKWӵYӟL PӭFÿӝWăQJFDRQKҩWFKӍOҫQWѭѫQJӭQJYӟLORJ 10 (M T B /M T A ) = 0.12, Hình 3.3A), WX\QKLrQ*&*WăQJWURQJFiFWUuQKWӵSURWHLQVHQVHFKLӃPÿӃQWәQJPұWÿӝWăQJ Yj&*&WăQJWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFKLӃPWәQJPұWÿӝWăQJ+uQK 3.14A)
7ӍOӋWăQJKRһFJLҧPFӫDFiFWULPHUYjWULPHU%6Ĉ1WUrQ5FNJQJWѭѫQJWӵ QKѭWUrQ5+uQK$Yj%FKRGWәQJ PұWÿӝWăQJYjJLҧPWUrQ5FDRKѫQ
52 7Ӎ OӋ WăQJ Yj JLҧP FӫD FiF WULPHU JLӳD FiF ORjLB thailandensis E264, B pseudomallei 1026b, B mallei SAVP1, B mallei ATCC23344 và loài B rhizoxinica
HKI 454 WURQJQKiQK ,WѭѫQJӭQJÿѭӧFWUuQKEj\Wrong Hình 3.15, Hình 3.16, Hình Yj+uQKӢFiFFһSYLNKXҭQQj\*$*WURQJFiFWUuQKWӵSURWHLQDQWLVHQVH YjWULPHU%6Ĉ1FӫDQy&7&WURQJFiFWUuQKWӵSURWHLQVHQVHYүQFKLӃPWӍOӋWăQJÿӝW ELӃQWX\QKLrQFKLӃPWӍOӋWăQJFDRQKҩWOҥLOj*&*YjWULPHU%6Ĉ1FӫDQy&*&
Hình 3.15 7ӍOӋWăQJKRһFJLҧPFӫDPұWÿӝ trimer khi B thailandensis E264 WLӃQKyDUD[D B rhizoxinica HKI 454
7UrQ 5 Yj 5 WӍ OӋ WăQJ FӫD *&* WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH FӫDB thailandensis E264 VRYӟLB rhizoxinica HKI 454 WѭѫQJӭQJOjYjWәQJPұW ÿӝWăQJYjWӍOӋWăQJFӫD&*&WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFӫDB thailandensis E264 VR YӟLB rhizoxinica HKI 454 WѭѫQJ ӭQJ Oj Yj WәQJ PұW ÿӝ WăQJ (Hình 3.15)
*LӳDB pseudomallei 1026b và B rhizoxinica HKI 454WӍOӋWăQJFӫD*&*WURQJFiFWUuQKWӵSURWHLQVHQVHFKLӃPWәQJPұWÿӝWăQJWUrQ5YjWәQJPұWÿӝWăQJWUrQ5ÿӗQJWKӡLWӍOӋWăQJFӫD&*&WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFKLӃPWәQJPұWÿӝWăQJWUrQR1 và 12.1% trên R2 (Hình 3.16)
Hình 3.16 7ӍOӋWăQJKRһFJLҧPFӫDPұWÿӝ trimer khi B pseudomallei 1026b WLӃQKyDUD[D B rhizoxinica HKI 454
Hình 3.17 7ӍOӋWăQJKRһFJLҧPFӫDPұWÿӝ trimer khi B mallei SAVP1 WLӃQKyDUD[D B rhizoxinica HKI 454
*LӳDB mallei SAVP1 và B rhizoxinica HKI 454WӍOӋWăQJFӫD*&*WURQJFiF
WUuQKWӵSURWHLQVHQVHFKLӃPWәQJPұWÿӝWăQJWUrQ5YjWәQJPұWÿӝ
54 WăQJWUrQ5ÿӗQJWKӡLWӍOӋWăQJFӫD&*&WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFKLӃP 13WәQJPұWÿӝWăQJWUrQ5YjWUrQ5+uQK
Hình 3.18 7ӍOӋWăQJKRһFJLҧPFӫDPұWÿӝ trimer khi B mallei ATCC23344 WLӃQKyDUD[D B rhizoxinica HKI 454
*LӳDB mallei ATCC23344 và B rhizoxinica HKI 454WӍOӋWăQJFӫD*&*WURQJ cáFWUuQKWӵSURWHLQVHQVHFKLӃPWәQJPұWÿӝWăQJWUrQ5YjWәQJPұWÿӝ WăQJWUrQ5ÿӗQJWKӡLWӍOӋWăQJFӫD&*&WURQJFiFWUuQKWӵSURWHLQDQWLVHQVHFKLӃP WәQJPұWÿӝWăQJWUrQ5YjWUrQ5+uQK ӢFiFYLNKXҭQWUrQQKiQK,,+uQKNKiFYӟLWUrQQKiQK,VӕWULPHUKRһFWULPHU %6Ĉ1 WѭѫQJ ӭQJ WUrQ Fҧ KDL UHSOLFKRUH Fy PұW ÿӝ WăQJ QKLӅX KѫQ Vӕ WULPHUKRһFWULPHU%6Ĉ1WѭѫQJӭQJFyPұWÿӝJLҧP0өFWX\QKLrQOҥLJLӕQJYӟLFiFYLNKXҭQWUrQQKiQK,WәQJVӕPұWÿӝWULPHUWăQJOX{QEҵQJWәQJVӕPұWÿӝWULPHUJLҧPWtQKWUrQWRjQUHSOLFKRUHYjWәQJVӕPұWÿӝWULPHUWăQJKRһFJLҧPWUrQ5NKiFYӟLWUrQ5%ҧQJ
%ҧQJ 7әQJPұWÿӝWULPHUWăQJKRһFJLҧPNKLFiFYLNKXҭQWLӃQKyDUD[DYLNKXҭQ B phymatum STM815 ӣQKiQK,,5[Oj5KRһF5 'ҩX- FKӍPұWÿӝJLҧP
6'M T -S 6'M T -AS 6'M 7 % 6 Ĉ 1 -S 6'M 7 % 6 Ĉ 1 -AS 6 M T Wă QJ -Rx KR һF 6 M T JLҧ P -Rx
B p h yma tu m STM 8 1 5 B xe n o ro va n s L B 4 0 0 R1
7ӍOӋWăQJKRһFJLҧPFӫDPӝWWULPHUNKLPұWÿӝWULPHUWKD\ÿәLWăQJKD\JLҧP ӣQKiQK,,NKLFiFYLNKXҭQWLӃQKyDUD[DB phymatum STM815 ÿѭӧFWUuQKEj\WURQJ Hình 3.19 và Hình 3.20
*LӳD B xenorovans LB400 và B phymatum STM815 (Hình 3.19), trimer GAG
Yj WULPHU %6Ĉ1FӫD Qy&7&Fy PұWÿӝWKD\ ÿәLUҩW NKiF VRYӟL FiF FһS YLNKXҭQ WURQJQKiQK,ÿӗQJWKӡLWULPHUWăQJYӅPұWÿӝFKLӃPWӍOӋFDRQKҩWWtQKWUrQWәQJPұW ÿӝWăQJFӫDWRjQEӝUHSOLFKRUHNK{QJFzQOj*&*Yj&*&QӳDPjOj**&FKLӃP YjWULPHU%6Ĉ1FӫDQy*&&FKLӃPYj**7FKLӃPYjWULPHU
%6Ĉ1FӫDQy$&&FKLӃP*LӳDKDLORjLQj\WUiLQJѭӧFYӟLFiFFһSYLNKXҭQ WURQJQKiQK,*&*Yj&*&FyPұWÿӝJLҧPWURQJWҩWFҧFiFQKyPWUuQKWӵWUrQFҧ KDLUHSOLFKRUH JLҧPQKLӅXQKҩWOj&*&WURQJFiF WUuQKWӵSURWHLQ DQWLVHQVH WUrQ 5 FKLӃPWәQJPұWÿӝJLҧP
*LӳD Burkholderia Sp CCGE1003 và B phymatum STM815 (Hình 3.20), trimer
*$*Yj&7&FNJQJFyPұWÿӝWKD\ÿәLNKiFVRYӟLFiFFһSYLNKXҭQWURQJQKiQK, ÿӗQJWKӡLFNJQJNK{QJJLӕQJYӟLFһSB xenorovans LB400 và B phymatum STM815 +uQK WURQJ QKiQK ,, NLӇX WăQJ KRһF JLҧP Yj WӍ OӋ WăQJ KRһF JLҧP FӫD UҩW QKLӅX WULPHU Yj WULPHU %6Ĉ1 WѭѫQJ ӭQJ FNJQJ NK{QJ JLӕQJ QKѭ ӣB xenorovans
Hình 3.19 7ӍOӋWăQJKRһFJLҧPFӫDPұWÿӝ trimer khi B xenorovans LB400 WLӃQKyDUD[D B phymatum STM815
Hình 3.20 7ӍOӋWăQJKRһFJLҧPFӫDPұWÿӝ trimer khi Burkholderia Sp CCGE1003 WLӃQKyDUD xa B phymatum STM815
1K˱Y̵\FyWK͋WK̭\U̹QJYLNKX̱QNKLWL͇QKyDWKuFiFV͙O˱ͫQJNKiFQKDX FͯDWULPHUYjWULPHU%6Ĉ1 W˱˯QJͱQJWăQJOrQKR̿FJL̫P ÿ i) Y͉P̵Wÿ͡YͣLFiFW͑ O P̵Wÿ͡WăQJKR̿FJL̫PNKiFQKDXWtQKWUrQW͝QJP̵Wÿ͡WăQJKR̿FJL̫PFKR WͳQJ UHSOLFKRUH &KR G V͙ O˱ͫQJ WULPHU Yj WULPHU %6Ĉ1 W˱˯QJ ͱQJ Fy P̵W ÿ͡ WăQJKR̿F JL̫PtWKD\QKL͉XWKuW͝QJP̵Wÿ͡WăQJOX{QOX{QE̹QJW͝QJP̵Wÿ͡JL̫P tính trên P͟LUHSOLFKRUH
7ӍOӋPұWÿӝWăQJKRһFJLҧPOjQKӳQJJLiWUӏWKӇKLӋQNKҧQăQJWKD\WKӃFӫDcác WULPHUKD\WULPHU%6Ĉ1ÿӇOjPKuQKWKjQKEӝJHQHPӟLFӫDYLNKXҭQNKLYLNKXҭQWLӃQKyD7UӏWX\ӋWÿӕLFӫDJLiWUӏWӍOӋWăQJKRһFJLҧPPұWÿӝWULPHUFjQJFDRWKuNKҧQăQJWKD\WKӃFӫDWULPHUÿyYjREӝJHQHFjQJOӟQÿӇWҥRQrQORjLPӟL
&+ѬѪ1*.ӂT LUҰN VÀ KIӂN NGHӎ
KӂT LUҰN
CiFYLNKXҭQWKXӝFKӑBurkholderiaceae NKLWLӃQKyD:
1 1KLӉP VҳF WKӇ FӫD FiF YL NKXҭQ OX{Q EҧR WӗQ WtQK WѭѫQJ ÿӗQJ JLӳD PұW ÿӝ WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH WUrQ PӝW UHSOLFKRUH Yj PұW ÿӝ WULPHU
%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHWUrQUHSOLFKRUHFzQOҥLFӫD QKLӉPVҳFWKӇ
2 0ұWÿӝSKkQ EӕFӫDWҩW FҧFiFWULPHUÿӅXFyWKD\ÿәLWăQJKRһFWKD\ÿәLJLҧP
&iFORjLNKiFQKDXFyVӕOѭӧQJFiFWULPHUYjFiFWULPHU%6Ĉ1WѭѫQJӭQJFө WKӇ Fy PӭF ÿӝ WKD\ ÿәL YӅ PұW ÿӝ SKkQ Eӕ UҩW NKiF QKDX WURQJ FiF WUuQK Wӵ SURWHLQVHQVHYjDQWLVHQVHWUrQFҧKDLUHSOLFKRUHFӫDQKLӉPVҳFWKӇ
3 0ұW ÿӝ FӫD FiF WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH Yj PұW ÿӝ FiF WULPHU
%6Ĉ1WѭѫQJӭQJWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHtWWKD\ÿәLKѫQVRYӟLPұW ÿӝWULPHUWURQJFiFWUuQKWӵSURWHLQDQWLVHQVHYjPұWÿӝWULPHU%6Ĉ1WURQJFiF WUuQKWӵSURWHLQVHQVH
4 &iFVӕOѭӧQJNKiFQKDXFӫDWULPHUYjWULPHU%6Ĉ1WѭѫQJӭQJWăQJOrQKRһFJLҧPÿLYӅ PұWÿӝYӟLFiFWӍOӋPұWÿӝWăQJKRһFJLҧPNKiFQKDXWtQKWUrQWәQJPұWÿӝWăQJKRһFJLҧPFKR WӯQJUHSOLFKRUHYӟLWәQJPұWÿӝWăQJOX{QOX{QEҵQJWәQJPұWÿӝJLҧPWtQKWUrQPӛLUHSOLFKRUH.
KIӂN NGHӎ
1JKLrQFӭXPӭFÿӝWKD\ÿәLFӫDFiFWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHӣFiFKӑ YLNKXҭQNKiF
1 Watson JD, Crick FHC, 1953 A structure for deoxyribose nucleic acid Nature 171: 737-738
2 Jukes TH, Cantor CR, 1969 Evolution of protein molecules, in Mammalian Protein Metabolism, ed HM Munro, NewYork, NY: Academic Press, 21-132
3 Altschul SF, Gish W, Miller W, Myers EU; Lipman DJ, 1990 Basic local alignment search tool J Mol Biol, 215: 403-410
4 Xiong J, 2006 Essential Bioinformatics Cambridge University Press, NY, USA
5 Dayhoff MO, Schwartz RM, Orcutt BC, 1978 A model of evolutionary change
LQ SURWHLQV´ LQ $WODV RI 3URWHLQ 6HTXHQFH DQG 6WUXFWXUH HGMO Dayhoff (Washington DC: National Biomedical Research Foundation), pp 345-352
6 Henikoff S, Henikoff JG, 1992 Amino acid substitution matrices from protein blocks Proc Natl Acad Sci USA, 89: 10915-10919
7 Schneider A, Cannarozzi GM, Gonnet GH, 2005 Empirical codon substitution matrix BMC Bioinformatics, 6: 134
8 Muse SV, Gaut BS, 1994 A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome Mol Biol Evol, 11: 715-724
9 Kimura, M 1980 A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences J Mol Evol
10 Collins DW, Jukes TH, 1994 Rates of transition and transversion in coding sequences since the human-rodent divergence Genomics, 20: 386-396
11 Felsenstein J, 1981 Evolutionary trees from DNA sequences: a maximum likelihood approach J Mol Evol, 17: 368-376
12 Hasegawa M, Kishino H, Yano T, 1985 Dating the human-apes plitting by a molecular clock of mitochondrial DNA J Mol Evol, 22: 160-174
13 Zharkikh A, 1994 Estimation of evolutionary distances between nucleotide sequences J Mol Evol, 39: 315-329
14 Tavaré S, 1986 Some probabilistic and statistical problems in the analysis of DNA sequences, in Some Mathematical Questions in Biology-DNA Sequence Analysis, ed RM Miura (Providence, RI:Amer Math Soc), 57-86
15 Shoemaker JS, Fitch WM, 1989 Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated
16 Yang Z, 1994 Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods J Mol Evol, 39: 306-314
17 Sumner JG, Jarvis PD, Fernández-Sánchez J, Kaine BT, Woodhams MD, Holland BR, 2012 Is the general time-reversible model bad for molecular phylogenetics? Syst Biol, 61: 1069-1074
18 Arenas M, 2015 Trends in substitution models of molecular evolution Front Genet, 6: 319
19 Mueller T, Spang R, Vingron M, 2002 Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method Mol Biol Evol, 19: 8-13
20 Halpern, AL, Bruno WJ, 1998 Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies Mol Biol Evol, 15: 910-
21 Lartillot N, Philippe H, 2004 A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process Mol Biol Evol, 21:
22 Zoller S, Schneider A, 2013 Improving phylogenetic inference with a semiempirical amino acid substitution model Mol Biol Evol, 30: 469-479
23 Keane TM, Creevey CJ, Pentony MM, Naughton TJ, Mclnerney JO, 2006 Assessment of methods for amino acid matrix selection their use on empirical data shows that adhoc assumptions for choice of matrix are not justified BMC
24 Taverna DM, Goldstein RA, 2000 The distribution of structures in evolving protein populations Biopolymers, 53: 1-8
25 Parisi G, Echave J, 2005 Generality of the structurally constrained protein evolution model: assessment on representatives of the four main fold classes
26 Goldstein RA, 2011 The evolution and evolutionary consequences of marginal thermostability in proteins Proteins, 79: 1396-1407
27 Grahnen JA, Nandakumar P, Kubelka J, Liberles DA, 2011 Biophysical and structural considerations for protein sequence evolution BMC Evol Biol, 11:
28 Wilke CO, 2012 Bringing molecules back into molecular evolution PloS Comput Biol, 8: e1002572
29 Arenas M, DosSantos HG, Posada D, Bastolla U, 2013 Protein evolution along phylogenetic histories under structurally constrained substitution models
30 Arenas M, Sánchez-Cobos A, Bastolla U, 2015 Maximum likelihood phylogenetic inference with selection on protein folding stability Mol Biol Evol, 32: 2195-2207
31 Bordner AJ, Mittelmann HD, 2013 A new formulation of protein evolutionary models that account for structural constraints Mol Biol Evol, 31: 736-749
32 Benner SA, Cohen MA, Gonnet GH, 1994 Amino acid substitution during functionally constrained divergent evolution of protein sequences Protein Eng, 7: 1323-1332
33 Seo TK, Kishino H, 2008 Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins Syst Biol, 57: 367-377
34 Phan TH, Nguyen DL, 2012 Species-specificity of DNA trimer densities in chromosomes and their use in the classification of closely related organisms J Microbiol Methods, 91: 30-37
35 Phan TH, Tran THT, 2018 Trimer distribution patterns in the protein sense and antisense sequences on the two replichores of the Burkholderia lata chromosomes International Conference on Advanced Computing and
Applications (ACOMP) Computer Science and Software Engineering Section, pp 71-75
36 Pearson WR, 2013 Selecting the right similarity-scoring matrix Curr Protoc Bioinformatics, 43: 3.5.1-3.5.9
37 Benner SA, Cohen MA, Gonnet GH, 1994 Amino acid substitution during functionally constrained divergent evolution of protein sequences Protein Eng, 7: 1323-1332
38 Goldman N, Yang Z, 1994 A codon-based model of nucleotide substitution for protein-coding DNA sequences Mol Biol Evol, 11: 725-736
39 Seo TK, Kishino H, 2008 Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins Syst Biol, 57: 367-377
40 Genskew: Harhay GP, Murray RW, Lubbers B, Griffin D, Koren S, Phillippy
AM, Harhay DM, Bono J, Clawson ML, Heaton MP, Chitko-McKown CG, Smith TPL, 2014 Completed closed genome sequences of four Mannheimia varigena isolates from cattle with shipping fever Genome Announc, 2: e00088-
41 Rice P, Longden I, Bleasby A, 2000 EMBOSS: the European molecular biology open software suite Trends Genet, 16: 276-277
42 Phan Thi Huyen, Nguyen Duc Luong, 2012 Non-random DNA trimer arrangement in Bacillus cereus ATCC 10987 chromosome T̩p chí Sinh h͕c,
PHӨ LӨC 1&iFEѭӟc thiӃt lұp ma trұQÿLӇm sӕ BLOSUM [6]
*LҧVӱPӝWEORFNQKѭÿmW{PjXxanh GѭӟLÿk\FӫDNӃWTXҧFӫDVҳS[ӃSJLyQJKjQJ 1WUuQKWӵFyQFӝWWӭFQYӏWUtEҧRWӗQ
7әQJVӕFһSDPLQRDFLG;ioX j NKҧQăQJDPLQRDFLG;i FKX\ӇQÿәLWKjQKDPLQRDFLG
X j FKtQKOjWәQJVӕFһSWUuQKWӵÿѭӧFJLyQJWӭc là t = N(N-1)/2
%ѭӟF 7tQKWҫQVXҩW[XҩWKLӋQ& ij k FӫDFһS; i oX j WURQJFӝWWKӭNNQFӫD block:
*ӑL[ i , x j và C ij k WѭѫQJӭQJOjWҫQVXҩW[XҩWKLӋQFӫDDPLQRDFLG; i , amino acid X j và
X i oX j WURQJFӝWWKӭNFӫDEORFN
7ҫQVXҩW[XҩWKLӋQFӫDFһS;ioX i trong block là: C ii k = x i (x i -1)/2
7ҫQVXҩW[XҩWKLӋQFӫD; i oX j trong block là: C ij k = x i x j
7ҫQVXҩW[XҩWKLӋQ x i x j C ii k KRһFC ij k
%ѭӟF 7tQKWәQJWҫQVXҩW[XҩWKLӋQ& ij FӫDFһS; i oX j WURQJWҩWFҧQFӝWFӫDEORFN
%ѭӟF 4XLWәQJWҫQVXҩW[XҩWKLӋQ& ij FӫDFiFFһS; i oX j YӅ.KLÿyWҫQVXҩW[XҩW KLӋQT ij FӫDFһSOj; i oX j là: q ij = C ij /T WURQJÿy7 QW Q>11-1)/2]
9tGөWҫQVXҩW[XҩWKLӋQTAoB FӫDFһS;AoXB WURQJEORFNӣWUrQOj q AoB = (4 + 8 + 0 + 0 + 0 + 0 + 0)/(7*15) = 12/105 = 0.114
%ѭӟF 7tQKWҫQVXҩW[XҩWKLӋQQJүXQKLrQS i FӫDDPLQRDFLG; i (hay p j FӫDDPLQR acid X j WURQJFһS; i oX j trong block: p i = q ii + 6(q ij LM p j = q jj + 6(q ij LM
7ҫQVXҩW[XҩWKLӋQQJүXQKLrQFӫDFһS; i oX i trong block là: e ii = p i *p i = p i 2 7ҫQVXҩW[XҩWKLӋQQJүXQKLrQFӫDFһS;ioX j trong block là: e ij = p i *p j + p j *p i = 2p i *p j ML 7tQKWӍVӕV ij = q ij /e ij
%ѭӟF 7tQKORJDULWKFѫVӕKRһFFӫDV ij
%ѭӟF /jPWUzQFiFJLiWUӏORJDULWKFѫVӕKRһFFӫDs ij YӅVӕWӵQKLrQYjÿѭDFiFJLiWUӏOjPWUzQYjRPDWUұQ%/2680
PHӨ LӨC 2 MұWÿӝ trimer trong các trình tӵ protein sense và antisense trên R1 cӫa nhiӉm sҳc thӇ 1 (NC_006348) cӫa vi khuҭn B mallei ATCC23344
Trimer R1-S_Trimer R1-AS_Trimer Trimer
PHӨ LӨC 3 MұWÿӝ trimer trong các trình tӵ protein sense và antisense trên R2 cӫa nhiӉm sҳc thӇ 1 (NC_006348) cӫa vi khuҭn B mallei ATCC23344
Trimer R2-S_Trimer R2-AS_Trimer Trimer
PHӨ LӨC 4 MұWÿӝ trimer trong các trình tӵ protein sense và antisense trên R1 cӫa nhiӉm sҳc thӇ 2 (NC_006349) cӫa vi khuҭn B mallei ATCC23344
PHӨ LӨC 5 MұWÿӝ trimer trong các trình tӵ protein sense và antisense trên R2 cӫa nhiӉm sҳc thӇ 2 (NC_006349) cӫa vi khuҭn B mallei ATCC23344
Trimer R2-S_Trimer R2-AS_Trimer Trimer
PHӨ LӨC 6 MұW ÿӝ trimer trung bình trên R1 cӫa vi khuҭn B mallei
Trimer R1-S_Trimer R1-AS_Trimer Trimer
PHӨ LӨC 7 MұW ÿӝ trimer trung bình trên R2 cӫa vi khuҭn B mallei
Trimer R2-S_Trimer R2-AS_Trimer Trimer
3+Ө/Ө& 0ұWÿӝSKkQEӕWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVH FӫDQKLӉPVҳFWKӇYjQKLӉPVҳFWKӇFӫDYLNKXҭQ B pseudomallei 1026
3+Ө/Ө& 0ұWÿӝSKkQEӕWULPHUWURQJFiFWUuQKWӵSURWHLQVHQVHYjDQWLVHQVHFӫDQKLӉPVҳFWKӇYjQKLӉPVҳFWKӇFӫDYLNKXҭQ B mallei SAVP
3+Ө /Ө& 0ұW ÿӝ SKkQ Eӕ WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH Yj DQWLVHQVH FӫDQKLӉP VҳFWKӇ YjQKLӉPVҳF WKӇ FӫDYLNKXҭQ B thailandensis E264
3+Ө /Ө& 0ұW ÿӝ SKkQ Eӕ WULPHU WURQJ FiF WUuQK Wӵ SURWHLQ VHQVH YjDQWLVHQVH FӫD QKLӉP VҳF WKӇ Yj QKLӉP VҳF WKӇ FӫD YL NKXҭQ B phymatum STM815.