Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 13 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
13
Dung lượng
1 MB
Nội dung
PhylogeneticrelationshipsinclassIofthesuperfamilyof bacterial,
fungal, andplant peroxidases
Marcel Za
´
mocky
´
Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia
Molecular phylogeny among catalase–peroxidases, cyto-
chrome c peroxidas es, and ascorbate peroxidases was ana-
lysed. Sixty representative sequences covering all known
subgroups ofclassIofthesuperfamilyofbacterial, fungal,
and plant heme peroxidases were selected. Each sequence
analysed contained the typical p eroxidase motifs evolved to
bind effectively the prosthetic heme group, enabling per-
oxidatic activity. The N-terminal and C-terminal domains of
catalase–peroxidases matching the ancestral tandem g ene
duplication event were treated separately inthe phylogenetic
analysis to reveal their specific evolutionary history. The
inferred unrooted phylogenetic tree obtained by three dif-
ferent methods revealed the existence of four clearly
separated c lades ( C-terminal and N-terminal d omains of
catalase–peroxidases, ascorbate peroxidases, and cyto-
chrome c peroxidases) which were segregated early in the
evolution of this superfamily. F rom the results, it is obvious
that the duplication e vent inthe g ene f or catalase–peroxid ase
occurred inthe later phase of evolution, in which the indi-
vidual specificities ofthe peroxidase families distinguished
were already formed. Evidence is presented that classIof the
heme peroxidase superfamily is spread among prokaryotes
and eukaryotes, obeying the birth-and-death p rocess of
multigene family evolution.
Keywords: ascorbate peroxidase; catalase–peroxidase; cyto-
chrome c peroxidase; birth-and-death process; lateral gene
transfer.
Heme peroxidases are very abundant enzymes present in all
living forms. These oxidoreductases are involved in a wide
array o f physiological p rocesses, the m ost important of
which are involved inthe response to various forms of
oxidative stress [ 1,2]. Attention was mainly drawn to the
family of catalase–peroxidases by the representative enco-
ded by KatG in Mycobacterium tuberculosis,whichis
capable of oxidative activation of isoniazid (isonicotinic
acid hydrazide) [3], still the most widely used antitubercu-
losis drug. All heme peroxidases have important features
of their catalytic mechanism in common. After their initial
oxidation w ith a molecule of hydrogen peroxide, they
oxidize from the reactive intermediate known as compound
I a wide variety of substrates according to the simplified
reaction scheme: H
2
O
2
+2AH fi 2H
2
O+2A.
The d etailed reaction mechanism a nd substrate specificity
of numerous peroxidases have been investigated for decades,
and a large amount of experimental data has accumulated
(e.g [4], for r eview). It was suggested that similar heme-
containing peroxidases w hich are very abundant in plants,
fungi and some bacteria should constitute theplant peroxi-
dase superfamily [5]. They were further classified into three
subclasses according to their cellular localization and
function. All representatives possess the same heme pros-
thetic group containing high-spin f erric iron, so the reaction
specificity is a pparently determined by the protein surround-
ings ofthe heme. Catalase–peroxidases, which belong to
class I , are the only group of this superfamily that possess
notable catalase activity (i.e. they can oxidize and reduce
hydrogen peroxide; see [6] f or details). All other members of
the superfamily can only reduce hydrogen peroxide with
subsequent oxidation of a secondary substrate. These
ÔnoncatalaseÕ members ofclassI exhibit strong specificity
for electron donors: the preferred substrate is ascorbate in
the case of a scorbate peroxidasesand cytochrome c for
cytochrome c peroxidases. Several crystal structures of heme
peroxidases have been solved, now covering all subgroups
of this superfamily. In clas s I, the crystal structure of
cytochrome c peroxidase (CCP) from Saccharomyce s
cerevisiae [7] a nd ascorbate peroxidase (APX) from Pisum
sativum [8] is known; the former has served as a benchmark
for peroxidase structures for two decades. APX was already
crystallized in a complex with its substrate [9]. After many
unsuccessful attempts, the crystals of several catalase–
peroxidases we re also obtain ed ( e.g. [10,11]). Recently, the
structure of catalase–peroxidase from the halophilic arch-
aeon Haloarcula marismortui was solved t o high resolution
[12], and this was followed b y the highly resolved structure o f
KatG from Burkholderia pseudomallei [13].
The phylogenetic relations of heme peroxidases have only
been analysed to a certain extent: the evolutionary analysis
of the mammalian peroxidase superfamily has been per-
formed, and even a prokaryotic member has been detected
[14]. Thephylogenetic relations i n t he plant peroxidase
Correspondence to M. Za
´
mocky´ , Institute of Molecular B iology,
Slovak Academy of Sciences, Du´ bravska
´
cesta 21, SK-845 51
Bratislava, Slovakia. Fax: + 4212 59307416,
Tel.: + 4212 59307441, E-m ail: umikm zam@savba.sk
Abbreviations: APX, ascorbate pe roxidase; CCP , f ungal c yto-
chrome c peroxidase; CP, catalase–peroxidase; CPn , N-t erminal
domain of a c atalase–peroxidase; CPc, C-terminal do main o f a
catalase–peroxidase; KatG, gene for cat alase–peroxidase;
NJ, neighbor-joining.
(Received 22 April 2004, revised 1 0 June 2004, accepted 2 1 June 2004)
Eur. J. Biochem. 271, 3297–3309 (2004) Ó FEBS 2004 doi:10.1111/j.1432-1033.2004.04262.x
superfamily have been analysed only partially: the common
phylogeny of catalase–peroxidases and APXs have been
outlined [15], and a dendrogram of 29 lignin a nd manganese
peroxidases have been presented [16]. The present study
should contribute to our understanding ofthe possible
modes of evolution of multigene families. In principle, two
possible s chemes have been suggested for this type o f
phylogeny: (a) concerted evolution and (b) evolution by a
birth-and-death process. Inthe first case, multigene families
arise i n the genomes after gene duplications by the
mechanisms of unequal crossing over and gene conversion,
followed b y natural selection [17]. Inthe second case, new
genes are created by repeated gene duplication; some are
maintained inthe genome for a long time, whereas others
are d eleted or become nonfunctional [18–20]. It is well
documented that almost all KatGs contain two fused copies
of the primordial peroxidase gene [5,15]. The copy trans-
lated into the N-terminal domain participates in catalysis
and possesses t he prosthetic heme group, but the c atalytic
function ofthe C-terminal domain is not apparent [12], and
thus the role ofthe corresponding part ofthe gene is also
unknown. These g ene f usions togeth er with single-copy
genes of APXs and CCPs are ideal for investigating the
evolution of a widespread multigene family. H ence, the
most probable evolutionary route leading to clades of extant
heme peroxidases will help to explain the occurrence and
function of multigene families present in both prokaryotic
and eukaryotic genomes.
Experimental Procedures
Sequence data
All protein sequences used in this study were obtained from
the U niProt database and are listed in Table 1 together with
their accession numbers andthe organisms from which they
originate. The protein sequences were used to infer the
phylogenetic relationships. Both codon usage bias of
analysed sequences ranging from a rchaea to high er plants
and the presence of introns in only some ofthe members
analyzed (plant APXs) can cause serious problems with
analyzing the DNA sequences directly. Owing to the
currently unequal availability ofthe sequences of the
proposed groups of heme peroxidases (known from previ-
ous analysis in [15]), only representative sequences from
each group and f rom each kingdom were selected for a
statistically equilibrated phylogenetic analysis. All 34 cata-
lase–peroxidases analysed here were div ided into N -terminal
and C-terminal domains because ofthe apparent tandem
gene-duplication event reporte d previously [5,15]. The
border between the domains was easily discernible because
of conserved residues a nd motifs present i n all known
KatGs. All sequenced N-terminal domains are l onger
(average length 430 amino acids) because o f several
insertions not present in C-terminal domains (309 amino
acids on average [6]). Twenty-one APXs, from red algae to
higher plants, were selected for the analysis. APX genes
expressed both in cytoplasm and chloroplasts were chosen
in equal amounts. Two APXs from Euglenozoa were also
included. The 25 amino acid-long fragment of APX from
bovine eye (accession No. PC4445) could not be used in this
analysis. This N-terminal stretch is insufficient in length and
of rather unclear origin. M oreover, no other homologs of
APX are known inthe whole kingdom Animalia. Besides
the w ell-known S . cerevisiae CCP s equence, two additional
ascomycetous CCP sequences (as putative ORFs f rom
sequencing projects) were also included in this study. No
homologs of y east CCP from other kingdoms are known,
and, for example, bacterial CCPs belong to a d ifferent
protein family.
Multiple sequence alignments
Multiple sequence a lignments of catalase–peroxidases, and
of APXs with CCPs, were perf ormed using
CLUSTALX
[21].
In the case of catalase–peroxidases, two partial alignments
were performed, for the N-terminal and C-terminal domain.
Suitable parameters for all three partial alignments were:
gap opening penalty, 10.0; gap extension penalty, 0.2; and
gap separation distance, 8. The Blosum 62 series protein–
weight matrix was used in all three cases. These parameters
were the same as those used for the first alignment ofclass I
of the peroxidase superfamily [15]. Varying the gap opening
penalty setting inthe range 5.0–20.0 did not change the
alignment output significantly. T he seq uence alignments
were displayed with
GENEDOC
[22] and refined manually
with respect to known structural homology.
Profile alignments
The profile alignment mode of
CLUSTALX
was used stepwise
on the partial alignments. Firstly, the N-terminal domains
of catalase–peroxidases were aligned with the prealigned
group of APXs and CCPs where the known secondary-
structure e lements were taken into account. This new profile
was final ly used to align the group of C-terminal domains of
catalase–peroxidases which share the lowest sequence
similarity with other superfamily members in catalytically
essential regions. Suitable parameters used for all profile
alignments were: gap opening penalty, 10.00; gap extension
penalty, 0.1; Blosum 30 protein–weight matrix; helix and
strand gap p enalty, 4; a nd loop gap penalty, 1. Finally, a reas
of extensive gaps (i.e. longer than 10 amino acid positions
and present in more than 90% of sequences) were omitted
from the entire alignment to prevent long-branch attraction
in the following procedures.
Phylogenetic analysis
The profile alignment used for thephylogenetic analysis
comprised 9 4 sequences (each KatG divided in t he two
corresponding domains) and a total length of 398 amino-
acid positions. Three different phylogenetic methods were
applied.
First, thephylogeneticrelationships were inferred using
the neighbor-joining (NJ) method selected from the package
MEGA
[23]. T he following parameters were used: the Poisson
correction of substitutions; the option o f Ôcomplete deletionÕ
for handling gaps; and 100 bootstrap replications as a test of
inferred phylogeny. The resulting unrooted tree topology
was visualized in the
TREE EXPLORER
.
The s ame profile alignment of 94 sequences was subjected
to the bootstrap p rocedure of the
PHYLIP
package [24]. After
100 bootstrap cycles, the data s et was s ubjected to the
3298 M. Za
´
mocky´ (Eur. J. Biochem. 271) Ó FEBS 2004
Table 1. Sequences of enzymes used in this study. Abbreviations for all peroxidase s included in this evolutionary an alysis, with their accession
numbers from the UniProt database and organisms from which they originate. Inthe case o f catalase–peroxidases, the parts coding for the N-
terminal and C-terminal domains ofthe co rresponding genes (KatG) w ere t rea ted se pa rately. S eq uence data for Candida al bicans was o btained from
the Stanford Genome Technology Center website at h ttp://www-sequence.stanford.edu/group/candida.
Abbreviation Accession number Enzyme Organism (strain)
ArathaAPXc Q05431 Ascorbate peroxidase 1 Arabidopsis thaliana
ArathaAPXt Q42593 Ascorbate peroxidase (thylakoid) Arabidopsis thaliana
ArchfulCP O28050 Catalase–peroxidase Archaeoglobus fulgidus
AspefumCP Q7Z7W6 Catalase–peroxidase Aspergillus fumigatus
AspenidCP Q96VT4 Catalase–peroxidase Emericella nidulans
BacihalCP Q9KEE6 Catalase–peroxidase Bacillus halodurans
BacisteCP P14412 Catalase I Geobacillus stearothermophilus
BlumgraCP Q8 · 1 N3 Catalase–peroxidase Blumeria graminis
BurkcepCP Q9AP06 Catalase–peroxidase Burkholderia cepacia
BurkpseCP Q939D2, pdb: 1MWV Catalase–peroxidase Burkholderia pseudomallei
CandalbCCP Contig19–10046* Cytochrome c peroxidase Candida albicans
CapsannAPX Q84UH3 Ascorbate peroxidase Capsicum annuum
CaulcreCP O31066 Peroxidase/catalase Caulobacter crescentus
ChlamspAPX Q9SXL5 Ascorbate peroxidase Chlamydomonas sp. W80
ChlareiAPX O49822 Ascorbate peroxidase Chlamydomonas reinhardtii
CucusatAPX Q96399 Ascorbate peroxidase (cytosolic) Cucumis sativus
CucurcAPXt O04873 Ascorbate peroxidase (thylakoid-bound)
Kurokawa Amakuri
Cucurbita cv.
DesulfiCP ZP_00096951 Catalase–peroxidase Desulfitobacterium hafniense
E_coliHPI P13029 Catalase HPI Escherichia coli
E_coliPCP P77038 EHEC-strain catalase peroxidase
(strain 0157:H7)
Escherichia coli
EuglgraAPX Q8LP26 Ascorbate peroxidase Euglena gracilis
FraganaAPX O48919 Ascorbate peroxidase Fragaria x ananassa
GaldparAPX Q8GT26 Hybrid-type ascorbate peroxidase
(Rhodophyta)
Galdieria partita
GeobactCP AAR35476 Catalase–peroxidase Geobacter sulfurreducens
GloeobaCP Q7NGW6 Catalase–peroxidase Gloeobacter violaceus
GlycmaxAPX Q43758 Ascorbate peroxidase 1 Glycine max
GosshirAPX Q39780 Ascorbate peroxidase Gossypium hirsutum
HalomarCP O59651, pdb: 1ITK Catalase–peroxidase Haloarcula marismortui
HalosalCP Q9HHP5 Catalase–peroxidase Halobacterium salinarum
LegipneCP Q9ZGM4 Catalase–peroxidase Legionella pneumophila
LycoesAPXt Q8LSK6 Ascorbate peroxidase (thylakoid) Lycopersicon esculentum
MesecryAPX Q42909 Ascorbate peroxidase Mesembryanthemum crystallinum
MesolotCP Q987S0 Catalase–peroxidase Mesorhizobium loti
MethaceCP Q8TS34 Catalase–peroxidase Methanosarcina acetivorans
MycoforCP O08404 Catalase–peroxidase Mycobacterium fortuitum
MycosmeCP Q59557 Catalase–peroxidase Mycobacterium smegmatis
MycospeCP Q9R2E9 Catalase–peroxidase Mycobacterium vanbaalenii
MycotubCP Q08129 Catalase–peroxidase Mycobacterium tuberculosis
NcrassaCP Q8 · 182 Catalase–peroxidase Neurospora crassa
Ncrassahyp Q7SDV9 Hypothetical protein Neurospora crassa
NictabAPXc Q42941 Ascorbate peroxidase (cytosolic) Nicotiana tabacum
NictabAPXt Q9XPR6 Ascorbate peroxidase (thylakoid-bound) Nicotiana tabacum
OryzsatAPX P93404 Ascorbate peroxidase Oryza sativa
PenimarCP Q8NJN2 Catalase–peroxidase Penicillium marneffei
PisusatAPX P48534, pdb: 1APX Ascorbate peroxidase Pisum sativum
PorpyezAPX Q7Y1X0 Ascorbate peroxidase (cytosolic)
(Rhodophyta)
Porphyra yezoensis
PseuputCP Q88GQ0 Catalase–peroxidase HPI Pseudomonas putida KT2440
RhizlegCP Q8RJZ6 Catalase–peroxidase Rhizobium leguminosarum
SacchceCCP P00431, pdb: 2CYP Cytochrome c peroxidase Saccharomyces cerevisiae
ShewoneCP Q8EIV5 Catalase–peroxidase HPI Shewanella oneidensis
SpinolAPXt O46921 Ascorbate peroxidase (thylakoid) Spinacia oleracea
Ó FEBS 2004 Molecular evolution of heme peroxidases (Eur. J. Biochem. 271) 3299
pairwise protein distance calculation method in which the
JTT protein matrix [24] was formed. This output was put in
the F itch–Margoliash Ôleast sq uaresÕ phylogenetic tree
estimation method, in which the search for the best trees
was allowed. T n addition, global rearrangement of the
sequence order after each cycle in the
FITCH
program was
activated. The series of tree s prod uced was analysed by the
Consense method to reveal the majority r ule c onsensus tree.
This tree was visualized with the program
TREEVIEW
[25].
The maximum likelihood unrooted phylogenetic tree was
also calculated using the program
PUZZLE
, version 5.0 [26].
The WAG model of amino acid substitution was applied
[27]. Slow and accurate parameter estimation and 50 000
puzzling s teps were used on the set of data subjected to the
above methods. The c-distribution of rate h eterogeneity
with parameter estimation from the actual data set was used
(value obtained for parameter Gamma ¼ 0.62). In total,
230 300 quartets were analysed, and an unrooted quartet
puzzling tree was produced. This tree was also visualized
with the program
TREEVIEW
[25]. The highest likelihood
trees resulting from all three methods described were
compared to arrive at the e xpected tree.
Structural comparisons
Experimental 3D co-ordinates of two catalase–peroxidases,
one APX and one CCP, were obtained from the Protein
Data Bank, R esearch Collaboratory f or Structural Bio-
informatics, Rutgers University, New Brunswick, NJ, USA
(http://www.rcsb.org). Their c odes a re mentione d i n
Table 1 by the corresponding sequences. The secondary-
structure elements of all structures used for comparison
were ou tlined b y PDBSum (http://www.biochem.ucl.ac.uk/
bsm/pdbsum) [ 28]. The secondary-structure content was
quantified from the resulting plots with the program
PRO-
MOTIF
implemented in PDBsum.
Results and discussion
Conserved regions and typical motifs inthe sequences
of classI peroxidases
Sixty heme peroxidases belonging to classIofthe plant
peroxidase superfamily were aligned with the option of
profile alignment in
CLUSTALX
. The overall sequence s imi-
larity is 28.5%, as calculated from the 398 amino acid-long
alignment u sed for the phylogen etic analysis. The three
most important sequence areas possibly involved i n the
catalytic mechanism are presented in Figs 1–3. Region A is
located on the distal side ofthe prosthetic heme group, and
regions B and C are located on proximal side. The
unambiguous sequence s imilarities in these regions can also
be traced inthe known 3D crystal structures ofclass I
peroxidases presented in Fig. 4 for members of each group
analysed.
The greatest s equence conservation is achieved inthe area
on the distal side ofthe heme prosthetic group (known as
peroxidase consensus p attern PS00436 inthe P rosite
database) surrounding the active s ite, where it reaches
76% (Fig. 1). The catalytic triad Arg92, Trp95, and His96
in HalomarCPn (abbreviations of all sequences analysed are
listed i n T able 1) located i n t he distal heme cavity is
invariantly conserved among all N-domains of catalase–
peroxidases, all CCPs and all A PXs. Whereas the essential
arginine (Arg92) and histidine (His96) are responsible for
compound I formation [4], the latter allowing the heterolytic
cleavage ofthe peroxide bond via acid-base catalysis [29],
the coessential tryptophan (Trp95) facilitates the two-
electron reduction of compound I by hydrogen peroxide
[30]. Site-directed mutagenesis in E_coliHPIn [31],
MycotubCPn [32], and SyncyspCPn [29], a s well a s in
PisusatAPX [33] and SacchceCCP [34], supported the role
of the catalytic triad by affecting the typical reactivity. The
level of decrease inthe peroxidase activity (in contrast with
catalase activity) correlated with the ability o f the re spective
mutants t o bind heme. Residues corresponding to the
catalytic t riad are not conserved i n the C-domains of
catalase–peroxidases which do not bind heme. The position
corresponding to Arg92 (the numbering c orresponds to
HalomarCP, in which the residues c an also be found;
Fig. 4A,B) is variable in C-domains, but there a similar
basic residue occurs (e.g. Lys465; Fig. 4B). The position
corresponding to His96 is even more variable in all
C-domains ofthe catalase–peroxidases investigated. In
contrast, the positions Trp95 and Trp468 were invariantly
conserved among all classI representatives except Mesecry-
APX. From Fig. 1 it is obvious that the extension of the
distal active site exhibits high sequence conservation,
although lower than the region d irectly involved in the
reaction with the peroxidic substrate. An essential aspara-
gine (Asn126 of H alomarCPn) was located here, a nd its role
is supported by a mutagenesis study [35]. The hydrogen-
bonding network in which this residue is involved has subtle
differences from that present in PisusatAPX [36] visible in
Table 1. (Continued).
Abbreviation Accession number Enzyme Organism (strain)
StreretCP O87864 Catalase–peroxidase Streptomyces reticuli
SyncyspCP P73911 Catalase HPI Synechocystis sp. (PCC6803)
SynecspCP Q55110 Catalase–peroxidase Synechococcus sp. (PCC7942)
TrypcruAPX Q8I1 N3 Ascorbate-dependent peroxidase Trypanosoma cruzi
VibrchoCP Q9KRS6 Catalase–peroxidase Vibrio cholerae
VignungAPX Q41712 Ascorbate peroxidase Vigna unguiculata
XantcamCP Q8PBB7 Catalase–peroxidase Xantomonas campestris
XylefasCP Q9PBB2 Catalase–peroxidase Xylella fastidiosa
YerspesCP Q9X6B0 Catalase–peroxidase Yersinia pestis
ZeamaysAPX Q41772 Ascorbate peroxidase Zea mays
3300 M. Za
´
mocky´ (Eur. J. Biochem. 271) Ó FEBS 2004
the sequence alignment (e.g. Asn121 in HalomarCPn and
Glu65 in PisusatAPX). T he area arou nd the p roximal heme
ligand (His259 in HalomarCPn, His163 in Pisusat APX,
His175 in SacchceCCP; Fig. 2) is less conserved. The o verall
sequence similarity around this iron ligand is only 48%.
Nevertheless, the corresponding peroxidase consensus pat-
tern PS00435 (Prosite database) is d iscernible in a ll the
peroxidases analysed. The lower sequence similarity com-
pared with the distal side can be e xplained by the fact that
this rather variable region contributes significantly to the
reaction specificity ofthe respective groups and therefore
each family has its own typical feature in this region. The
iron ofthe prosthetic heme group is invariantly
co-ordinated by the a bove essential proximal h istidine
(Fig. 4A,C,D). However, in all C-domains of catalase–
peroxidases, there is a conserved arginine (Arg622 in
HalomarCPc) in t he corresponding position, indicating that
these domains lost their ability to co-ordinate the heme.
Further, a conserved tryptophan (Trp311 in HalomarCPn,
Trp179 in PisusatAPX, and Trp191 in SacchceCCP) is
thought to participate in an important hydrogen-bond
network on the proximal side o f the heme. This residue is
not conserved in all C-domains of catalase–peroxidases and
some APXs (e.g. position P he171 in MesecryAPX), sup-
porting the theory that it is not essential for the reaction
mechanism o f A PXs [33]. In contrast, for CCPs, T rp191 has
been suggested t o be t he site ofthe free-radical formation of
the c orresponding CCP compound I [ 37]. Site-directed
mutagenesis was performed inthe proximal heme cavity of
SyncyspCPn [38] and SacchceCCP [34] and focused on the
function ofthe two residues. In mutated catalase–peroxid-
ases, substitutions i n both residues had a pronounced effect:
a decrease in activity and loss ofthe prosthetic heme group.
Similarly to t he distal heme region, close to the essential His
and Trp, there are h ighly c onserved positions (Fig. 2)
among all the sequences investigated with unknown func-
tion. Even though the C-domains ofthe catalase–peroxid-
ases had lost the ability to bind the prosthetic heme group,
the structural elements remained conserved. In addition, in
the N-domains ofthe catalase-peroxidases, there is a large,
Fig. 1. Multiple sequence alignment of 50 selected representatives ofthesuperfamilyofbacterial, f ungal andplant heme peroxidases: r egion on the
distal side ofthe prosthetic heme g roup. Abbreviations o f enzym e sou rces are d efine d in T able 1. Numbers ind icate the position of each presented
segment within the corresponding s equence . Sequences are grouped together as discussed inthe text (i.e. catalase–peroxidases divided into two
separate domains, CCPs, and APXs). Sequence similarity is graded from light grey (low similarity) to b lack (highest similarity). Functionally
important residues involved inthe catalytic mechanism are marked with an asterisk. This figure was constructed using
GENEDOC
[22]. The complete
alignment of these sequences is available upon re quest.
Ó FEBS 2004 Molecular evolution of heme peroxidases (Eur. J. Biochem. 271) 3301
Fig. 2. Multiple sequence alignment of 50
selected representatives of t he superfamily of
bacterial, fungal andplant heme peroxidases:
region on the proximal side ofthe prosthetic
heme group. Abbreviations of enzyme sources
are defined in Table 1. Numbers indicate the
position of each presented segment within the
corresponding sequence. Inthe case of cata-
lase–peroxidases and some APXs a large
insertion is presen t here. Se quenc es are
grouped together as discussed inthe text
(i.e. catalase–peroxidases divided into two
separate domains, CCPs, and APXs).
Sequence similarity is graded from light grey
(low similarity) to black (highest similarity).
Functionally im port ant residue s invol ved
in the catalytic mechanism are marked with
an asterisk. This figure was constructed using
GENEDOC
[22].
3302 M. Za
´
mocky´ (Eur. J. Biochem. 271) Ó FEBS 2004
36 amino acid-long insertion (between residues Asp268 and
Thr304 in HalomarCPn) which has been suggested to have
a function inthe strength of Fe–N co-ordination on the
proximal side [6]. This unique sequence motif in known
KatG structure(s) is in principle a large loop [12] leading
from the edge on the proximal side of heme to the molecular
surface on the distal side. Part of this loop on the surface,
around Glu271 of HalomarCP (Fig. 4A, shown in green),
forms an entrance to the substrate access channel, and the
remainder interacts with the C-domain ofthe neighboring
subunit [12]. This loop contributes to the typical organiza-
tion ofthe substrate channel to the active site (compare
Fig. 4A with Fig. 4C and Fig. 4D), not surprisingly as it is a
very flexible region which could not be located i n the
electron density map. The role of this unique insertion has
been examined in MyctubCPn by mutating Ser315 to a
threonine [39]. The mutated protein did not activate
isoniazid because o f th e introduction of a steric hindrance
in the access c hannel. Hence, it is very likely that this
extension from the proximal side to the substrate channel
guarantees efficient catalytic reaction of catalase–peroxid-
ases via rapid diffusion through a channel to the heme in the
active site, similarly to monofunctional catalases [40].
A third conserved sequence pattern is located nearer the
C-termini ofthe investigated sequences. With a sequence
similarity of 52%, i t is a bove the average for a ll the
sequences. Asp372 and Asp686 in HalomarCP (marked in
Fig. 3 with an a sterisk) are 100% conserved i n a ll t he
peroxidases a nalysed. This invariant aspartate forms an
important hydrogen bond with the proximal heme ligand
(His259), facilitating the reactivity ofthe heme iron [ 8]. Site-
directed mutagenesis was performed in this region in
SyncyspCPn [38] and SacchceCCP [34], with a large effect
on the reactivity ofthe engine ered peroxidases. In contrast
with other c atalytically important regions, this essential
aspartate remained conserved in all known C-domains of
Fig. 3. Multiple se quence alignment of 50 s elected representatives ofthesuperfamilyofbacterial, f ungal andplant heme p eroxidases: conserved region
around the essential aspartate on the proximal heme side. Abbreviation s of e nzyme sources are defi ned in T able 1. Numbers ind icate the p osition of
each presented segment within the corresponding sequence. Sequences are grouped t ogether as d iscussed in t he text (i.e. catalase–peroxidases
divided into two se parate dom ains, CCPs, and APXs). Sequence similarity is graded from light grey (low similarity) to black (highest similarity).
Functionally important residues i nvolved inthe catalytic mecha nism are marked with an asterisk. This figure was constructed using
GENEDOC
[22].
Ó FEBS 2004 Molecular evolution of heme peroxidases (Eur. J. Biochem. 271) 3303
catalase–peroxidases, although its function in these domains
is not apparent. In this third analysed region, residues in
positions for w hich the function h as not yet b een determined
are highly conserved (Fig. 3).
Phylogenetic relationships
The c onsensus phylogenetic trees produced by NJ and Fitch
distance methods as well as the reconstructed t ree produced
by the
PUZZLE
method revealed four main clades of heme
peroxidases belonging t o classIofthesuperfamily of
bacterial, fungal,andplant heme peroxidases. In Fig. 5 the
simplified inferred
FITCH
tree, is presented, andin Fig. 6
the same tree in a simplified form with an outgroup is
shown. The NJ-reconstructed tree exhibited identical topo-
logy with slightly different branch lengths, and a very similar
maximum likelihood tree was revealed b y
PUZZLE
.Thelatter
method w i th the Ô+GÕ option f or rates of heterogeneity
Fig. 4. Structural comparison of four representatives ofclassIofthesuperfamilyofbacterial,fungal,andplant peroxidases. (A) N-terminal domain
and (B) C-terminal domain of catalase–peroxidase from Haloarcula marismortui (PDB code 1ITK); (C) S. cerevisiae CCP (PDB code 2CYP);
(D) cytosolic APX from Pisum sativum (PDB code 1APX). All figures are in solid ribbon presentation. In (A), (C) and (D) the prosthetic heme
group is presented in ball and stick presentation. Functionally important conserved r esidues discussed inthe te xt and marked also in Fig. 1 are
shown with their corresponding number inthe amino acid sequence. Those on the distal s ide ofthe prosthetic heme grou p are coloured yellow and
those o n the proxim al side blue. The large loop in catalase–peroxidases with an essential residue inthe entrance of a substrate chann el is coloured
greenin(A).
3304 M. Za
´
mocky´ (Eur. J. Biochem. 271) Ó FEBS 2004
produced the parameter a ¼ 1.96 estimated from t he actual
data set of 94 a nalysed peroxidase sequences. The bootstrap
support in all main nodes is strong; only in some minor
nodes is refining ofthe particular species within groups
moderate. Inthe case of
PUZZLE
, the likelihood mapping
analysis also revealed strong support of all main nodes.
Hence, the four distinct clades can be understood as four
diverse peroxidase families: ascomycetous CCPs; APXs;
C-domains of catalase–peroxidases; and N-domains of
catalase–peroxidases. This e volutionary branching a lso
Fig. 5. Unrooted ph ylogenetic tree of 60 peroxidase g enes. TheinferredtreeobtainedwiththeFitchmethod [24] is pres ented. This tree is essentially
identical w it h t he majority rule consensus t ree o b tained by the NJ method [23]. A very similar maximum likelihood tree was also obtained by
PUZZLE
[26]. Numbers represent the b ootstrap valu es on the branche s calculated fo r NJ/Fitch, respectively. The third value gives likelihood output from
PUZZLE
. The scale bar represents 10% ofthe estimated sequence divergence. Abbreviations ofthe species are identical with those used in Table 1. In
the case o f catalase–peroxidases, the N-terminal and C-terminal domains are analysed separately (giving rise to a total of 94 analysed sequences) due
to the evident tandem gene-duplication event discussed inthe text. Colour scheme for catalase–peroxidases: brown, Archaeons; cyan, Cyano-
bacteria; orange, Proteobacteria; magenta, Firmicut es; black, Actino bacteria; dark blue, Ascom ycota.
Ó FEBS 2004 Molecular evolution of heme peroxidases (Eur. J. Biochem. 271) 3305
matches t he reaction specificity ofthe corresponding
enzymes, and, inthe case of all known c atalase–peroxidases,
the two domains are fused together in one KatG .Itis
obvious that CCPs are closely related t o A PXs, and,
although the active centre of catalase–peroxidases located
exclusively inthe N-domains of KatGs resembles the active
centres of APX and CCP (Fig. 4), the N-domains are
phylogenetically more closely r elated to catalytically inactive
C-domains of catalase–peroxidases.
The complete sequences are known for catalase–peroxi-
dases from various prokaryotes, both e ubacteria and
archaea. The systematic a nalysis reveals that KatGsare
distributed unequally among closely related genomes.
Whereas in some complete genomes, no KatG is present
(as discussed below), some bacteria even contain t wo
different ones (e.g. Mycobacterium fortuitum). This unequal
distribution of KatGs can be attributed to a lateral gene
transfer [41] between otherwise phylogenetically unrelated
micro-organisms. Inthe case of KatGs, it was first
proposed to occur between archaea and eubacteria based
on the analysis of three archaeal and 16 bacterial KatGs
[42]. Later it was postulated that this phenomenon often
occurs in all lineages of hydroperoxidases capable of
catalytic reaction [43]. From thephylogenetic tree presen-
ted here (with 34 KatGs divided inthe separate domains), it
is obvious that archaeal and eubacterial KatGsare
phylogenetically more closely related than the rest of the
genomes. Moreover, several lateral gene transfer events are
discernible in t he phylogenetic tree in both branches of
catalase–peroxidases (Fig. 5). Interestingly, the sequence of
ArchfulCP segregated very early on from the remaining
known KatGs. In th is paper, I focus on the analysis of
lateral gene transfer between KatGs of Firmicutes, Cyano-
bacteria, and Proteobacteria, which is also supported by
high bootstrap values for both domains. Their positions on
the branches indicate that KatGs from pathogenic proteo-
bacteria are d escendants o f gen es from Firmicute s and
Cyanobacteria. T here is also an obvious discrepancy
between the rather high GC c ontent o f p roteobacterial
KatGs a nd the GC content ofthe whole o rganism
(Table 2), s upporting the hypothesis on the direction of
the lateral gene transfer. Pathogenic and soil proteobacteria
could profit from such a mode of lateral gene transfer by
causing new genes to resist more efficiently the harmful
effects of oxidative stress often caused by the host immune
response or t he environment. However , no KatG was
foundinthecompletedgenomeofBacillus subtilis,
indicating gene loss in some Firmicutes.
Fig. 6. Simplified presentation ofthe inferred t ree with the use of an
outgroup (manganese peroxidase from Phanerochaete chrysosporium
belonging to class II of this superfamily) presented to demonstrate the
order of evolutionary events inclassIofthe peroxidase superfamily.
Table 2. Analysis of GC c ontent in KatGs thought to be involved in lateral gene transfer between organisms. Values were obtained from the codon
usage database at kazusa.or.jp. Abbreviations of enzymes are described in Table 1.
Type of peroxidase GC content of KatG (%) GC content of whole organism (%) Type of organism
BacihalCP 45.06 44.32 Firmicutes
BacisteCP 51.85 49.67 Firmicutes
DesulfiCP 57.09 50.07 Firmicutes
GeobactCP 67.40 61.61 d-Proteobacterium
GloeobaCP 65.58 62.86 Cyanobacterium
LegipneCP 46.44 39.98 c-Proteobacterium
SyncysCP 51.74 48.25 Cyanobacterium
SynecspCP 54.14 55.86 Cyanobacterium
VibrchoCP 51.18 48.09 c-Proteobacterium
AspefumCP 63.64 54.17 Ascomycete
AspenidCP 56.49 53.01 Ascomycete
BlumgraCP 46.30 44.55 Ascomycete
NcrassaCP 58.62 56.13 Ascomycete
PenimarCP 55.81 51.43 Ascomycete
3306 M. Za
´
mocky´ (Eur. J. Biochem. 271) Ó FEBS 2004
[...]... repeatedly inthesuperfamilyofbacterial,fungal,andplant heme peroxidasesInclass I, this event apparently occurred in each important period of evolution, firstly inthe ancient evolutionary line allowing differentiation in (a) the progenitor for APX and CCP families and (b) ancestral catalase–peroxidase Furthermore, it acted inside both the APX and catalase–peroxidase families The subsequent event of. .. event of a tandem gene duplication inthe already formed family of catalase peroxidases was unique, according to the fact that the duplicons remained fused together in one KatG Functional adaptation resulted in N-domains having the ability to oxidize and reduce hydrogen peroxide (i. e the reaction of catalase) and to remain peroxidatically active (oxidation of substrates with hydrogen peroxide) In contrast,... lines ofclassIof this superfamily occur, i. e the CCP line andthe KatG fusion line, as is the case for Neurospora crassa [47] Evolution by the birth -and- death process inclassIofthesuperfamilyofbacterial,fungal,andplantperoxidases Gene duplication is one ofthe most common evolutionary processes by which new genes arise From the results presented, it is obvious that gene duplication must... contrast, C-domains lost their enzymatic activity and probably remained widespread only to stabilize the whole oligomeric protein Inthe course of evolution via (repeated) gene duplications, similar nonfunctional copies inthe branch of APX and CCP must also have occurred It is difficult to follow the presence of such inactive Ôsecond copiesÕ of APX and CCP genes inthe genomes, as mentioned previously [15]... Hence, it is reasonable to suppose a common origin of extant APXs inthe ancient line of protists The occurrence and diversity of APX genes in various lineages that descended from the ancestral protist correlate with the various photosynthetic abilities of these organisms, producing various levels of reactive oxygen species The abundant present day APX genes are spread among the genomes of photosynthetically... catalase peroxidases: the catalytically inactive C-domains with as yet unknown or only hypothetical noncatalytic function) So we can conclude that the evolution of this class probably occurred through the death -and- birth process Whether this is also true for class II andclass III remains to be proved It is clear that this mode of evolution in heme proteins can create novel reaction specificities, thus... presented in this tree, but in several plant species multiple forms of genes coding for cytosolic and chloroplast APXs exist as a rule, indicating a high level of sequence identity [36] The second subbranch is particularly interesting: the APX gene of a red algae (Galdieria partita) has a common origin with APXs from green algae (Chlamydomonas sp and Chlamydomonas reinhardtii) and all thylakoid APXs from higher... previously [15] However, in staying fused with the N-domains, the C-domains of KatGs represent a unique opportunity to address the mode of evolution ofclassIof this peroxidase superfamily According to currently established phylogenetic theories [17–20], multigene families can exist in genomes as a consequence of either concerted evolution or evolution by the birth -and- death process From the results presented... possible difference at the protein level In fact, this peroxidase, which is the closest phylogenetic neighbor to fungal CCPs, has a unique cellular localization inthe endoplasmic reticulum It also exhibits a unique metabolic role inthe trypanothione system [44] Evidence from an evolutionary tree based on 18S rRNA sequences places the origin ofthe kingdom Plantae among the phyla ofthe ancestral kingdom... classIofthesuperfamilyofbacterial,fungal,andplantperoxidasesthe concerted mode of evolution is unlikely to have occurred This mode supposes that polymorphism (genetic diversity) was generated by the introduction of new ´ 3308 M Zamocky (Eur J Biochem 271) ´ variants from different loci through interlocus recombination or gene conversion Multiple sequence alignment ofthe whole ofclass I, . evolution, in which the indi-
vidual specificities of the peroxidase families distinguished
were already formed. Evidence is presented that class I of the
heme. Phylogenetic relationships in class I of the superfamily of bacterial,
fungal, and plant peroxidases
Marcel Za
´
mocky
´
Institute of Molecular Biology,