Tài liệu Báo cáo khoa học: Medium-chain dehydrogenases/reductases (MDR) Family characterizations including genome comparisons and active site modelling pdf
Medium-chaindehydrogenases/reductases (MDR)
Family characterizationsincludinggenomecomparisonsandactivesite modelling
Erik Nordling
1,2
, Hans Jo¨ rnvall
1
and Bengt Persson
1,2
1
Department of Medical Biochemistry and Biophysics and
2
Stockholm Bioinformatics Centre, Karolinska Institutet, Stockholm,
Sweden
Completed eukaryotic genomes were screened for medium-
chain dehydrogenases/reductases (MDR). In the human
genome, 23 MDR forms were found, a number that prob-
ably will increase, because the genome is not yet fully inter-
preted. Partial sequences already indicate that at least three
further members exist. Within the MDR superfamily, at least
eight families were distinguished. Three families are formed
by dimeric alcohol dehydrogenases (ADH; originally detec-
ted in animals/plants), cinnamyl alcohol dehydrogenases
(originally detected in plants) and tetrameric alcohol dehy-
drogenases (originally detected in yeast). Three further
families are centred around forms initially detected as
mitochondrial respiratory function proteins, acetyl-CoA
reductases of fatty acid synthases, and leukotriene B4
dehydrogenases. The two remaining families with polyol
dehydrogenases (originally detected as sorbitol dehydro-
genase) and quinone reductases (originally detected as
f-crystallin) are also distinct but with variable sequences. The
most abundant families in the human genome are the dimeric
ADH forms and the quinone oxidoreductases. The eukary-
otic patterns are different from those of Escherichia coli.
The different families were further evaluated by molecular
modelling of their active sites as to geometry, hydrophobicity
and volume of substrate-binding pockets. Finally, sequence
patterns were derived that are diagnostic for the different
families and can be used in genome annotations.
Keywords: medium-chain dehydrogenases/reductases;
genome comparisons; polyol dehydrogenase; cinnamyl
alcohol dehydrogenase; quinone oxidoreductase.
Medium-chain dehydrogenases/reductases (MDRs) consti-
tute a large enzyme superfamily with (including species
variants) close to 1000 members [1,2]. The MDR enzymes
represent many different enzyme activities of which alcohol
dehydrogenases (ADHs) are the most closely investigated.
They participate in the oxidation of alcohols, detoxification
of aldehydes/alcohols and the metabolism of bile acids
[3,4]. Another MDR branch has polyol dehydrogenase
(PDH) activities originally detected for sorbitol dehydro-
genase (SDH) [5]. All the corresponding substrates are
widespread in nature because of their derivation from
glucose, fructose, and general metabolism. In some organ-
isms these substrates, such as polyols, can be accumulated
at high concentrations constituting a protection against
environmental stress, such as osmotic shock [6], and
reduced or elevated temperature [7,8]. Polyol accumulation
can, however, be harmful [9], suggesting a further protective
role for these enzymes. An MDR family earlier recognized
is cinnamyl alcohol dehydrogenase, CAD. This enzyme
type in plants catalyses the last step in the biosynthesis of
the monomeric precursors of lignin, the main constituent of
plant cell walls [10]. This enzyme family has been exten-
sively characterized through CAD from plant sources
[11–13], because of its importance for the pulp industry [14].
Down-regulation or inhibition of CAD will reduce wood
lignin content and yield a pulp of high quality [15]. A
further MDR family long since recognized is the quinone
oxidoreductase (QOR)-type, of which one mammalian
form functions as a lens protein (f-crystallin) [16], muta-
tional loss of which may result in cataract formation at
birth. This suggests that f-crystallin has a role in the
protection of the lens against oxidative damage [17]. In
common therefore, as demonstrated by the examples
above, all MDR families appear to have some members
with protective functions in different organismal defences
[2]. All MDR enzymes utilize NAD(H) or NADP(H) as
cofactor and several but not all of the members have one
zinc ion with catalytic function at the active site. Some, in
particular classical, dimeric ADHs, also have a second zinc
ion at a structural site, stabilizing an external loop present
in those forms [18].
The availability of completed genomes provides an
opportunity to evaluate all these members of the MDR
superfamily. We have therefore studied the MDR enzymes
corresponding to the products from available eukaryotic
genomes (and for comparison, the Escherichia coli genome
is also included, but not further analysed because of the
distant relationships). The total number of MDR forms in
each species was evaluated, orthologies were assigned and
evolutionary relationships were characterized. In addition,
separate sequence motifs were defined and the active site
variability was investigated.
Correspondence to B. Persson, Department of Medical Biochemistry
and Biophysics, Karolinska Institutet, S-171 77 Stockholm,
Sweden. Fax: + 46 8 337 462, Tel.: + 46 8 728 7730,
E-mail: bengt.persson@mbb.ki.se
Abbreviations:MDR,medium-chaindehydrogenases/reductases;
ADH, alcohol dehydrogenase; CAD, cinnamyl alcohol dehydrogen-
ase; YADH, yeast alcohol dehydrogenase; MRF, mitochondrial
response proteins; PDH, polyol dehydrogenases;
QOR, quinone oxidoreductases; ACR, acyl-CoA reductase;
LTD, leukotriene B
4
dehydrogenase.
(Received 12 April 2002, revised 24 June 2002, accepted 15 July 2002)
Eur. J. Biochem. 269, 4267–4276 (2002) Ó FEBS 2002 doi:10.1046/j.1432-1033.2002.03114.x
MATERIALS AND METHODS
Protein sequences translated from the complete genomes
of Homo sapiens [19,20], Drosophila melanogaster [21],
Caenorhabditis elegans [22], Arabidopsis thaliana [23],
Saccharomyces cerevisiae [24] and Escherichia coli [25] were
searched for MDR members using
FASTA
[26] with known
MDR proteins [1,2] as query sequences. Hits with an expect
value (E value) below 10
)10
were extracted and screened for
MDR sequence patterns in order to find true members.
These sequences were subsequently subjected to another
round of
FASTA
searches against the protein sequences from
each genome to find further homologues. Multiple sequence
alignments were calculated using
CLUSTALW
[27]. Evolu-
tionary trees were calculated from the alignments using the
distance-based techniques, neighbour joining and
UPGMA
,
and a heuristic search to find the most parsimonious tree.
The neighbour joining tree was created using
CLUSTALW
and
the other trees were created using
PAUP
[28]. The certainty of
each branch point was assessed with bootstrap tests of the
different trees. When all three methods agreed on a branch
point with a bootstrap value above 90%, the corresponding
branch was considered significant, and is marked with an
asterisk in Fig. 1. Each protein that was not assigned an
unambiguous placement during the bootstrap tests was
manually investigated for the appearance of family specific
sequence patterns to aid in the classification process. The
resulting evolutionary trees were displayed with
TREEVIEW
[29]. All sequences were checked vs. the
SWISSPROT
database
[30] for functional annotations.
The active sites of the MDR proteins were investigated by
homology modelling using
ICM
(Molsoft LLC, La Jolla,
CA, USA) [31]. For the CAD and PDH families, the ketose
reductase from Bemisia argentifolii (PDB accession no. 1e3j)
[32] is the closest homologue with known three-dimensional
structure. For the families of QOR, mitochondrial response
factor proteins, leukotriene B4 dehydrogenases (LTDs) and
acyl-CoA reductases (ACRs), the three-dimensional struc-
ture of E. coli quinone oxidoreductase (PDB accession no.
1qor) [33] was used as template. Activesite residues were
assigned according to the crystal structure of horse liver
alcohol dehydrogenase with bound substrate (PDB acces-
sion no. 3bto) [34]. For the homology modelling, the active
site residues were replaced according to the multiple
sequence alignment of each MDR family (cf. Fig. 1). The
replaced residues were positioned initially in the same
rotamers as the original residues. Each replacement was
followed by a conjugate gradient minimization of 100
function evaluations [35]. The last step in the replacement
procedure is a conjugate gradient minimization of 1000
functional evaluations to relieve any remaining unfavour-
able side chain interactions. The volume of the activesite is
measured as the space accessible for a carbon probe in the
interior of the protein. The hydrophobicity index was
calculated by averaging the hydrophobicity values [36] of
the activesite residues.
Fig. 1. Evolutionary tree of the MDR enzymes from the six genomes investigated. Confidence levels of over 90% from the bootstrap test are marked
with an asterisk at the corresponding branch point. The tree branches early, which indicates a divergent superfamily of ancient origin where a
limited number of ancestral genes have diverged during the evolution of the separate species. The eight families are enclosed by thick lines.
4268 E. Nordling et al. (Eur. J. Biochem. 269) Ó FEBS 2002
RESULTS AND DISCUSSION
MDR forms in the completed genomes
We find 23 MDR forms (Table 1) upon screening the
human genome. In addition, we find three incomplete
sequences, still too early to finally evaluate, and therefore
not included in this study. Thus, the total number of human
MDR forms can be expected to increase slightly. The
A. thaliana genome is the one with the greatest number of
MDR members (38), which is consistent with a high gene
duplication tendency in this organism [23]. Surprisingly, the
genome of S. cerevisiae (like that of E. coli)alsohasmany
MDR members, especially in relation to its size vs. that of
the larger genomes of C. elegans and D. melanogaster,with
only 13 and 10 members, respectively (Table 1). Obviously,
the MDR super-family exhibits different levels of variability
and represents a number of different ancestral gene
duplications followed by repeated acquirements of new
functions, ÔenzymogenesisÕ [37].
From the consensus evolutionary tree (Fig. 1), construc-
ted from the aligned MDR sequences of six genomes
(human, D. melanogaster, C. elegans, A. thaliana, S. cere-
visiae and E. coli), we can first divide the MDR super-
family into families. Four families are clearly separated
from the rest of the tree. These are the dimeric ADHs,
CAD, yeast alcohol dehydrogenases (YADH) and mito-
chondrial response proteins (MRF). Notably, CAD is not
only found in plants, but also in the S. cerevisiae genome
(as in the E. coli genome), indicating that CAD has a
wider function than just lignin biosynthesis, which is
consistent with the annotation in
SWISSPROT
recently
changed to mannitol dehydrogenase. Two families contain
sequences that are distantly related. These are the PDHs
and QORs. Finally, there are the two families of ACRs
and LTDs, and a few forms that do not belong to any of
the families mentioned.
Half of the families are zinc-containing MDRs and half
are non-zinc-containing MDRs. A division can be drawn
(dashed line in Fig. 1), with the zinc-containing MDRs in
one of the halves (bottom Fig. 1, with families CAD, PDH,
ADH, YADH) and the non-zinc-containing MDRs in the
other (top Fig. 1 with families QOR, MRF, LTD, ACR).
The number of enzymes in each genome belonging to the
different families is listed in Table 1. ADH is the family
branch most frequently found in the human genome, while
ADH and CAD are the most frequent in the A. thaliana
genome. The YADH-type of enzyme is present not only in
yeast but also in C. elegans, A. thaliana (and E. coli). These
latter organisms therefore have alcohol dehydrogenases of
both the dimeric and tetrameric ADH families. In Table 1,
the few forms that do not fit into the eight families are
grouped in the column ÔOthersÕ.
Comparing our results with those of other databases, e.g.
PFAM
[38] and
COG
[39], we find that several family members
are also represented in the corresponding entries of those
databases, supporting our results. However, in contrast to
our work,
PFAM
does not subclassify the MDR superfamily.
Inthe
COG
database, the human and A. thaliana sequences
are presently lacking. Furthermore,
COG
groups the YADH
and CAD families together in
COG
1064, and the MRF and
QOR families together in
COG
0604.
COG
and
PFAM
also
include six yeast proteins distantly related to MDR forms
but with expect values far below our threshold of 10
)10
,
while we include some members with better expect values
which are more closely related to the MDR family but not
listed in the other classifications. Two of the distantly related
yeast proteins were included in a previous, different genome
comparison [2], giving 17 S. cereivisae MDR forms instead
of the present 15. However, for E. coli, the number of MDR
forms (17) is unchanged.
Of the MDR families now observed, the dimeric and
tetrameric ADH families have been recently analysed
elsewhere [40] while six families are further considered
below: PDH, CAD, QOR, MRF, LTD and ACR. Family
distinguishing sequence patterns are also recognized.
The polyol dehydrogenase (PDH) family
The PDH family contains SDH, ketose reductase and
threonine dehydrogenase. SDH is present in all genomes of
this study, and a corresponding gene with retained function
is traceable from prokaryotes to man. This conservation
emphasizes that this SDH has an important function
common through a wide range of life forms. It further
shows species-specific duplications, in a manner well known
also in the classical ADH family [41]. The separate SDH
duplications appear to have occurred independently in
several lines as reflectedbythe human, C. elegans,D. melano-
gaster and S. cerevisiae genomes. These isoforms show
81.2–99.7% residue identity in pairwise comparisons. In
addition, S. cerevisiae has one further SDH form that is
only 53% identical to the others, indicating the presence of
widely separated duplications in the SDH group.
TheactivesitevolumesofthePDHsrangebetween77
and 257 A
˚
3
. SDH typically has large volumes of between
Table 1. Number of sequences within each MDR family, discernible from the genomes investigated. The six enzymes that do not fit into the eight
families are grouped in the column ÔOthersÕ.
Genome ADH CAD YADH PDH QOR MRF LTD ACR Others Sum
H. sapiens (38922 ORFs) 9 – – 2 7 1 2 1 1 23
C. elegans (19000 ORFs) 2 – 3 2 1 2 1 1 1 13
D. melanogaster (13500 ORFs) 1 – – 3 1 1 – 3 1 10
A. thaliana (25464 ORFs) 9 8 1 1 7 1 11 – – 38
S. cerevisiae (6000 ORFs) 1 2 4 5 1 1 1 – – 15
E. coli (4289 ORFs) 1 2 1 9 1 – 1 – 2 17
Total 23 12 8 22 18 6 16 5 6 116
Ó FEBS 2002 MDR familycharacterizations in complete genomes (Eur. J. Biochem. 269) 4269
210 and 257 A
˚
3
. Most of the activesite residues are
conserved through all PDH forms. Within the SDHs
(entries 4–14 of Table 2), 10 out of 16 amino acid residues
are strictly conserved, and remaining residues are exchanged
only to a conservative extent in most cases. A few further
enzymes in Table 2 are annotated as SDH, but their
E values are much lower than those for the verified SDHs.
In addition, the residues, hydrophobicity and geometry at
the active sites are different from those of the confirmed
SDHs, indicating that they are likely to represent further
types of polyol activities.
The zinc-liganding residues, Cys45, His70 and Glu156
(residue numbers according to human SDH) [42], are
conservedinmostPDHforms.InsixPDHs,Glu156is
exchanged for Asp, Gln or Ser. The Asp might act as a zinc
ligand [43], but the Gln or Ser are not likely to contribute
zinc ligands [43]. The exact nature of one ligand has been
much investigated previously [44]. In DM_7300579, two of
the zinc ligands, Glu156 and Cys45, are missing and we
postulate that this protein does not bind zinc. The
coenzyme-binding motif in this protein deviates further,
having two of the three Ôcoenzyme-typicalÕ Gly residues [45]
replaced by Cys and Ala, respectively. This is expected to
give a change in the fold within this region, and this protein
may therefore exhibit loss of enzymatic activity, or represent
another activity [37].
The cinnamyl alcohol dehydrogenase (CAD) family
This family contains CAD and mannitol dehydrogenases
(MTD), represented in A. thaliana by eight forms. However,
we also find two of this family’s forms in S. cerevisiae (and
in E. coli). This family has 43 residues strictly conserved, of
which close to half (19) are glycines, typical of unaltered
folds [46]. In addition, seven cysteine residues are strictly
conserved, of which six correspond to the zinc-liganding
positions of ADH, suggesting the presence of two zinc ions
in the CAD family.
In plants, the ancestral gene for the CAD family has
been duplicated after the separation from fungi, giving
rise to the CAD and MTD lines. The substrate specifi-
city, however, has been retained, as both these enzymes
act on primary alcohols/aldehydes. CAD is part of the
shikimic acid pathway, which leads to synthesis of nearly
all plant aromatic compounds. This pathway is unique
for plants, bacteria and fungi [47], consistent with the fact
that no CAD homologue could be found in the other
organisms.
The hydrophobicity index is typically between )0.3 and
+0.3 for most CAD/MTD forms (Table 2). The molecu-
lar modelling of the enzymes within the CAD family
indicates that some enzymes have a deep (> 12 A
˚
)and
narrow ( 8A
˚
) active site, while others have a more
shallow ( 9A
˚
) and somewhat wider ( 10 A
˚
) active site
(Table 2).
Apart from the strictly conserved Cys48, His70, Glu71
and Cys164 (residue numbers according to CAD1_
ARATH), the conserved activesite residues of the CAD
family are one Glu and four hydrogen-bonding residues
(typically Ser/Thr/Gln).
Six A. thaliana CAD forms cluster together (59–98%
residue identity), while two A. thaliana CAD enzymes
(MLD14.17 and F28P22.13) form two separate lines.
The quinone oxidoreductase (QOR) family
The family containing QORs is variable but has distinct
borders (Fig. 1). One enzymatic activity described for these
members is QOR [48], but additional activities are likely to
exist in the QOR family. In plants, QOR members give
protection against diamide compounds, which may be
metabolites of alkylating diazoate-derivatives [49].
Several proteins from the QOR family are found in the
human genome only (Fig. 1), showing that this family has
given rise to novel functions in mammals. These enzymes
may therefore be highly important for mammalian meta-
bolic conversions. As some of these enzymes are homolog-
ous to the synaptic vesicle protein VAT-1 from Torpedo
californica ray, the group might be involved in neuronal
functions. This would be consistent with the increased
number of QOR forms in mammals. The human VAT-1
homologue displays the largest activesite volume (289 A
˚
3
)
of the OQR subgroup. The VAT-1 related proteins have
hydrophobic substrate pockets with hydrophobicity indices
up to 1.47 (Table 2).
At the active sites of the proteins of the QOR family,
three residues are conserved in close to all forms: Asn41,
Asp/Glu44 and Thr127. The QORs and human f-crystallin
contain Tyr46 and Tyr52. The human QOR has ortho-
logues in all species investigated except D. melanogaster
(Table 3). The absence of a QOR member in D. melano-
gaster might indicate that another enzyme has evolved for
this enzyme function in the fruit-fly, as is the case for ethanol
dehydrogenase activity, which is often supplied by MDR
enzymes, but in the fruit-fly is supplied by a short-chain
dehydrogenase [5].
The mitochondrial respiratory function proteins (MRF)
family
In yeast, it has been shown that SC_YBR026C is essential
for mitochondrial respiratory function (MRF) [50]. This
protein has clearly discernible homologues in all investi-
gated eukaryotic species (Table 3), forming a family,
distinguishable from the other non-zinc-containing oxido-
reductases (Table 2). The human orthologue (HS ENSP
234985) may be similarly important for mitochondrial
function. The activesite volumes are 169–243 A
˚
3
, indica-
ting large substrates (Table 2). The substrate pocket is
polar with hydrophobicity indices as low as )1.48, in
contrast to that of most of the other investigated proteins.
The active sites of these MRF proteins have seven out of
17 residues strictly conserved. All but two of these
conserved residues are polar, contributing to an active
site concluded to have many hydrogen bonds to the
substrate(s).
The leukotriene B
4
dehydrogenases (LTD) family
LTDs form a subgroup that have members from all
genomes except that of D. melanogaster. In the human
genome, we find two forms (LTB4_HUMAN and
hCP39255), while in C. elegans and S. cerevisiae,thereis
only one form (as in E. coli). All these proteins form an
orthologue cluster with reciprocal relationships better than
10
)15
(Table 3). In addition, we find 11 A. thaliana members
of this type. As plants have systems of host-defence
4270 E. Nordling et al. (Eur. J. Biochem. 269) Ó FEBS 2002
Table 2. Members of the MDR superfamily. Activesite residues correspond to the noncontinuous sequence. Annotations within parentheses are less
certain due to a log E value above )20.
Protein Annotation
log
E value Activesite residues
Hydro-
phobicity
index
Depth
(A
˚
)
Width
(A
˚
)
Volume
(A
˚
3
)
PDH
a
EC_4248.yjjN Sorbitol dehydrogenase )28 CTANQ-HEVVEILRNA )0.19 13.2 7.0 212
EC_4158.yjgV
L
-idonate 5-dehydrogenase )136 CSYVGFHEFSEVMFRF 0.37 16.7 7.2 173
DM_7300579 Sorbitol dehydrogenase )26
SSVNR-HDLNQLCFRS )0.72 16.2 5.9 144
EC_1742.b1774 Sorbitol dehydrogenase )45
CSGFIKHEFTEVTFRY )0.18 15.8 9.1 257
AT_MSG15–5 Sorbitol dehydrogenase )60
CSYCAFHEFTEVMFRY 0.16 16.4 7.2 211
CE_R04B5.5 Sorbitol dehydrogenase )71
CSYIGFHEFTEVLFRY 0.26 16.8 7.8 226
CE_R04B5.6 Sorbitol dehydrogenase )68 CSFIGFHEFTEVLFRS 0.55 15.8 7.6 244
DM_7298873 Sorbitol dehydrogenase )79
CSYIGFHEFTEVMFRY 0.14 16.9 8.9 210
DM_7299382 Sorbitol dehydrogenase )77
CSYIGFHEFTEVMFRY 0.14 15.9 8.4 210
DHSO_HUMAN Sorbitol dehydrogenase )141
CSYIGFHEFTEVLFRY 0.26 14.6 7.8 226
Q9UMD6 Sorbitol dehydrogenase )140
CSYIGFHEFTEVLFRY 0.26 14.6 7.8 226
SC_YLR070C Sorbitol dehydrogenase )72 CSYIAYHEFTEVMFRY 0.03 12.9 8.7 176
SC_YJR159W Sorbitol dehydrogenase )137
CSYIGYHEFTEVMFRY )0.11 12.9 7.7 207
SC_YDL246C Sorbitol dehydrogenase )136 CSYIGYHEFTEVMFRY )0.11 13.3 8.4 207
EC_1744.b1776 Sorbitol dehydrogenase )23
CAHGS-HENLDAMMAY )0.24 12.5 7.0 212
EC_3538.tdh Threonine 3-dehydrogenase )130
CTIWSKHEGVDNIYGR )0.68 12.5 8.0 205
EC_2496.b2545 Sorbitol dehydrogenase )24
CSYRAKHEKYSTEWVT )1.28 13.2 7.5 202
SC_YAL61W (Sorbitol dehydrogenase) )15
CTEIFSHELAQVMMCY 0.59 9.7 6.7 145
SC_YAL60W Butanediol dehydrogenase )152
CSEIFSHEFLEVVIGY 0.77 12.1 8.5 188
EC_0598.b0608 Glutathione-dependent
formaldehyde dehydrogenase
)67
CSLIP-HELYSTRFKM )0.34 10.2 5.9 77
EC_1550.rspB Starvation sensing
protein RSPB
)130
CSIHN-HEVVEIFRLN 0.05 16.0 6.8 161
EC_2050.gatD Galactitol-1-phosphate )133
CSRAH-HEFSEVTWMN )0.71 12.4 8.6 220
5-dehydrogenase
CAD
b
AT_MLD14–17 Cinnamyl-alcohol )132 CTQGM-HEYVCTFTEE )0.05 9.7 10.7 160
dehydrogenase
AT_F20D10–90 Mannitol dehydrogenase )84
CSCHS-HEYVCSITQE 0.09 12.5 9.8 213
AT_F20D10–110 Mannitol dehydrogenase )91
CSMGM-HEYPCTLTQE )0.01 8.3 9.0 181
AT_F20D10–100 Mannitol dehydrogenase )89
CTMGL-HESKCTLTQE 0.00 12.5 9.8 174
AT_T22F8–230 Mannitol dehydrogenase )87
CTTGY-HEYICTLTQE )0.39 10.9 8.0 183
AT_F7D8–5 Mannitol dehydrogenase )85
CSTGF-HEYRCTLTQE )0.32 12.8 7.3 223
AT_F7D8–21 Mannitol dehydrogenase )84
CSTGF-HEYRCTLTQE )0.32 12.8 7.8 223
AT_F28P22–13 Cinnamyl-alcohol )67
CAWGD-HEFICTITQQ 0.34 12.7 7.8 198
dehydrogenase
EC_0317_b0325 Mannitol dehydrogenase )61
CSQAG-HEYPCTSTQE )0.69 9.1 10.2 213
EC_4160_yjgB Mannitol dehydrogenase )41
CSMGF-HEI-CTVLRK 0.44 12.7 7.8 204
SC_YCR105W Mannitol dehydrogenase )40 CSIGP-HEMPCTLIEQ 0.49 12.5 7.4 248
SC_YMR318C Mannitol dehydrogenase )40
CSCGN-HEYPCTLLNQ )0.03 12.0 8.1 237
QOR
c
HS_hCP39890 (Mycocerosic acid synthase) )18 NADLQYLALGETLFLSR 0.22 17.0 7.1 201
AT_F18E5–200 Quinone oxidoreductase )28
NADLQYLALGETLPAPR )0.21 18.5 6.7 220
SC_YBR046C Quinone oxidoreductase )34
NIEYFYRISTLTNSRLY )0.41 12.5 8.7 124
EC_3946_qor Quinone oxidoreductase )121
NIDYIYTAQLLTNNRLQ )0.43 22.0 7.3 218
AT_k11j9–30 Quinone oxidoreductase )52
NIDYFYMAGMLTQARMM 0.21 11.2 8.9 137
HS_hCP34852 (Quinone oxidoreductase) )16
NSDNYYF-YPVTFLAQG )0.35 18.4 6.7 204
AT_F14J22–2 (Alcohol dehydrogenase
class III)
)9
NSDNFYF-VFTTMLQAG 0.13 17.6 7.0 199
HS_VAT1_HUMAN Synaptic vesicle membrane )119
MAYMVLSVTMQCHL 0.97 18.8 10.2 289
protein VAT-1 homologue
HS_hCP47235 + Synaptic vesicle membrane )25
NIDMVIFAFYMTLYVLW 1.47 17.0 6.3 160
hCP1631114 protein VAT-1 homologue
HS_hCP38146 Synaptic vesicle membrane )22
STHFDYALLLIENAEEA 0.04 17.0 5.5 186
protein VAT-1 homologue
Ó FEBS 2002 MDR familycharacterizations in complete genomes (Eur. J. Biochem. 269) 4271
Table 2. (Continued).
Protein Annotation
log
E value Activesite residues
Hydro-
phobicity
index
Depth
(A
˚
)
Width
(A
˚
)
Volume
(A
˚
3
)
CE_F39B2–3 Quinone oxidoreductase )43
NVDYIYKYGAVTNVAMS 0.14 17.2 7.4 211
HS_QOR_HUMAN Quinone oxidoreductase )123
NVEYIYSSSSITSATS- )0.05 13.4 7.1 160
AT_T5P19–110 Quinone oxidoreductase )21
NANLQYSSFLVTFVYSY 0.35 16.4 7.9 222
HS_QORL_HUMAN Quinone oxidoreductase-like 1 )139
SINKLKRILDRRGLNVW )0.55 16.0 6.2 188
AT_K15M2–24 Quinone oxidoreductase )20
NLDRIGRALTFTGLYGI 0.30 13.1 5.3 168
AT_F5O8–27 Quinone oxidoreductase )109
NVDKRFYNVKLTG-VN- )0.56 19.1 7.6 252
AT_F25G13–100 (Quinone oxidoreductase) )18
NVDKIITVLLVTPMTKK 0.51 15.7 6.9 161
DM_7295851 (ToxD protein) )7
NIDAMGRVVQYTPLTGG )0.01 16.0 6.3 234
MRF
d
SC_YBR026C Mitochondrial respiratory
function protein
)137 NSDNQYNLQQVTGFWEK )1.48 16.6 7.3 198
CE_Y48A6B_9 Mitochondrial respiratory
function protein
)22
NLDNRYSFSTITGFAMW )0.18 14.0 6.5 168
AT_T6D9–100 (Mitochondrial respiratory
function protein)
)15
NSDNRYYSPSVTGFWSW )1.08 14.1 7.1 199
DM_7303260 Mitochondrial respiratory
function protein
)32
NADNTYNLALVTGFWRW )0.31 18.1 7.3 243
CE_W09H1–5 (Mitochondrial respiratory
function protein)
)15
NADNQYNDRLVTGFWRW )1.27 16.6 6.8 202
HS_ENSP234985 Mitochondrial respiratory
function protein
)38
NSDNMYNANLVTGFWQW )0.68 16.1 6.3 201
LTD
e
AT_F2K13–110 NADP-dependent leukotriene
dehydrogenase
)37 SCDYMRKEEV-TMG-IE )0.62 17.6 6.7 242
AT_F2K13–140 NADP-dependent leukotriene
dehydrogenase
)37
SCDYMGKEEV-TMN-IQ )0.56 16.4 6.5 251
AT_F2K13–150 NADP-dependent leukotriene
dehydrogenase
)20
SCDYMGKEEV-TMN-IQ )0.56 19.2 5.9 251
AT_F2K13–120 NADP-dependent leukotriene
dehydrogenase
)37
SCDYMGQEEV-TMN-IQ )0.54 16.5 6.6 236
AT_F2K13–130 NADP-dependent leukotriene
dehydrogenase
)34
SCDYMGQY TMN-IQ )0.45 19.5 6.0 242
AT_T17B22–23 NADP-dependent leukotriene
dehydrogenase
)37
SCDYMGEGEL-TMN-IK )0.41 17.0 5.8 247
AT_F28B23–3 NADP-dependent leukotriene
dehydrogenase
)39
SCDYMGVEEV-TMN-LQ )0.13 20.3 5.9 256
AT_K18L3–100 NADP-dependent leukotriene
dehydrogenase
)37
SCDHSGKEEV-TMN-VQ )0.85 17.5 6.1 249
AT_k19a23–10 NADP-dependent leukotriene
dehydrogenase
)36
SCDHSGKEEV-TMN-VQ )0.85 17.5 6.6 249
AT_F24G16–110 NADP-dependent leukotriene
dehydrogenase
)35
SCDYMRKEET-TMN-MQ )1.25 17.0 10.1 245
AT_F5I14–32 NADP-dependent leukotriene
dehydrogenase
)37
SCDYMRQEEL-TMN-LE )0.85 18.5 7.3 225
HS_LB4D_HUMAN NADP-dependent leukotriene
dehydrogenase
)118
TVDYMKMTTI-TAP-ME )0.02 18.8 7.3 218
EC_1420_b1449 NADP-dependent leukotriene
dehydrogenase
)44
SLDYMSGQDI-TLLRLQ )0.05 11.7 6.4 161
CE_M106–3_short NADP-dependent leukotriene
dehydrogenase
)26
SVDAQNETKV-TQHNRE )1.65 20.0 6.8 248
HS_hCP39255 NADP-dependent leukotriene
dehydrogenase
)26
SVDYMNQQTI-TQANRE )1.18 19.5 7.3 192
SC_YML131W (NADP-dependent leukotriene
dehydrogenase)
)10
SNDAQSETTI-TAG-VK )0.57 17.9 6.4 291
ACR
f
HS_FAS_HUMAN Fatty acid synthase )199 NRDMLLTLVKVTKLLAF 0.78 16.3 4.8 140
CE_F32H2–5 Fatty acid synthase )92
NRDMLLAILQVTKLLSI 0.91 16.2 5.5 186
4272 E. Nordling et al. (Eur. J. Biochem. 269) Ó FEBS 2002
like animals, but based upon linolenic [51] rather than
arachidonic acid, the functions may indeed be correspond-
ing. Another claim for retained function of the LTD family
is that Urtica urens (nettle plant) uses leukotriene B
4
as an
immunoreactive agent in the defence against herbivores [52].
These proteins may also function as allyl ADHs as they
have 70% sequence identity to a protein from Nicotiana
tabacum that acts on monoterpene allylic alcohols [53]. The
LTD active sites all are deep and narrow (Table 2). The
active site is polar in all cases but one, and for the majority
of the LTD members, several charged residues are present at
the active site. The activesite volumes are typically around
250 A
˚
3
, all consistent with activity on leukotrienes or
similarly sized molecules. The typical residues at the active
site are Ser45, Cys46, Asp47, Tyr49, Met50, Glu63, Thr128,
Met241, and Asn256.
The acyl-CoA reductase (ACR) family
ACRs form a family (Table 2) that contributes one
domain of the fatty acid synthases and erythronolide
synthases [54,55]. These ACR members have active site
volumes ranging from 140 to 189 A
˚
3
and they are only
found in the human, D. melanogaster and C. elegans
genomes. Orthologue analysis shows that only two of the
three D. melanogaster forms are closely related to the
human and C. elegans forms (Table 3). The active sites are
hydrophobic (index between 0.64 and 1.08) with narrow
and quite deep (15–17 A
˚
) substrate pockets consistent with
the nature of their fatty acid substrates. Four leucine
residues are strictly conserved at the active site. Several
conserved residues are clustered at a surface corresponding
to the one that is perpendicular to the subunit interacting
surface in dimeric MDR forms. It seems likely that these
conserved residues are involved in protein–protein inter-
actions in the multienzymes of fatty acid synthase and
erythronolide synthase, defining the subunit-interacting
areas.
Sequence patterns
The sequence comparisonsand subdivisions make it possible
to define sequence patterns useful for characterization of
MDR members. For QOR, a
PROSITE
pattern [56] already
exists (PS01162). However, this pattern is too insensitive to
find all the sequences we now classify as QOR members. It
only finds five of our presently recognized 18 QOR
members. The QOR family is highly divergent, which may
explain the poor result of the existing pattern. Based upon
the sequences now available, we propose a new pattern that
Table 2. (Continued).
Protein Annotation
log
E value Activesite residues
Hydro-
phobicity
index
Depth
(A
˚
)
Width
(A
˚
)
Volume
(A
˚
3
)
DM_7289423 Fatty acid synthase )59
NRDMLLIMVGVTKLLSL 1.08 17.4 5.7 159
DM_7295848 Fatty acid synthase )88
NRDMLLAMVKCTKLLSV 0.64 15.6 4.8 189
DM_7295849 Fatty acid synthase )91
NRDMLLAMVKCTKLLSV 0.64 17.3 6.4 189
a
Active site residues are at positions 44, 46, 50, 56, 57, 59, 69, 70, 118, 121, 155, 159, 274, 297, 298 and 299 in the numbering of
DHSO_HUMAN.
b
Active site residues are at positions 48, 50, 54, 60, 61, 70, 71, 122, 125, 164, 168, 284, 307, 308 and 309 in the numbering
of in CAD1_ARATH.
c
Active site residues are at positions 42, 44, 45, 47, 48, 53, 64, 89, 90, 93, 124, 128, 241, 255, 264, 267 and 268 in the
numbering of EC_3946_qor (QOR_ECOLI).
d
Active site residues are at positions 63, 65, 66, 68, 69, 73, 94, 121, 122, 125, 156, 160, 285, 300,
309, 312 and 313 in the numbering of SC_YBR026C (MRF1_YEAST).
e
Active site residues are at positions 45, 46, 47, 49, 50, 55, 63, 92, 93,
96, 128, 241, 256, 267 and 268 in the numbering of LB4D_HUMAN.
f
Active site residues are at positions 1567, 1569, 1570, 1572, 1573,
1576, 1586, 1611, 1612, 1615, 1645, 1649, 1766, 1781, 1790, 1793 and 1794 in the numbering of FAS_HUMAN.
Table 3. Orthologues recognized within the six analysed genomes.
H. sapiens D. melanogaster C. elegans A. thaliana S. cerevisiae E. coli
PDH family
(SDH activity)
a
HS_DHSO_HUMAN DM_7298873 CE_R04B5.5 AT_MSG15–5 YLR070C EC_1742_b1774
HS_Q9UMD6 DM_7299382 CE_R04B5.6 YJR159W
YDL246C
QOR family
(QOR activity)
a
QOR_HUMAN – CE_F39B2.3 AT_k11j9–30 SC_YBR046C EC_3946.qor
MRF family
(MRF)
b
ENSP 234985 DM_7303260 CE_W09H1.5 AT_T6D9–100 – –
ACR family
(ACR activity)
a,c
HS_FAS_HUMAN DM_7295848_short CE_F32H2.5 – – –
DM_7295849_short
LTD family HS_LB4D_HUMAN
(LTD activity)
a
HS_hCP39255 – CE_M106.3_short –
d
SC_YML131W EC_1420.b1449
a
Shown for the human member;
b
shown for the S. cerevisiae member;
c
in one domain of fatty acid synthases;
d
all A. thaliana forms show
equidistant relationships to the LTD forms of other species.
Ó FEBS 2002 MDR familycharacterizations in complete genomes (Eur. J. Biochem. 269) 4273
will detect better the QOR members (Table 4). It finds 15
of the QOR members, threefold more than the existing
PROSITE
pattern, and it misses only three QOR forms, while
detecting no false positives.
The sequence patterns for the PDH and the CAD
subgroups are based on residues that bind the catalytic zinc
and the substrate. The PDH pattern (Table 4) is somewhat
complex but captures all the different substrate specificities
of this group and only one false positive when the pattern is
screened vs.
SWISSPROT
. The CAD pattern (Table 4) is
shorter and is highly specific due to an additional cysteine
residue in the activesite region, unique to this family. It finds
no false positive match and misses no one of our members.
The MRF and the ACR patterns are based upon unusual
sequence stretches. The MRF proteins have a highly
conserved T-Y-G-G-M motif ideal to base a pattern upon.
The pattern finds all the members and no false positive
matches from
SWISSPROT
(Table 4). The ACR pattern is
based upon a sequence stretch with many aromatic residues,
including two tryptophan residues, which are suitable for
pattern recognition as this is the least frequently occurring
amino acid (Table 4). The pattern for the LTD group is
based upon the comparatively hydrophilic nature of the
active site in this subgroup, is very specific, and recovers all
members with no false positive matches in
SWISSPROT
(Table 4).
The patterns are useful for proper recognition of new
genomic sequences. They allow rapid annotation into the
different families of the MDR superfamily of the huge
amounts of sequences generated by ongoing genome
projects. They are also ideal for finding particular enzymes
in the ever increasing sequence databases.
ACKNOWLEDGEMENTS
Financial support from the Swedish Research Council, the Swedish
Foundation for Strategic Research and Karolinska Institutet is grate-
fully acknowledged.
REFERENCES
1. Persson,B.,Zigler,J.S.Jr&Jo
¨
rnvall, H. (1994) A super-family of
medium-chain dehydrogenases/reductases (MDR). Sub-lines
including f-crystallin, alcohol and polyol dehydrogenases, quinone
oxidoreductase, enoyl reductases, VAT-1 and other proteins. Eur.
J. Biochem. 226, 15–22.
2. Jo
¨
rnvall, H., Ho
¨
o
¨
g, J O. & Persson, B. (1999) SDR and MDR:
completed genome sequences show these protein families to be
large, of old origin, and of complex nature. FEBS Lett. 445,261–
264.
3. Jo
¨
rnvall, H., Ho
¨
o
¨
g, J O., Persson, B. & Pare
´
s, X. (2000) Phar-
macogenetics of the alcohol dehydrogenase system. Am. J. Phar-
macol. 61, 184–191.
4. Marschall, H U., Oppermann, U.C., Svensson, S., Nordling, E.,
Persson, B., Ho
¨
o
¨
g, J O. & Jo
¨
rnvall, H. (2000) Human liver class I
alcohol dehydrogenase cc isozyme: the sole cytosolic 3beta-
hydroxysteroid dehydrogenase of iso bile acids. Hepatology 31,
990–996.
5. Jo
¨
rnvall, H., Persson, M. & Jeffery, J. (1981) Alcohol and
polyol dehydrogenases are both divided into two protein types,
and structural properties cross-relate the different enzyme activ-
ities within each type. Proc. Natl Acad. Sci. USA 78, 4226–4230.
6. Yancey, P.H., Clark, M.E., Hand, S.C., Bowlus, R.D. & Somero,
G.N. (1982) Living with water stress: evolution of osmolyte sys-
tems. Science 217, 1214–1222.
7. Czajka, M.C. & Lee, R.E. Jr (1990) A rapid cold-hardening
response protecting against cold shock injury in Drosophila mel-
anogaster. J. Exp. Biol. 148, 245–254.
8. Wolfe, G.R., Smith, C.A., Hendrix, D.L. & Salvucci, M.E. (1999)
Molecular basis for thermoprotection in Bemisia: structural dif-
ferences between whitefly ketose reductase and other medium-
chain dehydrogenases/reductases. Insect. Biochem. Mol. Biol. 29,
113–120.
9. Chakrabarti, S., Cukiernik, M., Hileeto, D., Evans, T. & Chen, S.
(2000) Role of vasoactive factors in the pathogenesis of early
changes in diabetic retinopathy. Diabetes Metab. Res. Rev. 16,
393–407.
10.Boudet,A.M.,Lapierre,C.&Grima-Pettenati,J.(1995)Bio-
chemistry and molecularbiology of lignification. New Phytol. 129,
203–236.
11. Sarni, F., Grand, C. & Boudet, A.M. (1984) Purification and
properties of cinnamoyl-CoA reductase and cinnamyl alcohol
dehydrogenase from poplar stems (Populus X euramericana). Eur.
J. Biochem. 139, 259–265.
12. Halpin, C., Knight, M.E., Foxon, G.A., Campbell, M., Boudet,
A.M., Boon, J.J., Chabbert, B., Tollier, M.T. & Schuch, W. (1994)
Manipulation of lignin quality by down-regulation of cinnamyl
alcohol dehydrogenase. Plant J. 6, 339–350.
13. Galliano, H., Cabane, M., Eckerskorn, C., Lottspeich, F., San-
dermann, H. Jr & Ernst, D. (1993) Molecular cloning, sequence
analysis and elicitor-/ozone-induced accumulation of cinnamyl
alcohol dehydrogenase from Norway spruce (Picea abies L.).
Plant Mol. Biol. 23, 145–156.
14. Persson, B., Hallborn, J., Walfridsson, M., Hahn-Ha
¨
gerdal, B.,
Kera
¨
nen, S., Penttila
¨
,M.&Jo
¨
rnvall, H. (1993) Dual relationships
of xylitol and alcohol dehydrogenases in families of two protein
types. FEBS Lett. 324, 9–14.
15. Campbell, M.M. & Sederoff, R.R. (1995) Variation in lignin
content and composition. Mechanisms of control and implications
for the genetic improvement of plants. Plant Physiol. 110, 3–13.
16.Gonzalez,P.,Rao,P.V.,Nunez,S.B.&Zigler,J.S.Jr(1995)
Evidence for independent recruitment of f-crystallin/quinone
Table 4. Sequence patterns and screening results. The column hits gives number of MDR forms detected in the six genomes investigated, fp (false
positives) gives number of nonmembers detected in
SWISSPROT
, fn (false negatives) gives the number of proteins classified as a member but not found
by the pattern.
Group Pattern hits fp fn
QOR [GAS]-x-N-x(2)-[DEN]-x(5)-G-x(6,19)-[PS]-x(3)-[GA]-x-[ED]-x(2)-G-x-[VIL]-x(3)-G 15 0 3
MRF L-x(6)-[VL]-T-Y-G-G-M-[SA]-[KR] 600
PDH [GA]-[VIL]-[CS]-[GN]-[STA]-D-[VILMS]-[HKP]-x(14,27)-G-H-[ED]-x(2)-G-x-[VI]-x(10,12)-G-[DEQ]-x-[IV] 22 1 0
CAD C-G-x-C-x(2)-D-x(17)-G-H-E 12 0 0
LTD D-x-[YF]-x-[DE]-N-V-G-[GS]-x(3)-[DEN] 16 0 0
ACR W-x(5)-W-x(8)-P-x(2)-Y-x(3)-Y-Y 500
4274 E. Nordling et al. (Eur. J. Biochem. 269) Ó FEBS 2002
reductase (CRYZ) as a crystallin in camelids and hystricomorph
rodents. Mol. Biol. Evol. 12, 773–781.
17. Rao, P.V., Gonzalez, P., Persson, B., Jo
¨
rnvall,H.,Garland,D.&
Zigler, J.S. Jr (1997) Guinea pig and bovine f-crystallins have
distinct functional characteristics highlighting replacements in
otherwise similar structures. Biochemistry 36, 5353–5362.
18. Bra
¨
nde
´
n, C.I., Jo
¨
rnvall, H., Eklund, H. & Furugren, B. (1975)
Alcohol dehydrogenases. In The Enzymes,3rdedn.(Boyer,P.D.,
ed.), pp. 104–190. Academic Press, New York.
19. Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J.,
Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A.
et al. (2001) The sequence of the human genome. Science 291,
1304– 1351.
20. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C.,
Baldwin,J.,Devon,K.,Dewar,K.,Doyle,M.,FitzHugh,W.et al.
(2001) Initial sequencing and analysis of the human genome.
Nature 409, 860–921.
21. Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A., Gocayne,
J.D., Amanatides, P.G., Scherer, S.E., Li, P.W., Hoskins, R.A.,
Galle, R.F. et al. (2000) The genome sequence of Drosophila
melanogaster. Science 287, 2185–2195.
22. The C. elegans Sequencing Consortium (1998) Genome sequence
of the nematode C. elegans: a platform for investigating biology.
Science 282, 2012–2018.
23. The Arabidopsis Genome Initiative (2000) Analysis of the genome
sequence of the flowering plant Arabidopsis thaliana. Nature 408,
796–815.
24. Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B.,
Feldmann, H. & Galibert, F. (1996) Life with 6000 genes. Science
274, 563–567.
25. Blattner, F.R., Plunkett, G. III, Bloch, C.A., Perna, N.T., Bur-
land, V., Riley, M., Collado-Vides, J., Glasner, J.D., Rode, C.K.,
Mayhew, G.F., Gregor, J., Davis, N.W., Kirkpatrick, H.A.,
Goeden, M.A., Rose, D.J., Mau, B. & Shao, Y. (1997) The
complete genome sequence of Escherichia coli K-12. Science 277,
1453–1474.
26. Pearson, W.R. & Lipman, D.J. (1988) Improved tools for biolo-
gical sequence comparison. Proc. Natl Acad. Sci. USA 85, 2444–
2448.
27. Thompson, J.D., Higgins, D.G. & Gibson, T.J. (1994) CLUSTAL
W: improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position-specific gap
penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–
4680.
28. Swofford, D.L. (1998) PAUP*. Phylogenetic Analysis Using Par-
simony (*and Other Methods), version 4. Sinauer Associates,
Sunderland, Massachusetts.
29. Page, R.D. (1996) TreeView: an application to display phyloge-
netic trees on personal computers. Comput. Appl. Biosci. 12, 357–
358.
30. Bairoch, A. & Apweiler, R. (2000) The SWISS-PROT protein
sequence data bank and its supplement TrEMBL in 2000. Nucleic
Acids Res. 28, 45–48.
31. Abagyan, R. & Totrov, M. (1994) Biased probability Monte Carlo
conformational searches and electrostatic calculations for peptides
and proteins. J. Mol. Biol. 235, 983–1002.
32. Banfield, M.J., Salvucci, M.E., Baker, E.N. & Smith, C.A. (2001)
Crystal structure of the NADP(H)-dependent ketose reductase
from Bemisia argentifolii at 2.3 A
˚
resolution. J. Mol. Biol. 306,
239–250.
33. Thorn, J.M., Barton, J.D., Dixon, N.E., Ollis, D.L. & Edwards,
K.J. (1995) Crystal structure of Escherichia coli QOR quinone
oxidoreductase complexed with NADPH. J. Mol. Biol. 249, 785–
799.
34. Cho, H., Ramaswamy, S. & Plapp, B.V. (1997) Flexibility of liver
alcohol dehydrogenase in stereoselective binding of 3-butylthio-
lane 1-oxides. Biochemistry 36, 382–389.
35. Abagyan, R., Batalov, S., Cardozo, T., Totrov, M., Webber, J. &
Zhou, Y. (1997) Homology modeling with internal
coordinate mechanics: deformation zone mapping and improve-
ments of models via conformational search. Proteins supplement
1, 29–37.
36. Kyte, J. & Doolittle, R.F. (1982) A simple method for displaying
the hydropathic character of a protein. J. Mol. Biol. 157, 105–132.
37. Danielsson, O. & Jo
¨
rnvall, H. (1992) ÔEnzymogenesisÕ: classical
liver alcohol dehydrogenase origin from the glutathione-depend-
ent formaldehyde dehydrogenase line. Proc. Natl Acad. Sci. USA
89, 9247–9251.
38. Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Finn, R.D. &
Sonnhammer, E.L.L. (1999) Pfam 3.1: 1313 multiple align-
ments match the majority of proteins. Nucleic Acids Res. 27,
260–262.
39. Tatusov, R.L., Natale, D.A., Garkavtsev, I.V., Tatusova, T.A.,
Shankavaram, U.T., Rao, B.S., Kiryutin, B., Galperin, M.Y.,
Fedorova, N.D. & Koonin, E.V. (2001) The COG database: new
developments in phylogenetic classification of proteins from
complete genomes. Nucleic Acids Res. 29, 22–28.
40. Nordling, E., Persson, B. & Jo
¨
rnvall,H.(2002)Differentialmultipli-
city of MDR alcohol dehydrogenases. Cell. Mol. Life Sci. 59,
1070–1075.
41. Hjelmqvist, L., Shafqat, J., Siddiqi, A.R. & Jo
¨
rnvall, H. (1996)
Linking of isozyme and class variability patterns in the emerg-
ence of novel alcohol dehydrogenase functions. Characterization
of isozymes in Uromastix hardwickii. Eur. J. Biochem. 236, 563–
570.
42. Johansson,K.,El-Ahmad,M.,Kaiser,C.,Jo
¨
rnvall, H., Eklund,
H., Ho
¨
o
¨
g, J. & Ramaswamy, S. (2001) Crystal structure of sorbitol
dehydrogenase. Chem. Biol. Interact. 130–132, 351–358.
43. Vallee, B.L. & Auld, D.S. (1990) Active-site zinc ligands and
activated H2O of zinc enzymes. Proc. Natl Acad. Sci. USA 87,
220–224.
44. Karlsson, C., Jo
¨
rnvall, H. & Ho
¨
o
¨
g, J O. (1995) Zinc binding of
alcohol and sorbitol dehydrogenases. Adv. Exp. Med. Biol. 372,
397–406.
45. Wierenga, R.K., Terpstra, P. & Hol, W.G. (1986) Prediction of the
occurrence of the ADP-binding beta alpha beta-fold in proteins,
using an amino acid sequence fingerprint. J. Mol. Biol. 187, 101–
107.
46. Jo
¨
rnvall, H., Danielsson, O., Ho
¨
o
¨
g, J O. & Persson, B. (1993)
Alcohol dehydrogenase: patterns of protein evolution. In: Meth-
ods in Protein Sequence Analysis (Imahori, K. & Sakiyama, F.,
eds), pp. 275–282. Plenum, New York.
47. Bentley, R. (1990) The shikimate pathway – a metabolic tree with
many branches. Crit. Rev. Biochem. Mol. Biol. 25, 307–384.
48. Rao, P.V. & Zigler, J.S. Jr (1991) f-Crystallin from guinea pig lens
is capable of functioning catalytically as an oxidoreductase. Arch.
Biochem. Biophys. 284, 181–185.
49. Mano, J., Babiychuk, E., Belles-Boix, E., Hiratake, J., Kimura, A.,
Inze, D., Kushnir, S. & Asada, K. (2000) A novel NADPH: dia-
mide oxidoreductase activity in Arabidopsis thaliana P1 f-crystal-
lin. Eur. J. Biochem. 267, 3661–3671.
50. Yamazoe, M., Shirahige, K., Rashid, M.B., Kaneko, Y., Nakay-
ama, T., Ogasawara, N. & Yoshikawa, H. (1994) A protein which
binds preferentially to single-stranded core sequence of autono-
mously replicating sequence is essential for respiratory function in
mitochondrial of Saccharomyces cerevisiae. J. Biol. Chem. 269,
15244–15252.
51. Bergey, D.R., Howe, G.A. & Ryan, C.A. (1996) Polypeptide
signaling for plant defensive genes exhibits analogies to def-
ense signaling in animals. Proc. Natl Acad. Sci. USA 93, 12053–
12058.
52. Czarnetzki, B.M., Thiele, T. & Rosenbach, T. (1990) Immuno-
reactive leukotrienes in nettle plants (Urtica urens). Int. Arch.
Allergy. Appl. Immunol. 91, 43–46.
Ó FEBS 2002 MDR familycharacterizations in complete genomes (Eur. J. Biochem. 269) 4275
53. Hirata, T., Tamura, Y., Yokobatake, N., Shimoda, K. & Ashida,
Y. (2000) A 38 kDa allylic alcohol dehydrogenase from the cul-
tured cells of Nicotiana tabacum. Phytochemistry 55, 297–303.
54. Amy, C.M., Witkowski, A., Naggert, J., Williams, B., Randhawa,
Z. & Smith, S. (1989) Molecular cloning and sequencing of
cDNAs encoding the entire rat fatty acid synthase. Proc. Natl
Acad. Sci. USA 86, 3114–3118.
55. Donadio, S., Staver, M.J., McAlpine, J.B., Swanson, S.J. & Katz,
L. (1991) Modular organization of genes required for complex
polyketide biosynthesis. Science 252, 675–679.
56. Hofmann, K., Bucher, P., Falquet, L. & Bairoch, A. (1999) The
PROSITE database, its status in 1999. Nucleic Acids Res. 27,215–
219.
4276 E. Nordling et al. (Eur. J. Biochem. 269) Ó FEBS 2002
. Medium-chain dehydrogenases/reductases (MDR)
Family characterizations including genome comparisons and active site modelling
Erik Nordling
1,2
,. diagnostic for the different
families and can be used in genome annotations.
Keywords: medium-chain dehydrogenases/reductases;
genome comparisons; polyol dehydrogenase;