REVIEW ARTICLE
Flavogenomics –agenomicandstructuralview of
flavin-dependent proteins
Peter Macheroux
1,2
, Barbara Kappes
3
and Steven E. Ealick
2
1 Institute of Biochemistry, Graz University of Technology, Austria
2 Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
3 Department of Parasitology, University Hospital Heidelberg, Germany
Keywords
enzymes; flavin adenine dinucleotide (FAD);
flavin mononucleotide (FMN); genomic
distribution; oxidoreductases; redundancy;
structures
Correspondence
P. Macheroux, Institute of Biochemistry,
Graz University of Technology, Petersgasse
12 ⁄ II, A-8010 Graz, Austria
Fax: +43 316 873 6952
Tel: +43 316 873 6450
E-mail: peter.macheroux@tugraz.at
(Received 17 March 2011, revised 11 May
2011, accepted 31 May 2011)
doi:10.1111/j.1742-4658.2011.08202.x
Riboflavin (vitamin B
2
) serves as the precursor for FMN and FAD in
almost all organisms that utilize the redox-active isoalloxazine ring system
as a coenzyme in enzymatic reactions. The role of flavin, however, is not
limited to redox processes, as 10% offlavin-dependent enzymes catalyze
nonredox reactions. Moreover, the flavin cofactor is also widely used as a
signaling and sensing molecule in biological processes such as phototropism
and nitrogen fixation. Here, we present a study of 374 flavin-dependent
proteins analyzed with regard to their function, structure and distribution
among 22 archaeal, eubacterial, protozoan and eukaryotic genomes. More
than 90% offlavin-dependent enzymes are oxidoreductases, and the
remaining enzymes are classified as transferases (4.3%), lyases (2.9%),
isomerases (1.4%) and ligases (0.4%). The majority of enzymes utilize
FAD (75%) rather than FMN (25%), and bind the cofactor noncovalently
(90%). High-resolution structures are available for about half of the flavo-
proteins. FAD-containing proteins predominantly bind the cofactor in a
Rossmann fold ( 50%), whereas FMN-containing proteins preferably
adopt a (ba)
8
-(TIM)-barrel-like or flavodoxin-like fold. The number of
genes encoding flavin-dependentproteins varies greatly in the genomes
analyzed, and covers a range from 0.1% to 3.5% of the predicted genes.
It appears that some species depend heavily on flavin-dependent oxidore-
ductases for degradation or biosynthesis, whereas others have minimized
their flavoprotein arsenal. An understanding of ‘flavin-intensive’ lifestyles,
such as in the human pathogen Mycobacterium tuberculosis, may result in
valuable new intervention strategies that target either riboflavin biosynthe-
sis or uptake.
Introduction
Biological cofactors are generally employed by enzymes
to enable a wide and diverse range of biochemical
transformations necessary for all aspects of life. Some
of these cofactors, such as vitamin B
12
and vitamin H
(biotin), catalyze a small but nevertheless important set
of biochemical reactions. Other cofactors, on the other
hand, perform very different chemical tasks, and
compete for the title of master of versatility, with
vitamin B
2
(riboflavin)-derived, vitamin B
6
-derived
(e.g. pyridoxine and pyridoxamine) cofactors and
cytochrome P450 being the most serious contenders.
The yellow vitamin B
2
, or riboflavin, is synthesized by
Abbreviations
PDB, Protein Data Bank; RI, redundancy index.
FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS 2625
many bacteria and plants [1,2], and then converted to
FMN and FAD (for structures see Fig. 1) by riboflavin
kinase (which catalyzes the phosphorylation of the
ribityl side chain attached to N10 of the isoalloxazine
ring system) and further adenylated by FAD-synthetase
in two ATP-dependent reactions [3–5]. These two
modified forms of riboflavin occur exclusively in flavin-
dependent enzymes. The biochemical utility of FMN
and FAD is based on their redox-active isoalloxazine
ring system, which is capable of one-electron and two-
electron transfer reactions and, most importantly, of
dioxygen activation [6]. Generations of enzymologists
have marvelled about the astonishing diversity of
flavin-dependent reactions, encompassing dehydrogena-
tion [7], oxidation [8–10], monooxygenation [11–13],
halogenation [14–16], and reduction (e.g. of disulfides
and various types of double bond) [17], as well as their
utility in biological sensing processes (e.g. light and
redox status) [18–25]. Not surprisingly, this area has
been the subject of numerous review articles that have
attempted to fathom and rationalize the capabilities of
the flavin cofactor [26–32]. The complexity of flavin-
catalyzed reactions is further increased when they join
forces with other redox-active cofactors, such as iron–
sulfur clusters ([2Fe–2S], [3Fe–4S] and ⁄ or [4Fe–4S])
[33–35], heme [36], molybdopterin [37], or thiamine
diphosphate [38].
Since the discovery of the first flavin-containing
enzyme by Otto Warburg in the 1930s [39], the number
of ‘yellow’ enzymes has steadily increased, and there
has been a sharp rise in the last 20–30 years, owing to
the rapid progress in molecular cloning and full
genome sequencing. More recently, structural genomics
has led to the structural characterization of many more
and hitherto unknown flavoproteins. To gain an over-
view of flavoproteins, their genomic distribution, and
their structural topologies, we have assembled a list of
flavoproteins and searched for the encoding sequences
in a selection of genomes. In addition, structural infor-
mation on flavoproteins in the Protein Data Bank
(PDB) was analyzed in order to define the flavin-bind-
ing pocket according to the PFAM classification
scheme [40].
Nature’s flavoprotein arsenal
The list offlavin-dependentproteins was assembled by
using, mainly, three on-line databases. First, the
enzyme database BRENDA (http://www.brenda-enzy-
mes.org/) was searched for FMN-dependent and
FAD-dependent enzymes to compile a preliminary list.
This initial list contained many false positives and also
missed several flavin-dependent enzymes, as well as
flavoproteins with no catalytic or no known catalytic
function (e.g. flavin storage proteins). To verify the
dependence ofa protein on flavin, the primary litera-
ture was consulted, anda complementary search for
classified enzymes in the Enzyme Structures Database
(http://www.ebi.ac.uk/thornton-srv/databases/enzymes/)
and the PDB (http://www.pdb.org/pdb/home/home.do)
was conducted to link the list of flavoproteins to the
available structural information.
The current list of flavoproteins contains 276 fully
classified enzymes and 98 entries for enzymes with no
or incomplete classification as well as flavoproteins
without a demonstrated enzymatic activity (cofactor
storage, electron transfer, repressor and response
proteins; 17 entries). As could be expected for a redox-
active cofactor, the majority of flavoenzymes are found
in enzyme class 1: oxidoreductases account for 91%
(251 entries), whereas transferases, lyases, isomerases
and ligases contribute only 4.3% (12 entries), 2.9%
(eight entries), 1.4% (four entries), and 0.4% (one
entry) (Fig. 2A). Within the class of oxidoreductases,
N
N
NH
N
H
3
C
H
3
C
H
3
C
CH
2
H
3
C
O
O
CH
HC
CH
OH
OH
OH
O P
O
P
O
O
O
O
O
O
N
N
N
N
NH
2
OH OH
1
2
3
4
5
6
7
8
9
10
N
N
NH
N
O
O
H
H
Oxidized
Reduced
Isoalloxazine ring
Riboflavin
FAD
FMN
2e / H
H
2
C
H
2
C
Fig. 1. Structure of riboflavin, FMN, and
FAD. The redox-active isoalloxazine ring is
shown in its oxidized and two-electron
reduced state (red and blue). The numbering
scheme for the isoalloxazine ring is indi-
cated in the oxidized structure on the left.
Flavin-dependent proteins P. Macheroux et al.
2626 FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS
the three largest subgroups are enzymes in EC 1.1.4
(61 entries for monooxygenases ⁄ hydroxylases),
EC 1.1
(38 entries for enzymes oxidizing a CH–OH group),
and
EC 1.1.3 (30 entries for enzymes oxidizing a
CH–CH group) (Fig. 2B).
FAD is clearly more common as a cofactor than
FMN, with 289 proteins depending on FAD (75%)
and 98 on FMN (25%) (note: entries where cofactor
utilization is unclear were not considered; see
Table S1). Riboflavin is not used in any enzymes
(except for riboflavin kinase ⁄ FAD synthetase as a sub-
strate), but appears to be the preferred storage form of
the cofactor in some organisms (e.g. riboflavin-binding
protein in chicken eggs and dodecin in archaeons
[41,42]). In addition, organisms (e.g. mammals) lacking
vitamin B
2
biosynthesis employ riboflavin-specific
transporters to sequester it from dietary sources by
facilitated diffusion [43].
In the majority of enzymes, the cofactor is noncova-
lently bound in the active site. Covalent attachment of
the flavin cofactor has been confirmed in 40 cases (see
Table S2), corresponding to 10.8% of all flavopro-
teins listed in Table S1. Apparently, covalent attach-
ment of FMN (five entries) occurs rarely as compared
with that of FAD (35 entries). Different types of cova-
lent attachment have been found for FMN. It is linked
either to the 8a-position (via N3 ofa histidine) or to
the 6-position (via the thiol group ofa cysteine) of the
isoalloxazine ring [44], or, in one case, it is bicovalently
linked to N1 ofa histidine and the thiol group of a
cysteine [45]. Only recently, a novel attachment of
FMN to redox-driven ion pumps (RnfG and RnfD)
via an ester linkage between the hydroxyl group of a
threonine and the ribitylphosphate side chain of the
cofactor was discovered [46]. On the other hand, cova-
lent linkage of FAD always occurs via the 8a-position,
to either the N1 or N3 ofa histidine, a cysteine thiol,
a tyrosine hydroxyl, or an aspartate carboxyl group
(Table S2) [44,47]. In five enzymes, FAD is bicovalent-
ly attached via the 8a-position and 6-position of the
isoalloxazine ring system [48]. Bicovalent attachment
was first discovered only 5 years ago, but appears to
be more common than monocovalent attachment to
the 8a-position via cysteine, tyrosine, or aspartate
[49,50].
Flavoprotein structures
The first structure ofaflavin-dependent protein was
reported in 1972 for a bacterial flavodoxin [51,52]. Sev-
eral years later, the structures of the FAD-dependent
enzymes glutathione reductase (
EC 1.8.1.7) and 4-hy-
droxybenzoate 3-monooxygenase (p-hydroxybenzoate
hydroxylase;
EC 1.14.13.2) were described [53,54].
Since that time, the numbers of deposited structures
have risen to 646 and 1179 structures of FMN-depen-
dent and FAD-dependent proteins, respectively (as of
31 December 2010), and this has been paralleled by
efforts to relate the structures of flavoproteins to their
functions [55–58]. The structure of flavodoxin, a small
electron transfer protein that uses FMN as a cofactor,
is not only the first but also by far the most frequently
solved structure of all flavin-dependent proteins
(> 120 entries in the PDB).
Currently, structures are available for 55 FMN-uti-
lizing and 141 FAD-utilizing flavoproteins, accounting
for 52% of all flavoproteins listed in Table S1.
Overall, a total of 23 structural clans (according to the
PFAM classification [40]) is represented by flavin-
dependent proteins, and the structural topologies are
therefore quite diverse in comparison with other
cofactor-dependent enzyme families; for example, all
pyridoxal 5¢-phosphate-dependent enzymes adopt one
of five different structural topologies [59].
6. (0.4%)
5. (1.4%)
4. (2.9%)
2. (4.3%)
1.1.
1.2.
1.3.
1.4.
1.5.
1.6.
1.7.
1.8.
1.14.
1.16.
1.17. 1.18.
1.21.
1.13.
1.12.
1.11.
1.10.
1. (91%)
AB
Fig. 2. Pie chart of flavoproteins found in various enzyme classes: yellow, class 1 (oxidoreductases); orange, class 2 (transferases); red,
class 4 (lyases); blue, class 5 (isomerases); and green, class 6 (ligases). This chart was generated by using the fully classified flavoenzymes
(a total of 276) from Table S1.
P. Macheroux et al. Flavin-dependent proteins
FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS 2627
As can be seen from Fig. 3, FMN and FAD binding
are vastly different with respect to the topology of the
binding pocket, indicating that the adenosine moiety
strongly affects the mode of cofactor binding. The pre-
ferred structure for FMN binding is the classical
(b ⁄ a)
8
-barrel (clan TIM_barrel), with 16 entries, and
the flavodoxin-like fold (clan Flavoprotein), with 12
entries. Together, these two clans account for more
than half of the currently known FMN-dependent
structural types. Graphical representations of these
two most common topologies in FMN-dependent pro-
teins are shown in Fig. 4A,B. Within the clan
TIM_barrel, five families are found in FMN-dependent
enzymes: FMN_dh (six entries), Oxidored_FMN (five
entries), and DHO_dh, Glu_synthase and NPD (one
entry for each family). In the clan Flavoprotein, nine
proteins adopt a Flavodoxin_1, two an FMN_red and
one a recently discovered Flavodoxin_NrdI fold. All of
the FMN-dependent proteins in this clan serve as elec-
tron transfer proteins or act as two-electron reductases
for free flavin (FMN reductase,
EC 1.5.1.29)or
other electron acceptors (e.g. azobenzene reductase,
EC 1.7.1.6). In addition to these two most abundant
structural clans, FMN-dependent proteins are found in
12 rare folds. Some of these folds are unique struc-
tures, and are found in only one or a few enzymes,
such as bacterial luciferase (Bac_luciferase), nitroreduc-
tase (Nitroreductase fold), phosphopantothenate-cyste-
ine ligase (clan NADP_Rossmann ⁄ family DFP), and
chorismate synthase (chorismate_syn). The latter two
examples are very interesting, because these two
enzymes do not catalyze net redox reactions and are
not classical oxidoreductases, like most flavin-depen-
dent enzymes (Fig. 2). This observation suggests that
FMN-dependent enzymes used for ‘aberrant’ activities
have evolved independently from the canonical FMN-
dependent oxidoreductases, or, in other words, the
folds necessary to carry out the enzymatic reaction
were not ‘borrowed’ from the oxidoreductases, but
instead novel topologies have arisen during the
evolution of these enzymes. As will be discussed below,
this tendency for unusual reactions to call for unusual
folds is also found in FAD-dependent enzymes.
The topologies found for FAD binding are
dominated by the Rossmann fold or variations thereof,
FAD
0
10
20
30
40
50
60
70
NADP_Rossmann
FAD_PCMH
FAD_Lum_binding
acyl-CoA_dh
FAD_DHS
PAS
Flavoprotein
FMN-binding
FAD_oxidored
DNA photolyases
4Fe_4S
bluF
"diverse" Flavoprotein
ERO1
Erv1_Alr
FCSD-flav_bind
Thy1
FMN
16
12
6
4 4 4
2
1 1 1 1
0
2
4
6
8
10
12
14
16
TIM_barrel
Flavoprotein
FMN-binding
Bac_luciferase
Nitroreductase
"diverse" Flavoprotein
FAD_Lum_binding
Chorismate_syn
FAD_PCMH
Glutaminase_I
NADP_Rossmann
Oxidored_q6
PAS
PK_C
A
B
Fig. 3. Bar plot of the distribution ofstructural clans (according to
the PFAM classification) in FMN-dependent (A) and FAD-dependent
(B) flavoproteins.
AB
CD
Fig. 4. Graphical representation of the two most common struc-
tural clans for FMN-dependent (A, B) and FAD-dependent (C, D)
proteins. The examples show the structures of flavodoxin from (A)
Desulfovibrio vulgaris (PDB entry
1fx1), (B) bold yellow enzyme
from Sa. cerevisiae (PDB entry
1oyc), (C) glutathione disulfide
reductase (PDB entry
3grs), and (D) UDP-N-acetylmuramate dehy-
drogenase (PDB entry
1mbt), representing the clans TIM_barrel,
Flavoprotein, NADP_Rossmann, and FAD_PCMH, respectively. The
structure representations were generated with
PYMOL.
Flavin-dependent proteins P. Macheroux et al.
2628 FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS
contained in the clan NADP_Rossmann (Fig. 4C) [56].
This structure clan comprises a large number of
families (148), with nine families reported to serve for
FAD binding. Almost half of the FAD-dependent pro-
teins exhibit a fold in this clan (Fig. 3, bottom panel).
Second to the clan NADP_Rossmann is the clan
FAD_PCMH (two families; for a graphical example,
see Fig. 4D), followed by the clan FAD_Lum_binding
(five families) and the clan Acyl-CoA_dh (four fami-
lies). Together, the structures found in these four clans
account for 75% of all FAD-dependent proteins. The
clans that are rare appear to occur predominantly in
proteins with special biological functions, such as
light-dependent DNA repair (deoxyribodipyrimidine
photolyase,
EC 4.1.99.3), oxidoreductase activity in the
endoplasmic reticulum (ERO1), or electron transfer
from acyl-CoA dehydrogenases to the electron trans-
port chain (clan 4Fe–4S). As discussed above for
FMN-dependent proteins, this observation suggests
that employment of FAD-dependent enzymes for novel
or unusual functions requires the adaptation of already
existing topologies and, in some cases, new structural
designs to fulfill the desired role.
The majority of covalently bound flavins are present
as FAD rather than FMN (Table S2). Interestingly,
covalent attachment of FAD occurs only in the
two most abundant clans, NADP_Rossmann and
FAD_PCMH, and is almost equally distributed between
these two clans (Table S2). Several families in the clan
NADP_Rossmann are associated with covalent FAD
linkage (DAO, GMC_oxred_N, FAD_binding_2,
Amino_oxidase, and Trp_halogenase). This is in con-
trast to the clan FAD_PCMH, where covalent linkage is
found in the family FAD_binding_4 but not in the fam-
ily FAD_binding_5, which comprises FAD-containing
and molybdopterin-containing enzymes, such as xan-
thine oxidase (
EC 1.1.3.22) and quinoline-2-oxidoreduc-
tase (
EC 1.3.99.17), to mention only two representatives
of this family (Table S1). Covalent linkage is highly
prevalent in the family FAD_binding_4: 11 of the 14
structures reported for this family show monocovalent
or bicovalent flavin attachment, with UDP-N-acetyl-
muramate dehydrogenase (
EC 1.1.1.158), D-lactate
dehydrogenase (
EC 1.1.1.28) and alkyldihydroxyace-
tone phosphate synthase (EC 2.5.1.26) being the only
exceptions (Table S2).
Impact ofstructural genomics
consortia
Several structural genomics projects on prokaryotic
and eukaryotic species have been initiated, in order to
define the structures of expressed proteins in the target
organism. A total of 173 (86 for FMN-utilizing pro-
teins and 87 for FAD-utilizing proteins) entries have
been deposited by structural genomics consortia since
1999, amounting to 10% of the total entries ( 1800
entries; 640 for FMN-utilizing proteinsand 1160 for
FAD-utilizing proteins). Analysis of the structural
classification for FMN-dependent proteins reveals a
strong bias towards the clan Nitroreductases, with
a total of 27 entries ( 31%). As this clan has only a
moderate frequency among FMN-dependent proteins
(Fig. 3, top panel), this overrepresentation suggests
that this type of structure is favored by the methodolo-
gies currently used in structural genomics pipelines.
The aim of the consortia to elucidate the structures of
as many different proteins as possible also leads to a
serious lack of biochemical information, which renders
some of the PDB entries difficult to interpret in terms
of the biological function of the flavoprotein. On the
other hand, several structures of new flavoproteins
with unknown roles have been contributed by struc-
tural genomics initiatives. For example, a zinc-depen-
dent protease from Bacteroides thetaiotaomicron (clan
Glutaminase_I, family DJ-1 ⁄ PfpI, PDB entry
3cne)
and protein structures with a fold similar to the C-ter-
minal domain of pyruvate kinase in the archaeons
Archaeoglobus fulgidus and Methanobacterium thermo-
autotrophicum were recently deposited in the PDB
(clan PK_C, PDB entries
1vp8 and 1t57). However,
the role of the FMN cofactor in these two proteins is
unclear. In the putative protease, the flavin isoalloxa-
zine ring is sandwiched by two tryptophans at the
interface of the dimeric protein, with the edge of the
pyrimidine ring moiety at distance of 15 A
˚
from the
presumably catalytic mononuclear zinc center. Hence,
the flavin does not appear to play a role in catalysis,
but may instead be involved in dimerization of the
protein or act as a gate for potential substrates to
enter the active site. On the other hand, the flavin in
the pyruvate kinase fold in archaeons is located in a
central cavity of the protein, and engages in hydrogen
bond interactions with several amino acid side chains.
In this case, it seems plausible that the flavin plays a
catalytic role, albeit in a type of fold that has not
previously been implicated in flavoenzyme catalysis.
Furthermore, an FMN-dependent oxidoreductase
from Thermotoga maritima was the first structure of a
flavin-dependent tRNA dihydrouridine synthase
(clan TIM_barrel, family Dus; PDB entry
1vhn),
an enzyme that has recently been characterized
biochemically [60].
In the case of FAD, the entries provided by struc-
tural genomics consortia reflect the predominance of
the clan NADP_Rossmann, with 44 of 87 entries
P. Macheroux et al. Flavin-dependent proteins
FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS 2629
belonging to this clan. Interestingly, several new struc-
tural families for FAD-dependent proteins were
defined in the course ofstructural genomics efforts,
such as the bluf domain of blue light sensors in cyano-
bacteria (1x0p), the glucose-inhibited division pro-
tein A (GidA) domain in the clan NADP_Rossmann,
the HI0933-like proteins (first discovered in target 0933
from Haemophilus influenzae, PDB entry
2gqf), and a
siderophore-interacting protein (family FAD_bind-
ing_9 in the clan FAD_Lum_binding). In addition, a
novel covalent attachment between a side chain car-
boxylate group of an aspartate and the 8a-position of
the isoalloxazine system was discovered in an FAD-
dependent halogenase involved in chloramphenicol
biosynthesis in Streptomyces venezuelae [47]. As noted
before, this structural information provides interesting
leads for biochemists to follow up and subject these
proteins to thorough biochemical characterization in
order to reveal their cellular role.
Flavogenomics – occurrence and
distribution of flavoproteins in
prokaryotes and eukaryotes
Despite the availability ofgenomic sequence informa-
tion, it proved difficult to obtain reliable information
on the occurrence of flavoproteins encoded in the
genomes of various organisms. This is mostly because
of the lack of information on whether a flavin (FMN
and ⁄ or FAD) cofactor is present and the precise
biochemical reaction catalyzed by the enzyme. On the
other hand, it is doubtful that all, or even most, of the
proteins predicted by genomics will ever be subjected
to a detailed characterization that would enable accu-
rate functional assignment ofa putative flavoenzyme.
For most of the species analyzed, we used the annota-
tions provided by the responsible sequencing facility,
and included only those entries that gave a clear
indication of flavin dependence (see Methods). This
approach probably leads to an underestimation of the
number of flavoproteins, as many ‘hypothetical’ or
‘putative’ proteins may be flavin-dependent but are not
annotated as such. An interesting alternative to use of
the existing annotations is the analysis of predicted
protein families as provided by the Broad Institute
for Neurospora crassa (http://www.broadinstitute.org/
annotation/genome/neurospora/Pfam.html) and on the
tuberculosis research platform for Mycobacterium
tuberculosis and Streptomyces coelicolor (http://www.
tbdb.org/). Therefore, we have also used our set of
structural families (Table S3) to search for proteins
predicted in the above-mentioned species. In the case
of M. tuberculosis, a parallel analysis of the available
genome annotation was conducted. The ‘structural
family approach’ has generated a significantly higher
number of predicted flavoproteins (141 versus 113), as
many hypothetical proteins are found in protein
families that are typical or even specific for flavopro-
teins (e.g. FAD_binding_4 or NPD) and hence were
included as predicted flavin-dependent proteins. The
disadvantage of this more ‘inclusive’ analysis is that
some of the protein families, such as PAS_3, are not
specific for flavin and may utilize other cofactors (e.g.
heme). In any case, the task of eliminating the false
positives and false negatives inherent in both
approaches can only be performed by biochemical
characterization of predicted and suspected flavopro-
teins. To this end, structural genomics may also play
an important role; however, flavoenzymes that do not
hold on tightly to the flavin cofactor (e.g. chorismate
synthase) or use it only transiently during catalysis
(e.g. hydroxypropylphosphonic acid epoxidase) may
elude identification as flavin-dependent proteins.
Although it is presently not possible to determine
the exact number of flavoproteins, our analysis
has revealed striking differences in the utilization of
flavin-dependent proteins in various prokaryotic and
eukaryotic species, which are reflected both by the
total number and the percentage of genes encoding
flavoproteins (Fig. 5). Several species appear to have a
minimum number offlavin-dependentproteins that are
required to maintain basic metabolic functions, such as
succinate dehydrogenase which is necessary for pri-
mary energy metabolism, and chorismate synthase and
acetolactate synthase, which are necessary for amino
acid biosynthesis. Examples of species with a minimal
set of enzymes are Pyrococcus abyssi, T. maritima, and
Saccharomyces cerevisiae (with 12, 12 and 48 entries,
respectively). On the other hand, organisms such as
M. tuberculosis, Neurospora crassa , S. coelicolor and
Arabidopsis thaliana contain a relatively large number
of genes encoding flavin-dependent proteins. In these
cases, flavoenzymes are apparently involved in a
species-specific lifestyle that requires a much larger set
of flavoenzymes than are needed by the ‘flavin mini-
malists’ mentioned before. Closer inspection of the set
of flavoenzymes in these organisms reveals a multitude
of one or several types offlavin-dependent proteins. In
order to estimate this redundancy ofa ‘flavogenome’,
we have defined the quotient of the number of distinct
flavin-dependent proteins (i.e. with different EC
numbers) and the total number of flavin-dependent
proteins as a ‘redundancy’ index (RI) (RI = 1 indi-
cates a nonredundant flavogenome, whereas RI < 1
indicates increasing redundancy; Fig. 5C). In the case
of M. tuberculosis , 34 genes encoding acyl-CoA
Flavin-dependent proteins P. Macheroux et al.
2630 FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS
dehydrogenases and 10–15 genes encoding flavin-
containing monooxygenases and oxidoreductases give
rise to high redundancy (RI = 0.55; Fig. 5C). The
occurrence of this many acyl-CoA dehydrogenases is
apparently related to the extensive and complex utiliza-
tion of lipids from host cells by this pathogenic bacte-
rium [61]. A large number of genes encoding acyl-CoA
dehydrogenases is also found in S. coelicolor, and is
only exceeded by putative flavin-dependent oxidore-
ductases, with 57 predicted genes. Again, the
abundance of these flavoenzymes can be rationalized
on the basis of the lifestyle: S. coelicolor is a rather
immobile soil bacterium that can adapt to various car-
bon and nitrogen sources and produces a large number
of biologically active compounds, such as antibiotics.
In other words, the organism depends on metabolic
power and versatility that are certainly conferred to
some degree by flavin-dependent enzymes. In contrast
to M. tuberculosis and S. coelicolor, N. crassa has
apparently pursued a different metabolic strategy by
using a broader array of flavoenzymes rather than a
highly similar set, as indicated by the rather high RI
(0.74 versus 0.5 and 0.55 for S. coelicolor and
M. tuberculosis, respectively). As a result, N. crassa
contains more than 100 different flavoproteins, more
than any other species analyzed in our study. The large
number of flavoenzymes in this filamentous fungus
may be attributable to diverse biosynthetic routes
leading to secondary metabolites, as well as the sapro-
trophic lifestyle, which requires the generation and
secretion of oxidases and dehydrogenases to access
organic matter in the environment. In this context, it is
noteworthy that the protein family FAD_binding_4
constitutes the largest group among the predicted puta-
tive flavoenzymes in this species. Members of this fam-
ily are typically oxidases that are capable of
performing a wide range of substrate (e.g. sugars and
alcohols) oxidation reactions [48].
The flavogenome of the model plant A. thaliana is
the most prolific among the analyzed genomes. This is
mostly because of the occurrence of two large groups
of flavoproteins, monooxygenases and oxidases of the
(S)-tetrahydroprotoberberine oxidase ⁄ berberine bridge
0
50
100
150
200
250
My. genitalium
Ar. fulgidus
Me. jannaschii
P. abyssi
B. subtilis
C. trachomatis
De. radiodurans
E. coli
H. pylori
M. tuberculosis
Ps. aeruginosa
St. aureus
S. coelicolor
T. maritima
V. fischeri
Pl. falciparum
To. gondii
N. crassa
Sa. cerevisiae
A. thaliana
D. melanogaster
Ho. sapiens
Number of genes
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
My. genitalium
Ar. fulgidus
Me. jannaschii
P. abyssi
B. subtilis
C. trachomatis
De. radiodurans
E. coli
H. pylori
M. tuberculosis
Ps. aeruginosa
St. aureus
S. coelicolor
T. maritima
V. fischeri
Pl. falciparum
To. gondii
N. crassa
Sa. cerevisiae
A. thaliana
D . melanogaster
Ho. sapiens
% of genes
Redundancy
My. genitalium
Ar. fulgidus
Me. jannaschii
P. abyssi
B. subtilis
C. trachomatis
De. radiodurans
E. coli
H. pylori
M. tuberculosis
Ps. aeruginosa
St. aureus
S. coelicolor
T. maritima
V. fischeri
Pl. falciparum
To. gondii
N. crassa
Sa. cerevisiae
A. thaliana
D
. melanogaster
Ho. sapiens
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
A
B
C
Fig. 5. Occurrence and distribution of flavoproteins in 22 selected
genomes. (A) The number of genes encoding flavin-dependent pro-
teins in the genomes of My. genitalium, Ar. fulgidus, Me. janaschii,
P. abyssi, Pl. falciparum, To. gondii, Sa. cerevisiae, N. crassa,
A. thaliana, D. melanogaster and Homo sapiens. (B) The numbers
of predicted flavoproteins as percentages of the total proteins for
the species in (A). (C) The RIs of flavoproteins in these genomes.
Yellow bars indicate genomes with low redundancy, and brown
bars indicate genomes with high redundancy.
P. Macheroux et al. Flavin-dependent proteins
FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS 2631
enzyme family, with 31 and 26 members, respectively.
As previously discussed for microbial genomes, the
large number of enzymes in these two flavoprotein
families is a reflection of the diversity of metabolic
processes employed to synthesize a vast array of bioac-
tive compounds. In the case of plants, natural products
such as alkaloids and terpenes are among the com-
pounds synthesized for signaling and defense purposes.
Several members of the berberine bridge enzyme family
are implicated in plant metabolism, such as (S)-tetra-
hydroprotoberberine oxidase, nectarin V [62], and pol-
len allergen proteins [63]. Therefore, it can be expected
that most of the flavoproteins occurring in these two
groups will catalyze distinct reactions on various dif-
ferent substrates.
The RI seems to be a useful tool with which to iden-
tify organisms that have a ‘flavin-dependent’ lifestyle
because of their high demand for chemically complex
biomolecules, and which are thus potentially vulnera-
ble to inhibitors of riboflavin biosynthesis and ⁄ or
uptake [64–66]. Although it is apparent that major spe-
cies-specific differences exist, the currently estimated
RIs are probably too low for several species, owing to
the lack of biochemical knowledge of the enzymes in
the most common flavoprotein families. Hence, future
efforts to define the flavoprotein arsenal of an organ-
ism have to focus on three aspects: to capture all true
flavin-dependent proteins, to eliminate false positives,
and to characterize the flavoproteins biochemically in
order to classify them accurately. As a significant first
step, it would be useful to conduct an HMMER analy-
sis [67] of the existing genomes to provide a list of
potential flavoproteins, to enable scientists to target
specifically these putative genes for biochemical and
structural studies.
Methods
Flavoproteins from different species were identified by
screening pertinent databases. Microbial genomes were
analyzed by screening the databases provided by the J. Craig
Venter Institute (Ar. fulgidus DSM4304, Bacillus subtil-
is 168, Chlamydia trachomatis serovar D, Deinococcus radio-
durans R1, Escherichia coli K-12, Helicobacter pylori 26695,
Methanocaldococcus jannaschii, M. tuberculosis CDC1551,
Mycoplasma genitalium G-37, Pseudomonas aeruginosa PAO,
P. abyssi, Staphylococcus aureus MW2, T. maritima, and
Vibrio fischeri ES114). Putative flavoproteins in N. crassa
were retrieved by a web-based analysis of the known
flavin-dependent protein families listed in Table S1 on
http://www.broadinstitute .org/annotation/genome/neurospora/
MultiHome.html. Flavoproteins in the yeast Sa. cerevisiae
were identified with the annotations available on the yeast
genome website at http://www.yeastgenome.org. A similar
approach was used for M. tuberculosis and S. coelicol-
or A3(2) (http://www.tbdb.org/). Information on flavopro-
teins in the human parasites Plasmodium falciparum and
Toxoplasma gondii were retrieved by inspection of http://
plasmodb.org/plasmo/ and http://toxodb.org/toxo, respec-
tively. Flavoproteins from A. thaliana were retrieved by a
keyword and protein name search (flavin, FMN, FAD, diox-
ygenase, monooxygenase, hydroxylase, and the individual
names of all flavoproteins listed in Table S1), with the ARa-
bidopsis Gene EXpression Database (AREX) (http://
www.arexdb.org/index.jsp). Analysis of flavoproteins in Dro-
sophila melanogaster was based on a search in http://fly-
base.org/ and http://www.brenda-enzymes.org. Human
flavoproteins were identified by a text search with the
enzyme names from Table S1 in the Online-Mendelian
Inheritance in Man (OMIM) database (http://www.ncbi.
nlm.nih.gov/omim).
References
1 Bacher A, Eberhardt S, Fischer M, Kis K & Richter G
(2000) Biosynthesis of vitamin b2 (riboflavin). Annu Rev
Nutr 20, 153–167.
2 Fischer M & Bacher A (2008) Biosynthesis of vita-
min B2: structure and mechanism of riboflavin syn-
thase. Arch Biochem Biophys 474, 252–265.
3 Efimov I, Kuusk V, Zhang X & McIntire WS (1998)
Proposed steady-state kinetic mechanism for Corynebac-
terium ammoniagenes FAD synthetase produced by
Escherichia coli. Biochemistry 37, 9716–9723.
4 Manstein DJ & Pai EF (1986) Purification and charac-
terization of FAD synthetase from Brevibacterium am-
moniagenes. J Biol Chem 261, 16169–16173.
5 Wu M, Repetto B, Glerum DM & Tzagoloff A
(1995) Cloning and characterization of FAD1, the
structural gene for flavin adenine dinucleotide synthe-
tase of Saccharomyces cerevisiae. Mol Cell Biol 15,
264–271.
6 Massey V (1994) Activation of molecular oxygen by
flavins and flavoproteins. J Biol Chem 269, 22459–
22462.
7 Ghisla S & Thorpe C (2004) Acyl-CoA dehydrogenases.
A mechanistic overview. Eur J Biochem 271, 494–508.
8 Fitzpatrick PF (2010) Oxidation of amines by flavopro-
teins. Arch Biochem Biophys 493, 13–25.
9 Fass D (2008) The Erv family of sulfhydryl oxidases.
Biochim Biophys Acta 1783, 557–566.
10 Vrielink A & Ghisla S (2009) Cholesterol oxidase: bio-
chemistry andstructural features. FEBS J 276, 6826–
6843.
11 Ellis HR (2010) The FMN-dependent two-component
monooxygenase systems. Arch Biochem Biophys 497,
1–12.
Flavin-dependent proteins P. Macheroux et al.
2632 FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS
12 Palfey BA & McDonald CA (2010) Control of catalysis
in flavin-dependent monooxygenases. Arch Biochem Bio-
phys 493, 26–36.
13 van Berkel WJ, Kamerbeek NM & Fraaije MW (2006)
Flavoprotein monooxygenases, a diverse class of oxida-
tive biocatalysts. J Biotechnol 124, 670–689.
14 Anderson JL & Chapman SK (2006) Molecular mecha-
nisms of enzyme-catalysed halogenation. Mol BioSyst 2,
350–357.
15 Blasiak LC & Drennan CL (2009) Structural perspec-
tive on enzymatic halogenation. Acc Chem Res 42,
147–155.
16 van Pee KH, Dong C, Flecks S, Naismith J, Patallo EP
& Wage T (2006) Biological halogenation has moved
far beyond haloperoxidases. Adv Appl Microbiol 59,
127–157.
17 Argyrou A & Blanchard JS (2004) Flavoprotein disul-
fide reductases: advances in chemistry and function.
Prog Nucleic Acid Res Mol Biol 78, 89–142.
18 Demarsy E & Fankhauser C (2009) Higher plants use
LOV to perceive blue light. Curr Opin Plant Biol 12,
69–74.
19 Gomelsky M & Klug G (2002) BLUF: a novel FAD-
binding domain involved in sensory transduction in
microorganisms. Trends Bichem Sci 27, 497–500.
20 Kavakli IH & Sancar A (2002) Circadian photorecep-
tion in humans and mice. Mol Interv 2, 484–492.
21 Lin C & Todo T (2005) The cryptochromes. Genome
Biol 6, 220.
22 Losi A & Gartner W (2011) Old chromophores, new
photoactivation paradigms, trendy applications:
flavins in blue light-sensing photoreceptors. Photochem
Photobiol 87, 491–510.
23 Ozturk N, Song SH, Ozgur S, Selby CP, Morrison L,
Partch C, Zhong D & Sancar A (2007) Structure and
function of animal cryptochromes. Cold Spring Harbor
Symp Quant Biol 72, 119–131.
24 Braatsch S, Gomelsky M, Kuphal S & Klug G (2002)
A single flavoprotein, AppA, integrates both redox and
light signals in Rhodobacter sphaeroides. Mol Microbiol
45, 827–836.
25 Macheroux P, Hill S, Austin S, Eydmann T, Jones T,
Kim SO, Poole R & Dixon R (1998) Electron donation
to the flavoprotein NifL, a redox-sensing transcriptional
regulator. Biochem J 332, 413–419.
26 Ghisla S & Massey V (1989) Mechanisms of flavopro-
tein-catalyzed reactions. Eur J Biochem 181, 1–17.
27 Mansoorabadi SO, Thibodeaux CJ & Liu HW (2007)
The diverse roles of flavin coenzymes – nature’s most
versatile thespians. J Biol Chem 72, 6329–6342.
28 Mathews FS, Cunane L & Durley RC (2000) Flavin
electron transfer proteins. Subcell Biochem 35, 29–72.
29 Miura R (2001) Versatility and specificity in flavoen-
zymes: control mechanisms of flavin reactivity. Chem
Rec 1, 183–194.
30 Vervoort J & Rietjens IM (1996) Unifying concepts in
flavin-dependent catalysis. Biochem Soc Trans 24, 127–
130.
31 Fagan RL & Palfey BA (2010) Flavin-dependent
enzymes. In Comprehensive Natural Products II (Begley
TP, ed.), pp. 37–114. Elsevier, Amsterdam.
32 Joosten V & van Berkel WJ (2007) Flavoenzymes. Curr
Opin Chem Biol 11, 195–202.
33 Johnson DA, Gassner GT, Bandarian V, Ruzicka FJ,
Ballou DP, Reed GH & Liu HW (1996) Kinetic charac-
terization of an organic radical in the ascarylose biosyn-
thetic pathway. Biochemistry 35, 15846–15856.
34 Cecchini G, Schroder I, Gunsalus RP & Maklashina E
(2002) Succinate dehydrogenase and fumarate reductase
from Escherichia coli. Biochim Biophys Acta 1553,
140–157.
35 Vanoni MA, Dossena L, van den Heuvel RH & Curti
B (2005) Structure–function studies on the complex
iron–sulfur flavoprotein glutamate synthase: the key
enzyme of ammonia assimilation. Photosynth Res 83,
219–238.
36 Mowat CG, Gazur B, Campbell LP & Chapman SK
(2010) Flavin-containing heme enzymes. Arch Biochem
Biophys 493, 37–52.
37 Garattini E, Mendel R, Romao MJ, Wright R & Terao
M (2003) Mammalian molybdo-flavoenzymes, an
expanding family of proteins: structure, genetics, regula-
tion, function and pathophysiology. Biochem J 372,
15–32.
38 Tittmann K (2009) Reaction mechanisms of thiamine
diphosphate enzymes: redox reactions. FEBS J 276,
2454–2468.
39 Warburg O & Christian W (1932) Ein zweites Sauerst-
off-u
¨
bertragendes Ferment und sein Absorptionsspek-
trum. Naturwissenschaften 20, 688.
40 Bateman A, Coin L, Durbin R, Finn RD, Hollich V,
Griffiths-Jones S, Khanna A, Marshall M, Moxon S,
Sonnhammer EL et al. (2004) The Pfam protein families
database. Nucleic Acids Res 32, D138–D141.
41 Grininger M, Staudt H, Johansson P, Wachtveitl J &
Oesterhelt D (2009) Dodecin is the key player in flavin
homeostasis of archaea. J Biol Chem 284, 13068–13076.
42 Monaco HL (1997) Crystal structure of chicken ribofla-
vin-binding protein. EMBO J 16, 1475–1483.
43 Yamamoto S, Inoue K, Ohta KY, Fukatsu R, Maeda
JY, Yoshida Y & Yuasa H (2009) Identification
and functional characterization of rat riboflavin
transporter 2. J Biochem 145, 437–443.
44 Mewies M, McIntire WS & Scrutton NS (1998) Cova-
lent attachment of flavin adenine dinucleotide (FAD)
and flavin mononucleotide (FMN) to enzymes: the cur-
rent state of affairs. Protein Sci 7, 7–20.
45 Li YS, Ho JY, Huang CC, Lyu SY, Lee CY, Huang
YT, Wu CJ, Chan HC, Huang CJ, Hsu NS et al. (2007)
A unique flavin mononucleotide-linked primary alcohol
P. Macheroux et al. Flavin-dependent proteins
FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS 2633
oxidase for glycopeptide A40926 maturation. JAm
Chem Soc 129, 13384–13385.
46 Backiel J, Juarez O, Zagorevski DV, Wang Z, Nilges
MJ & Barquera B (2008) Covalent binding of flavins to
RnfG and RnfD in the Rnf complex from Vibrio chole-
rae. Biochemistry 47, 11273–11284.
47 Podzelinska K, Latimer R, Bhattacharya A, Vining LC,
Zechel DL & Jia Z (2010) Chloramphenicol biosynthe-
sis: the structure of CmlS, aflavin-dependent halogen-
ase showing a covalent flavin–aspartate bond. J Mol
Biol 397, 316–331.
48 Leferink NG, Heuts DP, Fraaije MW & van Berkel WJ
(2008) The growing VAO flavoprotein family. Arch
Biochem Biophys 474, 292–301.
49 Huang C-H, Lai W-L, Lee M-H, Chen C-J, Vasella A,
Tsai Y-C & Liaw S-H (2005) Crystal structure of glu-
cooligosaccharide oxidase from Acremonium strictum.
A novel flavinylation of 6-S-cysteinyl, 8a-N1-histidyl
FAD. J Biol Chem 280, 38831–38838.
50 Winkler A, Hartner F, Kutchan TM, Glieder A &
Macheroux P (2006) Biochemical evidence that berber-
ine bridge enzyme belongs to a novel family of flavo-
proteins containing a bi-covalently attached FAD
cofactor. J Biol Chem 281, 21276–21285.
51 Andersen RD, Apgar PA, Burnett RM, Darling GD,
Lequesne ME, Mayhew SG & Ludwig ML (1972)
Structure of the radical form of clostridial flavodoxin:
a new molecular model. Proc Natl Acad Sci USA 69,
3189–3191.
52 Watenpaugh KD, Sieker LC, Jensen LH, Legall J &
Dubourdieu M (1972) Structure of the oxidized form of
a flavodoxin at 2.5-Angstrom resolution: resolution of
the phase ambiguity by anomalous scattering. Proc Natl
Acad Sci USA 69, 3185–3188.
53 Schulz GE, Schirmer RH, Sachsenheimer W & Pai EF
(1978) The structure of the flavoenzyme glutathione
reductase. Nature 273, 120–124.
54 Wierenga RK, De Jong RJ, Kalk KH, Hol WGJ &
Drenth J (1979) Crystal structure of p-hydroxybenzoate
hydroxylase. J Mol Biol 131, 55–73.
55 De Colibus L & Mattevi A (2006) New frontiers in
structural flavoenzymology. Curr Opin Struct Biol 16,
722–728.
56 Dym O & Eisenberg D (2001) Sequence-structure
analysis of FAD-containing proteins. Protein Sci 10,
1712–1728.
57 Karplus PA, Fox KM & Massey V (1995) Flavoprotein
structure and mechanism. 8. Structure–function rela-
tions for old yellow enzyme. FASEB J 9, 1518–1526.
58 Massey V (1995) Introduction: flavoprotein structure
and mechanism. FASEB J 9, 473–475.
59 Percudani R & Peracchi A (2003) Agenomic overview
of pyridoxal-phosphate-dependent enzymes. EMBO Rep
4, 850–854.
60 Rider LW, Ottosen MB, Gattis SG & Palfey BA (2009)
Mechanism of dihydrouridine synthase 2 from yeast
and the importance of modifications for efficient tRNA
reduction. J Biol Chem 284, 10324–10333.
61 Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C,
Harris D, Gordon SV, Eiglmeier K, Gas S, Barry CE
III et al. (1998) Deciphering the biology of Mycobacte-
rium tuberculosis from the complete genome sequence.
Nature 393, 537–544.
62 Carter CJ & Thornburg RW (2004) Tobacco nectarin V
is a flavin-containing berberine bridge enzyme-like pro-
tein with glucose oxidase activity. Plant Physiol 134,
460–469.
63 Liaw S, Lee DY, Chow LP, Lau GX & Su SN (2001)
Structural characterization of the 60-kDa bermuda
grass pollen isoallergens, a covalent flavoprotein. Bio-
chem Biophys Res Commun 280, 738–743.
64 Dutta P (1991) Enhanced uptake and metabolism of
riboflavin in erythrocytes infected with Plasmodium
falciparum. J Protozool 38, 479–483.
65 Dutta P, Pinto J & Rivlin R (1985) Antimalarial effects
of riboflavin deficiency. Lancet 2, 1040–1043.
66 Dutta P, Pinto J & Rivlin RS (1986) Malaria chemo-
therapy through interference of riboflavin metabolism.
Lancet 1, 679–680.
67 Marchin M, Kelly PT & Fang J (2005) Tracker: contin-
uous HMMER and BLAST searching. Bioinformatics
21, 388–389.
Supporting information
The following supplementary material is available:
Table S1. List offlavin-dependentproteins (FMN,
FAD, riboflavin and derivatives).
Table S2. Covalent attachment of FMN and FAD.
Table S3. Protein families (PFAM) used for structural
classification offlavin-dependent proteins.
This supplementary material can be found in the
online version of this article.
Please note: As a service to our authors and readers,
this journal provides supporting information supplied
by the authors. Such materials are peer-reviewed and
may be re-organized for online delivery, but are not
copy-edited or typeset. Technical support issues arising
from supporting information (other than missing files)
should be addressed to the authors.
Flavin-dependent proteins P. Macheroux et al.
2634 FEBS Journal 278 (2011) 2625–2634 ª 2011 The Authors Journal compilation ª 2011 FEBS
. REVIEW ARTICLE Flavogenomics – a genomic and structural view of flavin-dependent proteins Peter Macheroux 1,2 , Barbara Kappes 3 and Steven E. Ealick 2 1 Institute of Biochemistry, Graz University. this area has been the subject of numerous review articles that have attempted to fathom and rationalize the capabilities of the flavin cofactor [2 6–3 2]. The complexity of flavin- catalyzed reactions. Members of this fam- ily are typically oxidases that are capable of performing a wide range of substrate (e.g. sugars and alcohols) oxidation reactions [48]. The flavogenome of the model plant A. thaliana