Anovelcarbonicanhydrasefromthegiantclam Tridacna
gigas containstwocarbonicanhydrase domains
William Leggat
1,2
, Ross Dixon
1
, Said Saleh
1
and David Yellowlees
1
1 Biochemistry and Molecular Biology, James Cook University, Townsville, Queensland, Australia
2 Centre for Marine Studies, University of Queensland, Queensland, Australia
Carbonic anhydrase (CA; EC 4.2.1.1) catalyses the
hydration of CO
2
to HCO
3
–
, is ubiquitous amongst all
living organisms and fulfils a variety of metabolic roles
[1]. Currently there are five evolutionarily distinct CA
gene families (a, b, c, d and e) [1,2] and it is generally
believed that all animal CAs belong to the a-CA fam-
ily. a-CAs are characterized by 36 amino acids found
around the active site [3]. Of these, 15 are conserved in
all active CAs, suggesting that they are required for
CA activity [3].
To date 15 a-CA or a-CA-like proteins have been
identified in mammals. These can be divided into five
broad subgroups, the cytosolic CAs (CA I, CA II,
CA III, CA VII and CA XIII), mitochondrial CAs
(CA VA and CA VB), secreted CAs (CA VI), mem-
brane-associated CAs (CA IV, CA IX, CA XII and
CA XIV) and those without CA activity, the
CA-related proteins (CA-RP VIII, X and XI). The
cytosolic and mitochondrial CAs and the secreted
and membrane-associated CAs are often further
Keywords
carbonic anhydrase; clam; symbiosis
Correspondence
W. Leggat, Centre for Marine Studies,
University of Queensland, Queensland
4072, Australia
Fax: +61 7 33654755
Tel: +61 7 33469576
E-mail: b.leggat@marine.uq.edu.au
Notes
The nucleotide sequences for carbonic
anhydrase from T. gigas have been
deposited in the GenBank database under
GenBank accession numbers AY790884 and
AY799986-AY799998.
The alignment for the genomic sequence of
tgCA between positions 101 and 1810 of
the cDNA has been submitted to
EMBL-ALIGN database under the accession
number ALIGN_000833.
(Received 14 February 2005, revised
11 April 2005, accepted 28 April 2005)
doi:10.1111/j.1742-4658.2005.04742.x
This report describes the presence of a unique dual domain carbonic
anhydrase (CA) in thegiant clam, Tridacna gigas. CA plays an important
role in the movement of inorganic carbon (C
i
) fromthe surrounding sea-
water to the symbiotic algae that are found within the clam’s tissue. One of
these isoforms is a glycoprotein which is significantly larger (70 kDa) than
any previously reported from animals (generally between 28 and 52 kDa).
This a-family CA containstwo complete carbonicanhydrase domains
within the one protein, accounting for its large size; dual domain CAs have
previously only been reported fromtwo algal species. The protein contains
a leader sequence, an N-terminal CA domain and a C-terminal CA
domain. Thetwo CA domains have relatively little identity at the amino
acid level (29%). The genomic sequence spans in excess of 17 kb and con-
tains at least 12 introns and 13 exons. A number of these introns are in
positions that are only found in the membrane attached ⁄ secreted CAs. This
fact, along with phylogenetic analysis, suggests that this protein represents
the second example of a membrane attached invertebrate CA and it con-
tains a dual domain structure unique amongst all animal CAs characterized
to date.
Abbreviations
CA, carbonic anhydrase; C
i
, inorganic carbon; GPI, glycosylphosphatidylinositol.
FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS 3297
consolidated into two separate groups based upon
sequence similarity [3]. Of the membrane-associated
CAs, CA IX, XII and XIV contain integral trans-
membrane domains while CA IV is membrane atta-
ched through a glycosylphosphatidylinositol (GPI)
anchor.
Mammalian CAs function in a variety of roles inclu-
ding pH balance, H
+
secretion, HCO
3
–
secretion, CO
2
exchange, bone resorption and ion transport [1]. The
diversity in mammalian genes and the presence of
homologs in other animals suggests that a large num-
ber of CAs are yet to be characterized fromthe animal
kingdom. Complete cDNA sequences from inverte-
brate sources have only been obtained fromthe tube-
worm Riftia pachyptila [4], cnidarians [5] and
mosquitoes [6,7]. In addition a cDNA sequence enco-
ding a protein involved in calcification fromthe pearl
oyster Pinctada fucata contains an a-CA-like domain
[8].
All animal CAs purified to date have subunit
molecular weights less than 58 kDa. The only excep-
tion is a report of the purification of a protein display-
ing CA activity fromthegiantclamTridacna gigas,
where the purified protein has a molecular weight
of 70 kDa [9]. This protein is glycosylated through
both N- and O-links and is thought to be involved in
the transport of inorganic carbon (C
i
) from seawater
to symbiotic photosynthetic dinoflagellates that are
found intercellularly within theclam tissue [9,10]. It
has been demonstrated that these alga can supply
up to 100% of the clam’s energy requirements [11].
In addition T. gigascontainstwo other CA isoforms,
one of 32 kDa and one of approximately 200 kDa
[9,10].
Here we present further characterization of the
70 kDa CA isoform from T. gigas, including cDNA
sequence indicating that it codes for a unique animal
CA containing two putative CA catalytic domains
within the one protein. Each domain contains those
residues thought to be essential for CA activity.
Results
tgCA cDNA sequence, deduced amino acid
sequence and domains
Only cDNA sequence for the first 1438 bp of the
70 kDa CA could be obtained from both the gill and
mantle cDNA libraries; both sequences were identical.
In both cases the sequence terminated in an identical
position (1438 bp after the start codon), suggesting that
the mRNA secondary structure prevented complete sec-
ond strand synthesis. For this reason cDNA sequence
was also obtained using RT-PCR using a higher than
normal temperature for second strand synthesis. Using
RT-PCR the 3¢ end of the cDNA was obtained yielding
an open reading frame of 1803 bp encoding 600 amino
acids and a protein of 66.7 kDa (Fig. 1). This is flanked
by a 58 bp 5¢-UTR and an 89 bp 3¢-UTR. Although no
polyadenylation signal was found a poly A tail was
sequenced. The translated protein sequence (tgCA) was
found to contain three domains, based upon database
searches, a signal sequence (Met1–Ala17) and two
domains with homology to the a-CA family, n-tgCA
(Ala18–Thr289) and c-tgCA (Ala290–Ser600) (Fig. 2).
The predicted cleavage point of the signal sequence,
between Ala17 and Ala18 (Centre for Biological
Sequence Analysis Database), produces a mature
N-terminal amino acid sequence almost identical (21
out of 22 residues) to an N-terminal peptide sequence
previously obtained [9], suggesting that this is the cor-
rect cleavage point for the signal sequence. A potential
GPI-modification site was identified at Gly577 by the
DPGI database.
Five consensus sites for N-glycosylation (NXS or
NXT) were found in the deduced amino acid sequence
at positions Asn66, Asn97, Asn177, Asn421 and Asn452.
Phylogenetic comparison of both CA domains with
a number of characterized human CA isoforms and
representative invertebrate CAs shows the clear group-
ing of the three recognized a-CA groups, the cytosolic,
Fig. 1. Translated protein sequence of tgCA
from T. gigas. The first CA domain (n-tgCA)
begins at Ala18, the second CA domain
(c-tgCA) begins at Ala290 (bold). The signal
sequence is highlighted and the five poten-
tial glycosylation sites (three in n-tgCA and 2
in c-tgCA) are underlined. Gly577 (double
underline) is the predicted GPI-anchor site.
A dual domain carbonicanhydrase W. Leggat et al.
3298 FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS
secreted ⁄ membrane-associated and the CA-related pro-
teins (Fig. 3). Both domains of theclam CA group
with the membrane-associated CAs.
Intron ⁄ exon mapping
There are a number of intron ⁄ exon locations that are
specific for the various CA classes [3,12]. With this in
mind the introns for tgCA were mapped to further
characterize thetwo CA domains. The genomic
sequence of tgCA, between positions 101 and 1810 of
the cDNA, was amplified in a number of PCR reac-
tions spanning in excess of 17 kb. These sequences
included the complete coding sequence for the gene
between positions 101 and 1810. Twelve introns and
13 exons were found in this region, five in n-tgCA, six
Fig. 2. Alignment of n-tgCA and c-tgCA with other CAs showing intron position and conserved motifs. The 15 amino acids thought to be
required for CA activity are indicated (#), while cysteine residues involved in disulfide bonding in CA IV (two disulfide bonds) and CA VI, XII,
XIV (one disulfide bond) are indicated (Ù) above the alignment. Numbers below the alignment indicate the intron number while intron posi-
tions are represented by: ( ⁄ ) intron between amino acids, (\) intron located after the first codon position of the following amino acid, (+)
intron located after the second codon position of the following amino acid, and (*) represents no intron present. Residues shared by more
than 50% of the CAs examined are shaded. The alignment was performed using
CLUSTALW. Note that hCA1 contains an alanine at position
122 rather then the conserved Val122, however, the consensus for the CA-1 isoform from vertebrates is valine [3]. hCA, human CA; NCBI
accession numbers in brackets, hCA1 (NM_001738), hCA2 (NM_000067), hCA3 (NM_005181), hCA4 (NM_000717), hCA5 (NM_001739),
hCA6 (NM_001215), hCA7 (NM_005182), hCA8 (NM_004056), hCA12 (NP_001209), hCA14 (NP_036245).
W. Leggat et al. A dual domain carbonic anhydrase
FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS 3299
in c-tgCA and one between thetwodomains (Fig. 2).
Unfortunately, despite repeated attempts it was not
possible to obtain genomic sequence corresponding to
the cDNA sequence prior to position 101 of the cDNA
sequence, this may have been due to the presence of
an extremely large intron. Alignment of the intron
positions with those of the human CAs shows that
both n-tgCA and c-tgCA share the majority of intron
locations with the secreted ⁄ membrane-associated CAs
(Fig. 2). All introns conformed to the gt-ag rule [13]
(Table 1). In three cases (introns 2, 7 and 11) multiple
sequences were obtained for introns, suggesting that
tgCA is a multiple copy gene, this was confirmed by
the Southern analysis (data not shown). Only one base
pair of the genomic sequence differed to that previ-
ously obtained for the cDNA sequence (1586CfiT). Of
the 15.7 kb of intron in the gene, sequence data was
obtained for 9.8 kb. The GC content of the coding
sequence (44.5%) was significantly higher than that of
the introns (36.0%). In addition microsatellite repeat
sequences were found in exon 5 (CAAA, 21 repeats)
and exon 7 (GTTT, 13 repeats) (data not shown).
tgCA subunit size
In addition to the 70 kDa CA, another protein of
200 kDa with characteristics of CA was also identified
[9]. Although a minor component of the purified CA
fraction this protein had an identical N-terminal amino
acid sequence to the 70 kDa isoform [9]. When gel
purified and separated on an 8% SDS ⁄ PAGE gel it
was found that the apparent molecular mass of this
Fig. 3. Phylogeny of the CA domains of
tgCA (n-tgCA and c-tgCA) with representa-
tives of other CA classes using maximum
likelihood. Alignments were performed using
CLUSTALW, bootstrapped 1000 times and the
trees constructed using maximum likelihood.
hCA, human CA; ce, Caenorhabditis
elegans;Dr,Drosophila melanogaster;ae,
Anthopleura elegantissma; CAH1 from the
alga Clamydomonas reinhardtii was used as
an outgroup.
Table 1. Exon ⁄ intron junctions of tgCA. No genomic sequence was obtained before the position corresponding to base pair 101 in the cDNA
sequence. Numbering of the cDNA sequence begins at the first codon position of the first in-frame methionine. The exact intron size is
given where known; estimates were made fromthe size of PCR products. Uppercase letters indicate coding sequence, while lowercase
indicate intron sequence.
Intron cDNA codon position preceding intron (bp) Intron size (bp) 5¢ Splice donor 3¢ Splice acceptor
1265 1040 CACGG gtaaac ttccag TGGTA
2417 956
a
TTGAG gtaggt ttgtag GTACA
3510 1200 TTGAG gtgggt ttctag ATCGA
4583 1830 TAAAA gttagt ttctag ATGGA
5744 2280 CTCAG gtatat tttcag CTTGC
6868 2000 TACAG gttggg ttacag CTCAA
7910 1380
a
TCAAG gtatgt ttacag GAGTG
8 1093 950 TACAA gtactt ctacag TCCAA
9 1242 1100 TCGAG gtactg tttcag CTACA
10 1335 1370 TTGAA gtaagt tttcag ATCGG
11 1399 1300, 1200
a
TAAAG gtatgt tttcag ATGCC
12 1581 347 GTCAG gtaagt ttccag TTGGT
a
Two intron sequences were found for these introns.
A dual domain carbonicanhydrase W. Leggat et al.
3300 FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS
protein is approximately 145 kDa, as opposed to
200 kDa suggested in [9]. In addition a protein of
70 kDa was also observed (Fig. 4A). When purified
CA containing the 32, 70 and 145 kDa isoform was
separated in the presence of increasing concentrations
of the reducing agent 2-mercaptoethanol, it was found
that the 145 kDa isoform disappeared (Fig. 4B). These
two observations, the presence of the 70 kDa isoform
in gel purified 145 kDa extract, and the disappearance
of the 145 kDa isoform under reducing conditions, in
addition to the identical N-terminal acid sequences [9]
suggests that the 70 kDa forms a homodimer of
145 kDa.
Separation of affinity purified CA by 2D-PAGE
shows that both the 32 and 70 kDa CAs have multiple
isoforms with identical masses but differing pI values
between 4 and 4.5 for the 32 kDa isoform and between
5.2 and 6.0 for the 70 kDa isoform (Fig. 5). The pre-
dicted pI point of the mature protein derived from the
cDNA sequence is 5.84.
Discussion
The data presented here represents the first example of
an animal protein containing two CA catalytic domains
within the one coding sequence. Only two other pro-
teins have been found to contain transcripts containing
duplicate CA domains which are translated into the
one peptide, one fromthe green alga Dunaliella salina
contains two a-CA domains while the second from the
red alga Porphyridium purpureum containstwo b-CA
repeats.
tgCA containsa 17 amino acid leader sequence, an
N-terminal CA domain of 272 amino acids and a
C-terminal domain of 311 amino acids. The cDNA
encodes a protein of 66.7 kDa, when the leader
sequence is removed the mature molecular mass is
reduced to 64.8 kDa, this is similar to the 62 kDa
molecular weight obtained for the deglycosylated pro-
tein determined by SDS ⁄ PAGE [9]. Both CA domains
of tgCA contain all residues thought to be required for
CA activity [3] suggesting that both domains are cata-
lytically active. In addition, both domains contain a
histidine residue (His87 in n-tgCA, His363 in c-tgCA)
conserved with His64 in human CA2 (Fig. 2). This his-
tidine residue has been found to act as a proton shuttle
in CO
2
hydration in high activity CAs (reviewed in
[14,15]) supporting the notion that both CA domains
within this protein are catalytically functional. How-
ever this will have to be confirmed experimentally
through either mutational studies or the use of select-
ive inhibitors.
The predicted cleavage position of the leader
sequence between Ala17 and Ala18 produces a mature
N-terminal protein sequence which is identical for the
first 21 amino acids to that obtained from N-terminal
sequencing of the purified protein. The putative signal
sequence for tgCA containstwo potential in-frame
start codons (Met1 and Met3), similar to that found
in the membrane-associated human CA4 (Fig. 2). The
presence of this signal sequence is consistent with pre-
vious localization studies that have indicated extracel-
lular localization of this protein in both the mantle
and gills of giant clams [9,10].
Previous studies on the purified tgCA protein [9]
and immunolocalization results [9,10] suggest that
tgCA is a membrane CA, although its mode of attach-
ment was not clear. It was also found [9] that
AB
Fig. 4. SDS ⁄ PAGE separation of the various 70 kDa CA isoforms.
(A) An 8% SDS ⁄ PAGE gel showing the presence of both the
145 kDa and 70 kDa CA isoforms in a purified extract of the
145 kDa isoform. Lane 1, purified 145 kDa CA isoform; lane 2, puri-
fied 70 kDa isoform. (B) Purified gill CA separated by SDS ⁄ PAGE in
the presence of increasing concentrations of 2-mercaptoethanol.
Lane 1, 0.25
M 2-mercaptoethanol; lane 2, 1.3 M 2-mercaptoetha-
nol; lane 3. 5.3
M 2-mercaptoethanol.
Fig. 5. Two-dimensional electrophoresis separation of affinity puri-
fied T. gigas CA. pI values are shown across the top while
molecular masses are shown on the right. Box 1 surrounds the
70 kDa isoform while box 2 surrounds the 32 kDa isoform.
W. Leggat et al. A dual domain carbonic anhydrase
FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS 3301
phosphoinositol phospholipase C digestion did not
result in the release of tgCA from crude gill homogen-
ate, suggesting that it is not GPI-anchored and is
instead an integral membrane protein. However analy-
sis of the predicted protein sequence presented here
does not seem to support this hypothesis, as evidence
of a putative transmembrane domain region is ambigu-
ous (data not shown). In addition the DPGI database
predicts that a GPI anchor may be present at Gly577
(it must be stated that the big-PI Predictor did not
identify a GPI-anchor for this protein). How tgCA is
associated with the membrane is still unknown and
requires further research, although the lack of a hydro-
phobic domain suggests that, as with human CA4, it is
attached through a GPI-anchor.
Perhaps surprisingly there is very little identity
between thetwo CA domains of this protein at either
the coding (48%) or amino acid level (29%). This level
of identity is similar to that seen when comparing the
different domains of tgCA to human CAs. Of the ver-
tebrate and invertebrate CAs used for the tree con-
struction (Fig. 3), n-tgCA had the greatest identity at
the amino acid level with human CA2 and human
CA7 (32% for both) while c-tgCA was most similar to
human CA1 and human CA7 (28% identity for both).
The lowest identity was with the alga C. reinhardtii
CAH1 gene (13% n-tgCA and 16% c-tgCA).
Despite this greater identity with human cytosolic
CAs, phylogenetic analysis suggests that both the
N- and C-terminal domains belong to the secreted ⁄
membrane-associated CA group (Fig. 3). Despite low
bootstrap values for the tree in general, the support
for the division between the cytosolic and secre-
ted ⁄ membrane-associated CAs is high (80%). This tree
clearly groups the human cytosolic CAs, the CA-like
proteins and secreted ⁄ membrane-associated CAs. Of
the invertebrate CAs, both the fly and anemone fall
within the cytosolic group while thetwo putative
Caenorhabditis elegans CAs (CAH1 and CAH2) group
with the CA-RP vertebrate genes. To our knowledge
there is only one published report of a cDNA coding
for an invertebrate membrane attached CA [7], indica-
ting that tgCA represents the second example of a
membrane-associated CA fromthe invertebrates.
The phylogenetic grouping of tgCA with the mem-
brane-associated CAs from vertebrates is supported by
a range of other properties of this protein including
the presence of a signal sequence and the presence of a
conserved disulfide bond. All CAs of this group have
been found to contain two conserved cysteine residues
involved in an intrachain disulfide bond (Fig. 2). The
tgCA sequence is no exception with each CA domain
containing two conserved cysteine residues (Cys41,
Cys229, Cys315, Cys508) that are homologous to those
found in all other CAs of this group (Fig. 2). The pres-
ence of disulfide bonds in tgCA is supported by chan-
ges in electrophoretic mobility when the purified
protein is subjected to varying levels of reducing agent.
In the presence of higher concentrations of 2-merca-
ptoethanol, the 70 kDa isoform migrates more slowly
(Fig. 4A). This is consistent with human CA4 which
shows a similar pattern in the presence of 5% 2-merca-
ptoethanol [16].
While four of the six cysteines in tgCA are implica-
ted in intrachain disulfide bonds, evidence suggests
that at least one of the remaining two cysteines forms
a disulfide bond with another tgCA subunit making a
dimer of the 70 kDa protein. Gel filtration experiments
had previously suggested that tgCA exists as a dimer,
with a native weight of approximately 141 kDa [9].
When a gel purified fraction of similar estimated mole-
cular mass (145 kDa) is separated by SDS ⁄ PAGE, the
70 kDa band is found in addition to the original
145 kDa protein. Upon addition of high concentra-
tions of the reducing agent 2-mercaptoethanol this
band ( 145 kDa) disappears (Fig. 4B) supporting the
conclusion that this represents a dimer. The gel filtra-
tion results in combination with reduction of the
145 kDa protein to the 70 kDa isoform suggest that
there are interchain disulfide bonds between two
70 kDa subunits.
Analysis of the genomic sequence data, where differ-
ent intron sequences have been obtained (Table 1),
Southern blots (data not shown) and protein two-
dimensional gels all suggest that tgCA is a multicopy
gene. Despite this no sequence differences were
observed in the coding sequence or intron position
where different copies, evidenced by different intron
sequences, were obtained. Given this it seems reason-
able then to use the combined genomic data, even if it
does not represent one gene, for analysis of intron ⁄
exon structure.
Intron ⁄ exon positions are considered diagnostic for
animal CAs with characteristic pattern differences
being found in cytosolic and secreted ⁄ membrane-asso-
ciated CAs [3,12]. For example of the 15 possible
intron sites shown in Fig. 2, three introns are shared
between these two groups, three are found only in the
cytosolic CAs and at least two more are found only in
the secreted ⁄ membrane-associated CAs. Genomic
sequence of tgCA revealed 12 introns and 13 exons
(Fig. 2), all of which conformed to the gt ⁄ ag rule for
splice junctions [13] (Table 1). The majority of these
introns (11) were found to be homologous to those in
the secreted ⁄ membrane-associated CAs (intron posi-
tions 5, 9 and 13 of all possible introns, Fig. 2) or
A dual domain carbonicanhydrase W. Leggat et al.
3302 FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS
those introns common to the majority of vertebrate
CAs (introns 7, 8, 11). This distribution of intron ⁄ exon
boundaries supports the phylogeny and protein prop-
erties discussed above that groups both CA domains
of tgCA with the secreted ⁄ membrane attached CAs.
However, surprisingly one intron in the c-tgCA was
found to differ from this pattern. Intron 3 in c-tgCA
(Fig. 2) is diagnostic for the cytosolic CAs. The pres-
ence of a cytosolic specific intron in a CA that would
otherwise belong to the secreted ⁄ membrane attached
CA grouping suggests that this intron was present
before the division of these two groups and has subse-
quently been lost fromthe membrane-associated CAs.
The dual domain structure of tgCA could have arisen
through one of two mechanisms, either the fusion of
two separate CA genes or a duplication of a single gene
followed by a fusion event. If this protein arose through
duplication and fusion event, and given the poor iden-
tity between thetwo CA domains (29%), the duplication
event must be old, thereby allowing time for the two
domains to diverge. This low identity between domains
is especially striking when compared to other duplicated
domain CA proteins, 52 and 72% identity for D. salina
and P. purpureum, respectively [17,18]. Furthermore
there are similar examples of non-CA duplicated domain
proteins from invertebrates. Phosphagen kinases from a
number of bivalves [19–21], sea anemones [22] and sea
urchins [23] have been shown to contain bi- or tripartite
repeat domains. In all of these examples, identity
between thedomains is in excess of 60%. For each pro-
tein it has been concluded that the dual domain struc-
ture arose through gene duplication of one gene and
subsequent fusion. Where genomic sequence is available
[20,22] this is supported by the presence of an intron
between thetwo domains. Similarly in tgCA an intron is
found between thetwo domains. Given the low homol-
ogy between thetwodomains of tgCA in comparison to
other duplicated domain proteins it is not possible
to exclude the possibility that this protein has arisen
through the fusion of two different CA genes rather
than a duplication event.
Given the unique structure of this CA protein it
would be interesting to know if both domains display
CA activity. Previous studies [9] have shown that the
purified protein is active, however, from this data it is
not possible to conclude if this is due to one or both
domains. As both domains contain all the required
residues it seems likely that they are both active. A fur-
ther area of study is the interaction of the two
domains, for example are both required for activity
and ⁄ or do they function cooperatively and what is
their three-dimensional arrangement? These questions
are areas of future study.
To date the dual domain structure of tgCA is unique
amongst animals, whether this gene duplication of CA
is present in other symbiotic or nonsymbiotic bivalves,
and possibly other invertebrates, remains to be seen. If
this CA arrangement is restricted to symbiotic bivalves
it may represent a mechanism by which a symbiotic
animal can increase the rate of inorganic carbon trans-
port to their photosynthetic symbionts, and thereby
maximize the benefits of symbiosis.
Experimental procedures
Purification of carbonicanhydrasefromclam gills
The 70 kDa CA isoform was purified fromthe gills of
the giantclam T. gigas as previously described [9]. The
145 kDa CA isoform was electroeluted from affinity puri-
fied CA after separation by SDS ⁄ PAGE.
Separation of CA isoforms by two-dimensional
gel electrophoresis
Affinity purified CA was analyzed by two-dimensional gel
electrophoresis (2D-PAGE). Separation in the first dimen-
sion was performed using an Immobiline DryStrip (pH 4–7,
Pharmacia, Piscataway, NJ, USA) which was then further
separated on an 8–18% SDS ⁄ PAGE gradient gel using the
manufacturer’s protocol (Pharmacia, Cat # 18-1038-63).
Gels were visualized using Sypro-Ruby (Molecular Probes,
Eugene, OR, USA).
Purification of RNA and cDNA library construction
Total RNA was prepared from T. gigas mantle and gill
tissue. Fresh tissue (1.3 g) was snap frozen in liquid nitro-
gen and ground in a mortar and pestle. Total RNA was
prepared fromthe tissue using cesium chloride [24].
mRNA was then purified from total RNA using the
QuickPrepÒ mRNA Purification Kit (Pharmacia). The gill
cDNA was synthesized for rapid amplification of cDNA
ends by the polymerase chain reaction (RACE-PCR) using
the Clontech
TM
cDNA Amplification Kit. The mantle lib-
rary was initially synthesized as a phage library in k-ZapII
(Stratagene, La Jolla, CA, USA). It was used as a
template for RACE-PCR using specific primers for the
adaptors.
Clam CA primers were designed to previously known
cDNA sequence of the 70 kDa CA isoform from T. gigas
[25] and to N-terminal amino acid sequence [9] (Fig. 1).
From the derived sequence further primers were designed
to amplify the remaining portion of the cDNA.
Products were also amplified using RT-PCR. mRNA was
purified as previously described and first strand synthesis
performed using Omniscript
TM
Reverse Transcriptase
W. Leggat et al. A dual domain carbonic anhydrase
FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS 3303
(Qiagen, Valencia, CA, USA) using the following primer:
5¢-CCAgTgAgCAg AgTgACggAggACTCgAgCTCA AgCTT
TTTTTTTTTTTTTT-3¢. PCR products were then amplified
using a gene specific primer and Q
0
(5¢-CCAgTgAgCAgAg
TgACg-3¢) whose sequence was contained in the poly(T)
primer. The second-strand synthesis was conducted at 38 °C
rather than 16 °C to overcome problems associated with sec-
ondary structure inhibition of second-strand synthesis which
had previously been observed.
DNA sequencing
Gel purified PCR products (High Pure PCR Product Purifi-
cation Kit, Roche, Mannheim, Germany) or 1.5 lL of the
PCR reaction were ligated into T-Vector Easy (Promega,
USA) and transformed into XL-1 Blue cells. After plasmid
purification (High Purity Plasmid Isolation Kit, Roche)
clones were sequenced using capillary separation on a ABI
3730xl sequencer using the ABI v3.0 sequencing kit (Applied
Biosystems, Foster City, CA, USA).
Sequence analysis
Sequence alignments were performed using the program
clustal w [26], bootstrapped 1000 times and trees con-
structed using maximum likelihood [27], the C. reinhardtii
a-CA gene (CAH1) was used as an outgroup. All analyses
were performed using biomanager by ANGIS (http://
www.angis.org.au). Signal sequences were identified using
the Centre for Biological Sequence Analysis database [28].
Potential GPI-anchor sites were examined using the DPGI
database (http://129.194.185.165/dgpi/DGPI_demo_en.html)
and the big-PI Predictor (http://mendel.imp.univie.ac.at/gpi/
gpi_server.html) [29].
Isolation of genomic DNA
Genomic DNA was isolated from T. gigas sperm. Spawning
was induced by the injection of approximately 5 mL of a
2mm serotonin solution into the gonads. The sperm was
collected fromthe water and centrifuged (1000 g for
5 min). Sperm (1 mL packed cell volume) was diluted with
11 mL proteinase K solution [50 mm Tris ⁄ HCl pH 7.5,
20 mm EDTA, 100 mm NaCl, 1% (w ⁄ v) SDS, 100 lgÆmL
)1
proteinase K] and the sample was incubated overnight at
55 °C. The solution was centrifuged (1000 g for 15 min at
4 °C), the supernatant removed, mixed with 10 mL of
ultra-pure phenol (Sigma, St Louis, MO, USA) and then
equilibrated with 4 mL of TE (10 mm Tris pH 8.0, 1 mm
EDTA) buffer. After adding an equal volume of chloro-
form, the solution was left overnight. The solution was
again centrifuged (1000 g for 15 min at 4 °C) and the aque-
ous phase removed. This was re-extracted twice with chlo-
roform and the DNA precipitated with 0.1 volume sodium
acetate (3 m, pH 5.2) and 2.5 volume 100% (v ⁄ v) ethanol.
After precipitation the DNA was spooled and resuspended
in TE buffer.
Genomic sequencing
The intron ⁄ exon structure of the 70 kDa CA was mapped
using a series of sequence specific primers obtained from the
cDNA sequence that bracketed possible intron positions [3].
Acknowledgements
This work was supported by an Australian Research
Council grant to David Yellowlees. We would like to
thank three anonymous referees for their helpful com-
ments.
References
1 Pastorekova S, Parkkila S, Pastorek J & Supuran CT
(2004) Carbonic anhydrases: current state of the art,
therapeutic applications and future prospects. J Enz
Inhib Med Chem 19, 199–229.
2 So AKC, Espie GS, Williams EB, Shively JM, Hein-
horst S & Cannon GC (2004) Anovel evolutionary line-
age of carbonicanhydrase e is a component of the
carboxysome shell. J Bacteriol 186, 623–630.
3 Hewett-Emmett D & Tashian RE (1996) Functional
diversity, conservation, and convergence in the evolu-
tion of the a-, b-, and c-carbonic anhydrase gene famil-
ies. Mol Phylo Evol 5, 50–77.
4 De Cian M, Bailly X, Morales J, Strub J, van Doressel-
aer A & Lallier FH (2003) Characterization of carbonic
anhydrases from Riftia pachyptila, a symbiotic inverte-
brate from deep-sea hydrothermal vents. Proteins Struct
Func Genet 51, 327–339.
5 Weis VM & Reynolds WS (1999) Carbonic anhydrase
expression and synthesis in the sea anemone Anthopleura
elegantissima are enhanced by the presence of dinoflagel-
late symbionts. Physiol Biochem Zool 72, 307–316.
6 Pilar Corena M, Seron TJ, Lehman HK, Ochrietor JD,
Kohn A, Tu C & Linser PJ (2002) Carbonic anhydrase
in the midgut of larval Aedes aegypti: cloning, localiza-
tion and inhibition. J Exp Biol 205, 591–602.
7 Seron TJ, Hill J & Linser PJ (2004) A GPI-linked car-
bonic anhydrase expressed in the larval mosquito mid-
gut. J Exp Biol 207, 4559–4572.
8 Miyamoto H, Miyashita T, Okushima M, Nakano S,
Morita T & Matsushiro A (1996) Acarbonic anhydrase
from the nacreous layer in oyster pearls. Proc Natl Acad
Sci USA 93, 9657–9660.
9 Baillie B & Yellowlees D (1998) Characterization and
function of carbonicanhydrase in the zooxanthellae-
giant clam symbiosis. Proc R Soc Lond B 265, 465–473.
A dual domain carbonicanhydrase W. Leggat et al.
3304 FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS
10 Leggat W, Marendy EM, Baillie B, Whitney SM,
Ludwig M, Badger MR & Yellowlees D (2002) Dinofla-
gellate symbioses: strategies and adaptations for the
acquisition and fixation of inorganic carbon. Funct
Plant Biol 29, 309–322.
11 Fisher CR, Fitt WK & Trench RK (1985) Photosynth-
esis and respiration in Tridacnagigas as a function of
irradiance and size. Biol Bull 169, 230–245.
12 Jiang W & Gupta D (1999) Structure of the carbonic
anhydrase VI (CA6) gene: evidence for two distinct
groups within the a-CA gene family. Biochem J 344,
385–390.
13 Mount SM (1982) A catalog of splice junction
sequences. Nucleic Acids Res 10, 459–472.
14 Lindskog S (1997) Structure and mechanism of carbonic
anhydrase. Pharmacol Ther 74, 1–20.
15 Supuran CT, Scozzafava A & Casini A (2003) Carbonic
anhydrase inhibitors. Med Res Rev 23, 146–189.
16 Zhu XL & Sly WS (1990) Carbonicanhydrase IV from
human lung: purification, characterization, and compar-
ison with membrane carbonicanhydrasefrom human
kidney. J Biol Chem 265, 8795–8801.
17 Fisher M, Gokhman I, Pick U & Zamir A (1996) A
salt-resistant plasma membrane carbonicanhydrase is
induced by salt in Dunaliella salina. J Biol Chem 271,
17718–17723.
18 Mitsuhashi S & Miyachi S (1996) Amino acid sequence
homology between N- and C-terminal halves of a carbo-
nic anhydrase in Porphyridium purpureum, as deduced
from a cloned cDNA. J Biol Chem 271, 28703–28709.
19 Compaan DM & Ellington WR (2003) Functional con-
sequences of a gene duplication and fusion event in an
arginine kinase. J Exp Biol 206, 1545–1556.
20 Suzuki T, Kawasaki Y, Unemi Y, Nishimura Y, Soga
T, Kamidochi M, Yazawa Y & Furukohri T (1998)
Gene duplication and fusion have occurred frequently
in the evolution of phosphagen kinases – a two-domain
arginine kinase fromtheclam Pseudocardium sachalinen-
sis. Biochim Biophys Acta 1388, 253–259.
21 Suzuki T, Sugimura N, Taniguchi T, Unemi Y, Murata
T, Hayashida M, Yokouchi K, Uda K & Furukohri T
(2002) Two-domain arginine kinases fromthe clams
Solen strictus and Corbicula japonica: exceptional amino
acid replacement of the functionally important D62 by
G. Int J Biochem Cell Biol 34, 1221–1229.
22 Suzuki T, Kawasaki Y & Furukohri T (1997) Evolution
of phosphagen kinase: Isolation, characterization and
cDNA-derived amino acid sequence of two-domain
arginine kinase fromthe sea anemone Anthopleura japo-
nicus. Biochem J 328, 301–306.
23 Wothe DD, Charbonneau H & Shapiro BM (1990) The
phosphocreatine shuttle of sea urchin sperm: flagellar
creatine kinase resulted froma gene triplication. Proc
Natl Acad Sci USA 87, 5203–5207.
24 Sambrook J, Fritsch EF & Manniatis T (1989) Mole-
cular Cloning, 2nd edn. Cold Spring Harbor Laboratory
Press, Cold Spring Harbour, NY.
25 Baillie BK, (1995) Inorganic carbon acquisition and utili-
sation in thegiantclam symbiosis. PhD Thesis, James
Cook University of North Queensland, Australia.
26 Thompson JD, Higgins DG & Gibson TJ (1994) CLUS-
TAL W: improving the sensitivity of progressive multi-
ple sequence alignment through sequence weighting,
position-specific gap penalties and weight matrix choice.
Nucleic Acids Res 22, 4673–4680.
27 Felsenstein J (1989) P
HYLIP – Phylogeny Inference Pack-
age Version 3.2. Cladistics, 5, 164–166.
28 Bendtsten JD, Nielsen H, von Heijne G & Brunak S
(2004) Improved prediction of signal peptides: SignalP
3.0. J Mol Biol 340, 783–795.
29 Eisenhaber B, Bork P & Eisenhaber F (1999) Prediction
of potential GPI-modification sites in proprotein
sequences. J Mol Biol 292, 741–758.
W. Leggat et al. A dual domain carbonic anhydrase
FEBS Journal 272 (2005) 3297–3305 ª 2005 FEBS 3305
. A novel carbonic anhydrase from the giant clam Tridacna gigas contains two carbonic anhydrase domains William Leggat 1,2 , Ross Dixon 1 , Said Saleh 1 and David Yellowlees 1 1 Biochemistry and. 2000 TACAG gttggg ttacag CTCAA 7910 1380 a TCAAG gtatgt ttacag GAGTG 8 1093 950 TACAA gtactt ctacag TCCAA 9 1242 1100 TCGAG gtactg tttcag CTACA 10 1335 1370 TTGAA gtaagt tttcag ATCGG 11. describes the presence of a unique dual domain carbonic anhydrase (CA) in the giant clam, Tridacna gigas. CA plays an important role in the movement of inorganic carbon (C i ) from the surrounding sea- water