Tài liệu Báo cáo Y học: Interallelic recombination is probably responsible for the occurrence of a new as1-casein variant found in the goat species potx
Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 11 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
11
Dung lượng
593,18 KB
Nội dung
Interallelicrecombinationisprobablyresponsibleforthe occurrence
of anew a
s1
-casein variantfoundinthegoat species
Claudia Bevilacqua
1,2,
*, Pasquale Ferranti
3,4
, Giuseppina Garro
3,4
, Cristina Veltri
1
, Raffaella Lagonigro
1
,
Christine Leroux
2
, Emilio Pietrola
`
1
, Francesco Addeo
3,4
, Fabio Pilla
1
, Lina Chianese
3
and Patrice Martin
2
1
Dipartimento di Scienze Animali, Vegetali e dell’Ambiente, Facolta
`
di Agraria dell’Universita
`
del Molise, Campobasso, Italy;
2
Laboratoire de Ge
´
ne
´
tique biochimique et de Cytoge
´
ne
´
tique, INRA, Domaine de Vilvert, Jouy-en-Josas, France;
3
Dipartimento di Scienza degli Alimenti, Facolta
`
di Agraria, le Universita
`
di Napoli ‘Federico II’, Portici, Italy;
4
Istituto di Scienze dell’Alimentazione del CNR, Avellino, Italy
The a
s1
-casein (a
s1
-Cas) locus inthegoatis characterized by
a polymorphism, the main feature of which is to be qualit-
ative as well as quantitative. A systematic analysis performed
in an autochthon sou thern Italy breed identified anew rare
allele (M), which was characterized at both the protein and
genomic level. The M protein displays the slowest elec-
trophoretic mobility ofthe a
s1
-Cas variants described so far.
MS and automated Edman degradation experiments
showed that this behavior was due to the loss of two phos-
phate residues inthe multiple phosphorylation site (64S
P
-S
P
-
S
P
-S
P
-S
P
-E-70E) consecutively to a Ser fi Leu s ubstitution
at position 66 ofthe peptide chain (64S-S
P
-L-S
P
-S
P
-E-70E).
This was confirmed by sequencing a genomic DNA frag-
ment encompassing exon 9 where the 8th codon (TCG) was
shown to be mutated to TTG. Sequencing of amplified
genomic DNA segments spanning the 5¢ and 3¢ flanking
regions of each exon allowed us to identify 23 single nuc-
leotide polymorphisms and two i nsertion/deletion events in
the coding as well as the noncoding regions. A comparison o f
specific haplotypes defined for each ofthe a
s1
-CasF, A and
M alleles indicates that the M allele probably arises from
interallelic recombination between alle les A and B
2
, followed
by a C fi T transition a t nucleotide 23 ofthe ninth exon.
The region encompassing therecombination break point
was putatively located between nucleotide 86 upstream and
nucleotide 40 downstream of exon 8. Interallelic recombi-
nation therefore appears to be a possible means of gener-
ating alle lic diversity at the a
s1
-Cas locu s, at least i n the goat.
The previously proposed molecular phylogeny must now be
revised, possibly starting from t wo ancestral a llelic lineages.
Keywords: a
s1
-casein gene; allelic recombination; genetic
polymorphism; g oat milk.
Caseins comprise the main protein fraction of ruminant
milk. They are encoded by four tightly linked genes [1],
clustered ina 250-kb g enomic DNA segment [ 2] in the
following order: a
s1
, b, a
s2
and j [3]. They have be en
mapped on chromosome 6 in cattle and goats [4,5]. The
a
s1
-casein locus (a
s1
-Cas) is characterized inthegoat by a
polymorphism, the main feature of which is to be qualit-
ative as well as quantitative. Indeed, more than 11 alleles
have so far been characterized [6], distributed among seven
different classes of protein variants (a
s1
-CasA to a
s1
-CasG),
associated with four levels of expression ranging betwee n 0
(a
s1
-Cas0) and 3.5 g ÆL
)1
(a
s1
-CasA, B, and C) per allele.
Whereas the a
s1
-CasE variant, which is 199 amino-acid
residues in length, only differs from variants A, B and C by
single amino acid substitutions [7], the F variant displays an
internal deletion of 37 residues [8], leading to the loss of a
hydrophilic cluster of five contiguous phosphoseryl resi-
dues: 64Ser
P
-Ser
P
-Ser
P
-Ser
P
-Ser
P
-Glu-70Glu. This deletion
arises from the outsplicing of three exons (9, 10 and 11)
during the processing of primary transcripts, probably
because ofa single nucleotide dele tion occurring within the
first (exon 9 ) unspliced exon [9]. M ore recently, the B alle le
has been split up in to four alleles giving rise to the synthesis
of four protein variants B
1
,B
2
,B
3
,andB
4
, which differ as a
result of amino-acid substitutions [6]. These substitutions
have no effect on th e net charge ofthe protein, which
therefore makes the relevant variants indistinguishable on
PAGE. Variant B
1
is considered to be the original t ype i n
goat because it shows the closest homology to its bovine
and ovine counterpart [6].
The distribution of these different alleles or variants has
been investigated ina great variety of breeds a nd popula-
tions [6,10–13]. Breeds from the Mediterranean a rea usually
display a high f requency of ‘strong’ alleles (mainly A and B).
However, local and now rare breeds generally do not follow
this rule and are often the source of rare ‘germoplasms’.
Three novel a
s1
-Cas variants (H, I and L) have been
identified by Chianese et al. [14] in southern Italian goat
populations. More recently, a further novel and rare
Correspondence to P. Martin, Laboratoire de Ge
´
ne
´
tique biochimique
et de Cytoge
´
ne
´
tique, INRA, Domaine de Vilvert,
78 352 Jouy-en-Josas, France. Fax: + 33 1 34 65 24 78,
Tel.: + 33 1 3 4 6 5 25 82, E-mail: mart in@jouy.inra.fr
Abbreviations: a
s1
-Cas, a
s1
-casein; UTLIEF, ultra-thin-layer
isoelectric focusing; LC/ES /MS, liquid chromatography/electrospray/
mass spectrometry; ACRS-PCR, amplified created restriction
site-PCR.
*Present address: I NSERM E9925, Interactions de l’e
´
pithe
´
lium
intestinal avec le syste
`
me immunitaire, Faculte
´
Necker-Infants
Malades, 156, rue de Vaugirard, 75 743 Paris Cedex 15, F rance.
(Received 29 August 2001, revised 17 December 2 001, accepted
9 January 2002)
Eur. J. Biochem. 269, 1293–1303 (2002) Ó FEBS 2002
variant, named M, was detected inthe Molisane Montefal-
cone goat breed [15], which was shown, in addition, to
display a rather high frequency ofthe F allele [16].
In this paper, we report t he characterization of this new
variant at both the protein and genomic level. The complete
amino-acid sequence ofthe M variant has been determined.
Starting from genomic DNA, we amplified, by PCR, the
coding regions (exons) a nd their intron fl anking regions,
which have been subsequently sequenced. Such a dual
approach has made it possible to identify the mutation
specific forthe a
s1
-CasM allele. Extensive comparisons of
these sequences with those of previously characterized
alleles have allowed the identification of additional poly-
morphic sites, the arrangements (haplotypes) of which
strongly suggest an interallelicrecombination (or a gene
conversion) event at the origin ofthe a
s1
-CasM allele. This
is, to our knowledge, t he first hypothesis o f a genomic
recombination event to account for genetic polymorphism
at a locus encoding a milk protein.
MATERIALS AND METHODS
Animals
A total of 147 individual milk samples were analysed from
Montefalcone goats, which are localized in southern Italy
(Molise r egion). Eight goats w ere used, as well as two bucks,
for peripheral blood (15–30 mL), which was subsequently
used for DNA extraction.
Casein preparation
Whole casein was prepared by acid precipitation of
individual skimmed milk as described by Aschaffenburg &
Drewry [17].
Gel electrophoresis
Vertical disc PAGE at pH 8.6, preparation of casein
samples and polyclonal antibodies against a
s1
-Cas, and
immunoblotting experiments were performed as described
elsewhere [18].
Preparation of polyacrylamide gel ultra-thin layers
(0.25 mm) and isoelectric focusing (UTLIEF) were carried
out as recommended by EEC Regulation no. 690/92 [19].
The pH gradient inthe range 2.5–6.5 was obtained by
mixing Ampholine (Pharmacia LKB) 2.5–5, 4.5–5.4, and
4–6.5 inthe volume ratio 1.6 : 1.4 : 1.
2D ge l electrophoresis (PAGE inthe first dimension
followed by UTLIEF inthe second) has been described
elsewhere [18].
Enzymatic hydrolyses
Trypsin (Boehringer Mannheim) hydrolysis was carried out
in 0.4% NH
4
HCO
3
,pH8.5,at37°C, for 4 h, in a
substrate/enzyme ratio of 5 0 : 1 (w/w). Dephosphorylation
with calf intestine alkaline phosphatase (Boehringer Mann-
heim) was performed inthe same buffer by using 1 mU
enzyme/mg casein at 37 °C for 18 h; these c onditions have
been previously shown to p roduce complete d ephosphory-
lation ofthe sample [ 20]. Reactions were stopped by freeze-
drying.
Liquid chromatography/mass spectrometry analysis
of proteins and peptides
The whole caprine casein samples were fractionated by the
procedure of J aubert and M artin [21], modified by Ferranti
et al . [22].
Liquid chromatography/electrospray/mass spectrometry
(LC/ES/MS) was performed using a HP1100 modular
system on-line connected to a Platform (Micromass) single
quadrupole mass spectrometer. The selectively precipitated
casein phosphopeptides were fractionated by RP-HPLC on
a 214TP54, 5 lm V yd ac C18, 25 0 · 2.1 mm i nternal
diameter column (Vydac, Hesperia, CA, USA). Solvent A
was 0.3 mL trifluoroacetic acid per L water. Solvent B was
0.2 mL trifluoroacetic acid per L acetonitrile. Samples
(500 lg) were dissolved in 200 lL w ater and injected on to
the HPLC column equilibrated in solvent A. A linear
gradient from 0% to 37% B was applied a t a flow rate of
0.5 m LÆmin
)1
over 60 min. The column effluent was split
1 : 25 to give a flow rate of % 4 lLÆmin
)1
into the
electrospray nebulizer. The bulk ofthe flow was run
throughthedetectorforpeakcollectionasmeasuredby
following A
220
. The ES-mass spectra were scanned from
1800 to 400 lm at a scan cycle of 5 s per scan. The source
temperature was 120 °C and the orifice voltage 40 V. Mass
values were reported as average masses. Signals recorded in
the mass spectra of peptides were associated with the
corresponding tryptic peptides on t he basis ofthe molecu lar
mass, taking into account the enzyme specificity and the
reported amino-acid sequence of a
s1
-Cas from different
species. Q uantitative a nalysis of components was performed
by integration ofthe multiple charged ions ofthe single
species [22].
Sequence analysis
Automated Edman degradation was performed using a n
Applied Biosystems mode l 477A Protein Sequencer with
on-line phenylthiohydantoinyl amino acid (Pth-Xaa)-
HPLC analyzer. Phosphorylated peptides were modified
by the procedure of Ferranti et al .[20].
Genomic DNA preparation
Goat genomic DNA was prepared f rom leucocytes i solated
from the plasma fraction of EDTA-anticoagulated periph-
eral blood samples, as described previously [23,24].
Oligonucleotides
Intronic primers used either for amplification from genomic
DNA or f or sequencing o f amplified DNA fragments were
provided by Genosys Biotechnologies Inc. (Cambridge,
UK) and Primm (Milano, Italy).
Their sequences are given in Table 1, together with those
used for genotyping.
PCR conditions
In vit ro am plification was performed with the t hermostable
DNA polymerase of Thermus aquaticus (Taq polymerase)
using either a 480 or a 2400 thermal cycler (PerkinElmer),
essentially as described [25]. A typical 50-lL reaction
1294 C. Bevilacqua et al.(Eur. J. Biochem. 269) Ó FEBS 2002
mixture consisted o f 5 lL10·PCR buffer (500 m
M
KCl,
100 m
M
Tris/HCl, p H 9.0, 1% Triton X-100), 3 lL25m
M
MgCl
2
,2.5lL5m
M
dNTPs mixture, 0.5 lL (25 pmol)
each primer, 2 lL template DNA, and 0.25 lL(1.25U)
Taq polymerase ( Promega). To avoid evaporation ( with 480
thermal c ycler), the mixture was covered with 70 lL mineral
oil. After an initial denaturing step of 5 min (or 10 min) at
94 °C, the r eaction mixture was subjected to the f ollowing
three-step cycle which was repeated 35 times: denaturation
for 30 s (or 1 min) at 94 °C, annealing for 30 s (or 2 min) at
47–60 °C, and extension for 30–60 s (or 3 min) at 72 °C,
using the 2400 (or 480) thermal cycler. To estimate the
concentration of PCR products, 5 lL each reaction mixture
was analysed by elec trophoresis, i n the presence of ethidium
bromide (0.5 lLÆmL
)1
) ina 2% SeaKem (FMC) or Gibco
BRL Life Technologies agarose slab gel in T ris/borate/
EDTA (8.9 m
M
Tris, 8.9 m
M
boric acid, 0.2 m
M
EDTA,
pH 8.0) bu ffer.
For genotype a
s1
-CasM, using the amplified created
restriction site (ACRS)-PCR procedure [26], experimental
conditions are essentially the s ame as those mentioned b efore
except forthe primer c oncentration (50 pmolÆ50 lL
)1
reac-
tion mix) and the concentration ofthe agarose slab gel used
to visualize the PCR products was 4% (2% Gibco-BRL and
2% high-resolution agarose FMC).
Sequencing of amplified genomic DNA fragments
PCR products were either d irectly sequenced or sequenced
after cloning (fragments amplified between primers C9U
and C9L) into SmaI-digested pUC18 plasmid vector, using
fluorescent Cycle Sequencing (AmpliTaq FS, Dye Termi-
nator Cycle Sequencing Kit; PerkinElmer) with an ABI
377A or an ABI 310 DNA sequencer.
RESULTS
PAGE analysis and immunoblotting of whole casein
Figure 1A shows t he typical electrophoretic patterns
yielded, in polyacrylamide gel at pH 8.6, by thenew a
s1
-Cas
phenotype, subsequently shown to be a heterozygous M/F
(M being thenew variant), in comparison with two
reference phenotypes AA (lane 1) and FF (lane 2). This new
phenotype is characterized, under t hese conditions, by t he
presence ofa protein band with a slower mobility (lane 3, *)
occurring within the a
s
complex. As the a
s1
-Cas and a
s2
-Cas
overlap inthe same zone ofthe gel, the a
s1
-Cas composition
of each phenotype was analysed by immunostaining after
Table 1. Primers used inthe p resent study. Each pair of primers
amplifies the target exon and its flanking regions (from 60 to 200
nucleotides upstream and downstream). Primers ending with U (upper)
and L (lower) are p ositioned 5¢ and 3¢ from the target exon, respect-
ively. Given the small size of introns 4 and 10, primers C45U/C45L
and C1011U/C1011L were designed to amplify t ogether e xons 4 and 5
and exons 10 and 11, respectively. Sequencing of exon 7 was performed
starting from a genomic DNA fragment produced by amplification
betweenC7UandC8L.Primersinitalicswereusedinthegenotyping
of allele M.
C1U
5¢ GAG AGG AAC TGA ACA GAA CAT TG 3¢
C1L 5¢ CAA CTG CGT ATT AGT GAA GAA TG 3¢
C2U 5¢ AAT CAA ATT TTA TTA TAA GAC C 3¢
C2L
5¢ AAT AGC TAA TTA GAG ACC AT 3¢
C3U 5¢ GGT GTC AAA TTT AGC TGT TAA A 3¢
C3L 5¢ GCC CTC TTC TCT AAA AAG GTT T 3¢
C4U 5¢ AAT GGA GAA TTT GTG TTC AA 3¢
C45U 5¢ TGA CTG TGT TTT TCA CTT CT 3¢
C45L 5¢ GCT TTG TTA ATT CTG CAG TA 3¢
C6U 5¢ CCT TTT CCA GAA GTG TTT AGA AAG 3¢
C6L 5¢ CAT ACC ACC TTA ATT TTC GTA TT 3¢
C7U 5¢ CAT GAA GCA ATA TAT CTG CTC C 3¢
C7L 5¢ TGG TCA ACA TAC ATG TTG CAT C 3¢
C8U 5¢ CTT CAG TTA GCC TGG TAG GTA 3¢
C8L 5¢ TGG CAC AAC ATT GTA CAT TCT TGG G 3¢
C9U 5¢ GTA TGG AAG TGT GGA ATA GTT T 3¢
C9L 5¢ GGA CAC CAC AGA TAT CCA ATA G 3¢
C1011U 5¢ CAT AAA ACT AAC AAT ACA TGT 3¢
C1011L 5¢ TAG CAG ATA TTG AAA AGG AG 3¢
C12U 5¢ CCA GTG AAT ATT CAG GAC TGA T 3¢
C12L 5¢ AGG CTC TAG CAT GAT TTG ATG T 3¢
C13U 5¢ GCA TTT TTA TTT TGA ATG TAA A 3¢
C13L 5¢ TAG TTC AAA TGC ACA TCT TAT 3¢
C14U 5¢ GGC AGA GAA TAC GTT TAT ACT AA 3¢
C14L 5¢ TCT CAG ATT GAC TAC TAC AAC TT 3¢
C15U 5¢ CAT GAA AAG CAT TTC AAA AA 3¢
C15L 5¢ TAA AAA ACA GTG GTT ACC AA 3¢
C16U 5¢ CTA AAG AGT ACA CTA TCC TCA C 3¢
C16L 5¢ TTG CTG TGG TTG CCT ATC CTA 3¢
C17U 5¢ TGA TTT CTC ATA CAC TGT TG 3¢
C17L 5¢ TTG ATA AGG CAA CAA TAT GC 3¢
C18U 5¢ GTC CCA ACT TGA AAT CCT GAT C 3¢
C18L 5¢ CAA GTT TAT AGT CTA CAC GTT GTA C 3¢
C19U 5¢ CTT AGC ATC TTC CAT GGC TTG ATC 3¢
C19L 5¢ ATA CAC ACA AAC TCA CAA GG 3¢
MWU 5¢ CAA CAT ATT TTA AAT AAA ATT GAC AAT 3¢
C9LM* 5¢ ATA AAA ATG GTA TAC CTC ACT TGT*C 3¢
C9UM1 5¢ TAA CAA TGA TTC TCT TTC TTT TAG 3¢
C9LM1 5¢ AAT CTT TAT TTT GTC TCT GAC AA 3¢
Fig. 1. Disc-PAGE at pH 8.6 of individual whole caprine casein samples
containing different a
s1
-Cas variants AA, FF and MF. Phenotypes are
indicated at the top of each lane. Staining was with ( A) Coomassie
Brilliant Blue and (B) polyclonal antibodies against a
s1
-Cas. a–e iden-
tify a
s1
-Cas bands ofthe MF sampl e in order o f increasing mobility
towards the anode.
Ó FEBS 2002 Interallelicrecombination at the a
s1
-casein locus (Eur. J. Biochem. 269) 1295
transfer to NC paper with specific polyclonal antibodies
raised against a
s1
-Cas; the result is shown in Fig. 1B. The
new a
s1
-Cas phenotype (M/F) comprises at least five
components (a, b, c, d, and e). Two o f these (a and c)
appear to be shared with variants A and F, while
components e and d seem to be in common with the A
variant. Therefore, band b represents the only component
specific to t he M variant. The intensities ofthe bands in the
MF pattern indicate that variant M isa ‘strong’ variant like
variants A, B and C, i.e . it has a high level of expression.
However, as the intensities of three apparently homologous
components (a, c, and e) inthe AA and MF profile were
different, further heterogeneity ofthe PAGE components
may be suspected.
To understand t he high degree of heterogeneity observed
with goat a
s1
-Cas and to try to explain the difference in
band intensities, further electrophoretic experiments were
carried out, including UTLIEF analysis and 2D electro-
phoresis followed by staining with polyclonal a ntibodies.
In UTLIEF (results not shown) the a
s1
-CasM/F phenotype
comprised at least seven major components, two of which
were in common with variant F. Using 2D electrophoresis
(Fig. 2), at least two main spots surrounded by a number o f
minor components differing in their pI were foundin each
PAGE band. This large microheterogeneity, which a lso
occurs for other casein phenotypes (results not shown), m ay
be attributable to nonallelic a
s1
-Cas forms generated by
defective mRNA splicing an d to differently phosphorylated
a
s1
-Cas chains, as reported by Ferranti et al .[20].
MS and sequence analyses
To determine the molecular m ass ofthenewvariant (M),
whole c asein s of individual milks o f the phenotypes A/A, F/F,
andM/FweresubjectedtoHPLCseparation(Fig.3).The
retention time ofvariant M was shorter than that of the
A variant while the relative percentage was the same.
The HPLC fractions were analysed by ES/MS, and t he
molecular m asses o f a
s1
-CasA, B, and F were in agreement
with the expected masses [ 7,9]. The molecular mass deter-
mined by ES/MS ofthe a
s1
-Cas components occurring in
the s ample containing M/F v ariants w as 23 134/23 214/23
294 Da (Fig. 4). A fter alkaline phosphatase hydrolysis, t he
molecular mass ofthe three main peaks shifted to the single
value of 22 734 Da, indicating theoccurrenceof three
a
s1
-Cas species carrying five, six, and seven phosphate
groups, respectively. A set of small HPLC peaks eluted
before the main a
s1
-Cas peak gave a molecular mass of
Fig. 2. 2D electrophoretic analysis ofa whole casein sample prepared
from the milk ofa single goat, heterozygous M/F at the a
s1
-Cas locus.
Disc-PAGE was performed inthe first dimension followed by
UTLIEF inthe second dimension. The UT LIEF pattern in th e pH
range 2.5–6.5 is shown on t he left. Staining was with polyclonal anti-
bodies raised against a
s1
-Cas.
Fig. 3. RP-HPLC a nalysis of casein fractions from goats of different
genotypes F/F (A), M/F (B) and A/A (C) at the a
s1
-Cas locus.
1296 C. Bevilacqua et al.(Eur. J. Biochem. 269) Ó FEBS 2002
18 715/18 795/18 875 Da (18 555 Da after a lkaline phos-
phatase hydrolysis), corresponding to that expected for the
F variant. This result isthe first evidence forthe heterozy-
gous status (M/F) ofthe individual goat milk analysed.
In addition to this, t he HPLC profile confirms that the
M variantis abundantly expressed. Thus, as previously
mentioned, we were working with a mixture of two
unresolvable variants, one of which (M) a ccounts for more
than 80% ofthe whole a
s1
-Cas. This overrepresentation of
the M variant allowed us to continue the molecular
characterization with such a material.
The a
s1
-Cas fraction containing the M variant was
digested with trypsin, and the resulting peptide mixture
analysed by LC/ES/MS (Fig. 5). The peptide sequence
determined for t he M variant was identical w ith t hat yielde d
by variant B
2
(from the published sequence [6,7]) except for
two substitutions located in peptide 62–79. MS and
automated sequence analysis actually demonstrated that
peptide 62–79 (molecular mass 1833 Da and sequence
AGSSLSSEEIVPNSAQQK, where S indicates a phos-
phorylated serin e residue) contains the two substitutions
Ser66fiLeu an d Glu77fiGln, as compared with the B
2
variant. The substitution Ser fi Leu at position 66, first
makes this site unphosphorylatable and secondly impairs
the phosphorylation of Ser64 inthe M variant. The
sequence determined is consistent with the molecular mass
measured forthe native protein. The phosphorylated
residues a re therefore Ser46, 48, 65, 67, 68 (fully), and
Ser41 and Ser115 (partly), which originate in proteins with
five, six and seven phosphates/mol, explaining the hetero-
geneity of phosphorylation observed forthe n ative protein
by ES/MS analysis (Fig. 4). Finally, peptide E96QLLR100,
diagnostic ofthe F v ariant, was present among the peptides
identified by Edman d egradation after tryptic digestion and
RP-HPLC fractionation, confirming the heterozygous sta-
tus (M/F) ofthe sample analysed.
Experimental strategy designed to analyse the new
a
s1
-Cas variant at the nucleotide level
To determine the coding sequence ofa gene, there are at
least two possible strategies: it is possible to analyse it at
both the genomic level and messenger level. The most
straightforward option is undoubtedly mRNA extraction to
construct a cDNA molecule. The structure ofthe coding
region is then readily obtained by sequencing the cDNA.
In our situation, however, such a strategy was not
possible. Given t he low number of animals inthe popula-
tion, it was not possible to slaughter individuals of interest.
In addition, as thea nimals were from private flocks bred in
mountain m eadows, it was not possible to make mammary
tissue biopsy samples under appropriate hygienic condi-
tions.
To overcome this, we tried to extract mammary mRNA
from milk somatic cells, using the technique first de scribed
by Martin et al. [27]. Unfortunately, we could not obtain
enough material to synthesize cDNA. However, as expected
from the phen otypic analysis (at the p rotein level), the f ew
animals yielding in their milk the a
s1
-CasM variant were
exclusively heterozygous M/F a t the a
s1
-Cas locus. There-
fore, analysis o f their transcripts could have been rather
difficult because oftheoccurrenceof at least nine different
forms of messenger arising from the F allele [9]. Finally, t o
integrate this new allele into the phylogenetic t ree proposed
by Grosclaude & M artin [6], we also n eeded to obtain
information ab out relevant noncoding regions in which
specific and informative mutations are localized.
For these reasons, we d ecided to analyse the sequence of
the M allele at the genomic level. After amplification of each
exon an d i ts intron-flanking regions, amplified genomic
DNA fragments were sequenced. The knowled ge of
the structural organization ofthegoat gene encoding
the a
s1
-Cas [9] made this strategy possible. In addition, the
complete sequence ofthe bovine gene [28] was also available
and showed that the two genes display th e same o rganiza-
tion (number and sizing of exons) and 95% similarity at th e
exon sequence level. As goats and cattle are phylogenetically
close and known intron sequ ences inthegoat show strong
similarity to their bovine counterparts, we designed prim-
ers upstream a nd downstream of each exon to amplify
and analyse genomic regions including flanking intron
Fig. 4 . Deconvoluted electrospray mass spectrum of caprine a
s1
-Cas
M variant.
Fig. 5. LC/ES/MS analysis ofthe tryptic dige st ofthe a
s1
-CasM vari-
ant. Th e purified protein was digested with a ppropr iate concentrations
of trypsin (see Materials and methods). The peptide mixture was
analyzed using a V ydac C18 column (250 · 2.1 mm, 5 lm), on-line
with a Platform mass spectrometer, as described in Materials and
methods. The peak ofthevariant p eptide is indicated by an arrow.
Ó FEBS 2002 Interallelicrecombination at the a
s1
-casein locus (Eur. J. Biochem. 269) 1297
sequences, starting f rom both t he bovine and the g oat
sequences.
Analysis ofthe exon sequences at the genomic level
As the samples analysed were from goats that were
heterozygous (M/F)atthea
s1
-Cas locus, to discriminate
between the two alleles and therefore determine the 19 exon
sequences coming from the M allele, sequence data were
compared with those from a homozygous F/F goat genomic
DNA sample. All the sequences yielded were unambigu-
ously determined except that corresponding to the PCR
fragment encompassing the ninth exon in which a single
nucleotide deletion has been shown to occur inthe F allele
[9]. This m akes the s equence chromatogram unreadable
from that point forthe DNA template amplified from the
heterozygous M /F sample. To overcome this pro blem, t he
amplified fragment was cloned. Ofthe 10 clones sequenced,
four displayed a typical F exon-9 sequence, and five showed
the same sequence, which was different from that of the
F allele, with a 33-nucleotide exon 9. Taken together, the
exonic sequence data allowed us to construct a sequence
corresponding to the c omplete cDNA ofthe M allele. This
sequence is given in Fig. 6, where it is c ompared w ith that of
alleles F and A.
Only four polymorphic nucleotides were identified, t hree
of which yielded amino-ac id substitutions: (a) the transition
TfiC on the second nucleotide ofthe third codon within
exon 4, leading to a LeufiPro substitution at position 16 of
the peptide chain, as compared with theA variant; (b) a
transversion GfiC on the first nucleotide ofthe l ast exon-10
codon, leading to a GlufiGln substitution at position 77 o f
the peptide chain, as compared with the F variant; (c) the
deoxycytidyl phosphate residue at position 2 3 inthe 9th
exon oftheA alle le, which is deleted inthe F allele, is
mutatedtoTintheMallele, giving rise to a Ser fi Leu
substitution.
Analysis ofthe intronic flanking sequences
The flanking intronic regions directly upstream and down-
stream of each exon were sequenced over 50–200 nucleo-
tides a nd the complete sequences of introns 4, 7, and 10 w ere
determined for alleles A, F,andM.Inthisway,20further
polymorphic sites were identified besides the f our polymor-
phic exon nucleotides (Fig. 7). In addition, an RsaI
restriction site was found between exon 6 and exon 8 of
alleles F and M, which is lacking intheA allele, giving a
total of 25 polymorphic sites useful for phylogenetic allele
comparisons. Taking into account these data, it is worth
noting that inthe 5¢ part ofthe gene, up to exon 8, the
nucleotide combination (haplotype ) observed f or the M
allele is identical with t hat shown by the F allele. In contrast,
in its 3¢ part, beyond exon 8, the haplotype ofthe M allele is
identical with that oftheA allele, e xcept at the polymorphic
sitelocatedinexon9.
In addition, intron 5 was completely sequenced starting
from genomic DNA isolated from blood of two goats,
genotyped as M/F and F/F at the a
s1
-Cas locus. Compared
with the bovine sequence, a deletion spanning nucleotides
376 t o 594 was observed f or both goats. The deleted region
in this intron did not match any known sequence in the
EMBL databank. S ubsequently, the existence of this
deletion was confirmed by PCR for six goats of different
a
s1
-Cas genotypes (A/A, F/F, M/F) from different Italian
breeds (Montefalcone, Teramana, Garganica, Girgentana,
and Sarda) and for s ix sheep of different I talian breeds
(Comisana, Gentile di Puglia, and Valle del Belice). These
results: (a) confirm the difference in size (% 200 bp)
previously reported [23] between goat (% 450 bp) and cattle
(641 bp) intron 5, and (b) show that the ovine intron 5 is
also shorter than the cattle one. This could be expected,
given the phylogenetic proximity between sheep and goats.
Genotyping of the
M
allele
The genotyping procedure designed consists of two steps.
The first one is an ACRS-PCR technique [26], the principle
of which is to create a TaqI restriction site (TCGA) by using
a mismatching primer (C9LM*) which allows both the F
and M alleles to be discriminated from all the others (Fig. 8,
Step IA). These two alleles will be subsequently distin-
guished after a second amplification which allows discrim-
ination between the alleles on the basis ofthe f ragment sizes
(Fig. 8 , Step IIA).
In the first step, a 266-bp (265 bp forthe F allele) DNA
fragment is amplified between primers MWU and C9LM*
with every allele. After digestion with TaqI, the 265/266-bp
fragment is split into two fragments (240 and 26 bp) for
each allele e xcept the M and F alleles, for wh ich no TaqIsite
has been created (Fig. 8, Step IB), because of mutations
(deletion o r substitution) occurring a t position 23 in exon 9
(TTGA instead of TCGA).
To discriminate between the M and the F alleles, we took
advantage ofthe presence o f an 11-bp insertion in intron 9
of the F allele, which is lacking inthe M allele. Thus, using
two primers, C9UM1 (forward) and C9LM1 (reverse),
located just upstream from exon 9 and 82 nucleotides
downstream ofthe 11-bp insertion site, respectively, a
238-bp DNA fragment was yielded by PCR starting from
the M allele, whereas the F allele gives a 248-bp fragment
(Fig. 8 , Step IIA).
Individuals analysed here, which allowed the M allele to
be characterized were heterozygous M/F. Consistent with
our structural results, they gave the two fragments (238 and
248 bp) as shown for one of them at F ig. 8, Step IIB (lane
1). I t is worth noting that the third band observed with this
sample is due to theoccurrenceofa heteroduplex structure.
this was confirmed by analysing an amplification product
from the mix of samples F/F and X/X (Fig. 8, Step IIB,
lane 4).
DISCUSSION
We report the identification and the molecular character-
ization ofanew allele, named M, occurring at the a
s1
-Cas
locus inthe goat. This novel allele, characte rized by the
transition CfiT at position 23 inthe 9th exon ofthe gene,
was foundinthe Montefalcone breed, at v ery l ow frequency
(< 2%) after phenotypic analysis of 147 individual m ilk
samples. All goats bearing the M variant were s hown to be
heterozygous (M/F and M/B).
Interestingly, the mutation s pecific forthe M allele affects
the same nucleotide as that which is deleted inthe F allele,
and shown to be responsibleforthe internal deletion of 37
amino-acid residues inthe F variant, as a consequence of t he
1298 C. Bevilacqua et al.(Eur. J. Biochem. 269) Ó FEBS 2002
skipping of three successive exons during the course of the
processing ofthe primary transcripts [9]. At the peptide
level, the CfiT transition, which leads to a Ser66 fi Leu66
substitution, gives rise to the loss of two ofthe five
phosphate groups clustered inthe multiple phosphorylation
site ofthe a
s1
-Cas. This loss explains t he lower ele ctro-
phoretic mobility ofthe M protein compared with the other
caprine a
s1
-Cas variants described so far. This situation is
similartothatobservedinsheep,withthea
s1
-CasD variant
(previously called the Welsh variant). Actually, this ovine
variant has only two phosphoserine residues instead of five
in the homologous region ofthe a
s1
-CasA and C variants
[22]. However, whereas the structural alteration inthe D
(Welsh) variantis associated with a reduction in milk casein
content [29,30], the M variant, like thegoat a
s1
-CasA and B
variants, must be considered a ‘strong’ variant, given the
intensity ofthe isoelectrofocusing bands and the surface of
the relevant peak in RP-HPLC.
Fig. 6. Nucleotide sequence ofthe expected a
s1
-CasM cDNA obtained by genomic e xon sequencing analysis: c omparison with its A and F allele
counterparts. Numbering begins with the first nucleotide ofthe first exon ( up) and the first amino-acid residue ofthe mature M protein (down).
Dashes indicate nucleotides identical with those ofthe M all ele. The stop codon is symbolized by ***. Numbers in vertical framed arrows indicate
the position ofthe introns. T he boxes i ndicate amino-acid substitutions.
Ó FEBS 2002 Interallelicrecombination at the a
s1
-casein locus (Eur. J. Biochem. 269) 1299
Unexpectedly, placing variant M inthe phylogeny
(Fig. 9) proposed by Grosclaude and Martin [6] turned
out to be rather difficult. Indeed, a comparison of the
different variants at the peptide sequence level suggests a
hybrid structure forthe M protein. T aking into account
amino-acid combinations at the polymorph ic residues
(haplophenotypes), the M variant, with a proline and
glutamine residue at position 16 and 77, respectively,
could be placed in both lineages (A and B) arising from
the putative ancestral protein B
1
. This possible dual
membership strongly suggests the invo lvement of a
recombination/gene conversion event between alleles from
the two lineages. This hypothesis was strengthened by
genomic sequence d ata. Although a mutation-driven
convergence c annot be excluded, an interallelic rec ombi-
nation/gene conversion event seems to be the most
plausible. Intronic sequences relative to A, F and M
alleles (Fig. 7) strongly support such a notion. Indeed, a
detailed comparative analysis at 25 polymorphic sites,
including 23 single nucleotide polymorphisms, spanning a
large part of t he transcription unit provides a haplotype
formula allowing each allele to be precisely characterized.
Thus, the M allele unequivocally appears to be a hybrid
structure made o f F -type allele se quences in its 5¢ part
followed by A-type allele sequences in its 3¢ part (except
the transition CfiT at nucleotide 23 inthe ninth exo n).
Following such a scheme, arecombination event would
have occurred around exon 8 ( Fig. 7). However, t he
genomic sequences analysed do not allow us to distinguish
whether the mechanism by w hich allele M originates is
consecutive to a double (gene conversio n) or to a single
(recombination) cross over. However, as o ver t he 10 kb
separating exon 8 from the end ofthe transcription unit
there are no sequence clues indicating a sec ond cross o ver,
it see ms most likely that the M allele originated in an
interallelic recombination event. Gene conversion events,
which usually account for exchanges over short sequence
tracts [31], have been mainly described and intensively
investigated as mechanisms generating allelic diversity in
highly polymorphic genetic systems, such as the loci
encoding the class-II cell surface antigens ofthe major
histocompatibility complex in humans [32,33]. Both
mechanisms have also been thought to account for genetic
disorders in humans, such as sporadic Alzheimer disease
cases [34] and diabetic pathology involving the gene
encoding insulin [35].
Simplified haplotype formulae strongly suggest that the
allele that provided the 3¢ part ofthe recombinant allele ( M)
is theA allele (Fig. 10). In c ontrast, one can wonder whether
the donor allele o f the 5¢ part isthe F allele or another allele
belonging to t he same B allelic lineage (excluding B
1
and C),
as they share the same simplified haplotype formula, up to
exon 8. To reach a definite conclusion, the complete
sequence o f the 5¢ region of each allele would be required,
because no differences have been foundinthe available
sequences (exons and intron-flanking regions).
If our recombination hypothesis is correct, t he break
point should b e located between nucleotide 86 upstream
and nucleotide 40 downstream f rom exon 8, and the cross
over should have been accompanied by a reciprocal
exchange. One can therefore expect to find the reciprocal
recombinant allele among the alleles so far described. The
structural features of such a recombinant allele should be
an A-type sequence inthe 5¢ part followed by a B-type
(B
2
/B
3
/B
4
or F) sequence inthe 3¢ part. The only a llele
found so far gathering such characteristics is allele B
1
,
with a B
2
simplified haplotype formula in its 3¢ part
(Fig. 10). This confirms our assumption and suggests that
the M allele probably r esults from an interallelic re com-
bination event involving alleles A and B
2
whereas the
reciprocal event might have given B
1
. However, w ith a
leucine residue at position 66, it is clear that the M allele
does not arise directly from therecombination event
between alleles A and B
2
. It probablyis derived from an
intermediate hybrid allele (B
2
:A), putatively W, not yet
identified, which was subsequently mutated at nucleotide
23 ofthe ninth exon.
Because of its close similarity to its bovine and ovine
a
s1
-Cas counterparts, allele B
1
was considered to be the
ancestral allele inthegoat [6]. The results reported here
indicate that B
1
might result from an interallelic recombi-
nation between alleles A and B
2
, which can therefore be
Fig. 7. Polymorphisms occurring at 25 sites inthegoat a
s1
-CasA, F and M alleles. The position of each polymorphic site is identified and
numbered relative t o the nearest exon. Intro nic nucleotides are p receded by a ‘–’ or ‘+’ when they are upstream or downstream, repectively
(e.g. )11/1 corresponds to the nucleotide located 11 nucleotides upstream from the first exon). Polymorphic sites in an exon sequence are
identified without a sign (e.g. 8/4 identifies the 8th nucleotide ofthe 4th exon). RsaI/6–8 indicates the loss (–) or gain (+) of an RsaI
restriction site within the DNA fragment spanning exon 6 to exon 8. Mutations specific for alleles M and F atposition23inthe9thexon
are highlighted. The symbol D indicates the nucleotide deletion in allele F [6]. The hatched boxes, identified by i7-e8-i8, encompass the
putative recombination region.
1300 C. Bevilacqua et al.(Eur. J. Biochem. 269) Ó FEBS 2002
Fig. 8. Genotyping the M allele at the a
s1
-Cas locus. Step I: ACRS-PCR using the primers pair MWU and C9LM* yields a 265/266-bp fragment,
whatever the allele. Amplicons a re then submitted to restriction by TaqI(A).TheTaqI restriction site (TCGA) created in exon 9 is ind icated.
Nucleotides C and A* correspond t o the mut ation characteristic fora llele M and s ubstitution introduced within the primer C9LM*, respectively.
Fragments generated are finally analysed by agarose gel (2% Metaphore + 2% agarose) electrophoresis (B). Lane 1, M olecular mass marker
(pBR322 digested by Hae III); lane 2, nondigested PCR product; lanes 3–5, homozygous X/X, heterozygous M/F and heterozygous X/F samples,
respectively, where X represents an allele different from F, B , E ,andC. Sizes (in b p) of DNA fragments are given on the right ofthe gel.
Step I I: AS-PCR to discriminate between alleles F and M. (A). Amplification between primers C9UM1 and C9LM1 generates DNA fragments of
characteristic size forthe allele. (B) Agarose gel (2% Metaphore + 2% agarose) analysis of amplicons from heterozygous M/F (lane 1),
homozygous X/X (lane 2), homozygous F/F (lane 3), F/F + X/X mix (lane 4), with X different f rom F, B, E,andC. Lane 5 shows a molecular mass
marker (pBR322, HaeIII d igested). Sizes (in bp) of DNA fragments are given on the l eft ofthe gel.
Ó FEBS 2002 Interallelicrecombination at the a
s1
-casein locus (Eur. J. Biochem. 269) 1301
considered representatives of two ancestral allelic lineages.
The reciprocal proposal, i.e. B
1
and W are parental a lleles,
the recombinant products of which are A and B
2
, cannot be
ruled out (Fig. 10). The latter proposal is, however, less
plausible, given the low frequencies at which alleles B
1
and
M have been foundinthe g oat populations analysed so far.
It is worth noting that both alleles are characteristic of l ocal
breeds, Poitevine (France) and Montefalcone (Italy),
respectively. Nevertheless, whatever the hypothesis retained,
the existence of two ancestral allelic lineages seems to be the
most likely scenario. Thus, interallelic recombination
between two alleles may be responsibleforthe generation
of four possible allelic lineages (represented by A, B
2
, B
1
,
and W ), one of which (W/M) i s r evealed by this work. The
high polymorp hism ofthegoat a
s1
-Cas system provides
further evidence that a llelic diversity can arise from multiple
pathways, including shuffling of polymorphic sequences
generated by point mutations, through interallelic recombi-
nation events.
Fig. 9. Phylogeny proposed by Grosclaude and Martin [6] for the
a
s1
-Cas alleles and differences between the corresponding variants. The
phylogenetic t ree proposed is based on the existence ofa single
ancestra l allele ( B
1
), which was considered to be the o riginal o ne give n
its close similarity to its ovine and bovine a
s1
-Cas counterparts.
Fig. 10. Anew phylogenetic tree integrating the possible interallelicrecombination between two allelic lineages. The four alleles (B
2
, A, B
1
,andW)
putatively involved intherecombination event are schematically represented as a chain of six boxes (mimicking exons) on w hich are indicated
polymorphic a mino-acid residues and their position in t he p eptide chain. A sim plified haplotype formula is thus provided (e.g. HPS
P
ERT and
HLS
P
QRT for alleles B
2
and A , r espect ively) . The RsaI polymorphic restriction site a nd insertio ns o ccurring, resp ectively, bet ween e xons 6 and 8
and within intron 9 are indicated. Alleles d eriving from t hese four ‘potentially recombinant’ alleles (boxed) are circled. A rrows indicate a possible
pathway of evolution to alleles associated with high (black) or r educed (red) amounts of casein synthesized. The M allele is derived f rom allele W by
a single n ucleotide transition CfiT (nucleotide 23/exon 9) leading to the o ccurrence o f a leucine r esidue (allele M) instead ofthe Ser (putative allele
W) inthe multiple phosphorylation site of a
s1
-Cas. Thenew phylogeny has been e nriched with three novel variants (H, I and L ) reported in 1997 by
Chianese et al. [14].
1302 C. Bevilacqua et al.(Eur. J. Biochem. 269) Ó FEBS 2002
[...]... 249, 1–7 21 Jaubert, A & Martin, P (1992) Reverse-phase HPLC analysis ofagoat casein Identification of as1- and as2-casein genetic variants Lait 72, 235–247 22 Ferranti, P., Malorni, A. , Nitti, G., Laezza, P., Pizzano, R., Chianese, L & Addeo, F (1995) Primary structure of ovine as1-caseins: localization of phosphorylation sites and characterization of genetic variants A, C and D J Dairy Res 62, 281–296... G., Mahe, M.F., Grosclaude, F & Ribadeau-Dumas, B (1989) Sequence of caprine as1-casein and characterization of those of its genetic variants which are synthesized at a high level, as1- CnA, B and C Protein Seq Data Anal 2, 181–188 ´ 8 Brignon, G., Mahe, M.F., Ribadeau-Dumas, B., Mercier, J.C & Grosclaude, F (1990) Two ofthe three genetic variants ofgoatas1-casein which are synthesized at a reduced... au fromage: le polymorphisme ´ de la caseine as1, ses effets, son evolution INRA Prod Anim 7, 3–19 ´ 11 Jordana, J., Sanchez, A. , Jansa, M., Mahe, M.F & Grosclaude, F (1991) Estudio comparativo de razas caprinas espanolas en rela˜ cion a las variantes de la caseina alphas1 Inf Tec Ecom Agrar 11, 598–600 12 Ramunno, L., Rando, A. , Di Gregorio, P., Massari, M & Masina, P (1991) Struttura genetica di alcune... popolazioni caprine allevate in Italia al locus della caseina as1 IX Congresso Naz ASPA, 579–589 ´ 13 Grosclaude, F., Mahe, M.F., Brignon, G., Di stasio, L & Jeunet, R (1987) A Mendelian polymorphism underlying quantitative variations ofgoatas1-casein Genet Select Evol 19, 399–412 14 Chianese, L., Ferranti, P., Garro, G., Mauriello, R & Addeo, F (1997) Occurrenceof three novel as1-casein variants in. .. variants ingoat milk IDF Seminar ‘Milk Protein Polymorphism’ pp 259–267 International Dairy Federation, Bruxelles, Belgium 15 Chianese, L., Garro, G., Ferranti, P., Speranza, L.M., Pilla, F., ` Bevilacqua, C & Pietrola, E (1998) Occurence ofnewvariantofgoat as1- and as2-casein ina breed of Southern Italy 25th International Dairy Congress, 82nd annual sessions 16 Bevilacqua, C., Veltri, C & Pilla, P... (1999) Genotyping ofas1-casein locus in Montefalcone goat population XIII Congress ASPA, Piacenza, 179–181 17 Aschaffenburg, R & Drewry, J (1959) New procedure forthe routine determination ofthe various non-casein proteins of milk 15th Int Dairy Congr London 3, 1631–1637 18 Chianese, L., Mauriello, R., Intorcia, N., Moio, L & Addeo, F (1992) New as2-casein variant from caprine milk J Dairy Res 59,... P., Womack, J., Schmutz, S., Fries, R & Gallagher, D.S (1996) Standardization of cattle karyotype nomenclature: report of committee forthe standardization ofthe cattle karyotype Cytogenet Cell Genet 74, 259–261 6 Grosclaude, F & Martin, P (1997) Casein polymorphisms inthegoat IDF Seminar ‘Milk Protein Polymorphism II’, pp 241–253 International Dairy Federation, Palmerston North, New Zealand ´ 7... Martin, P., Mahe, M.F., Leveziel, H & Mercier, J.C (1990) Restriction fragment length polymorphism identification ofthegoatas1-casein alleles A potential tool in selection of individuals carrying alleles associated with a high level protein synthesis Anim Genet 21, 341–351 24 Sambrook, J., Fritsch, E.F & Maniatis, T (1989) Molecular Cloning: a Laboratory Manual, 2nd edn Cold Spring Harbor Laboratory... restriction analysis ofthe bovine casein genes Nucleic Acids Res 18, 6829–6833 3 Threadgill, D.W & Womack, J.E (1990) Genomic analysis ofthe major bovine milk protein genes Nucleic Acids Res 18, 6935–6942 4 Hayes, H., Petit, E., Bouniol, C & Popescu, P (1993) Localization ofthe as2-casein gene (CASAS2) to the homologous cattle, sheep, and goat chromosomes 4 by in situ hybridization Cytogenet Cell... 299–305 19 Chianese, L., Garro, G., Ferranti, P., Malorni, A. , Addeo, F., Rabasco, A & Molina Pons, P (1995) Discrete phosphorylation generates the electrophoretic heterogeneity of ovine b-casein J Dairy Res 62, 89–100 20 Ferranti, P., Addeo, F., Malorni, A. , Chianese, L., Leroux, C & Martin, P (1997) Differential splicing of pre-messenger RNA produces multiple forms of mature caprine as1-casein Eur J Biochem . CTT AGC ATC TTC CAT GGC TTG ATC 3¢
C19L 5¢ ATA CAC ACA AAC TCA CAA GG 3¢
MWU 5¢ CAA CAT ATT TTA AAT AAA ATT GAC AAT 3¢
C9LM* 5¢ ATA AAA ATG GTA TAC CTC ACT. 5¢ CAA CTG CGT ATT AGT GAA GAA TG 3¢
C2U 5¢ AAT CAA ATT TTA TTA TAA GAC C 3¢
C2L
5¢ AAT AGC TAA TTA GAG ACC AT 3¢
C3U 5¢ GGT GTC AAA TTT AGC TGT TAA A 3¢
C3L