Báo cáo khoa học: Solubility-dependent structural formation of a 25-residue, natively unfolded protein, induced by addition of a seven-residue peptide fragment pot
Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 12 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
12
Dung lượng
603,42 KB
Nội dung
Solubility-dependentstructuralformationofa 25-residue,
natively unfoldedprotein,inducedbyadditionof a
seven-residue peptide fragment
Mitsugu Araki and Atsuo Tamura
Graduate School of Science, Kobe University, Nada, Japan
In order to elucidate the architectural principle of pro-
tein structure, the relationship between protein
sequence and tertiary structure has been studied from
various perspectives. Factors determining the mecha-
nism ofstructural stability have long been explored,
mostly by studying the stability and folding kinetics of
natural and mutated proteins [1–3]. Recently, compu-
tational protein designs have become advanced and
provide new insights into the factors determining pro-
tein structure, stability and folding [4], i.e. the redesign
of naturally occurring proteins [5–8] and the de novo
design of novel structures [9,10]. These studies suggest
that each well-packed structure is stabilized bya num-
ber of intra- or intermolecular interactions, invoked by
the appropriate alignment of amino acid residues, and
the number of proteins with well-packed structures
seems extremely small compared with all possible
sequences (primary sequence space). In such cases,
how have the existing proteins attained their well-
packed structures? If the majority of the possible
Keywords
folding; intrinsically unstructured protein;
protein stability; self-association; solubility
Correspondence
A. Tamura, Graduate School of Science,
Kobe University, Nada, Kobe 657-8501,
Japan
Fax ⁄ Tel: +81 78 803 5692
E-mail: tamuatsu@kobe-u.ac.jp
Database
The atomic coordinates have been
deposited in the Protein Data Bank (PDB
ID code 2KFQ for FP1)
(Received 26 September 2008, revised 10
February 2009, accepted 12 February 2009)
doi:10.1111/j.1742-4658.2009.06961.x
To elucidate the architectural principle of protein structure, we focused on
sequestration from solvent, which is a common characteristic of folding
and self-associative precipitation. Because protein solubility can be
regarded as a basis for the potential ability to sequester from solvent, we
assume that poorly soluble proteins tend not only to precipitate, but also
to form solution structures. To examine this, the solubility ofa 25-residue,
natively unfoldedprotein, modified from a zinc-finger domain of transcrip-
tion factor Sp1, was disturbed by adding aseven-residue hydrophobic pep-
tide fragment to the C-terminus. NMR and ultracentrifuge measurements
of the resulting sequence showed that a dissolved species forms an a-helical
structure in a 15–20 molecule oligomer. To elucidate the mechanism by
which the structure forms, we prepared two variants in which the added
fragments are less hydrophobic; the structural stabilities were then mea-
sured at various pH values. A fairly good correlation was observed
between stability and hydration potential, whereas a much stronger correla-
tion was observed between stability and solubility, indicating that the
stability is more strongly dependent on the ability to precipitate than on
dehydration. These results show that, among poorly soluble protein mole-
cules, dissolved species can be transformed from the solvent-exposed
unfolded state into a loosely packed structure via intermolecular inter-
actions. Because decreasing the protein solubility does not require the
primary sequence to have a sophisticated design, such a protein structure
might form readily and frequently, compared with the well-packed
structure found in native proteins.
Abbreviations
FP, final protein; IP, initial protein; DG
dissol
, dissolution free energy; DG
f
, folding free energy; DG
hyd
, hydration potential.
2336 FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS
sequences result in the unfolded state, the probability
of natural proteins exploring and evolving well-packed
structures would be extremely limited. It is thus
assumed that moderately structured states needed for
decent biological function might arise frequently. This
assumption is supported by protein-folding studies,
which shed light on the early evolution of natural pro-
teins. First, in 74-residue library based on a binary
patterning of polar or nonpolar residues, most proteins
formed fluctuating structures reminiscent of molten
globule intermediates [11]. Second, using the lattice
model, folding simulations implied that a randomly
chosen sequence of amino acids frequently encodes a
globular conformation [12]. This simulation is based
on the concept that hydrophobicity is a driving force
in protein folding, in which a protein excludes part of
the molecule from the solvent water in a geometry-
specific manner [13].
In this study, we attempted to identify a crucial fac-
tor in the formationofa moderately structured state
by focusing on the fact that sequestering from a
solvent is a common characteristic of folding and self-
associative precipitation or aggregation, the latter
frequently occurring during the handling of natural
and artificial proteins. Because the tendency ofa pro-
tein molecule to precipitate is represented by its solu-
bility, this physicochemical property can be regarded
as a criterion for the potential ability to squeeze out
solvent. In the case ofa poorly soluble protein, which
disfavors exposure to solvent, it is likely that the dis-
solved species tend to form solution structures seques-
tered from the solvent. In order to confirm this, we
attempted to transform a25-residue, soluble unfolded
protein (the initial protein; IP) into a structured pro-
tein (the final protein; FP) by altering the solubility.
First, the amino acid sequence of the IP was deter-
mined on the basis ofa zinc-finger domain of tran-
scription factor Sp1, which folds into a well-defined
structure consisting ofa b hairpin and an a helix upon
binding to Zn
2+
; it is unfolded in the absence of the
metal [14]. Next, the solubility of the IP was decreased
by the additionof seven hydrophobic amino acids to
the C-terminus; we anticipated that the added frag-
ment would induce long-range interactions between
amino acids separated in the primary sequence [15]. As
a result, NMR, CD and ultracentrifuge measurements
showed that a dissolved species of the resulting
sequence, FP1, takes the form of an a-helical structure
in a 15–20 molecule oligomer, without addition of
Zn
2+
. We thus scrutinized the dependence of protein
solubility or hydration potential on the structural
stability using two variants of FP1 by varying the
pH. A strong correlation between the stability and
solubility elucidated mechanism offormationof the
loosely packed structure, inducedby the association
with other copies of the same chain.
Results
Sequence and structural property of IP
Among several candidates for IP, which needs to be
unstructured in the native condition, we chose the
third zinc-finger domain Sp1f3 of transcription factor
Sp1, with two histidines (His21, His25) and two cyste-
ines (Cys5, Cys8), to bind coordinately to Zn
2+
[14].
Sp1f3 is known to fold into a well-defined structure
upon binding to Zn
2+
, and is unfolded in the absence
of the metal. To suppress the excessively high solubil-
ity of Sp1f3, which is caused by the high frequency of
ionizable amino acid residues (Table 1), residues 26–29
(Gln-Asn-Lys-Lys) in the C-terminal region were
removed and Lys1, Lys2, Glu7 and His17 were
replaced by alanine or tyrosine. In addition, His25,
which binds to Zn
2+
in Sp1f3, was replaced by alanine
to suppress any possible interactions with trace
amounts of metal ions in solution, resulting in the
sequence of IP given in Table 1. In NOESY and TOC-
SY spectra of 3 mm IP at pH 3.0, most of the NOE
peaks overlapped with TOCSY cross-peaks, indicating
that these NOE peaks are intraresidual. The remaining
NOE peaks, identified as non-intraresidual, came from
the sequential signals, C
a
H-NH, C
b
H-NH, NH-NH
and those related to C
d
H of prolines. In addition, far-
UV CD spectra of IP, which is typical of unfolded
proteins, did not change in the concentration range
0.4–2.9 mm (Fig. 1). All of these NMR and CD ana-
lyses show that IP remains unfolded up to a con-
centration of 3 mm.
Sequence and structural properties of FP
We attempted to add apeptidefragment to the C-ter-
minus of the IP, anticipating that a solution structure
might be formed throughout the length of the mole-
cule. The number of additional residues was limited to
six, because contiguous hydrophobic residues in a pro-
tein resulted in a reduced yield in peptide synthesis.
Table 1. Sequences of Sp1f3, IP and FPs.
Sp1f3 KKFACPECPK
10
RFMRSDHLSK
20
HIKTHQNKK
IP YAFACPACPK RFMRSDALSK HIKTA
FP1 YAFACPACPK RFMRSDALSK HIKTAFIVVA
30
LG
FP2 YAFACPACPK RFMRSDALSK HIKTAYIVVA LG
FP3 YAFACPACPK RFMRSDALSK HIKTAYISVA LG
M. Araki and A. Tamura Solubility-dependent structure formation
FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS 2337
Among the various hydrophobic scales reported, we
used the hydration potential from the gaseous to the
aqueous phase (DG
hyd
) of amino acids [16,17], because
it is quantitatively related to each side-chain and main-
chain component. We chose Gly, Pro, Leu, Ile, Val,
Ala, Phe, Cys and Met, which have notably larger
hydration potentials (greater than )2 kcalÆmol
)1
)
than those of any other amino acids (less than
)5 kcalÆmol
)1
), as candidates for amino acid resi-
dues in the fragment. Next, the sequence (X
1
,X
2
, ,
X
6
) of the extra region was chosen to be complemen-
tary [18] to P6, C5, A4, F3, A2 and Y1 in the N-termi-
nal region, assuming that the interactions between the
N- and C-terminal regions are important for folding of
the whole molecule. The resulting sequence of the extra
region in the final protein, FP1, became FIVVAL
(Table 1). In a NOESY spectrum of 3 mm FP1 at
pH 3.0, a number of NOE peaks, including long-range
NOEs, i.e. Y1C
d
H–I27C
c
H and A2C
b
H–V29C
b
H, and
medium-range NOEs, i.e. cross-peaks of d
aM
(i,i+2)
and d
ab
(i,i+3), were observed in addition to intraresi-
due and sequential NOEs. However, in the case of
0.9 mm FP1, intraresidue and sequential NOE peaks,
detected mostly in the case of IP, were observed. In a
NOESY spectrum of 1.5 mm FP1, long- and medium-
range NOEs observed at 3.0 mm FP1 were partially
observed. In addition to the NMR analyses, far-UV
CD spectra of FP1 showed that the shape is dependent
on the protein concentration (Fig. 1A). The [Q] value
at 222 nm for 0.4 mm FP1 was approximately )2000,
which is close to that for 0.4 mm IP, whereas it
became more negative, to approximately )4000, with
an increase in the protein concentration to 3 mm
(Fig. 1B). All of these NMR and CD analyses show
that the amount of secondary and tertiary structure in
FP1 increases with the protein concentration.
Self-association of FP1
We examined the degree of protein association at vari-
ous concentrations by measuring the sedimentation
equilibrium (Fig. 2A) and sedimentation velocity
(Fig. 2B). At 0.9 mm FP1, sedimentation equilibrium
measurements showed that most plots of the apparent
molecular mass are distributed over the range
0–10 000 Da, which corresponds to a molecular mass
for a monomeric or dimeric form of FP1 of 3500 or
7000 Da, respectively, although a few plots are
> 10 000. At 1.5 mm FP1, although a large number of
plots are distributed over the range 0–10 000 Da, the
number of plots above 10 000 becomes noticeably
higher. At 3.0 mm, most plots are in the range
50 000–70 000, which corresponds to a molecular mass
for an oligomer of 15–20 molecules. Sedimentation
velocity measurements were also performed by chang-
ing the FP1 concentration. At 3.0 mm, the distribution
of the sedimentation coefficients showed two main
peaks, one at 4.4 and the other at 4.6 (Fig. 2B). These
peaks could be assigned to oligomers of 15–20 mole-
cules, because it was shown that most FP1 molecules
form this type of oligomers at 3 mm, according to the
equilibrium measurements. At 2.3 mm, the distribution
showed a distinct peak at 1, which is the smallest
observed sedimentation coefficient, in addition to the
main peak at 4.5. The sedimentation coefficient of
monomeric FP1 can be estimated by using the equation
correlating the sedimentation coefficient (S) and the
A
B
Fig. 1. Protein concentration dependence of far-UV CD spectra of
IP and FP1. (A) Spectra of FP1. (B) [Q] values at 222 nm for IP and
FP1. [Q] is molar ellipticity per residue.
Solubility-dependent structure formation M. Araki and A. Tamura
2338 FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS
molecular mass (M): S=M(1)qv
s
)D ⁄ RT, where q is
the density ofa solution, v
s
is the partial specific vol-
ume of the solute, D is the diffusion constant of the sol-
ute, R is gas constant and T is absolute temperature. S
was calculated to be 0.4 by using M = 3500 gÆmol
)1
,
q =1gÆcm
)3
, v
s
= 0.7 cm
3
Æg
)1
, which was the general
value of native proteins [19], R = 8.3 JÆK
)1
mol
)1
,
T = 293 K and D = 9.3 · 10
)11
m
2
Æs
)1
, obtained
using pulsed-field gradient NMR spectroscopy, as
described previously [20,21]. Therefore, the peak at 1
can be assigned to the monomer or small oligomer.
These sedimentation equilibrium and velocity measure-
ments indicate that the amount of monomeric FP1
decreases with an increase in the concentration, whereas
the number of 15–20 molecule oligomers increases.
pH dependence of solubility and stability
The peptidefragment added to the C-terminus of IP in
FP1 consists of hydrophobic amino acid residues,
which have notably large hydration potentials, DG
hyd
.
In order to identify a determining factor in the struc-
tural formation, we prepared two variants whose
hydrophobicity is lower than that of FP1: FP2 (Phe26-
Tyr) and FP3 (Phe26Tyr and Val28Ser) (Table 1). The
hydration potential of FP2 or FP3 can be calculated to
be 5.4 or 12.5 kcalÆmol
)1
, respectively; lower than that
of FP1, based on DG
hyd
derived from model com-
pounds [16]. The solubility of these mutants was mea-
sured at various pH values (Fig. 3A). The individual
solubility of FPs increases gradually with a decrease in
pH in the range 6.5–7.3, presumably because of an
increment in the net charges caused by protonation of
the imidazole group in His21 and anionic sulfhydryls in
Cys5 and Cys8 (Fig. 3B). At all these pH values, the
solubility of FP2 is as high as that of FP1, and that of
FP3 is clearly the highest. Plots of the solubility at pH
< 6.4 were omitted because they were severely scat-
tered owing to the steep slope. Experimentally obtained
plots for each FP were fitted using Eqn (5), where r
p
values for the FPs were set to 384 A
˚
, because the
amino acid compositions of the FPs were almost identi-
cal. This r
p
value was obtained by fitting FP3, for
which the error in the determination of r
p
is smallest
among FPs. Values of l
pðsÞ
l
o
0
pðsolÞ
=RT obtained by
fitting FP1, FP2 and FP3 were )22.8 ± 0.1,
)22.5 ± 0.1 and )20.5 ± 0.0, respectively.
The pH dependence of the structural stabilities
represented by NOE intensities for 3 mm FP1 is given
in Fig. 4A,B. With an increase in pH, integrated inten-
sities of long-range NOE peaks (Fig. 4A), as well as
short- and medium-range NOE peaks (Fig. 4B),
increased. The increment was also confirmed for FP2
and FP3 at 3 mm (data not shown).
Discussion
Solution structures of FPs
Complete assignment of the proton chemical-shift
resonances was achieved for FPs, excluding the amide
A
B
Fig. 2. Sedimentation equilibrium and sedimentation velocity mea-
surements of FP1. (A) The distribution of apparent molecular mass
(Mw
app
) against the location in the cell, obtained from sedimenta-
tion equilibrium measurements at protein concentrations of 0.9
(black), 1.5 (blue) and 3.0 m
M (red). Apparent molecular mass was
calculated at respective points in the cell, i.e. the higher the A
250
,
the closer to the bottom of the cell. (B) Distribution of sedimenta-
tion coefficients obtained from sedimentation velocity measure-
ments at protein concentrations of 2.3 (blue) and 3.0 m
M (red).
M. Araki and A. Tamura Solubility-dependent structure formation
FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS 2339
protons of the N-termini, which were not detected
because of rapid exchange with solvent. Some of the
NOE peaks in the NOESY spectra of 3 mm FPs, how-
ever, overlapped and could not be identified separately.
Torsion angle restraints, obtained from DQF-COSY,
and distance restraints, obtained from clearly separated
NOE peaks at a mixing time of 200 ms, are given in
Table 2 for 3 mm FP1 at pH 3.0. Because the sedimen-
tation equilibrium and velocity measurements of FP1
showed that most of the dissolved species form oligo-
mers consisting of 15–20 monomers at a protein con-
centration of 3 mm, it is likely that some distance
restraints are because of intermolecular interactions
caused by the added fragment. However, the ratio of
long-range NOEs related to the added fragment to the
total of long-range NOEs is 73%, which is notably
higher than that of intraresidue (21%), short-range
(30%) and medium-range (28%) NOEs, showing that
long-range interactions are generated mainly by the
added fragment. We deduced that long-range distance
restraints are caused by intermolecular interactions,
whereas other distance restraints are produced intra-
molecularly. The final structural calculation ofa FP1
molecule was performed using a total of 342 intraresi-
due, short-range and medium-range distance restraints
and 25 backbone / dihedral angle restraints (Table 2).
The resulting r.m.s.d. from the mean structure for the
backbone atoms is 2.79 ± 0.71 A
˚
, which is worse than
that derived from typical native proteins (< 0.5 A
˚
)
because long-range distance restraints were not
included in the structural calculation. A stereoview of
the 10 best FP1 structures (Fig. 5A) shows that the
backbone residues Phe12–Ile22 adopt an a helix. By
contrast, in the Sp1f3–Zn
2+
complex, the backbone
residues Asp16–Gln26 form an a helix or a 3
10
helix,
whereas Phe12–Ser15 forms a turn between the second
b strand and the helix. Intermolecular interactions
were drawn on the lowest energy structures of FP1,
assuming that long-range restraints excluded in the
A
B
Fig. 3. pH dependence of the solubilities of FPs. (A) Experimen-
tally obtained plots of the natural logarithm of the solubility, S
(molÆL
)1
), in the pH range 6.4–7.4. (B) Solubilities calculated using
r
p
= 384 A
˚
and l
pðsÞ
l
o
0
pðsolÞ
=RT values of )22.8 ± 0.1,
)22.5 ± 0.1 and )20.5 ± 0.0 for FP1, FP2 and FP3, respectively, in
the pH range 2.5–6.0. Errors in ln S
calc
for FPs are < 0.1. (Inset)
Net charge curves of FPs calculated using pK values for the amino
acid side chains, a-COOH and a-NH
3
+
termini: Tyr = 10.9,
Cys = 8.3, Lys = 10.8, Arg = 12.5, Asp = 3.9, His = 6.0; NH
3
+
of
the N-terminus = 9.1, and COOH of the C-terminus = 2.4 [22].
Table 2. Structural statistics for the 10 lowest energy structures of
FP1.
Number of distance restraints
Intraresidue 150
Short-range (|i)j | = 1 residues) 118
Medium-range (|i)j | = 2–4 residues) 74
Long-range (|i)j | > 4 residues) (52)
a
Number of torsion angle restraints
/ 25
Geometric statistics
r.m.s.d. from the mean structure (A
˚
)
Backbone atoms (residues 3–30) 2.79 ± 0.71
All heavy atoms (residues 3–30) 3.51 ± 0.81
Ramachandran analysis
Most favored regions (%) 42.1
Additional allowed regions (%) 37.5
Generously allowed regions (%) 18.9
Disallowed regions (%) 1.4
a
Long-range distance restraints were not used in the structure
calculation (see text).
Solubility-dependent structure formation M. Araki and A. Tamura
2340 FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS
structural calculation are induced intermolecularly
(Fig. 5B). The long-range interactions can be divided
into three categories: (a) interactions between the
N-terminus (Tyr1–Cys8) and the C-terminus (His21–
Ala30), (b) interactions between the N-terminus and
the middle (Arg11–Arg14), and (c) interactions
between the middle and the C-terminus. The N- and
C-termini contain mainly hydrophobic amino acids,
and the middle also includes the hydrophobic amino
acids, Phe12 and Met13. Because most of the long-
range NOEs were assigned to these hydrophobic resi-
dues, the intermolecular interactions are presumably
hydrophobic. CD analysis and structural determination
showed that the amount ofa helix increases with an
increase in FP1 concentration. Therefore, it is likely
that the a helix, whose constituent interactions are
mainly intramolecular, is inducedby the hydrophobic
interactions between FP1 molecules. It should also
be noted that the structural specificity is apparently
low and the structure is therefore loosely packed
because the proton chemical shifts do not disperse
compared with those observed in typical native
proteins, despite the appearance ofa number of NOE
peaks.
Physicochemical factors that determine the
structural stability
Here, we scrutinize the dependence of protein solubil-
ity or hydration potential on structural stability to elu-
cidate the determining factor in structural formation.
First, the stability was derived from the fraction of
the structured molecules at each pH for 3 mm FPs by
using six, clearly separated, short- and medium-range
NOE peaks, in which distances related to NOE inten-
sities were set to the best structure of FP1 [15]. The
fractions obtained are shown in Fig. 4C. Fractions of
FP1 and FP2, which are close to 0 at pH 2.5, increase
to 0.4 as pH is increased to 4.2. The structural
stabilities could not be calculated at pH values > 4.2
because of an increase in the aggregation rate. By con-
trast, the FP3 fraction is close to 0 at pH 3.8, and
increases to 0.4 as pH increases to 5.6. The folding
free energies (DG
f
) can be calculated using the
fractions (Table 3) according to the following simple
scheme: 15D $ N
15
; because ultracentrifuge measure-
A
B
C
Fig. 4. pH dependence of the structural stabilities of FPs. (A) Inte-
grated intensities of long-range NOE peaks of FP1. (B) Integrated
intensities of short- and medium-range NOE peaks of FP1. (C) Frac-
tions of structured molecules of FP1 (red), FP2 (blue) and FP3
(green).
M. Araki and A. Tamura Solubility-dependent structure formation
FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS 2341
ments suggested that, up to 3 mm, the dissolved spe-
cies contain mainly the monomer and 15–20 molecule
oligomer. Second, the hydration potential (DG
hyd
) was
evaluated at each pH value as follows: DG
hyd
values
for the FPs were calculated using the hydration poten-
tials of the amino acid side chains and the backbone
[16,17]. In addition, because the individual hydration
potential for ionizable side chains, a-COOH and
a-NH
3
+
termini depends on the pH of the solution,
i.e. a protonated cation or deprotonated anion is
stabilized or destabilized, respectively, with a decrease
in pH [16], pH dependence was taken into account
using pK values (Table 3) [22]. Third, solubility was
represented as the dissolution free energy of solute p
(DG
dissol
), which can be calculated using DG
dissol
=
)RT ln S
p
, where R is gas constant, T is absolute
temperature and S
p
is the solubility of solute
p. DG
dissol
reflects the free energy of transfer of solute
p from the solid phase to aqueous solution. For a
protein as a solute, solubility is known to depend on
polarity, hydrophobicity and net charge [23–25], the
latter of which increases positively with a decrease in
pH. The solubility at each pH value for the FPs was
calculated by taking these factors into account and
using Eqn (5) (Fig. 3B and Table 3).
Plots of DG
f
, obtained using NOE peaks as
described above, against DG
hyd
show a fairly good
correlation (r = )0.70; Fig. 6A), i.e. the less hydrated
the protein, the more stable the solution structure.
However, plots of DG
f
against DG
dissol
show a much
stronger correlation (r = )0.86; Fig. 6B), i.e. the more
insoluble the protein, the more stable the structure.
These results indicate that the structural stability is
more strongly dependent on the precipitation capabil-
ity than on the dehydration capability. This means
that, even if it becomes more hydrated, a protein that
prefers precipitation retains the stable structure, as
shown in the case of Phe26Tyr replacement. These pre-
cipitable proteins might tend to form the solution
structure because both structure formation and precipi-
tation require the self-association of protein molecules
to sequester from solvent.
Mode offormationof loosely packed structure
through intermolecular interactions
When the hydrophobic peptidefragmentof FIV-
VALG was added to the C-terminus of unstructured
IP consisting of 25 residues, formationof the overall
protein structure was induced in FP1, with a drastic
AB
Fig. 5. NMR structures of FP1. (A) Backbone traces of the 10 best structures. Backbones of residues 12–22, which adopt an a helix, are
drawn in red. (B) Schematic diagram of long-range interactions between FP1 molecules. Long-range interactions are represented by blue
lines between the lowest energy structures of FP1. The side chains of Tyr1, Phe3, Ala4, Cys5, Pro6, Ala7, Cys8, Arg11, Phe12, Met13,
Arg14, His21, Ile22, Lys23, Ala25, Phe26, Ile27, Val28, Val29 and Ala30, which are related to the long-range interactions, are indicated in
green. In addition, NOE peaks including aromatic protons of Phe3 and Phe12 in FP1 are not clearly separated because chemical shifts of
aromatic protons of Phe3 and Phe12 are close to those of Phe26. Therefore, long-range interactions including aromatic protons of Phe3 and
Phe12 in FP2, which is the variant Phe26Tyr of FP1, are added.
Solubility-dependent structure formation M. Araki and A. Tamura
2342 FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS
decrease in solubility, i.e. the solubility of IP was
> 2.2 mm at pH 7.1 (data not shown), whereas that
of FP1 was 10 lm (Fig. 3A). This loosely packed
structure is maintained by intermolecular interactions,
indicating that the added peptidefragment confers
the ability to form the protein structure by having
low-specificity interactions. How much hydrophobicity
is needed in the added fragment to form this struc-
ture? Phe26Tyr replacement in the added fragment
resulted in a similar stability in FP2, while keeping
the same solubility. However, the additional Val28Ser
replacement led to a drastic decrease in the stability
of FP3, which showed higher solubility than FP2,
indicating that hydrophilic replacement of two of
seven hydrophobic residues in the added fragment
deprives the whole protein of the ability to form
structures. In fact, a decrease of 1.5 kcalÆmol
)1
in
the dissolution free energy by the replacements
Phe26Tyr and Val28Ser resulted in a decrease of
5 kcalÆmol
)1
in structural stability (Fig. 6B). These
results demonstrate that, among poorly soluble pro-
teins, dissolved species tend to be transformed from
the solvent-exposed unfolded state into a loosely
packed solution structure through intermolecular
interactions.
The mechanism ofstructuralformation through
intermolecular interactions is similar to that of intrinsi-
cally unstructured proteins, which are devoid of
the well-defined secondary and tertiary structure in
A
B
Fig. 6. Relationships between structural stability and hydrophobic
indices. The correlation between the folding free energy (DG
f
) and
(A) the hydration potential (DG
hyd
), defined as the free energy of
transfer ofa protein solute molecule from the gaseous phase into
water, or (B) the dissolution free energy (DG
dissol
= )RT ln S), for
FP1 (red), FP2 (blue) and FP3 (green). Lines represent linear fits
with correlation coefficients of )0.70 and )0.86, respectively. To
calculate DG
hyd
of FPs, hydration potentials of 18 amino acid side
chains, excluding Pro and Arg, were taken from the values
measured by Wolfenden et al. [16]. Those of Pro and Arg side
chains and the backbone are taken from the values measured by
Privalov et al. [17]. pK values of the amino acid side chains,
a-COOH and a-NH
3
+
termini are taken from the values in the
legend to Fig. 3 [22].
Table 3. pH dependence of folding free energy (DG
f
), dissolution
free energy (DG
dissol
) and hydration potential (DG
hyd
) of FPs. Errors
in pH and DG
dissol
are < 0.02 and 0.1, respectively. The folding free
energy was calculated according to DG
f
= )RT ln ([N
15
] ⁄ [D]
15
),
where [N
15
] and [D] are the concentrations of the structured oligo-
mer and unfolded monomer, respectively. A error in the DG
f
value
for FP3 at pH 5.65 could not be obtained because only the spec-
trum at a mixing time of 300 ms was analyzed owing to an
increase in the aggregation rate.
pH
DG
f
(kcalÆmol
)1
)
DG
dissol
(kcalÆmol
)1
)
DG
hyd
(kcalÆmol
)1
)
FP1 2.67 )44.4 ± 0.2 )3.86 )500.4
2.85 )46.5 ± 0.2 )3.24 )498.9
3.05 )49.6 ± 0.3 )2.59 )497.2
3.27 )55.1 ± 0.4 )1.90 )495.4
3.52 )54.1 ± 0.3 )1.10 )493.4
3.91 )67.4 ± 2.7 0.169 )490.4
4.03 )59.2 ± 0.8 0.545 )489.5
FP2 3.00 )47.9 ± 0.3 )2.93 )502.9
3.17 )48.1 ± 0.3 )2.40 )501.5
3.31 )50.6 ± 0.4 )1.96 )500.4
3.86 )49.5 ± 0.3 )0.180 )496.1
4.16 )58.6 ± 1.0 0.740 )493.9
FP3 3.87 )49.0 ± 0.4 )1.34 )503.1
4.20 )62.0 ± 0.6 )0.342 )500.6
4.47 )57.3 ± 0.4 0.315 )498.7
4.82 )59.6 ± 1.3 0.958
)496.2
5.65 )67.0 2.43 )490.7
M. Araki and A. Tamura Solubility-dependent structure formation
FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS 2343
isolation, but adopt relatively rigid conformations
upon binding a specific molecular partner of ligands or
substrates [26–29]. In the case of FPs, it is unstruc-
tured in isolation, as in the case of intrinsically
unstructured proteins, whereas a helix (Phe12-Ile22) is
induced with a gain in the concentration. One reason
that a local conformation consistent with a helix is
formed could be attributed to the potential ability
[30,31] and ⁄ or local structural preference [32] in the
sequence, because the zinc-finger domain Sp1f3, ini-
tially chosen as a basis for the FPs, also forms a ter-
tiary structure containing a helix (Asp16–Gln26) upon
binding to Zn
2+
[14]. We show that the local structure,
stabilized originally by coordinate bonds to the metal
ion, could be inducedby interactions with other copies
of the same chain, after the additionofa proper
hydrophobic segment responsible for the decrease in
solubility. Furthermore, this inductive mechanism indi-
cates that hydrophobicity could be regarded as a driv-
ing force in the structuralformationof FPs, as
observed in protein folding in general [13]. By using a
simple model of short, self-avoiding flexible chains on
lattices, in which the only energetic feature of the
sequence is the hydrophobic interaction, protein fold-
ing simulations imply a significant probability that a
random sequence of amino acids will encode a globu-
lar conformation, in general, and a particular native
structure, in specific [12]. The globular conformation is
interpreted as being like a molten globule, stabilized
by intramolecular hydrophobic interactions [12,33].
However, our results show that the unfolded protein
can be transformed into the structured assembly by
altering the solubility. Because decreasing the protein
solubility does not require a sophisticated design for
the primary sequence, it is implied that the loosely
packed structures with intermolecular interactions
shown in FPs may arise readily and frequently com-
pared with well-packed structures. The existence of this
moderately structured state might serve as an interme-
diate stage in the search for the well-packed structures
of natural proteins in the vast primary sequence space.
In addition, the high occurrence ofa primary sequence
that prefers self-association seems closely connected to
the inherent tendency of natural proteins to aggregate
and form potentially harmful deposits such as amyloid
fibrils [34–38].
Materials and methods
Protein synthesis and purification
Proteins were synthesized using the Pioneer Peptide Synthe-
sis System (PerSeptive Biosystems, Foster City, CA, USA)
with Fmoc solid-phase chemistry, and were cleaved from
the resin with a solution containing 82.5% trifluoroacetic
acid, 5% H
2
O, 5% thioanisole, 2.5% 1,2-ethanedithiol and
0.8 m phenol (v ⁄ v ⁄ v ⁄ v ⁄ v). Individual proteins were purified
by reverse-phase HPLC (acetonitrile ⁄ H
2
O ⁄ 0.1% trifluoro-
acetic acid). Protein identity was confirmed bya laser
desorption ToF ⁄ MS apparatus, AXIMA-CFR (SHIM-
ADZU, Kyoto, Japan). Protein samples for all studies were
lyophilized and stored under anaerobic conditions. These
purified proteins were handled under a N
2
atmosphere in
buffers deoxygenated with N
2
to prevent cysteine oxidation.
CD measurements
Spectra were acquired at 20 °C on a Jasco J-720 CD spec-
tropolarimeter with 0.1, 0.2, 1, 5 and 10 mm pathlength
cuvettes on 0.004 to 3 mm protein samples. After each pro-
tein was dissolved in a buffer containing 25 mm acetic acid,
2–4 mm NaOH and 50 mm NaCl in 90% H
2
O and 10%
D
2
O, the solution was adjusted to pH 3.0 (± 0.1) with
NaOH or HCl. Protein concentrations were determined
spectroscopically by measuring the amount of protein
sulfhydryls with Ellman’s reagent [39].
Ultracentrifuge measurements
Each protein sample was prepared as described above. Sedi-
mentation velocity and sedimentation equilibrium measure-
ments were performed using a Beckman-Coulter Optima
XL-1 analytical ultracentrifuge (Fullerton, CA, USA) with
an An-60 rotor and two-channel charcoal-filled Epon cells
at 20 °C and pH 3.0 (± 0.1). Sedimentation equilibrium
was measured at protein concentrations of 0.9, 1.5 and
3.0 mm; sedimentation velocity was measured at 2.3 and
3.0 mm. Data were analyzed using ultrascan 6.01 (http://
www.ultrascan.uthscsa.edu/).
NMR spectroscopy
NMR measurements were performed on a Bruker DMX-
750 spectrometer at 20 °C on 0.9 to 3.0 mm protein sam-
ples. After each protein was dissolved in a buffer containing
25 mm acetic acid, 0–4 mm NaOH and 50 mm NaCl in
90% H
2
O and 10% D
2
O, the solution was adjusted to
objective pH with NaOH or HCl. Pulsed-field gradient
NMR spectra were acquired at 20 °C and pH 3.0 contain-
ing 40 mm 1,4-dioxane at 0.5 mm FP1, at which concentra-
tion, CD and sedimentation equilibrium measurements
suggested that FP1 was in the monomer state. All chemical
shifts were referenced to the sodium salt of trimethylsilyl-
propionate. Pulsed-field gradient NMR spectroscopy, 2QF
COSY, TOCSY (mixing time of 30 and 80 ms) and NOE
spectroscopy experiments were performed and water
suppression was achieved by selective presaturation or field-
Solubility-dependent structure formation M. Araki and A. Tamura
2344 FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS
gradient pulses [40]. Proton resonances were assigned using
the sequential assignment procedure [41]. Fractions of
structured molecules were obtained by analyzing NOESY
spectra at mixing times of 100, 150, 200, 250 and 300 ms
for the 3.0 mm protein solution.
Structure calculations
Distance restraints were obtained by converting integrated
NOE peak intensities into distance upper limits, using the
macro CALIBA in dyana [42]. Standard pseudo atom
distances were used when they were needed. Torsion
angle constraints for / were determined from
3
J
Na
. They
were then classified into three categories: )120 ± 70,
)120 ± 50 and )120 ± 40° corresponding to
3
J
Na
< 7.5,
7.5–8.5 and > 8.5 Hz, respectively. With a cut-off of
0.2 A
˚
for upper bound NOE violations, 50 structures
were generated by using dyana and the 10 lowest
energy structures were selected to represent 3D structures.
Ramachandran analysis was evaluated by using procheck
[43].
Solubility measurements
Solubility was estimated using saturated protein solutions
as follows. Samples of 200–400 lL protein suspensions in
buffers containing 25 mm phosphate and 50 mm NaCl in
90% H
2
O and 10% D
2
O were mixed thoroughly by pipett-
ing, then incubated for 20 min at 25 °C. After the incu-
bated samples were centrifuged at 17 000 g for 20 min at
25 °C to remove the precipitate, the pH and concentration
of the individual supernatant solutions were measured. Pro-
tein concentrations were determined spectroscopically by
measuring the amount of sulfhydryls with Ellman’s reagent
[39].
Analysis of protein solubility
The chemical potential ofa solute p in a real solution
(l
p(sol)
) is generally expressed by
l
pðsolÞ
¼ l
0
pðsolÞ
þ RT ln c
p
S
p
ð1Þ
where l
0
pðsolÞ
is the chemical potential in the ideal solution
at a standard concentration of p, R is gas constant, T is
absolute temperature, c
p
is the activity coefficient of p and
S
p
is the concentration of p. As a first approximation, if p
is present as p
z+
ion impenetrable to the solvent, for com-
pact protein ions, l
0
pðsolÞ
could be divided into the free
energy of solvation (DG
0
solv
) that depends on the valence of
the ion, Z, and a term independent on the charge of the
ion l
0
0
pðsolÞ
:
l
0
pðsolÞ
¼ l
0
0
pðsolÞ
þ DG
0
solv;p
ð2Þ
DG
0
solv
in Eqn (2) is expressed by Born equation:
DG
0
solv;p
¼
Z
2
e
2
N
a
8pe
0
r
p
1
1
e
r
ð3Þ
where e is charge of an electron, N
a
is Avogadro’s number,
e
0
is electric constant, e
r
is relative permittivity, and r
p
is
the radius of the ion. In addition, c
p
in Eqn (1) is expressed
by extended the Debye–Hu
¨
ckel law:
log c
p
¼
AZ
2
ffiffi
I
p
1 þ Br
p
ffiffi
I
p
ð4Þ
where A and B are constants, and I is the ionic strength in
the solution. In saturated solution, because l
p(sol)
is equal
to the chemical potential of p in the solid (l
p(s)
), we
can obtain an equation for the solubility of p by using
Eqns (1–4):
ln S
p
¼
ln 10A
ffiffi
I
p
1 þ Br
p
ffiffi
I
p
þ
C
r
p
!
Z
2
þ
l
pðsÞ
l
0
0
pðsolÞ
RT
ð5Þ
where
C ¼
e
2
N
a
8pe
0
RT
1
1
e
r
ð6Þ
Assuming that the dissolved protein is a spherical ion
with a net charge of Z and radius r
p
impenetrable to
solvent [24], experimentally obtained plots of individual
protein solubility were fitted using Eqn (5) with A = 0.512
(L
1 ⁄ 2
Æmol
)1 ⁄ 2
), B = 0.329 (A
˚
)1
ÆL
1 ⁄ 2
Æmol
)1 ⁄ 2
) and C = 281
(A
˚
)at25°C.
Acknowledgements
We thank Miyo Sakai (Institute for Protein
Research, Osaka University) for the ultracentrifuge
measurements. This work was supported in part by
Grants-in-Aid for Science from the Ministry of
Education, Culture, Sports, Science and Technology
of Japan.
References
1 Carlsson U & Jonsson BH (1995) Folding of beta-sheet
proteins. Curr Opin Struct Biol 5, 482–487.
2 Chakrabartty A & Baldwin RL (1995) Stability of
alpha-helices. Adv Protein Chem 46, 141–176.
3 Jackson SE (1998) How do small single-domain
proteins fold? Fold Des 3, R81–R91.
4 Kuhlman B & Baker D (2004) Exploring folding free
energy landscapes using computational protein design.
Curr Opin Struct Biol 14, 89–95.
5 Ponder JW & Richards FM (1987) Tertiary templates
for proteins. Use of packing criteria in the enumeration
M. Araki and A. Tamura Solubility-dependent structure formation
FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS 2345
[...]... & Caradonna JP (1997) Structures of zinc finger domains from transcription factor Sp1 Insights into sequence-specific protein–DNA recognition J Biol Chem 272, 7801–7809 Araki M & Tamura A (2007) Transformation of an alpha-helix peptide into a beta-hairpin inducedbyadditionofafragment results in creation ofa coexisting state Proteins 66, 860–868 Wolfenden RV, Cullis PM & Southgate CC (1979) Water,... recognition by intrinsically unstructured proteins J Mol Biol 338, 1015–1026 Hoang TX, Marsella L, Trovato A, Seno F, Banavar JR & Maritan A (2006) Common attributes of nativestate structures of proteins, disordered proteins, and amyloid Proc Natl Acad Sci USA 103, 6883–6888 Chikenji G, Fujitsuka Y & Takada S (2006) Shaping up the protein folding funnel by local interaction: lesson from a structure... with backbone freedom Science 282, 1462–1467 Roy S & Hecht MH (2000) Cooperative thermal denaturation of proteins designed by binary patterning of polar and nonpolar amino acids Biochemistry 39, 4603– 4607 Lau KF & Dill KA (1990) Theory for protein mutability & biogenesis Proc Natl Acad Sci USA 87, 638–642 Dill KA (1990) Dominant forces in protein folding Biochemistry 29, 7133–7155 Narayan VA, Kriwacki... protein folding, and the genetic code Science 206, 575–577 Privalov PL & Makhatadze GI (1993) Contribution of hydration to protein folding thermodynamics II The entropy and Gibbs’ energy of hydration J Mol Biol 232, 660–679 Wouters MA & Curmi PM (1995) An analysis of side chain interactions and pair correlations within antiparallel beta-sheets: the differences between backbone hydrogen-bonded and non-hydrogen-bonded... Protein Sci 4, 2006– 2018 Isogai Y, Ito Y, Ikeya T, Shiro Y & Ota M (2005) Design of lambda Cro fold: solution structure ofa monomeric variant of the de novo protein J Mol Biol 354, 801–814 Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL & Baker D (2003) Design ofa novel globular protein fold with atomic-level accuracy Science 302, 1364–1368 Harbury PB, Plecs JJ, Tidor B, Alber T & Kim PS (1998)... for NMR structure calculation Solubility-dependent structure formation with the new program DYANA J Mol Biol 273, 283– 298 43 Laskowski RA, Rullmannn JA, MacArthur MW, Kaptein R & Thornton JM (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR J Biomol NMR 8, 477–486 FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS 2347... Tanford C (1961) Physical Chemistry of Macromolecules Wiley, New York, NY Shaw KL, Grimsley GR, Yakovlev GI, Makarov AA & Pace CN (2001) The effect of net charge on the solubility, activity, and stability of ribonuclease Sa Protein Sci 10, 1206–1215 Wright PE & Dyson HJ (1999) Intrinsically unstructured proteins: re-assessing the protein structure–function paradigm J Mol Biol 293, 321–331 Dunker AK,.. .Solubility-dependent structure formation 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 M Araki and A Tamura of allowed sequences for different structural classes J Mol Biol 193, 775–791 Dahiyat BI & Mayo SL (1997) De novo protein design: fully automated sequence selection Science 278, 82–87 Desjarlais JR & Handel TM (1995) De novo design of the hydrophobic cores of proteins Protein... FEBS Journal 276 (2009) 2336–2347 ª 2009 The Authors Journal compilation ª 2009 FEBS M Araki and A Tamura 40 Piotto M, Saudek V & Sklenar V (1992) Gradient-tailored excitation for single-quantum NMR spectroscopy of aqueous solutions J Biomol NMR 2, 661–665 41 Wuthrich K (1986) NMR of Proteins and Nucleic Acids Wiley, New York, NY 42 Guntert P, Mumenthaler C & Wuthrich K (1997) Torsion angle dynamics for... residue pairs Proteins 22, 119–131 Gohon Y, Pavlov G, Timmins P, Tribet C, Popot JL & Ebel C (2004) Partial specific volume and solvent interactions of amphipol A8 -35 Anal Biochem 334, 318–334 Stejskal EO & Tanner JE (1965) Spin diffusion measurements: spin echoes in the presence ofa time-dependent field gradient J Chem Phys 42, 288–292 Dingley AJ, Mackay JP, Chapman BE, Morris MB, Kuchel PW, Hambly BD . Solubility-dependent structural formation of a 25-residue, natively unfolded protein, induced by addition of a seven-residue peptide fragment Mitsugu Araki and Atsuo Tamura Graduate School of. 7801–7809. 15 Araki M & Tamura A (2007) Transformation of an alpha-helix peptide into a beta-hairpin induced by addi- tion of a fragment results in creation of a coexisting state. Proteins. YAFACPACPK RFMRSDALSK HIKTAFIVVA 30 LG FP2 YAFACPACPK RFMRSDALSK HIKTAYIVVA LG FP3 YAFACPACPK RFMRSDALSK HIKTAYISVA LG M. Araki and A. Tamura Solubility-dependent structure formation FEBS Journal 276 (2009)