an integrated proteomics analysis of bone tissues in response to mechanical stimulation

Li et al BMC Systems Biology 2011, 5(Suppl 3):S7 http://www.biomedcentral.com/1752-0509/5/S3/S7 RESEARCH Open Access An integrated proteomics analysis of bone tissues in response to mechanical stimulation Jiliang Li1, Fan Zhang2,3, Jake Y Chen2,3,4* From BIOCOMP 2010 - The 2010 International Conference on Bioinformatics and Computational Biology Las Vegas, NV, USA 12-15 July 2010 Abstract Bone cells can sense physical forces and convert mechanical stimulation conditions into biochemical signals that lead to expression of mechanically sensitive genes and proteins However, it is still poorly understood how genes and proteins in bone cells are orchestrated to respond to mechanical stimulations In this research, we applied integrated proteomics, statistical, and network biology techniques to study proteome-level changes to bone tissue cells in response to two different conditions, normal loading and fatigue loading We harvested ulna midshafts and isolated proteins from the control, loaded, and fatigue loaded Rats Using a label-free liquid chromatography tandem mass spectrometry (LC-MS/MS) experimental proteomics technique, we derived a comprehensive list of 1,058 proteins that are differentially expressed among normal loading, fatigue loading, and controls By carefully developing protein selection filters and statistical models, we were able to identify 42 proteins representing 21 Rat genes that were significantly associated with bone cells’ response to quantitative changes between normal loading and fatigue loading conditions We further applied network biology techniques by building a fatigue loading activated protein-protein interaction subnetwork involving of the human-homolog counterpart of the 21 rat genes in a large connected network component Our study shows that the combination of decreased antiapoptotic factor, Raf1, and increased pro-apoptotic factor, PDCD8, results in significant increase in the number of apoptotic osteocytes following fatigue loading We believe controlling osteoblast differentiation/proliferation and osteocyte apoptosis could be promising directions for developing future therapeutic solutions for related bone diseases Introduction Bone tissues are sensitive to its mechanical environment [1] It is well accepted that the presence of a reasonable level of mechanical stress on bones (known as normal loading) could enhance bone formation and maintain a healthy bone mass [2] Prolonged absence of normal loading on bones–usually associated with extended physical inactivity due to injuries–could decrease bone formation and increase bone resorption, eventually leading to bone loss and disuse osteoporosis When the level of mechanical stimulations exceeds the normal amount for an extended period of time, a stress condition known as fatigue loading could occur In fatigue loading, microdamage such as small cracks in bone tissues may * Correspondence: jakechen@iupui.edu Indiana University School of Informatics, Indianapolis, IN 46202, USA Full list of author information is available at the end of the article appear, triggering a cascade of bone remodeling processes that attempt to repair damaged bone tissues via sequential bone resorption and formation [3] When fatigue loading conditions are not recognized early and addressed, the risks for bone injuries and bone diseases will increase Therefore, understanding the constituents and functions of molecular repertoires involved in fatigue loading has been a central focus of study in molecular biology of the bone It still remains unknown what all the mechanicallysensitive genes and proteins in bone cells under mechanical stress are and how their differential expressions are regulated [4] Past research identified osteoblast as being recruited to bone surfaces to form new bones in response to loading [5] In fatigue loading conditions, the migration of osteoblast to the bone surface is known to co-occur with migrations of osteoblast © 2011 Li et al This is an open access article distributed under the terms of the Creative Commons Attribution License (http:// creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited Li et al BMC Systems Biology 2011, 5(Suppl 3):S7 http://www.biomedcentral.com/1752-0509/5/S3/S7 progenitors and osteoblast to bone damaged areas, thus activating bone remodeling process and damage repairs [6-11] This process requires temporal coordination of osteoblast and osteoblast to repair damaged bone tissues Therefore, osteoblast-associated genes were reported and presumed to be involved with different levels of mechanical stimulation signals [12] Several biochemical studies have also suggested that anabolic mechanical stimulation may increase the expression of c-fos, osteopontin, COX-2, guanosine triphosphatases (GTPases), adenylate cyclase, phospholipase C (PLC), and mitogen-activated protein kinases (MAPKs), which can further lead to elevated expression of bone anabolic factors such as prostaglandins and Nitric oxide (See reference [13] for a review) In this work, we performed the first proteomic study of mechanical loading of bone tissues using Rat as an animal model Prior to our study, large-scale functional genomics analysis of the activation of bone remodeling process were performed in a few microarray studies [14,15] While these earlier studies suggested osteocyte apoptosis and Wnt signaling pathways were two critical biological processes involved, proper controls against normal loading conditions were not performed in those experimental studies It was not clear what mRNA level changes observed in fatigue loading were shared in common with normal loading Nor is it clear whether the biological processes observed at the mRNA expression level could overlook critical protein changes, since many recent studies revealed that largescale gene expression and proteomics tend to complement (instead of significantly overlap) with each other [16,17] Elucidating proteomics level changes, particularly when integrated with prior findings of genes and new models developed at the molecular signaling network/pathway level, can lead to new insights on bone mechanical stress and development of novel molecular biomarkers Experimental procedures Design of bone loading experiments using rat models In order to study proteomics profile differences in living bone tissues, an ulnar axial compression loading system was chosen (see illustration in Figure 1) The system allows loading experimentation at different stress levels for animal models [6,10,11] Female Sprague-Dawley Rat (age: months; weight: 250-300 grams) were purchased from Harlan (Indianapolis, Indiana, USA) Animals were acclimatized for two weeks and housed in environmentally controlled rooms in Laboratory Animal Resource Center (LARC) of Indiana University School of Medicine and fed standard Rat chow and water ad libitum All the procedures performed in this study were in accordance with the Page of 14 Figure An illustration of the ulnar axial compression loading system to study the effects of different levels of mechanical stress on bones in animal models Indiana University Animal Care and Use Committee Guideline Nine animals were divided randomly into groups: control (CTRL), loading (L) and fatigue loading (FL) groups All the animals were anesthetized with an intraperitoneal injection of ketamine (60 mg/kg; Ketaset®– Fort Dodge Animal Health, Fort Dodge, IA) and xylazine (7.5 mg/kg; Sedazine®–Fort Dodge Animal Health, Fort Dodge, IA) The animals in the control group were sacrificed 96 hours post-injection without being subject to mechanical loading The right ulnae of the remained animals were loaded or overloaded based on treatment groups The animals in the loading group were loaded with a peak force of 20 N for 360 cycles and then sacrificed at 96 hours after the loading session For the animals in fatigue loading group, one bout of loading with a peak force of 20 N at Hz was not stopped until 1015% stiffness loss The overloaded animals were also sacrificed at 96 hours after the loading session Load was applied using a load-controlled, electromagnetic loading device Total loading cycles was adjusted through the connected load controller Stiffness loss during the loading procedure was observed through continuous monitoring of displacement of the arm on the loading device using a CCD Laser Displacement Sensor (LK Series, Keyence Corp Osaka, Japan) Li et al BMC Systems Biology 2011, 5(Suppl 3):S7 http://www.biomedcentral.com/1752-0509/5/S3/S7 Liquid chromatography coupled tandem mass spectrometry proteomics analysis The ulnae were dissected out immediately and cleaned of all muscle and connective tissue after all the Rats were sacrificed Both of 5-mm proximal and distal ends of the ulnae were removed The remaining ulna midshafts were snap frozen in liquid nitrogen and stored at -80°C until protein isolation For total protein isolation, Rat ulna midshafts were shattered and ground to a fine powder under liquid nitrogen using mortars and pestles There were three groups (The control, loading and fatigue loading groups), three samples per group, and two HPLC injections per sample (Table 1) Label-free protein identification and protein quantitative analysis services were performed by professionals at the Protein Analysis and Research Center/Proteomics Core of Indiana University School of Medicine, colocated at Monarch Life Sciences, Inc, Indianapolis For a thorough review of the principle and method developed at Monarch, refer to the review by Wang et al [18] The protein identification tasks were analyzed using standard commercial-strength protocols and commercial software packages developed at Monarch, which have supported many scientific research case studies in areas including proteomics studies, biomarker discovery, and bioinformatics analysis, e.g., [19-21] Briefly, Tryptic peptides were analyzed using Thermo-Finnigan linear ios-trap mass spectrometer (LTQ) coupled with a HPLC system Peptides were eluted with a gradient from to 45% Acetonitrile developed over 120 minutes and data were collected in the triple-play mode (MS Scan, zoom scan, and MS/MS scan) The acquired raw peak list data were generated by XCalibur (version 2.0) using default parameters and further analyzed by an algorithm using default parameters described by Higgs et al [22] MS database searches were performed against the combined protein data set from International Protein Index (IPI; version 1.2) [23] and the non-redundant NCBI-nr human protein database (2005 version), which totaled 22,180 protein records The resulting MS/MS data were searched using SEQUEST Cluster from Thermo Scientific (bundled with BioWorks software suite version 2.70 based on the original SEQUEST algorithm [24]) During Table The experimental design for proteomics analysis of bone loading in rat Samples Replicates Injection runs (Subtotals) CTRL L FL The LC-MS/MS experiment consists of groups × samples × replicates = 18 LC/MS injections run in random order The three groups are: Controls (CTRL), Loaded (L), and Fully-Loaded (FL) Page of 14 search, we set the number of missed cleavages permitted to be We search fixed modifications to be Iodoethanol on Cys and variable modifications to be Oxidation on Met The mass tolerance for precursor ions were set at Da and the mass tolerance for fragment ions were set at 0.7 Da For novel protein that could not be positively identified by SEQUEST, we used the de novo sequencing function of the BioWorks software to obtain peptide sequence information for the collision-induced dissociation (CID) spectra Carious data processing filters for protein identification were applied to keep only peptides with the XCorr score above 1.5 for singly charged peptides, 2.5 for doubly charged peptides, and 3.5 for triply charged peptides These XCorr scores were set according to linear discriminant analysis similar to that described in DTASelect (version 2.0) to control false-positive rate at below 5% levels These empirical thresholds were validated in large data sets processed by Monarch in similar conditions and peptide identification parameters The false positive rates of these large-scale studies under the used parameters were estimated from the number and quality of spectral matches to the decoy database Protein quantification tasks were also conducted using software developed at Monarch Life Sciences, Inc First, all extracted ion chromatograms (XICs) were aligned by retention time Each aligned peak were matched by precursor ion, charge state, fragment ions from MS/MS data, and retention time within a one-minute window Then, after alignment, the area-under-the-curve (AUC) for each individually aligned peak from each sample was measured, normalized, and compared for relative abundance–all as described in [22] The normalization methods by Higgs et al [22] were used, and the data were then transformed back to the original scale Here, a linear mixed model generalized from individual ANOVA (Analysis of Variance) was used to quantify protein intensities and calculate statistical significance In principle, the linear mixed model considers three types of effects when deriving protein intensities based on weighted average of quantile-normalized peptide intensities: 1) group effect, which refers to the fixed non-random effects caused by the experimental conditions or treatments that are being compared; 2) sample effect, which refers to the random effects (including those arising from sample preparations) from individual biological samples within a group; 3) replicate effect, which refers to the random effects from replicate injections from the same sample preparation Standard statistical data preprocessing techniques, including quantile normalization and randomization of measurement orders, were applied first to eliminate technical bias due to random variations from biological samples and their replicates The model fitting was performed in the SAS software (version 9) Li et al BMC Systems Biology 2011, 5(Suppl 3):S7 http://www.biomedcentral.com/1752-0509/5/S3/S7 using PROC MIXED The REML method was used as a fit mechanism and degrees of freedom were computed using the Satterthwaite method The RANDOM statement was used to model the covariance with the NOBOUND parameter option in the PROC statement The p-value estimates the proportion of times a change at least as big as evaluated will be observed if in fact there is no real change All the p-values were then transformed into q-values that estimate the False Discovery Rate (FDR) [25] Homologous gene mapping of rat and human proteins Due to the lack of protein-protein interaction data coverage in Rat, we map all Rat protein-encoding genes to their human gene homolog to take advantage of large sets of protein interaction data available in human The homologous gene mapping involved four steps First, we extracted all the Rat protein identifiers (IPI number and protein GI accessions) from the sequence annotation field of the proteomics search results Second, we downloaded Rat IPI reference database version 1.2, which contains 38,873 sequence identifier mapping relationships among Rat Swissprot IDs, sequence accession numbers, and gene names Third, we downloaded NCBI Homologene release 49.1 We filtered out genes from other organisms to include proteins only from Rat and human After applying the filter, 14,558 remained in the homologene groups, which contain homology mapping relationships between 15,125 Rat genes and 14,753 human genes We defined a “homolog gene match” between a Rat gene and a human gene as each pair found within the same homologene group In the fourth step, we map the matched human genes back to human proteins, using Uniprot sequence annotation files Note that the mapping between Rat protein to human protein based on gene homology relationships has the limitation of aggregating all alternative spliced protein isoforms together However, this will not be a major concern, since the majority protein-protein interaction data are collected based on gene-level experimentation data and therefore not offer isoform-level resolution anyway Method for selecting candidate significantly differentiallyexpressed proteins For candidate proteins, we refer to the list of proteins that satisfies statistical protein-selecting filters but still needs further scrutiny before a subset of them can be confirmed as biologically relevant It is tempting to control false positives using high FC threshold and q-value (false discovery Rate adjusted p-value) when we try to select candidate proteins that are differentially expressed with statistically rigor For example, the following threshold filter (the F1 filter) was suggested by the proteomics analysis software by default to control possible Page of 14 false positives that may arise due to potential sources of variability (estimated to be up to 15%) from different sample and experimental errors: F1 : FC (x|i) ≥ 1.5&q − value (x|i) < 0.05 While a stringent filter is generally necessary for proteomics experiments, protein expression level changes in proteomics experiments are generally expected to be smaller than those often observed in expression microarrays, because changes in signaling proteins or regulatory proteins are expected to be subtle in general In addition, the problem with applying default filters directly is that these filters fail to take into account of data that may be highly correlated from controlled comparative experiments with more than two conditions In our case, we have three conditions FL for fatigue loading, L for normal loading, and CTRL for normal controls If we can observe high degree of correlation of results that occur in FL vs CTRL and in F vs CTRL, the FC requirement and q-value requirement may be both relaxed to allow more interesting proteins that change barely in the “twilight zone” of >10%, as long as these proteins can be further validated using additional computational or experimental techniques Therefore, in complementary to fold change filter in F1, we developed a second experimental filter (the F2 filter) to select candidate proteins that changed significantly above 10% (FC ≥ 1.1) to show up, when we try to compare two similar conditions, FL_vs_L (Fatigue Loading against Normal Loading), in which data for L_vs_CTRL (Fatigue Loading against Controls) and FL_vs_CTRL (Normal Loading against Controls) are also available: F2: FC (x|FL_vs_L) ≥ 1.1 and q-value(x|FL_vs_CTRL)*q-value(x|L_vs_CTRL) < 0.0025 and p-value(x|FL_vs_CTRL) < 0.05 & p-value(x| L_vs_CTRL) < 0.05 Here in this F2 filter, in addition to relaxing the FC threshold, we also modified how we should apply statistical q-value Here, we introduce a concept that we’ll refer to as the triangulation property of comparable analysis Briefly, this property is met if and only if pairwise comparison results from three conditions, for example, CTRL, L, and FL, are consistent among themselves In other words, we say a triangulation property exists among CTRL-L-FL if and only if proteins passing FL_vs_CTRL and L_vs_CTRL q-value filters with FC changes of f1 and f2 respectively are the same set of proteins that pass FL_vs_L with and same q-value filter and a FC threshold of f1/f2 independently In fact, no Li et al BMC Systems Biology 2011, 5(Suppl 3):S7 http://www.biomedcentral.com/1752-0509/5/S3/S7 proteomics search software that we know today guarantee such triangulation property due to inherent errors in the model that estimates statistical significance of peptides and proteins In fact, we understand that the qvalue was derived from a more stringent statistical model in early years of proteomics licensed from Eli Lilly (private communication with Dr Mu Wang, who provided the proteomics service for this experiment) Therefore, we developed an easy-to-understand metaanalysis method, q-value triangulation method, in the F2 filter, so that we can rely primarily on better-understood p-value statistics In this method, we assume the p-value calculations of two independent experiments, FL_vs_CTRL and L_vs_CTRL, are generally reliable and therefore can be controlled at 0.05 The q-value triangulation calculation for FL_vs_L is done by multiplying the respective q-values for FL_vs_CTRL and L_vs_CTRL comparisons controlled at the 0.05^2 = 0.0025 level The reason why the p-values are chosen comparing to the control samples rather than comparing FL vs L is that comparing to the control samples with our statistic method can reduce baseline noise in proteomics data and detect weak patterns Normality probability plot calculation To determine normality of the residual distribution, we use the normal probability plot to calculate the normal quantiles of all values in Residue (i), or Res_FL_L The values and the normal quantiles are then plotted against each other Normal quantiles are computed using the fvalue, fi , which is calculated as: fi = i − 0.5 n where i is the index of the value and n is the number of values The normal quantile, q(f), for a given f-value is the value for which P[X

Định dạng
Số trang	14
Dung lượng	1,71 MB