pFind Studio: a computational solution for mass spectrometry-based proteomics
2015
Journal of Proteome Research2015. Tan, ZJ et al.
Univ Michigan, Dept Surg, Ann Arbor, MI 48109 USA.
ABSTRACT:Glycosylation has significant effects on protein function and cell metastasis, which are important in cancer progression. It is of great interest to identify site-specific glycosylation in search of potential cancer biomarkers. However, the abundance of glycopeptides is low compared to that of nonglycopeptides after trypsin digestion of serum samples, and the mass spectrometric signals of glycopeptides are often masked by coeluting nonglycopeptides due to low ionization efficiency. Selective enrichment of glycopeptides from complex serum samples is essential for mass spectrometry (MS)-based analysis. Herein, a strategy has been optimized using LCA enrichment to improve the identification of core-fucosylation (CF) sites in serum of pancreatic cancer patients. The optimized strategy was then applied to analyze CF glycopeptide sites in 13 sets of serum samples from pancreatic cancer, chronic pancreatitis, healthy controls, and a standard reference. In total, 630 core-fucosylation sites were identified from 322 CF proteins in pancreatic cancer patient serum using an Orbitrap Elite mass spectrometer. Further data analysis revealed that 8 CF peptides exhibited a significant difference between pancreatic cancer and other controls, which may be potential diagnostic biomarkers for pancreatic cancer.
Use: pFind
Journal of proteomics2015. Zhang, L et al.
China Univ Geosci, Dept Biol Sci & Technol, Sch Environm Studies, Wuhan 430074, Peoples R China.
ABSTRACT:Androctonus bicolor is one of the most poisonous scorpion species in the world. However, little has been known about the venom composition of the scorpion. To better understand the molecular diversity and medical significance of the venom from the scorpion, we systematically analyzed the venom components by combining transcriptomic and proteomic surveys. Random sequencing of 1000 clones from a cDNA library prepared from the venom glands of the scorpion revealed that 70% of the total transcripts code for venom peptide precursors. Our efforts led to a discovery of 103 novel putative venom peptides. These peptides include NaTx-like, ICTx-like and CaTx-like peptides, putative antimicrobial peptides, defensin-like peptides, BPP-like peptides, BmKa2-like peptides, Kunitz-type toxins and some new-type venom peptides without disulfide bridges, as well as many new-type venom peptides that are cross-linked with one, two, three, five or six disulfide bridges, respectively. We also identified three peptides that are identical to known toxins from scorpions. The venom was also analyzed using a proteomic technique. The presence of a total of 16 different venom peptides was confirmed by LC-MS/MS analysis. The discovery of a wide range of new and new-type venom peptides highlights the unique diversity of the venom peptides from A. bicolor. These data also provide a series of novel templates for the development of therapeutic drugs for treating ion channel-associated diseases and infections caused by antibiotic-resistant pathogens, and offer molecular probes for the exploration of structures and functions of various ion channels. (C) 2015 Elsevier B.V. All rights reserved.
Use: pFind
Proteomics2015. Sidoli, S et al.
Univ Penn, Dept Biochem & Biophys, Perelman Sch Med, Philadelphia, PA 19104 USA.
ABSTRACT:MS-based proteomics has become the most utilized tool to characterize histone PTMs. Since histones are highly enriched in lysine and arginine residues, lysine derivatization has been developed to prevent the generation of short peptides (<6 residues) during trypsin digestion. One of the most adopted protocols applies propionic anhydride for derivatization. However, the propionyl group is not sufficiently hydrophobic to fully retain the shortest histone peptides in RP LC, and such procedure also hampers the discovery of natural propionylation events. In this work we tested 12 commercially available anhydrides, selected based on their safety and hydrophobicity. Performance was evaluated in terms of yield of the reaction, MS/MS fragmentation efficiency, and drift in retention time using the following samples: (i) a synthetic unmodified histone H3 tail, (ii) synthetic modified histone peptides, and (iii) a histone extract from cell lysate. Results highlighted that seven of the selected anhydrides increased peptide retention time as compared to propionic, and several anhydrides such as benzoic and valeric led to high MS/MS spectra quality. However, propionic anhydride derivatization still resulted, in our opinion, as the best protocol to achieve high MS sensitivity and even ionization efficiency among the analyzed peptides.
Use: pFind; pBuild
Journal of Proteome Research2015. Zhang, Yao et al.
BGI Shenzhen, Shenzhen 518083, Peoples R China
ABSTRACT:Investigations of missing proteins (MPs) are being endorsed by many bioanalytical strategies. We proposed that proteogenomics of testis tissue was a feasible approach to identify more MPs because testis tissues have higher gene expression levels. Here we combined proteomics and transcriptomics to survey gene expression in human testis tissues from three post-mortem individuals. Proteins were extracted and separated with glycine- and tricine-SDS-PAGE. A total of 9597 protein groups were identified; of these, 166 protein groups were listed as MPs, including 138 groups (83.1%) with transcriptional evidence. A total of 2948 proteins are designated as MPs, and 5.6% of these were identified in this study. The high incidence of MPs in testis tissue indicates that this is a rich resource for MPs. Functional category analysis revealed that the biological processes that testis MPs are mainly involved in are sexual reproduction and spermatogenesis. Some of the MPs are potentially involved in tumorgenesis in other tissues. Therefore, this proteogenomics analysis of individual testis tissues provides convincing evidence of the discovery of MPs. All mass spectrometry data from this study have been deposited in the ProteomeXchange (data set identifier PXD002179).
Use: pFind
Journal of Proteome Research2015. Chen, Yang et al.
Jinan Univ, Coll Life Sci & Technol, Inst Life & Hlth Engn, Key Lab Funct Prot Res Guangdong Higher Educ Inst, Guangzhou 510632, Guangdong, Peoples R China
ABSTRACT:Finding protein evidence (PE) for protein coding genes is a primary task of the Phase I Chromosome-Centric Human Proteome Project (C-HPP). Currently, there are 2948 PE level 2-4 coding genes per neXtProt, which are deemed missing proteins in the human proteome. As most samples prepared and analyzed in the C-HPP framework were focusing on detergent soluble proteins, we posit that as a natural composition the cytoplasmic detergent-insoluble proteins (DIPs) represent a source of finding missing proteins. We optimized a workflow and separated cytoplasmic DIPs from three human lung and three human hepatoma cell lines via differential speed centrifugation. We verified that the detergent-soluble proteins (DSPs) could be sufficiently depleted and the cytoplasmic DIP isolation was partially reproducible with Spearman r > 0.70 according to two independent SILAC MS experiments. Through label-free MS, we identified 4524 and 4156 DIPs from lung and liver cells, respectively. Among them, a total of 23 missing proteins (22 PE2 and 1 PE4) were identified by MS, and 18 of them had translation evidence; in addition, six PES proteins were identified by MS, three with translation evidence. We showed that cytoplasmic DIPs were not an enrichment of transmembrane proteins and were chromosome-, cell type-, and tissue-specific. Furthermore, we demonstrated that DIPs were distinct from DSPs in terms of structural and physical chemical features. In conclusion, we have found 23 missing proteins and 6 PES proteins from the cytoplasmic insoluble proteome that is biologically and physical-chemically different from the soluble proteome, suggesting that cytoplasmic DIPs carry comprehensive and valuable information for finding PE of missing proteins. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001694.
Use: pFind
PLOS ONE2015. Moraes, I et al.
Univ Sao Paulo, Inst Quim, Dept Bioquim, BR-01498 Sao Paulo, Brazil.
ABSTRACT:Histones are the main structural components of the nucleosome, hence targets of many regulatory proteins that mediate processes involving changes in chromatin. The functional outcome of many pathways is "written" in the histones in the form of post-translational modifications that determine the final gene expression readout. As a result, modifications, alone or in combination, are important determinants of chromatin states. Histone modifications are accomplished by the addition of different chemical groups such as methyl, acetyl and phosphate. Thus, identifying and characterizing these modifications and the proteins related to them is the initial step to understanding the mechanisms of gene regulation and in the future may even provide tools for breeding programs. Several studies over the past years have contributed to increase our knowledge of epigenetic gene regulation in model organisms like Arabidopsis, yet this field remains relatively unexplored in crops. In this study we identified and initially characterized histones H3 and H4 in the monocot crop sugarcane. We discovered a number of histone genes by searching the sugarcane ESTs database. The proteins encoded correspond to canonical histones, and their variants. We also purified bulk histones and used them to map post-translational modifications in the histones H3 and H4 using mass spectrometry. Several modifications conserved in other plants, and also novel modified residues, were identified. In particular, we report O-acetylation of serine, threonine and tyrosine, a recently identified modification conserved in several eukaryotes. Additionally, the sub-nuclear localization of some well-studied modifications (i.e., H3K4me3, H3K9me2, H3K27me3, H3K9ac, H3T3ph) is described and compared to other plant species. To our knowledge, this is the first report of histones H3 and H4 as well as their post-translational modifications in sugarcane, and will provide a starting point for the study of chromatin regulation in this crop.
Use: pFind
Journal of proteome research2015. Sun, Han et al.
Shanghai Acad Sci & Technol, Shanghai Ctr Bioinformat Technol, 1278 Ke Yuan Rd, Shanghai 201203, Peoples R China
ABSTRACT:HeLa cell line, which was derived from cervical carcinoma, provides an idea platform to study both the integration of human papillomavirus and the massive mutations occurring on the cancer cell genome. Proteogenomics is a field with the intersection of proteomics and genomics to perform gene annotation and identify gene mutation. In this work, we first identified the SNV/INDEL, structural variation (SV), and virus infection/integration events from RNA-Seq data of HeLa cell line; then, by applying proteogenomics strategy, we were able to detect some of the genomic events with the tandem mass spectrometry (MS/MS) data from the same sample. Furthermore, some of the mutated peptides were experimentally validated using multiple reaction monitoring technology. The integrated analysis of the RNA-Seq and MS/MS data not only renders the discovery of HeLa cell genome variations more credible but also illustrates a practical workflow for protein-coding mutation discovery in cancer-related studies.
Use: pFind
Journal of Proteome Research2015. Yang, Lijuan et al.
Fudan Univ, Inst Biomed Sci, Dept Cell Biol, Key Lab Epigenet, Shanghai 200032, Peoples R China
ABSTRACT:The chromosome-centric human proteome project (C-HPP) has made great progress of finding protein evidence (PE) for missing proteins (PE2-4 proteins defined by the neXtProt), which now becomes an increasingly challenging field. As a majority of samples tested in this field were from adult tissues/cells, the developmental stage specific or relevant proteins could be missed due to biological source availability. We posit that epigenetic interventions may help to partially bypass such a limitation by stimulating the expression of the "silenced" genes in adult cells, leading to the increased chance of finding missing proteins. In this study, we established in vitro human cell models to modify the histone acetylation, demethylation, and methylation with near physiological conditions. With mRNA-seq analysis, we found that histone modifications resulted in overall increases of expressed genes in an even distribution manner across different chromosomes. We identified 64 PE2-4 and six PE5 proteins by MaxQuant (FDR < 1% at both protein and peptide levels) and 44 PE2-4 and 7 PES proteins by Mascot (FDR < 1% at peptide level) searches, respectively. However, only 24 PE2-4 and five PE5 proteins in Mascot, and 12 PE2-4 and one PES proteins in MaxQuant searches could, respectively, pass our stringently manual spectrum inspections. Collectively, 27 PE2-4 and five PES proteins were identified from the epigenetically modified cells; among them, 19 PE2-4 and three PES proteins passed FDR < 1% at both peptide and protein levels. Gene ontology analyses revealed that the PE2-4 proteins were significantly involved in development and spermatogenesis, although their chemical physical features had no statistical difference from the background. In addition, we presented an example of suspicious PES peptide spectrum matched with unusual AA substitutions related to post-translational modification. In conclusion, the epigenetically manipulated cell models should be a useful tool for finding missing proteins in C-HPP. The mass spectrometry data have been deposited to the iProx database (accession number: IPX00020200).
Use: pFind
Journal of Proteome Research2015. Su, N et al.
Beijing Proteome Res Ctr, 33 Sci Pk Rd, Beijing 102206, Peoples R China.
ABSTRACT:As part of the Chromosome-Centric Human Proteome Project (C-HPP) mission, laboratories all over the world have tried to map the entire missing proteins (MPs) since 2012. On the basis of the first and second Chinese Chromosome Proteome Database (CCPD 1.0 and 2.0) studies, we developed systematic enrichment strategies to identify MPs that fell into four classes: (1) low molecular weight (LMW) proteins, (2) membrane proteins, (3) proteins that contained various post-translational modifications (PTMs), and (4) nucleic acid-associated proteins. Of 8845 proteins identified in 7 data sets, 79 proteins were classified as MPs. Among data sets derived from different enrichment strategies, data sets for LMW and PTM yielded the most novel MPs. In addition, we found that some MPs were identified in multiple-data sets, which implied that tandem enrichments methods might improve the ability to identify MPs. Moreover, low expression at the transcription level was the major cause of the "missing" of these MPs; however, MPs with higher expression level also evaded identification, most likely due to other characteristics such as LMW, high hydrophobicity and PTM. By combining a stringent manual check of the MS2 spectra with peptides synthesis verification, we confirmed 30 MPs (neXtProt PE2 similar to PE4) and 6 potential MPs (neXtProt PE5) with authentic MS evidence. By integrating our large-scale data sets of CCPD 2.0, the number of identified proteins has increased considerably beyond simulation saturation. Here, we show that special enrichment strategies can break through the data saturation bottleneck, which could increase the efficiency of MP identification in future C-HPP studies. All 7 data sets have been uploaded to ProteomeXchange with the identifier PXD002255.
Use: pFind; pBuild
PLANT AND SOIL2015. Qin, R et al.
S China Normal Univ, Sch Life Sci, Key Lab Ecol & Environm Sci Guangdong Higher Educ, Guangzhou 510631, Guangdong, Peoples R China.
ABSTRACT:In the present study, the effects of Cu (2.0 and 8.0 mu M) on root growth of Allium cepa var. agrogarum L. were addressed and protein abundance levels were analyzed using the technology of proteomics combined with transcriptomics, in order to go deeper into the understanding of the mechanism of Cu toxicity on plant root systems at the protein level and to provide valuable information for monitoring and forecasting the effects of exposure to Cu in real scenarios conditions. Protein extraction; Two-dimensional electrophoresis (2-DE) analysis; Mass spectrometry analysis; Establishment of the in-house database; Restriction enzyme map of the in-house database and protein identification. Root growth was dramatically inhibited after 12 h Cu treatment. By establishing an in-house database and using mass spectrometry analysis, 27 differentially abundant proteins were identified. These 27 proteins were involved in multiple biological processes including defensive response, transcription regulation and protein synthesis, cell wall synthesis, cell cycle and DNA replication, and other important functions. Our results provide new insights at the proteomic level into the Cu-induced responses, defensive responses and toxic effects, and provide new molecular markers of the early events of plant responses to Cu toxicity. Moreover, the establishment of an in-house database provides a big improvement for proteomics research on non-model plants.
Use: pFind