pFind Studio: a computational solution for mass spectrometry-based proteomics



2015




freeQuant: a mass spectrometry label-free quantification software tool for complex proteome analysis
TheScientificWorldJournal2015. Deng, Ning et al. Department of Biomedical Engineering, Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou 310027, China
ABSTRACT:Study of complex proteome brings forward higher request for the quantification method using mass spectrometry technology. In this paper, we present a mass spectrometry label-free quantification tool for complex proteomes, called freeQuant, which integrated quantification with functional analysis effectively. freeQuant consists of two well-integrated modules: label-free quantification and functional analysis with biomedical knowledge. freeQuant supports label-free quantitative analysis which makes full use of tandem mass spectrometry (MS/MS) spectral count, protein sequence length, shared peptides, and ion intensity. It adopts spectral count for quantitative analysis and builds a new method for shared peptides to accurately evaluate abundance of isoforms. For proteins with low abundance, MS/MS total ion count coupled with spectral count is included to ensure accurate protein quantification. Furthermore, freeQuant supports the large-scale functional annotations for complex proteomes. Mitochondrial proteomes from the mouse heart, the mouse liver, and the human heart were used to evaluate the usability and performance of freeQuant. The evaluation showed that the quantitative algorithms implemented in freeQuant can improve accuracy of quantification with better dynamic range.
Use: pFind; pDeep



Experimental validation of Bacillus anthracis A16R proteogenomics
Scientific Reports2015. Gao, ZQ et al. Beijing Inst Radiat Med, Beijing Proteome Res Ctr, Natl Engn Res Ctr Prot Drugs, State Key Lab Prote,Natl Ctr Prot Sci, 27 Taiping Rd, Beijing 100850, Peoples R China.
ABSTRACT:Anthrax, caused by the pathogenic bacterium Bacillus anthracis, is a zoonosis that causes serious disease and is of significant concern as a biological warfare agent. Validating annotated genes and reannotating misannotated genes are important to understand its biology and mechanisms of pathogenicity. Proteomics studies are, to date, the best method for verifying and improving current annotations. To this end, the proteome of B. anthracis A16R was analyzed via one-dimensional gel electrophoresis followed by liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS). In total, we identified 3,712 proteins, including many regulatory and key functional proteins at relatively low abundance, representing the most complete proteome of B. anthracis to date. Interestingly, eight sequencing errors were detected by proteogenomic analysis and corrected by resequencing. More importantly, three unannotated peptide fragments were identified in this study and validated by synthetic peptide mass spectrum mapping and green fluorescent protein fusion experiments. These data not only give a more comprehensive understanding of B. anthracis A16R but also demonstrate the power of proteomics to improve genome annotations and determine true translational elements.
Use: pFind; pBuild



Appraisal of the missing proteins based on the mRNAs bound to ribosomes
Journal of Proteome Research2015. Xu, Shaohang et al. BGI Shenzhen, Beishan Ind Zone, 11 Build, Shenzhen 518083, Peoples R China
ABSTRACT:Considering the technical limitations of mass spectrometry in protein identification, the mRNAs bound to ribosomes (RNC-mRNA) are assumed to reflect the mRNAs participating in the translational process. The RNC-mRNA data are reasoned to be useful for appraising the missing proteins. A set of the multiomics data including free-mRNAs, RNC-mRNAs, and proteomes was acquired from three liver cancer cell lines. On the basis of the missing proteins in neXtProt (release 2014-09-19), the bioinformatics analysis was carried out in three phases: (1) finding how many neXtProt missing proteins have or do not have RNA-seq and/or MS/MS evidence, (2) analyzing specific physicochemical and biological properties of the missing proteins that lack both RNA-seq and MS/MS evidence, and (3) analyzing the combined properties of these missing proteins. Total of 1501 missing proteins were found by neither RNC-mRNA nor MS/MS in the three liver cancer cell lines. For these missing proteins, some are expected higher hydrophobicity, unsuitable detection, or sensory functions as properties at the protein level, while some are predicted to have nonexpressing chromatin structures on the corresponding gene level. With further integrated analysis, we could attribute 93% of them (1391/1501) to these causal factors, which result in the expression products scarcely detected by RNA-seq or MS/MS.
Use: pFind



Evaluation and Comparison of Aligners for De Novo Sequencing
International Journal of Advanced Computer Technology2015. Simin Zhu et al. Yunnan Minzu University, Kunming
ABSTRACT:In high-throughput proteomics research of tandem mass spectrometry, de novo sequencing provides a novel method to interpret MS/MS data without any help of sequence database and discover new organisms. In this paper, we have systematically evaluated and compared the capability of mainstream de novo sequencing software via testing data sets which have been correctly identified by Mascot and Sequest, so we can intuitively find out the optimal de novo sequencing software for protein identification.
Use: pNovo; pFind



CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION2015. Yang, YH et al. Chinese Acad Sci, Inst Hydrobiol, Key Lab Algal Biol, Wuhan 430072, Peoples R China.
ABSTRACT:Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported. However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics's usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects.
Use: pFind



Convenient and precise strategy for mapping N-glycosylation sites using microwave-assisted acid hydrolysis and characteristic ions recognition
Analytical Chemistry2015. Ma, C et al. Georgia State Univ, Ctr Diagnost & Therapeut, Atlanta, GA 30303 USA.
ABSTRACT:N-glycosylation is one of the most prevalence protein post-translational modifications (PTM) which is involved in several biological processes. Alternation of N-glycosylation is associated with cellular malfunction and development of disease. Thus, investigation of protein N-glycosylation is crucial for diagnosis and treatment of disease. Currently, deglycosylation with peptide N-glycosidase F is the most commonly used technique in N-glycosylation analysis. Additionally, a common error in N-glycosylation site identification, resulting from protein chemical deamidation, has largely been ignored. In this study, we developed a convenient and precise approach for mapping N-glycosylation sites utilizing with optimized TFA hydrolysis, ZIC-HILIC enrichment, and characteristic ions of N-acetylglucosamine (GlcNAc) from higher-energy collisional dissociation (HCD) fragmentation. Using this method, we identified a total of 257 N-glycosylation sites and 144 N-glycoproteins from healthy human serum. Compared to deglycosylation with endoglycosidase, this strategy is more convenient and efficient for large scale N-glycosylation sites identification and provides an important alternative approach for the study of N-glycoprotein function.
Use: pFind; pBuild



Nanobodies: site-specific labeling for super-resolution imaging, rapid epitope-mapping and native protein complex isolation
eLife2015. Pleiner, T et al. Max Planck Inst Biophys Chem, Dept Cellular Logist, D-37077 Gottingen, Germany.
ABSTRACT:Nanobodies are single-domain antibodies of camelid origin. We generated nanobodies against the vertebrate nuclear pore complex (NPC) and used them in STORM imaging to locate individual NPC proteins with <2 nm epitope-label displacement. For this, we introduced cysteines at specific positions in the nanobody sequence and labeled the resulting proteins with fluorophore-maleimides. As nanobodies are normally stabilized by disulfide-bonded cysteines, this appears counterintuitive. Yet, our analysis showed that this caused no folding problems. Compared to traditional NHS ester-labeling of lysines, the cysteine-maleimide strategy resulted in far less background in fluorescence imaging, it better preserved epitope recognition and it is site-specific. We also devised a rapid epitope-mapping strategy, which relies on crosslinking mass spectrometry and the introduced ectopic cysteines. Finally, we used different anti-nucleoporin nanobodies to purify the major NPC building blocks - each in a single step, with native elution and, as demonstrated, in excellent quality for structural analysis by electron microscopy. The presented strategies are applicable to any nanobody and nanobody-target.
Use: pLink



The architecture of a eukaryotic replisome
nature structural & molecular biology2015. Sun, JC et al. Rockefeller Univ, DNA Replicat Lab, 1230 York Ave, New York, NY 10021 USA.
ABSTRACT:At the eukaryotic DNA replication fork, it is widely believed that the Cdc45-Mcm2-7-GINS (CMG) helicase is positioned in front to unwind DNA and that DNA polymerases trail behind the helicase. Here we used single-particle EM to directly image a Saccharomyces cerevisiae replisome. Contrary to expectations, the leading strand Pol epsilon is positioned ahead of CMG helicase, whereas Ctf4 and the lagging-strand polymerase (Pol) alpha-primase are behind the helicase. This unexpected architecture indicates that the leading-strand DNA travels a long distance before reaching Pol epsilon, first threading through the Mcm2-7 ring and then making a U-turn at the bottom and reaching Pol a at the top of CMG. Our work reveals an unexpected configuration of the eukaryotic replisome, suggests possible reasons for this architecture and provides a basis for further structural and biochemical replisome studies.
Use: pLink; pXtract



Kojak: efficient analysis of chemically cross-linked protein complexes
Journal of Proteome Research2015. Hoopmann, MR et al. Inst Syst Biol, 401 Terry Ave North, Seattle, WA 98109 USA.
ABSTRACT:Protein chemical cross-linking and mass spectrometry enable the analysis of protein protein interactions and protein topologies;, however, complicated cross-linked peptide spectra require specialized algorithms to identify interacting sites. The Kojak cross-linking software application is a new, efficient approach to identify cross-linked peptides, enabling large-scale analysis of protein protein interactions by chemical cross-linking techniques. The algorithm integrates spectral processing and scoring schemes adopted from traditional database search algorithms and can identify cross-linked peptides using many different chemical cross-linkers with of without heavy isotope labels. Kojak was used to analyze both novel and existing data sets and was compared to existing cross-linking algorithms. The algorithm provided increased cross-link identifications over existing algorithms and, equally importantly, the results in a fraction Of computational time. The Kojak algorithm is open-source; cross-platform, and freely available. This software provides both existing and new cross-linking researchers alike an effective way to derive additional cross-link identifications from new or existing data sets. For new users, it provides a simple-analytical resource resulting in more cross-link identifications than other Methods.
Use: pLink



A strategy for dissecting the architectures of native macromolecular assemblies
Nature methods2015. Shi, Y et al. Rockefeller Univ, Lab Mass Spectrometry & Gaseous Ion Chem, 1230 York Ave, New York, NY 10021 USA.
ABSTRACT:It remains particularly problematic to define the structures of native macromolecular assemblies, which are often of low abundance. Here we present a strategy for isolating complexes at endogenous levels from GFP-tagged transgenic cell lines. Using cross-linking mass spectrometry, we extracted distance restraints that allowed us to model the complexes' molecular architectures.
Use: pLink