pFind Studio: a computational solution for mass spectrometry-based proteomics



2018




Multiproteases combined with high-pH reverse-phase separation strategy verified fourteen missing proteins in human testis tissue
Journal of proteome research2018. Sun, Jinshuai et al. Demo Lab Thermofisher Sci China, Shanghai 200120, Peoples R China; Hebei Univ, Coll Life Sci, Hebei Prov Key Lab Res & Applicat Microbial Diver, Baoding 071002, Hebei, Peoples R China; Beijing Inst Life, Beijing Proteome Res Ctr, Natl Ctr Prot Sci Beijing, State Key Lab Prote, Beijing 102206, Peoples R China; Wuhan Univ, Sch Pharmaceut Sci, Key Lab Combinat Biosynth & Drug Discovery, Minist Educ, Wuhan 430072, Hubei, Peoples R China; Sun Yat Sen Univ, Sch Life Sci, State Key Lab Biocontrol, Guangdong Key Lab Plant Resources, Guangzhou 510275, Guangdong, Peoples R China
ABSTRACT:Subsequent to conducting the Chromosome-Centric Human Proteome Project, we have focused on human testis-enriched missing proteins (MPs) since 2015. For protein coverage to be enhanced, a multiprotease strategy was used for separation of samples by 10% SDS-PAGE. For the separating efficiency to be improved, a high-pH reverse phase (RP) separation strategy was applied to fractionate complex samples in this study. A total of 11,558 proteins was identified, which is the largest proteome data set for single human tissue sample so far. On the basis of this large-scale data set, we verified 14 MPs (PE2) in neXtProt (2018-01) after spectrum quality analysis, isobaric post-translational modification, and single amino acid variant filtering, and synthesized peptide matching. Tissue expression analysis showed that 3 of 14 MPs were testis-specific proteins. Functional analysis showed that 10 of 14 MPs were closely related to liver tumor, liver carcinoma, and hepatocellular carcinoma. Another 100 MPs were listed as candidates but required additional verification information. All MS data sets have been deposited into the ProteomeXchange with the identifier PXD009737.
Use: pFind



N-linked glycopeptide identification based on open mass spectral library search
Biomed Research International2018. An, ZW et al. Chinese Acad Sci, Acad Math & Syst Sci, Natl Ctr Math & Interdisciplinary Sci, Key Lab Random Complex Struct & Data Sci, Beijing 100101, Peoples R China.
ABSTRACT:Confident characterization of intact glycopeptides is a challenging task in mass spectrometry-based glycoproteomics due to microheterogeneity of glycosylation, complexity of glycans, and insufficient fragmentation of peptide bones. Open mass spectral library search is a promising computational approach to peptide identification, but its potential in the identification of glycopeptides has not been fully explored. Here we present pMatchGlyco, a new spectral library search tool for intact N-linked glycopeptide identification using high-energy collisional dissociation (HCD) tandem mass spectrometry (MS/MS) data. In pMatchGlyco, (1) MS/MS spectra of deglycopeptides are used to create spectral library, (2) MS/MS spectra of glycopeptides are matched to the spectra in library in an open (precursor tolerant) manner and the glycans are inferred, and (3) a false discovery rate is estimated for top-scored matches above a threshold. The efficiency and reliability of pMatchGlyco were demonstrated on a data set of mixture sample of six standard glycoproteins and a complex glycoprotein data set generated from human cancer cell line OVCAR3.
Use: pFind; pParse; pGlyco



Digging for missing proteins using low-molecular-weight protein enrichment and a mirror protease strategy
Journal of Proteome Research2018. He, CT et al. Sun Yat Sen Univ, Sch Life Sci, State Key Lab Biocontrol, Guangdong Key Lab Plant Resources, Guangzhou 510275, Guangdong, Peoples R China.
ABSTRACT:In 2012, the Chromosome-centric Human Proteome Project (C-HPP) launched an investigation for missing proteins (MPs) to complete the Human Proteome Project (HPP). The majority of the MPs were distributed in low-molecular-weight (LMW) ranges, especially from 0 to 40 kDa. LMW protein identification is challenging, owing to their short length, low abundance, and hydrophobicity. Furthermore, many sequences from trypsin digestion are unlikely to yield detectable peptides or a reasonable quality of MS2 spectrum. Therefore, we focused on small MPs by combining LMW protein enrichment and a pair of complementary proteases strategy with trypsin and LysargiNase for human testis samples. In-depth testis LMW protein profiling resulted in the identification of 4063 proteins, of which 2565 were LMW proteins and 1130 had pairs of peptides generated from both trypsin and LysargiNase. This provided additional mass spectral evidence of further verification of small MPs. Finally, two MPs were verified from the seven MP candidates. One of them, Q8N688, was verified with two series of continuous and complementary b/y-product ions from the pairs of spectra for tryptic and LysargiNase digested peptides after the "mirror spectrum" matching. This make the confident identification of the representative peptides for the target MPs. On the contrary, the two verified peptides for Q86WR6 were identified with the same strategy from the gel-separation and gel-elution samples, respectively. Although the other five MP candidates showed high-quality spectra, they could not be sufficiently distinguished as PE1s and require further verification. All MS data sets have been deposited in the ProteomeXchange with identifier PXD010093.
Use: pFind



Proteomics investigation of the changes in serum proteins after high-and low-flux hemodialysis
RENAL FAILURE2018. Han, S et al. Chinese Acad Sci, Dalian Inst Chem Phys, Natl Chromatog Res & Anal Ctr, Key Lab Separat Sci Analyt Chem, 457 Zhongshan Rd, Dalian 116023, Peoples R China.
ABSTRACT:Purpose: This study aimed to use proteomics methods to investigate the changes in serum protein levels after high- and low-flux hemodialysis (HD). Methods: Before and after HD, serum samples were obtained from two selected patients who were treated with a Polyflux 140H high-flux dialyzer and a Polyflux 14L low-flux dialyzer during two continuous therapy sessions. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was performed to identify the proteins. Results: A total of 212 and 203 serum proteins were identified after high-flux and low-flux HD, respectively. After high-flux HD, 21 proteins increased, and 132 proteins decreased. After low-flux HD, 87 proteins increased, and 45 proteins decreased. High-flux HD led to a significantly greater reduction in protein levels than low-flux HD (0.73 +/- 0.13 vs. 0.84 +/- 0.18, p = .00). Among the increased and decreased proteins, the isoelectric point (pI) values mainly ranged from 5 to 7, and the molecular weights (Mws) were mostly smaller than 30 kDa. The serum proteins showed no difference in pI or Mw for high- and low-flux HD. Gene ontology (GO) analysis showed that the detected proteins were related to immune system processes and complement activation. Conclusions: Serum protein levels differentially changed after high- and low-flux HD. Long-term effects should be observed in future studies.
Use: pFind



Optimal settings of mass spectrometry open search strategy for higher confidence
Journal of Proteome Research2018. Li, DH et al. Jinan Univ, Inst Life & Hlth Engn, Coll Life Sci & Technol, Key Lab Funct Prot Res Guangdong Higher Educ Inst, Guangzhou 510632, Guangdong, Peoples R China.
ABSTRACT:In most proteome mass spectrometry experiments, more than half of the mass spectra cannot be identified, mainly because of various modifications. The open search strategy allows for a larger precursor tolerance to utilize more spectra, especially those with post-translational modifications; however, thorough quality control based on independent information is lacking. Here, we used the "Suspicious Discovery Rate (SDR)" based on translatome sequencing (RNC-seq) as an independent source to reference the proteome open search results in steady-state cells. We found that the open search strategy increased the spectra utilization with the cost of increased suspicious identifications that lack translation evidence. We further found that restricting the peptide FDR below 0.1% efficiently controlled the suspicious identifications of open search methods and thus enhanced the confidence of the peptide identification with modifications comparable to the level of the traditional narrow window search. We then demonstrated the successful and validated identification of 27 single amino acid variations from the spectra of two cell lines using the open search strategy without a predefined database. These results validated the proper use of open search methods for higher-quality proteome identifications with information on post-translational modifications and single amino acid polymorphisms.
Use: pFind



Facile Cu (ii)-mediated conjugation of thioesters and thioacids to peptides and proteins under mild conditions
Organic & Biomolecular Chemistry2018. Sun, Y et al. Wuhan Univ, Sch Pharmaceut Sci, Zhongnan Hosp, State Key Lab Virol, Wuhan 430071, Hubei, Peoples R China.
ABSTRACT:The bioconjugation of peptide derivatives such as polypeptides, peptide-based probes and proteins is a vibrant area in many scientific fields. However, reports on metal-mediated chemical methods towards native peptides especially non-engineering protein modification under mild conditions are still limited. Herein, we describe a novel Cu(ii)-mediated strategy for the conjugation of thioesters/thioacids to peptides under mild conditions with high functional group tolerance. Based on this strategy, polypeptides, even peptide-based fluorescent probes, can be efficiently constructed. Finally, the selective modification of lysine residues of native Ub with thioesters could be realized and complete conjugation of Ub could be achieved even under equivalent Cu(ii). These promising results could greatly expand Cu(ii)-mediated reaction strategies on chemical biology and molecular imaging.
Use: pFind



Deep learning-based MSMS spectra reduction in support of running multiple protein search engines on cloud
2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)2018. Maabreh, M et al. Western Michigan Univ, Dept Comp Sci, Kalamazoo, MI 49008 USA.
ABSTRACT:The diversity of the available protein search engines with respect to the utilized matching algorithms, the low overlap ratios among their results and the disparity of their coverage encourage the community of proteomics to utilize ensemble solutions of different search engines. The advancing in cloud computing technology and the availability of distributed processing clusters can also provide support to this task. However, data transferring and results' combining, in this case, could be the major bottleneck. The flood of billions of observed mass spectra, hundreds of Gigabytes or potentially Terabytes of data, could easily cause the congestions, increase the risk of failure, poor performance, add more computations' cost, and waste available resources. Therefore, in this study, we propose a deep learning model in order to mitigate the traffic over cloud network and, thus reduce the cost of cloud computing. The model, which depends on the top 50 intensities and their m/z values of each spectrum, removes any spectrum which is predicted not to pass the majority voting of the participated search engines. Our results using three search engines namely: pFind, Comet and X!Tandem, and four different datasets are promising and promote the investment in deep learning to solve such type of Big data problems.
Use: pFind



Selective Enrichment and Quantification of N-Terminal Glycine Peptides via Sortase A Mediated Ligation
Analytical Chemistry2018. Cao, T et al. Fudan Univ, Shanghai Canc Ctr, Shanghai 200032, Peoples R China.
ABSTRACT:The identification and quantification of low-abundant proteins are always impeded by high-abundant proteins in proteomic analysis because of the extreme complexity of peptide mixtures and wide dynamic range of protein abundances. Here, we developed a novel approach to enrich and quantify N-terminal glycine peptides through sortase A mediated ligation. This strategy was based on the formation of a covalent bond between the sortase A recognition motif LPXTG and a N-terminal glycine residue. Also, the quantification was achieved by introducing isotopically labeled threonine in the motif LPXTG. In this strategy, both the enrichment of N-terminal glycine peptides and the stable isotope labeling were achieved in a single step. We applied this approach for the proteome analysis of MCF-7 cell line. It was demonstrated a significant reduction in sample complexity via highly selective and efficient enrichment of N-terminal glycine peptides, thereby detecting lots of less abundant proteins and enhancing proteome coverage. In comparison to the untreated sample, an increase of 34% of proteins was additionally identified. Furthermore, 97% of proteins were successfully quantified with high accuracy. In summary, this quantitative N-terminal glycine peptides enrichment strategy is expected for high-throughput qualitative and quantitative proteomic analysis as a complementary approach to conventional shotgun proteomics.
Use: pFind



Myeloid-derived suppressor cells inhibit T cell activation through nitrating LCK in mouse cancers
Proceedings of the National Academy of Sciences of the United States of America2018. Feng, S et al. Univ Notre Dame, Dept Biol Sci, Notre Dame, IN 46556 USA.
ABSTRACT:Potent immunosuppressive mechanisms within the tumor microenvironment contribute to the resistance of aggressive human cancers to immune checkpoint blockade (ICB) therapy. One of the main mechanisms for myeloid-derived suppressor cells (MDSCs) to induce T cell tolerance is through secretion of reactive nitrogen species (RNS), which nitrates tyrosine residues in proteins involved in T cell function. However, so far very few nitrated proteins have been identified. Here, using a transgenic mouse model of prostate cancer and a syngeneic cell line model of lung cancer, we applied a nitroproteomic approach based on chemical derivation of 3-nitrotyrosine and identified that lymphocyte-specific protein tyrosine kinase (LCK), an initiating tyrosine kinase in the T cell receptor signaling cascade, is nitrated at Tyr394 by MDSCs. LCK nitration inhibits T cell activation, leading to reduced interleukin 2 (IL2) production and proliferation. In human T cells with defective endogenous LCK, wild type, but not nitrated LCK, rescues IL2 production. In the mouse model of castration-resistant prostate cancer (CRPC) by prostate-specific deletion of Pten, p53, and Smad4, CRPC is resistant to an ICB therapy composed of antiprogrammed cell death 1 (PD1) and anticytotoxic-T lymphocyte-associated protein 4 (CTLA4) antibodies. However, we showed that ICB elicits strong anti-CRPC efficacy when combined with an RNS neutralizing agent. Together, these data identify a previously unknown mechanism of T cell inactivation by MDSC-induced protein nitration and illuminate a clinical path hypothesis for combining ICB with RNSreducing agents in the treatment of CRPC.
Use: pFind



A pathogen-derived effector modulates host glucose metabolism by arginine GlcNAcylation of HIF-1$\alpha$ protein
PLOS Pathogens2018. Xu, CX et al. Chinese Acad Sci, Inst Hydrobiol, State Key Lab Freshwater Ecol & Biotechnol, Wuhan, Hubei, Peoples R China.
ABSTRACT:The essential role of pathogens in host metabolism is widely recognized, yet the mechanisms by which they affect host physiology remain to be fully defined. Here, we found that NIeB, an enteropathogenic Escherichia coli (EPEC) type III secretion system effector known to possess N-acetylglucosamine (GlcNAc) transferase activity, GlcNAcylates HIF-la, a master regulator of cellular O-2 homeostasis. We determined that NIeB-mediated GlcNAcylation at a conserved arginine 18 (Arg18) at the N-terminus of HIF-1 alpha enhanced HIF-1 alpha transcriptional activity, thereby inducing HIF-1 alpha downstream gene expression to alter host glucose metabolism. The arginine transferase activity of NIeB was required for its enhancement of HIF-1 alpha transactivity and the subsequent effect on glucose metabolism in a mouse model of EPEC infection. In addition, HIF-1 alpha acted as a mediator to transact NIeB-mediated induction of glucose metabolism-associated gene expression under hypoxia. Thus, our results further show a causal link between pathogen infection and host glucose metabolism, and we propose a new mechanism by which this occurs.
Use: pFind