pFind Studio: a computational solution for mass spectrometry-based proteomics



2022




Discovery of 194 Unreported Conopeptides and Identification of a New Protein Disulfide Isomerase in Conus caracteristicus Using Integrated Transcriptomic and Proteomic Analysis
Frontiers in Marine Science2022. Han Zhang1 et al. department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
ABSTRACT:Current ConoServer database accumulates 8,134 conopeptides from 122 species of cone snail, which are pharmaceutically attractive marine resource. However, many more conopeptides remain to be discovered, and the enzymes involved in their synthesis and processing are unclear. In this report, firstly we screened and analyzed the differentially expressed genes (DEGs) between venom duct (VD) and venom bulb (VB) of C. caracteristicus, and obtained 3,289 transcripts using a comprehensive assembly strategy. Then using de novo deep transcriptome sequencing and analysis under a strict merit, we discovered 194 previously unreported conopeptide precursors in Conus caracteristicus. Meanwhile, 2 predicted conopeptides from Consort were verified using liquid chromatography-mass spectrometry/mass spectrometry (LC-MS/MS). Furthermore, we demonstrated that both VD and VB of C. caracteristicus secreted hundreds of different conotoxins, which showed a high diversity among individuals of the species. Finally, we identified a protein disulfide isomerase (PDI) gene, which, functioning for intramolecular disulfide-bond folding, was shared among C. caracteristicus, C. textile, and C. bartschi and was the first PDI identified with five thioredoxin domains. Our results provide novel insights and fuel further studies of the molecular evolution and function of the novel conotoxins.
Use: pFind



A hybrid spectral library and protein sequence database search strategy for bottom-up and top-down proteomic data analysis
Journal of Proteome Research2022. Dai, YL et al. Univ Wisconsin, Dept Chem, Madison, WI 53706 USA
ABSTRACT:Tandem mass spectrometry (MS/MS) is widely employed for the analysis of complex proteomic samples. While protein sequence database searching and spectral library searching are both well-established peptide identification methods, each has shortcomings. Protein sequence databases lack fragment peak intensity information, which can result in poor discrimination between correct and incorrect spectrum assignments. Spectral libraries usually contain fewer peptides than protein sequence databases, which limits the number of peptides that can be identified. Notably, few post-translationally modified peptides are represented in spectral libraries. This is because few search engines can both identify a broad spectrum of PTMs and create corresponding spectral libraries. Also, programs that generate spectral libraries using deep learning approaches are not yet able to accurately predict spectra for the vast majority of PTMs. Here, we address these limitations through use of a hybrid search strategy that combines protein sequence database and spectral library searches to improve identification success rates and sensitivity. This software uses Global PTM Discovery (G-PTM-D) to produce spectral libraries for a wide variety of different PTMs. These features, along with a new spectrum annotation and visualization tool, have been integrated into the freely available and open-source search engine MetaMorpheus.
Use: pFind; pDeep



A protocol of using PTMiner for quality control and localization of protein modifications identified by open or closed search of tandem mass spectra
Biophysics Reports2022. Cheng, Zhiyuan et al. Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
ABSTRACT:In recent years, an open search of tandem mass spectra has greatly promoted the detection of post-translational modifications (PTMs) in shotgun proteomics. However, post-processing of the results from open searches remains an unsatisfactorily resolved problem, which hinders the open search mode from wide practical use. PTMiner is a software tool based on dedicated statistical algorithms for reliable filtering, localization and annotation of the modifications (mass shifts) detected by open search. Furthermore, PTMiner also supports quality control and re-localization of modifications identified by the traditional closed search. In this protocol, we describe how to use PTMiner for the two search modes. Currently, the search engines supported by PTMiner include pFind, MSFragger, MaxQuant, Comet, MS-GF + and SEQUEST.
Use: pFind; pDeep



Identification and mechanism of G protein-biased ligands for chemokine receptor CCR1
Nature chemical biology2022. Shao, ZH et al. Zhejiang Univ, MOE Frontier Sci Ctr Brain Res & Brain Machine In, Sch Med, Hangzhou, Peoples R China; Zhejiang Univ, Affiliated Hosp 2, Dept Pharmacol, Key Lab Resp Dis Zhejiang Prov,Sch Med, Hangzhou, Peoples R China; Zhejiang Univ, Dept Pathol, Sir Run Run Shaw Hosp, Sch Med, Hangzhou, Peoples R China; Zhejiang Univ, Affiliated Hosp 2, Dept Resp & Crit Care Med, Key Lab Resp Dis Zhejiang Prov,Sch Med, Hangzhou, Peoples R China; Zhejiang Prov Key Lab Immun & Inflammatory Dis, Hangzhou, Peoples R China; Zhejiang Univ, Key Lab Resp Dis Zhejiang Prov, Dept Resp & Crit Care Med, Affiliated Hosp 2,Sch Med, Hangzhou, Peoples R China; State Key Lab Resp Dis, Guangzhou, Peoples R China; Zhejiang Univ, Liangzhu Lab, Med Ctr, Hangzhou, Peoples R China; Zhejiang Univ, Int Inst Med, Sch Med, Affiliated Hosp 4, Yiwu, Peoples R China; Zhejiang Univ, Dept Biophys, Sir Run Run Shaw Hosp, Sch Med, Hangzhou, Peoples R China
ABSTRACT:Biased signaling of G protein-coupled receptors describes an ability of different ligands that preferentially activate an alternative downstream signaling pathway. In this work, we identified and characterized different N-terminal truncations of endogenous chemokine CCL15 as balanced or biased agonists targeting CCR1, and presented three cryogenic-electron microscopy structures of the CCR1-G(i) complex in the ligand-free form or bound to different CCL15 truncations with a resolution of 2.6-2.9 angstrom, illustrating the structural basis of natural biased signaling that initiates an inflammation response. Complemented with pharmacological and computational studies, these structures revealed it was the conformational change of Tyr291 (Y291(7.)(43)) in CCR1 that triggered its polar network rearrangement in the orthosteric binding pocket and allosterically regulated the activation of beta-arrestin signaling. Our structure of CCL15-bound CCR1 also exhibited a critical site for ligand binding distinct from many other chemokine-receptor complexes, providing new insights into the mode of chemokine recognition.
Use: pFind



MStoCIRC: A powerful tool for downstream analysis of MS/MS data to predict translatable circRNAs
Frontiers in Molecular Biosciences2022. Cao, Zhou et al. Shaanxi Normal Univ, Coll Life Sci, Key Lab Minist Educ Med Plant Resource & Nat Pharm, Natl Engn Lab Resource Dev Endangered Crude Drugs, Xian, Peoples R China
ABSTRACT:CircRNAs are formed by a non-canonical splicing method and appear circular in nature. CircRNAs are widely distributed in organisms and have the features of time- and tissue-specific expressions. CircRNAs have attracted increasing interest from scientists because of their non-negligible effects on the growth and development of organisms. The translation capability of circRNAs is a novel and valuable direction in the functional research of circRNAs. To explore the translation potential of circRNAs, some progress has been made in both experimental identification and computational prediction. For computational prediction, both CircCode and CircPro are ribosome profiling-based software applications for predicting translatable circRNAs, and the online databases riboCIRC and TransCirc analyze as many pieces of evidence as possible and list the predicted translatable circRNAs of high confidence. Simultaneously, mass spectrometry in proteomics is often recognized as an efficient method to support the identification of protein and peptide sequences from diverse complex templates. However, few applications fully utilize mass spectrometry to predict translatable circRNAs. Therefore, this research aims to build up a scientific analysis pipeline with two salient features: 1) it starts with the data analysis of raw tandem mass spectrometry data; and 2) it also incorporates other translation evidence such as IRES. The pipeline has been packaged into an analysis tool called mass spectrometry to translatable circRNAs (MStoCIRC). MStoCIRC is mainly implemented by Python3 language programming and could be downloaded from GitHub (). The tool contains a main program and several small, independent function modules, making it more multifunctional. MStoCIRC can process data efficiently and has obtained hundreds of translatable circRNAs in humans and Arabidopsis thaliana.
Use: pFind



A Hybrid Spectral Library and Protein Sequence Database Search Strategy for Bottom-Up and Top-Down Proteomic Data Analysis
Journal of Proteome Research2022. Dai, YL et al. Univ Wisconsin, Dept Chem, Madison, WI 53706 USA
ABSTRACT:Tandem mass spectrometry (MS/MS) is widely employed for the analysis of complex proteomic samples. While protein sequence database searching and spectral library searching are both well-established peptide identification methods, each has shortcomings. Protein sequence databases lack fragment peak intensity information, which can result in poor discrimination between correct and incorrect spectrum assignments. Spectral libraries usually contain fewer peptides than protein sequence databases, which limits the number of peptides that can be identified. Notably, few post-translationally modified peptides are represented in spectral libraries. This is because few search engines can both identify a broad spectrum of PTMs and create corresponding spectral libraries. Also, programs that generate spectral libraries using deep learning approaches are not yet able to accurately predict spectra for the vast majority of PTMs. Here, we address these limitations through use of a hybrid search strategy that combines protein sequence database and spectral library searches to improve identification success rates and sensitivity. This software uses Global PTM Discovery (G-PTM-D) to produce spectral libraries for a wide variety of different PTMs. These features, along with a new spectrum annotation and visualization tool, have been integrated into the freely available and open-source search engine MetaMorpheus.
Use: pFind; pDeep



High-throughput proteomic sample preparation using pressure cycling technology
Nature protocols2022. Cai, X et al. Westlake Univ, Sch Life Sci, Westlake Lab Life Sci & Biomed, Key Lab Struct Biol Zhejiang Prov, Hangzhou, Peoples R China; Westlake Inst Adv Study, Inst Basic Med Sci, Hangzhou, Peoples R China
ABSTRACT:High-throughput lysis and proteolytic digestion of biopsy-level tissue specimens is a major bottleneck for clinical proteomics. Here we describe a detailed protocol of pressure cycling technology (PCT)-assisted sample preparation for proteomic analysis of biopsy tissues. A piece of fresh frozen or formalin-fixed paraffin-embedded tissue weighing similar to 0.1-2 mg is placed in a 150 mu L pressure-resistant tube called a PCT-MicroTube with proper lysis buffer. After closing with a PCT-MicroPestle, a batch of 16 PCT-MicroTubes are placed in a Barocycler, which imposes oscillating pressure to the samples from one atmosphere to up to similar to 3,000 times atmospheric pressure. The pressure cycling schemes are optimized for tissue lysis and protein digestion, and can be programmed in the Barocycler to allow reproducible, robust and efficient protein extraction and proteolysis digestion for mass spectrometry-based proteomics. This method allows effective preparation of not only fresh frozen and formalin-fixed paraffin-embedded tissue, but also cells, feces and tear strips. It takes similar to 3 h to process 16 samples in one batch. The resulting peptides can be analyzed by various mass spectrometry-based proteomics methods. We demonstrate the applications of this protocol with mouse kidney tissue and eight types of human tumors.
Use: pFind



Probing strigolactone perception mechanisms with rationally designed small-molecule agonists stimulating germination of root parasitic weeds
Nature communications2022. Wang, DW et al. Hunan Univ, State Key Lab Chemo Biosensing & Chemometr, Hunan Prov Key Lab Plant Funct Genom & Dev Regula, Coll Biol, Changsha 410082, Peoples R China; Nankai Univ, Collaborat Innovat Ctr Chem Sci & Engn, Natl Pesticide Engn Res Ctr, Coll Chem,Dept Chem Biol, Tianjin 300071, Peoples R China; Univ Amsterdam, Swammerdam Inst Life Sci SILS, Sci Pk 904, NL-1098 XH Amsterdam, Netherlands; Nankai Univ, State Key Lab Elementoorgan Chem, Collaborat Innovat Ctr Chem Sci & Engn, Coll Chem,Natl Pesticide Engn Res Ctr, Tianjin 300071, Peoples R China
ABSTRACT:The development of potent strigolactone (SL) agonists as suicidal germination inducers could be a useful strategy for controlling root parasitic weeds, but uncertainty about the SL perception mechanism impedes real progress. Here we describe small-molecule agonists that efficiently stimulate Phelipanchce aegyptiaca, and Striga hermonthica, germination in concentrations as low as 10(-8) to 10(-17) M. We show that full efficiency of synthetic SL agonists in triggering signaling through the Striga SL receptor, ShHTL7, depends on the receptor-catalyzed hydrolytic reaction of the agonists. Additionally, we reveal that the stereochemistry of synthetic SL analogs affects the hydrolytic ability of ShHTL7 by influencing the probability of the privileged conformations of ShHTL7. Importantly, an alternative ShHTL7-mediated hydrolysis mechanism, proceeding via nucleophilic attack of the NE2 atom of H246 to the 2 ' C of the D-ring, is reported. Together, our findings provide insight into SL hydrolysis and structure-perception mechanisms, and potent suicide germination stimulants, which would contribute to the elimination of the noxious parasitic weeds.Strigolactone agonists could potentially help control noxious weeds by promoting suicidal germination. Here the authors describe a series of small molecule agonists that stimulate germination via the Striga ShHTL7 receptor and show that stereochemistry and hydrolysis-independent signalling mediate potency.
Use: pFind



Characterization of protein unfolding by fast cross-linking mass spectrometry using di-ortho-phthalaldehyde cross-linkers
Nature communications2022. Wang, JH et al. Natl Inst Biol Sci NIBS, Beijing 102206, Peoples R China; Tsinghua Univ, Tsinghua Inst Multidisciplinary Biomed Res, Beijing 102206, Peoples R China; Peking Univ, Coll Chem & Mol Engn, Peking Tsinghua Ctr Life Sci,Minist Educ, Beijing Natl Lab Mol Sci,Key Lab Bioorgan Chem &, Beijing 100871, Peoples R China; Chinese Acad Sci, Innovat Acad Precis Measurement Sci & Technol, Wuhan 430071, Peoples R China
ABSTRACT:Conformations sampled by a protein while it unfolds are difficult to visualize. Here, the authors develop di-ortho-phthalaldehyde cross-linkers for rapid chemical cross-linking mass spectrometry analysis and demonstrate that this method captures the conformations of protein unfolding intermediates.Chemical cross-linking of proteins coupled with mass spectrometry is widely used in protein structural analysis. In this study we develop a class of non-hydrolyzable amine-selective di-ortho-phthalaldehyde (DOPA) cross-linkers, one of which is called DOPA2. Cross-linking of proteins with DOPA2 is 60-120 times faster than that with the N-hydroxysuccinimide ester cross-linker DSS. Compared with DSS cross-links, DOPA2 cross-links show better agreement with the crystal structures of tested proteins. More importantly, DOPA2 has unique advantages when working at low pH, low temperature, or in the presence of denaturants. Using staphylococcal nuclease, bovine serum albumin, and bovine pancreatic ribonuclease A, we demonstrate that DOPA2 cross-linking provides abundant spatial information about the conformations of progressively denatured forms of these proteins. Furthermore, DOPA2 cross-linking allows time-course analysis of protein conformational changes during denaturant-induced unfolding.
Use: pFind; pLink



Deep coverage proteome analysis of hair shaft for forensic individual identification
Forensic Science International: Genetics2022. Wu, JL et al. Chinese Acad Sci, Natl Chromatog Res & Anal Ctr, Dalian Inst Chem Phys, CAS Key Lab Separat Sci Analyt Chem, 457 Zhongshan Rd, Dalian 116023, Peoples R China; Peoples Publ Secur Univ China, Grad Sch, 1 Muxidi Nanli, Beijing 100038, Peoples R China; Inst Forens Sci, Natl Engn Lab Forens Sci, Key Lab Forens Genet, Minist Publ Secur, 17 Muxidi Nanli, Beijing 100038, Peoples R China
ABSTRACT:Hair shaft is one of the most common biological evidence found at crime scenes. However, due to the biogenic degradation of nuclear DNA in hair shaft, it is difficult to achieve individual identification through routine DNA analysis. In contrast, the proteins in hair shaft are stable and contain genetic polymorphisms in the form of single amino acid polymorphisms (SAPs), translated from non-synonymous single nucleotide polymorphisms (nsSNPs) in the genome. However, the number of SAPs detected still cannot meet the requirements of practical applications. This paper developed a deep coverage proteome analysis method by combining a three-step sequential ionic liquid-based protein extraction and 2D-RPLC-MS/MS with high and low pH to identify both variant and reference SAPs from 2-cm-long hair shafts. We identified 632 +/- 243 protein groups from 10 individuals, with the average number of SAPs reaching 167 +/- 21/person. These were further used to calculate random match probabilities (RMPs), a widely accepted forensic statistical term for human identification. The RMPs ranged from 6.53 x 10(-4) to 3.10 x 10(-14) (median = 2.62 x 10(-8)) when calculated with frequency of matching nsSNP genotype data from exomes, and ranged from 2.62 x 10(-3) to 2.07 x 10(-10) (median = 4.88 x 10(-6)) with SAP genotype frequency. All these results indicate that the deep coverage proteomics method is beneficial for improving SAP-based forensic individual identification in hair shaft, with great potential in crime investigation.
Use: pFind