pFind Studio: a computational solution for mass spectrometry-based proteomics



2023




Molecular characterization of extracellular vesicles derived from follicular fluid of women with and without PCOS: integrating analysis of differential miRNAs and proteins reveals vital molecules involving in PCOS
Journal of Assisted Reproduction and Genetics2023. Yang, Yuqin et al. Nanjing Med Univ, Womens Hosp, Nanjing Matern & Child Hlth Care Hosp, Dept Reprod Med, Nanjing, Peoples R China
ABSTRACT:PurposeTo elucidate the characterization of extracellular vesicles (EVs) in the follicular fluid-derived extracellular vesicles (FF-EVs) and discover critical molecules and signaling pathways associating with the etiology and pathobiology of PCOS, the differentially expressed miRNAs (DEmiRNAs) and differentially expressed proteins profiles (DEPs) were initially explored and combinedly analyzed.MethodsFirst, the miRNA and protein expression profiles of FF-EVs in PCOS patients and control patients were compared by RNA-sequencing and tandem mass tagging (TMT) proteomic methods. Subsequently, Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes were used to analyze the biological function of target genes of DEmiRNAs and DEPs. Finally, to discover the functional miRNA-target gene-protein interaction pairs involved in PCOS, DEmiRs target gene datasets and DEPs datasets were used integratedly.ResultsA total of 6 DEmiRNAs and 32 DEPs were identified in FF-EVs in patients with PCOS. Bioinformatics analysis revealed that DEmiRNAs target genes are mainly involved in thiamine metabolism, insulin secretion, GnRH, and Apelin signaling pathway, which are closely related to the occurrence of PCOS. DEPs also closely related to hormone metabolism processes such as steroid hormone biosynthesis. In the analysis integrating DEmiRNAs target genes and DEPs, two molecules, GRAMD1B and STPLC2, attracted our attention that are closely associated with cholesterol transport and ceramide biosynthesis, respectively.ConclusionDysregulated miRNAs and proteins in FF-EVs, mainly involving in hormone metabolism, insulin secretion, neurotransmitters regulation, adipokine expression, and secretion, may be closely related to PCOS. The effects of GRAMD1B and STPLC2 on PCOS deserve further study.
Use: pFind



Fully integrated on-line strategy for highly sensitive proteome profiling of 10500 mammalian cells
Analyst2023. Yang, Yun et al. Chinese Acad Sci, Inst Neurosci, CAS Ctr Excellence Brain Sci & Intelligence Techno, State Key Lab Neurosci, Shanghai 200031, Peoples R China; Hong Kong Univ Sci &Technol, Dept Chem & Biol Engn, Clear Water Bay, Kowloon, Hong Kong, Peoples R China; Southern Univ Sci & Technol, Sch Sci, Dept Chem, 1088 Xueyuan Ave, Shenzhen 518055, Peoples R China
ABSTRACT:Recent development in proteomic sample preparation using nanofluidic devices has made single-cell proteome profiling possible. However, these nanofluidic devices require special expertise and costly nanopipetting instruments. They are also specially designed for single cells, are not well-suited for profiling rare samples consisting of a few hundred mammalian cells, arguably a more common need that remains a great challenge. Herein, we developed an easy-to-use and scalable device for processing low-input samples, which combined the merits of previously reported rare cell proteomic reactor (RCPR) and mixed-mode simple and integrated spintip-based proteomics technology, as an alternative to nanofluidic devices. All steps of proteomics sample preparation, including protein preconcentration, impurity removal, reduction, alkylation, digestion, and desalting, were fully integrated in our workflow, and the device can be directly connected to online nanoLC-MS system after processing the rare samples. Using the developed 3-frit mixed-mode RCPR, we identified on average 946 +/- 158, 2 998 +/- 106, and 3 934 +/- 85 protein groups in data-dependent acquisition (DDA) mode from 10, 100, and 500 fluorescence-activated cell sorting (FACS)-sorted 293T cells, respectively. As an illustrative application of this technology, we performed a label-free proteome comparison of 500 FACS-sorted mouse cochlear hair cells of two different ages. On average, 2 595 +/- 230 and 2 042 +/- 120 protein groups were quantified in the juvenile and the adult samples in DDA mode, respectively, achieving dynamic ranges of over 6 orders of magnitude for both.
Use: pFind



VCF1 is an unconventional p97/VCP cofactor promoting recognition of ubiquitylated p97-UFD1-NPL4 substrates
2023. Mirsanaye, Ann Schirin et al. Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, DK-2200, Copenhagen, Denmark
ABSTRACT:
Use: pFind



DbyDeep: Exploration of MS-Detectable Peptides via Deep Learning
Analytical Chemistry2023. Juho Son et al. Hanyang Univ, Dept Comp Sci, Seoul 04763, South Korea; Hanyang Univ, Inst Artificial Intelligence Res, Seoul 04763, South Korea
ABSTRACT:Predicting peptide detectability is useful in a varietyof massspectrometry (MS)-based proteomics applications, particularly targetedproteomics. However, most machine learning-based computational methodshave relied solely on information from the peptide itself, such asits amino acid sequences or physicochemical properties, despite thefact that peptides detected by MS are dependent on many factors, includingprotein sample preparation, digestion, separation, ionization, andprecursor selection during MS experiments. DbyDeep (Detectabilityby Deep learning) is an innovative end-to-end LSTM network model forpeptide detectability prediction that incorporates sequence contextsof peptides and their cleavage sites (by protease). Utilizing thecleavage site contexts could improve the performance of prediction,and DbyDeep outperformed existing methods in predicting peptides recognizablefrom multiple MS/MS data sets with diverse species and MS instruments.We argue for the necessity of a learning model that encompasses severalcontexts associated with peptide detection, as opposed to dependingjust on peptide sequences. There is a Python implementation of DbyDeepat https://github.com/BISCodeRepo/DbyDeep.
Use: pFind



Novel Proteoform Discovery by Precise Semi-De Novo Sequencing of Novel Junction Peptides
Analytical Chemistry2023. Zhitai Hao et al. Chinese Acad Med Sci & Peking Union Med Coll, Peking Union Med Coll Hosp, State Key Lab Complex Severe & Rare Dis, Dept Med Res Ctr, Beijing 100730, Peoples R China; Peking Univ, Acad Adv Interdisciplinary Studies, Peking Tsinghua Ctr Life Sci, Beijing 100871, Peoples R China; Peking Univ Hlth Sci Ctr, Ctr Precis Med Multi Res, Beijing 100191, Peoples R China
ABSTRACT:Alternative splicing allows a smallnumber of human genesto encodelarge amounts of proteoforms that play essential roles in normal anddisease physiology. Some low-abundance proteoforms may remain undiscovereddue to limited detection and analysis capabilities. Peptides coencodedby novel exons and annotated exons separated by introns are callednovel junction peptides, which are the key to identifying novel proteoforms.Traditional de novo sequencing does not take intoaccount the specificity in the composition of the novel junction peptideand is therefore not as accurate. We first developed a novel de novo sequencing algorithm, CNovo, which outperformedthe mainstream PEAKS and Novor in all six test sets. We then builton CNovo to develop a semi-de novo sequencing algorithm,SpliceNovo, specifically for identifying novel junction peptides.SpliceNovo identifies junction peptides with much higher accuracythan CNovo, CJunction, PEAKS, and Novor. Of course, it is also possibleto replace the built-in CNovo in SpliceNovo with other more accurate de novo sequencing algorithms to further improve its performance.We also successfully identified and validated two novel proteoformsof the human EIF4G1 and ELAVL1 genes by SpliceNovo. Our results significantlyimprove the ability to discover novel proteoforms through de novo sequencing.
Use: pFind; pNovo



Proteome Landscapes of Human Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma
Molecular & cellular proteomics : MCP2023. Xiao Yi et al. Center for ProtTalks, Westlake Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
ABSTRACT:Liver cancer is among the top leading causes of cancer mortality worldwide. Particularly, hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (CCA) have been extensively investigated from the aspect of tumor biology. However, a comprehensive and systematic understanding of the molecular characteristics of HCC and CCA remains absent. Here, we characterized the proteome landscapes of HCC and CCA using the data-independent acquisition (DIA) mass spectrometry (MS) method. By comparing the quantitative proteomes of HCC and CCA, we found several differences between the two cancer types. In particular, we found an abnormal lipid metabolism in HCC and activated extracellular matrix-related pathways in CCA. We next developed a three-protein classifier to distinguish CCA from HCC, achieving an area under the curve (AUC) of 0.92, and an accuracy of 90% in an independent validation cohort of 51 patients. The distinct molecular characteristics of HCC and CCA presented in this study provide new insights into the tumor biology of these two major important primary liver cancers. Our findings may help develop more efficient diagnostic approaches and new targeted drug treatments.
Use: pFind



Characterization of natural peptides in Pheretima by integrating proteogenomics and label-free peptidomics
Journal of Pharmaceutical Analysis2023. Luo, Xiaoxiao et al. Shanghai Research Center for Modernization of Traditional Chinese Medicine, National Engineering Research Center of TCM Standardization Technology, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
ABSTRACT:
Use: pFind; pDeep



RNA polymerase drives ribonucleotide excision DNA repair in E. coli
Cell2023. Nudler, Evgeny et al. New York Univ, Dept Biochem & Mol Pharmacol, Grossman Sch Med, New York, NY 10016 USA; New York Univ, Howard Hughes Med Inst, Grossman Sch Med, New York, NY 10016 USA
ABSTRACT:Ribonuclease HII (RNaseHII) is the principal enzyme that removes misincorporated ribonucleoside mono -phosphates (rNMPs) from genomic DNA. Here, we present structural, biochemical, and genetic evidence demonstrating that ribonucleotide excision repair (RER) is directly coupled to transcription. Affinity pull -downs and mass-spectrometry-assisted mapping of in cellulo inter-protein cross-linking reveal the majority of RNaseHII molecules interacting with RNA polymerase (RNAP) in E. coli. Cryoelectron microscopy struc-tures of RNaseHII bound to RNAP during elongation, with and without the target rNMP substrate, show spe-cific protein-protein interactions that define the transcription-coupled RER (TC-RER) complex in engaged and unengaged states. The weakening of RNAP-RNaseHII interactions compromises RER in vivo. The struc-ture-functional data support a model where RNaseHII scans DNA in one dimension in search for rNMPs while "riding"the RNAP. We further demonstrate that TC-RER accounts for a significant fraction of repair events, thereby establishing RNAP as a surveillance "vehicle"for detecting the most frequently occurring replication errors.
Use: pFind; pLink



Characterization of the natural peptidome of four leeches by integrated proteogenomics and pseudotargeted peptidomics
Analytical and Bioanalytical Chemistry2023. Jingmei Liao1 et al. Chinese Acad Sci, Shanghai Inst Mat Med, Shanghai Res Ctr Modernizat Tradit Chinese Med, Natl Engn Res Ctr TCM Standardizat Technol, Shanghai 201203, Peoples R China; Nanjing Univ Chinese Med, Sch Chinese Mat Med, Nanjing 210023, Jiangsu, Peoples R China; Univ Chinese Acad Sci, 19A Yuquan Rd, Beijing 100049, Peoples R China
ABSTRACT:Animal-derived drugs are an indispensable part of folk medicine worldwide. However, their chemical constituents are poorly approached, which leads to the low level of the quality standard system of animal-derived drugs and further causes a chaotic market. Natural peptides are ubiquitous throughout the organism, especially in animal-derived drugs. Thus, in this study, we used multi-source leeches, including Hirudo nipponica (HN), Whitmania pigra (WP), Whitmania acranulata (WA), and Poecilobdella manillensis (PM), as a model. A strategy integrating proteogenomics and novel pseudotargeted peptidomics was developed to characterize the natural peptide phenotype and screen for signature peptides of four leech species. First, natural peptides were sequenced against an in-house annotated protein database of closely related species constructed from RNA-seq data from the Sequence Read Archive (SRA) website, which is an open-sourced public archive resource. Second, a novel pseudotargeted peptidomics integrating peptide ion pair extraction and retention time transfer was established to achieve high coverage and quantitative accuracy of the natural peptides and to screen for signature peptides for species authentication. In all, 2323 natural peptides were identified from four leech species whose databases were poorly annotated. The strategy was shown to significantly improve peptide identification. In addition, 36 of 167 differential peptides screened by pseudotargeted proteomics were identified, and about one-third of them came from the leucine-rich repeat domain (LRR) proteins, which are widely distributed in organisms. Furthermore, six signature peptides were screened with good specificity and stability, and four of them were validated by synthetic standards. Finally, a dynamic multiple reaction monitoring (dMRM) method based on these signature peptides was established and revealed that one-half of the commercial samples and all of the Tongxinluo capsules were derived from WP. All in all, the strategy developed in this study was effective for natural peptide characterization and signature peptide screening, which could also be applied to other animal-derived drugs, especially for modelless species that are less studied in protein database annotation.
Use: pFind



Identification of novel smORFs and microprotein acting in response to rehydration of Nostoc flagelliforme
Proteomics2023. Peng, Zhao et al.
ABSTRACT:Nostoc flagelliforme, a terrestrial cyanobacterium spread throughout arid and semi-arid areas, has been long known for its outstanding adaptability to extremely dry conditions. This microorganism is able to recover biological activities within hours after months of anhydrobiosis state, attracting investigation through proteomic analysis. Except for canonical proteome, microproteins encoded by small ORFs (smORFs) have recently been regarded as indispensable participants in metabolic processes. However, the involvement of smORFs in N. flagelliforme remains unknown. Here we first constructed a smORF database in N. flagelliforme using bioinformatic prediction, resulting in 6072 novel smORFs. Then LS-MS/MS analysis was applied to identify expression patterns of microproteins and seek smORFs and their encoded microprotein playing a role during rehydration. In total, 18 novel microproteins were mined based on a smORF searching strategy combined with three proteomic assays, of which five were annotated as ribosomal proteins, one as RNA polymerase subunit, and one as acetohydroxy acid isomeroreductase. We also suggested the possible functions of smORFs according to their expression pattern and discovered two neighboring and homologous smORFs. All these results will expand our knowledge of smORFs-encoded microproteins and their relation to the stress response of extremophilic microorganisms.
Use: pFind; pDeep