pFind Studio: a computational solution for mass spectrometry-based proteomics
2024
Nature Communications2024. Liu, Dan-Dan et al.
Life Sciences Institute, Department of Medical Oncology, The Second Affiliated Hospital of Zhejiang University School of Medicine, Zhejiang University, Hangzhou, Zhejiang, 310058, China
ABSTRACT:Latent bioreactive unnatural amino acids (Uaas) have been widely used in the development of covalent drugs and identification of protein interactors, such as proteins, DNA, RNA and carbohydrates. However, it is challenging to perform high-throughput identification of Uaa cross-linking products due to the complexities of protein samples and the data analysis processes. Enrichable Uaas can effectively reduce the complexities of protein samples and simplify data analysis, but few cross-linked peptides were identified from mammalian cell samples with these Uaas. Here we develop an enrichable and multiple amino acids reactive Uaa, eFSY, and demonstrate that eFSY is MS cleavable when eFSY-Lys and eFSY-His are the cross-linking products. An identification software, AixUaa is developed to decipher eFSY mass cleavable data. We systematically identify direct interactomes of Thioredoxin 1 (Trx1) and Selenoprotein M (SELM) with eFSY and AixUaa.
Use: pFind; pLink
Analytical Chemistry2024. Hu, Yingwei et al.
Department of Pathology, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21231, United States
ABSTRACT:Rapid development and wide adoption of mass spectrometry-based glycoproteomic technologies have empowered scientists to study proteins and protein glycosylation in complex samples on a large scale. This progress has also created unprecedented challenges for individual laboratories to store, manage, and analyze proteomic and glycoproteomic data, both in the cost for proprietary software and high-performance computing and in the long processing time that discourages on-the-fly changes of data processing settings required in explorative and discovery analysis. We developed an open-source, cloud computing-based pipeline, MS-PyCloud, with graphical user interface (GUI), for proteomic and glycoproteomic data analysis. The major components of this pipeline include data file integrity validation, MS/MS database search for spectral assignments to peptide sequences, false discovery rate estimation, protein inference, quantitation of global protein levels, and specific glycan-modified glycopeptides as well as other modification-specific peptides such as phosphorylation, acetylation, and ubiquitination. To ensure the transparency and reproducibility of data analysis, MS-PyCloud includes open-source software tools with comprehensive testing and versioning for spectrum assignments. Leveraging public cloud computing infrastructure via Amazon Web Services (AWS), MS-PyCloud scales seamlessly based on analysis demand to achieve fast and efficient performance. Application of the pipeline to the analysis of large-scale LC-MS/MS data sets demonstrated the effectiveness and high performance of MS-PyCloud. The software can be downloaded at https://github.com/huizhanglab-jhu/ms-pycloud.
Use: pFind
Nature Communications2024. Ann Schirin Mirsanaye et al.
Center for Chromosome Stability, Department of Cellular and Molecular Medicine, University of Copenhagen, DK-2200, Copenhagen, Denmark
ABSTRACT:The hexameric AAA+ ATPase p97/VCP functions as an essential mediator of ubiquitin-dependent cellular processes, extracting ubiquitylated proteins from macromolecular complexes or membranes by catalyzing their unfolding. p97 is directed to ubiquitylated client proteins via multiple cofactors, most of which interact with the p97 N-domain. Here, we discover that FAM104A, a protein of unknown function also named VCF1 (VCP/p97 nuclear Cofactor Family member 1), acts as a p97 cofactor in human cells. Detailed structure-function studies reveal that VCF1 directly binds p97 via a conserved alpha-helical motif that recognizes the p97 N-domain with unusually high affinity, exceeding that of other cofactors. We show that VCF1 engages in joint p97 complex formation with the heterodimeric primary p97 cofactor UFD1-NPL4 and promotes p97-UFD1-NPL4-dependent proteasomal degradation of ubiquitylated substrates in cells. Mechanistically, VCF1 indirectly stimulates UFD1-NPL4 interactions with ubiquitin conjugates via its binding to p97 but has no intrinsic affinity for ubiquitin. Collectively, our findings establish VCF1 as an unconventional p97 cofactor that promotes p97-dependent protein turnover by facilitating p97-UFD1-NPL4 recruitment to ubiquitylated targets.p97/VCP, a nexus of the ubiquitin system, recognizes and unfolds ubiquitylated substrates via multiple cofactors. Here, the authors identify VCF1, a nuclear cofactor promoting p97 recruitment to, and proteasomal degradation of, ubiquitylated targets.
Use: pFind
Journal of Computer and Communications2024. He, Changjiu et al.
School of Computer Science and Technology, Shandong University of Technology
ABSTRACT:
Use: pFind; pDeep; pParse
Journal of Proteomics2024. LunfeiZou et al.
Key Laboratory of Optoelectronic Chemical Materials and Devices, Ministry of Education, School of Optoelectronic Materials & Technology, Jianghan University, Wuhan 430056, Hubei, People's Republic of China
ABSTRACT:Investigating site-specific protein phosphorylation remains a challenging task. The present study introduces a two-step chemical derivatization method for accurate identification of phosphopeptides. Methylamine neutralizes carboxyl groups, thus reducing the adsorption of non-phosphorylated peptides during enrichment, while dimethylamine offers a cost-effective reagent for stable isotope labeling of phosphorylation sites. The derivatization improves the mass spectra obtained through liquid chromatography-tandem mass spectrometry. The product ions at m/z 58.07 and 64.10 Da, resulting from dimethylamine-d0 and dimethylamine-d6 labeled phosphorylation sites respectively, can serve as report ions. Derivatized phosphopeptides from casein demonstrate enhanced ionization and formation of product ions, yielding a significant increase in the number of identifiable peptides. When using the parallel reaction monitoring technique, it is possible to distinguish isomeric phosphopeptides with the same amino acid sequence but different phosphorylation sites. By employing a proteomic software and screening the report ions, we identified 29 endogenous phosphopeptides in 10 mu L of human saliva with high reliability. These findings indicate that the two-step derivatization strategy has great potential in site-specific phosphorylation and large-scale phosphoproteomics research. Significance: There is a significant need to improve the accuracy of identifying phosphoproteins and phosphopeptides and analyzing them quantitatively. Several chemical derivatization techniques have been developed to label phosphorylation sites, thus enabling the identification and relative quantification of phosphopeptides. Nevertheless, these methods have limitations, such as incomplete conversion or the need for costly isotopic reagents. Building upon previous contributions, our study moves the field forward due to high efficiency in sitespecific labeling, cost-effectiveness, improved sensitivity, and comprehensive product ion coverage. Using the two-step derivatization approach, we successfully identified 29 endogenous phosphopeptides in 10 mu L of human saliva with high reliability. The outcomes underscore the possibility of the method for site-specific phosphorylation and large-scale phosphoproteomics investigations.
Use: pFind
2024. Humberto J. Ferreira et al.
Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
ABSTRACT:Circular RNAs (circRNAs) are covalently closed non-coding RNAs lacking the 5' cap and the poly-A tail. Nevertheless, it has been demonstrated that certain circRNAs can undergo active translation. Therefore, aberrantly expressed circRNAs in human cancers could be an unexplored source of tumor-specific antigens, potentially mediating anti-tumor T cell responses. This study presents an immunopeptidomics workflow with a specific focus on generating a circRNA-specific protein fasta reference. The main goal of this workflow is to streamline the process of identifying and validating human leukocyte antigen (HLA) bound peptides potentially originating from circRNAs. We increase the analytical stringency of our workflow by retaining peptides identified independently by two mass spectrometry search engines and/or by applying a group-specific FDR for canonical-derived and circRNA-derived peptides. A subset of circRNA-derived peptides specifically encoded by the region spanning the back-splice junction (BSJ) are validated with targeted MS, and with direct Sanger sequencing of the respective source transcripts. Our workflow identifies 54 unique BSJ-spanning circRNA-derived peptides in the immunopeptidome of melanoma and lung cancer samples. Our approach enlarges the catalog of source proteins that can be explored for immunotherapy.
Use: pFind
Nature Communications2024. Hao Hu et al.
State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
ABSTRACT:Protein-modifying enzymes regulate the dynamics of myriad post-translational modification (PTM) substrates. Precise characterization of enzyme-substrate associations is essential for the molecular basis of cellular function and phenotype. Methods for direct capturing global substrates of protein-modifying enzymes in living cells are with many challenges, and yet largely unexplored. Here, we report a strategy to directly capture substrates of lysine-modifying enzymes via PTM-acceptor residue crosslinking in living cells, enabling global profiling of substrates of PTM-enzymes and validation of PTM-sites in a straightforward manner. By integrating enzymatic PTM-mechanisms, and genetically encoding residue-selective photo-crosslinker into PTM-enzymes, our strategy expands the substrate profiles of both bacterial and mammalian lysine acylation enzymes, including bacterial lysine acylases PatZ, YiaC, LplA, TmcA, and YjaB, as well as mammalian acyltransferases GCN5 and Tip60, leading to discovery of distinct yet functionally important substrates and acylation sites. The concept of direct capturing substrates of PTM-enzymes via residue crosslinking may extend to the other types of amino acid residues beyond lysine, which has the potential to facilitate the investigation of diverse types of PTMs and substrate-enzyme interactive proteomics.
Use: pFind; pLink
Computational and Structural Biotechnology Journal2024. AnuragRaj et al.
G. N. Ramachandran Knowledge Centre for Genomics Informatics, CSIR Institute of Genomics and Integrative Biology, New Delhi, India
ABSTRACT:Variant peptides resulting from single nucleotide polymorphisms (SNPs) can lead to aberrant protein functions and have translational potential for disease diagnosis and personalized therapy. Variant peptides detected by proteogenomics are fraught with high number of false positives, but there is no uniform and comprehensive approach to assess variant quality across analysis pipelines. Despite class -specific FDR along with ad -hoc filters, the problem is far from solved. These protocols are typically manual and tedious, and thus not uniform across labs. We demonstrate that variant peptide rescoring, integrated with intensity, variant event information and search result features, allows better discrimination of correct variant peptides. Implemented into PgxSAVy - a tool for quality control of variant peptides, this method can tackle the high rate of false positives. PgxSAVy provides a rigorous framework for quality control and annotations of variant peptides on the basis of (i) variant quality, (ii) isobaric masses, and (iii) disease annotation. PgxSAVy demonstrated high accuracy by identifying true variants with 98.43% accuracy on simulated data. Large-scale proteogenomic reanalysis of similar to 2.8 million spectra (PXD004010 and PXD001468) resulted in 12,705 variant peptide spectrum matches (PSMs), of which PgxSAVy evaluated 3028 (23.8%), 1409 (11.1%) and 8268 (65.1%) as confident, semi -confident and doubtful respectively. PgxSAVy also annotates the variants based on their pathogenicity and provides support for assisted manual validation. The analysis of proteins carrying variants can provide fine granularity in discovering important pathways. PgxSAVy will advance personalized medicine by providing a comprehensive framework for quality control and prioritization of proteogenomics variants.
Use: pFind
Bioinformatics Advances2024. Li, Jiancheng et al.
Department of Computer Science and Engineering, University of North Texas
ABSTRACT:Summary: Shotgun proteomics is widely used in many system biology studies to determine the global protein expression profiles of tissues, cultures, and microbiomes. Many non-distributed computer algorithms have been developed for users to process proteomics data on their local computers. However, the amount of data acquired in a typical proteomics study has grown rapidly in recent years, owing to the increasing throughput of mass spectrometry and the expanding scale of study designs. This presents a big data challenge for researchers to process proteomics data in a timely manner. To overcome this challenge, we developed a cloud-based parallel computing application to offer end-to-end proteomics data analysis software as a service (SaaS). A web interface was provided to users to upload mass spectrometry-based proteomics data, configure parameters, submit jobs, and monitor job status. The data processing was distributed across multiple nodes in a supercomputer to achieve scalability for large datasets. Our study demonstrated SaaS for proteomics as a viable solution for the community to scale up the data processing using cloud computing.Availability and implementation: This application is available online at https://sipros.oscer.ou.edu/ or https://sipros.unt.edu for free use. The source code is available at https://github.com/Biocomputing-Research-Group/CloudProteoAnalyzer under the GPL version 3.0 license.
Use: pFind; pDeep
Cell Metabolism2024. QianminOu et al.
Key Laboratory of Stem Cells and Tissue Engineering (Sun Yat-Sen University), Ministry of Education, Guangzhou 510080, China
ABSTRACT:Over 50 billion cells undergo apoptosis each day in an adult human to maintain immune homeostasis. Hydrogen sulfide (H2S) is also required to safeguard the function of immune response. However, it is unknown whether apoptosis regulates H2S production. Here, we show that apoptosis-deficient MRL/lpr (B6.MRL-Faslpr/J) and Bim-/- (B6.129S1-Bcl2l11tm1.1Ast/J) mice exhibit significantly reduced H2S levels along with aberrant differentiation of Th17 cells, which can be rescued by the additional H2S. Moreover, apoptotic cells and vesicles (apoVs) express key H2S-generating enzymes and generate a significant amount of H2S, indicating that apoptotic metabolism is an important source of H2S. Mechanistically, H2S sulfhydrates selenoprotein F (Sep15) to promote signal transducer and activator of transcription 1 (STAT1) phosphorylation and suppress STAT3 phosphorylation, leading to the inhibition of Th17 cell differentiation. Taken together, this study reveals a previously unknown role of apoptosis in maintaining H2S homeostasis and the unique role of H2S in regulating Th17 cell differentiation via sulfhydration of Sep15C38.
Use: pFind