pFind Studio: a computational solution for mass spectrometry-based proteomics



2021




Structure of cyanobacterial phycobilisome core revealed by structural modeling and chemical cross-linking
Science Advances2021. Liu, HJ et al. Washington Univ, Dept Chem, St Louis, MO 63130 USA.
ABSTRACT:In cyanobacteria and red algae, the structural basis dictating efficient excitation energy transfer from the phycobilisome (PBS) antenna complex to the reaction centers remains unclear. The PBS has several peripheral rods and a central core that binds to the thylakoid membrane, allowing energy coupling with photosystem II (PSII) and PSI. Here, we have combined chemical cross-linking mass spectrometry with homology modeling to propose a tricylindrical cyanobacterial PBS core structure. Our model reveals a side-view crossover configuration of the two basal cylinders, consolidating the essential roles of the anchoring domains composed of the ApcE PB loop and ApcD, which facilitate the energy transfer to PSII and PSI, respectively. The uneven bottom surface of the PBS core contrasts with the flat reducing side of PSII. The extra space between two basal cylinders and PSII provides increased accessibility for regulatory elements, e.g., orange carotenoid protein, which are required for modulating photochemical activity.
Use: pLink



Clostridium perfringens suppressing activity in black soldier fly protein preparations
LWT - Food Science and Technology2021. Dong, LY et al. Wageningen Res, Food & Biobased Res, Bornse Weilanden 9, NL-6708 WG Wageningen, Netherlands.
ABSTRACT:Clostridium perfringens is a commensal, but also an opportunistic pathogen that can lead to lethal diseases as a result of overgrowth when homeostasis is disrupted. The current course of treatment is antibiotics. However, with increasing antibiotic resistance alternatives are required. We investigated the antimicrobial capacity of digest from different black soldier fly- and mealworm-derived fractions towards C. perfringens by using in vitro models. Culturing C. perfringens with digest of insect-derived fractions showed that fractions containing black soldier fly larvae protein significantly (p < 0.05) inhibited the growth of C. perfringens. In relation to this effect, many small (<5 amino acids) anti-microbial peptides were identified. The impact on healthy microbiota was also investigated through 16S rRNA sequencing and SCFA secretion following exposure of healthy faecal-derived microbiota to digests. This revealed a small but significant (p < 0.05) reduction in abundance and diversity of microbiota, mainly a result of a strong reduction in Firmicutes (e.g. Enterobacter) and increased abundance of Proteobacteria (e.g. Klebsiella). These changes coincided with increased levels of acetic, propionic, and butyric acid secretion. The combined impact of black soldier fly larvae protein on these in vitro assays suggest it can be a promising additional tool to combat C. perfringens infection.
Use: pNovo



Peptide presentations of marsupial MHC class I visualize immune features of lower mammals paralleled with bats
JOURNAL OF IMMUNOLOGY2021. Wang, PY et al. Wenzhou Med Univ, Sch Lab Med & Life Sci, Wenzhou, Peoples R China; Zhejiang Univ, Affiliated Hosp 1, Sch Med, State Key Lab Diag & Treatment Infect Dis, Hangzhou, Zhejiang, Peoples R China; Chinese Ctr Dis Control & Prevent, Natl Inst Viral Dis Control & Prevent, 155 Changbai Rd, Beijing 100101, Peoples R China
ABSTRACT:Marsupials are one of three major mammalian lineages that include the placental eutherians and the egg-laying monotremes. The marsupial brushtail possum is an important protected species in the Australian forest ecosystem. Molecules encoded by the MHC genes are essential mediators of adaptive immune responses in virus -host interactions. Yet, nothing is known about the peptide presentation features of any marsupial MHC class I (MHC I). This study identified a series of possum MHC I Trvu-UB*01:01 binding peptides derived from wobbly possum disease virus (WPDV), a lethal virus of both captive and feral possum populations, and unveiled the structure of marsupial peptide/MHC I complex. Notably, we found the two brushtail possum -specific insertions, the 3-aa Ile(52)Glu(53)Arg(54) and 1-aa Arg(154) insertions are located in the Trvu-UB*01:01 peptide binding groove (PBG). The 3-aa insertion plays a pivotal role in maintaining the stability of the N terminus of Trvu-UB*01:01 PBG. This aspect of marsupial PBG is unexpectedly similar to the bat MHC I Ptal-N*01:01 and is shared with lower vertebrates from elasmobranch to monotreme, indicating an evolution hotspot that may have emerged from the pathogen -host interactions. Residue Arg(154) insertion, located in the a2 helix, is available for TCR recognition, and it has a particular influence on promoting the anchoring of peptide WPDV-12. These findings add significantly to our understanding of adaptive immunity in marsupials and its evolution in vertebrates. Our findings have the potential to impact the conservation of the protected species brushtail possum and other marsupial species.
Use: pNovo



Improved Identification of Small Open Reading Frames Encoded Peptides by Top-Down Proteomic Approaches and De Novo Sequencing
International journal of molecular sciences2021. Wang, B et al. Cent China Normal Univ, Sch Life Sci, Hubei Key Lab Genet Regulat & Integrat Biol, 152 Luoyu Rd, Wuhan 430079, Peoples R China.
ABSTRACT:Small open reading frames (sORFs) have translational potential to produce peptides that play essential roles in various biological processes. Nevertheless, many sORF-encoded peptides (SEPs) are still on the prediction level. Here, we construct a strategy to analyze SEPs by combining top-down and de novo sequencing to improve SEP identification and sequence coverage. With de novo sequencing, we identified 1682 peptides mapping to 2544 human sORFs, which were all first characterized in this work. Two-thirds of these new sORFs have reading frame shifts and use a non-ATG start codon. The top-down approach identified 241 human SEPs, with high sequence coverage. The average length of the peptides from the bottom-up database search was 19 amino acids (AA); from de novo sequencing, it was 9 AA; and from the top-down approach, it was 25 AA. The longer peptide positively boosts the sequence coverage, more efficiently distinguishing SEPs from the known gene coding sequence. Top-down has the advantage of identifying peptides with sequential K/R or high K/R content, which is unfavorable in the bottom-up approach. Our method can explore new coding sORFs and obtain highly accurate sequences of their SEPs, which can also benefit future function research.
Use: pNovo



Mapping Microproteins and ncRNA-Encoded Polypeptides in Different Mouse Tissues
Frontiers in cell and developmental biology2021. Pan, N et al. Cent China Normal Univ, Sch Life Sci, Hubei Key Lab Genet Regulat & Integrat Biol, Wuhan, Peoples R China.
ABSTRACT:Small open reading frame encoded peptides (SEPs), also called microproteins, play a vital role in biological processes. Plenty of their open reading frames are located within the non-coding RNA (ncRNA) range. Recent research has demonstrated that ncRNA-encoded polypeptides have essential functions and exist ubiquitously in various tissues. To better understand the role of microproteins, especially ncRNA-encoded proteins, expressed in different tissues, we profiled the proteomic characterization of five mouse tissues by mass spectrometry, including bottom-up, top-down, and de novo sequencing strategies. Bottom-up and top-down with database-dependent searches identified 811 microproteins in the OpenProt database. De novo sequencing identified 290 microproteins, including 12 ncRNA-encoded microproteins that were not found in current databases. In this study, we discovered 1,074 microproteins in total, including 270 ncRNA-encoded microproteins. From the annotation of these microproteins, we found that the brain contains the largest number of neuropeptides, while the spleen contains the most immunoassociated microproteins. This suggests that microproteins in different tissues have tissue-specific functions. These unannotated ncRNA-coded microproteins have predicted domains, such as the macrophage migration inhibitory factor domain and the Prefoldin domain. These results expand the mouse proteome and provide insight into the molecular biology of mouse tissues.
Use: pNovo



Peptide Presentations of Marsupial MHC Class I Visualize Immune Features of Lower Mammals Paralleled with Bats
JOURNAL OF IMMUNOLOGY2021. Wang, PY et al. Wenzhou Med Univ, Sch Lab Med & Life Sci, Wenzhou, Peoples R China; Zhejiang Univ, Affiliated Hosp 1, Sch Med, State Key Lab Diag & Treatment Infect Dis, Hangzhou, Zhejiang, Peoples R China; Chinese Ctr Dis Control & Prevent, Natl Inst Viral Dis Control & Prevent, 155 Changbai Rd, Beijing 100101, Peoples R China
ABSTRACT:Marsupials are one of three major mammalian lineages that include the placental eutherians and the egg-laying monotremes. The marsupial brushtail possum is an important protected species in the Australian forest ecosystem. Molecules encoded by the MHC genes are essential mediators of adaptive immune responses in virus -host interactions. Yet, nothing is known about the peptide presentation features of any marsupial MHC class I (MHC I). This study identified a series of possum MHC I Trvu-UB*01:01 binding peptides derived from wobbly possum disease virus (WPDV), a lethal virus of both captive and feral possum populations, and unveiled the structure of marsupial peptide/MHC I complex. Notably, we found the two brushtail possum -specific insertions, the 3-aa Ile(52)Glu(53)Arg(54) and 1-aa Arg(154) insertions are located in the Trvu-UB*01:01 peptide binding groove (PBG). The 3-aa insertion plays a pivotal role in maintaining the stability of the N terminus of Trvu-UB*01:01 PBG. This aspect of marsupial PBG is unexpectedly similar to the bat MHC I Ptal-N*01:01 and is shared with lower vertebrates from elasmobranch to monotreme, indicating an evolution hotspot that may have emerged from the pathogen -host interactions. Residue Arg(154) insertion, located in the a2 helix, is available for TCR recognition, and it has a particular influence on promoting the anchoring of peptide WPDV-12. These findings add significantly to our understanding of adaptive immunity in marsupials and its evolution in vertebrates. Our findings have the potential to impact the conservation of the protected species brushtail possum and other marsupial species.
Use: pNovo



Computationally instrument-resolution-independent de novo peptide sequencing for high-resolution devices
Nature Machine Intelligence2021. Qiao, R et al. Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON, Canada.
ABSTRACT:De novo peptide sequencing is the key technology for finding novel peptides from mass spectra. The overall quality of sequencing results depends on the de novo peptide sequencing algorithm as well as the quality of mass spectra. Over the past decade, the resolution and accuracy of mass spectrometers have improved by orders of magnitude and higher-resolution mass spectra have been generated. How to effectively take advantage of those high-resolution data without substantially increasing the computational complexity remains a challenge for de novo peptide sequencing tools. Here we present PointNovo, a neural network-based de novo peptide sequencing model that can robustly handle any resolution levels of mass spectrometry data while keeping the computational complexity unchanged. Our extensive experiment results show PointNovo outperforms existing de novo peptide sequencing tools by capitalizing on the ultra-high resolution of the latest mass spectrometers. Increased resolution of mass spectroscopy can provide better data for sequencing, but also increases the computational complexity of analysing the data. Qiao and colleagues present here a neural network-based method that processes sequencing data of any resolution while improving the accuracy of predicted sequences.
Use: pNovo



Reaction Tracking and High-Throughput Screening of Active Compounds in Combinatorial Chemistry by Tandem Mass Spectrometry Molecular Networking
Analytical chemistry2021. Chung, HH et al. Natl Taiwan Univ, Dept Chem, Taipei 10617, Taiwan.
ABSTRACT:Combinatorial synthesis has been widely used as an efficient strategy to screen for active compounds. Mass spectrometry is the method of choice in the identification of hits resulting from high-throughput screenings due to its high sensitivity, specificity, and speed. However, manual data processing of mass spectrometry data, especially for structurally diverse products in combinatorial chemistry, is extremely time-consuming and one of the bottlenecks in this process. In this study, we demonstrated the effectiveness of a tandem mass spectrometry molecular networking-based strategy for product identification, reaction dynamics monitoring, and active compound targeting in combinatorial synthesis. Molecular networking connects compounds with similar tandem mass spectra into a cluster and has been widely used in natural products analysis. We show that both the expected and side products can be readily characterized using molecular networking based on their mass spectrometry fragmentation patterns. Additionally, time-dependent molecular networking was integrated to track reaction dynamics to determine the optimal reaction time to maximize target product yields. We also present a proof-of-concept experiment that successfully identified and isolated active molecules from a dynamic combinatorial library. These results demonstrated the potential of using molecular networking for identifying, tracking, and high-throughput screening of active compounds in combinatorial synthesis.
Use: pNovo



Enhancing open modification searches via a combined approach facilitated by Ursgal
Journal of Proteome Research2021. Schulze, S et al. Univ Penn, Dept Biol, Philadelphia, PA 19104 USA.
ABSTRACT:The identification of peptide sequences and their posttranslational modifications (PTMs) is a crucial step in the analysis of bottom-up proteomics data. The recent development of open modification search (OMS) engines allows virtually all PTMs to be searched for. This not only increases the number of spectra that can be matched to peptides but also greatly advances the understanding of the biological roles of PTMs through the identification, and the thereby facilitated quantification, of peptidoforms (peptide sequences and their potential PTMs). Whereas the benefits of combining results from multiple protein database search engines have been previously established, similar approaches for OMS results have been missing so far. Here we compare and combine results from three different OMS engines, demonstrating an increase in peptide spectrum matches of 8-18%. The unification of search results furthermore allows for the combined downstream processing of search results, including the mapping to potential PTMs. Finally, we test for the ability of OMS engines to identify glycosylated peptides. The implementation of these engines in the Python framework Ursgal facilitates the straightforward application of the OMS with unified parameters and results files, thereby enabling yet unmatched high-throughput, large-scale data analysis.
Use: pGlyco



StrucGP: de novo structural sequencing of site-specific N-glycan on glycoproteins using a modularization strategy
Nature Methods2021. Shen, JC et al. Northwest Univ, Coll Life Sci, Xian, Peoples R China.
ABSTRACT:Precision mapping of glycans at structural and site-specific level is still one of the most challenging tasks in the glycobiology field. Here, we describe a modularization strategy for de novo interpretation of N-glycan structures on intact glycopeptides using tandem mass spectrometry. An algorithm named StrucGP is also developed to automate the interpretation process for large-scale analysis. By dividing an N-glycan into three modules and identifying each module using distinct patterns of Y ions or a combination of distinguishable B/Y ions, the method enables determination of detailed glycan structures on thousands of glycosites in mouse brain, which comprise four types of core structure and 17 branch structures with three glycan subtypes. Owing to the database-independent glycan mapping strategy, StrucGP also facilitates the identification of rare/new glycan structures. The approach will be greatly beneficial for in-depth structural and functional study of glycoproteins in the biomedical research. StrucGP offers a de novo glycan mapping method to determine detailed N-glycan structures at the site-specific level.
Use: pGlyco