pFind Studio: a computational solution for mass spectrometry-based proteomics
2020
Journal of proteome research2020. Zhang, YL et al.
BGI Shenzhen, BGI Genom, Shenzhen 518083, Guangdong, Peoples R China.
ABSTRACT:Since the Chromosome-Centric Human Proteome Project (C-HPP) was launched in 2010, many techniques have been adopted for the discovery of missing proteins (MPs). Because of these efforts, only 1481 MPs remained as of July 2020; however, by relying only on technique optimization, researchers have reached a bottleneck in MP discovery. Protein expression is tissue- or cell-type-dependent. The tissues of the human testis and brain have been reported to harbor a large number of tissue-specific genes and proteins; however, few studies have been performed on human brain tissue or cells to identify MPs. Herein a metastatic cell line derived from brain cancer, D283 Med, was used to search for MPs. With a traditional and simple shotgun workflow to separate the peptides into 20 fractions, 12 MPs containing at least two unique non-nested peptides (amino acid length >= 9) were identified in this cell line with a protein false discovery rate of <1%. Following the same experimental protocol, only one MP was found in a nonmetastatic brain cancer cell line, U-118 MG. Furthermore, 12 MPs were verified as having two non-nested unique peptides by matching them with corresponding chemically synthesized peptides through parallel reaction monitoring. These results clearly demonstrate that the appropriate selection of experimental materials, either tissues or cell lines, is imperative for MP discovery. The data obtained in this study are available via ProteomeXchange (PXD021482) and PeptideAtlas (PASS01627).
Use: pFind
Journal of Mass Spectrometry2020. Huang, PW et al.
Southern Univ Sci & Technol, Dept Chem, Shenzhen 518055, Peoples R China.
ABSTRACT:Steady improvement in Orbitrap-based mass spectrometry (MS) technologies has greatly advanced the peptide sequencing speed and depth. In-depth analysis of the performance of state-of-the-art MS and optimization of key parameters can improve sequencing efficiency. In this study, we first systematically compared the performance of two popular data-dependent acquisition approaches, with Orbitrap as the first-stage (MS1) mass analyzer and the same Orbitrap (high-high approach) or ion trap (high-low approach) as the second-stage (MS2) mass analyzer, on the Orbitrap Fusion mass spectrometer. High-high approach outperformed high-low approach in terms of better saturation of the scan cycle and higher MS2 identification rate. However, regardless of the acquisition method, there are still more than 60% of peptide features untargeted for MS2 scan. We then systematically optimized the MS parameters using the high-high approach. Increasing the isolation window in the high-high approach could facilitate faster scan speed, but decreased MS2 identification rate. On the contrary, increasing the injection time of MS2 scan could increase identification rate but decrease scan speed and the number of identified MS2 spectra. Dynamic exclusion time should be set properly according to the chromatography peak width. Furthermore, we found that the Orbitrap analyzer, rather than the analytical column, was easily saturated with higher loading amount, thus limited the dynamic range of MS1-based quantification. By using optimized parameters, 10 000 proteins and 110 000 unique peptides were identified by using 20 h of effective liquid chromatography (LC) gradient time. The study therefore illustrated the importance of synchronizing LC-MS precursor ion targeting, fragment ion detection, and chromatographic separation for high efficient data-dependent proteomics.
Use: pFind; pParse
Cell reports2020. Chui, AJ et al.
Mem Sloan Kettering Canc Ctr, Triinst PhD Program Chem Biol, New York, NY 10065 USA.
ABSTRACT:Several cytosolic pattern-recognition receptors (PRRs) form multiprotein complexes called canonical inflammasomes in response to intracellular danger signals. Canonical inflammasomes recruit and activate caspase-1 (CASP1), which in turn cleaves and activates inflammatory cytokines and gasdermin D (GSDMD), inducing pyroptotic cell death. Inhibitors of the dipeptidyl peptidases DPP8 and DPP9 (DPP8/9) activate both the human NLRP1 and CARD8 inflammasomes. NLRP1 and CARD8 have different N-terminal regions but have similar C-terminal regions that undergo autoproteolysis to generate two non-covalently associated fragments. Here, we show that DPP8/9 inhibition activates a proteasomal degradation pathway that targets disordered and misfolded proteins for destruction. CARD8's N terminus contains a disordered region of similar to 160 amino acids that is recognized and destroyed by this degradation pathway, thereby freeing its C-terminal fragment to activate CASP1 and induce pyroptosis. Thus, CARD8 serves as an alarm to signal the activation of a degradation pathway for disordered and misfolded proteins.
Use: pFind
Food Chemistry2020. Cerrato, A et al.
Sapienza Univ Roma, Dept Chem, Piazzale Aldo Moro 5, I-00185 Rome, Italy.
ABSTRACT:Native peptides from sea bass muscle were analyzed by two different approaches: medium-sized peptides by peptidomics analysis, whereas short peptides by suspect screening analysis employing an inclusion list of exact m/z values of all possible amino acid combinations (from 2 up to 4). The method was also extended to common post-translational modifications potentially interesting in food analysis, as well as non-proteolytic aminoacyl derivatives, which are well-known taste-active building blocks in pseudo-peptides. The medium-sized peptides were identified by de novo and combination of de novo and spectra matching to a protein sequence database, with up to 4077 peptides (2725 modified) identified by database search and 2665 peptides (223 modified) identified by de novo only; 102 short peptide sequences were identified (with 12 modified ones), and most of them had multiple reported bioactivities. The method can be extended to any peptide mixture, either endogenous or by protein hydrolysis, from other food matrices.
Use: pNovo; pFind
NATURE PROTOCOLS2020. Li, QY et al.
Univ Calif Davis, Dept Chem, Davis, CA 95616 USA.
ABSTRACT:The glycocalyx comprises glycosylated proteins and lipids and fcorms the outermost layer of cells. It is involved in fundamental inter- and intracellular processes, including non-self-cell and self-cell recognition, cell signaling, cellular structure maintenance, and immune protection. Characterization of the glycocalyx is thus essential to understanding cell physiology and elucidating its role in promoting health and disease. This protocol describes how to comprehensively characterize the glycocalyx N-glycans and O-glycans of glycoproteins, as well as intact glycolipids in parallel, using the same enriched membrane fraction. Profiling of the glycans and the glycolipids is performed using nanoflow liquid chromatography-mass spectrometry (nanoLC-MS). Sample preparation, quantitative LC-tandem MS (LC-MS/MS) analysis, and data processing methods are provided. In addition, we discuss glycoproteomic analysis that yields the site-specific glycosylation of membrane proteins. To reduce the amount of sample needed, N-glycan, O-glycan, and glycolipid analyses are performed on the same enriched fraction, whereas glycoproteomic analysis is performed on a separate enriched fraction. The sample preparation process takes 2-3 d, whereas the time spent on instrumental and data analyses could vary from 1 to 5 d for different sample sizes. This workflow is applicable to both cell and tissue samples. Systematic changes in the glycocalyx associated with specific glycoforms and glycoconjugates can be monitored with quantitation using this protocol. The ability to quantitate individual glycoforms and glycoconjugates will find utility in a broad range of fundamental and applied clinical studies, including glycan-based biomarker discovery and therapeutics. This protocol describes nanoflow liquid chromatography-mass spectrometry (nanoLC-MS) analysis of the N-glycans and O-glycans of glycoproteins and glycolipids, as well as site-specific glycosylation of membrane proteins.
Use: pFind; pGlyco
Proteomics Clinical Applications2020. Sun, R et al.
Dalian Med Univ, Dept Gen Surg, Div Hepatobiliary & Pancreat Surg, Affiliated Hosp 2, Dalian 116023, Peoples R China.
ABSTRACT:Purpose The extensive drug resistance of hepatocellular carcinoma (HCC) has become a major cause of chemotherapy failure. A deeper understanding of the drug resistance mechanism of tumor cells is very significant for improving the clinical prognosis of patients with HCC. Experimental Design In this study, proteomic studies on the composition of 5-fluorouracil (5-Fu) resistant Bel/5Fu cell line and its parent Bel7402 cell line by using an ionic liquid assisted proteins extraction method with the advantage of extracting plasma membrane proteins to a wider extent are performed. Then the expression level and function of differentially expressed plasma membrane proteins are verified. Results In total, 25 plasma membrane proteins are shown differentially expressed in Bel/5Fu compared with Bel7402. Western blot analysis results further confirmed that the EPHX1 PLIN2 RAB27B SLC4A2 are upregulated in Bel/5Fu cells in accordance with the proteomics data. Moreover, cell viability assay and clonogenic survival assay results demonstrated that EPHX1 is closely related to the chemoresistance of Bel/5Fu to 5-Fu. Conclusions and Clinical Relevance Plasma membrane protein EPHX1 is closely related to the chemotherapy resistance of Bel/5Fu cells and can be used as a new drug target to improve the clinical prognosis of patients with HCC.
Use: pFind
Food Chemistry2020. Guo, XY et al.
China Agr Univ, Coll Food Sci & Nutr Engn, Beijing Adv Innovat Ctr Food Nutr & Human Hlth, 17 Qinghua East Rd, Beijing, Peoples R China.
ABSTRACT:Jug r 1, the major allergen of walnut, triggers severe allergic reactions through epitopes. Hence, research on the efficient strategy for analyzing the linear epitopes of Jug r 1 are necessary. In this work, bioinformatics analysis was used to predict the linear epitopes of Jug r 1. Overlapping peptide synthesis was used to map linear epitopes. In vitro simulated gastrointestinal digestion and HPLC-MS/MS were used to identify digestion-resistant peptides. The results showed that six predicted linear epitopes were AA28-35, AA42-49, AA55-62, AA65-73, AA97-104, and AA109-121. AA16-30 and AA125-139 were identified by the sera of walnut allergic patients. Five digestionresistant peptides were AA19-33, AA40-45, AA54-74, AA96-106, and AA117-137. The predicted results only included one of the linear epitopes identified by sera, while the digestion-resistant peptides covered all. Therefore, the digestion-resistant property of food allergens may be a promising direction for studying the linear epitopes of Jug r 1.
Use: pFind
Journal of the American Society for Mass Spectrometry2020. Zhang, N et al.
Peking Univ, Sch Pharmaceut Sci, Sch Basic Med Sci, Hlth Sci Ctr,Ctr Precis Med Multi Res, Beijing 100191, Peoples R China.
ABSTRACT:Multidimensional protein identification (MudPIT), developed in the Yates Laboratory 20 years ago, is regarded as a powerful tool for proteomics research. Due to its remarkable online separation advantages, MudPIT has been widely used to facilitate discoveries in the field of proteomics research. However, it has one major disadvantage: the process of eluting peptides during strong cation exchange introduces salts, of different concentrations, into the mass spectrometer. Considering the sensitivity of the new generation of high-resolution mass spectrometers, developing a new version of MudPIT that could eliminate the introduction of salts in the elute would be a significant advancement to current technology. Herein, we developed a new, clean version of MudPIT called parallel channels-multidimensional protein identification technology (PC-MudPIT) to overcome this issue. In this design, the original biphasic trapping column was replaced by two parallel analytical column channels. We successfully averted the salt contamination yet retained all the other advantages of MudPIT. A total of 8161 and 7359 protein groups were identified from A549 whole cell lysate using PC-MudPIT and classic MudPIT, respectively. Moreover, we discovered the additional advantage that, in online mode, PC-MudPIT can also be used for an enrichment process of phosphopeptide identification. We identified a total 11453 phosphopeptides using PC-MudPIT and 7729 phosphopeptides using offline TiO2 enrichment followed by classic MudPIT. These advances indicate the possibility of other innovative applications of PC-MudPIT technology in deep proteome exploration.
Use: pFind
Journal of Mass Spectrometry2020. Wu, Q et al.
Chinese Acad Sci, Dalian Inst Chem Phys, Natl Chromatog Res & Anal Ctr, CAS Key Lab Separat Sci Analyt Chem, Dalian 116023, Peoples R China.
ABSTRACT:Owing to the poor fragmentation efficiency caused by the lack of a positively charged basic group at the C-termini of peptides, the identification of nontryptic peptides in classical proteomics is known to be less efficient. Particularly, attaching positively charged basic groups to C-termini via chemical derivatizations is known to be able to enhance their fragmentation efficiency. In this study, we introduced a novel strategy, C-termini sequential amidation reaction (CSAR), to improve peptide fragmentation efficiency. By this strategy, C-terminal and side-chain carboxyl groups were firstly amidated by neutral methylamine (MA), and then C-terminal amide bonds were selectively deamidated through peptide amidase while side-chain amide bonds remained unchanged, followed by the secondary amidation of C-termini via basic agmatine (AG). We optimized the amidation reaction conditions to achieve the MA derivatization efficiency of >99% for side-chain carboxyl groups and AG derivatization efficiency of 80% for the hydrolytic C-termini. We applied CSAR strategy to identify bovine serum albumin (BSA) chymotryptic digests, resulting in the increased fragmentation efficiencies (improvement by 9-32%) and charge states (improvement by 39-52%) under single or multiple dissociation modes. The strategy described here might be a promising approach for the identification of peptides that suffered from poor fragmentation efficiency.
Use: pFind
Molecular & Cellular Proteomics2020. Netz, E et al.
Max Planck Inst Dev Biol, Biomol Interact, Tubingen, Germany.
ABSTRACT:XL-MS has been recognized as an effective source of information about protein structures and interactions. OpenPepXL is a sensitive XL-MS identification software that reports from 7% to 40% more structurally validated cross-links than other tools on data sets with available high-resolution structures for cross-link validation. It is open source and has been built as part of the OpenMS suite of tools. OpenPepXL supports all common operating systems and open data formats. Cross-linking MS (XL-MS) has been recognized as an effective source of information about protein structures and interactions. In contrast to regular peptide identification, XL-MS has to deal with a quadratic search space, where peptides from every protein could potentially be cross-linked to any other protein. To cope with this search space, most tools apply different heuristics for search space reduction. We introduce a new open-source XL-MS database search algorithm, OpenPepXL, which offers increased sensitivity compared with other tools. OpenPepXL searches the full search space of an XL-MS experiment without using heuristics to reduce it. Because of efficient data structures and built-in parallelization OpenPepXL achieves excellent runtimes and can also be deployed on large compute clusters and cloud services while maintaining a slim memory footprint. We compared OpenPepXL to several other commonly used tools for identification of noncleavable labeled and label-free cross-linkers on a diverse set of XL-MS experiments. In our first comparison, we used a data set from a fraction of a cell lysate with a protein database of 128 targets and 128 decoys. At 5% FDR, OpenPepXL finds from 7% to over 50% more unique residue pairs (URPs) than other tools. On data sets with available high-resolution structures for cross-link validation OpenPepXL reports from 7% to over 40% more structurally validated URPs than other tools. Additionally, we used a synthetic peptide data set that allows objective validation of cross-links without relying on structural information and found that OpenPepXL reports at least 12% more validated URPs than other tools. It has been built as part of the OpenMS suite of tools and supports Windows, macOS, and Linux operating systems. OpenPepXL also supports the MzIdentML 1.2 format for XL-MS identification results. It is freely available under a three-clause BSD license at .
Use: pLink