pFind Studio: a computational solution for mass spectrometry-based proteomics



2013 or earlier




Determination of monoisotopic masses of chimera spectra from high-resolution mass spectrometric data by use of isotopic peak intensity ratio modeling
Rapid Communications in Mass Spectrometry2012. Niu, Ming et al. Natl Engn Res Ctr Prot Drugs, Beijing Inst Radiat Med, Beijing Proteome Res Ctr, State Key Lab Prote, 33 Life Sci Pk Rd, Beijing 102206, Peoples R China
ABSTRACT:RATIONALE Chimera spectra make it challenging to identify proteins in complex mixtures by LC/MS/MS. Approximately half of the spectra collected are chimera spectra even when high-resolution tandem mass spectrometry is used. Chimera spectra are generated from the co-fragmentation of different co-elute peptides, and it is often difficult to distinguish monoisotopic precursors of these peptides from each other. METHODS In this paper, we propose a peak intensity ratio-based monoisotopic peak determination algorithm (PIRMD) to distinguish different monoisotopic precursors of chimera spectra. Monoisotopic peaks in non-overlapping clusters are detected by the edge features of the isotopic peak intensity ratios. For multiple overlapping clusters grouped as one cluster, monoisotopic peaks can be detected by an advanced estimation of the similarity between the estimated and the experimental isotopic distribution based on the isotopic peak intensity ratios. RESULTS High-resolution mass spectrometric datasets acquired from mixtures of 30 synthetic peptides and mixtures of 18 proteins were used to evaluate the efficiency and accuracy of PIRMD. The results indicate that PIRMD can recognize monoisotopic precursors from the chimera spectra containing non-overlapping and overlapping isotopic clusters. Compared to several published algorithms, PIRMD identifies approximately 2?similar to?14% more spectra and has fewer false positives. CONCLUSIONS The results on standard datasets and actual samples demonstrated that PIRMD could notably improve the successful identification rates of the spectra by identifying more chimera spectra, and of the identified spectra, approximately 25% are chimera spectra. This novel algorithm will help to interpret spectra produced by shotgun strategy in proteomics. Copyright (C) 2012 John Wiley & Sons, Ltd.
Use: pFind; pBuild; pParse; pLabel



Glycosylation analysis of recombinant neutral protease I from Aspergillus oryzae expressed in Pichia pastoris
Biotechnology Letters2013. Lei, D et al. Nanchang Univ, State Key Lab Food Sci & Technol, Nanjing East Rd 235, Nanchang 330047, Jiangxi, Peoples R China.
ABSTRACT:Neutral protease I from Aspergillus oryzae 3.042 was expressed in Pichia pastoris and its N-glycosylation properties were analyzed. After purification by nickel-affinity chromatography column, the recombinant neutral protease (rNPI) was confirmed to be N-glycosylated by periodicacid/Schiff's base staining and Endo H digestion. Moreover, the deglycosylated protein's molecular weight decreased to 43.3 kDa from 54.5 kDa analyzed by SDS-PAGE and MALDI-TOF-MS, and the hyperglycosylation extent was 21 %. The N-glycosylation site of rNPI was analyzed by nano LC-MS/MS after digesting by trypsin and Glu-C, and the unique potential site Asn(41) of mature peptide was found to be glycosylated. Homology modeling of the 3D structure of rNPI indicated that the attached N-glycans hardly affected neutral protease's activity due to the great distance away from the active site of the enzyme.
Use: pFind



Method for rapid protein identification in a large database
Biomed Research International2013. Wenli Zhang et al. Institute of Computing Technology, Chinese Academy of Sciences
ABSTRACT:Proteinidentificationis an integral part of proteomics research. The available tools to identify proteinsintandem mass spectrometry experiments are not optimized to face current challengesinterms ofidentificationscale and speed owing to the exponential growth of theproteindatabaseand the accelerated generation of mass spectrometry data, as well as the demandfornonspecific digestion and post-modificationsincomplex-sampleidentification. Asaresult,arapidmethodis required to mitigate such complexity and computation challenges. This paper thus aims to present an openmethodto prevent enzyme and modification specificity onalargedatabase. This paper designed and developedadistributed program to facilitate application to computer resources. With this optimization, nearly linear speedup and real-time support are achieved onalargedatabasewith nonspecific digestion, thus enabling testing with two classicallargeproteindatabasesina20-blade cluster. This work aidsinthe discovery of more significant biological results, such as modification sites, and enables theidentificationof more complex samples, such as metaproteomics samples.
Use: pFind



Speeding up scoring module of mass spectrometry based protein identification by GPU
2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS)2012. Li, Y et al. Hong Kong Baptist Univ, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China.
ABSTRACT:Database searching is a main method for protein identification in shotgun proteomics, and many research efforts are dedicated to improving its effectiveness. However, the efficiency of database searching is facing a serious challenge, due to the ever fast growth of protein and peptide databases resulted from genome translations, enzymatic digestions, and post-translational modifications (PTMs). On the other hand, as a general-purpose and high performance parallel hardware, Graphics Processing Units (GPUs) develop continuously and provide another promising platform for parallelizing database searching based protein identification. It becomes very important to study how to speed up database search engines by GPUs for protein identification. In this paper, we mainly utilize GPUs to accelerate the scoring module, which is the most time-consuming component. Specifically, we study two popular scoring method: spectral dot product based method, which is used by X!Tandem, and kernel spectral dot product, which is used by pFind.
Use: pFind



Efficient discovery of abundant post-translational modifications and spectral pairs using peptide mass and retention time differences
BMC bioinformatics2009. Fu, Yan et al. Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
ABSTRACT:Background: Peptide identification via tandem mass spectrometry is the basic task of current proteomics research. Due to the complexity of mass spectra, the majority of mass spectra cannot be interpreted at present. The existence of unexpected or unknown protein post-translational modifications is a major reason.Results: This paper describes an efficient and sequence database-independent approach to detecting abundant post-translational modifications in high-accuracy peptide mass spectra. The approach is based on the observation that the spectra of a modified peptide and its unmodified counterpart are correlated with each other in their peptide masses and retention time. Frequently occurring peptide mass differences in a data set imply possible modifications, while small and consistent retention time differences provide orthogonal supporting evidence. We propose to use a bivariate Gaussian mixture model to discriminate modification-related spectral pairs from random ones. Due to the use of two-dimensional information, accurate modification masses and confident spectral pairs can be determined as well as the quantitative influences of modifications on peptide retention time.Conclusion: Experiments on two glycoprotein data sets demonstrate that our method can effectively detect abundant modifications and spectral pairs. By including the discovered modifications into database search or by propagating peptide assignments between paired spectra, an average of 10% more spectra are interpreted.
Use: pFind



A hydrophilic immobilized trypsin reactor with N-vinyl-2-pyrrolidinone modified polymer microparticles as matrix for highly efficient protein digestion with low peptide
Journal of Chromatography A2012. Jiang, H et al. 457 Zhongshan Rd, Dalian 116023, Peoples R China.
ABSTRACT:In this work, a novel kind of N-vinyl-2-pyrrolidinone (NVP) modified poly acrylic ester microspheres was prepared, followed by trypsin immobilization to prepare a hydrophilic immobilized enzyme reactor (IMER), to achieve highly efficient protein digestion with low peptide residue. The nonspecific adsorption of peptides on such an IMER was evaluated by the in sequence digestion of bovine serum albumin (BSA) and myoglobin. Without NVP modification, both proteins could be identified after digestion by a 5 cm-length IMER, but 18 peptides of BSA were found in the digests of myoglobin caused by the nonspecific adsorption of the matrix. With NVP modification, the hydrophilicity of IMER was greatly improved, resulting in not only the sequence coverage of myoglobin increased from 63% to 73%, but also no residual peptides from BSA observed in myoglobin digests. Although the sequence coverages of proteins obtained by the IMER were comparable to those obtained by in-solution digestion, the digestion time was shortened from 24 h to 1 min. By such an IMER, a protein mixture, containing BSA, myoglobin, and cytochrome c (100, 1 and 0.01 mu g/mL, respectively), was digested, and all proteins were unambiguously identified with improved sequence coverages than that achieved by in-solution digestion. Furthermore, the hydrophilic IMER was also off-line coupled to nano-RPLC-ESI-MS/MS for the analysis of proteins extracted from yeast. After 1.5 min digestion, 271 protein groups with at least 2 distinct peptides were identified, much more than those obtained by 24 h in-solution digestion (192 protein groups), indicating the great potential of such an IMER for proteome analysis. (C) 2012 Elsevier B.V. All rights reserved.
Use: pFind; pXtract



Biphasic microreactor for efficient membrane protein pretreatment with a combination of formic acid assisted solubilization, on-column pH adjustment, reduction
Analytical chemistry2013. Zhao, Qun et al. Chinese Acad Sci, Dalian Inst Chem Phys, Key Lab Separat Sci Analyt Chem, Natl Chromatog R&A Ctr, Dalian 116023, Peoples R China
ABSTRACT:Combining good dissolving ability of formic acid (FA) for membrane proteins and excellent complementary retention behavior of proteins on strong cation exchange (SCX) and strong anion exchange (SAX) materials, a biphasic microreactor was established to pretreat membrane proteins at microgram and even nanogram levels. With membrane proteins solubilized by FA, all of the proteomics sample processing procedures, including protein preconcentration, pH adjustment, reduction, and alkylation, as well as tryptic digestion, were integrated into an "SCX-SAX" biphasic capillary column. To evaluate the performance of the developed microreactor, a mixture of bovine serum albumin, myoglobin, and cytochrome c was pretreated. Compared with the results obtained by the traditional in-solution process, the peptide recovery (93% vs 83%) and analysis throughput (3.5 vs 14 h) were obviously improved. The microreactor was further applied for the pretreatment of 14 mu g of membrane proteins extracted from rat cerebellums, and 416 integral membrane proteins (IMPs) (43% of total protein groups) and 103 transmembrane peptides were identified by two-dimensional nanoliquid chromatography-electrospray ionization tandem mass spectrometry (2D nano-LC-ESI-MS/MS) in triplicate analysis. With the starting sample preparation amount decreased to as low as 50 ng, 64 IMPs and 17 transmembrane peptides were identified confidently, while those obtained by the traditional in-solution method were 10 and 1, respectively. All these results demonstrated that such an "SCX-SAX" based biphasic microreactor could offer a promising tool for the pretreatment of trace membrane proteins with high efficiency and throughput.
Use: pFind; pXtract; pBuild



NSI and NSMT: usages of MS/MS fragment ion intensity for sensitive differential proteome detection and accurate protein fold change calculation in relative label-free
Analyst2012. Wu, Q et al. Chinese Acad Sci, Dalian Inst Chem Phys, Key Lab Separat Sci Analyt Chem, Natl Chromatog Res & Anal Ctr, Dalian 116023, Peoples R China.
ABSTRACT:Although widely applied in the label-free quantification of proteomics, spectral count (SC)-based abundance measurements suffer from the narrow dynamic range of attainable ratios, leading to the serious underestimation of true protein abundance fold changes, especially when studying biological samples that exhibit very large fold changes in protein expression. MS/MS fragment ion intensity, as an alternative to SC, has recently gained acceptance as the abundance feature of protein in label-free proteomic studies. Herein, we implemented two formats of MS/MS fragment ion intensity, Spectral Index (SI) and Summed MS/MS TIC (SMT), to alleviate this particular deficiency arising from SC. Both were in forms of replacing SC in the Normalized Spectral Abundance Factor (NSAF) formula, resulting in two algorithms, abbreviated as NSI and NSMT, respectively. The necessity of the normalization process was validated using a publicly available dataset. Furthermore, when applied to another well characterized benchmark dataset, both NSI and NSMT showed improved overall accuracy over NSAF for the relative quantification of proteomes. Hereinto, NSI enabled the sensitive detection of differentially expressed proteins, while NSMT ensured accurate calculation for protein abundance fold change. Therefore, the selective use of both algorithms might facilitate the screening and quantification of potential biomarkers on the proteome scale.
Use: pFind; pXtract; pBuild



Thioredoxin and thioredoxin reductase control tissue factor activity by thiol redox-dependent mechanism
Journal of Biological chemistry2013. Wang, P et al. Univ Chinese Acad Sci, Coll Life Sci, YuQuan Rd 19 A, Beijing 100049, Peoples R China.
ABSTRACT:Abnormally enhanced tissue factor (TF) activity is related to increased thrombosis risk in which oxidative stress plays a critical role. Human cytosolic thioredoxin (hTrx1) and thioredoxin reductase (TrxR), also secreted into circulation, have the power to protect against oxidative stress. However, the relationship between hTrx1/TrxR and TF remains unknown. Here we show reversible association of hTrx1 with TF in human serum and plasma samples. The association is dependent on hTrx1-Cys-73 that bridges TF-Cys-209 via a disulfide bond. hTrx1-Cys-73 is absolutely required for hTrx1 to interfere with FVIIa binding to purified and cell-surface TF, consequently suppressing TF-dependent procoagulant activity and proteinase-activated receptor-2 activation. Moreover, hTrx1/TrxR plays an important role in sensing the alterations of NADPH/NADP(+) states and transducing this redox-sensitive signal into changes in TF activity. With NADPH, hTrx1/TrxR readily facilitates the reduction of TF, causing a decrease in TF activity, whereas with NADP(+), hTrx1/TrxR promotes the oxidation of TF, leading to an increase in TF activity. By comparison, TF is more likely to favor the reduction by hTrx1-TrxR-NADPH. This reversible reduction-oxidation reaction occurs in the TF extracellular domain that contains partially opened Cys-49/-57 and Cys-186/-209 disulfide bonds. The cell-surface TF procoagulant activity is significantly increased after hTrx1-knockdown. The response of cell-surface TF procoagulant activity to H2O2 is efficiently suppressed through elevating cellular TrxR activity via selenium supplementation. Our data provide a novel mechanism for redox regulation of TF activity. By modifying Cys residues or regulating Cys redox states in TF extracellular domain, hTrx1/TrxR function as a safeguard against inappropriate TF activity.
Use: pFind



Secretory/releasing proteome-based identification of plasma biomarkers in HBV-associated hepatocellular carcinoma
Science China Life Sciences2013. Yang, L et al. Peking Union Med Coll, Canc Inst Hosp, Dept Abdominal Surg, Beijing 100021, Peoples R China.
ABSTRACT:For successful therapy, hepatocellular carcinoma (HCC) must be detected at an early stage. Herein, we used a proteomic approach to analyze the secretory/releasing proteome of HCC tissues to identify plasma biomarkers. Serum-free conditioned media (CM) were collected from primary cultures of cancerous tissues and surrounding noncancerous tissues. Proteomic analysis of the CM proteins permitted the identification of 1365 proteins. The enriched molecular functions and biological processes of the CM proteins, such as hydrolase activity and catabolic processes, were consistent with the liver being the most important metabolic organ. Moreover, 19% of the proteins were characterized as extracellular or membrane-bound. For validation, secretory proteins involved in transforming growth factor-beta signaling pathways were validated in plasma samples. Alphafetoprotein (AFP), metalloproteinase (MMP)1, osteopontin (OPN), and pregnancy-specific beta-1-glycoprotein (PSG)9 were significantly increased in HCC patients. The overall performance of MMP1 and OPN in the diagnosis of HCC remained greater than that of AFP. In addition, this study represents the first report of MMP1 as a biomarker with a higher sensitivity and specificity than AFP. Thus, this study provides a valuable resource of the HCC secretome with the potential to investigate serological biomarkers. MMP1 and OPN could be used as novel biomarkers for the early detection of HCC and to improve the sensitivity of biomarkers compared with AFP.
Use: pFind