pFind Studio: a computational solution for mass spectrometry-based proteomics
2016
Molecular & Cellular Poteomics2016. Olsen, JB et al.
Lilly USA, Lilly Res Labs, Indianapolis, IN 46285 USA.
ABSTRACT:The significance of non-histone lysine methylation in cell biology and human disease is an emerging area of research exploration. The development of small molecule inhibitors that selectively and potently target enzymes that catalyze the addition of methyl-groups to lysine residues, such as the protein lysine mono-methyltransferase SMYD2, is an active area of drug discovery. Critical to the accurate assessment of biological function is the ability to identify target enzyme substrates and to define enzyme substrate specificity within the context of the cell. Here, using stable isotopic labeling with amino acids in cell culture (SILAC) coupled with immunoaffinity enrichment of mono-methyl-lysine (Kme1) peptides and mass spectrometry, we report a comprehensive, large-scale proteomic study of lysine mono-methylation, comprising a total of 1032 Kme1 sites in esophageal squamous cell carcinoma (ESCC) cells and 1861 Kme1 sites in ESCC cells overexpressing SMYD2. Among these Kme1 sites is a subset of 35 found to be potently down-regulated by both shRNA-mediated knockdown of SMYD2 and LLY-507, a selective small molecule inhibitor of SMYD2. In addition, we report specific protein sequence motifs enriched in Kme1 sites that are directly regulated by endogenous SMYD2 activity, revealing that SMYD2 substrate specificity is more diverse than expected. We further show direct activity of SMYD2 toward BTF3-K2, PDAP1-K126 as well as numerous sites within the repetitive units of two unique and exceptionally large proteins, AHNAK and AHNAK2. Collectively, our findings provide quantitative insights into the cellular activity and substrate recognition of SMYD2 as well as the global landscape and regulation of protein mono-methylation.
Use: pFind
Molecular & Cellular Poteomics2016. Olsen, JB et al.
Lilly USA, Lilly Res Labs, Indianapolis, IN 46285 USA.
ABSTRACT:The significance of non-histone lysine methylation in cell biology and human disease is an emerging area of research exploration. The development of small molecule inhibitors that selectively and potently target enzymes that catalyze the addition of methyl-groups to lysine residues, such as the protein lysine mono-methyltransferase SMYD2, is an active area of drug discovery. Critical to the accurate assessment of biological function is the ability to identify target enzyme substrates and to define enzyme substrate specificity within the context of the cell. Here, using stable isotopic labeling with amino acids in cell culture (SILAC) coupled with immunoaffinity enrichment of mono-methyl-lysine (Kme1) peptides and mass spectrometry, we report a comprehensive, large-scale proteomic study of lysine mono-methylation, comprising a total of 1032 Kme1 sites in esophageal squamous cell carcinoma (ESCC) cells and 1861 Kme1 sites in ESCC cells overexpressing SMYD2. Among these Kme1 sites is a subset of 35 found to be potently down-regulated by both shRNA-mediated knockdown of SMYD2 and LLY-507, a selective small molecule inhibitor of SMYD2. In addition, we report specific protein sequence motifs enriched in Kme1 sites that are directly regulated by endogenous SMYD2 activity, revealing that SMYD2 substrate specificity is more diverse than expected. We further show direct activity of SMYD2 toward BTF3-K2, PDAP1-K126 as well as numerous sites within the repetitive units of two unique and exceptionally large proteins, AHNAK and AHNAK2. Collectively, our findings provide quantitative insights into the cellular activity and substrate recognition of SMYD2 as well as the global landscape and regulation of protein mono-methylation.
Use: pFind
Electrophoresis2016. Li, SS et al.
Nankai Univ, Coll Pharm, State Key Lab Med Chem Biol, Tianjin, Peoples R China.
ABSTRACT:O-linked beta-N-acetylglucosamine (O-GlcNAc) is emerging as an essential protein post-translational modification in a range of organisms. It is involved in various cellular processes such as nutrient sensing, protein degradation, gene expression, and is associated with many human diseases. Despite its importance, identifying O-GlcNAcylated proteins is a major challenge in proteomics. Here, using peracetylated N-azidoacetylglucosamine (Ac(4)GlcNAz) as a bioorthogonal chemical handle, we described a gel-based mass spectrometry method for the identification of proteins with O-GlcNAc modification in A549 cells. In addition, we made a labeling efficiency comparison between two modes of azide-alkyne bioorthogonal reactions in click chemistry: copper-catalyzed azide-alkyne cycloaddition (CuAAC) with Biotin-Diazo-Alkyne and stain-promoted azide-alkyne cycloaddition (SPAAC) with Biotin-DIBO-Alkyne. After conjugation with click chemistry in vitro and enrichment via streptavidin resin, proteins with O-GlcNAc modification were separated by SDS-PAGE and identified with mass spectrometry. Proteomics data analysis revealed that 229 putative O-GlcNAc modified proteins were identified with Biotin-Diazo-Alkyne conjugated sample and 188 proteins with Biotin-DIBO-Alkyne conjugated sample, among which 114 proteins were overlapping. Interestingly, 74 proteins identified from Biotin-Diazo-Alkyne conjugates and 46 verified proteins from Biotin-DIBO-Alkyne conjugates could be found in the O-GlcNAc modified proteins database dbOGAP (http://cbsb.lombardi.georgetown.edu/hulab/OGAP.html). These results suggested that CuAAC with Biotin-Diazo-Alkyne represented a more powerful method in proteomics with higher protein identification and better accuracy compared to SPAAC. The proteomics credibility was also confirmed by the molecular function and cell component gene ontology (GO). Together, the method we reported here combining metabolic labeling, click chemistry, affinity-based enrichment, SDS-PAGE separation, and mass spectrometry, would be adaptable for other post-translationally modified proteins in proteomics.
Use: pFind; pBuild
Journal of Proteome Research2016. Qiyao Li et al.
Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, WI, 53706;
ABSTRACT:A new global post-translational modification (PTM) discovery strategy, G-PTM-D, is described. A proteomics database containing UniProt-curated PTM information is supplemented with potential new modification types and sites discovered from a first-round search of mass spectrometry data with ultrawide precursor mass tolerance. A second-round search employing the supplemented database conducted with standard narrow mass tolerances yields deep coverage and a rich variety of peptide modifications with high confidence in complex unenriched samples. The G-PTM-D strategy represents a major advance to the previously reported G-PTM strategy and provides a powerful new capability to the proteomics research community.
Use: pFind
Molecular & Cellular Proteomics2016. Wang, XS et al.
Univ Penn, Perelman Sch Med, Dept Biochem & Biophys, Epigenet Program, Room 9-124,3400 Civ Ctr Blvd,Bldg 421, Philadelphia, PA 19104 USA.
ABSTRACT:Over the past decades, protein O-GlcNAcylation has been found to play a fundamental role in cell cycle control, metabolism, transcriptional regulation, and cellular signaling. Nevertheless, quantitative approaches to determine in vivo GlcNAc dynamics at a large-scale are still not readily available. Here, we have developed an approach to isotopically label O-GlcNAc modifications on proteins by producing C-13-labeled UDP-GlcNAc from C-13(6)-glucose via the hexosamine biosynthetic pathway. This metabolic labeling was combined with quantitative mass spectrometry-based proteomics to determine protein O-GlcNAcylation turnover rates. First, an efficient enrichment method for O-GlcNAc peptides was developed with the use of phenylboronic acid solid-phase extraction and anhydrous DMSO. The near stoichiometry reaction between the diol of GlcNAc and boronic acid dramatically improved the enrichment efficiency. Additionally, our kinetic model for turnover rates integrates both metabolomic and proteomic data, which increase the accuracy of the turnover rate estimation. Other advantages of this metabolic labeling method include in vivo application, direct labeling of the O-GlcNAc sites and higher confidence for site identification. Concentrating only on nuclear localized GlcNAc modified proteins, we are able to identify 105 O-GlcNAc peptides on 42 proteins and determine turnover rates of 20 O-GlcNAc peptides from 14 proteins extracted from HeLa nuclei. In general, we found O-GlcNAcylation turnover rates are slower than those published for phosphorylation or acetylation. Nevertheless, the rates widely varied depending on both the protein and the residue modified. We believe this methodology can be broadly applied to reveal turnovers/dynamics of protein O-GlcNAcylation from different biological states and will provide more information on the significance of O-GlcNAcylation, enabling us to study the temporal dynamics of this critical modification for the first time.
Use: pXtract; pParse; pFind
Journal of Proteome Research2016. Wei, W et al.
Beijing Inst Radiat Med, Natl Ctr Prot Sci Beijing, Beijing Proteome Res Ctr, Natl Engn Res Ctr Prot Drugs,State Key Lab Prote, Beijing 102206, Peoples R China.
ABSTRACT:Since 2012, missing proteins (MPs) investigation has been one of the critical missions of Chromosome-Centric Human Proteome Project (C-HPP) through various biochemical strategies. On the basis of our previous testis MPs study, faster scanning and higher resolution mass-spectrometry-based proteomics might be conducive to MPs exploration, especially for low-abundance proteins. In this study, Q-Exactive HF (HF) was used to survey proteins from the same testis tissues separated by two separating methods (tricine- and glycine-SDS-PAGE), as previously described. A total of 8526 proteins were identified, of which more low-abundance proteins were uniquely detected in HF data but not in our previous LTQ Orbitrap Velos (Velos) reanalysis data. Further transcriptomics analysis showed that these uniquely identified proteins by HF also had lower expression at the mRNA level. Of the 81 total identified MPs, 74 and 39 proteins were listed as MPs in HF and Velos data sets, respectively. Among the above MPs, 47 proteins (43 neXtProt PE2 and 4 PE3) were ranked as confirmed MPs after verifying with the stringent spectra match and isobaric and single amino acid variants filtering. Functional investigation of these 47 MPs revealed that 11 MPs were testis-specific proteins and 7 MPs were involved in spermatogenesis process. Therefore, we concluded that higher scanning speed and resolution of HF might be factors for improving the low-abundance MP identification in future C-HPP studies. All mass-spectrometry data from this study have been deposited in the ProteomeXchange with identifier PXD004092.
Use: pFind; pBuild
Journal of Proteome Research2016. Choi, M et al.
Northeastern Univ, Boston, MA 02115 USA.
ABSTRACT:Detection of differentially abundant proteins in label-free quantitative shotgun liquid chromatography tandem mass spectrometry (LC-MS/MS) experiments requires a series of computational steps that identify and quantify LC-MS features. It also requires statistical analyses that distinguish systematic changes in abundance between conditions from artifacts of biological and technical variation. The 2015 study of the Proteome Informatics Research Group (iPRG) of the Association of Biomolecular Resource Facilities (ABRF) aimed to evaluate the effects of the statistical analysis on the accuracy of the results. The study used LC tandem mass spectra acquired from a controlled mixture, and made the data available to anonymous volunteer participants. The participants used methods of their choice to detect differentially abundant proteins, estimate the associated fold changes, and characterize the uncertainty of the results. The study found that multiple strategies (including the use of spectral counts versus peak intensities, and various software tools) could lead to accurate results, and that the performance was primarily determined by the analysts' expertise. This manuscript summarizes the outcome of the study, and provides representative examples of good computational and statistical practice. The data set generated as part of this study is publicly available.
Use: pFind; pQuant
Pharmacogenomics2016. Liang, KH et al.
Chang Gung Mem Hosp, Liver Res Ctr, 5 Fu Sing St, Taoyuan, Taiwan.
ABSTRACT:Aim: Transcatheter arterial chemoembolization is currently the standard treatment in hepatocellular carcinoma patients with Barcelona Clinic Liver Cancer stage B. Genomic variants of GALNT14 were recently identified as effective predictors for chemotherapy responses in Barcelona Clinic Liver Cancer stage C patients. Methods: We investigated the prognosis predictive value of GALNT14 genotypes in 327 hepatocelluar carcinoma patients treated by transcatheter arterial chemoembolization. Result: Cox proportional hazards model analysis showed that the genotype 'TT' was associated with shorter time-to-response (multivariate p < 0.001), time-to-complete-response (p = 0.004) and longer time-to-tumor progression (p < 0.001), compared with the genotype 'non-TT'. In patients with albumin <3.5 g/dl, genotype 'TT' was associated with longer overall survival (p = 0.027). Finally, genotype 'TT' correlated with higher cancer-to-noncancer ratios of GALNT14 protein levels, lower cancer-to-noncancer ratios of antiapoptotic cFLIP-S, and a clustered glycosylation pattern in the extracellular domain of death receptor 5. Conclusion: GALNT14 genotypes were significantly associated with clinical outcomes of transcatheter arterial chemoembolization. The differential status of extrinsic apoptotic signaling between cancerous and non-cancerous tissues might underlie the clinical association.
Use: pFind
Journal of Proteome Research2016. Zhao, MZ et al.
Beijing Inst Radiat Med, Natl Engn Res Ctr Prot Drugs, Beijing Proteome Res Ctr, Natl Ctr Prot Sci Beijing,State Key Lab Prote, Beijing 102206, Peoples R China.
ABSTRACT:A membrane protein enrichment method composed of ultracentrifugation and detergent-based extraction was first developed based on MCF7 cell line. Then, in-solution digestion with detergents and eFASP (enhanced filter-aided sample preparation) with detergents were compared with the time-consuming in-gel digestion method. Among the in-solution digestion strategies, the eFASP combined with RapiGest identified 1125 membrane proteins. Similarly, the eFASP combined with sodium deoxycholate identified 1069 membrane proteins; however, the in-gel digestion characterized 1091 membrane proteins. Totally, with the five digestion methods, 1390 membrane proteins were identified with >= 1 unique peptides, among which 1345 membrane proteins contain unique peptides >= 2. This is the biggest membrane protein data set for MCF7 cell line and even breast cancer tissue samples. Interestingly, we identified 13 unique peptides belonging to 8 missing proteins (MPs). Finally, eight unique peptides were validated by synthesized peptides. Two proteins were confirmed as MPs, and another two proteins were candidate detections.
Use: pFind; pBuild
Journal of Proteomics2016. Ma, C et al.
Georgia State Univ, Dept Chem, Ctr Diagnost & Therapeut, Atlanta, GA 30303 USA.
ABSTRACT:Core-fucosylation (CF) plays important roles in regulating biological processes in eukaryotes. Alterations of CF-glycosites or CF-glycans in bodily fluids correlate with cancer development. Therefore, global research of protein core-fucosylation with an emphasis on proteomics can explain pathogenic and metastasis mechanisms and aid in the discovery of new potential biomarkers for early clinical diagnosis. In this study, a precise and high throughput method was established to identify CF-glycosites from human plasma. We found that alternating HCD and ETD fragmentation (AHEF) can provide a complementary method to discover CF-glycosites. A total of 407 CF-glycosites among 267 CF-glycoproteins were identified in a mixed sample made from six normal human plasma samples. Among the 407 CF-glycosites, 10 are without the N-X-S/T/C consensus motif, representing 2.5% of the total number identified. All identified CF-glycopeptide results from HCD and ETD fragmentation were filtered with neutral loss peaks and characteristic ions of GlcNAc from HCD spectra, which assured the credibility of the results. This study provides an effective method for CF-glycosites identification and a valuable biomarker reference for clinical research. Biological significance: CF-glycosytion plays an important role in regulating biological processes in eukaryotes. Alterations of the glycosites and attached CF-glycans are frequently observed in various types of cancers. Thus, it is crucial to develop a strategy for mapping human CF-glycosylation. Here, we developed a complementary method via alternating HCD and ETD fragmentation (AHEF) to analyze CF-glycoproteins. This strategy reveals an excellent complementarity of HCD and ETD in the analysis of CF-glycoproteins, and provides a valuable biomarker reference for clinical research. Published by Elsevier B.V.
Use: pFind; pBuild