Nanoscale. 2017. Bo Jiang. et al. Dalian Institute of Chemical Physics
ABSTRACT: Although selective enrichment of glycopeptides from complex biological samples is indispensable for mass spectrometry (MS)-based glycoproteomics, it still remains a great challenge due to the low abundance of glycoproteins and suppression of non-glycopeptides. In this study, silver nanoparticle-functionalized magnetic graphene oxide nanocomposites (GO/Fe3O4/PEI/Ag) were synthesized. Silver nanoparticles were generated in situ on the surface of magnetic graphene oxide using polyethylenimine as a reducing and stabilizing agent. The resulting material was used as an adsorbent for selective enrichment of glycopeptides. GO/Fe3O4/PEI/Ag nanocomposites offered excellent enrichment ability, which was attributed to the synergistic effect of polyethylenimine and silver nanoparticles.
> The nanocomposites showed superior specificity for glycopeptides even when non-glycopeptides were 100 times more concentrated than glycopeptides. The nanocomposites displayed advantages including rapid adsorption (1 min), low detection limit (25 fmol), repeatability (6 times), and high recovery (77.8%). Using these nanocomposites, 91 different glycoproteins and 136 N-linked glycopeptides were identified from among 20 μg tryptic human serum proteins and this demonstrated the superior performance of the nanocomposites for glycopeptides enrichment.
Scientific Reports. 2017. Susanna L. Lundström. et al. Karolinska Institutet
ABSTRACT: The human blood proteome is frequently assessed by protein abundance profiling using a combination of liquid chromatography and tandem mass spectrometry (LC-MS/MS). In traditional sequence database search, many good-quality MS/MS data remain unassigned. Here we uncover the hidden part of the blood proteome via novel SpotLight approach. This method combines de novo MS/MS sequencing of enriched antibodies and co-extracted proteins with subsequent label-free quantification of new and known peptides in both enriched and unfractionated samples. In a pilot study on differentiating early stages of Alzheimer's disease (AD) from Dementia with Lewy Bodies (DLB), on peptide level the hidden proteome contributed almost as much information to patient stratification as the apparent proteome. Intriguingly, many of the new peptide sequences are attributable to antibody variable regions, and are potentially indicative of disease etiology. When the hidden and apparent proteomes are combined, the accuracy of differentiating AD (n = 97) and DLB (n = 47) increased from ≈85% to ≈95%. The low added burden of SpotLight proteome analysis makes it attractive for use in clinical settings.
Proteomics. 2016. Tyler J. Bechtel. et al. Boston College
ABSTRACT: This review provides a comprehensive overview of the functional roles of disulfide bonds and their relevance to human disease. The critical roles of disulfide bonds in protein structure stabilization and redox regulation of protein activity are addressed. Disulfide bonds are essential to the structural stability of many proteins within the secretory pathway and can exist as intramolecular or inter-domain disulfides. The proper formation of these bonds often relies on folding chaperones and oxidases such as members of the protein disulfide isomerase (PDI) family. Many of the PDI family members catalyze disulfide-bond formation, reduction and isomerization through redox-active disulfides and perturbed PDI activity is characteristic of carcinomas and neurodegenerative diseases. In addition to catalytic function in oxidoreductases, redox-active disulfides are also found on a diverse array of cellular proteins and act to regulate protein activity and localization in response to oxidative changes in the local environment.
> These redox-active disulfides are either dynamic intramolecular protein disulfides or mixed disulfides with small-molecule thiols generating glutathionylation and cysteinylation adducts. The oxidation and reduction of redox-active disulfides are mediated by cellular reactive oxygen species and activity of reductases, such as glutaredoxin and thioredoxin. Dysregulation of cellular redox conditions and resulting changes in mixed disulfide formation are directly linked to diseases such as cardiovascular disease and Parkinson's disease. This article is protected by copyright. All rights reserved.
Journal of proteome research. 2016. Meena Choi. et al. Northeastern University
ABSTRACT: Detection of differentially abundant proteins in label-free quantitative shotgun liquid chromatography–tandem mass spectrometry (LC–MS/MS) experiments requires a series of computational steps that identify and quantify LC–MS features. It also requires statistical analyses that distinguish systematic changes in abundance between conditions from artifacts of biological and technical variation. The 2015 study of the Proteome Informatics Research Group (iPRG) of the Association of Biomolecular Resource Facilities (ABRF) aimed to evaluate the effects of the statistical analysis on the accuracy of the results.
> The study used LC–tandem mass spectra acquired from a controlled mixture, and made the data available to anonymous volunteer participants. The participants used methods of their choice to detect differentially abundant proteins, estimate the associated fold changes, and characterize the uncertainty of the results. The study found that multiple strategies (including the use of spectral counts versus peak intensities, and various software tools) could lead to accurate results, and that the performance was primarily determined by the analysts’ expertise. This manuscript summarizes the outcome of the study, and provides representative examples of good computational and statistical practice. The data set generated as part of this study is publicly available.
Journal of proteome research. 2016. Qiyao Li. et al. University of Wisconsin
ABSTRACT: A new global post-translational modification (PTM) discovery strategy, G-PTM-D, is described. A proteomics database containing UniProt-curated PTM information is supplemented with potential new modification types and sites discovered from a first-round search of mass spectrometry data with ultra-wide precursor mass tolerance. A second-round search employing the supplemented database conducted with standard narrow mass tolerances yields deep coverage and a rich variety of peptide modifications with high confidence in complex unenriched samples. The G-PTM-D strategy represents a major advance to the previously reported G-PTM strategy and provides a powerful new capability to the proteomics research community.
Current Protocols in Protein Science. 2016. Xing-Jun Cao. et al. University of Pennsylvania
ABSTRACT: Lysine methylation is a common protein post-translational modification dynamically mediated by protein lysine methyltransferases (PKMTs) and protein lysine demethylases (PKDMs). Beyond histone proteins, lysine methylation on non-histone proteins plays a substantial role in a variety of functions in cells and is closely associated with diseases such as cancer. A large body of evidence indicates that the dysregulation of some PKMTs leads to tumorigenesis via their non-histone substrates. However, most studies on other PKMTs have made slow progress owing to the lack of approaches for extensive screening of lysine methylation sites. However, recently, there has been a series of publications to perform large-scale analysis of protein lysine methylation. In this unit, we introduce a protocol for the global analysis of protein lysine methylation in cells by means of immunoaffinity enrichment and mass spectrometry.
Molecular Plant. 2016. Shuo-Lei Bu. et al. Hebei Normal University
ABSTRACT: ABA induces the phosphorylation of three basic helix-loop-helix (bHLH) transcription factors, called AKSs (ABA-responsive kinase substrates; AKS1, AKS2, and AKS3). The unphosphorylated AKSs facilitate stomatal opening through promoting the transcription of genes encoding inwardly rectifying K+ channels (Takahashi et al., 2013). AKS1 and AKS3 are also regulators of flowering (Ito et al., 2012). However, the kinases and phosphatases that directly control the phosphorylation status of AKSs in vivo have not been fully characterized. Here, our proteomic analyses provide evidence supporting that AKSs are phosphorylated by GSK3 kinases and dephosphorylated by protein phosphatase 2A (PP2A).
> PP2A is a ubiquitous and conserved serine/threonine phosphatase. Studies in mammals have shown that PP2A is one of the most important phosphatases for cellular regulation, with broad substrate specificity and diverse cellular functions. PP2A is a heterotrimeric complex composed of structural A, catalytic C, and regulatory B subunits. The A subunit is the scaffold required for the formation of the heterotrimeric complex, whereas the B subunit recruits specific substrates. The Arabidopsis genome encodes at least three A subunits, 17 B subunits, and five C subunits ( Jonassen et al., 2011). Genetic studies have shown important function of PP2A in plant growth, development, and adaptation (Lillo et al., 2014), but its substrates have not been studied systematically. In addition, it is unknown whether each B subunit associates with different A subunit isoforms or non-selectively associates with all A and C subunits.
Cell. 2016. Javier Fernandez-Martinez. et al. The Rockefeller University
ABSTRACT: The last steps in mRNA export and remodeling are performed by the Nup82 complex, a large conserved assembly at the cytoplasmic face of the nuclear pore complex (NPC). By integrating diverse structural data, we have determined the molecular architecture of the native Nup82 complex at subnanometer precision. The complex consists of two compositionally identical multiprotein subunits that adopt different configurations. The Nup82 complex fits into the NPC through the outer ring Nup84 complex. Our map shows that this entire 14-MDa Nup82-Nup84 complex assembly positions the cytoplasmic mRNA export factor docking sites and messenger ribonucleoprotein (mRNP) remodeling machinery right over the NPC’s central channel rather than on distal cytoplasmic filaments, as previously supposed. We suggest that this configuration efficiently captures and remodels exporting mRNP particles immediately upon reaching the cytoplasmic side of the NPC.
EMBO reports. 2016. Dheva T Setiaputra. et al. The University of British Columbia
ABSTRACT: Elongator is a ~850 kDa protein complex involved in multiple processes from transcription to tRNA modification. Conserved from yeast to humans, Elongator is assembled from two copies of six unique subunits (Elp1 to Elp6). Despite the wealth of structural data on the individual subunits, the overall architecture and subunit organization of the full Elongator and the molecular mechanisms of how it exerts its multiple activities remain unclear. Using single‐particle electron microscopy (EM), we revealed that yeast Elongator adopts a bilobal architecture and an unexpected asymmetric subunit arrangement resulting from the hexameric Elp456 subassembly anchored to one of the two Elp123 lobes that form the structural scaffold. By integrating the EM data with available subunit crystal structures and restraints generated from cross‐linking coupled to mass spectrometry, we constructed a multiscale molecular model that showed the two Elp3, the main catalytic subunit, are located in two distinct environments. This work provides the first structural insights into Elongator and a framework to understand the molecular basis of its multifunctionality.
Journal of Proteomics. 2016. Verena Tinnefeld. et al. Leibniz-Institut für Analytische Wissenschaften
ABSTRACT: Chemical cross-linking of proteins is an emerging field with huge potential for the structural investigation of proteins and protein complexes. Owing to the often relatively low yield of cross-linking products their identification in complex samples benefits from enrichment procedures prior to mass spectrometry analysis. So far, this is mainly accomplished by using biotin moieties in specific cross-linkers or by applying strong cation exchange chromatography (SCX) for a relatively crude enrichment. Here, we present here a novel workflow to enrich cross-linked peptides by utilizing charge-based fractional diagonal chromatography (ChaFRADIC). Based on two-dimensional diagonal SCX separation, we could increase the number of identified cross-linked peptides for samples of different complexity: pure cross-linked BSA, cross-linked BSA spiked into a simple protein mixture and cross-linked BSA spiked into a HeLa lysate. We also compared XL-ChaFRADIC with size exclusion chromatography-based enrichment of cross-linked peptides. The XL-ChaFRADIC approach is straightforward, reproducible and independent of the cross-linking chemistry and cross-linker properties.
NATURE. 2016. Shan Wu. et al. Carnegie Mellon University
ABSTRACT: Ribosome biogenesis is a highly complex process in eukaryotes, involving temporally and spatially regulated ribosomal protein (r-protein) binding and ribosomal RNA remodelling events in the nucleolus, nucleoplasm and cytoplasm1, 2. Hundreds of assembly factors, organized into sequential functional groups3, 4, facilitate and guide the maturation process into productive assembly branches in and across different cellular compartments. However, the precise mechanisms by which these assembly factors function are largely unknown. Here we use cryo-electron microscopy to characterize the structures of yeast nucleoplasmic pre-60S particles affinity-purified using the epitope-tagged assembly factor Nog2. Our data pinpoint the locations and determine the structures of over 20 assembly factors, which are enriched in two areas: an arc region extending from the central protuberance to the polypeptide tunnel exit, and the domain including the internal transcribed spacer 2 (ITS2) that separates 5.8S and 25S ribosomal RNAs.
> In particular, two regulatory GTPases, Nog2 and Nog1, act as hub proteins to interact with multiple, distant assembly factors and functional ribosomal RNA elements, manifesting their critical roles in structural remodelling checkpoints and nuclear export. Moreover, our snapshots of compositionally and structurally different pre-60S intermediates provide essential mechanistic details for three major remodelling events before nuclear export: rotation of the 5S ribonucleoprotein, construction of the active centre and ITS2 removal. The rich structural information in our structures provides a framework to dissect molecular roles of diverse assembly factors in eukaryotic ribosome assembly.
Analytical and Bioanalytical Chemistry. 2016. He Zhu. et al. Georgia State University
ABSTRACT: N-Glycosylation is one of the most prevalent protein post-translational modifications and is involved in many biological processes, such as protein folding, cellular communications, and signaling. Alteration of N-glycosylation is closely related to the pathogenesis of diseases. Thus, the investigation of protein N-glycosylation is crucial for the diagnosis and treatment of disease. In this research, we applied diethylaminoethanol (DEAE) Sepharose solid-phase extraction microcolumns for N-glycopeptide enrichment. This method integrated the advantages of Click Maltose and zwitterionic HILIC (ZIC-HILIC) and showed a relatively higher specificity for N-glycosylated peptides. This strategy was then applied to tryptic digests of normal human serum, followed by deglycosylation using peptide-N-glycosidase F (PNGase F) in H218O. Subsequent LC-MS/MS analysis allowed for the assignment of 219 N-glycosylation sites from 115 serum N-glycoproteins. This study provides an alternative approach for N-glycopeptide enrichment and the method employed is effective for large-scale N-glycosylation site identification. Graphical abstract Proposed mechanism of glycopeptides enrichment using DEAE-Sepharose.
Rapid Communications in Mass Spectrometry. 2016. Acosta-Martin AE. et al. University of Geneva
ABSTRACT: Busulfan is a bifunctional alkyl sulfonate antineoplastic drug. This alkylating agent was described as forming covalent adducts on proteins. However, only limited data are available regarding the interaction of busulfan with proteins. Mass spectrometry and bioinformatics were used to identify busulfan adducts on human serum albumin and hemoglobin. Albumin and hemoglobin were incubated with busulfan or control compounds, digested with trypsin and analyzed by LC-MS/MS on a Thermo Fisher LTQ Orbitrap Velos Pro. MS data were used to generate spectral libraries of non-modified peptides and an open modification search was performed to identify potential adduct mass shifts and possible modification sites. Results were confirmed by a second database search including identified mass shifts and by visual inspection of annotated tandem mass spectra of adduct-carrying peptides.
> Five structures of busulfan adducts were detected and a chemical structure could be attributed to four of them. Two were primary adducts corresponding to busulfan monoalkylation and alkylation of two amino acid residues by a single busulfan molecule. Two others corresponded to secondary adducts generated during sample processing. Adducts were mainly detected on Asp, Glu, and His residues. These findings were confirmed by subsequent database searches and experiments with synthetic peptides. The combination of in vitro incubation of proteins with the drug of interest or control compounds, high-resolution mass spectrometry, and open modification search allowed confirming direct interaction of busulfan with proteins and characterizing resulting adducts. Our results also showed that careful analysis of the data is required to detect experimental artifacts.
Toxins. 2016. Ning Luan. et al. Kunming Institute of Zoology
ABSTRACT: Scorpion venom is deemed to contain many toxic peptides as an important source of natural compounds. Out of the two hundred proteins identiﬁed in Mesobuthus martensii (M. martensii), only a few peptide toxins have been found so far. Herein, a combinational approach based upon RNA sequencing and Liquid chromatography-mass spectrometry/mass spectrometry (LC MS/MS) was employed to explore the venom peptides in M. martensii. A total of 153 proteins were identiﬁed from the scorpion venom, 26 previously known and 127 newly identiﬁed. Of the novel toxins, 97 proteins exhibited sequence similarities to known toxins, and 30 were never reported. Combining peptidomic and transcriptomic analyses, the peptide sequence of BmKKx1 was reannotated and four disulﬁde bridges were conﬁrmed within it. In light of the comparison of conservation and variety of toxin amino acid sequences, highly conserved and variable regions were perceived in 24 toxins that were parts of two sodium channel and two potassium channel toxins families. Taking all of this evidences together, the peptidomic analysis on M. martensii indeed identiﬁed numerous novel scorpion peptides, expanded our knowledge towards the venom diversity, and afforded a set of pharmaceutical candidates.
Nature. 2016. Jianping Wu. et al. Tsinghua University
ABSTRACT: The voltage-gated calcium (Cav) channels convert membrane electrical signals to intracellular Ca2+-mediated events. Among the ten subtypes of Cav channel in mammals, Cav1.1 is specified for the excitation-contraction coupling of skeletal muscles. Here we present the cryo-electron microscopy structure of the rabbit Cav1.1 complex at a nominal resolution of 3.6 Å. The inner gate of the ion-conductin Classification of the particles yielded two additional reconstructions that reveal pronounced displacement of β1a and adjacent elements in α1. The atomic model of the Cav1.1 complex establishes a foundation for mechanistic understanding of excitation-contraction coupling and provides a three-dimensional template for molecular interpretations of the functions and disease mechanisms of Cav and Nav c channels.
Analytical Chemistry. 2016. Şule Yılmaz. et al. Ghent University
ABSTRACT: Chemical cross-linking coupled with mass spectrometry plays an important role in unravelling protein interactions, especially weak and transient ones. Moreover, cross-linking complements several structural determination approaches such as cryo-EM. Although several computational approaches are available for the annotation of spectra obtained from cross-linked peptides, there remains room for improvement. Here, we present Xilmass, a novel algorithm to identify cross-linked peptides that introduces two new concepts: (i) the cross-linked peptides are represented in the search database such that the cross-linking sites are explicitly encoded, and (ii) the scoring function derived from the Andromeda algorithm was adapted to score against a theoretical tandem mass spectrometry (MS/MS) spectrum that contains the peaks from all possible fragment ions of a cross-linked peptide pair.
> The performance of Xilmass was evaluated against the recently published Kojak and the popular pLink algorithms on a calmodulin-plectin complex data set, as well as three additional, published data sets. The results show that Xilmass typically had the highest number of identified distinct cross-linked sites and also the highest number of predicted cross-linked sites.
PNAS. 2016. Suzanne M. McDermott. et al. Center for Infectious Disease Research
ABSTRACT: Uridine insertion and deletion RNA editing generates functional mitochondrial mRNAs in Trypanosoma brucei Editing is catalyzed by three distinct ∼20S editosomes that have a common set of 12 proteins, but are typified by mutually exclusive RNase III endonucleases with distinct cleavage specificities and unique partner proteins. Previous studies identified a network of protein-protein interactions among a subset of common editosome proteins, but interactions among the endonucleases and their partner proteins, and their interactions with common subunits were not identified. Here, chemical cross-linking and mass spectrometry, comparative structural modeling, and genetic and biochemical analyses were used to define the molecular architecture and subunit organization of purified editosomes. We identified intra- and interprotein cross-links for all editosome subunits that are fully consistent with editosome protein structures and previously identified interactions, which we validated by genetic and biochemical studies.
> The results were used to create a highly detailed map of editosome protein domain proximities, leading to identification of molecular interactions between subunits, insights into the functions of noncatalytic editosome proteins, and a global understanding of editosome architecture.
European Journal of Cell Biology. 2016. Anna Chan. et al. University of Freiburg
ABSTRACT: Peroxisomal matrix protein import is facilitated by cycling receptors that recognize their cargo proteins in the cytosol by peroxisomal targeting sequences (PTS). In the following, the assembled receptor-cargo complex is targeted to the peroxisomal membrane where it docks to the docking-complex as part of the peroxisomal translocation machinery. The docking-complex is composed of Pex13p, Pex14p an
> We identified the dynein light chain protein Dyn2p as additional core component of the Pex14p/Pex17p-complex. Both, Pex14p and Pex17p interact directly with Dyn2p, but in vivo, Pex17p turned out to be prerequisite for an association of Dyn2p with Pex14p. Finally, like pex17Δ also dyn2Δ cells lack the high molecular weight complex. As dyn2Δ cells also display reduced peroxisomal function, our data
Proteomics. 2016. Vladimir Gorshkov. et al. University of Southern Denmark Odense M
ABSTRACT: The impact of mixture spectra deconvolution on the performance of four popular de novo sequencing programs was tested using artificially constructed mixture spectra as well as experimental proteomics data. Mixture fragmentation spectra are recognized as a limitation in proteomics because they decrease the identification performance using database search engines. De novo sequencing approaches are expected to be even more sensitive to the reduction in mass spectrum quality resulting from peptide precursor co-isolation and thus prone to false identifications. The deconvolution approach matched complementary b-, y-ions to each precursor peptide mass, which allowed the creation of virtual spectra containing sequence specific fragment ions of each co-isolated peptide. Deconvolution processing resulted in equally efficient identification rates but increased the absolute number of correctly sequenced peptides.
> The improvement was in the range of 20–35% additional peptide identifications for a HeLa lysate sample. Some correct sequences were identified only using unprocessed spectra; however, the number of these was lower than those where improvement was obtained by mass spectral deconvolution. Tight candidate peptide score distribution and high sensitivity to small changes in the mass spectrum introduced by the employed deconvolution method could explain some of the missing peptide identifications.
Protein& Cell. 2016. Shengliu Wang. et al. Institute of Biophysics, Chinese Academy of Sciences
ABSTRACT: Studies on coat protein I (COPI) have contributed to a basic understanding of how coat proteins generate vesicles to initiate intracellular transport. The core component of the COPI complex is coatomer, which is a multimeric complex that needs to be recruited from the cytosol to membrane in order to function in membrane bending and cargo sorting. Previous structural studies on the clathrin adaptors have found that membrane recruitment induces a large conformational change in promoting their role in cargo sorting. Here, pursuing negative-stain electron microscopy coupled with single-particle analyses, and also performing CXMS (chemical cross-linking coupled with mass spectrometry) for validation, we have reconstructed the structure of coatomer in its soluble form.
> When compared to the previously elucidated structure of coatomer in its membrane-bound form we do not observe a large conformational change. Thus, the result uncovers a key difference between how COPI versus clathrin coats are regulated by membrane recruitment.
Journal of Proteome Research. 2016. Jiahui Guo. et al. Jinan University
ABSTRACT: Identification of all phosphorylation forms of known proteins is a major goal of the Chromosome-Centric Human Proteome Project (C-HPP). Recent studies have found that certain phosphoproteins can be encapsulated in exosomes and function as key regulators in tumor microenvironment, but no deep coverage phosphoproteome of human exosomes has been reported to date, which makes the exosome a potential source for the new phosphosite discovery. In this study, we performed highly optimized MS analyses on the exosomal and cellular proteins isolated from human colorectal cancer SW620 cells. With stringent data quality control, 313 phosphoproteins with 1091 phosphosites were confidently identified from the SW620 exosome, from which 202 new phosphosites were detected.
> Exosomal phosphoproteins were significantly enriched in the 11q12.1–13.5 region of chromosome 11 and had a remarkably high level of tyrosine-phosphorylated proteins (6.4%), which were functionally relevant to ephrin signaling pathway-directed cytoskeleton remodeling. In conclusion, we here report the first high-coverage phosphoproteome of human cell-secreted exosomes, which leads to the identification of new phosphosites for C-HPP. Our findings provide insights into the exosomal phosphoprotein systems that help to understand the signaling language being delivered by exosomes in cell–cell communications. The mass spectrometry proteomics data have been deposited to the ProteomeXchange consortium with the data set identifier PXD004079, and iProX database (accession number: IPX00076800).
PLOS Pathogens. 2016. Dhana G. Gorasia. et al. The University of Melbourne
ABSTRACT: The type IX secretion system (T9SS) has been recently discovered and is specific to Bacteroidetes species. Porphyromonas gingivalis, a keystone pathogen for periodontitis, utilizes the T9SS to transport many proteins including the gingipain virulence factors across the outer membrane and attach them to the cell surface via a sortase-like mechanism. At least 11 proteins have been identified as components of the T9SS including PorK, PorL, PorM, PorN and PorP, however the precise roles of most of these proteins have not been elucidated and the structural organization of these components is unknown. In this study, we purified PorK and PorN complexes from P. gingivalis and using electron microscopy we have shown that PorN and the PorK lipoprotein interact to form a 50 nm diameter ring-shaped structure containing approximately 32–36 subunits of each protein.
> The formation of these rings was dependent on both PorK and PorN, but was independent of PorL, PorM and PorP. PorL and PorM were found to form a separate stable complex. PorK and PorN were protected from proteinase K cleavage when present in undisrupted cells, but were rapidly degraded when the cells were lysed, which together with bioinformatic analyses suggests that these proteins are exposed in the periplasm and anchored to the outer membrane via the PorK lipid. Chemical cross-linking and mass spectrometry analyses confirmed the interaction between PorK and PorN and further revealed that they interact with the PG0189 outer membrane protein. Furthermore, we established that PorN was required for the stable expression of PorK, PorL and PorM. Collectively, these results suggest that the ring-shaped PorK/N complex may form part of the secretion channel of the T9SS. This is the first report showing the structural organization of any T9SS component.
ACS Appl. Mater. Interfaces. 2016. Jianxi Liu. et al. Dalian Institute of Chemical Physics
ABSTRACT: Because of the low abundance of glycopeptide in natural biological samples, methods for efficient and selective enrichment of glycopeptides play a significant role in mass spectrometry (MS)-based glycoproteomics. In this study, a novel kind of zwitterionic hydrophilic interaction chromatography polymer particles, namely, poly(N,N-methylenebisacrylamide-co-methacrylic acid)@l-Cys (poly(MBAAm-co-MAA)@l-Cys), for the enrichment of glycopeptides was synthesized by a facile and efficient approach that combined distillation precipitation polymerization (DPP) and “thiol–ene” click reaction. In the DPP approach, residual vinyl groups explored outside the core with high density, then the functional ligand cysteine was immobilized onto the surface of core particles by highly efficient thiol–ene click reaction. Taking advantage of the unique structure of poly(MBAAm-co-MAA)@l-Cys, the resulting particles possess remarkable enrichment selectivity for glycopeptides from the tryptic digested human immunoglobulin G.
> The polymer particles were successfully employed for the analysis of human plasma, and 208 unique glycopeptides corresponding to 121 glycoproteins were reliably identified in triple independent nano-LC-MS/MS runs. The selectivity toward glycopeptides of these particles poly(MBAAm-co-MAA)@l-Cys is ∼2 times than that of the commercial beads. These results demonstrated that these particles had great potential for large-scale glycoproteomics research. Moreover, the strategy with the combination of DPP and thiol–ene click chemistry might be a facile method to produce functional polymer particles for bioenrichment application.
Journal of Proteome Research<. 2016. Wei Wei. et al. Beijing Proteome Research Center
ABSTRACT: Since 2012, missing proteins (MPs) investigation has been one of the critical missions of Chromosome-Centric Human Proteome Project (C-HPP) through various biochemical strategies. On the basis of our previous testis MPs study, faster scanning and higher resolution mass-spectrometry-based proteomics might be conducive to MPs exploration, especially for low-abundance proteins. In this study, Q-Exactive HF (HF) was used to survey proteins from the same testis tissues separated by two separating methods (tricine- and glycine-SDS-PAGE), as previously described. A total of 8526 proteins were identified, of which more low-abundance proteins were uniquely detected in HF data but not in our previous LTQ Orbitrap Velos (Velos) reanalysis data. Further transcriptomics analysis showed that these uniquely identified proteins by HF also had lower expression at the mRNA level. Of the 81 total identified MPs, 74 and 39 proteins were listed as MPs in HF and Velos data sets, respectively.
> Among the above MPs, 47 proteins (43 neXtProt PE2 and 4 PE3) were ranked as confirmed MPs after verifying with the stringent spectra match and isobaric and single amino acid variants filtering. Functional investigation of these 47 MPs revealed that 11 MPs were testis-specific proteins and 7 MPs were involved in spermatogenesis process. Therefore, we concluded that higher scanning speed and resolution of HF might be factors for improving the low-abundance MP identification in future C-HPP studies. All mass-spectrometry data from this study have been deposited in the ProteomeXchange with identifier PXD004092.
Journal of Proteome Research. 2016. Mingzhi Zhao. et al. Beijing Proteome Research Center
ABSTRACT: A membrane protein enrichment method composed of ultracentrifugation and detergent-based extraction was first developed based on MCF7 cell line. Then, in-solution digestion with detergents and eFASP (enhanced filter-aided sample preparation) with detergents were compared with the time-consuming in-gel digestion method. Among the in-solution digestion strategies, the eFASP combined with RapiGest identified 1125 membrane proteins. Similarly, the eFASP combined with sodium deoxycholate identified 1069 membrane proteins; however, the in-gel digestion characterized 1091 membrane proteins. Totally, with the five digestion methods, 1390 membrane proteins were identified with ≥1 unique peptides, among which 1345 membrane proteins contain unique peptides ≥2.
> This is the biggest membrane protein data set for MCF7 cell line and even breast cancer tissue samples. Interestingly, we identified 13 unique peptides belonging to 8 missing proteins (MPs). Finally, eight unique peptides were validated by synthesized peptides. Two proteins were confirmed as MPs, and another two proteins were candidate detections.
Proteomics. 2016. Yan Yan. et al. University of Saskatchewan Saskatoon
ABSTRACT: In tandem mass spectrometry (MS/MS), there are several different fragmentation techniques possible including collision-induced dissociation (CID), higher-energy collisional dissociation (HCD), electron capture dissociation (ECD), and electron transfer dissociation (ETD). When using pairs of spectra for de novo peptide sequencing, the most popular methods are designed for CID (or HCD) and ECD (or ETD) spectra because of the complementarity between them. Less attention has been paid to the use of CID and HCD spectra pairs. In this study, a new de novo peptide sequencing method is proposed for these spectra pairs. This method includes a CID and HCD spectra merging criterion and a parent mass correction step, along with improvements to our previously proposed algorithm for sequencing merged spectra.
> Three pairs of spectral datasets were used to investigate and compare the performance of the proposed method with other existing methods designed for single spectrum (HCD or CID) sequencing. Experimental results showed that full length peptide sequencing accuracy was increased significantly by using spectra pairs in the proposed method, with the highest accuracy reaching 81.31%. This article is protected by copyright. All rights reserved.
Nature Communications. 2016. Mirjam Hunziker. et al. The Rockefeller University
ABSTRACT: Early eukaryotic ribosome biogenesis involves large multi-protein complexes, which co-transcriptionally associate with pre-ribosomal RNA to form the small subunit processome. The precise mechanisms by which two of the largest multi-protein complexes—UtpA and UtpB—interact with nascent pre-ribosomal RNA are poorly understood. Here, we combined biochemical and structural biology approaches with ensembles of RNA–protein cross-linking data to elucidate the essential functions of both complexes. We show that UtpA contains a large composite RNA-binding site and captures the 5′ end of pre-ribosomal RNA. UtpB forms an extended structure that binds early pre-ribosomal intermediates in close proximity to architectural sites such as an RNA duplex formed by the 5′ ETS and U3 snoRNA as well as the 3′ boundary of the 18S rRNA. Both complexes therefore act as vital RNA chaperones to initiate eukaryotic ribosome assembly.
Nature Communications. 2016. Karthik V. Rajasekar. et al. University of Oxford
ABSTRACT: Redox-regulated effector systems that counteract oxidative stress are essential for all forms of life. Here we uncover a new paradigm for sensing oxidative stress centred on the hydrophobic core of a sensor protein. RsrA is an archetypal zinc-binding anti-sigma factor that responds to disulfide stress in the cytoplasm of Actinobacteria. We show that RsrA utilizes its hydrophobic core to bind the sigma factor σR preventing its association with RNA polymerase, and that zinc plays a central role in maintaining this high-affinity complex. Oxidation of RsrA is limited by the rate of zinc release, which weakens the RsrA–σR complex by accelerating its dissociation. The subsequent trigger disulfide, formed between specific combinations of RsrA’s three zinc-binding cysteines, precipitates structural collapse to a compact state where all σR-binding residues are sequestered back into its hydrophobic core, releasing σR to activate transcription of anti-oxidant genes.
Journal of Proteomics. 2016. Cheng Ma. et al. Georgia State University
ABSTRACT: Core-fucosylation (CF) plays important roles in regulating biological processes in eukaryotes. Alterations of CF-glycosites or CF-glycans in bodily fluids correlate with cancer development. Therefore, global research of protein core-fucosylation with an emphasis on proteomics can explain pathogenic and metastasis mechanisms and aid in the discovery of new potential biomarkers for early clinical diagnosis. In this study, a precise and high throughput method was established to identify CF-glycosites from human plasma. We found that alternating HCD and ETD fragmentation (AHEF) can provide a complementary method to discover CF-glycosites. A total of 407 CF-glycosites among 267 CF-glycoproteins were identified in a mixed sample made from six normal human plasma samples. Among the 407 CF-glycosites, 10 are without the N-X-S/T/C consensus motif, representing 2.5% of the total number identified. All identified CF-glycopeptide results from HCD and ETD fragmentation were filtered with neutral loss peaks and characteristic ions of GlcNAc from HCD spectra, which assured the credibility of the results. This study provides an effective method for CF-glycosites identification and a valuable biomarker reference for clinical research.
> CF-glycosytion plays an important role in regulating biological processes in eukaryotes. Alterations of the glycosites and attached CF-glycans are frequently observed in various types of cancers. Thus, it is crucial to develop a strategy for mapping human CF-glycosylation. Here, we developed a complementary method via alternating HCD and ETD fragmentation (AHEF) to analyze CF-glycoproteins. This strategy reveals an excellent complementarity of HCD and ETD in the analysis of CF-glycoproteins, and provides a valuable biomarker reference for clinical research.
NATURE STRUCTURAL & MOLECULAR BIOLOGY. 2016. Susan L Kloet. et al. Radboud University Nijmegen
ABSTRACT: Although the core subunits of Polycomb group (PcG) complexes are well characterized, little is known about the dynamics of these protein complexes during cellular differentiation. We used quantitative interaction proteomics and genome-wide profiling to study PcG proteins in mouse embryonic stem cells (ESCs) and neural progenitor cells (NPCs). We found that the stoichiometry and genome-wide binding of PRC1 and PRC2 were highly dynamic during neural differentiation. Intriguingly, we observed a downregulation and loss of PRC2 from chromatin marked with trimethylated histone H3 K27 (H3K27me3) during differentiation, whereas PRC1 was retained at these sites.
> Additionally, we found PRC1 at enhancer and promoter regions independently of PRC2 binding and H3K27me3. Finally, overexpression of NPC-specific PRC1 interactors in ESCs led to increased Ring1b binding to, and decreased expression of, NPC-enriched Ring1b-target genes. In summary, our integrative analyses uncovered dynamic PcG subcomplexes and their widespread colocalization with active chromatin marks during differentiation.
NATURE COMMUNICATIONS. 2016. Xiangdong Zheng. et al. Tsinghua University.
ABSTRACT: Centrioles and cilia are microtubule-based structures, whose precise formation requires controlled cytoplasmic tubulin incorporation. How cytoplasmic tubulin is recognized for centriolar/ciliary-microtubule construction remains poorly understood. Centrosomal-P4.1-associated-protein (CPAP) binds tubulin via its PN2-3 domain. Here, we show that a C-terminal loop-helix in PN2-3 targets β-tubulin at the microtubule outer surface, while an N-terminal helical motif caps microtubule’s α-β surface of β-tubulin. Through this, PN2-3 forms a high-affinity complex with GTP-tubulin, crucial for defining numbers and lengths of centriolar/ciliary-microtubules.
> Surprisingly, two distinct mutations in PN2-3 exhibit opposite effects on centriolar/ciliary-microtubule lengths. CPAPF375A, with strongly reduced tubulin interaction, causes shorter centrioles and cilia exhibiting doublet- instead of triplet-microtubules. CPAPEE343RR that unmasks the β-tubulin polymerization surface displays slightly reduced tubulin-binding affinity inducing over-elongation of newly forming centriolar/ciliary-microtubules by enhanced dynamic release of its bound tubulin. Thus CPAP regulates delivery of its bound-tubulin to define the size of microtubule-based cellular structures using a ‘clutch-like’ mechanism.
Analytical and Bioanalytical Chemistry. 2016. Yurong Wen. et al. Ghent University
ABSTRACT: Toxin-antitoxin systems are genetic modules involved in a broad range of bacterial cellular processes including persistence, multidrug resistance and tolerance, biofilm formation, and pathogenesis. In type II toxin-antitoxin systems, both the toxin and antitoxin are proteins. In the prototypic Escherichia coli HipA-HipB module, the antitoxin HipB forms a complex with the protein kinase HipA and sequesters it in the nucleoid. HipA is then no longer able to phosphorylate glutamyl-tRNA-synthetase and this prevents the initiation of the forthcoming stringent response.
> Here we investigated the assembly of the Shewanella oneidensis MR-1 HipA-HipB complex using native electrospray ion mobility-mass spectrometry and chemical crosslinking combined with mass spectrometry. We revealed that the HipA autophosphorylation was accompanied by a large conformational change, and confirmed structural evidence that S. oneidensis MR-1 HipA-HipB assembly was distinct from the prototypic E. coli HipA-HipB complex.
Protein Science. 2016. Jason W. Schmidberger. et al. University of Sydney
ABSTRACT: The Nucleosome Remodeling and Deacetylase (NuRD) complex remodels the genome in the context of both gene transcription and DNA damage repair. It is essential for normal development and is distributed across multiple tissues in organisms ranging from mammals to nematode worms. In common with other chromatin-remodeling complexes, however, its molecular mechanism of action is not well understood and only limited structural information is available to show how the complex is assembled. As a step towards understanding the structure of the NuRD complex, we have characterized the interaction between two subunits: the metastasis associated protein MTA1 and the histone-binding protein RBBP4.
> We show that MTA1 can bind to two molecules of RBBP4 and present negative stain electron microscopy and chemical crosslinking data that allow us to build a low-resolution model of an MTA1-(RBBP4)2 subcomplex. These data build on our understanding of NuRD complex structure and move us closer towards an understanding of the biochemical basis for the activity of this complex. This article is protected by copyright. All rights reserved.
Cell research. 2016. Jun-Jie Liu. et al. Tsinghua University
ABSTRACT: The eukaryotic multi-subunit RNA exosome complex plays crucial roles in 3′-to-5′ RNA processing and decay. Rrp6 and Ski7 are the major cofactors for the nuclear and cytoplasmic exosomes, respectively. In the cytoplasm, Ski7 helps the exosome to target mRNAs for degradation and turnover via a through-core pathway. However, the interaction between Ski7 and the exosome complex has remained unclear. The transaction of RNA substrates within the exosome is also elusive.
> In this work, we used single-particle cryo-electron microscopy to solve the structures of the Ski7-exosome complex in RNA-free and RNA-bound forms at resolutions of 4.2 Å and 5.8 Å, respectively. These structures reveal that the N-terminal domain of Ski7 adopts a structural arrangement and interacts with the exosome in a similar fashion to the C-terminal domain of nuclear Rrp6. Further structural analysis of exosomes with RNA substrates harboring 3′ overhangs of different length suggests a switch mechanism of RNA-induced exosome activation in the through-core pathway of RNA processing.
Molecular & Cellular Proteomics. 2016. Xiaoshi Wang. et al. University of Pennsylvania School of Medicine, United States
ABSTRACT: Over the past decades, protein O-GlcNAcylation has been found to play a fundamental role in cell cycle control, metabolism, transcriptional regulation, and cellular signaling. Nevertheless, quantitative approaches to determine in vivo GlcNAc dynamics at a large-scale are still not readily available. Here, we have developed an approach to isotopically label O-GlcNAc modifications on proteins by producing 13C-labeled UDP-GlcNAc from 13C6-glucose via the hexosamine biosynthetic pathway. This metabolic labeling was combined with quantitative mass spectrometry-based proteomics to determine protein O-GlcNAcylation turnover rates.
> First, an efficient enrichment method for O-GlcNAc peptides was developed with the use of phenylboronic acid solid-phase extraction and anhydrous DMSO. The near stoichiometry reaction between the diol of GlcNAc and boronic acid dramatically improved the enrichment efficiency. Additionally, our kinetic model for turnover rates integrates both metabolomic and proteomic data, which increase the accuracy of the turnover rate estimation. Other advantages of this metabolic labeling method include in vivo application, direct labeling of the O-GlcNAc sites and higher confidence for site identification. Concentrating only on nuclear localized GlcNAc modified proteins, we are able to identify 105 O-GlcNAc peptides on 42 proteins and determine turnover rates of 20 O-GlcNAc peptides from 14 proteins extracted from HeLa nuclei. In general, we found O-GlcNAcylation turnover rates are slower than those published for phosphorylation or acetylation. Nevertheless, the rates widely varied depending on both the protein and the residue modified. We believe this methodology can be broadly applied to reveal turnovers/dynamics of protein O-GlcNAcylation from different biological states and will provide more information on the significance of O-GlcNAcylation, enabling us to study the temporal dynamics of this critical modification for the first time.
Molecular Cancer Therapeutics. 2016. Jun-Mei Yi. et al. Shanghai Institute of Materia Medica
ABSTRACT: Multidrug resistance (MDR) is a major cause of tumor treatment failure; therefore, drugs that can avoid this outcome are urgently needed. We studied triptolide which directly kills MDR tumor cells with a high potency and a broad spectrum of cell death. Triptolide did not inhibit P-glycoprotein (P-gp) drug-efflux and reduced P-gp and mdr1 mRNA resulted from transcription inhibition. Transcription factors including c-Myc, SOX-2, OCT-4, and NANOG were not correlated with triptolide-induced cell killing but Rpb1, the largest subunit of RNA polymerase II, was critical in mediating triptolide's inhibition of MDR cells.
> Triptolide elicited antitumor and anti-MDR activity through a universal mechanism: by activating CDK7 by phosphorylating Thr170 in both parental and MDR cell lines and in SK-OV-3 cells. The CDK7 selective inhibitor BS-181 partially rescued cell killing induced by 72 h treatment of triptolide which may be due to partial rescue of Rpb1 degradation. We suggest that a precise phosphorylation site on Rpb1 (Ser1878) was phosphorylated by CDK7 in response to triptolide. In addition, XPB and p44, two transcription factor TFIIH subunits did not contribute to triptolide-driven Rpb1 degradation and cell killing although XPB was reported to covalently bind to triptolide. Several clinical trials are underway to test triptolide and its analogues for treating cancer and other diseases, so our data may help expand potential clinical uses of triptolide as well as offer a compound that overcomes tumor MDR. Future investigations into the primary molecular target(s) of triptolide responsible for Rpb1 degradation may suggest novel anti-MDR target(s) for therapeutic development.
The Journal of Biological Chemistry. 2016. Jie Tian. et al. Capital Normal University
ABSTRACT: The anaphase promoting complex/cyclosome (APC/C) orchestrates various aspects of the eukaryotic cell cycle. One of its co-activators, Cdh1, is subject to myriad post-translational modifications, such as phosphorylation and ubiquitination. Herein, we identify the O-linked N-acetylglucosamine (O-GlcNAc) modification that occurs on Cdh1. Cdh1 is O-GlcNAcylated in cultured cells and mouse brain extracts. Mass spectrometry identifies an O-GlcNAcylated peptide that neighbors a known phosphorylation site. Cell synchronization and mutation studies reveal that O-GlcNAcylation of Cdh1 may antagonize its phosphorylation.
> Our results thus reveal a pivotal role of O-GlcNAcylation in regulating APC/C activity.
Analytical Chemistry. 2016. Yuehe Ding. et al. National Institute of Biological Sciences
ABSTRACT: Chemical cross-linking of proteins coupled with mass spectrometry (CXMS) is a powerful tool to study protein folding and to map the interfaces between interacting proteins. The most commonly used cross-linkers in CXMS are BS3 and DSS, which have similar structures and generate the same linkages between pairs of lysine residues in spatial proximity. However, there are cases where no cross-linkable lysine pairs are present at certain regions of a protein or at the interface of two interacting proteins. In order to find the cross-linkers that can best complement the performance of BS3 and DSS, we tested seven additional cross-linkers that either have different spacer arm structures or that target different amino acids (BS2G, EGS, AMAS, GMBS, Sulfo-GMBS, EDC, and TFCS).
> Using BSA, aldolase, the yeast H/ACA protein complex, and E. coli 70S ribosomes, we showed that, in terms of providing structural information not obtained through the use of BS3 and DSS, EGS and Sulfo-GMBS worked better than the other cross-linkers that we tested. EGS generated a large number of cross-links not seen with the other amine-specific cross-linkers, possibly due to its hydrophilic spacer arm. We demonstrate that incorporating the cross-links contributed by the EGS and amine-sulfhydryl cross-linkers greatly increased the accuracy of Rosetta in docking the structure of the yeast H/ACA protein complex. Given the improved depth of useful information it can provide, we suggest that the multilinker CXMS approach should be used routinely when the amount of a sample permits.
eLife. 2016. Dan Tan. et al. National Institute of Biological Sciences
ABSTRACT: To improve chemical cross-linking of proteins coupled with mass spectrometry (CXMS), we developed a lysine-targeted enrichable cross-linker containing a biotin tag for affinity purification, a chemical cleavage site to separate cross-linked peptides away from biotin after enrichment, and a spacer arm that can be labeled with stable isotopes for quantitation. By locating the flexible proteins on the surface of 70S ribosome, we show that this trifunctional cross-linker is effective at attaining structural information not easily attainable by crystallography and electron microscopy.
> From a crude Rrp46 immunoprecipitate, it helped identify two direct binding partners of Rrp46 and 15 protein-protein interactions (PPIs) among the co-immunoprecipitated exosome subunits. Applying it to E. coli and C. elegans lysates, we identified 3130 and 893 inter-linked lysine pairs, representing 677 and 121 PPIs. Using a quantitative CXMS workflow we demonstrate that it can reveal changes in the reactivity of lysine residues due to protein-nucleic acid interaction.
Journal of Proteomics. 2016. Chien-Wen Hung. et al. Christian-Albrechts-Universität zu Kiel
ABSTRACT: Bone morphogenetic protein 1 (BMP-1) is an essential metalloproteinase to trigger extracellular matrix assembly and organogenesis. Previous structural studies on the refolded catalytic domain of BMP-1 produced in E. coli have suggested the existence of a rare vicinal disulfide linkage near the active site. To confirm that this was not an artifact of the refolding procedure, the full-length human BMP-1 produced in mammalian cells was investigated via sequence-dependent enzyme cleavage under native conditions followed by high mass accuracy and high resolution LC-MS/MS analysis to interrogate the post-translational modifications. Ten disulfide linkages of BMP-1, including the vicinal disulfide linkage C185-C186 could be unambiguously identified.
> Further, around 50% of this vicinal disulfide bond was found to be modified by N-ethylmaleimide (NEM), a cysteine protease inhibitor supplied when the BMP-1-containing medium was collected, suggesting that this bond was highly unstable. In the absence of NEM, BMP-1 has a higher tendency to form aggregates, but after aggregate removal, C185 and C186 are almost quantitatively engaged in the vicinal disulfide bond and BMP-1 activity remains unchanged. In addition, three consensus N-glycosylation sites at N142, N363, and N599 could be identified together with a previously unknown O-glycosylation site and an Asn-hydroxylation. An in-depth characterization of post-translational modifications of the full-length human BMP-1 produced in mammalian cells by MS was performed. A rare vicinal disulfide bond in the catalytic domain could be confirmed for the first time by mass spectrometry along with nine other proposed disulfide linkages of mature BMP-1. This vicinal disulfide bond can transiently open to form covalent adducts with the cysteine protease inhibitor (NEM) supplied in cell medium during protein harvesting. Further, we report a previously unknown O-glycosylation site and Asn-hydroxylation site, indicating a novel feature of BMP-1 in the EGF domain. The study clearly outlines the benefit of in-depth characterization of overexpressed proteins to deduce important protein modifications.
Cancer Research. 2016. Chih-Hang Anthony Tang. et al. The Wistar Institute
ABSTRACT: Endoplasmic reticulum (ER) stress responses through the IRE-1/XBP-1 pathway are required for the function of STING (TMEM173), an ER-resident transmembrane protein critical for cytoplasmic DNA sensing, IFN production, and cancer control. Here we show that the IRE-1/XBP-1 pathway functions downstream of STING and that STING agonists selectively trigger mitochondria-mediated apoptosis in normal and malignant B cells. Upon stimulation, STING was degraded less efficiently in B cells, implying that prolonged activation of STING can lead to apoptosis.
>Transient activation of the IRE-1/XBP-1 pathway partially protected agonist-stimulated malignant B cells from undergoing apoptosis. In Eμ-TCL1 mice with chronic lymphocytic leukemia, injection of the STING agonist 3'3'-cGAMP induced apoptosis and tumor regression. Similarly efficacious effects were elicited by 3'3'-cGAMP injection in syngeneic or immunodeficient mice grafted with multiple myeloma. Thus, in addition to their established ability to boost antitumoral immune responses, STING agonists can also directly eradicate malignant B cells.
Journal of Proteomics. 2016. Yehui Xiong. et al. Institute of Plant Protection
ABSTRACT: Lysine acetylation is a dynamic and reversible post-translational modification that plays an important role in the gene transcription regulation. Here, we report high quality proteome-scale data for lysine-acetylation (Kac) sites and Kac proteins in rice (Oryza sativa). A total of 1337 Kac sites in 716 Kac proteins with diverse biological functions and subcellular localizations were identified in rice seedlings. About 42% of the sites were predicted to be localized in the chloroplast. Seven putative acetylation motifs were detected. Phenylalanine, located in both the upstream and downstream of the Kac sites, is the most conserved amino acid surrounding the regions.
>In addition, protein interaction network analysis revealed that a variety of signaling pathways are modulated by protein acetylation. KEGG pathway category enrichment analysis indicated that glyoxylate and dicarboxylate metabolism, carbon metabolism, and photosynthesis pathways are significantly enriched. Our results provide an in-depth understanding of the acetylome in rice seedlings, and the method described here will facilitate the systematic study of how Kac functions in growth, development, and abiotic and biotic stress responses in rice and other plants.Rice is one of the most important crops consumption and is a model monocot for research. In this study, we combined a highly sensitive immune-affinity purification method (used pan anti-acetyl-lysine antibody conjugated agarose for immunoaffinity acetylated peptide enrichment) with high-resolution LC-MS/MS. In total, we identified 1337 Kac sites on 716 Kac proteins in rice cells. Bioinformatic analysis of the acetylome revealed that the acetylated proteins are involved in a variety of cellular functions and have diverse subcellular localizations. We also identified seven putative acetylation motifs in the acetylated proteins of rice. In addition, protein interaction network analysis revealed that a variety of signaling pathways were modulated by protein acetylation. KEGG pathway category enrichment analysis indicated that glyoxylate and dicarboxylate metabolism, carbon metabolism, and photosynthesis pathways were significantly enriched. To our knowledge, the number of Kac sites we identified was 23-times greater and the number of Kac proteins was 16-times greater than in a previous report. Our results provide an in-depth understanding of the acetylome in rice seedlings, and the method described here will facilitate the systematic study of how Kac functions in growth, development and responses to abiotic and biotic stresses in rice or other plants.
Analytical Chemistry. 2016. Yun Xiong. et al. Tianjin Institute of Industrial Biotechnology
ABSTRACT: Detection of proteins containing single amino acid polymorphisms (SAPs) encoded by nonsynonymous SNPs (nsSNPs) can aid researchers in studying the functional significance of protein variants. Most proteogenomic approaches for large-scale SAPs mapping require construction of a sample-specific database containing protein variants predicted from the next-generation sequencing (NGS) data. Searching shotgun proteomic data sets against these NGS-derived databases allowed for identification of SAP peptides, thus validating the proteome-level sequence variation.
>Contrary to the conventional approaches, our study presents a novel strategy for proteome-wide SAP detection without relying on sample-specific NGS data. By searching a deep-coverage proteomic data set from an industrial thermotolerant yeast strain using our strategy, we identified 337 putative SAPs compared to the reference genome. Among the SAP peptides identified with stringent criteria, 85.2% of SAP sites were validated using whole-genome sequencing data obtained for this organism, which indicates high accuracy of SAP identification with our strategy. More interestingly, for certain SAP peptides that cannot be predicted by genomic sequencing, we used synthetic peptide standards to verify expression of peptide variants in the proteome. Our study has provided a unique tool for proteogenomics to enable proteome-wide direct SAP identification and capture nongenetic protein variants not linked to nsSNPs.
Pharmacogenomics. 2016. Kung-Hao Liang. et al. Chang Gung Memorial Hospital
ABSTRACT: Transcatheter arterial chemoembolization is currently the standard treatment in hepatocellular carcinoma patients with Barcelona Clinic Liver Cancer stage B. Genomic variants of GALNT14 were recently identified as effective predictors for chemotherapy responses in Barcelona Clinic Liver Cancer stage C patients.We investigated the prognosis predictive value of GALNT14 genotypes in 327 hepatocelluar carcinoma patients treated by transcatheter arterial chemoembolization.
>Cox proportional hazards model analysis showed that the genotype 'TT' was associated with shorter time-to-response (multivariate p < 0.001), time-to-complete-response (p = 0.004) and longer time-to-tumor progression (p < 0.001), compared with the genotype 'non-TT'. In patients with albumin <3.5 g/(dl), genotype 'TT' was associated with longer overall survival (p = 0.027). Finally, genotype 'TT' correlated with higher cancer-to-noncancer ratios of GALNT14 protein levels, lower cancer-to-noncancer ratios of antiapoptotic cFLIP-S, and a clustered glycosylation pattern in the extracellular domain of death receptorGALNT14 genotypes were significantly associated with clinical outcomes of transcatheter arterial chemoembolization. The differential status of extrinsic apoptotic signaling between cancerous and non-cancerous tissues might underlie the clinical association.
ELECTROPHORESIS. 2016. Shanshan Li. et al. Nankai University
ABSTRACT: O-linked β-N-acetylglucosamine (O-GlcNAc) is emerging as an essential protein posttranslational modification in a range of organisms. It is involved in various cellular processes such as nutrient sensing, protein degradation, gene expression and is associated with many human diseases. Despite its importance, identifying O-GlcNAcylated proteins is a major challenge in proteomics. Here, using peracetylated N-azidoacetylglucosamine (Ac4 GlcNAz) as a bioorthogonal chemical handle, we described a gel-based mass spectrometry method for the identification of proteins with O-GlcNAc modification in A549 cells.
>In addition, we made a labeling efficiency comparison between two modes of azide-alkyne biothogonal reactions in click chemistry: copper-catalyzed azide-alkyne cycloaddition (CuAAC) with Biotin-Diazo-Alkyne and stain-promoted azide-alkyne cycloaddition (SPAAC) with Biotin-DIBO-Alkyne. After conjugation with click chemistry in vitro and enrichment via streptavidin resin, proteins with O-GlcNAc modification were separated by SDS-PAGE and identified with mass spectrometry. Proteomics data analysis revealed that 229 putative O-GlcNAc modified proteins were identified with Biotin-Diazo-Alkyne conjugated sample and 188 proteins with Biotin-DIBO-Alkyne conjugated sample, among which 114 proteins were overlapping. Interestingly, 74 proteins identified from Biotin-Diazo-Alkyne conjugates and 46 verified proteins from Biotin-DIBO-Alkyne conjugates could be found in the O-GlcNAc modified proteins database dbOGAP (http://cbsb.lombardi.georgetown.edu/hulab/OGAP.html). These results suggested that CuAAC with Biotin-Diazo-Alkyne represented a more powerful method in proteomics with higher protein identification and better accuracy compared to SPAAC. The proteomics credibility was also confirmed by the molecular function and cell component gene ontology (GO). Together, the method we reported here combining metabolic labeling, click chemistry, affinity-based enrichment, SDS-PAGE separation and mass spectrometry, would be adaptable for other post-translationally modified proteins in proteomics. This article is protected by copyright. All rights reserved.
MOLECULAR & CELLULAR POTEOMICS. 2016. Jonathan B. Olsen. et al. Lilly Research Laboratories
ABSTRACT: The significance of non-histone lysine methylation in cell biology and human disease is an emerging area of research exploration. The development of small molecule inhibitors that selectively and potently target enzymes that catalyze the addition of methyl-groups to lysine residues, such as the protein lysine mono-methyltransferase SMYD2, is an active area of drug discovery. Critical to the accurate assessment of biological function is the ability to identify target enzyme substrates and to define enzyme substrate specificity within the context of the cell.
>Here, using stable isotopic labeling with amino acids in cell culture (SILAC) coupled with immunoaffinity enrichment of mono-methyl-lysine (Kme1) peptides and mass spectrometry, we report a comprehensive, large-scale proteomic study of lysine mono-methylation, comprising a total of 1032 Kme1 sites in esophageal squamous cell carcinoma (ESCC) cells and 1861 Kme1 sites in ESCC cells overexpressing SMYD2. Among these Kme1 sites is a subset of 35 found to be potently down-regulated by both shRNA-mediated knockdown of SMYD2 and LLY-507, a selective small molecule inhibitor of SMYD2. In addition, we report specific protein sequence motifs enriched in Kme1 sites that are directly regulated by endogenous SMYD2 activity, revealing that SMYD2 substrate specificity is more diverse than expected. We further show direct activity of SMYD2 toward BTF3-K2, PDAP1-K126 as well as numerous sites within the repetitive units of two unique and exceptionally large proteins, AHNAK and AHNAK2. Collectively, our findings provide quantitative insights into the cellular activity and substrate recognition of SMYD2 as well as the global landscape and regulation of protein mono-methylation.
Journal of Proteome Research. 2016. Arun Devabhaktuni. et al. Stanford University
ABSTRACT: Dependent on concise, predefined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results.
>Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large-scale proteomics data sets and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) that leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to that of other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.
Nucleic Acids Research. 2015. Christian Trahan. et al. Université de Montréal
ABSTRACT: Proteomic and RNomic approaches have identified many components of different ribonucleoprotein particles (RNPs), yet still little is known about the organization and protein proximities within these heterogeneous and highly dynamic complexes. Here we describe a targeted cross-linking approach, which combines cross-linking from a known anchor site with affinity purification and mass spectrometry (MS) to identify the changing vicinity interactomes along RNP maturation pathways. Our method confines the reaction radius of a heterobifunctional cross-linker to a specific interaction surface, increasing the probability to capture low abundance conformations and transient vicinal interactors too infrequent for identification by traditional cross-linking-MS approaches, and determine protein proximities within RNPs.
> Applying the method to two conserved RNA-associated complexes in Saccharomyces cerevisae, the mRNA export receptor Mex67:Mtr2 and the pre-ribosomal Nop7 subcomplex, we identified dynamic vicinal interactomes within those complexes and along their changing pathway milieu. Our results therefore show that this method provides a new tool to study the changing spatial organization of heterogeneous dynamic RNP complexes.
Journal of Proteome Research. 2015. Shaohang Xu. et al. BGI-Shenzhen
ABSTRACT: Considering the technical limitations of mass spectrometry in protein identification, the mRNAs bound to ribosomes (RNC-mRNA) are assumed to reflect the mRNAs participating in the translational process. The RNC-mRNA data are reasoned to be useful for appraising the missing proteins. A set of the multiomics data including free-mRNAs, RNC-mRNAs, and proteomes was acquired from three liver cancer cell lines. On the basis of the missing proteins in neXtProt (release 2014-09-19), the bioinformatics analysis was carried out in three phases: (1) finding how many neXtProt missing proteins have or do not have RNA-seq and/or MS/MS evidence, (2) analyzing specific physicochemical and biological properties of the missing proteins that lack both RNA-seq and MS/MS evidence, and (3) analyzing the combined properties of these missing proteins.
> Total of 1501 missing proteins were found by neither RNC-mRNA nor MS/MS in the three liver cancer cell lines. For these missing proteins, some are expected higher hydrophobicity, unsuitable detection, or sensory functions as properties at the protein level, while some are predicted to have nonexpressing chromatin structures on the corresponding gene level. With further integrated analysis, we could attribute 93% of them (1391/1501) to these causal factors, which result in the expression products scarcely detected by RNA-seq or MS/MS.
Analytical Chemistry. 2015. Lin He. et al. The Scripps Research Institute
ABSTRACT: Extraction of data from the proprietary RAW files generated by Thermo Fisher mass spectrometers is the primary step for subsequent data analysis. High resolution and high mass accuracy data obtained by state-of-the-art mass spectrometers (e.g., Orbitraps) can significantly improve both peptide/protein identification and quantification. We developed RawConverter, a stand-alone software tool, to improve data extraction on RAW files from high-resolution Thermo Fisher mass spectrometers.
> RawConverter extracts full scan and MSn data from RAW files like its predecessor RawXtract; most importantly, it associates the accurate precursor mass-to-charge (m/z) value with the tandem mass spectrum. RawConverter accepts RAW data generated by either data-dependent acquisition (DDA) or data-independent acquisition (DIA). It generates output into MS1/MS2/MS3, MGF, or mzXML file formats, which fulfills the format requirements for most data identification and quantification tools. Using the tandem mass spectra extracted by RawConverter with corrected m/z values, 32.8%, 27.1%, and 84.1%, peptide spectra matches (PSMs) produce 17.4% (13.0%), 14.4% (11.5%), and 45.7% (36.2%) more peptide (protein) identifications than ProteoWizard, pXtract, and RawXtract, respectively. RawConverter is implemented in C# and is freely accessible at http://fields.scripps.edu/rawconv.
Analytica Chimica Acta. 2015. Yu Liang. et al. Dalian Institute of Chemical Physics
ABSTRACT: The poly (glycidyl methacrylate-co-poly (ethylene glycol) diacrylate) monoliths modified with gold nanoparticles, with advantages of enhanced reactive sites, good hydrophilicity and facile modification, were prepared as the matrix, followed by variable functionalization with cysteine and PNGase F for glycopeptide enrichment and on-line deglycosylation respectively. By the cysteine functionalized monolithic column, glycopeptides could be efficiently and selectively enriched with good reproducibility based on hydrophilic interaction chromatography (HILIC). Furthermore, the enrichment was specially achieved in weak alkaline environment, with 10 mM NH4HCO3 as the elution buffer, compatible with deglycosylation conditions.
> Therefore, the glycopeptides could be on-line deglycosylated with high efficiency and throughput by directly coupling the PNGase F functionalized monolithic column with the enrichment column during elution without the requirement of buffer exchange and pH adjustment. By such a method, within only 70-min pretreatment, 196 N-linked glycopeptides, corresponding to 122 glycoproteins, could be identified from 5 μg of human plasma with 14 high-abundant proteins removed, and the N-linked glycopeptides occupied 81% of all identified peptides, achieving to the best of our knowledge, the highest selectivity of HILIC-based methods. All the results demonstrated the high efficiency, selectivity and throughput of our proposed strategy for the large scale glycoproteome analysis.
MOLECULAR & CELLULAR POTEOMICS. 2015. Yumiao Han. et al. University of Pennsylvania School of Medicine
ABSTRACT: Protein phosphorylation, one of the most common and important modifications of acute and reversible regulation of protein function, plays a dominant role in almost all cellular processes. These signaling events regulate cellular responses, including proliferation, differentiation, metabolism, survival, and apoptosis. Several studies have been successfully used to identify phosphorylated proteins and dynamic changes in phosphorylation status after stimulation. Nevertheless, it is still rather difficult to elucidate precise complex phosphorylation signaling pathways. In particular, how signal transduction pathways directly communicate from the outer cell surface through cytoplasmic space and then directly into chromatin networks to change the transcriptional and epigenetic landscape remains poorly understood.
> Here, we describe the optimization and comparison of methods based on thiophosphorylation affinity enrichment, which can be utilized to monitor phosphorylation signaling into chromatin by isolation of phosphoprotein containing nucleosomes, a method we term phosphorylation-specific chromatin affinity purification (PS-ChAP). We utilized this PS-ChAP(1) approach in combination with quantitative proteomics to identify changes in the phosphorylation status of chromatin-bound proteins on nucleosomes following perturbation of transcriptional processes. We also demonstrate that this method can be employed to map phosphoprotein signaling into chromatin containing nucleosomes through identifying the genes those phosphorylated proteins are found on via thiophosphate PS-ChAP-qPCR. Thus, our results showed that PS-ChAP offers a new strategy for studying cellular signaling and chromatin biology, allowing us to directly and comprehensively investigate phosphorylation signaling into chromatin to investigate if these pathways are involved in altering gene expression. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD002436.
Plant Soil. 2015. Rong Qin. et al. South China Normal University
ABSTRACT: In the present study, the effects of Cu (2.0 and 8.0 μM) on root growth of Allium cepa var. agrogarum L. were addressed and protein abundance levels were analyzed using the technology of proteomics combined with transcriptomics, in order to go deeper into the understanding of the mechanism of Cu toxicity on plant root systems at the protein level and to provide valuable information for monitoring and forecasting the effects of exposure to Cu in real scenarios conditions.
> Protein extraction; Two-dimensional electrophoresis (2-DE) analysis; Mass spectrometry analysis; Establishment of the in-house database; Restriction enzyme map of the in-house database and protein identification. Root growth was dramatically inhibited after 12 h Cu treatment. By establishing an in-house database and using mass spectrometry analysis, 27 differentially abundant proteins were identified. These 27 proteins were involved in multiple biological processes including defensive response, transcription regulation and protein synthesis, cell wall synthesis, cell cycle and DNA replication, and other important functions. Our results provide new insights at the proteomic level into the Cu-induced responses, defensive responses and toxic effects, and provide new molecular markers of the early events of plant responses to Cu toxicity. Moreover, the establishment of an in-house database provides a big improvement for proteomics research on non-model plants.
Journal of Proteome Research. 2015. David R. Barnidge. et al. Karolinska Institutet
ABSTRACT: In our previous work, we showed that electrospray ionization of intact polyclonal kappa and lambda light chains isolated from normal serum generates two distinct, Gaussian-shaped, molecular mass distributions representing the light-chain repertoire. During the analysis of a large (>100) patient sample set, we noticed a low-intensity molecular mass distribution with a mean of approximately 24 250 Da, roughly 800 Da higher than the mean of the typical kappa molecular-mass distribution mean of 23 450 Da. We also observed distinct clones in this region that did not appear to contain any typical post-translational modifications that would account for such a large mass shift.
> To determine the origin of the high molecular mass clones, we performed de novo bottom-up mass spectrometry on a purified IgM monoclonal light chain that had a calculated molecular mass of 24 275.03 Da. The entire sequence of the monoclonal light chain was determined using multienzyme digestion and de novo sequence-alignment software and was found to belong to the germline allele IGKV2-30. The alignment of kappa germline sequences revealed ten IGKV2 and one IGKV4 sequences that contained additional amino acids in their CDR1 region, creating the high-molecular-mass phenotype. We also performed an alignment of lambda germline sequences, which showed additional amino acids in the CDR2 region, and the FR3 region of functional germline sequences that result in a high-molecular-mass phenotype. The work presented here illustrates the ability of mass spectrometry to provide information on the diversity of light-chain molecular mass phenotypes in circulation, which reflects the germline sequences selected by the immunoglobulin-secreting B-cell population.
Scientific Reports. 2015. Veronika Haslbeck. et al. Technische Universität München
ABSTRACT: Protein phosphatase 5 is involved in the regulation of kinases and transcription factors. The dephosphorylation activity is modulated by the molecular chaperone Hsp90, which binds to the TPR-domain of protein phosphatase 5. This interaction is dependent on the C-terminal MEEVD motif of Hsp90. We show that C-terminal Hsp90 fragments differ in their regulation of the phosphatase activity hinting to a more complex interaction. Also hydrodynamic parameters from analytical ultracentrifugation and small-angle X-ray scattering data suggest a compact structure for the Hsp90-protein phosphatase 5 complexes.
> Using crosslinking experiments coupled with mass spectrometric analysis and structural modelling we identify sites, which link the middle/C-terminal domain interface of C. elegans Hsp90 to the phosphatase domain of the corresponding kinase. Studying the relevance of the domains of Hsp90 for turnover of native substrates we find that ternary complexes with the glucocorticoid receptor (GR) are cooperatively formed by full-length Hsp90 and PPH-5. Our data suggest that the direct stimulation of the phosphatase activity by C-terminal Hsp90 fragments leads to increased dephosphorylation rates. These are further modulated by the binding of clients to the N-terminal and middle domain of Hsp90 and their presentation to the phosphatase within the phosphatase-Hsp90 complex.
Molecular Cellular Proteomics. 2015. Matthew M Makowski. et al. Radboud University Nijmegen
ABSTRACT: In recent years, cross-linking mass spectrometry has proven to be a robust and effective method of interrogating macromolecular protein complex topologies at peptide resolution. Traditionally, cross-linking mass spectrometry workflows have utilized homogenous complexes obtained through time-limiting reconstitution, tandem affinity purification, and conventional chromatography workflows. Here, we present cross-linking immunoprecipitation-MS (xIP-MS), a simple, rapid, and efficient method for structurally probing chromatin-associated protein complexes using small volumes of mammalian whole cell lysates, single affinity purification, and on-bead cross-linking followed by LC-MS/MS analysis.
> We first benchmarked xIP-MS using the structurally well-characterized phosphoribosyl pyrophosphate synthetase complex. We then applied xIP-MS to the chromatin-associated cohesin (SMC1A/3), XRCC5/6 (Ku70/86), and MCM complexes, and we provide novel structural and biological insights into their architectures and molecular function. Of note, we use xIP-MS to perform topological studies under cell cycle perturbations, showing that the xIP-MS protocol is sufficiently straightforward and efficient to allow comparative cross-linking experiments. This work, therefore, demonstrates that xIP-MS is a robust, flexible, and widely applicable methodology for interrogating chromatin-associated protein complex architectures.
Scientific Reports. 2015. Zhiqi Gao. et al. Beijing Institute of Biotechnology
ABSTRACT: Anthrax, caused by the pathogenic bacterium Bacillus anthracis, is a zoonosis that causes serious disease and is of significant concern as a biological warfare agent. Validating annotated genes and reannotating misannotated genes are important to understand its biology and mechanisms of pathogenicity. Proteomics studies are, to date, the best method for verifying and improving current annotations. To this end, the proteome of B. anthracis A16R was analyzed via one-dimensional gel electrophoresis followed by liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS).
>In total, we identified 3,712 proteins, including many regulatory and key functional proteins at relatively low abundance, representing the most complete proteome of B. anthracis to date. Interestingly, eight sequencing errors were detected by proteogenomic analysis and corrected by resequencing. More importantly, three unannotated peptide fragments were identified in this study and validated by synthetic peptide mass spectrum mapping and green fluorescent protein fusion experiments. These data not only give a more comprehensive understanding of B. anthracis A16R but also demonstrate the power of proteomics to improve genome annotations and determine true translational elements.
Journal of Proteome Research. 2015. Dhirendra Kumar. et al. CSIR-Institute of Genomics and Integrative Biology
ABSTRACT: The missing human proteome comprises predicted protein-coding genes with no credible protein level evidence detected so far and constitutes ∼18% of the human protein coding genes (neXtProt release 19/9/2014). The missing proteins may be of pharmacological interest as many of these are membrane receptors, thus requiring comprehensive characterization. In the present study, we explored various computational parameters, crucial during protein searches from tandem mass spectrometry (MS) data, for their impact on missing protein identification.
>Variables taken into consideration are differences in search database composition, shared peptides, semitryptic searches, post-translational modifications (PTMs), and transcriptome guided proteogenomic searches. We used a multialgorithmic approach for protein detection from publicly available mass spectra from recent studies covering diverse human tissues and cell types. Using the aforementioned approaches, we successfully detected 24 missing proteins (22-PE2, 1-PE4, and 1-PE5). Maximum of these identifications could be attributed to differences in reference proteome databases, exemplifying use of a single standard database for human protein detection from MS data. Our results suggest that search strategies with modified parameters can be rewarding alternatives for extensive profiling of missing proteins. We conclude that using complementary spectral data searches incorporating different parameters like PTMs, against a comprehensive and compact search database, might lead to discoveries of the proteins attributed so far as the missing human proteome.
Journal of Proteomics Research. 2015. Lei Zhang. et al. China University of Geosciences (Wuhan)
ABSTRACT: Androctonus bicolor is one of the most poisonous scorpion species in the world. However, little has been known about the venom composition of the scorpion. To better understand the molecular diversity and medical significance of the venom from the scorpion, we systematically analyzed the venom components by combining transcriptomic and proteomic surveys.
>Random sequencing of 1000 clones from a cDNA library prepared from the venom glands of the scorpion revealed that 70% of the total transcripts code for venom peptide precursors. Our efforts led to a discovery of 103 novel putative venom peptides. These peptides include NaTx-like, KTx-like and CaTx-like peptides, putative antimicrobial peptides, defensin-like peptides, BPP-like peptides, BmKa2-like peptides, Kunitz-type toxins and some new-type venom peptides without disulfide bridges, as well as many new-type venom peptides that are cross-linked with one, two, three, five or six disulfide bridges, respectively. We also identified three peptides that are identical to known toxins from scorpions. The venom was also analyzed using a proteomic technique. The presence of a total of 16 different venom peptides was confirmed by LC–MS/MS analysis. The discovery of a wide range of new and new-type venom peptides highlights the unique diversity of the venom peptides from A. bicolor. These data also provide a series of novel templates for the development of therapeutic drugs for treating ion channel-associated diseases and infections caused by antibiotic-resistant pathogens, and offer molecular probes for the exploration of structures and functions of various ion channels.
The Federation of European Biochemical Societies Journal. 2015. Susan L. et al. Radboud University Nijmegen
ABSTRACT: The nucleosome remodeling and deacetylase (NuRD) complex is an evolutionarily conserved chromatin-associated protein complex. Although the subunit composition of the mammalian complex is fairly well characterized, less is known about the stability and dynamics of these interactions. Furthermore, detailed information regarding protein–protein interaction surfaces within the complex is still largely lacking.
>Here, we show that the NuRD complex interacts with a number of substoichiometric zinc finger-containing proteins. Some of these interactions are salt-sensitive (ZNF512B and SALL4), whereas others (ZMYND8) are not. The stoichiometry of the core subunits is not affected by high salt concentrations, indicating that the core complex is stabilized by hydrophobic interactions. Interestingly, the RBBP4 and RBBP7 proteins are sensitive to high nonionic detergent concentrations during affinity purification. In a subunit exchange assay with stable isotope labeling by amino acids in cell culture (SILAC)-treated nuclear extracts, RBBP4 and RBBP7 were identified as dynamic core subunits of the NuRD complex, consistent with their proposed role as histone chaperones. Finally, using cross-linking MS, we have uncovered novel features of NuRD molecular architecture that complement our affinity purification-MS/MS data. Altogether, these findings extend our understanding of MBD3–NuRD structure and stability.
Journal of Proteome Research. 2015. Yao Zhang. et al. Beijing Institute of Genomics
ABSTRACT: Investigations of missing proteins (MPs) are being endorsed by many bioanalytical strategies. We proposed that proteogenomics of testis tissue was a feasible approach to identify more MPs because testis tissues have higher gene expression levels. Here we combined proteomics and transcriptomics to survey gene expression in human testis tissues from three post-mortem individuals.
>Proteins were extracted and separated with glycine- and tricine-SDS-PAGE. A total of 9597 protein groups were identified; of these, 166 protein groups were listed as MPs, including 138 groups (83.1%) with transcriptional evidence. A total of 2948 proteins are designated as MPs, and 5.6% of these were identified in this study. The high incidence of MPs in testis tissue indicates that this is a rich resource for MPs. Functional category analysis revealed that the biological processes that testis MPs are mainly involved in are sexual reproduction and spermatogenesis. Some of the MPs are potentially involved in tumorgenesis in other tissues. Therefore, this proteogenomics analysis of individual testis tissues provides convincing evidence of the discovery of MPs. All mass spectrometry data from this study have been deposited in the ProteomeXchange (data set identifier PXD002179).
NATURE STRUCTURAL & MOLECULAR BIOLOGY. 2015. Inessa De. et al. Max Planck Institute
ABSTRACT: Aquarius is a multifunctional putative RNA helicase that binds precursor-mRNA introns at a defined position. Here we report the crystal structure of human Aquarius, revealing a central RNA helicase core and several unique accessory domains, including an ARM-repeat domain. We show that Aquarius is integrated into spliceosomes as part of a pentameric intron-binding complex (IBC) that, together with the ARM domain, cross-links to U2 snRNP proteins within activated spliceosomes; this suggests that the latter aid in positioning Aquarius on the intron.
>Aquarius's ARM domain is essential for IBC formation, thus indicating that it has a key protein-protein–scaffolding role. Finally, we provide evidence that Aquarius is required for efficient precursor-mRNA splicing in vitro. Our findings highlight the remarkable structural adaptations of a helicase to achieve position-specific recruitment to a ribonucleoprotein complex and reveal a new building block of the human spliceosome.
Science. 2015. Yigong Shi. et al. Tsinghua University
ABSTRACT: Splicing of precursor messenger RNA (pre-mRNA) in yeast is executed by the spliceosome, which consists of five small nuclear ribonucleoproteins (snRNPs), NTC (nineteen complex), NTC-related proteins (NTR), and a number of associated enzymes and cofactors. Here, we report the three-dimensional structure of a Schizosaccharomyces pombe spliceosome at 3.6-angstrom resolution, revealed by means of single-particle cryogenic electron microscopy. This spliceosome contains U2 and U5 snRNPs, NTC, NTR, U6 small nuclear RNA, and an RNA intron lariat.
>The atomic model includes 10,574 amino acids from 37 proteins and four RNA molecules, with a combined molecular mass of approximately 1.3 megadaltons. Spp42 (Prp8 in Saccharomyces cerevisiae), the key protein component of the U5 snRNP, forms a central scaffold and anchors the catalytic center. Both the morphology and the placement of protein components appear to have evolved to facilitate the dynamic process of pre-mRNA splicing. Our near-atomic-resolution structure of a central spliceosome provides a molecular framework for mechanistic understanding of pre-mRNA splicing.
Journal of Proteome Research. 2015. Na Su. et al. Beijing Proteome Research Center
ABSTRACT: As part of the Chromosome-Centric Human Proteome Project (C-HPP) mission, laboratories all over the world have tried to map the entire missing proteins (MPs) since 2012. On the basis of the first and second Chinese Chromosome Proteome Database (CCPD 1.0 and 2.0) studies, we developed systematic enrichment strategies to identify MPs that fell into four classes: (1) low molecular weight (LMW) proteins, (2) membrane proteins, (3) proteins that contained various post-translational modifications (PTMs), and (4) nucleic acid-associated proteins. Of 8845 proteins identified in 7 data sets, 79 proteins were classified as MPs.
Among data sets derived from different enrichment strategies, data sets for LMW and PTM yielded the most novel MPs. In addition, we found that some MPs were identified in multiple-data sets, which implied that tandem enrichments methods might improve the ability to identify MPs. Moreover, low expression at the transcription level was the major cause of the “missing” of these MPs; however, MPs with higher expression level also evaded identification, most likely due to other characteristics such as LMW, high hydrophobicity and PTM. By combining a stringent manual check of the MS2 spectra with peptides synthesis verification, we confirmed 30 MPs (neXtProt PE2 ∼ PE4) and 6 potential MPs (neXtProt PE5) with authentic MS evidence. By integrating our large-scale data sets of CCPD 2.0, the number of identified proteins has increased considerably beyond simulation saturation. Here, we show that special enrichment strategies can break through the data saturation bottleneck, which could increase the efficiency of MP identification in future C-HPP studies. All 7 data sets have been uploaded to ProteomeXchange with the identifier PXD002255.
Scientific Reports. 2015. Yejing Weng. et al. Dalian Institute of Chemical Physics
ABSTRACT: Due to the important roles of N-glycoproteins in various biological processes, the global N-glycoproteome analysis has been paid much attention. However, by current strategies for N-glycoproteome profiling, peptides with glycosylated Asn at N-terminus (PGANs), generated by protease digestion, could hardly be identified, due to the poor deglycosylation capacity by enzymes. However, theoretically, PGANs occupy 10% of N-glycopeptides in the typical tryptic digests.
Therefore, in this study, we developed a novel strategy to identify PGANs by releasing N-glycans through the N-terminal site-selective succinylation assisted enzymatic deglycosylation. The obtained PGANs information is beneficial to not only achieve the deep coverage analysis of glycoproteomes, but also discover the new biological functions of such modification.
Methods. 2015. Roland F. Rivera-Santiago. et al. The Wistar Institute
ABSTRACT: Structural mass spectrometry (MS) is a field with growing applicability for addressing complex biophysical questions regarding proteins and protein complexes. One of the major structural MS approaches involves the use of chemical cross-linking coupled with MS analysis (CX-MS) to identify proximal sites within macromolecules. Identified cross-linked sites can be used to probe novel protein–protein interactions or the derived distance constraints can be used to verify and refine molecular models.
This review focuses on recent advances of “zero-length” cross-linking. Zero-length cross-linking reagents do not add any atoms to the cross-linked species due to the lack of a spacer arm. This provides a major advantage in the form of providing more precise distance constraints as the cross-linkable groups must be within salt bridge distances in order to react. However, identification of cross-linked peptides using these reagents presents unique challenges. We discuss recent efforts by our group to minimize these challenges by using multiple cycles of LC–MS/MS analysis and software specifically developed and optimized for identification of zero-length cross-linked peptides. Representative data utilizing our current protocol are presented and discussed.
Talanta. 2015. Hao Jiang. et al. Dalian Institute of Chemical Physics
ABSTRACT: In this study, a novel kind of amide functionalized hydrophilic monolith was synthesized by the in situ photo-polymerization of N-vinyl-2-pyrrolidinone (NVP), acrylamide (AM), and N, N’-methylenebisacrylamide (MBA) in a UV transparent capillary, and successfully applied for hydrophilic interaction chromatography (HILIC) based enrichment of N-linked glycopeptides. With 2 μg of the tryptic digests of IgG as the sample, after enrichment, 18 glycopeptides could be identified by MALDI-TOF/TOF MS analysis.
Furthermore, with the mixture of BSA and IgG digests ( 10,000:1, m/m) as the sample, 6 N-linked glycopeptides were unambiguously identified after enrichment, indicating the high selectivity and good specificity of such material. Moreover, such a monolithic capillary column was also applied for the N-glycosylation sites profiling of 6 μg protein digests from HeLa cells and 1 μL human serum. In total, 530 and 262 unique N-glycosylated peptides were identified, respectively, corresponding to 282 and 124 N-glycoproteins, demonstrating its great potential for the large scale glycoproteomics analysis.
Chinese Chemical Letters. 2015. Simin Xia. et al. Dalian Institute of Chemical Physics
ABSTRACT: In this work, a novel kind of particulate capillary precolumns with double-end polymer monolithic frits has been developed. Firstly, the polymer monolithic frit at one end was prepared via photo-initiated polymerization of a mixture of lauryl methacrylate and ethyleneglycol dimethacrylate with 1-propanol and 1,4-butanediol as porogens and 2,2-dimethoxy-2-phenylacetophenone as a photo-initiator in UV transparent coating capillary (100 μm i.d.). Subsequently, C18 particles (5 μm, 100 Å) were packed into the capillary, and sealed with the polymer monolithic frit at another end.
To prevent the reaction of monomers and C18 particles, the packed C18 particles were masked during UV exposure. The loading capacity of such a precolumn was determined to be about 9 μg by frontal analysis with a synthetic peptide APGDRIYVHPF as a model sample. Furthermore, two parallel precolumns were incorporated into a two-dimensional nano-liquid chromatography (2D nano-LC) system with dual capillary trap columns for peptide trapping and concentration. Compared to 2D nano-LC system with a single trap column, such two dimensional separations could be operated simultaneously to improve the analysis throughput. All these results demonstrated that such capillary precolumns with double frits would be promising for high-throughput proteome analysis.
Journal of Proteome Research. 2015. Zhijing Tan. et al. University of Michigan
ABSTRACT: Glycosylation has significant effects on protein function and cell metastasis, which are important in cancer progression. It is of great interest to identify site-specific glycosylation in search of potential cancer biomarkers. However, the abundance of glycopeptides is low compared to that of nonglycopeptides after trypsin digestion of serum samples, and the mass spectrometric signals of glycopeptides are often masked by coeluting nonglycopeptides due to low ionization efficiency.
Selective enrichment of glycopeptides from complex serum samples is essential for mass spectrometry (MS)-based analysis. Herein, a strategy has been optimized using LCA enrichment to improve the identification of core-fucosylation (CF) sites in serum of pancreatic cancer patients. The optimized strategy was then applied to analyze CF glycopeptide sites in 13 sets of serum samples from pancreatic cancer, chronic pancreatitis, healthy controls, and a standard reference. In total, 630 core-fucosylation sites were identified from 322 CF proteins in pancreatic cancer patient serum using an Orbitrap Elite mass spectrometer. Further data analysis revealed that 8 CF peptides exhibited a significant difference between pancreatic cancer and other controls, which may be potential diagnostic biomarkers for pancreatic cancer.
Journal of Proteome Research. 2015. Michael R. Hoopmann. et al. Washington University
ABSTRACT: Protein chemical cross-linking and mass spectrometry enable the analysis of protein–protein interactions and protein topologies; however, complicated cross-linked peptide spectra require specialized algorithms to identify interacting sites. The Kojak cross-linking software application is a new, efficient approach to identify cross-linked peptides, enabling large-scale analysis of protein–protein interactions by chemical cross-linking techniques. The algorithm integrates spectral processing and scoring schemes adopted from traditional database search algorithms and can identify cross-linked peptides using many different chemical cross-linkers with or without heavy isotope labels.
Kojak was used to analyze both novel and existing data sets and was compared to existing cross-linking algorithms. The algorithm provided increased cross-link identifications over existing algorithms and, equally importantly, the results in a fraction of computational time. The Kojak algorithm is open-source, cross-platform, and freely available. This software provides both existing and new cross-linking researchers alike an effective way to derive additional cross-link identifications from new or existing data sets. For new users, it provides a simple analytical resource resulting in more cross-link identifications than other methods.
Talanta. 2015. Simin Xia. et al. Dalian Institute of Chemical Physics
ABSTRACT: In this work, a novel integrated sample preparation device for SDS-assisted proteome analysis was developed, by which proteins dissolved in 4% (w/v) SDS were first diluted by 50% methanol, and then SDS was online removed by a hollow fiber membrane interface (HFMI) with 50 mM ammonium bicarbonate (pH 8.0) as an exchange buffer, finally digested by an immobilized enzyme reactor (IMER). To evaluate the performance of such an integrated device, bovine serum albumin dissolved in 4% (w/v) SDS as a model sample was analyzed; it could be found that similar to that obtained by direct analysis of BSA digests without SDS (the sequence coverage of 60.3±1.0%, n=3), with HFMI as an interface for SDS removal, BSA was identified with the sequence coverage of 61.0±1.0% (n=3).
However, without SDS removal by HFMI, BSA could not be digested by the IMER and none peptides could be detected. In addition, such an integrated sample preparation device was also applied for the analysis of SDS extracted proteins from rat brain, compared to those obtained by filter-aided sample preparation (FASP), not only the identified protein group and unique peptide number were increased by 12% and 39% respectively, but also the sample pretreatment time was shortened from 24 h to 4 h. All these results demonstrated that such an integrated sample preparation device would provide an alternative tool for SDS assisted proteome analysis.
Journal of Proteome Research. 2015. Yang Chen. et al. Jinan University
ABSTRACT: Finding protein evidence (PE) for protein coding genes is a primary task of the Phase I Chromosome-Centric Human Proteome Project (C-HPP). Currently, there are 2948 PE level 2–4 coding genes per neXtProt, which are deemed missing proteins in the human proteome. As most samples prepared and analyzed in the C-HPP framework were focusing on detergent soluble proteins, we posit that as a natural composition the cytoplasmic detergent-insoluble proteins (DIPs) represent a source of finding missing proteins.
We optimized a workflow and separated cytoplasmic DIPs from three human lung and three human hepatoma cell lines via differential speed centrifugation. We verified that the detergent-soluble proteins (DSPs) could be sufficiently depleted and the cytoplasmic DIP isolation was partially reproducible with Spearman r > 0.70 according to two independent SILAC MS experiments. Through label-free MS, we identified 4524 and 4156 DIPs from lung and liver cells, respectively. Among them, a total of 23 missing proteins (22 PE2 and 1 PE4) were identified by MS, and 18 of them had translation evidence; in addition, six PE5 proteins were identified by MS, three with translation evidence. We showed that cytoplasmic DIPs were not an enrichment of transmembrane proteins and were chromosome-, cell type-, and tissue-specific. Furthermore, we demonstrated that DIPs were distinct from DSPs in terms of structural and physical–chemical features. In conclusion, we have found 23 missing proteins and 6 PE5 proteins from the cytoplasmic insoluble proteome that is biologically and physical-chemically different from the soluble proteome, suggesting that cytoplasmic DIPs carry comprehensive and valuable information for finding PE of missing proteins. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001694.
Proteomics. 2015. Jingwen Jiang. et al. Sichuan University
ABSTRACT: Cancer cells are characterized by higher levels of intracellular reactive oxygen species (ROS) due to metabolic aberrations. ROS are widely accepted as second messengers triggering pivotal signaling pathways involved in the process of cell metabolism, cell cycle, apoptosis, and autophagy. However, the underlying cellular mechanisms remain largely unknown. Recently, accumulating evidence has demonstrated that ROS initiate redox signaling through direct oxidative modification of the cysteines of key redox-sensitive proteins (termed redox sensors).
Uncovering the functional changes underlying redox regulation of redox sensors is urgently required, and the role of different redox sensors in distinct disease states still remains to be identified. To assist this, redox proteomics has been developed for the high-throughput screening of redox sensors, which will benefit the development of novel therapeutic strategies for cancer treatment. Highlighted here are recent advances in redox proteomics approaches and their applications in identifying redox sensors involved in tumor development.
Journal of Proteome Research. 2015. Lijuan Yang. et al. Jinan University
ABSTRACT: The chromosome-centric human proteome project (C-HPP) has made great progress of finding protein evidence (PE) for missing proteins (PE2–4 proteins defined by the neXtProt), which now becomes an increasingly challenging field. As a majority of samples tested in this field were from adult tissues/cells, the developmental stage specific or relevant proteins could be missed due to biological source availability. We posit that epigenetic interventions may help to partially bypass such a limitation by stimulating the expression of the “silenced” genes in adult cells, leading to the increased chance of finding missing proteins. In this study, we established in vitro human cell models to modify the histone acetylation, demethylation, and methylation with near physiological conditions.
With mRNA-seq analysis, we found that histone modifications resulted in overall increases of expressed genes in an even distribution manner across different chromosomes. We identified 64 PE2–4 and six PE5 proteins by MaxQuant (FDR < 1% at both protein and peptide levels) and 44 PE2–4 and 7 PE5 proteins by Mascot (FDR < 1% at peptide level) searches, respectively. However, only 24 PE2–4 and five PE5 proteins in Mascot, and 12 PE2–4 and one PE5 proteins in MaxQuant searches could, respectively, pass our stringently manual spectrum inspections. Collectively, 27 PE2–4 and five PE5 proteins were identified from the epigenetically modified cells; among them, 19 PE2–4 and three PE5 proteins passed FDR < 1% at both peptide and protein levels. Gene ontology analyses revealed that the PE2–4 proteins were significantly involved in development and spermatogenesis, although their chemical–physical features had no statistical difference from the background. In addition, we presented an example of suspicious PE5 peptide spectrum matched with unusual AA substitutions related to post-translational modification. In conclusion, the epigenetically manipulated cell models should be a useful tool for finding missing proteins in C-HPP. The mass spectrometry data have been deposited to the iProx database (accession number: IPX00020200).
International Journal of Advanced Computer Technology. 2015. Simin Zhu. et al. Yunnan Minzu University
ABSTRACT: In high-throughput proteomics research of tandem mass spectrometry, de novo sequencing provides a novel method to interpret MS/MS data without any help of sequence database and discover new organisms. In this paper, we have systematically evaluated and compared the capability of mainstream de novo sequencing software via testing data sets which have been correctly identified by Mascot and Sequest, so we can intuitively find out the optimal de novo sequencing software for protein identification.
THE JOURNAL OF BIOLOGICAL CHEMISTRY. 2015. Mart Reimund. et al. Tallinn University of Technology
ABSTRACT: GPIHBP1 is an endothelial membrane protein that transports lipoprotein lipase (LPL) from the subendothelial space to the luminal side of the capillary endothelium. Here, we provide evidence that two regions of GPIHBP1, the acidic N-terminal domain and the central Ly6 domain, interact with LPL as two distinct binding sites. This conclusion is based on comparative binding studies performed with a peptide corresponding to the N-terminal domain of GPIHBP1, the Ly6 domain of GPIHBP1, wild type GPIHBP1, and the Ly6 domain mutant GPIHBP1 Q114P.
Although LPL and the N-terminal domain formed a tight but short lived complex, characterized by fast on- and off-rates, the complex between LPL and the Ly6 domain formed more slowly and persisted for a longer time. Unlike the interaction of LPL with the Ly6 domain, the interaction of LPL with the N-terminal domain was significantly weakened by salt. The Q114P mutant bound LPL similarly to the N-terminal domain of GPIHBP1. Heparin dissociated LPL from the N-terminal domain, and partially from wild type GPIHBP1, but was unable to elute the enzyme from the Ly6 domain. When LPL was in complex with the acidic peptide corresponding to the N-terminal domain of GPIHBP1, the enzyme retained its affinity for the Ly6 domain. Furthermore, LPL that was bound to the N-terminal domain interacted with lipoproteins, whereas LPL bound to the Ly6 domain did not. In summary, our data suggest that the two domains of GPIHBP1 interact independently with LPL and that the functionality of LPL depends on its localization on GPIHBP1.
Molecular & Cellular Proteomics. 2015. Zuo-Fei Yuan. et al. University of Pennsylvania
ABSTRACT: Histone post-translational modifications (PTMs) contribute to chromatin function through their chemical properties which influence chromatin structure, and their ability to recruit chromatin interacting proteins. Nanoflow liquid chromatography coupled with high resolution tandem mass spectrometry (nanoLC-MS/MS) has emerged as the most suitable technology for global histone modification analysis due to the high sensitivity and the high mass accuracy that provide confident identification.
However, the histone nanoLC-MS/MS data analysis is even more challenging due to the large number and variety of isobaric histone peptides, and the high dynamic range of histone peptide abundances. Here, we introduce EpiProfile, a software tool that discriminates isobaric histone peptides using the distinguishing fragment ions in their tandem mass spectra and extracts the chromatographic area under the curve using previous knowledge about peptide retention time. The accuracy of EpiProfile was evaluated by analysis of mixtures containing different ratios of synthetic histone peptides. In addition to label-free quantification of histone peptides, EpiProfile is flexible, and can quantify different types of isotopically labeled histone peptides. EpiProfile is much more convenient when compared to manual quantification, filling the need of an automatic and freely available tool to quantify labeled and non-labeled modified histone peptides. In summary, EpiProfile is a valuable nanoLC-MS/MS based quantification tool for histone modifications, which can also be adapted to analyze non-histone protein samples.
Proteomics. 2015. Simone Sidoli. et al. University of Pennsylvania
ABSTRACT: MS-based proteomics has become the most utilized tool to characterize histone PTMs. Since histones are highly enriched in lysine and arginine residues, lysine derivatization has been developed to prevent the generation of short peptides (<6 residues) during trypsin digestion. One of the most adopted protocols applies propionic anhydride for derivatization. However, the propionyl group is not sufficiently hydrophobic to fully retain the shortest histone peptides in RP LC, and such procedure also hampers the discovery of natural propionylation events.
In this work we tested 12 commercially available anhydrides, selected based on their safety and hydrophobicity. Performance was evaluated in terms of yield of the reaction, MS/MS fragmentation efficiency, and drift in retention time using the following samples: (i) a synthetic unmodified histone H3 tail, (ii) synthetic modified histone peptides, and (iii) a histone extract from cell lysate. Results highlighted that seven of the selected anhydrides increased peptide retention time as compared to propionic, and several anhydrides such as benzoic and valeric led to high MS/MS spectra quality. However, propionic anhydride derivatization still resulted, in our opinion, as the best protocol to achieve high MS sensitivity and even ionization efficiency among the analyzed peptides.
Biochimica et Biophysica Acta. 2015. Ruixue Sun. et al. Institute of Botany, Chinese Academy of Sciences
ABSTRACT: Minor antennae of photosystem (PS) II, located between the PSII core complex and the major antenna (LHCII), are important components for the structural and functional integrity of PSII supercomplexes. In order to study the functional significance of minor antennae in the energetic coupling between LHCII and the PSII core, characteristics of PSII–LHCII proteoliposomes, with or without minor antennae, were investigated.
Two types of PSII preparations containing different antenna compositions were isolated from pea: 1) the PSII preparation composed of the PSII core complex, all of the minor antennae, and a small amount of major antennae (MCC); and 2) the purified PSII dimeric core complexes without periphery antenna (CC). They were incorporated, together with LHCII, into liposomes composed of thylakoid membrane lipids. The spectroscopic and functional characteristics were measured. 77 K fluorescence emission spectra revealed an increased spectral weight of fluorescence from PSII reaction center in the CC–LHCII proteoliposomes, implying energetic coupling between LHCII and CC in the proteoliposomes lacking minor antennae. This result was further confirmed by chlorophyll a fluorescence induction kinetics. The incorporation of LHCII together with CC markedly increased the antenna cross-section of the PSII core complex. The 2,6-dichlorophenolindophenol photoreduction measurement implied that the lack of minor antennae in PSII supercomplexes did not block the energy transfer from LHCII to the PSII core complex. In conclusion, it is possible, in liposomes, that LHCII transfer energy directly to the PSII core complex, in the absence of minor antennae.
the journal of biological databases and curation. 2015. Yaohua Yang. et al. Institute of Hydrobiology, Chinese Academy of Sciences
ABSTRACT: Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported.
However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics's usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects. Database URL: http://lag.ihb.ac.cn/cyanomics.
Analytical Chemistry. 2015. Cheng Ma. et al. Georgia State University
ABSTRACT: N-glycosylation is one of the most prevalence protein post-translational modifications (PTM) which is involved in several biological processes. Alternation of N-glycosylation is associated with cellular malfunction and development of disease. Thus, investigation of protein N-glycosylation is crucial for diagnosis and treatment of disease. Currently, deglycosylation with peptide N-glycosidase F is the most commonly used technique in N-glycosylation analysis.
Additionally, a common error in N-glycosylation site identification, resulting from protein chemical deamidation, has largely been ignored. In this study, we developed a convenient and precise approach for mapping N-glycosylation sites utilizing with optimized TFA hydrolysis, ZIC-HILIC enrichment, and characteristic ions of N-acetylglucosamine (GlcNAc) from higher-energy collisional dissociation (HCD) fragmentation. Using this method, we identified a total of 257 N-glycosylation sites and 144 N-glycoproteins from healthy human serum. Compared to deglycosylation with endoglycosidase, this strategy is more convenient and efficient for large scale N-glycosylation sites identification and provides an important alternative approach for the study of N-glycoprotein function.
ABSTRACT: Trypsin has traditionally been used for enzymatic digestion during sample preparation in shotgun proteomics. The stringent specificity of trypsin is essential for accurate protein identification and quantification. But nonspecific trypsin cleavages are often observed in LC-MS/MS-based shotgun proteomics. To explore the extent of nonspecific trypsin cleavages, a series of biological systems including a standard protein mixture, Saccharomyces cerevisiae, human serum, human cancer cell lines and mouse brain were examined.
We found that nonspecific trypsin cleavages commonly occurred in various trypsin digested samples with high frequency. To control these nonspecific trypsin cleavages, we optimized fundamental parameters during sample preparation with mouse brain homogenates. These parameters included denaturing agents and protein storage time, trypsin type, enzyme-to-substrate ratio, as well as protein concentration during digestion. The optimized experimental conditions significantly decreased the ratio of partially tryptic peptides in total identifications from 28.4% to 2.8%. Furthermore, the optimized digestion protocol was applied to the study of N-glycoproteomics, and the proportions of partially tryptic peptides in enriched mixtures were also sharply reduced. Our work demonstrates the importance of controlling nonspecific trypsin cleavages in both shotgun proteomics and glycoproteomics and provides a better understanding and standardization for routine proteomics sample treatment.
Nature Communications. 2015. Linlin Yang. et al. Shanghai Institute of Materia Medica, Chinese Academy of Sciences
ABSTRACT: Class B G protein-coupled receptors are composed of an extracellular domain (ECD) and a seven-transmembrane (7TM) domain, and their signalling is regulated by peptide hormones. Using a hybrid structural biology approach together with the ECD and 7TM domain crystal structures of the glucagon receptor (GCGR), we examine the relationship between full-length receptor conformation and peptide ligand binding.
Molecular dynamics (MD) and disulfide crosslinking studies suggest that apo-GCGR can adopt both an open and closed conformation associated with extensive contacts between the ECD and 7TM domain. The electron microscopy (EM) map of the full-length GCGR shows how a monoclonal antibody stabilizes the ECD and 7TM domain in an elongated conformation. Hydrogen/deuterium exchange (HDX) studies and MD simulations indicate that an open conformation is also stabilized by peptide ligand binding. The combined studies reveal the open/closed states of GCGR and suggest that glucagon binds to GCGR by a conformational selection mechanism.
Protein Science. 2015. Nha-Thi Nguyen-Huynh. et al. Université de Strasbourg
ABSTRACT: Understanding the way how proteins interact with each other to form transient or stable protein complexes is a key aspect in structural biology. In this study, we combined chemical cross-linking with mass spectrometry to determine the binding stoichiometry and map the protein-protein interaction network of a human SAGA HAT subcomplex. MALDI-MS equipped with high mass detection was used to follow the cross-linking reaction using bis[sulfosuccinimidyl] suberate (BS3) and confirm the heterotetrameric stoichiometry of the specific stabilized subcomplex. Cross-linking with isotopically labeled BS3 d0-d4 followed by trypsin digestion allowed the identification of intra- and intercross-linked peptides using two dedicated search engines: pLink and xQuest.
The identified interlinked peptides suggest a strong network of interaction between GCN5, ADA2B and ADA3 subunits; SGF29 is interacting with GCN5 and ADA3 but not with ADA2B. These restraint data were combined to molecular modeling and a low-resolution interacting model for the human SAGA HAT subcomplex could be proposed, illustrating the potential of an integrative strategy using cross-linking and mass spectrometry for addressing the structural architecture of multiprotein complexes.
PLOS One. 2015. lzabel Moraes. et al. Universidade de São Paulo
ABSTRACT: Histones are the main structural components of the nucleosome, hence targets of many regulatory proteins that mediate processes involving changes in chromatin. The functional outcome of many pathways is "written" in the histones in the form of post-translational modifications that determine the final gene expression readout. As a result, modifications, alone or in combination, are important determinants of chromatin states. Histone modifications are accomplished by the addition of different chemical groups such as methyl, acetyl and phosphate.
Thus, identifying and characterizing these modifications and the proteins related to them is the initial step to understanding the mechanisms of gene regulation and in the future may even provide tools for breeding programs. Several studies over the past years have contributed to increase our knowledge of epigenetic gene regulation in model organisms like Arabidopsis, yet this field remains relatively unexplored in crops. In this study we identified and initially characterized histones H3 and H4 in the monocot crop sugarcane. We discovered a number of histone genes by searching the sugarcane ESTs database. The proteins encoded correspond to canonical histones, and their variants. We also purified bulk histones and used them to map post-translational modifications in the histones H3 and H4 using mass spectrometry. Several modifications conserved in other plants, and also novel modified residues, were identified. In particular, we report O-acetylation of serine, threonine and tyrosine, a recently identified modification conserved in several eukaryotes. Additionally, the sub-nuclear localization of some well-studied modifications (i.e., H3K4me3, H3K9me2, H3K27me3, H3K9ac, H3T3ph) is described and compared to other plant species. To our knowledge, this is the first report of histones H3 and H4 as well as their post-translational modifications in sugarcane, and will provide a starting point for the study of chromatin regulation in this crop.
PROTEOMICS. 2015. Shen Zhang. et al. Dalian Institute of Chemical Physics
ABSTRACT: The isobaric peptide termini labeling (IPTL) method is a promising strategy in quantitative proteomics for its high accuracy, while the increased complexity of MS2 spectra originated from the paired b, y ions has adverse effect on the identification and the coverage of quantification. Here, a paired ions scoring algorithm (PISA) based on Morpheus, a database searching algorithm specifically designed for high-resolution MS2 spectra, was proposed to address this issue.
PISA was first tested on two 1:1 mixed IPTL datasets, and increases in peptide to spectrum matchings, distinct peptides and protein groups compared to Morpheus itself and MASCOT were shown. Furthermore, the quantification is simultaneously performed and 100% quantification coverage is achieved by PISA since each of the identified peptide to spectrum matchings has several pairs of fragment ions which could be used for quantification. Then the PISA was applied to the relative quantification of human hepatocellular carcinoma cell lines with high and low metastatic potentials prepared by an IPTL strategy.
IEEE Transactions on nanobioscience. 2015. Yan Yan. et al. University of Saskatchewan Saskatoon
ABSTRACT: With tandem mass spectrometry (MS/MS), spectra can be generated by various fragmentation techniques including collision-induced dissociation (CID), higher-energy collisional dissociation (HCD), electron capture dissociation (ECD), electron transfer dissociation (ETD) and so on. At the same time, de novo sequencing using multiple spectra from the same peptide generated by different fragmentation techniques is becoming popular in proteomics studies.
The focus of this study is the use of paired spectra from CID (or HCD) and ECD (or ETD) fragmentation because of the complementarity between them. We present a de novo peptide sequencing framework for multiple tandem mass spectra, and apply it to paired spectra sequencing problem. The performance of the framework on paired spectra is compared to another successful method named pNovo+. The results show that our proposed method outperforms pNovo+ in terms of full length peptide sequencing accuracy on three pairs of experimental datasets, with the accuracy increasing up to 13.6% compared to pNovo+.
Nature. 2014. Arun K. Shukla. et al. Stanford University School of Medicine
ABSTRACT: G-protein-coupled receptors (GPCRs) are critically regulated by β-arrestins, which not only desensitize G-protein signalling but also initiate a G-protein-independent wave of signalling. A recent surge of structural data on a number of GPCRs, including the β2 adrenergic receptor (β2AR)–G-protein complex, has provided novel insights into the structural basis of receptor activation.
However, complementary information has been lacking on the recruitment of β-arrestins to activated GPCRs, primarily owing to challenges in obtaining stable receptor–β-arrestin complexes for structural studies. Here we devised a strategy for forming and purifying a functional human β2AR–β-arrestin-1 complex that allowed us to visualize its architecture by single-particle negative-stain electron microscopy and to characterize the interactions between β2AR and β-arrestin 1 using hydrogen–deuterium exchange mass spectrometry (HDX-MS) and chemical crosslinking. Electron microscopy two-dimensional averages and three-dimensional reconstructions reveal bimodal binding of β-arrestin 1 to the β2AR, involving two separate sets of interactions, one with the phosphorylated carboxy terminus of the receptor and the other with its seven-transmembrane core. Areas of reduced HDX together with identification of crosslinked residues suggest engagement of the finger loop of β-arrestin 1 with the seven-transmembrane core of the receptor. In contrast, focal areas of raised HDX levels indicate regions of increased dynamics in both the N and C domains of β-arrestin 1 when coupled to the β2AR. A molecular model of the β2AR–β-arrestin signalling complex was made by docking activated β-arrestin 1 and β2AR crystal structures into the electron microscopy map densities with constraints provided by HDX-MS and crosslinking, allowing us to obtain valuable insights into the overall architecture of a receptor–arrestin complex. The dynamic and structural information presented here provides a framework for better understanding the basis of GPCR regulation by arrestins.
Molecular & Cellular Proteomics. 2014. Aiping Lu. et al. Tongji University
ABSTRACT: Conotoxins are peptide neurotoxins produced by predatory cone snails. They are mostly cysteine-rich short peptides with remarkable structural diversity. The conserved signal peptide sequences of their mRNA-encoded precursors have enabled the grouping of known conotoxins into a limited number of superfamilies. However, the conotoxins within each superfamily often present variable sequences, cysteine frameworks, and post-translational modifications.
To understand better how conotoxins are diversified, we performed a venomic study with C. flavidus, an uninvestigated vermivorous Conus species, by combining transcriptomic and proteomic analyses. In order to obtain the full-length conotoxin sequences, protease digestion was not performed with the venom extraction prior to spectra acquisition via tandem mass spectrometry (MS/MS). Because conotoxins are produced from mRNA-encoded precursors by means of proteolytic cleavage, nonspecific digestion of precursors was applied during the database search. Special attention was also paid in interpreting the MS/MS spectra. All together, these analyses identified 69 nonredundant cDNA sequences and 31 conotoxin components with confident MS/MS spectra. A new Q-superfamily was also identified. More importantly, this study revealed that conotoxin-encoding transcripts are diversified by hypermutation, fragment insertion/deletion, and mutation-induced premature termination, and that a single mRNA species can produce multiple toxin products through alternative post-translational modifications and alternative cleavages of the translated precursor. These multiple diversification strategies at different levels may explain, at least in part, the diversity of conotoxins, and provide the basis for further investigation.
Nature. 2014. Vitaly Epshtein. et al. New York University School of Medicine
ABSTRACT: UvrD helicase is required for nucleotide excision repair, although its role in this process is not well defined. Here we show that Escherichia coli UvrD binds RNA polymerase during transcription elongation and, using its helicase/translocase activity, forces RNA polymerase to slide backward along DNA. By inducing backtracking, UvrD exposes DNA lesions shielded by blocked RNA polymerase, allowing nucleotide excision repair enzymes to gain access to sites of damage.
Our results establish UvrD as a bona fide transcription elongation factor that contributes to genomic integrity by resolving conflicts between transcription and DNA repair complexes. Furthermore, we show that the elongation factor NusA cooperates with UvrD in coupling transcription to DNA repair by promoting backtracking and recruiting nucleotide excision repair enzymes to exposed lesions. Because backtracking is a shared feature of all cellular RNA polymerases, we propose that this mechanism enables RNA polymerases to function as global DNA damage scanners in bacteria and eukaryotes.
Molecular & Cellular Proteomics. 2014. Yan Fu. et al. Beijing Proteome Research Center
ABSTRACT: In shotgun proteomics, high-throughput mass spectrometry experiments and the subsequent data analysis produce thousands to millions of hypothetical peptide identifications. The common way to estimate the false discovery rate (FDR) of peptide identifications is the target-decoy database search strategy, which is efficient and accurate for large datasets. However, the legitimacy of the target-decoy strategy for protein-modification-centric studies has rarely been rigorously validated.
It is often the case that a global FDR is estimated for all peptide identifications including both modified and unmodified peptides, but that only a subgroup of identifications with a certain type of modification is focused on. As revealed recently, the subgroup FDR of modified peptide identifications can differ dramatically from the global FDR at the same score threshold, and thus the former, when it is of interest, should be separately estimated. However, rare modifications often result in a very small number of modified peptide identifications, which makes the direct separate FDR estimation inaccurate because of the inadequate sample size. This paper presents a method called the transferred FDR for accurately estimating the FDR of an arbitrary number of modified peptide identifications. Through flexible use of the empirical data from a target-decoy database search, a theoretical relationship between the subgroup FDR and the global FDR is made computable. Through this relationship, the subgroup FDR can be predicted from the global FDR, allowing one to avoid an inaccurate direct estimation from a limited amount of data. The effectiveness of the method is demonstrated with both simulated and real mass spectra.
Molecular & Cellular Proteomics. 2014. Yi Shi. et al. The Rockefeller University
ABSTRACT: Most cellular processes are orchestrated by macromolecular complexes. However, structural elucidation of these endogenous complexes can be challenging because they frequently contain large numbers of proteins, are compositionally and morphologically heterogeneous, can be dynamic, and are often of low abundance in the cell. Here, we present a strategy for the structural characterization of such complexes that has at its center chemical cross-linking with mass spectrometric readout.
In this strategy, we isolate the endogenous complexes using a highly optimized sample preparation protocol and generate a comprehensive, high-quality cross-linking dataset using two complementary cross-linking reagents. We then determine the structure of the complex using a refined integrative method that combines the cross-linking data with information generated from other sources, including electron microscopy, X-ray crystallography, and comparative protein structure modeling. We applied this integrative strategy to determine the structure of the native Nup84 complex, a stable hetero-heptameric assembly (∼ 600 kDa), 16 copies of which form the outer rings of the 50-MDa nuclear pore complex (NPC) in budding yeast. The unprecedented detail of the Nup84 complex structure reveals previously unseen features in its pentameric structural hub and provides information on the conformational flexibility of the assembly. These additional details further support and augment the protocoatomer hypothesis, which proposes an evolutionary relationship between vesicle coating complexes and the NPC, and indicates a conserved mechanism by which the NPC is anchored in the nuclear envelope.
Journal of Proteomics. 2014. Ana O. Tiroli-Cepeda. et al. University of Campinas
ABSTRACT: Hsp70 cycles from an ATP-bound state, in which the affinity for unfolded polypeptides is low, to an ADP-bound state, in which the affinity for unfolded polypeptides is high, to assist with cell proteostasis. Such cycling also depends on co-chaperones because these proteins control both the Hsp70 ATPase activity and the delivery of unfolded polypeptide chains. Although it is very important, structural information on the entire protein is still scarce.
This work describes the first cloning of a cDNA predicted to code for a cytosolic Saccharum spp. (sugarcane) Hsp70, named SsHsp70 here, the purification of the recombinant protein and the characterization of its structural conformation in solution by chemical cross-linking coupled to mass spectrometry. The in vivo expression of SsHsp70 in sugarcane extracts was confirmed by Western blot. Recombinant SsHsp70 was monomeric, both ADP and ATP binding increased its stability and it was efficient in cooperating with co-chaperones: ATPase activity was stimulated by Hsp40s, and it aided the refolding of an unfolded polypeptide delivered by a member of the small Hsp family. The structural conformation results favor a model in which nucleotide-free SsHsp70 is highly dynamic and may fluctuate among different conformations that may resemble those in which nucleotide is bound.Validation of a sugarcane EST as a true mRNA that encodes a cytosolic Hsp70 (SsHsp70) as confirmed by in vivo expression and characterization of the structure and function of the recombinant protein. SsHsp70 was monomeric, both ADP and ATP binding increased its stability and was efficient in interacting and cooperating with co-chaperones to enhance ATPase activity and refold unfolded proteins. The conformation of nucleotide-free SsHsp70 in solution was much more dynamic than suggested by crystal structures of other Hsp70s. This article is part of a Special Issue entitled: Environmental and structural proteomics.
Analytical Chemistry. 2014. Qichen Cao. et al. Beijing Institute of Radiation Medicine
ABSTRACT: Core fucosylation (CF) is a special glycosylation pattern of proteins that has a strong relationship with cancer. The Food and Drug Administration (FDA) has approved the core fucosylated α-fetoprotein as a biomarker for the early diagnosis of hepatocellular carcinoma (HCC). The technology for identifying core fucosylated proteins has significant practical value. The major method for core fucosylated glycoprotein/glycopeptide analysis is neutral loss-based MS(3) scanning under collision-induced dissociation (CID) by ion trap mass spectrometry.
However, due to the limited speed and low resolution of the MS(3) scan mode, it is difficult to achieve high-throughput, with only dozens of core fucosylated proteins identified in a single run. In this work, we developed a novel strategy for the identification of CF glycopeptides at a large scale, integrating the stepped fragmentation function, one novel feature of quadrupole-orbitrap mass spectrometry, with "glycan diagnostic ion"-based spectrum optimization. By using stepped fragmentation, we were able to obtain both highly accurate glycan and peptide information of a simplified CF glycopeptide in one spectrum. Moreover, the spectrum could be recorded with the same high speed as the conventional MS(2) scan. By using the "glycan diagnostic ion"-based spectrum refinement method, the efficiency of the CF glycopeptide discovery was significantly improved. We demonstrated the feasibility and reproducibility of our method by analyzing CF glycoproteomes of mouse liver tissue and HeLa cell samples spiked with standard CF glycoprotein. In total, 1364 and 856 CF glycopeptides belonging to 702 and 449 CF glycoproteins were identified, respectively, within a 78-min gradient analysis, which was approximately a 7-fold increase in the identification efficiency of CF glycopeptides compared to the currently used method. In this work, we took core fucosylated glycopeptides as a practical example to demonstrate the great potential of our novel method for use in glycoproteome analysis, and we also anticipate using the flexible novel method in other research fields.
Biochimica et Biophysica Acta (BBA). 2014. Long Zhao. et al. Beijing Institute of Radiation Medicine
ABSTRACT: Hepatocyte nuclear factor-1 alpha (HNF1α) exerts important effects on gene expression in multiple tissues. Several studies have directly or indirectly supported the role of phosphorylation processes in the activity of HNF1α. However, the molecular mechanism of this phosphorylation remains largely unknown. Using microcapillary liquid chromatography MS/MS and biochemical assays, we identified a novel phosphorylation site in HNF1α at Ser249. We also found that the ATM protein kinase phosphorylated HNF1α at Ser249 in vitro in an ATM-dependent manner and that ATM inhibitor KU55933 treatment inhibited phosphorylation of HNF1α at Ser249 in vivo.
Coimmunoprecipitation assays confirmed the association between HNF1α and ATM. Moreover, ATM enhanced HNF1α transcriptional activity in a dose-dependent manner, whereas the ATM kinase-inactive mutant did not. The use of KU55933 confirmed our observation. Compared with wild-type HNF1α, a mutation in Ser249 resulted in a pronounced decrease in HNF1α transactivation, whereas no dominant-negative effect was observed. The HNF1αSer249 mutant also exhibited normal nuclear localization but decreased DNA-binding activity. Accordingly, the functional studies of HNF1αSer249 mutant revealed a defect in glucose metabolism. Our results suggested that ATM regulates the activity of HNF1α by phosphorylation of serine 249, particularly in glucose metabolism, which provides valuable insights into the undiscovered mechanisms of ATM in the regulation of glucose homeostasis.
nature structural & molecular biology. 2014. Murat A Cevher. et al. Rockefeller University
ABSTRACT: The evolutionarily conserved Mediator complex is a critical coactivator for RNA polymerase II (Pol II)-mediated transcription. Here we report the reconstitution of a functional 15-subunit human core Mediator complex and its characterization by functional assays and chemical cross-linking coupled to MS (CX-MS). Whereas the reconstituted head and middle modules can stably associate, basal and coactivator functions are acquired only after incorporation of MED14 into the bimodular complex.
This results from a dramatically enhanced ability of MED14-containing complexes to associate with Pol II. Altogether, our analyses identify MED14 as both an architectural and a functional backbone of the Mediator complex. We further establish a conditional requirement for metazoan-specific MED26 that becomes evident in the presence of heterologous nuclear factors. This general approach paves the way for systematic dissection of the multiple layers of functionality associated with the Mediator complex.
Biochemistry and Molecular Biology. 2014. Chalkley, RJ. et al. University of California
ABSTRACT: The proteome informatics research group of the Association of Biomolecular Resource Facilities conducted a study to assess the community's ability to detect and characterize peptides bearing a range of biologically occurring post-translational modifications when present in a complex peptide background. A data set derived from a mixture of synthetic peptides with biologically occurring modifications combined with a yeast whole cell lysate as background was distributed to a large group of researchers and their results were collectively analyzed.
The results from the twenty-four participants, who represented a broad spectrum of experience levels with this type of data analysis, produced several important observations. First, there is significantly more variability in the ability to assess whether a results is significant than there is to determine the correct answer. Second, labile post-translational modifications, particularly tyrosine sulfation, present a challenge for most researchers. Finally, for modification site localization there are many tools being employed, but researchers are currently unsure of the reliability of the results these programs are producing.
PNAS. 2014. Ming-kun Yang. et al. Institute of Hydrobiology, Chinese Academy of Sciences
ABSTRACT: We describe an integrated workflow for proteogenomic analysis and global profiling of posttranslational modifications (PTMs) in prokaryotes and use the model cyanobacterium Synechococcus sp. PCC 7002 (hereafter Synechococcus 7002) as a test case. We found more than 20 different kinds of PTMs, and a holistic view of PTM events in this organism grown under different conditions was obtained without specific enrichment strategies. Among 3,186 predicted protein-coding genes, 2,938 gene products (>92%) were identified.
We also identified 118 previously unidentified proteins and corrected 38 predicted gene-coding regions in the Synechococcus 7002 genome. This systematic analysis not only provides comprehensive information on protein profiles and the diversity of PTMs in Synechococcus 7002 but also provides some insights into photosynthetic pathways in cyanobacteria. The entire proteogenomics pipeline is applicable to any sequenced prokaryotic organism, and we suggest that it should become a standard part of genome annotation projects.
Journal of Chromatography A. 2014. Huiming Yuan. et al. Dalian Institute of Chemical Physics
ABSTRACT: In this work, a novel kind of organic-silica hybrid monolith based immobilized enzymatic reactor (IMER) was developed. The monolithic support was prepared by a single step "one-pot" strategy via the polycondensation of tetramethoxysilane and vinyltrimethoxysilane and in situ copolymerization of methacrylic acid and vinyl group on the precondensed siloxanes with ammonium persulfate as the thermal initiator. Subsequently, the monolith was activated by N-(3-dimethylaminopropyl) - N'-ethylcarbodiimide (EDC) and N-hydroxysuccinimide (NHS), followed by the modification of branched polyethylenimine (PEI) to improve the hydrophilicity.
Finally, after activated by EDC and NHS, trypsin was covalently immobilized onto the monolithic support. The performance of such a microreactor was evaluated by the in sequence digestion of bovine serum albumin (BSA) and myoglobin, followed by MALDI-TOF-MS analysis. Compared to those obtained by traditional in-solution digestion, not only higher sequence coverages for BSA (74±1.4% vs. 59.5±2.7%, n=6) and myoglobin (93±3% vs. 81±4.5%, n=6) were obtained, but also the digestion time was shortened from 24h to 2.5 min, demonstrating the high digestion efficiency of such an IMER. The carry-over of these two proteins on the IMER was investigated, and peptides from BSA could not be found in mass spectrum of myoglobin digests, attributed to the good hydrophilicity of our developed monolithic support. Moreover, the dynamic concentration range for protein digestion was proved to be four orders of magnitude, and the IMER could endure at least 7-day consecutive usage. Furthermore, such an IMER was coupled with nano-RPLC-ESI/MS/MS for the analysis of extracted proteins from Escherichia coli. Compared to formerly reported silica hybrid monolith based IMER and the traditional in-solution counterpart, by our developed IMER, although the identified protein number was similar, the identified distinct peptide number was improved by 7% and 25% respectively, beneficial to improve the reliability of protein identification. The IMER was further online integrated with two-dimensional nano-HPLC-MS/MS system for the analysis of protein extracts from hepatocellular carcinoma (HCC) cells with low metastasis rate, and more than 3000 protein groups were identified, with only 46 proteins identified from the residues of the IMER. All these results demonstrated that such a hybrid monolith based IMER would be of great promise in the high throughput and high confidence proteome analysis.
Journal of Proteome Research. 2014. Zhuo Chen. et al. Institute of Hydrobiology, Chinese Academy of Sciences
ABSTRACT: Protein phosphorylation on serine, threonine, and tyrosine (Ser/Thr/Tyr) is well established as a key regulatory posttranslational modification used in signal transduction to control cell growth, proliferation, and stress responses. However, little is known about its extent and function in diatoms. Phaeodactylum tricornutum is a unicellular marine diatom that has been used as a model organism for research on diatom molecular biology. Although more than 1000 protein kinases and phosphatases with specificity for Ser/Thr/Tyr residues have been predicted in P. tricornutum, no phosphorylation event has so far been revealed by classical biochemical approaches.
Here, we performed a global phosphoproteomic analysis combining protein/peptide fractionation, TiO2 enrichment, and LC–MS/MS analyses. In total, we identified 264 unique phosphopeptides, including 434 in vivo phosphorylated sites on 245 phosphoproteins. The phosphorylated proteins were implicated in the regulation of diverse biological processes, including signaling, metabolic pathways, and stress responses. Six identified phosphoproteins were further validated by Western blotting using phospho-specific antibodies. The functions of these proteins are discussed in the context of signal transduction networks in P. tricornutum. Our results advance the current understanding of diatom biology and will be useful for elucidating the phosphor-relay signaling networks in this model diatom.
JOURNAL OF PROTEOMICS. 2014. Mingqiang Rong. et al. BGI-Shenzhen
ABSTRACT: Centipedes are one of the oldest venomous arthropods using toxin as their weapon to capture prey. But little attention was focused on them and only few centipede toxins were demonstrated with activity on ion channels. Therefore, more deep works are needed to understand the diversity of centipede venom. In the present study, we use peptidomics combined with cDNA library to uncover the diversity of centipede Scolopendra subspinipes mutilans L. Koch. 192 peptides were identified by LC-MS/MS and 79 precursors were deduced by cDNA library. Surprisingly, the signal peptides of centipede toxins were more complicated than any other animal toxins and even exhibited large differences in homologues.
Meanwhile, a large number of variants generated by alternative cleavage sites were detected by mass spectra. Odd number of cystein (3, 5, 7) found in the mature peptides were seldom seen in peptide toxins. In additional, two novel cysteine frameworks (C-C-C-CCC, C-C-C-C-CC-CC) were identified from 16 different cysteine frameworks from centipede peptides. Only 29 precursors have clear targets, while others may provide a potential diversity function for centipede. These findings highlight the extensive diversity of centipede toxins and provide powerful tools to understand the capture and defense weapon of centipede.Peptide toxins from venomous animal have attracted increasing attentions due to their extraordinary chemical and pharmacological diversity. Centipedes are one of the most used Chinese traditional medicines, but little was known about the active components. The venom of Scolopendra subspinipes mutilans L. Koch is first deeply analyzed in this work and most of peptides were never discovered before. Interestingly, the number and arrangement of cysteine showed a larger different to known peptide toxins such spider or scorpion toxins. Moreover, only 29 peptides from this centipede venom were identified with known function. It suggested that our work not only important to understand the composition of centipede venom, but also provide many valuable peptides for potential biological functions.
IEEE Transactions on nanobioscience. 2014. Yan Yan. et al. University of Saskatchewan Saskatoon
ABSTRACT: In recent years, de novo peptide sequencing from mass spectrometry data has developed as one of the major peptide identification methods with the emergence of new instruments and advanced computational methods. However, there are still limitations to this method; for example, the typically used spectrum graph model cannot represent all the information and relationships inherent in tandem mass spectra (MS/MS spectra). Here, we present a new method named NovoHCD which applies a spectrum graph model with multiple types of edges (called a multi-edge graph), and integrates into it amino acid combination (AAC) information and peptide tags.
In addition, information on immonium ions observed particularly in higher-energy collisional dissociation (HCD) spectra is incorporated. Comparisons between NovoHCD and another successful de novo peptide sequencing method for HCD spectra, pNovo, were performed. Experiments were conducted on five HCD spectral datasets. Results show that NovoHCD outperforms pNovo in terms of full length peptide identification accuracy; specifically, the accuracy increases 13%-21% over the five datasets.
IEEE Transactions on nanobioscience. 2014. Yan Yan. et al. University of Saskatchewan Saskatoon
ABSTRACT: De novo peptide sequencing using tandem mass spectrometry (MS/MS) data has become a major computational method for sequence identification in recent years. With the development of new instruments and technology, novel computational methods have emerged with enhanced performance. However, there are only a few methods focusing on ECD/ETD spectra, which mainly contain variants of c-ions and z-ions.
Here, a de novo sequencing method for ECD/ETD spectra, NovoExD, is presented. NovoExD applies a new form of spectrum graph with multiple edge types (called a GMET), considers multiple peptide tags, and integrates amino acid combination (AAC) and fragment ion charge information. Its performance is compared with another successful de novo sequencing method, pNovo+, which has an option for ECD/ETD spectra. Experiments conducted on three different datasets show that the average full length peptide identification accuracy of NovoExD is as high as 88.70%, and that NovoExD’s average accuracy is more than 20% greater on all datasets than that of pNovo+.
Journal of Proteome Research. 2014. Liwei Cao. et al. Dalian Institute of Chemical Physics
ABSTRACT: N-Glycosylation site analysis of baker's yeast Saccharomyces cerevisiae is of fundamental significance to elucidate the molecular mechanism of human congenital disorders of glycosylation (CDG). Here we present a mass spectrometry (MS)-based workflow for the profiling of N-glycosylated sites in S. cerevisiae proteins. In this workflow, proteolytic glycopeptides were enriched by using a hydrophilic material named Click TE-Cys to improve the glycopeptide selectivity and coverage.
To enhance the reliability of the identified results, the enriched glycopeptides were subjected to parallel deglycosylation by using two endoglycosidases (i.e., PNGase F and Endo Hf), respectively, prior to LC-MS/MS analysis. On the basis of the workflow, a total of 135 N-glycosylated sites including 6 known, 93 potential, and 36 novel sites were identified and mapped to 79 proteins. Among the novel-type sites, nine sites from eight proteins, which were simultaneously identified via PNGase F and Endo Hf deglycosylation, are believed to possess high confidence. The established workflow, together with the profile of N-glycosylated sites, will contribute to the improvement of S. cerevisiae model for revealing the pathogenesis of CDG.
ABSTRACT: The TORC1 signaling pathway plays a major role in the control of cell growth and response to stress. Here we demonstrate that the SEA complex physically interacts with TORC1 and is an important regulator of its activity. During nitrogen starvation, deletions of SEA complex components lead to Tor1 kinase delocalization, defects in autophagy, and vacuolar fragmentation. TORC1 inactivation, via nitrogen deprivation or rapamycin treatment, changes cellular levels of SEA complex members.
We used affinity purification and chemical cross-linking to generate the data for an integrative structure modeling approach, which produced a well-defined molecular architecture of the SEA complex and showed that the SEA complex comprises two regions that are structurally and functionally distinct. The SEA complex emerges as a platform that can coordinate both structural and enzymatic activities necessary for the effective functioning of the TORC1 pathway.
THE JOURNAL OF BIOLOGICAL CHEMISTRY. 2014. Wilson Wong. et al. Imperial College of Science, Technology and Medicine
ABSTRACT: Actin depolymerizing factor (ADF)/cofilins are essential regulators of actin turnover in eukaryotic cells. These multifunctional proteins facilitate both stabilization and severing of filamentous (F)-actin in a concentration-dependent manner. At high concentrations ADF/cofilins bind stably to F-actin longitudinally between two adjacent actin protomers forming what is called a decorative interaction. Low densities of ADF/cofilins, in contrast, result in the optimal severing of the filament.
To date, how these two contrasting modalities are achieved by the same protein remains uncertain. Here, we define the proximate amino acids between the actin filament and the malaria parasite ADF/cofilin, PfADF1 from Plasmodium falciparum. PfADF1 is unique among ADF/cofilins in being able to sever F-actin but do so without stable filament binding. Using chemical cross-linking and mass spectrometry (XL-MS) combined with structure reconstruction we describe a previously overlooked binding interface on the actin filament targeted by PfADF1. This site is distinct from the known binding site that defines decoration. Furthermore, total internal reflection fluorescence (TIRF) microscopy imaging of single actin filaments confirms that this novel low affinity site is required for F-actin severing. Exploring beyond malaria parasites, selective blocking of the decoration site with human cofilin (HsCOF1) using cytochalasin D increases its severing rate. HsCOF1 may therefore also use a decoration-independent site for filament severing. Thus our data suggest that a second, low affinity actin-binding site may be universally used by ADF/cofilins for actin filament severing.
Molecular & Cellular Proteomics. 2014. Michael J. Trnka. et al. University of California San Francisco
ABSTRACT: Chemical cross-linking mass spectrometry identifies interacting surfaces within a protein assembly through labeling with bifunctional reagents and identifying the covalently modified peptides. These yield distance constraints that provide a powerful means to model the three-dimensional structure of the assembly. Bioinformatic analysis of cross-linked data resulting from large protein assemblies is challenging because each cross-linked product contains two covalently linked peptides, each of which must be correctly identified from a complex matrix of potential confounders.
Protein Prospector addresses these issues through a complementary mass modification strategy in which each peptide is searched and identified separately. We demonstrate this strategy with an analysis of RNA polymerase II. False discovery rates (FDRs) are assessed via comparison of cross-linking data to crystal structure, as well as by using a decoy database strategy. Parameters that are most useful for positive identification of cross-linked spectra are explored. We find that fragmentation spectra generally contain more product ions from one of the two peptides constituting the cross-link. Hence, metrics reflecting the quality of the spectral match to the less confident peptide provide the most discriminatory power between correct and incorrect matches. A support vector machine model was built to further improve classification of cross-linked peptide hits. Furthermore, the frequency with which peptides cross-linked via common acylating reagents fragment to produce diagnostic, cross-linker-specific ions is assessed. The threshold for successful identification of the cross-linked peptide product depends upon the complexity of the sample under investigation. Protein Prospector, by focusing the reliability assessment on the least confident peptide, is better able to control the FDR for results as larger complexes and databases are analyzed. In addition, when FDR thresholds are calculated separately for intraprotein and interprotein results, a further improvement in the number of unique cross-links confidently identified is achieved. These improvements are demonstrated on two previously published cross-linking datasets.
Journal of Proteome Research. 2014. David B Beck. et al. Department of Cell and Developmental Biology, Epigenetics Program
ABSTRACT: Accurate and sensitive detection of protein–protein and protein–RNA interactions is key to understanding their biological functions. Traditional methods to identify these interactions require cell lysis and biochemical manipulations that exclude cellular compartments that cannot be solubilized under mild conditions. Here, we introduce an in vivo proximity labeling (IPL) technology that employs an affinity tag combined with a photoactivatable probe to label polypeptides and RNAs in the vicinity of a protein of interest in vivo.
Using quantitative mass spectrometry and deep sequencing, we show that IPL correctly identifies known protein–protein and protein–RNA interactions in the nucleus of mammalian cells. Thus, IPL provides additional temporal and spatial information for the characterization of biological interactions in vivo.
Disease Markers. 2014. Ying Zhang. et al. Chinese Academy of Medical Sciences
ABSTRACT: This study was aimed to create a large-scale laryngeal cancer relevant secretory/releasing protein database and further discover candidate biomarkers. Primary tissue cultures were established using tumor tissues and matched normal mucosal tissues collected from four laryngeal cancer patients. Serum-free conditioned medium (CM) samples were collected. These samples were then sequentially processed by SDS-PAGE separation, trypsin digestion, and LC-MS/MS analysis.
The candidates in the database were validated by ELISA using plasma samples from laryngeal cancer patients, benign patients, and healthy individuals. Combining MS data from the tumor tissues and normal tissues, 982 proteins were identified in total; extracellular proteins and cell surface proteins accounted for 15.0% and 4.3%, respectively. According to stringent criteria, 49 proteins were selected as candidates worthy of further validation. Of these, human tissue kallikrein 6 (KLK6) was verified. The level of KLK6 was significantly increased in the plasma samples from the cancer cohort compared to the benign and healthy cohorts and moreover showed a slight decrease in the postoperative plasma samples in comparison to the preoperative plasma samples. Conclusions. This laryngeal cancer-derived protein database provides a promising repository of candidate blood biomarkers for laryngeal cancer. The diagnostic potential of KLK6 deserves further investigation.
Journal of Proteome Research. 2014. Chun Wai Manson Ma. et al. Hong Kong University of Science and Technology
ABSTRACT: Discovering novel post-translational modifications (PTMs) to proteins and detecting specific modification sites on proteins is one of the last frontiers of proteomics. At present, hunting for post-translational modifications remains challenging in widely practiced shotgun proteomics workflows due to the typically low abundance of modified peptides and the greatly inflated search space as more potential mass shifts are considered by the search engines.
Moreover, most popular search methods require that the user specifies the modification(s) for which to search; therefore, unexpected and novel PTMs will not be detected. Here a new algorithm is proposed to apply spectral library searching to the problem of open modification searches, namely, hunting for PTMs without prior knowledge of what PTMs are in the sample. The proposed tier-wise scoring method intelligently looks for unexpected PTMs by allowing mass-shifted peak matches but only when the number of matches found is deemed statistically significant. This allows the search engine to search for unexpected modifications while maintaining its ability to identify unmodified peptides effectively at the same time. The utility of the method is demonstrated using three different data sets, in which the numbers of spectrum identifications to both unmodified and modified peptides were substantially increased relative to a regular spectral library search as well as to another open modification spectral search method, pMatch.
Journal of Proteome Research. 2014. Zuo-Fei Yuan. et al. University of Pennsylvania
ABSTRACT: Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results.
We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118.
Journal of Proteome Research. 2014. Sira Sriswasdi. et al. Wistar Institute
ABSTRACT: Chemical cross-linking coupled to mass spectrometry provides structural information that is useful for probing protein conformations and providing experimental support for molecular models. “Zero-length” cross-links have greater value for these applications than longer cross-links because they provide more stringent distance constraints. However, this method is less commonly utilized because it cannot take advantage of isotopic labels, MS-labile bonds, or enrichment tags to facilitate identification.
In this study, we combined label-free precursor ion quantitation and targeted tandem mass spectrometry with a new software tool, Zero-length Cross-link Miner (ZXMiner), to form a multitiered analysis strategy. A major, critical objective was to simultaneously achieve very high accuracy with essentially no false-positive cross-link identifications while maintaining a good depth of analysis. Our strategy was optimized on several proteins with known crystal structures. Comparison of ZXMiner to several existing cross-link analysis software showed that other algorithms detected less true positive cross-links and were far less accurate. Although prior use of zero-length cross-linking was typically restricted to small proteins, ZXMiner and the associated strategy enable facile analysis of very large protein complexes. This was demonstrated by identification of zero-length cross-links using purified 526 kDa spectrin heterodimers and intact red cell membranes and membrane skeletons.
Biochmical and Biophysical Research Communications. 2014. Adam B. Shapiro. et al. Infection Innovative Medicines Unit/Reagents
ABSTRACT: LptA is a soluble periplasmic component of the lipopolysaccharide (LPS) transport system of Gram-negative bacteria that transports newly synthesized LPS from the inner membrane to the outer leaflet of the outer membrane. LptA links the inner membrane components (LptBFGC) to the outer membrane components (LptDE), but it is uncertain whether LptA is a freely moving LPS shuttle or part of a stable trans-periplasm structure. Escherichiacoli LptA forms highly polymerized head-to-tail oligomers in solution, but dimers in vivo.
We studied the oligomerization of purified Pseudomonasaeruginosa LptA. Size-exclusion chromatography showed that P. aeruginosa LptA, unlike E. coli LptA, is a dimer over a wide range of concentrations. Chemical crosslinking with bis(sulfosuccinimidyl) suberate confirmed that dimers were the predominant species even at sub-micromolar LptA concentrations, which was unaffected by LPS binding. Mass spectrometry of crosslinked dimers showed that crosslinks occurred between the N-terminal α-amino group and either Lys-172 or Lys-173 near the C-terminus. These results support a hypothetical structure for the dimer of isolated P. aeruginosa LptA in which the N-terminus of one monomer is in close proximity to the C-terminus of the other, and the same surface of each monomer forms the interface between them, preventing further oligomerization.
Shotgun Proteomics: Methods and Protocols, Methods in Molecular Biology. 2014. Andy Christoforou. et al. University of Cambridge
ABSTRACT: Protein subcellular localization is a fundamental feature of posttranslational functional regulation. Traditional microscopy based approaches to study protein localization are typically of limited throughput, and dependent on the availability of antibodies with high specificity and sensitivity, or fluorescent fusion proteins. In this chapter we describe how Localization of Organelle Proteins by Isotope Tagging (LOPIT), a mass spectrometry based workflow coupling biochemical fractionation and iTRAQ™ 8-plex quantification, can be applied for the high-throughput characterization of protein localization in a mammalian cell culture line.
Talanta. 2014. Bo Jiang. et al. Dalian Institute of Chemical Physics
ABSTRACT: In this study, dendrimer grafted graphene oxide nanosheets (dGO) were prepared by covalent reaction. The successful synthesis of dGO was confirmed by Fourier-transform infrared spectra, Raman spectra, Thermo gravimetric analysis and Zeta potential. Taking advantages of large surface area, excellent biocompatibility and abundant functional groups, dGO provided an ideal substrate for trypsin immobilization. Trypsin-linked dGO was synthesized through covalent bonding using glutaraldehyde as coupling agents.
The amount of trypsin immobilized on dGO nanosheets was calculated to be about 649±20 mg/g. The activity of immobilized trypsin could be maintained for over 10 days at 4 °C. On-plate proteolysis could be performed without removing trypsin-linked dGO, because dGO did not interfere with matrix-assisted laser desorption ionization time-of-flight tandem mass spectrometry analysis. By such an immobilized enzymatic reactor, standard proteins could be efficiently digested within 15 min, with sequence coverages comparable or better than those obtained by conventional over-night in-solution digestion. Furthermore, trypsin-linked dGO showed high sensitivity when applied to trace samples analysis. All these results demonstrated that the developed dGO based enzymatic reactor might provide a promising tool for high throughput proteome identification.
Journal of Molecular Cell Biology. 2014. Zhi-Duan Su. et al. Shanghai Institutes for Biological Sciences
ABSTRACT: The detection of single amino-acid variants (SAVs) usually depends on single-nucleotide polymorphisms (SNPs) database. Here, we describe a novel method that discovers SAVs at proteome level independent of SNPs data. Using mass spectrometry-based de novo sequencing algorithm, peptide-candidates are identified and compared with theoretical protein database to generate SAVs under pairing strategy, which is followed by database re-searching to control false discovery rate.
In human brain tissues, we can confidently identify known and novel protein variants with diverse origins. Combined with DNA/RNA sequencing, we verify SAVs derived from DNA mutations, RNA alternative splicing, and unknown post-transcriptional mechanisms. Furthermore, quantitative analysis in human brain tissues reveals several tissue-specific differential expressions of SAVs. This approach provides a novel access to high-throughput detection of protein variants, which may offer the potential for clinical biomarker discovery and mechanistic research.
International Federation for Information Processing. 2014. Chao Pan. et al. Zhejiang University
ABSTRACT: Label-free quantitative proteomics based on mass spectrometry plays an essential role in large-scale analysis of complex proteomes. Meanwhile, quantitative proteomics is not only a way for data processing, but also an important approach for exploring protein functions and interactions in a large-scale manner. An effective method combining quantitation and qualification should be built.
To systematically overcome this challenge, we proposed a new label-free quantitative method using spectral counting in the proposed method, the count of shared peptides was considered as an optimized factor to accurately appraise abundance of Isoforms for complex proteomes. Large-scale functional annotations for complex proteomes were extracted by g:Profiler and were assigned to functional clusters. To test the effect of the methods, three groups of mitochondrial proteins including mouse heart mitochondrial dataset, mouse liver mitochondrial dataset and human heart mitochondrial dataset were selected for analysis. According to the biochemical properties of mitochondrial proteins, all functional annotations were assigned to various signalling pathway or functional clusters. We came to draw a conclusion that the strategy with shared peptides overcame inaccurate and overestimated results for low-abundant isoforms to improve accuracy, and quantitative proteomics coupled with biomedical knowledge can thoroughly comprehend functions and relationships for complex proteomes, and contribute to providing a new method for large-scale comparative or diseased proteomics.
Molecular & Cellular Proteomics. 2014. Jian Wang. et al. University of California
ABSTRACT: The combination of chemical cross-linking and mass spectrometry has recently been shown to constitute a powerful tool for studying protein–protein interactions and elucidating the structure of large protein complexes. However, computational methods for interpreting the complex MS/MS spectra from linked peptides are still in their infancy, making the high-throughput application of this approach largely impractical.
Because of the lack of large annotated datasets, most current approaches do not capture the specific fragmentation patterns of linked peptides and therefore are not optimal for the identification of cross-linked peptides. Here we propose a generic approach to address this problem and demonstrate it using disulfide-bridged peptide libraries to (i) efficiently generate large mass spectral reference data for linked peptides at a low cost and (ii) automatically train an algorithm that can efficiently and accurately identify linked peptides from MS/MS spectra. We show that using this approach we were able to identify thousands of MS/MS spectra from disulfide-bridged peptides through comparison with proteome-scale sequence databases and significantly improve the sensitivity of cross-linked peptide identification. This allowed us to identify 60% more direct pairwise interactions between the protein subunits in the 20S proteasome complex than existing tools on cross-linking studies of the proteasome complexes. The basic framework of this approach and the MS/MS reference dataset generated should be valuable resources for the future development of new tools for the identification of linked peptides.
The EMBO Journal. 2014. Yan Han. et al. Institute for Systems Biology, Seattle
ABSTRACT: The conserved transcription coactivator SAGA is comprised of several modules that are involved in activator binding, TBP binding, histone acetylation (HAT) and deubiquitination (DUB). Crosslinking and mass spectrometry, together with genetic and biochemical analyses, were used to determine the molecular architecture of the SAGA‐TBP complex. We find that the SAGA Taf and Taf‐like subunits form a TFIID‐like core complex at the center of SAGA that makes extensive interactions with all other SAGA modules.
SAGA‐TBP binding involves a network of interactions between subunits Spt3, Spt8, Spt20, and Spt7. The HAT and DUB modules are in close proximity, and the DUB module modestly stimulates HAT function. The large activator‐binding subunit Tra1 primarily connects to the TFIID‐like core via its FAT domain. These combined results were used to derive a model for the arrangement of the SAGA subunits and its interactions with TBP. Our results provide new insight into SAGA function in gene regulation, its structural similarity with TFIID, and functional interactions between the SAGA modules.
Nature Structural & Molecular Biology. 2014. Bruce A Knutso. et al. Fred Hutchinson Cancer Research Center, Seattle
ABSTRACT: Core Factor (CF) is a conserved RNA polymerase (Pol) I general transcription factor comprising Rrn6, Rrn11 and the TFIIB-related subunit Rrn7. CF binds TATA-binding protein (TBP), Pol I and the regulatory factors Rrn3 and upstream activation factor. We used chemical cross-linking–MS to determine the molecular architecture of CF and its interactions with TBP. The CF subunits assemble through an interconnected network of interactions between five structural domains that are conserved in orthologous subunits of the human PolI factor SL1.
We validated the cross-linking–derived model through a series of genetic and biochemical assays. Our combined results show the architecture of CF and the functions of the CF subunits in assembly of the complex. We extend these findings to model how CF assembles into the Pol I preinitiation complex, providing new insight into the roles of CF, TBP and Rrn3.
Molecular & Cellular Proteomics. 2014. Fengying Liu. et al. Institute of Hydrobiology, Chinese Academy of Sciences
ABSTRACT: The lysine acetylation of proteins is a reversible post-translational modification that plays a critical regulatory role in both eukaryotes and prokaryotes. Mycobacterium tuberculosis is a facultative intracellular pathogen and the causative agent of tuberculosis. Increasing evidence shows that lysine acetylation may play an important role in the pathogenesis of M. tuberculosis.
However, only a few acetylated proteins of M. tuberculosis are known, presenting a major obstacle to understanding the functional roles of reversible lysine acetylation in this pathogen. We performed a global acetylome analysis of M. tuberculosis H37Ra by combining protein/peptide prefractionation, antibody enrichment, and LC-MS/MS. In total, we identified 226 acetylation sites in 137 proteins of M. tuberculosis H37Ra. The identified acetylated proteins were functionally categorized into an interaction map and shown to be involved in various biological processes. Consistent with previous reports, a large proportion of the acetylation sites were present on proteins involved in glycolysis/gluconeogenesis, the citrate cycle, and fatty acid metabolism. A NAD+-dependent deacetylase (MRA_1161) deletion mutant of M. tuberculosis H37Ra was constructed and its characterization showed a different colony morphology, reduced biofilm formation, and increased tolerance of heat stress. Interestingly, lysine acetylation was found, for the first time, to block the immunogenicity of a peptide derived from a known immunogen, HspX, suggesting that lysine acetylation plays a regulatory role in immunogenicity. Our data provide the first global survey of lysine acetylation in M. tuberculosis. The dataset should be an important resource for the functional analysis of lysine acetylation in M. tuberculosis and facilitate the clarification of the entire metabolic networks of this life-threatening pathogen.
ABSTRACT: Tandem mass spectrometry-based database searching is currently the main method for protein identification in shotgun proteomics. The explosive growth of protein and peptide databases, which is a result of genome translations, enzymatic digestions, and post-translational modifications (PTMs), is making computational efficiency in database searching a serious challenge.
Profile analysis shows that most search engines spend 50%-90% of their total time on the scoring module, and that the spectrum dot product (SDP) based scoring module is the most widely used. As a general purpose and high performance parallel hardware, graphics processing units (GPUs) are promising platforms for speeding up database searches in the protein identification process.We designed and implemented a parallel SDP-based scoring module on GPUs that exploits the efficient use of GPU registers, constant memory and shared memory. Compared with the CPU-based version, we achieved a 30 to 60 times speedup using a single GPU. We also implemented our algorithm on a GPU cluster and achieved an approximately favorable speedup.Our GPU-based SDP algorithm can significantly improve the speed of the scoring module in mass spectrometry-based protein identification. The algorithm can be easily implemented in many database search engines such as X!Tandem, SEQUEST, and pFind. A software tool implementing this algorithm is available at http://www.comp.hkbu.edu.hk/~youli/ProteinByGPU.html
Analytical Chemistry. 2014. Qichen Cao. et al. Beijing Institute of Radiation Medicine
ABSTRACT: Core fucosylation (CF) is a special glycosylation pattern of proteins that has a strong relationship with cancer. The Food and Drug Administration (FDA) has approved the core fucosylated α-fetoprotein as a biomarker for the early diagnosis of hepatocellular carcinoma (HCC). The technology for identifying core fucosylated proteins has significant practical value.
The major method for core fucosylated glycoprotein/glycopeptide analysis is neutral loss-based MS(3) scanning under collision-induced dissociation (CID) by ion trap mass spectrometry. However, due to the limited speed and low resolution of the MS(3) scan mode, it is difficult to achieve high-throughput, with only dozens of core fucosylated proteins identified in a single run. In this work, we developed a novel strategy for the identification of CF glycopeptides at a large scale, integrating the stepped fragmentation function, one novel feature of quadrupole-orbitrap mass spectrometry, with "glycan diagnostic ion"-based spectrum optimization. By using stepped fragmentation, we were able to obtain both highly accurate glycan and peptide information of a simplified CF glycopeptide in one spectrum. Moreover, the spectrum could be recorded with the same high speed as the conventional MS(2) scan. By using the "glycan diagnostic ion"-based spectrum refinement method, the efficiency of the CF glycopeptide discovery was significantly improved. We demonstrated the feasibility and reproducibility of our method by analyzing CF glycoproteomes of mouse liver tissue and HeLa cell samples spiked with standard CF glycoprotein. In total, 1364 and 856 CF glycopeptides belonging to 702 and 449 CF glycoproteins were identified, respectively, within a 78-min gradient analysis, which was approximately a 7-fold increase in the identification efficiency of CF glycopeptides compared to the currently used method. In this work, we took core fucosylated glycopeptides as a practical example to demonstrate the great potential of our novel method for use in glycoproteome analysis, and we also anticipate using the flexible novel method in other research fields.
Journal of Proteomics. 2014. Cheng Ma. et al. Georgia State University
ABSTRACT: The core fucosylation (CF) of N-glycoproteins plays important roles in regulating protein functions during biological development, and it has also been shown to be up-regulated in several high metastasis cancer cell lines. Therefore, global profiling and quantitative characterization of CF-glycoproteins may reveal potent biomarkers for clinical applications.
However, due to the complex fragmentation pattern of CF-glycopeptides, accurately identifying CF-glycosylation sites via mass spectrometry with high throughput remains a formidable challenge. In this study, we established a precise CF-glycosylation site identification strategy with UHPLC LTQ-Orbitrap Elite under low- and high-normalized collision energy (LHNCE) conditions. To demonstrate the feasibility of LHNCE, the CF-glycopeptides of target proteins in clinical plasma samples were applied and compared as a preliminary demonstration and resulted in the assignment of 357 unique CF-glycosylation sites from 209 CF-glycoproteins. In this study, the largest human plasma CF-glycosylation site database was constructed, and at least three-fold more CF-sites were identified compared to previously published studies. The results further demonstrated that LHNCE provides an important approach for CF-glycoprotein function studies and biomarker screening in cancer research.
Analytical Chemistry. 2014. Qun Zhao. et al. Dalian Institute of Chemical Physics
ABSTRACT: Due to their extremely hydrophobic nature, the analysis of integral membrane proteins (IMPs) is of great challenge. Although various additives have been applied to improve the solubility of IMPs, they still suffer from low solubilization efficiency, incompatibility with trypsin digestion, or interference with MS detection. Herein, the systematic study on the effect of ionic liquid structure on membrane protein solubilization and trypsin biocompatibility was performed, based on which 1-dodecyl-3-methylimidazolium chloride (C12Im-Cl) was selected for the sample preparation of IMPs.
Compared with other commonly used additives, such as sodium dodecyl sulfate (SDS), Rapigest, and methanol, C12Im-Cl showed the best performance. In addition, with a strong cation exchange trap column, it could be easily removed after trypsin digestion, which not only was beneficial to avoid protein precipitation during digestion but also had no adverse effect on LC-MS-based separation and detection. Such a C12Im-Cl-assisted sample preparation method was further applied to the membrane proteome analysis of rat brain. Compared with the SDS-assisted method, 1.4 and 3.5 times improvement on the identified IMP and hydrophobic peptide number were achieved (251 vs 178, and 982 vs 279). All these results demonstrated that the C12Im-Cl-assisted sample preparation method is of great promise to promote the large-scale membrane proteome profiling.
Bioinformatics. 2013. Kyowon Jeong. et al. University of California
ABSTRACT: Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but de novo peptide sequencing algorithms to analyze tandem mass (MS/MS) spectra are lagging behind. Although existing de novo sequencing tools perform well on certain types of spectra [e.g. Collision Induced Dissociation (CID) spectra of tryptic peptides], their performance often deteriorates on other types of spectra, such as Electron Transfer Dissociation (ETD), Higher-energy Collisional Dissociation (HCD) spectra or spectra of non-tryptic digests.
Thus, rather than developing a new algorithm for each type of spectra, we develop a universal de novo sequencing algorithm called UniNovo that works well for all types of spectra or even for spectral pairs (e.g. CID/ETD spectral pairs). UniNovo uses an improved scoring function that captures the dependences between different ion types, where such dependencies are learned automatically using a modified offset frequency function.The performance of UniNovo is compared with PepNovo+, PEAKS and pNovo using various types of spectra. The results show that the performance of UniNovo is superior to other tools for ETD spectra and superior or comparable with others for CID and HCD spectra. UniNovo also estimates the probability that each reported reconstruction is correct, using simple statistics that are readily obtained from a small training dataset. We demonstrate that the estimation is accurate for all tested types of spectra (including CID, HCD, ETD, CID/ETD and HCD/ETD spectra of trypsin, LysC or AspN digested peptides).
THE JOURNAL OF BIOLOGICAL CHEMISTRY. 2013. Pei Wang. et al. University of Chinese Academy of Sciences
ABSTRACT: Abnormally enhanced tissue factor (TF) activity is related to increased thrombosis risk in which oxidative stress plays a critical role. Human cytosolic thioredoxin (hTrx1) and thioredoxin reductase (TrxR), also secreted into circulation, have the power to protect against oxidative stress. However, the relationship between hTrx1/TrxR and TF remains unknown. Here we show reversible association of hTrx1 with TF in human serum and plasma samples.
The association is dependent on hTrx1-Cys-73 that bridges TF-Cys-209 via a disulfide bond. hTrx1-Cys-73 is absolutely required for hTrx1 to interfere with FVIIa binding to purified and cell-surface TF, consequently suppressing TF-dependent procoagulant activity and proteinase-activated receptor-2 activation. Moreover, hTrx1/TrxR plays an important role in sensing the alterations of NADPH/NADP+ states and transducing this redox-sensitive signal into changes in TF activity. With NADPH, hTrx1/TrxR readily facilitates the reduction of TF, causing a decrease in TF activity, whereas with NADP+, hTrx1/TrxR promotes the oxidation of TF, leading to an increase in TF activity. By comparison, TF is more likely to favor the reduction by hTrx1-TrxR-NADPH. This reversible reduction-oxidation reaction occurs in the TF extracellular domain that contains partially opened Cys-49/-57 and Cys-186/-209 disulfide bonds. The cell-surface TF procoagulant activity is significantly increased after hTrx1-knockdown. The response of cell-surface TF procoagulant activity to H2O2 is efficiently suppressed through elevating cellular TrxR activity via selenium supplementation. Our data provide a novel mechanism for redox regulation of TF activity. By modifying Cys residues or regulating Cys redox states in TF extracellular domain, hTrx1/TrxR function as a safeguard against inappropriate TF activity.
Science China Life Sciences, Springer. 2013. Yang Lei. et al. 中国医学科学院
ABSTRACT: For successful therapy, hepatocellular carcinoma (HCC) must be detected at an early stage. Herein, we used a proteomic approach to analyze the secretory/releasing proteome of HCC tissues to identify plasma biomarkers. Serum-free conditioned media (CM) were collected from primary cultures of cancerous tissues and surrounding noncancerous tissues. Proteomic analysis of the CM proteins permitted the identification of 1365 proteins.
The enriched molecular functions and biological processes of the CM proteins, such as hydrolase activity and catabolic processes, were consistent with the liver being the most important metabolic organ. Moreover, 19% of the proteins were characterized as extracellular or membrane-bound. For validation, secretory proteins involved in transforming growth factor-β signaling pathways were validated in plasma samples. Alphafetoprotein (AFP), metalloproteinase (MMP)1, osteopontin (OPN), and pregnancy-specific beta-1-glycoprotein (PSG)9 were significantly increased in HCC patients. The overall performance of MMP1 and OPN in the diagnosis of HCC remained greater than that of AFP. In addition, this study represents the first report of MMP1 as a biomarker with a higher sensitivity and specificity than AFP. Thus, this study provides a valuable resource of the HCC secretome with the potential to investigate serological biomarkers. MMP1 and OPN could be used as novel biomarkers for the early detection of HCC and to improve the sensitivity of biomarkers compared with AFP.
Electrophoresis. 2013. Cheng Ma. et al. Beijing Institute of Radiation Medicine
ABSTRACT: N-linked glycosylation is an important protein posttranslational modification that is involved in numerous biological processes. Different methods, including chemical reaction and affinity interaction, have been developed to enrich glycosylated peptides or proteins from biological systems. However, due to the common occurrence of low glycosites occupancy in proteins and the low efficiency of enrichment approaches, only a small fraction of protein glycosites have been reported.
In this study, we combined the glycopeptide enrichment strategy for broad analysis of human serum N-glycoproteins using a tandem enrichment method coupling lectin affinity capture with HILIC. This strategy was applied to profile the human serum N-linked glycoproteome, and it resulted in 32 and 14% more N-glycosites than could be identified with the common lectin affinity capture or HILIC approaches, respectively. With an additional dimension of glycopeptides separation using high-pH reversed phase liquid chromatography or off-gel electrophoresis, the number of identified glycosites was increased by 3.1-fold and 1.8-fold, respectively. These results demonstrate that tandem enrichment methods, especially when followed by high-pH reversed-phase prefractionation, can greatly improve the power of N-glycoproteome analysis. In total, 615 N-glycosites from 312 glycoproteins (protein group) were mapped using high-accuracy mass spectrometry.
BioMed Research International. 2013. Wenli Zhang. et al. Institute of Computing Technology, Chinese Academy of Sciences
ABSTRACT: Protein identification is an integral part of proteomics research. The available tools to identify proteins in tandem mass spectrometry experiments are not optimized to face current challenges in terms of identification scale and speed owing to the exponential growth of the protein database and the accelerated generation of mass spectrometry data, as well as the demand for nonspecific digestion and post-modifications in complex-sample identification.
As a result, a rapid method is required to mitigate such complexity and computation challenges. This paper thus aims to present an open method to prevent enzyme and modification specificity on a large database. This paper designed and developed a distributed program to facilitate application to computer resources. With this optimization, nearly linear speedup and real-time support are achieved on a large database with nonspecific digestion, thus enabling testing with two classical large protein databases in a 20-blade cluster. This work aids in the discovery of more significant biological results, such as modification sites, and enables the identification of more complex samples, such as metaproteomics samples.
Analytical Chemistry. 2013. Yuan Zhou. et al. Dalian Institute of Chemical Physics
ABSTRACT: Discovering differentially expressed proteins in various biological samples requires proteome quantification methods with accuracy, precision, and wide dynamic range. This study describes a mass defect-based pseudo-isobaric dimethyl labeling (pIDL) method based on the subtle mass defect differences between 12C/13C and 1H/2H. Lys-C protein digests were labeled with CD2O/13CD2O and reduced with NaCNBD3/NaCNBH3 as heavy and light isotopologues, respectively.
The fragment ion pairs with mass differences of 5.84 mDa were resolved by high-resolution tandem mass spectrometry (MS/MS) and used for quantification. The pIDL method described here resulted in highly accurate and precise quantification results with approximately 100-fold dynamic range. Furthermore, the pIDL method was extended to 4-plex proteome quantification and applied to the quantitative analysis of proteomes from Hca-P and Hca-F, two mouse hepatocarcinoma ascites syngeneic cell lines with low and high lymph node metastasis rates.
Proteome science. 2013. James P Cleveland. et al. University of South Carolina
ABSTRACT: Independent of the approach used, the ability to correctly interpret tandem MS data depends on the quality of the original spectra. Even in the case of the highest quality spectra, the majority of spectral peaks can not be reliably interpreted. The accuracy of sequencing algorithms can be improved by filtering out such 'noise' peaks. Preprocessing MS/MS spectra to select informative ion peaks increases accuracy and reduces the processing time.
Intuitively, the mix of informative versus non-informative peaks has a direct effect on the quality and size of the resulting candidate peptide search space. As the number of selected peaks increases, the corresponding search space increases exponentially. If we select too few peaks then the ion-ladder interpretation of the spectrum will contain gaps that can only be explained by permutations of combinations of amino acids. This will result in a larger candidate peptide search space and poorer quality candidates. The dependency that peptide sequencing accuracy has on an initial peak selection regime makes this preprocessing step a crucial facet of any approach, whether de novo or not, to MS/MS spectra interpretation.We have developed a novel approach to address this problem. Our approach uses a staged neural network to model ion fragmentation patterns and estimate the posterior probability of each ion type. Our method improves upon other preprocessing techniques and shows a significant reduction in the search space for candidate peptides without sacrificing candidate peptide quality.
Biotechnology Letters. 2013. Da Lei. et al. Nanchang University
ABSTRACT: Neutral protease I from Aspergillus oryzae 3.042 was expressed in Pichia pastoris and its N-glycosylation properties were analyzed. After purification by nickel-affinity chromatography column, the recombinant neutral protease (rNPI) was confirmed to be N-glycosylated by periodicacid/Schiff’s base staining and Endo H digestion. Moreover, the deglycosylated protein’s molecular weight decreased to 43.3 kDa from 54.5 kDa analyzed by SDS-PAGE and MALDI–TOF–MS, and the hyperglycosylation extent was 21 %.
The N-glycosylation site of rNPI was analyzed by nano LC–MS/MS after digesting by trypsin and Glu-C, and the unique potential site Asn41 of mature peptide was found to be glycosylated. Homology modeling of the 3D structure of rNPI indicated that the attached N-glycans hardly affected neutral protease’s activity due to the great distance away from the active site of the enzyme.
Journal of proteome. 2013. Ming-kun Yang. et al. Institute of Hydrobiology, Chinese Academy of Sciences
ABSTRACT: Increasing evidence shows that protein phosphorylation on serine (Ser), threonine (Thr), and tyrosine (Tyr) residues is one of the major post-translational modifications in the bacteria, involved in regulating a myriad of physiological processes. Cyanobacteria are one of the largest groups of bacteria and are the only prokaryotes capable of oxygenic photosynthesis. Many cyanobacteria strains contain unusually high numbers of protein kinases and phosphatases with specificity on Ser, Thr, and Tyr residues.
However, only a few dozen phosphorylation sites in cyanobacteria are known, presenting a major obstacle for further understanding the regulatory roles of reversible phosphorylation in this group of bacteria. In this study, we carried out a global and site-specific phosphoproteomic analysis on the model cyanobacterium Synechococcus sp. PCC 7002. In total, 280 phosphopeptides and 410 phosphorylation sites from 245 Synechococcus sp. PCC 7002 proteins were identified through the combined use of protein/peptide prefractionation, TiO2 enrichment, and LC–MS/MS analysis. The identified phosphoproteins were functionally categorized into an interaction map and found to be involved in various biological processes such as two-component signaling pathway and photosynthesis. Our data provide the first global survey of phosphorylation in cyanobacteria by using a phosphoproteomic approach and suggest a wide-ranging regulatory scope of this modification. The provided data set may help reveal the physiological functions underlying Ser/Thr/Tyr phosphorylation and facilitate the elucidation of the entire signaling networks in cyanobacteria.
2013 ICME INTERNATIONAL CONFERENCE ON COMPLEX MEDICAL ENGINEERING (CME). 2013. Rui Su. et al. Beijing Institute of Technology
ABSTRACT: Evidence shows that non-enzymatic glycation plays an important role in the development of diabetes, diabetes-related complications and neurodegenerative diseases. Mass spectrometric methods have been used for non-enzymatic glycation of proteins. At present, CID remains the major fragmentation method for peptide sequencing. In this study, we utilized synthesized peptide models to study fragmentation spectra of glycated peptides, which can be easily studied using ESI-MS.
MS/MS fragmentation whose characteristics of glycated peptides will help for large-scale glycated proteomics analyses of complex plasma and tissue samples.
Journal of Structural and Functional Genomics. 2013. Eric D. Merkley. et al. Biological Sciences Division, Pacific Northwest
ABSTRACT: Multiprotein complexes, rather than individual proteins, make up a large part of the biological macromolecular machinery of a cell. Understanding the structure and organization of these complexes is critical to understanding cellular function. Chemical cross-linking coupled with mass spectrometry is emerging as a complementary technique to traditional structural biology methods and can provide low-resolution structural information for a multitude of purposes, such as distance constraints in computational modeling of protein complexes.
In this review, we discuss the experimental considerations for successful application of chemical cross-linking-mass spectrometry in biological studies and highlight three examples of such studies from the recent literature. These examples (as well as many others) illustrate the utility of a chemical cross-linking-mass spectrometry approach in facilitating structural analysis of large and challenging complexes.
Anal Bioanal Chem. 2013. Xinyuan Zhao. et al. Beijing Institute of Radiation Medicine
ABSTRACT: Glycosylation is an important posttranslational modification of proteins and plays a crucial role in both cellular functions and secretory pathways. Sialic acids (SAs), a family of nine-carbon-containing acidic monosaccharides, often terminate the glycan structures of cell surface molecules and secreted glycoproteins and perform an important role in many biological processes. Hence, a more profound profiling of the sialylated glycoproteomics may improve our knowledge of this modification and its effects on protein functions.
Here, we systematically investigated different strategies to enrich the SA proteins in human plasma using a newly developed technology that utilizes titanium dioxide for sialylated N-glycoproteomics profiling by mass spectrometry. Our results showed that using a combination of a filter-aided sample preparation method, TiO2 chromatography, multiple enzyme digestion, and two-dimensional reversed-phase peptide fractionation led to a more profound profiling of the SA proteome. In total, 982 glycosylation sites in 413 proteins were identified, among which 37.8% were newly identified, to establish the largest database of sialic acid containing proteins from human plasma.
Analytical Chemistry. 2013. Qun Zhao. et al. Dalian Institute of Chemical Physics
ABSTRACT: Combining good dissolving ability of formic acid (FA) for membrane proteins and excellent complementary retention behavior of proteins on strong cation exchange (SCX) and strong anion exchange (SAX) materials, a biphasic microreactor was established to pretreat membrane proteins at microgram and even nanogram levels. With membrane proteins solubilized by FA, all of the proteomics sample processing procedures, including protein preconcentration, pH adjustment, reduction, and alkylation, as well as tryptic digestion, were integrated into an “SCX-SAX” biphasic capillary column.
To evaluate the performance of the developed microreactor, a mixture of bovine serum albumin, myoglobin, and cytochrome c was pretreated. Compared with the results obtained by the traditional in-solution process, the peptide recovery (93% vs 83%) and analysis throughput (3.5 vs 14 h) were obviously improved. The microreactor was further applied for the pretreatment of 14 μg of membrane proteins extracted from rat cerebellums, and 416 integral membrane proteins (IMPs) (43% of total protein groups) and 103 transmembrane peptides were identified by two-dimensional nanoliquid chromatography-electrospray ionization tandem mass spectrometry (2D nano-LC-ESI-MS/MS) in triplicate analysis. With the starting sample preparation amount decreased to as low as 50 ng, 64 IMPs and 17 transmembrane peptides were identified confidently, while those obtained by the traditional in-solution method were 10 and 1, respectively. All these results demonstrated that such an “SCX-SAX” based biphasic microreactor could offer a promising tool for the pretreatment of trace membrane proteins with high efficiency and throughput.
IEEE International Conference on Bioinformatics and Biomedicine. 2013 . Yan Yan. et al. University of Saskatchewan Saskatoon
ABSTRACT: In recent years, de novo peptide sequencing from mass spectrometry data has developed as one of the major peptide identification methods with the emergence of new instruments and advanced computational methods. However, there are still limitations to this method; for example, the typically used spectrum graph model cannot represent all the information and relationships inherent in tandem mass spectra (MS/MS spectra). Here, we present a new spectrum graph model with multiple types of edges (called a multi-edge graph), and integrate amino acid combination (AAC) information and peptide tags into it for peptide sequencing.
In addition, the information about immoniun ions observed particularly in higher-energy collisional dissociation (HCD) spectra are incorporated. Comparisons between the proposed method and another successful de novo peptide sequencing method for HCD spectra, pNovo, were performed. Experiments were conducted on four HCD spectral datasets. Results show that the proposed method outperforms pNovo in terms of full length peptide identification accuracy; specifically, the accuracy increases 7%-13% over all four datasets.
Proteomics. 2012. Ying Zhang. et al. Chinese Academy of Medical Sciences
ABSTRACT: Ovarian cancer is the most lethal gynecological malignancy worldwide, and early detection of this disease using serum or plasma biomarkers may improve its clinical outcome. In the present study, a large scale protein database derived from ovarian cancer was created to enable tumor marker discovery.
First, primary organ cultures were established with the tumor tissues and corresponding normal tissues obtained from six ovarian cancer patients, and the serum-free conditioned medium (CM) samples were collected for proteomic analysis. The total proteins from the CM sample were separated by SDS-PAGE, digested with trypsin and then analyzed by LC-MS/MS. Combining data from the tumor tissues and the normal tissues, 1129 proteins were identified in total, of which those categorized as "extracellular proteins" and "plasma membrane proteins" accounted for 21.4% and 16.9%, respectively. For validation, three secretory proteins (NID1, TIMP2, and VCAN) involved in "organ development"-associated subnetwork, showed significant differences between their levels in the circulating plasma samples from ovarian cancer patients and healthy women. In conclusion, this ovarian cancer-derived protein database provides a credible repertoire of potential biomarkers in blood for this malignant disease, and deserves mining further.
High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS). 2012. You Li. et al. Hong Kong Baptist University Kowloon Tong
ABSTRACT: Database searching is a main method for protein identification in shotgun proteomics, and many research efforts are dedicated to improving its effectiveness. However, the efficiency of database searching is facing a serious challenge, due to the ever fast growth of protein and peptide databases resulted from genome translations, enzymatic digestions, and post-translational modifications (PTMs).
On the other hand, as a general-purpose and high performance parallel hardware, Graphics Processing Units (GPUs) develop continuously and provide another promising platform for parallelizing database searching based protein identification. It becomes very important to study how to speed up database search engines by GPUs for protein identification. In this paper, we mainly utilize GPUs to accelerate the scoring module, which is the most time-consuming component. Specifically, we study two popular scoring method: spectral dot product based method, which is used by X!Tandem, and kernel spectral dot product, which is used by pFind.
Analyst. 2012. Qi Wu. et al. Dalian Institute of Chemical Physics
ABSTRACT: Although widely applied in the label-free quantification of proteomics, spectral count (SC)-based abundance measurements suffer from the narrow dynamic range of attainable ratios, leading to the serious underestimation of true protein abundance fold changes, especially when studying biological samples that exhibit very large fold changes in protein expression. MS/MS fragment ion intensity, as an alternative to SC, has recently gained acceptance as the abundance feature of protein in label-free proteomic studies.
Herein, we implemented two formats of MS/MS fragment ion intensity, Spectral Index (SI) and Summed MS/MS TIC (SMT), to alleviate this particular deficiency arising from SC. Both were in forms of replacing SC in the Normalized Spectral Abundance Factor (NSAF) formula, resulting in two algorithms, abbreviated as NSI and NSMT, respectively. The necessity of the normalization process was validated using a publicly available dataset. Furthermore, when applied to another well characterized benchmark dataset, both NSI and NSMT showed improved overall accuracy over NSAF for the relative quantification of proteomes. Hereinto, NSI enabled the sensitive detection of differentially expressed proteins, while NSMT ensured accurate calculation for protein abundance fold change. Therefore, the selective use of both algorithms might facilitate the screening and quantification of potential biomarkers on the proteome scale.
PNAS. 2012. Yanmei Zhao. et al. Institute of Biophysics, Chinese Academy of Sciences
ABSTRACT: Spermiogenesis is a series of poorly understood morphological, physiological and biochemical processes that occur during the transition of immotile spermatids into motile, fertilization-competent spermatozoa. Here, we identified a Serpin (serine protease inhibitor) family protein (As_SRP-1) that is secreted from spermatids during nematode Ascaris suum spermiogenesis (also called sperm activation) and we showed that As_SRP-1 has two major functions.
First, As_SRP-1 functions in cis to support major sperm protein (MSP)-based cytoskeletal assembly in the spermatid that releases it, thereby facilitating sperm motility acquisition. Second, As_SRP-1 released from an activated sperm inhibits, in trans, the activation of surrounding spermatids by inhibiting vas deferens-derived As_TRY-5, a trypsin-like serine protease necessary for sperm activation. Because vesicular exocytosis is necessary to create fertilization-competent sperm in many animal species, components released during this process might be more important modulators of the physiology and behavior of surrounding sperm than was previously appreciated.
Analytical chemistry. 2012. Huiming Yuan. et al. Dalian Institute of Chemical Physics
ABSTRACT: An online integrated platform for proteome profiling was established, with the combination of protein separation by microreversed phase liquid chromatography (μRPLC), online acetonitrile (ACN) removal, and pH adjustment by a hollow fiber membrane interface (HFMI), online digestion by an immobilized enzymatic microreactor (IMER), as well as peptide separation and proteins identification by μRPLC or nano-RPLC-electrospray ionization tandem mass spectrometry (μRPLC-ESI-MS/MS).
To evaluate the performance of such a platform, a three-protein mixture with mass ranging from 5 to 500 ng was analyzed automatically. Compared to the offline counterpart, although similar protein sequence coverages were obtained by the integrated platform, the signal intensity of total ion chromatogram was improved by almost 4 times. In addition, such an integrated platform was further applied for the analysis of extracted proteins from rat brain. Compared to the results obtained by offline counterpart and traditional MudPIT approach under similar conditions, by the integrated platform, the identified protein group number was comparable, but the analysis time was shortened to less than half of that taken by the traditional approaches. All these results demonstrated that our developed integrated platform might offer a promising tool for high-throughput and large-scale profiling of proteomes.
Molecular BioSystems. 2012. Liqi Xie. et al. Fudan University
ABSTRACT: Electron transfer dissociation (ETD) is a useful and complementary activation method for peptide fragmentation in mass spectrometry. However, ETD spectra typically receive a relatively low score in the identifications of 2+ ions. To overcome this challenge, we, for the first time, systematically interrogated the benefits of combining ion charge enhancing methods (dimethylation, guanidination, m-nitrobenzyl alcohol (m-NBA) or Lys-C digestion) and differential search algorithms (Mascot, Sequest, OMSSA, pFind and X!Tandem).
A simple sample (BSA) and a complex sample (AMJ2 cell lysate) were selected in benchmark tests. Clearly distinct outcomes were observed through different experimental protocol. In the analysis of AMJ2 cell lines, X!Tandem and pFind revealed 92.65% of identified spectra; m-NBA adduction led to a 5-10% increase in average charge state and the most significant increase in the number of successful identifications, and Lys-C treatment generated peptides carrying mostly triple charges. Based on the complementary identification results, we suggest that a combination of m-NBA and Lys-C strategies accompanied by X!Tandem and pFind can greatly improve ETD identification.
Journal of Chromatography A. 2012. Bo Jiang. et al. Dalian Institute of Chemical Physics
ABSTRACT: In this paper, magnetic Fe3O4 nanoparticles modified graphene oxide nanocomposites (GO–CO–NH–Fe3O4) were prepared by covalent bonding, via the reaction between the amino groups of fuctionalized Fe3O4 and the carboxylic groups of GO, confirmed by Fourier-transform infrared spectra, Raman spectroscopy, and transmission electron microscopy. With GO–CO–NH–Fe3O4 as a novel substrate, trypsin was immobilized via π–π stacking and hydrogen bonding interaction, and the binding capacity of trypsin reached as high as 0.275 mg/mg.
Since GO–CO–NH–Fe3O4 worked as not only support for enzyme immobilization, but also as an excellent microwave irradiation absorber, the digestion efficiency could be further improved with microwave assistance. By such an immobilized enzymatic reactor (IMER), standard proteins could be efficiently digested within 15 s, with sequence coverages comparable or better than those obtained by conventional in-solution digestion (12 h). Since trypsin was immobilized under mild conditions, the enzymatic activity of IMER preserved at least for a month. In addition, due to the good hydrophilicity of GO, no peptide residue was observed in the sequent digestion of bovine serum albumin and myoglobin. To further confirm the efficiency of such an IMER for proteome analysis, it was applied to digest proteins extracted from rat liver, followed by nanoRPLC–ESI-MS/MS analysis. With only 5 min microwave-assisted digestion, in 3 parallel runs, totally 456 protein groups were identified, comparable to that obtained by 12 h in-solution digestion, indicating the great potential of IMERs with GO–CO–NH–Fe3O4 as the support for high throughput proteome study.
Rapid Communications in Mass Spectrometry. 2012. Ming Niu. et al. Beijing Institute of Radiation Medicine
ABSTRACT: Chimera spectra make it challenging to identify proteins in complex mixtures by LC/MS/MS. Approximately half of the spectra collected are chimera spectra even when high-resolution tandem mass spectrometry is used. Chimera spectra are generated from the co-fragmentation of different co-elute peptides, and it is often difficult to distinguish monoisotopic precursors of these peptides from each other.In this paper, we propose a peak intensity ratio-based monoisotopic peak determination algorithm (PIRMD) to distinguish different monoisotopic precursors of chimera spectra.
Monoisotopic peaks in non-overlapping clusters are detected by the edge features of the isotopic peak intensity ratios. For multiple overlapping clusters grouped as one cluster, monoisotopic peaks can be detected by an advanced estimation of the similarity between the estimated and the experimental isotopic distribution based on the isotopic peak intensity ratios.High-resolution mass spectrometric datasets acquired from mixtures of 30 synthetic peptides and mixtures of 18 proteins were used to evaluate the efficiency and accuracy of PIRMD. The results indicate that PIRMD can recognize monoisotopic precursors from the chimera spectra containing non-overlapping and overlapping isotopic clusters. Compared to several published algorithms, PIRMD identifies approximately 2~14% more spectra and has fewer false positives.The results on standard datasets and actual samples demonstrated that PIRMD could notably improve the successful identification rates of the spectra by identifying more chimera spectra, and of the identified spectra, approximately 25% are chimera spectra. This novel algorithm will help to interpret spectra produced by shotgun strategy in proteomics.
Journal of Chromatography A. 2012. Hao Jiang .et al. Dalian Institute of Chemical Physics
ABSTRACT: In this work, a novel kind of N-vinyl-2-pyrrolidinone (NVP) modified poly acrylic ester microspheres was prepared, followed by trypsin immobilization to prepare a hydrophilic immobilized enzyme reactor (IMER), to achieve highly efficient protein digestion with low peptide residue. The nonspecific adsorption of peptides on such an IMER was evaluated by the in sequence digestion of bovine serum albumin (BSA) and myoglobin. Without NVP modification, both proteins could be identified after digestion by a 5 cm-length IMER, but 18 peptides of BSA were found in the digests of myoglobin caused by the nonspecific adsorption of the matrix.
With NVP modification, the hydrophilicity of IMER was greatly improved, resulting in not only the sequence coverage of myoglobin increased from 63% to 73%, but also no residual peptides from BSA observed in myoglobin digests. Although the sequence coverages of proteins obtained by the IMER were comparable to those obtained by in-solution digestion, the digestion time was shortened from 24h to 1 min. By such an IMER, a protein mixture, containing BSA, myoglobin, and cytochrome c (100, 1 and 0.01 μg/mL, respectively), was digested, and all proteins were unambiguously identified with improved sequence coverages than that achieved by in-solution digestion. Furthermore, the hydrophilic IMER was also off-line coupled to nano-RPLC-ESI-MS/MS for the analysis of proteins extracted from yeast. After 1.5 min digestion, 271 protein groups with at least 2 distinct peptides were identified, much more than those obtained by 24h in-solution digestion (192 protein groups), indicating the great potential of such an IMER for proteome analysis.
Journal of Proteome Research. 2011. Alberto Paradela. et al. Departamento de Biotecnología Microbiana
ABSTRACT: The RcsC, RcsD, and RcsB proteins compose a system used by enteric bacteria to sense envelope stress. Signal transmission occurs from the sensor RcsC to the transcriptional regulator RcsB. Accessory proteins, such as IgaA, are known to adjust the response level. In a previous transcriptomic study, we uncovered 85 genes differentially expressed in Salmonella enterica serovar Typhimurium igaA mutants. Here, we extended these observations to proteomics by performing differential isotope-coded protein labeling (ICPL) followed by liquid chromatography-electrospray ionization tandem mass spectrometry. Five-hundred five proteins were identified and quantified, with 75 of them displaying significant changes in response to alterations in the RcsCDB system.
Divergent expression at the RNA and protein level was observed for the metabolic genes pckA and metE, involved in gluconeogenesis and methionine synthesis, respectively. When analyzed in diverse environmental conditions, including the intracellular niche of eukaryotic cells, inverse regulation was more evident for metE and in bacteria growing in defined minimal medium or to stationary phase. The RcsCDB system was also shown to repress the synthesis of the small RNA FnrS, previously reported to modulate metE expression. Collectively, these findings provide new insights into post-transcriptional regulatory mechanisms involving the RcsCDB system and its control over metabolic functions.
Skeletal Muscle. 2011. Suzanne E Mate. et al. George Washington University
ABSTRACT: During development, the branchial mesoderm of Torpedo californica transdifferentiates into an electric organ capable of generating high voltage discharges to stun fish. The organ contains a high density of cholinergic synapses and has served as a biochemical model for the membrane specialization of myofibers, the neuromuscular junction (NMJ). We studied the genome and proteome of the electric organ to gain insight into its composition, to determine if there is concordance with skeletal muscle and the NMJ, and to identify novel synaptic proteins.Of 435 proteins identified, 300 mapped to Torpedo cDNA sequences with ≥2 peptides. We identified 14 uncharacterized proteins in the electric organ that are known to play a role in acetylcholine receptor clustering or signal transduction.
In addition, two human open reading frames, C1orf123 and C6orf130, showed high sequence similarity to electric organ proteins. Our profile lists several proteins that are highly expressed in skeletal muscle or are muscle specific. Synaptic proteins such as acetylcholinesterase, acetylcholine receptor subunits, and rapsyn were present in the electric organ proteome but absent in the skeletal muscle proteome.Our integrated genomic and proteomic analysis supports research describing a muscle-like profile of the organ. We show that it is a repository of NMJ proteins but we present limitations on its use as a comprehensive model of the NMJ. Finally, we identified several proteins that may become candidates for signaling proteins not previously characterized as components of the NMJ.
Analytical chemistry. 2011. Yan Zhao. et al. Beijing Institute of Radiation Medicine
ABSTRACT: Glycosylation modifications of proteins have been attracting increasing attention due to their roles in the physiological and pathological processes of the cell. Core fucosylation (CF), one special type of glycan structure in glycoproteins, has been linked with tumorigenesis. The study of protein glycosylation has been hindered by the technical challenges caused by the microheterogeneity of glycan modifications. In commonly used methods, sugar chains on the peptide were released using endoglycosidase, and the glycan and peptides were analyzed separately with mass spectrometry. Although mass spectrometric analysis can be performed easily in this way, an increase in false positives when assigning glycosites was inevitable.
Our earlier research demonstrated a strategy combining Endo F3-catalyzed partial deglycosylation with MS3 (MS/MS/MS) scanning triggered by the neutral loss of a fucose to precisely identify CF proteins on a large scale. In this research, fragmentations of partially deglycosylated glycopeptides were studied using a triple quadrupole mass spectrometer, and a quantification method that coupled our published identification strategy with multiple reaction monitoring-mass spectrometry (MRM-MS) analysis was developed to obtain site-specific quantification information of core fucosylated peptides. To illustrate the feasibility of the quantification method, the CF peptides of target proteins in clinical serum were quantified and compared as a preliminary demonstration.
PLOS ONE. 2011. Amit Kumar Yadav. et al. Institute of Genomics and Integrative Biology
ABSTRACT: Plasma is the most easily accessible source for biomarker discovery in clinical proteomics. However, identifying potential biomarkers from plasma is a challenge given the large dynamic range of proteins. The potential biomarkers in plasma are generally present at very low abundance levels and hence identification of these low abundance proteins necessitates the depletion of highly abundant proteins. Sample pre-fractionation using immuno-depletion of high abundance proteins using multi-affinity removal system (MARS) has been a popular method to deplete multiple high abundance proteins.
However, depletion of these abundant proteins can result in concomitant removal of low abundant proteins.Although there are some reports suggesting the removal of non-targeted proteins, the predominant view is that number of such proteins is small. In this study, we identified proteins that are removed along with the targeted high abundant proteins. Three plasma samples were depleted using each of the three MARS (Hu-6, Hu-14 and Proteoprep 20) cartridges. The affinity bound fractions were subjected to gelC-MS using an LTQ-Orbitrap instrument. Using four database search algorithms including MassWiz (developed in house), we selected the peptides identified at <1% FDR. Peptides identified by at least two algorithms were selected for protein identification. After this rigorous bioinformatics analysis, we identified 101 proteins with high confidence. Thus, we believe that for biomarker discovery and proper quantitation of proteins, it might be better to study both bound and depleted fractions from any MARS depleted plasma sample.