pFind Studio: a computational solution for mass spectrometry-based proteomics



2023




Expression of a Siglec-Fc Protein and Its Characterization
BIOLOGY-BASEL2023. Kaijun Chi et al. Jiangnan Univ, Sch Biotechnol, Key Lab Carbohydrate Chem & Biotechnol, Minist Educ, Wuxi 214122, Peoples R China; Chinese Acad Sci, Inst Proc Engn, State Key Lab Biochem Engn, Beijing 100190, Peoples R China
ABSTRACT:Simple Summary The Siglec-Fc protein, a fusion protein combining Siglec with the Fc part of a human antibody, is a promising sialic acid-Siglec axis-targeted agent for cancer treatment and is widely used for Siglec ligands discovery. The recombinant Siglec-Fc fusion protein has been expressed in different cell systems. However, its characteristics have not been investigated in detail. In this study, HEK293 and CHO cell lines were used to express the Siglec9-Fc protein, and their adaptability for production was compared. We optimized culture conditions and compared the glycosylation, yield, dimerization and sialic acid binding activity of the Siglec9-Fc protein produced in HEK293 and CHO. Using purified recombinant protein, we further analyzed the distribution of Siglec9 ligands on cancer cell lines, as well as bladder cancer tissue, and revealed the potential ligands. Our findings provide support for the selection of Siglec9-Fc protein expression systems and detection of related Siglec9 ligands. The emerging importance of the Siglec-sialic acid axis in human disease, especially cancer, has necessitated the identification of ligands for Siglecs. Recombinant Siglec-Fc fusion proteins have been widely used as ligand detectors, and also as sialic acid-targeted antibody-like proteins for cancer treatment. However, the heterogenetic properties of the Siglec-Fc fusion proteins prepared from various expression systems have not been fully elucidated. In this study, we selected HEK293 and CHO cells for producing Siglec9-Fc and further evaluated the properties of the products. The protein yield in CHO (8.23 mg/L) was slightly higher than that in HEK293 (7.46 mg/L). The Siglec9-Fc possesses five N-glycosylation sites and one of them is located in its Fc domain, which is important for the quality control of protein production and also the immunogenicity of Siglec-Fc. Our glycol-analysis confirmed that the recombinant protein from HEK293 received more fucosylation, while CHO showed more sialylation. Both products revealed a high dimerization ratio and sialic acid binding activity, which was confirmed by the staining of cancer cell lines and bladder cancer tissue. Finally, our Siglec9-Fc product was used to analyze the potential ligands on cancer cell lines.
Use: pGlyco



Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics
Nature Communications2023. Sun, Weiping et al. Bioinformatics Solutions Inc., Waterloo, Ontario, Canada
ABSTRACT:Here we present GlycanFinder, a database search and de novo sequencing tool for the analysis of intact glycopeptides from mass spectrometry data. GlycanFinder integrates peptide-based and glycan-based search strategies to address the challenge of complex fragmentation of glycopeptides. A deep learning model is designed to capture glycan tree structures and their fragment ions for de novo sequencing of glycans that do not exist in the database. We performed extensive analyses to validate the false discovery rates (FDRs) at both peptide and glycan levels and to evaluate GlycanFinder based on comprehensive benchmarks from previous community-based studies. Our results show that GlycanFinder achieved comparable performance to other leading glycoproteomics softwares in terms of both FDR control and the number of identifications. Moreover, GlycanFinder was also able to identify glycopeptides not found in existing databases. Finally, we conducted a mass spectrometry experiment for antibody N-linked glycosylation profiling that could distinguish isomeric peptides and glycans in four immunoglobulin G subclasses, which had been a challenging problem to previous studies.
Use: pGlyco; pDeep



Development and validation of a method for analyzing the sialylated glycopeptides of recombinant erythropoietin in urine using LC--HRMS
Scientific Reports2023. Yoondam Seo et al. Doping Control Center, Korea Institute of Science and Technology, Hwarang-ro 14-gil, Seongbuk-gu, Seoul, 02792, Republic of Korea
ABSTRACT:Erythropoietin (EPO) is a glycoprotein hormone that stimulates red blood cell production. It is produced naturally in the body and is used to treat patients with anemia. Recombinant EPO (rEPO) is used illicitly in sports to improve performance by increasing the blood's capacity to carry oxygen. The World Anti-Doping Agency has therefore prohibited the use of rEPO. In this study, we developed a bottom-up mass spectrometric method for profiling the site-specific N-glycosylation of rEPO. We revealed that intact glycopeptides have a site-specific tetra-sialic glycan structure. Using this structure as an exogenous marker, we developed a method for use in doping studies. The profiling of rEPO N-glycopeptides revealed the presence of tri- and tetra-sialylated N-glycopeptides. By selecting a peptide with a tetra-sialic acid structure as the target, its limit of detection (LOD) was estimated to be<500pg/mL. Furthermore, we confirmed the detection of the target rEPO glycopeptide using three other rEPO products. We additionally validated the linearity, carryover, selectivity, matrix effect, LOD, and intraday precision of this method. To the best of our knowledge, this is the first report of a doping analysis using liquid chromatography/mass spectrometry-based detection of the rEPO glycopeptide with a tetra-sialic acid structure in human urine samples.
Use: pGlyco



High-coverage four-Dimensional data-independent acquisition proteomics and phosphoproteomics enabled by deep learning-driven multidimensional predictions
Analytical Chemistry2023. Moran Chen et al. Wuhan Univ, Inst Adv Studies, Wuhan 430072, Hubei, Peoples R China
ABSTRACT:Four-dimensional (4D) data-independentacquisition (DIA)-basedproteomics is a promising technology. However, its full performanceis restricted by the time-consuming building and limited coverageof a project-specific experimental library. Herein, we developed aversatile multifunctional deep learning model Deep4D based on self-attentionthat could predict the collisional cross section, retention time,fragment ion intensity, and charge state with high accuracies forboth the unmodified and phosphorylated peptides and thus establishedthe complete workflows for high-coverage 4D DIA proteomics and phosphoproteomicsbased on multidimensional predictions. A 4D predicted library containing similar to 2 million peptides was established that could realize experimentallibrary-free DIA analysis, and 33% more proteins were identified thanusing an experimental library of single-shot measurement in the exampleof HeLa cells. These results show the great values of the convenienthigh-coverage 4D DIA proteomics methods.
Use: pDeep



Test-Time Training for Deep MS/MS Spectrum Prediction Improves Peptide Identification
Journal of Proteome Research2023. Jianbai Ye et al. MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, Anhui 230026, China
ABSTRACT:
Use: pDeep



AIomics: exploring more of the proteome using mass spectral libraries extended by artificial intelligence
Journal of Proteome Research2023. Lewis Y.Geer et al. Natl Inst Stand & Technol, Mass Spectrometry Data Ctr, Biomol Measurement Div, Gaithersburg, MD 20899 USA
ABSTRACT:The unbounded permutations of biological molecules, includingproteinsand their constituent peptides, present a dilemma in identifying thecomponents of complex biosamples. Sequence search algorithms usedto identify peptide spectra can be expanded to cover larger classesof molecules, including more modifications, isoforms, and atypicalcleavage, but at the cost of false positives or false negatives dueto the simplified spectra they compute from sequence records. Spectrallibrary searching can help solve this issue by precisely matchingexperimental spectra to library spectra with excellent sensitivityand specificity. However, compiling spectral libraries that span entireproteomes is pragmatically difficult. Neural networks that predictcomplete spectra containing a full range of annotated and unannotatedions can be used to replace these simplified spectra with librariesof fully predicted spectra, including modified peptides. Using sucha network, we created predicted spectral libraries that were usedto rescore matches from a sequence search done over a large searchspace, including a large number of modifications. Rescoring improvedthe separation of true and false hits by 82%, yielding an 8% increasein peptide identifications, including a 21% increase in nonspecificallycleaved peptides and a 17% increase in phosphopeptides.
Use: pDeep



Machine learning-based peptide-spectrum match rescoring opens up the immunopeptidome
2023. Charlotte Adams et al. Laboratory of Protein Science, Proteomics and Epigenetic Signaling (PPES), Department of Biomedical Sciences, University of Antwerp, Antwerp,
ABSTRACT:
Use: pDeep



High-Coverage Four-Dimensional Data-Independent Acquisition Proteomics and Phosphoproteomics Enabled by Deep Learning-Driven Multidimensional Predictions
Analytical Chemistry2023. Moran Chen et al. Wuhan Univ, Inst Adv Studies, Wuhan 430072, Hubei, Peoples R China
ABSTRACT:Four-dimensional (4D) data-independentacquisition (DIA)-basedproteomics is a promising technology. However, its full performanceis restricted by the time-consuming building and limited coverageof a project-specific experimental library. Herein, we developed aversatile multifunctional deep learning model Deep4D based on self-attentionthat could predict the collisional cross section, retention time,fragment ion intensity, and charge state with high accuracies forboth the unmodified and phosphorylated peptides and thus establishedthe complete workflows for high-coverage 4D DIA proteomics and phosphoproteomicsbased on multidimensional predictions. A 4D predicted library containing similar to 2 million peptides was established that could realize experimentallibrary-free DIA analysis, and 33% more proteins were identified thanusing an experimental library of single-shot measurement in the exampleof HeLa cells. These results show the great values of the convenienthigh-coverage 4D DIA proteomics methods.
Use: pDeep



AIomics: Exploring More of the Proteome Using Mass Spectral Libraries Extended by Artificial Intelligence
Journal of Proteome Research2023. Lewis Y.Geer et al. Natl Inst Stand & Technol, Mass Spectrometry Data Ctr, Biomol Measurement Div, Gaithersburg, MD 20899 USA
ABSTRACT:The unbounded permutations of biological molecules, includingproteinsand their constituent peptides, present a dilemma in identifying thecomponents of complex biosamples. Sequence search algorithms usedto identify peptide spectra can be expanded to cover larger classesof molecules, including more modifications, isoforms, and atypicalcleavage, but at the cost of false positives or false negatives dueto the simplified spectra they compute from sequence records. Spectrallibrary searching can help solve this issue by precisely matchingexperimental spectra to library spectra with excellent sensitivityand specificity. However, compiling spectral libraries that span entireproteomes is pragmatically difficult. Neural networks that predictcomplete spectra containing a full range of annotated and unannotatedions can be used to replace these simplified spectra with librariesof fully predicted spectra, including modified peptides. Using sucha network, we created predicted spectral libraries that were usedto rescore matches from a sequence search done over a large searchspace, including a large number of modifications. Rescoring improvedthe separation of true and false hits by 82%, yielding an 8% increasein peptide identifications, including a 21% increase in nonspecificallycleaved peptides and a 17% increase in phosphopeptides.
Use: pDeep



Dynamic localization of the chromosomal passenger complex is controlled by the orphan kinesins KIN-A and KIN-B in the kinetoplastid parasite Trypanosoma brucei
eLife2023. Ballmer, Daniel et al. Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, United Kingdom, The Wellcome Centre for Cell Biology, Institute of Cell Biology, School of Biological Sciences, Edinburgh, EH9 3BF, United Kingdom
ABSTRACT:
Use: pFind; pDeep; pLink