pFind Studio: a computational solution for mass spectrometry-based proteomics



2022




EGLN1 prolyl hydroxylation of hypoxia-induced transcription factor HIF1 is repressed by SET7-catalyzed lysine methylation
Journal of Biological Chemistry2022. Tang, JH et al. Univ Chinese Acad Sci, Beijing, Peoples R China; Chinese Acad Sci, Innovat Seed Design, Wuhan, Peoples R China; Hubei Hongshan Lab, Wuhan, Peoples R China; Chinese Acad Sci, Inst Hydrobiol, State Key Lab Freshwater Ecol & Biotechnol, Wuhan, Peoples R China
ABSTRACT:Egg-laying defective nine 1 (EGLN1) functions as an oxygen sensor to catalyze prolyl hydroxylation of the transcription factor hypoxia-inducible factor-1 alpha under normoxia conditions, leading to its proteasomal degradation. Thus, EGLN1 plays a central role in the hypoxia-inducible factor-mediated hypoxia signaling pathway; however, the posttranslational modifications that control EGLN1 function remain largely unknown. Here, we identified that a lysine monomethylase, SET7, catalyzes EGLN1 methylation on lysine 297, resulting in the repression of EGLN1 activity in catalyzing prolyl hydroxylation of hypoxia-inducible factor-1 alpha. Notably, we demonstrate that the methylation mimic mutant of EGLN1 loses the capability to suppress the hypoxia signaling pathway, leading to the enhancement of cell proliferation and the oxygen consumption rate. Collectively, our data identify a novel modification of EGLN1 that is critical for inhibiting its enzymatic activity and which may benefit cellular adaptation to conditions of hypoxia.
Use: pFind



Targeting tumor endothelial hyperglycolysis enhances immunotherapy through remodeling tumor microenvironment
Acta Pharmaceutica Sinica B2022. Shan, YL et al. China Pharmaceut Univ, Key Lab Drug Metab & Pharmacokinet, State Key Lab Nat Med, Nanjing 210009, Peoples R China; Nanjing Med Univ, Dept Hematol & Oncol, Affiliated Huaian 1 Peoples Hosp, Huaian 223300, Peoples R China
ABSTRACT:Vascular abnormality isa hallmark of most solid tumors and facilitates immune evasion. Targeting the abnormal metabolism of tumor endothelial cells (TECs) may provide an opportunity to improve the outcome of immunotherapy. Here, in comparison to vascular endothelial cells from adjacent peritumoral tissues in patients with colorectal cancer (CRC), TECs presented enhanced glycolysis with higher glyceraldehyde-3-phosphate dehydrogenase (GAPDH) expression. Then an unbiased screening identified that osimertinib could modify the GAPDH and thus inhibit its activity in TECs. Low-dose osimertinib treatment caused tumor regression with vascular normalization and increased infiltration of immune effector cells in tumor, which was due to the reduced secretion of lactate from TECs by osimertinib through the inhibition of GAPDH. Moreover, osimertinib and anti-PD-1 blockade synergistically retarded tumor growth. This study provides a potential strategy to enhance immunotherapy by targeting the abnormal metabolism of TECs.
Use: pFind



Accurate discrimination of leucine and isoleucine residues by combining continuous digestion with multiple MS3 spectra integration in protein sequence
Talanta2022. Zhang, Weijie et al. Chinese Acad Sci, Dalian Inst Chem Phys, Natl Chromatog R&A Ctr, CAS Key Lab Separat Sci Analyt Chem, Dalian 116023, Liaoning, Peoples R China
ABSTRACT:Protein de novo sequencing based on tandem mass spectrometry is a crucial technology that enables the identification of peptides without searching databases and assembling unknown sequence proteins, especially for monoclonal antibodies (mAbs). However, the discrimination of leucine (Leu) and isoleucine (Ile) residues in the target protein sequence is still challenging. Herein, we developed an accurate method by continuous digestion with MS3-based fragmentation and multiple spectra integration (evaluated by combined verification score, CVS) to distinguish Leu and Ile residues. Continuous digestion promotes the diversity of peptides in order to expose more Leu and Ile at the N-terminal. CVS integrates multiple MS3 spectra to reduce the interference from noise and co-fragmented ions and improve accuracy. This method successfully resolved all 75 Leu/Ile in bovine serum albumin, especially 3 consecutive Leu/Ile. We further applied the method to analyze trastuzumab and 67 out of the 68 Leu/Ile from the light chain and heavy chain were accurately discriminated, demonstrating the great potential in mAbs sequencing.
Use: pFind



Mirror proteases of Ac-Trypsin and Ac-LysargiNase precisely improve novel event identifications in Mycolicibacterium smegmatis MC2 155 by proteogenomic
FRONTIERS IN MICROBIOLOGY2022. Songhao Jiang et al. Guangzhou Univ Chinese Med, Clin Med Coll 2, Guangzhou Higher Educ Mega Ctr, Guangzhou, Peoples R China; Chinese Acad Med Sci, Inst Life, Beijing Proteome Res Ctr, Natl Ctr Prot Sci Beijing,State Key Lab Prote,Res, Beijing, Peoples R China; Chinese Acad Med Sci & Peking Union Med Coll, Inst Med Biotechnol, Res Unit Prote & Res & Dev New Drug, Beijing, Peoples R China; Hebei Univ, Sch Life Sci, Key Lab Microbial Divers Res & Applicat Hebei, Baoding, Peoples R China
ABSTRACT:Accurate identification of novel peptides remains challenging because of the lack of evaluation criteria in large-scale proteogenomic studies. Mirror proteases of trypsin and lysargiNase can generate complementary b/y ion series, providing the opportunity to efficiently assess authentic novel peptides in experiments other than filter potential targets by different false discovery rates (FDRs) ranking. In this study, a pair of in-house developed acetylated mirror proteases, Ac-Trypsin and Ac-LysargiNase, were used in Mycolicibacterium smegmatis MC2 155 for proteogenomic analysis. The mirror proteases accurately identified 368 novel peptides, exhibiting 75-80% b and y ion coverages against 65-68% y or b ion coverages of Ac-Trypsin (38.9% b and 68.3% y) or Ac-LysargiNase (65.5% b and 39.6% y) as annotated peptides from M. smegmatis MC2 155. The complementary b and y ion series largely increased the reliability of overlapped sequences derived from novel peptides. Among these novel peptides, 311 peptides were annotated in other public M. smegmatis strains, and 57 novel peptides with more continuous b and y pairs were obtained for further analysis after spectral quality assessment. This enabled mirror proteases to successfully correct six annotated proteins' N-termini and detect 17 new coding open reading frames (ORFs). We believe that mirror proteases will be an effective strategy for novel peptide detection in both prokaryotic and eukaryotic proteogenomics.
Use: pFind



Deep N-terminomics of Mycobacterium tuberculosis H37Rv extensively correct annotated encoding genes
Genomics2022. Shi, JH et al. Chinese Acad Med Sci, Beijing Proteome Res Ctr, Natl Ctr Prot Sci Beijing,State Key Lab Proteom, Res Unit Prote & Res & Dev New Drug,Inst Lifeom, Beijing 102206, Peoples R China
ABSTRACT:Mycobacterium tuberculosis (MTB) is a severe causing agent of tuberculosis (TB). Although H37Rv, the type strain of M. tuberculosis was sequenced in 1998, annotation errors of encoding genes have been frequently reported in hundreds of papers. This phenomenon is particularly severe at the 5 ' end of the genes. Here, we applied a TMPP [(N-Succinimidyloxycarbonylmethyl) tris (2,4,6-trimethoxyphenyl) phosphonium bromide] labeling combined with StageTip separating strategy on M. tuberculosis H37Rv to characterize the N-terminal start sites of its annotated encoding genes. Totally, 1047 proteins were identified with 2058 TMPP labeled N-terminal peptides from all the 2625 mass spectrometer (MS) sequenced proteins. Comparative genomics analysis allowed the reannotation of 43 proteins' N-termini in H37Rv and 762 proteins in Mycobacteriaceae. All revised N-termini start sites were distributed in 5'-UTR of annotated genes due to over-annotation of previous N-terminal initiation codon, especially the ATG. In addition, we identified and verified a novel gene Rv1078A in +3 frame different from the annotated gene Rv1078 in +2 frame. Altogether, our findings contribute to the better understanding of N-terminal of H37Rv and other species from Mycobacteriaceae that can assist future studies on biological study.
Use: pFind



Ac-LysargiNase efficiently helps genome reannotation of Mycolicibacterium smegmatis MC2 155
Journal of Proteomics2022. Zhu, HM et al. Chinese Acad Med Sci, Inst Life, Beijing Proteome Res Ctr,State Key Lab Prote, Natl Ctr Prot Sci Beijing,Res Unit Prote & Res &, Beijing 102206, Peoples R China
ABSTRACT:Accurate genome annotation, the foundation of life science research in the genome era, is hampered by limited known gene models, nonstandard start codons, and the limited homology of annotated genes in other organisms. LysargiNase mirrors trypsin at the cleavage sites, providing the opportunity to identify peptides other than tryptic peptides. In this study, we used an in-house developed acetylated LysargiNase (Ac-LysargiNase) with higher activity and stability in non-pathogenic Mycolicibacterium smegmatis MC2 155 to supplement the widely used trypsin in proteomic studies. We identified 27,582 peptides from 3844 annotated proteins and 332 novel genome search-specific peptides (GSSPs). Among these GSSPs, 88 peptides were annotated in another M.smeg-matis genome database, and 41 were verified as novel peptides by predicted theoretical spectra and their cor-responding 15N-labeling spectra. Further analysis revealed that 17 verified GSSPs corrected the N-terminus of the 13 annotated genes. The other 24 verified GSSPs helped identify 17 novel open reading frames (ORFs) missed in previously annotated M. smegmatis genomes. Among these novel ORFs, four relatively small proteins with amino acid residues less than 100 and three were precisely identified with C-terminal peptides. Ac-LysargiNase helps with genome reannotation by identifying new genes and events in proteogenomic studies. Significance: Correct genomic annotation is vital in the field of life sciences. The nonstandard start codons seriously affect the confirmation of the translation initiation sites (TISs) of an open reading frame (ORF), and unknown structural genes are easily missed in automated gene prediction. Although proteogenomics presents new avenues for validating gene expression and gene structure refinement based on conventional tryptic pep-tides, determining the TISs and potential encoding genes is complicated. Thus, validation of TISs and encoding ORFs is crucial and urgent. Therefore, we recommend Ac-LysargiNase, a mirror enzyme of trypsin that can identify additional novel peptides for N-terminal correction and ORF identification.
Use: pFind



A novel proteogenomic integration strategy expands the breadth of neo-epitope sources
Cancers2022. Xiang, Haitao et al. BGI Shenzhen, Shenzhen 518103, Peoples R China; Guangdong Prov Key Lab Human Dis Genom, Shenzhen Key Lab Genom, Shenzhen 518083, Peoples R China; BGI, Shenzhen 518083, Peoples R China
ABSTRACT:Simple Summary Tumor-specific antigens are ideal targets for cancer immunotherapy. Mass spectrometry, which is the main method that directly identifies neo-epitopes presented on tumor cells, focuses mainly on peptides derived from annotated protein-coding exomes. However, non-canonical peptides arising from alterations at genomic, transcriptional, and posttranslational levels have been identified in several pioneering studies, making it necessary to develop an integrated proteogenomic approach that can comprehensively identify neoantigens derived from all genomic regions. Our novel strategy combining database searches with a de novo peptide sequencing method accurately identified multiple types of non-canonical peptides in the colorectal cancer cell line, HCT116. This practical proteogenomic strategy can be applied to neoantigen discovery in clinical tumor samples, improving cancer immunotherapy. Tumor-specific antigens can activate T cell-based antitumor immune responses and are ideal targets for cancer immunotherapy. However, their identification is still challenging. Although mass spectrometry can directly identify human leukocyte antigen (HLA) binding peptides in tumor cells, it focuses on tumor-specific antigens derived from annotated protein-coding regions constituting only 1.5% of the genome. We developed a novel proteogenomic integration strategy to expand the breadth of tumor-specific epitopes derived from all genomic regions. Using the colorectal cancer cell line HCT116 as a model, we accurately identified 10,737 HLA-presented peptides, 1293 of which were non-canonical peptides that traditional database searches could not identify. Moreover, we found eight tumor neo-epitopes derived from somatic mutations, four of which were not previously reported. Our findings suggest that this new proteogenomic approach holds great promise for increasing the number of tumor-specific antigen candidates, potentially enlarging the tumor target pool and improving cancer immunotherapy.
Use: pFind



De-sialylation of glycopeptides by acid treatment: enhancing sialic acid removal without reducing the identification
Analytical Methods2022. Dong, Wenbo et al. Northwest Univ, Coll Life Sci, Xian 710069, Shaanxi, Peoples R China
ABSTRACT:Sialic acid, a common terminal monosaccharide on many glycoconjugates, plays essential roles in many biological processes such as immune responses, pathogen recognition, and cancer development. For various purposes, sialic acids may need to be removed from glycopeptides or glycans, mainly using enzymatical or chemical approaches. In this study, we found that most commonly used chemical methods couldn't completely remove sialic acids from glycopeptides. Although the de-sialylation efficiency could be further enhanced by increasing the treatment time or acid concentration, the undesirable side reactions on the peptide portion would decrease glycopeptide identification. By adding the deamidation on carbamidomethyl-cysteine (C), asparagine (N), and glutamine (Q) residues as a variable modification during database search, most of the unidentified spectra could be recovered. This optional acid-treatment and database search method for the complete removal of sialic acids without losing much spectral identification should be quite useful for many glycomic and glycoproteomic studies.
Use: pFind



Quantitative model suggests both intrinsic and contextual features contribute to the transcript coding ability determination in cells
Briefings in Bioinformatics2022. Kang, Yu-Jian et al. Peking Univ, Biomed Pioneering Innovat Ctr BIOPIC, Beijing Adv Innovat Ctr Genom ICG, Ctr Bioinformat CBI,Sch Life Sci, Beijing 100871, Peoples R China; Peking Univ, State Key Lab Pry Lein & Plant Gene Res, Sch Life Sci, Beijing 100871, Peoples R China
ABSTRACT:Gene transcription and protein translation are two key steps of the 'central dogma.' It is still a major challenge to quantitatively deconvolute factors contributing to the coding ability of transcripts in mammals. Here, we propose ribosome calculator (RiboCalc) for quantitatively modeling the coding ability of RNAs in human genome. In addition to effectively predicting the experimentally confirmed coding abundance via sequence and transcription features with high accuracy, RiboCalc provides interpretable parameters with biological information. Large-scale analysis further revealed a number of transcripts with a variety of coding ability for distinct types of cells (i.e. context-dependent coding transcripts), suggesting that, contrary to conventional wisdom, a transcript's coding ability should be modeled as a continuous spectrum with a context-dependent nature.
Use: pFind



Angiotensin-converting enzyme genotype--specific immune response contributes to the susceptibility of COVID-19: a nested case--control study
Frontiers in pharmacology2022. Gong, Pengyun et al. Hubei Ctr Dis Control & Prevent, Wuhan, Peoples R China; Beihang Univ, Sch Biol Sci & Med Engn, Beijing, Peoples R China; Capital Med Univ, Beijing YouAn Hosp, Dept Radiol, Beijing, Peoples R China; Beihang Univ, Sch Engn Med, Beijing, Peoples R China
ABSTRACT:Background: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of coronavirus disease 2019 (COVID-19), which has resulted in a global pandemic.Methodology: We used a two-step polymerase chain reaction to detect the ACE genotype and ELISA kits to detect the cytokine factor. We also used proteomics to identify the immune pathway related to the ACE protein expression.Result: In this study, we found that the angiotensin-converting enzyme (ACE) deletion polymorphism was associated with the susceptibility to COVID-19 in a risk-dependent manner among the Chinese population. D/D genotype distributions were higher in the COVID-19 disease group than in the control group (D/D odds ratio is 3.87 for mild (p value < 0.0001), 2.59 for moderate (p value = 0.0002), and 4.05 for severe symptoms (p value < 0.0001), logic regression analysis. Moreover, genotype-specific cytokine storms and immune responses were found enriched in patients with the ACE deletion polymorphism, suggesting the contribution to the susceptibility to COVID-19. Finally, we identified the immune pathway such as the complement system related to the ACE protein expression of patients by lung and plasma proteomics.Conclusion: Our results demonstrated that it is very important to consider gene polymorphisms in the population to discover a host-based COVID-19 vaccine and drug design for preventive and precision medicine.
Use: pFind