pFind Studio: a computational solution for mass spectrometry-based proteomics

Introduction

pSite is a quality control tool, which can support for both amino acid confidence evaluation on de novo peptide sequencing and modification site localization. De novo peptide sequencing can deduce the peptide sequences directly from MS/MS data without using any databases. Therefore, it can be used to find novel peptides, e.g., mutation and unexpected modifications. However, there still remains one big problem that needs to be solved: how to control the FDR in de novo peptide sequencing. As an alternative to controlling the FDR of the full-length peptides, controlling the false amino-acid rate (FAR) is more practical in de novo peptide sequencing because it often reports the sequences which are partly correct. Now, pSite is a good tool to evaluate the confidence of each amino acid on de novo peptide sequencing. In addition, this method can also be applied for another problem: modification site localization.

pSite has the following features:

1. It can report the confidence of each amino acid on de novo peptide sequencing.

2. It can support the results of almost all de novo sequencing software, e.g., pNovo, PEAKS and Novor. Only you need to know is the format of the input files, which are described in readme.txt file.

3. It can report the confidence of modification site and localize the modification site, e.g., phosphorylation site.

4. It can estimate the FAR of amino acids and the FLR of modification sites.

5. It can support multiple processes running to improve the searching speed.

6. It has a friendly search interface.

Figure 1. An example of confidence evaluation of amino acids by enumerating competitive partial sequences. Assuming that the correct peptide sequence is AQPSK and the confidence of the first residue A in AQPSK is to be evaluated. All subsequences whose masses are equal to the mass (with a given mass tolerance, e.g., 20 ppm) of AQPS (383.18 Da) are enumerated, e.g., QAPS, QPAS, QSAP, …, TQPG. Any enumerated subsequence whose summed mass of prefix residues and the amino acid to be evaluated are both the same as those of the original sequence should be removed, such as APQS, APSQ and ASPQ. Note that lengths of all subsequences do not have to be equal to 4 (the length of the original subsequence AQPS). For example, GATGP and GAAPS are also valid subsequences because their masses are also equal to 383.18 Da. Then the original subsequence is replaced by these enumerated subsequences to generate competitive peptides: QAPSK, QPASK, QSAPK, … , TQPGK. The score, i.e., the number of matched peaks is 7 for the original sequence AQPSK and varies from 2 to 6 for the competitive sequences.




Supplemental Files

pSite is currently free to use. Download pSite.

Notice: Please read carefully the pSite Software License Agreement before downloading and using the software. Please fill in the registered table and email it to pnovo@ict.ac.cn to get the registration key.

If you have any questions about it, please contact pnovo@ict.ac.cn.



Publications

Journal of Proteome Research, 2017. [abstract]

pSite: Amino Acid Confidence Evaluation for Quality Control of De Novo Peptide Sequencing and Modification Site Localization.

Hao Yang, Hao Chi, Wen-Jing Zhou, Wen-Feng Zeng, Chao Liu, Rui-Min Wang, Zhao-Wei Wang, Xiu-Nan Niu, Zhen-Lin Chen, and Si-Min He.