Proteomics - National Center for Research Resources
Search NCRR

Software and Tools

Protein Digestion Simulator Basic

The Protein Digestion Simulator Basic can be used to read a text file containing protein or peptide sequences (FASTA format or delimited text) then output the data to a tab-delimited file. It can optionally digest the input sequences using trypsin or partial trypsin rules, and can add the predicted normalized elution time (NET) values for the peptides. It can also validate a FASTA file, testing it against a set of rules that identify common formatting errors. Lastly, the program can calculate the number of uniquely identifiable peptides within an input file using only mass, or both mass and NET, given user-defined tolerances (see Peptide Uniqueness Options below)

Download Software Tool Download Source Code  

Version v2.1.2952 Requirements Microsoft NET Framework 1.1
Date Updated January 31, 2008 File Size (Software Tool) 313 KB (ZIP)
Registration Required No File size (Source Code) 973 KB (ZIP)
Developers Matthew Monroe
Comments
  • Added a new enzyme: "Acetic Acid Hydrolysis" cleaves after aspartic acid (D)
See also the complete Revision History


Protein Digestion Simulator Basic Feature Tour

Find out what is in Protein Digestion Simulator Basic. Take the feature tour to learn about some of the top features of this product.

File Format Options

Can read a FASTA file or delimited text file containing protein or peptide sequences to output the data to a tab-delimited file. FASTA files can also be validated against a set of rules that identify common formatting errors.

 

Parse Digest File Options

Can read in a FASTA file and create a new FASTA file with all of the protein sequences reversed or even randomized. This new file can be the equivalent length of the original file, or can include just a subset of the original file.

 

FASTA File Validation

Can read in a FASTA file and check for formatting errors, including duplicate protein names, duplicate protein sequences, long protein names, long protein residue lines, invalid residues, spaces in inappropriate places, protein entries without a description, etc.

 

Peptide Uniqueness Options

Calculate the number of uniquely identifiable peptides within the input file (digested or undigested), using only mass, or both mass and NET, with appropriate tolerances. The predicted NET values are computed using the NET Prediction DLL included with the NET Prediction Utility

Reference: A.D. Norbeck, M.E. Monroe, J.N. Adkins, K.K. Anderson, D.S. Daly, and R.D. Smith, "The utility of accurate mass and LC elution time information in the analysis of complex proteomes," Journal of the American Society for Mass Spectrometry; (2005) 16, 1239-1249.


 



Acknowledgment

All publications that result from the use of this software should include this statement:

Portions of this research were supported by the NIH National Center for Research Resources (Grant RR018522), the W.R. Wiley Environmental Molecular Science Laboratory (a national scientific user facility sponsored by the U.S. Department of Energy's Office of Biological and Environmental Research and located at PNNL), and the National Institute of Allergy and Infectious Diseases (NIH/DHHS through interagency agreement Y1-AI-4894-01). PNNL is operated by Battelle Memorial Institute for the U.S. Department of Energy under contract DE-AC05-76RL0 1830.