Learning about SEQUEST

The program that we use in order to identify proteins is called SEQUEST.  When I first started working in the lab, I ran spectra through this program by changing various parameters and then watching it work, without really understanding the process that it takes to come up with the best peptide/protein matches for the sample.  Now I understand the modifications of the program help get the most accurate values possible.

SEQUEST software goes about protein identification in a variety of ways.  The program receives a file of tandum mass spectra for each part of the run.  Software evaluates each tandem spectra in terms of its fragmentation and pulls data for particular peptides around the specified mass range.  The data for these peptides include a theoretical spectrum based on the peptide’s mass.  These theoretical spectra are then compared to the observed spectrum, and the closest match is identified as the peptide.  The user can adjust the parameters so that there is a wider variety of peptides that can be compared to the original protein scan.

Included in the supplementary information provided in the report for each sample is a Xcorr value.  This value is assigned as a definition of confidence in SEQUEST’s ability to accurately match the peptides in the sample.  These values have a threshold which, if above, can help with the definition of the entire protein’s identity.  Something that we have found curious and want to discover more about is that for all of our samples so far, each identified protein has a single peptide with a confident Xcorr value and all additional peptides with an Xcorr below the threshold of confidence.