Detects peptide pairs in LC-MS data and determines their relative abundance.
FeatureFinderMultiplex is a tool for the fully automated analysis of quantitative proteomics data. It detects pairs of isotopic envelopes with fixed m/z separation. It requires no prior sequence identification of the peptides. In what follows we outline the algorithm.
Algorithm
The algorithm is divided into three parts: filtering, clustering and linear fitting, see Fig. (d), (e) and (f). In the following discussion let us consider a particular mass spectrum at retention time 1350 s, see Fig. (a). It contains a peptide of mass 1492 Da and its 6 Da heavier labelled counterpart. Both are doubly charged in this instance. Their isotopic envelopes therefore appear at 746 and 749 in the spectrum. The isotopic peaks within each envelope are separated by 0.5. The spectrum was recorded at finite intervals. In order to read accurate intensities at arbitrary m/z we spline-fit over the data, see Fig. (b).
We would like to search for such peptide pairs in our LC-MS data set. As a warm-up let us consider a standard intensity cut-off filter, see Fig. (c). Scanning through the entire m/z range (red dot) only data points with intensities above a certain threshold pass the filter. Unlike such a local filter, the filter used in our algorithm takes intensities at a range of m/z positions into account, see Fig. (d). A data point (red dot) passes if
- all six intensities at m/z, m/z+0.5, m/z+1, m/z+3, m/z+3.5 and m/z+4 lie above a certain threshold,
- the intensity profiles in neighbourhoods around all six m/z positions show a good correlation and
- the relative intensity ratios within a peptide agree up to a factor with the ratios of a theoretic averagine model.
Let us now filter not only a single spectrum but all spectra in our data set. Data points that pass the filter form clusters in the t-m/z plane, see Fig. (e). Each cluster corresponds to the mono-isotopic mass trace of the lightest peptide of a SILAC pattern. We now use hierarchical clustering methods to assign each data point to a specific cluster. The optimum number of clusters is determined by maximizing the silhouette width of the partitioning. Each data point in a cluster corresponds to three pairs of intensities (at [m/z, m/z+3], [m/z+0.5, m/z+3.5] and [m/z+1, m/z+4]). A plot of all intensity pairs in a cluster shows a clear linear correlation, see Fig. (f). Using linear regression we can determine the relative amounts of labelled and unlabelled peptides in the sample.
The command line parameters of this tool are:
FeatureFinderMultiplex -- Determination of peak ratios in LC-MS data
Version: 2.0.0 Aug 25 2015, 00:02:58, Revision: GIT-NOTFOUND
Usage:
FeatureFinderMultiplex <options>
This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option.
Options (mandatory options marked with '*'):
-in <file>* LC-MS dataset in centroid or profile mode (valid formats: 'mzML')
-out <file> Set of all identified peptide groups (i.e. peptide pairs or triplets or singlets or ..).
The m/z-RT positions correspond to the lightest peptide in each group. (valid formats:
'consensusXML')
Common TOPP options:
-ini <file> Use the given TOPP INI file
-threads <n> Sets the number of threads allowed to be used by the TOPP tool (default: '1')
-write_ini <file> Writes the default configuration file
--help Shows options
--helphelp Shows all options (including advanced)
The following configuration subsections are valid:
- algorithm Parameters for the algorithm.
- labels Isotopic labels that can be specified in section 'algorithm:labels'.
You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
Have a look at the OpenMS documentation for more information.
INI file documentation of this tool:
Legend:
required parameter
advanced parameter
+FeatureFinderMultiplexDetermination of peak ratios in LC-MS data
version2.0.0
Version of the tool that generated this parameters file.
++1Instance '1' section for 'FeatureFinderMultiplex'
in
LC-MS dataset in centroid or profile modeinput file*.mzML
out
Set of all identified peptide groups (i.e. peptide pairs or triplets or singlets or ..). The m/z-RT positions correspond to the lightest peptide in each group.output file*.consensusXML
out_features
Optional output file containing the individual peptide features in 'out'.output file*.featureXML
out_mzq
Optional output file of MzQuantML.output file*.mzq
log
Name of log file (created only when specified)
debug0
Sets the debug level
threads1
Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse
Disables progress logging to command linetrue,false
forcefalse
Overwrite tool specific checks.true,false
testfalse
Enables the test mode (needed for internal use only)true,false
+++algorithmParameters for the algorithm.
labels[][Lys8,Arg10]
Labels used for labelling the samples. [...] specifies the labels for a single sample. For example
[][Lys8,Arg10] ... SILAC
[][Lys4,Arg6][Lys8,Arg10] ... triple-SILAC
[Dimethyl0][Dimethyl6] ... Dimethyl
[Dimethyl0][Dimethyl4][Dimethyl8] ... triple Dimethyl
[ICPL0][ICPL4][ICPL6][ICPL10] ... ICPL
charge1:4
Range of charge states in the sample, i.e. min charge : max charge.
isotopes_per_peptide3:6
Range of isotopes per peptide in the sample. For example 3:6, if isotopic peptide patterns in the sample consist of either three, four, five or six isotopic peaks.
rt_typical40
Typical retention time [s] over which a characteristic peptide elutes. (This is not an upper bound. Peptides that elute for longer will be reported.)0:∞
rt_min2
Lower bound for the retention time [s]. (Any peptides seen for a shorter time period are not reported.)0:∞
mz_tolerance6
m/z tolerance for search of peak patterns.0:∞
mz_unitppm
Unit of the 'mz_tolerance' parameter.Da,ppm
intensity_cutoff1000
Lower bound for the intensity of isotopic peaks.0:∞
peptide_similarity0.5
Two peptides in a multiplet are expected to have the same isotopic pattern. This parameter is a lower bound on their similarity.-1:1
averagine_similarity0.4
The isotopic pattern of a peptide should resemble the averagine model at this m/z position. This parameter is a lower bound on similarity between measured isotopic pattern and the averagine model.-1:1
averagine_similarity_scaling0.75
Let x denote this scaling factor, and p the averagine similarity parameter. For the detection of single peptides, the averagine parameter p is replaced by p' = p + x(1-p), i.e. x = 0 -> p' = p and x = 1 -> p' = 1. (For knock_out = true, peptide doublets and singlets are detected simulataneously. For singlets, the peptide similarity filter is irreleavant. In order to compensate for this 'missing filter', the averagine parameter p is replaced by the more restrictive p' when searching for singlets.)0:1
missed_cleavages0
Maximum number of missed cleavages due to incomplete digestion.0:∞
knock_outfalse
Is it likely that knock-outs are present? (Supported for doublex, triplex and quadruplex experiments only.)true,false
+++labelsIsotopic labels that can be specified in section 'algorithm:labels'.
Arg66.0201290268
Label:13C(6) | C(-6) 13C(6) | unimod #1880:∞
Arg1010.0082686
Label:13C(6)15N(4) | C(-6) 13C(6) N(-4) 15N(4) | unimod #2670:∞
Lys44.0251069836
Label:2H(4) | H(-4) 2H(4) | unimod #4810:∞
Lys66.0201290268
Label:13C(6) | C(-6) 13C(6) | unimod #1880:∞
Lys88.0141988132
Label:13C(6)15N(2) | C(-6) 13C(6) N(-2) 15N(2) | unimod #2590:∞
Dimethyl028.0313
Dimethyl | H(4) C(2) | unimod #360:∞
Dimethyl432.056407
Dimethyl:2H(4) | 2H(4) C(2) | unimod #1990:∞
Dimethyl634.063117
Dimethyl:2H(4)13C(2) | 2H(4) 13C(2) | unimod #5100:∞
Dimethyl836.07567
Dimethyl:2H(6)13C(2) | H(-2) 2H(6) 13C(2) | unimod #3300:∞
ICPL0105.021464
ICPL | H(3) C(6) N O | unimod #3650:∞
ICPL4109.046571
ICPL:2H(4) | H(-1) 2H(4) C(6) N O | unimod #6870:∞
ICPL6111.041593
ICPL:13C(6) | H(3) 13C(6) N O | unimod #3640:∞
ICPL10115.0667
ICPL:13C(6)2H(4) | H(-1) 2H(4) 13C(6) N O | unimod #8660:∞