1. GS Run Processor
1.4 Filters
1.4.3 Signal Processing Adjustable Filter Parameters
Some of the quality filters may be turned off or adjusted to control the stringency of the output.
This section provides instructions for customizing the filter parameters for signal processing.
•
Identify the Data Processing folder (‘D_’) of the sequencing Run whose reads are to be re-filtered or the Run folder (‘R_’) of a sequencing Run whose reads will be filtered using the customized filter parameters.•
Generate a template file in the Run or data directory by typing the following command, from within the fullProcessing folder:gsRunProcessor --template=filterOnly > filterTemplate.xml shotgun library processing in the current directory.
There are additional templates for Paired End and Amplicons experiments, which can be
•
•
Within sectionparameters that govern the filtering of high quality reads for the sequencing Run are adju
def lt are ma
o uce the number of filtered high quality
This will create a template file for
generated by using the template arguments “filterOnlyPairedEnd” and
“filterOnlyAmplicons” in the command above. The XML output filename (to the right of the “>” symbol) can be any valid string ending with ‘.xml’.
Open the file with a text editor. An easy-to-use text editor called “nedit” is present on the Genome Sequencer FLX Instrument; on the GS Junior Attendant PC, a similar program called “gedit”, is present. To use nedit (or gedit) to edit the file, type the following command:
nedit filterTemplate.xml
the template, scroll down to the <qualityFilter> and <baseCaller>
s (Figure 7 and Figure 8 below). All of the user-adjustable Quality Filter sted in these sections. Both sections contain some adjustable parameters by au . Additional parameters can be added to the <qualityFilter> section. As changes
de, note that:
Increasing a stringency setting will red
reads by eliminating the lowest quality reads from among the previously filtered high quality reads. This may increase overall accuracy of the filtered high quality reads.
o Decreasing a stringency setting will increase the number of passed filter high quality reads by allowing lower quality reads to pass through the quality filter.
This may reduce overall accuracy of the filtered high quality reads.
o he following command structure
After the changes have been made, save and exit the text editor.
•
Change directory, to the parent Run folder directory:cd ..
•
Launch the processing job specifying the modified filter parameter script:To launch a filter only signal processing job use t
Software v. 2.5p1, August 2010 20
runAna
If the jo D_..filte
lysisFilter--pipe=/path_To_filterTemplate.xml /path_To_DATA_DIRECTORY
b is successfully launched, a new Data Processing directory with the name convention rTemplate will be created in the Run directory.
processi
Figure 7: Signal ng filter adjustment XML code for Shotgun library processing (comments removed)
The quality filter parameters are modifiable. Their defaults and modification recommendation values are described below. For more information regarding Amplicon filter parameters, please refer to Application Note APN-10002: Amplicon Sequencing; Experimental Designs, Guidelines and Tips, available at http://454.com/my454.
doDotCheck Checks for reads with too many negative flows.
Value Effect True – Default All Dots filtering enabled
False Dots filtering disabled
doMixedCheck Checks for reads with too many positive flows.
Value Effect True – Default All Mixed filtering enabled
False Mixed filtering disabled
doClassifierCheck Checks if reads start with a valid 4-base ‘key’ sequence (Key Pass)
Value Effect True – Default All Key Pass filtering enabled
False Key Pass filtering disabled
doShortSignalCheck A signal intensity filter that trims reads that lose signal
‘crispness’ near the end.
Value Effect True – Default All Signal intensity filtering enabled
False Signal intensity filtering disabled
doPrimerTrimming Trims the end of a read when it matches a 454 Sequencing System Adaptor sequence.
Value Effect True – Default All Primer trimming enabled
False Primer trimming disabled
doValleyFilterTrimBack Enables or disables the TrimBack Valley Filter Value Effect
True – Default Shotgun, Paired End read trimming enabled False – Default Amplicon read trimming disabled
Software v. 2.5p1, August 2010 22
minConsensusSignal values integer >= 1.
The minimum average intensity of the positive key flows to be considered as a potential well. Acceptable
are any
Value Effect 20 - Default Shotgun, full
wise processing
60 - Default other
>Default
Increasing this value can eliminate dim candidate wells This can yield more high-quality fewer candidates. However, if
oo many good candidates are with poor signal quality.
bases and Reads from the value is too high, t
discarded and throughput declines.
useBicubic
Images are upsampled during image processing, doubling arameter impacts psampling is performed for their size in each dimension. This p
whether bicubic or bilinear u this step.
Value Effect True - Default All Bicubic upsampling is performed
False Bilinear upsampling is performed
vfScanLimit Controls how much of the read is scanned with the valley teger > 0.
filter. Acceptable values are any in Value Effect
< Default
Amplicon Runs will have more reads, but bases from the un-filtered end of the flowgram m
called ay have higher es.
error rat 4,096 - Default Shotgun
efault Amplicon 700 - D
Default Amplicon Runs will have fewer reads, but they will tend to have higher accuracy.
>
vfScanAllFlows Modifies the meaning of the valley flow parameters.
Value Effect True
The ratio of vfBadFlowTh taken as a score thresho
reshold to vfLastFlowToTest is ld that is applied over the entire read.
False - Default Shotgun
For amplicon runs, a maximum of vfBadFlowThreshold valley flows can be seen within the first vfLastFlowToTest nucleotide flows.
tiOnly - Default Amplicon nt to true for both GS Junior and Genome Sequencer FLX Titanium runs.
Equivale
flxOnly Equivalent to true for Genome Sequencer FLX Standard ns only.
ru
vfLastFlowToTest
which the reads are monitored for the presence of valley flows. The maximum value for this parameter is equal to the Represents the number of flows over
number of nucleotide flows in the Run (most stringent), and the minimum value is “0” (to turn off the Valley Filter). This parameter’s behavior has been largely superseded by the vfScanLimit parameter.
Value Effect
400 higher stringency
320 Default
168 Lower stringency
vfBadFlowThreshold
If vfScanAllFlows is false, then this is literally the number of flows that can fall within a valley in the valley
general this arameter should not need to be adjusted; the preferred filter before the read is rejected (for amplicons) or trimmed (for shotgun reads). In the more common case where vfScanAllFlows is true, this value forms the numerator of the valley filter score. In
p
method is to adjust the valley filter threshold is the vfTrimBackScaleFactor parameter.
Value Effect
2 higher stringency
4 Default
6 lower stringency
vfTrimBackScaleFactor
Test.
If vfScanAllFlows is true, then flows are scored against the flow valleys using an exponential function. The sum of scores is multiplied by vfTrimBackScaleFactor before being compared to the maximum allowable score ratio formed from vfBadFlowThreshold and vfLastFlowTo If the scaled score exceeds the maximum, then the read is trimmed (shotgun) or discarded (amplicon).
Acceptable values are floating point numbers >=0.
Value Effect 1.6 - Default Shotgun
3.0 - Default Amplicon
> Default ields
r Increases the stringency of the valley filter. This y higher accuracy, but also shorter read lengths and fewe high-quality reads.
Software v. 2.5p1, August 2010 24
vfUseRollingWindows tween using This parameter estimates the valley thresholds be
the 0, 1, 2, and 3-mers dynamically across the run using a window of some number of flows, rather than one set of thresholds for the entire run.
Value Effect
True - Default es the
read Use rolling windows enabled. This generally mak
valley filter less stringent, yielding more high-quality reads and bases, but slightly shorter average lengths.
False Use rolling windows disabled
axFailedPercent
entage of “valley” flows in a read which cause ected entirely. This parameter can be used to tune the failed percent. A range between “90”
The perc
the read to be rej vfM
and “100” is recommended.
Value Effect 100 Default
90 higher stringency
vfTrimBackMinimumLength
r if they are attempting to sequence very short When a read is trimmed such that it would be shorter than the value of this parameter, the read is rejected entirely. Users may set this parameter to a lower numbe
reads.
Value Effect
>84 higher stringency
84 Default
<84 lower stringency
minLength Represents the minimum length of reads acceptable after all quality filtering steps.
Value Effect 50 Default
84 higher stringency (old default)
useAmpliconsPrimers Only search for the sequencing primers appropriate for Amplicon experiments.
Value Effect
True - default Enabled
False Disabled
useCorrectionGlobalLimit Enables or disables a global limit on the SFF flowgram correction limit.
Value Effect
True - default Enabled
False Disabled
that is NOT recomme
While it is possible to turn of s,
Dot, Mixed and Short Signa ata
nded fo to turn
off Dots and Mixed filter, ad of
the Shotgun Processing temp ure 7).
<doDotCheck>false</doDotC
<doMixedCheck>false</doM
<doClassifierCheck>false</d
<doShortSignalCheck>false</doShortSignalCheck>
f the read rejecting and signal intensity filters (Key Pas l ) this will result in output of sub-standard quality d
r use in subsequent data analysis For example , d the following lines under the <qualityFilter> section
late (Fig heck>
ixedCheck>
oClassifierCheck>
Software v. 2.5p1, August 2010 26