3. GS Run Browser
3.8 The Control DNA Tab
The Control DNA tab (Figure 40) displays accuracy results for the Control DNA Beads that were spiked in the sequencing reaction, when available (see the Sequencing Method Manual for details about this procedure). Specifically, the Control DNA tab reports the percentage of Control DNA reads that match their reference sequence at 95%, 98%, and 100% accuracy, in each region of the PicoTiterPlate Device. It also allows a user to view the “consensus flowgram”
for the reads from each Control DNA sequence. (A consensus flowgram is the flowgram constructed by averaging the signals of the reads that matched the reference sequence over all flows, for the selected region or the entire PicoTiterPlate Device.)
For the Control DNA tab data to be displayed, the region.cwf files (or, for the GS FLX standard chemistry, files 454RunTimeMetricsAll.txt and 454QualityFilterMetrics.txt) must be present in the Analysis sub-directory of the sequencing Run. For GS FLX standard chemistry data sets, the region.wells files must also exist for the consensus flowgrams to be available.
3.8.1 Control DNA Tab Features and Functionalities
The Control DNA tab contains many features with the functionalities described below (see Figure 40):
•
The Options area (on the top left area of the tab):o Control DNA: choice of data to display:
All Control DNA sequences included in the Run
Any of the specific Control DNA sequences included in the Run
Unrecognized reads, i.e. reads which begin with the Control DNA sequencing key (GACT, CATG or ATGC; see Section 7.1 for details) but do not match any of the corresponding Control DNA reference sequences o Base Pairs: choice of length over which to calculate match, i.e. show % match
from the first base after the key up to this nucleotide in the reads. Only the options relevant to the read length of the data set are displayed.
o A button to display the consensus flowgrams for each of the specific Control DNA sequences (as selected) in a tri-flowgram viewer (Figure 41). See Section 3.8.2 for a complete description of the Control DNA consensus flowgram view. This button is available only if the current selection is one of the Control DNA sequences; and grayed out if “all” or “unrecognized” are selected.
Figure 41: The “Display Consensus Flowgram” button of the Control DNA tab, available when one of the Control DNA sequences is selected
•
Navigation and data capture buttons: Except for the “Display Consensus Flowgram”button described above, all the buttons on this tab have common functions, allowing a user to save the data as a text file or snapshot image. See Section 3.3.4 for a description of the navigation and data capture button functions.
•
Plot display:Software v. 2.5p1, August 2010 72
o Graphic display of the % match data for all key passed Control DNA reads, per the options selected for the individual regions of the PicoTiterPlate Device and as an aggregate average.
o This data is presented at 3 levels of accuracy:
100%: The percent of key passed Control DNA reads that exactly matched the first N-Nucleotide cycles of the corresponding known reference sequence.
≥ 98%: The percent of key passed Control DNA reads that had no more than 2% base calling differences in the first N-Nucleotide cycles compared to their known reference sequence.
≥ 95%: The percent of key passed Control DNA reads that had no more than 5% base calling differences in the first N-Nucleotide cycles compared to their known reference sequence.
o The plot shares the common scrolling and zooming functions of the other GS Run Browser plots See Section 3.3.3 for a description of the plot functions.
o When the mouse pointer is over the plot, the usual mouse tracker shows the x-axis value, and the % of reads among the Control DNA sequences selected that matched their respective reference sequences at 100, 98, and 95% accuracy over the selected length, in the region under the pointer or averaged over the entire PTP Device.
•
Summary: This area displays the number of Raw Wells and Control DNA reads for the Control DNA species specified; it also displays the percentage of matches between these reads and their reference sequence at the read length specified, for each region of the PicoTiterPlate Device and as an aggregate average.3.8.2 Control DNA Consensus Flowgrams
Clicking the “Open the flowgram” button (in the Options area of the Control DNA tab), when one of the Control DNA sequences is selected, brings up the consensus flowgram for that sequence (Figure 42). A consensus flowgram is the flowgram constructed by averaging, for each nucleotide flow, the read flowgram signals of the reads identified as that reference sequence.
This is presented as part of a tri-flowgram, along with the “ideal” flowgram for that Control DNA sequence and the “difference” flowgram, in a manner similar to the tri-flowgram of a Control DNA well, described before (Section 3.5.3).
Figure 42: The tri-flowgram view of the AVTF90 Control DNA sequence, showing the consensus flowgram for both regions of the PicoTiterPlate Device
Software v. 2.5p1, August 2010 74
3.8.2.1 Control DNA Consensus Flowgram Features and Functionalities
The tri-flowgram view of a Control DNA consensus sequence is similar to that of an individual Control DNA read (see Section 3.5.3) and is described below.
•
The Options area provides the following choices:o Consensus Region: choice of what PicoTiterPlate Device region data to display:
All – Show the consensus flowgram formed by averaging all the read flowgrams across the entire PicoTiterPlate Device, for the Control DNA sequence specified.
Individual region – Show the consensus flowgram of the reads from an individual region of the PicoTiterPlate Device, for the Control DNA sequence specified.
o Style: choice of Bars, Lines, or Lollipop plot styles. In all cases, the reagent flowed in each step is color-coded, per the legend. The lines and lollipop styles are narrower than the bar style, and will allow viewing of more of a Run without scrolling.
•
Navigation and data capture buttons:o All the buttons in this window have common functions: setting and adjusting the zoom level of the plot display or allowing a user to save the data as a text file or snapshot image. See Section 3.3.4 for a description of the navigation and data capture button functions.
o The text file and snapshot image buttons will save a file containing the data or view for all three plots.
o The bottom plot contains separate zooming buttons because the scrolling and zooming of the plots are tied together except for the y-axis of the bottom plot (see below).
•
Plot display:o The top plot displays the idealized flowgram for the Control DNA sequence selected, generated by taking the known nucleotide reference sequence and converting each homopolymer stretch into a corresponding signal (e.g. “AAA”
becomes a signal of 3.0 in the next A flow). The middle plot displays the consensus flowgram calculated as a flow-by-flow average of all the reads matching this Control DNA sequence in the region(s) selected; and the bottom plot displays the flow-by-flow differences between the top two plots.
o The plot shares the common scrolling and zooming functions of the GS Run Browser plots. See Section 3.3.3 for a description of the plot functions.
o As with other tri-flowgrams, the scrolling and zooming of the three plots are tied together. All three plots scroll and zoom together along the x-axis, and the top two plots scroll and zoom together along the y-axis as well (the bottom
“difference” plot scrolls and zooms separately along the y-axis).
o Since the bottom plot zooms separately, it has its own zoom buttons.
o The horizontal bars between the plots allow for adjusting the heights of the three plots.
o When the mouse pointer is over the plot, the mouse tracker shows the flow number, the reagent flowed and the N-mers count for the flow under the pointer.