3.4 High-throughput identification of direct protein-DNA interactions us-
3.5.1 Image Based Method For Identification of Positive Results In
One of the big challenges after performing the experiment is to correctly identify
positive and negative results. Previously, the positive interactions were found by
counting the number of colonies on the plate where mated yeast has been growing
for a number of days after the cleaning process. When the plates have been ade-
quately cleaned and there is no auto-activation present, it is relatively easy to pick
out strong positive results, since they appear as densely growing colonies. However,
in a practical setting this is often not the case. Figure 3.9 shows a more repre-
sentative example where positive results have to be identified. Firstly, scoring was
introduced to differentiate weak and strong interactions on a scale from one to ten
respectively. However, such scoring was not always consistent, between operators
and over time. Variability can result from growth conditions in the incubator and on
the plate (e.g. 3AT concentrations), and also depend on the individual scoring the
results. Therefore, consistency is a major factor in analysing colony growth and the
subsequent categorisation as a ’positive’ or ’negative’ result. In order to eliminate
this variability an alternative method for the determination of positive interactions
(a) Y1H-146 fragment. (b) Y1H-147 fragment.
Figure 3.9: A typical pairwise plate of one promoter fragment vs 95 TFs ( control
at H12) on agar SD-LTH plate. It is difficult to identify all positive interactions
visually due to auto-activation.
has been devised. Photographs of the final plates are always taken and kept for
reference purposes. However, these pictures combined with some image analysis
and statistical techniques could potentially be used to consistently identify positive
interactions, which is especially important when considering weak interactions.
Use of the automated scoring and analysis technique, SpotOn, has recently
been used in mapping human (Reece-Hoyes et al., 2011a) and worm (Reece-Hoyes
et al., 2011b) TFs to bait DNA sequences in a high-throughput manner. The method
used in these two studies and the new method presented in this chapter differ in
a number of important aspects. Firstly, SpotOn uses colour information obtained
from the blue colonies produced by thelacZ
reporter gene, whereas only grey scale
information is available from the study presented here. Presence of colour infor-
mation means that SpotOn is able to hone in on individual colonies, as opposed
to combining all colonies within the same spot/well, as used in the method here.
Although both methods use colour and grey scale intensities to determine positive
interactions, both apply different statistical techniques to the intensity information.
SpotOn uses negative control spots with empty reporter constructs as a normalisa-
tion factor and a Z-score is derived from normalised samples, taking into account
growth variation in row/column and negative controls. Higher Z-scores correspond
to high confidence positive results. On the other hand, the new method compares
negative controls against test spots using Kolmogorov-Smirnoff Two-Sample test
statistics.
Spot Detection
A number of different approaches have been tried to automatically identify circu-
lar spots where the yeast was spotted, for example, using Sobel filters to outline
regular circular structures. However, no adequate settings were found. An alter-
native approach was to use raw image recognition library OpenCV which provides
libraries for feature detection, for example using Haar wavelets. In order to use
feature detection, positive and negative sets containing true positives and true neg-
ative examples need to be prepared and feature detection can be trained using these
sets. The larger the training sets are, then the better feature detection becomes.
However, constructing a large enough set from the existing data would have been
as time consuming as picking out wells by hand. Moreover, as most of the data
used for training couldn’t have been used for detection meant that success of the
feature detection would have been difficult to access. Therefore, an ImageJ coupled
with microarray plugin that allows for easy drawing of the round spots on a large
scale allowed for picking a relatively high throughput way of outlining mated spots.
Once the spots were outlined, the same plugin outputs statistics about the spot, in-
cluding a histogram of the intensity values and intermediate statistics such as mean,
variance and standard deviation of intensities. Histograms are used for cumulative
distribution computations and in further downstream analysis.
Positive Interaction Identification Among Auto-Activation
An ability to identify positive interactions amongst the noise present in the fragments
that have high levels of auto-activation is one of the major advantages of using
automatic identification and classification. For example, the Y1H 177 fragment
consistently auto-activated during the screen, but appropriate levels of 3AT that
maintained a high fraction of positive results were difficult to determine, Figure
3.10. The method developed here allows even noisy spots to be compared with
negative controls since both should contain roughly the same amount of growth due
to auto-activation. In addition, positive results should also contain more growing
yeast due to the positive interaction of the TF with the promoter fragment. This
can be difficult using the traditional, by eye, method.
The new method has been applied to the images for this fragment and found
to contain a few significant interactions that were previously discarded in the tradi-
tional technique.
Limitations Of Automatic Approach
The new methods have been shown to be able to distinguish between positive and
negative results, Figure 3.6, as well as allowing the automatic determination of
all potential positive results on a single plate, Figure 3.7. However, one of the
major challenges that hindered successful prediction of positive interactions is the
presence of bubbles in the agar plates. When bubbles are formed during plate
pouring and are not dissipated, they leave a circular shaped indentation in the
plate. When the picture of the plate is taken after the incubation period with
upper white light (see Methods), the edges of the circular indentations produce
brighter outlines as light is refracted internally more around these edges than inside
of the indentations, giving a “halo” effect. Furthermore, the analysis relies on the
premise that brighter pixels represent growing yeast, and therefore this “halo” effect
introduces artificially brighter pixels. The spots with the “halo” effect produce lower
P-values that are still statistically significant. This effect can be corrected by using
appropriate spot recognition, as mentioned above. Automatic classifiers can be
trained to recognise bubble indentations and flag them appropriately, given that
an appropriate training set exists. During the development of the new method, no
such training set existed, therefore it was not possible to train automatic classifier.
However, following the pairwise Y1H results presented here, there are over 6500
examples of spots containing positive, negative and auto-activating results, which
can be used to successfully train an automatic classifier and design a pipeline that
would take images of plates containing yeast and output unbiased estimates for
positive and negative interactions on the plate. This method can also be extended
to be used in any pairwise Yeast ’n’-Hybrid screen. Additionally, the classifier could
be trained to be used with library screens using images from the library screens
Figure 3.10: Pairwise screen of Y1H 177 fragment against a mini-library of TFs.
All spots, with exception of C07, have some level of growth due to auto-activation
in yeast. Negative control is located at H12.
presented here, although spots are not as well defined in the library images as in
the pairwise screen.
In document
Computational and experimental analysis of plant promoters : identifying functional elements
(Page 128-132)