Image Based Method For Identification of Positive Results In

3.4 High-throughput identification of direct protein-DNA interactions us-

(a) Y1H-146 fragment. (b) Y1H-147 fragment.

Image Based Method For Identification of Positive Results In

3.4 High-throughput identification of direct protein-DNA interactions us-

3.5.1 Image Based Method For Identification of Positive Results In

One of the big challenges after performing the experiment is to correctly identify

positive and negative results. Previously, the positive interactions were found by

counting the number of colonies on the plate where mated yeast has been growing

for a number of days after the cleaning process. When the plates have been ade-

quately cleaned and there is no auto-activation present, it is relatively easy to pick

out strong positive results, since they appear as densely growing colonies. However,

in a practical setting this is often not the case. Figure 3.9 shows a more repre-

sentative example where positive results have to be identified. Firstly, scoring was

introduced to diﬀerentiate weak and strong interactions on a scale from one to ten

respectively. However, such scoring was not always consistent, between operators

and over time. Variability can result from growth conditions in the incubator and on

the plate (e.g. 3AT concentrations), and also depend on the individual scoring the

results. Therefore, consistency is a major factor in analysing colony growth and the

subsequent categorisation as a ’positive’ or ’negative’ result. In order to eliminate

this variability an alternative method for the determination of positive interactions

Figure 3.9: A typical pairwise plate of one promoter fragment vs 95 TFs ( control

at H12) on agar SD-LTH plate. It is diﬃcult to identify all positive interactions

visually due to auto-activation.

has been devised. Photographs of the final plates are always taken and kept for

reference purposes. However, these pictures combined with some image analysis

and statistical techniques could potentially be used to consistently identify positive

interactions, which is especially important when considering weak interactions.

Use of the automated scoring and analysis technique, SpotOn, has recently

been used in mapping human (Reece-Hoyes et al., 2011a) and worm (Reece-Hoyes

et al., 2011b) TFs to bait DNA sequences in a high-throughput manner. The method

used in these two studies and the new method presented in this chapter diﬀer in

a number of important aspects. Firstly, SpotOn uses colour information obtained

from the blue colonies produced by thelacZ

reporter gene, whereas only grey scale

information is available from the study presented here. Presence of colour infor-

mation means that SpotOn is able to hone in on individual colonies, as opposed

to combining all colonies within the same spot/well, as used in the method here.

Although both methods use colour and grey scale intensities to determine positive

interactions, both apply diﬀerent statistical techniques to the intensity information.

SpotOn uses negative control spots with empty reporter constructs as a normalisa-

tion factor and a Z-score is derived from normalised samples, taking into account

growth variation in row/column and negative controls. Higher Z-scores correspond

to high confidence positive results. On the other hand, the new method compares

negative controls against test spots using Kolmogorov-Smirnoﬀ Two-Sample test

statistics.

Spot Detection

A number of diﬀerent approaches have been tried to automatically identify circu-

lar spots where the yeast was spotted, for example, using Sobel filters to outline

regular circular structures. However, no adequate settings were found. An alter-

native approach was to use raw image recognition library OpenCV which provides

libraries for feature detection, for example using Haar wavelets. In order to use

feature detection, positive and negative sets containing true positives and true neg-

ative examples need to be prepared and feature detection can be trained using these

sets. The larger the training sets are, then the better feature detection becomes.

However, constructing a large enough set from the existing data would have been

as time consuming as picking out wells by hand. Moreover, as most of the data

used for training couldn’t have been used for detection meant that success of the

feature detection would have been diﬃcult to access. Therefore, an ImageJ coupled

with microarray plugin that allows for easy drawing of the round spots on a large

scale allowed for picking a relatively high throughput way of outlining mated spots.

Once the spots were outlined, the same plugin outputs statistics about the spot, in-

cluding a histogram of the intensity values and intermediate statistics such as mean,

variance and standard deviation of intensities. Histograms are used for cumulative

distribution computations and in further downstream analysis.

Positive Interaction Identification Among Auto-Activation

An ability to identify positive interactions amongst the noise present in the fragments

that have high levels of auto-activation is one of the major advantages of using

automatic identification and classification. For example, the Y1H 177 fragment

consistently auto-activated during the screen, but appropriate levels of 3AT that

maintained a high fraction of positive results were diﬃcult to determine, Figure

3.10. The method developed here allows even noisy spots to be compared with

negative controls since both should contain roughly the same amount of growth due

to auto-activation. In addition, positive results should also contain more growing

yeast due to the positive interaction of the TF with the promoter fragment. This

can be diﬃcult using the traditional, by eye, method.

The new method has been applied to the images for this fragment and found

to contain a few significant interactions that were previously discarded in the tradi-

tional technique.

Limitations Of Automatic Approach

The new methods have been shown to be able to distinguish between positive and

negative results, Figure 3.6, as well as allowing the automatic determination of

all potential positive results on a single plate, Figure 3.7. However, one of the