• No results found

Image Based Method For Identification of Positive Results In

3.4 High-throughput identification of direct protein-DNA interactions us-

3.5.1 Image Based Method For Identification of Positive Results In

One of the big challenges after performing the experiment is to correctly identify

positive and negative results. Previously, the positive interactions were found by

counting the number of colonies on the plate where mated yeast has been growing

for a number of days after the cleaning process. When the plates have been ade-

quately cleaned and there is no auto-activation present, it is relatively easy to pick

out strong positive results, since they appear as densely growing colonies. However,

in a practical setting this is often not the case. Figure 3.9 shows a more repre-

sentative example where positive results have to be identified. Firstly, scoring was

introduced to differentiate weak and strong interactions on a scale from one to ten

respectively. However, such scoring was not always consistent, between operators

and over time. Variability can result from growth conditions in the incubator and on

the plate (e.g. 3AT concentrations), and also depend on the individual scoring the

results. Therefore, consistency is a major factor in analysing colony growth and the

subsequent categorisation as a ’positive’ or ’negative’ result. In order to eliminate

this variability an alternative method for the determination of positive interactions

(a) Y1H-146 fragment. (b) Y1H-147 fragment.

Figure 3.9: A typical pairwise plate of one promoter fragment vs 95 TFs ( control

at H12) on agar SD-LTH plate. It is difficult to identify all positive interactions

visually due to auto-activation.

has been devised. Photographs of the final plates are always taken and kept for

reference purposes. However, these pictures combined with some image analysis

and statistical techniques could potentially be used to consistently identify positive

interactions, which is especially important when considering weak interactions.

Use of the automated scoring and analysis technique, SpotOn, has recently

been used in mapping human (Reece-Hoyes et al., 2011a) and worm (Reece-Hoyes

et al., 2011b) TFs to bait DNA sequences in a high-throughput manner. The method

used in these two studies and the new method presented in this chapter differ in

a number of important aspects. Firstly, SpotOn uses colour information obtained

from the blue colonies produced by thelacZ

reporter gene, whereas only grey scale

information is available from the study presented here. Presence of colour infor-

mation means that SpotOn is able to hone in on individual colonies, as opposed

to combining all colonies within the same spot/well, as used in the method here.

Although both methods use colour and grey scale intensities to determine positive

interactions, both apply different statistical techniques to the intensity information.

SpotOn uses negative control spots with empty reporter constructs as a normalisa-

tion factor and a Z-score is derived from normalised samples, taking into account

growth variation in row/column and negative controls. Higher Z-scores correspond

to high confidence positive results. On the other hand, the new method compares

negative controls against test spots using Kolmogorov-Smirnoff Two-Sample test

statistics.

Spot Detection

A number of different approaches have been tried to automatically identify circu-

lar spots where the yeast was spotted, for example, using Sobel filters to outline

regular circular structures. However, no adequate settings were found. An alter-

native approach was to use raw image recognition library OpenCV which provides

libraries for feature detection, for example using Haar wavelets. In order to use

feature detection, positive and negative sets containing true positives and true neg-

ative examples need to be prepared and feature detection can be trained using these

sets. The larger the training sets are, then the better feature detection becomes.

However, constructing a large enough set from the existing data would have been

as time consuming as picking out wells by hand. Moreover, as most of the data

used for training couldn’t have been used for detection meant that success of the

feature detection would have been difficult to access. Therefore, an ImageJ coupled

with microarray plugin that allows for easy drawing of the round spots on a large

scale allowed for picking a relatively high throughput way of outlining mated spots.

Once the spots were outlined, the same plugin outputs statistics about the spot, in-

cluding a histogram of the intensity values and intermediate statistics such as mean,

variance and standard deviation of intensities. Histograms are used for cumulative

distribution computations and in further downstream analysis.

Positive Interaction Identification Among Auto-Activation

An ability to identify positive interactions amongst the noise present in the fragments

that have high levels of auto-activation is one of the major advantages of using

automatic identification and classification. For example, the Y1H 177 fragment

consistently auto-activated during the screen, but appropriate levels of 3AT that

maintained a high fraction of positive results were difficult to determine, Figure

3.10. The method developed here allows even noisy spots to be compared with

negative controls since both should contain roughly the same amount of growth due

to auto-activation. In addition, positive results should also contain more growing

yeast due to the positive interaction of the TF with the promoter fragment. This

can be difficult using the traditional, by eye, method.

The new method has been applied to the images for this fragment and found

to contain a few significant interactions that were previously discarded in the tradi-

tional technique.

Limitations Of Automatic Approach

The new methods have been shown to be able to distinguish between positive and

negative results, Figure 3.6, as well as allowing the automatic determination of

all potential positive results on a single plate, Figure 3.7. However, one of the

major challenges that hindered successful prediction of positive interactions is the

presence of bubbles in the agar plates. When bubbles are formed during plate

pouring and are not dissipated, they leave a circular shaped indentation in the

plate. When the picture of the plate is taken after the incubation period with

upper white light (see Methods), the edges of the circular indentations produce

brighter outlines as light is refracted internally more around these edges than inside

of the indentations, giving a “halo” effect. Furthermore, the analysis relies on the

premise that brighter pixels represent growing yeast, and therefore this “halo” effect

introduces artificially brighter pixels. The spots with the “halo” effect produce lower

P-values that are still statistically significant. This effect can be corrected by using

appropriate spot recognition, as mentioned above. Automatic classifiers can be

trained to recognise bubble indentations and flag them appropriately, given that

an appropriate training set exists. During the development of the new method, no

such training set existed, therefore it was not possible to train automatic classifier.

However, following the pairwise Y1H results presented here, there are over 6500

examples of spots containing positive, negative and auto-activating results, which

can be used to successfully train an automatic classifier and design a pipeline that

would take images of plates containing yeast and output unbiased estimates for

positive and negative interactions on the plate. This method can also be extended

to be used in any pairwise Yeast ’n’-Hybrid screen. Additionally, the classifier could

be trained to be used with library screens using images from the library screens

Figure 3.10: Pairwise screen of Y1H 177 fragment against a mini-library of TFs.

All spots, with exception of C07, have some level of growth due to auto-activation

in yeast. Negative control is located at H12.

presented here, although spots are not as well defined in the library images as in

the pairwise screen.