• No results found

Overview of the Method

Figure 2.1 is a flow chart of the approach. The entire surface of

th e g u e st m o le c u le and the b in d in g r e g io n o f th e h o st are

represented as a series of slices of the protein surface. Each slice

details the van der W aals surface of the protein over a 32Â x 32Â

region. The surfaces of the proteins are smoothed and represented as

a contour map. A soft potential, based loosely on the L eonard-Jones

potential, is used to check each possible pair of m aps for surface

c o m p l e m e n ta r it y . T h is c o m p a r is o n p r o c e s s is c o m p u t a t i o n a l l y

in te n siv e and so a p a ra lle l a rc h ite c tu re c o m p u te r is used. The

m atching process produces a large num ber o f p ossible orientations,

w h ich m u st be clu stered to p ro d u ce a m an a g ea b le d ataset. After

clustering several hundred docking orientations rem ain. Each one of

th ese o rien tatio n s represents a sterically p lau sib le w ay o f docking

the host and guest molecules which is significantly different from the

o th e r o rie n ta tio n s in the set. To red u c e fu rth e r th e n u m b er of

d o c k in g o rie n ta tio n s c o n s tra in ts are a p p lied . T he m ain d o c k in g

p ro ce d u re c o n cen trates solely upon steric in fo rm a tio n and so the

c o n s tr a in ts u sed are based on o th er in fo rm a tio n th at m ay be

a v a i l a b l e . T h e s e a re e l e c t r o s t a t i c c o m p l e m e n t a r i t y , e p i t o p e

inform ation, and a single imprecise distance constraint.

Start

Split Antibody binding region into 64 slices Split Antigen into

432 slices

Produce rotations of Antibody maps by varying co

For all Antigen,Antibody pairs

Vary translation dx,dy

Vary height r

Apply potential

Finished ?

Write best to disc

Finished ?

Cluster

Apply constraints

End

Figure 2.1

A flo w ch art of the w hole D A PM atch algorithm . The section

e n c lo se d in a h e av y -lin e d box is the c o m p u ta tio n a lly in te n siv e

T h e main docking procedure and the soft potential used, were

developed and tested on the H yH EL-10 system. The constraints were

b e n c h m a r k e d u s in g b o th th e H y H E L - 1 0 and D 1 .3 s y s te m s .

O n cesu itab le p aram eters for the potential and subsequent constraints

had been found the method was applied to HyHEL-5, the D1.3 model

an d th e e n z y m e / i n h ib i t o r sy s te m s w ith o u t any c h a n g e s . T h is

approach ensured that the m ethod was not optim ised to produce the

c o rre c t resu lts.

2.5. Impl ementation

S im u la tio n s o f the d o c k in g p ro b le m are o ften c o m p u te r

intensive. T he m ore orientations exam ined, and the m ore com plete

the treatm ent o f the discrim ination function, the m ore com puter time

req u ire d . T o avoid this p ro b lem a m assiv ely p a ra lle l arch itectu re

machine, the D A P (AM T Ltd, Reading R G6 lA Z ), was used. The DAP

(D istrib u te d A rray of P ro cesso rs), is a 64x64 grid of closely coupled,

sim ple p ro cesso rs (Figure 2.2). Each p rocessor carries out the same

in stru ctio n , b u t on d ifferen t data. The DA P is th erefo re a single

in s tr u c tio n , m u ltip le d a ta stream (S IM D ) m ac h in e . T h e D A P is

c o n n e c te d by a fast in p u t/o u tp u t (I/O ) bus to a c o n v e n tio n a l

a rc h ite c tu re co m p u ter, called the host. This co m p u ter co n tro ls the

execution o f program s by the DAP and performs all the necessary file

I/O fo r b o th c o m p u ters. T he DA P is p ro g ra m m e d in a highly

m achine-specific form of FO RTR AN which allows parallel instructions

to be written in an understandable and compact form. D A PFO R T R A N

has the full set o f FO R TR AN mathem atical routines, and many more

w h ic h are u se fu l in the p a ra lle l c o n te x t o f the m ac h in e , but

unfortunately lacks any high level I/O functions.

P.E. P.E. 4 P.E. P . E f ? ► P . E f 4 ► P.E. P.E. t P.E. P.E. Simple processor elements (P.E.) linked

to neighbouring elements by fast connections

processor layer

64 Figure 2.2

The parallel architecture of the DAP computer, a 64x64 grid of

simple processor elements (P.E.s). Each processor has fast connections

to its four nearest neighbours (as shown in the inset). All instructions

are carried out at the processor layer, the required data is brought to

this layer from the array memory. 9 0

The DA P is best suited to p ro b lem s w hich can readily be

e x p re sse d in a p a ra lle l a lg o rith m , w h ich in v o lv e m ain ly in te g er

arithmetic and which entail little I/O. It had already been decided to

investigate the extent to which the steric m atching of surfaces could

be used to solve the docking problem. The sim plest way of mapping

a protein surface onto the DAP architecture was to take a planar slice

through the protein, divide the plane into a 64x64 grid, and at each

grid p o in t find the height o f the pro tein surface above the plane

(Figure 2.3). For convenience these heights were taken to be integers

in the range 0 to 63. Since the surface slices taken were 64 elements

square it was convenient to map a single elem ent onto each DAP

p ro cesso r. This allow ed the energy sum m ation to be carried out

sim ultaneously for each o f the 4096 elements. This sim ple mapping

o f the p ro b lem o nto the DA P a rc h ite c tu re allo w ed large speed

im provem ents. A search could be com pleted w ithin 2 days on the

DAP, whereas it would have taken a Sun SPARC II around 100 days.

W h ils t the D A P M a tc h ste ric se arc h p r o g r a m w as b eing

developed the DAP at the ICRF was connected to a SUN 3. The SUN 3

w as rela tiv e ly slow and the c lu ste rin g and c o n stra in t alg o rith m s

could not be implemented on it. These processes were also quite I/O

in ten siv e and so it was not convenient to im p lem en t them on the

D A P . I n s t e a d th e p r e - p r o c e s s i n g an d p o s t - p r o c e s s i n g w e re

im plem ented on a SUN SPARC II. This meant that large intermediate

files had to be generated and passed between the com puter systems.

The DAP is now hosted by a SUN SPARC II. This allows the DAPMatch

p r o g r a m to be m o re h ig h ly in te g ra te d , re d u c in g the n eed for

interm ediate files and making the package easier to use.

6 31