Figure 2.1 is a flow chart of the approach. The entire surface of
th e g u e st m o le c u le and the b in d in g r e g io n o f th e h o st are
represented as a series of slices of the protein surface. Each slice
details the van der W aals surface of the protein over a 32Â x 32Â
region. The surfaces of the proteins are smoothed and represented as
a contour map. A soft potential, based loosely on the L eonard-Jones
potential, is used to check each possible pair of m aps for surface
c o m p l e m e n ta r it y . T h is c o m p a r is o n p r o c e s s is c o m p u t a t i o n a l l y
in te n siv e and so a p a ra lle l a rc h ite c tu re c o m p u te r is used. The
m atching process produces a large num ber o f p ossible orientations,
w h ich m u st be clu stered to p ro d u ce a m an a g ea b le d ataset. After
clustering several hundred docking orientations rem ain. Each one of
th ese o rien tatio n s represents a sterically p lau sib le w ay o f docking
the host and guest molecules which is significantly different from the
o th e r o rie n ta tio n s in the set. To red u c e fu rth e r th e n u m b er of
d o c k in g o rie n ta tio n s c o n s tra in ts are a p p lied . T he m ain d o c k in g
p ro ce d u re c o n cen trates solely upon steric in fo rm a tio n and so the
c o n s tr a in ts u sed are based on o th er in fo rm a tio n th at m ay be
a v a i l a b l e . T h e s e a re e l e c t r o s t a t i c c o m p l e m e n t a r i t y , e p i t o p e
inform ation, and a single imprecise distance constraint.
Start
Split Antibody binding region into 64 slices Split Antigen into
432 slices
Produce rotations of Antibody maps by varying co
For all Antigen,Antibody pairs
Vary translation dx,dy
Vary height r
Apply potential
Finished ?
Write best to disc
Finished ?
Cluster
Apply constraints
End
Figure 2.1
A flo w ch art of the w hole D A PM atch algorithm . The section
e n c lo se d in a h e av y -lin e d box is the c o m p u ta tio n a lly in te n siv e
T h e main docking procedure and the soft potential used, were
developed and tested on the H yH EL-10 system. The constraints were
b e n c h m a r k e d u s in g b o th th e H y H E L - 1 0 and D 1 .3 s y s te m s .
O n cesu itab le p aram eters for the potential and subsequent constraints
had been found the method was applied to HyHEL-5, the D1.3 model
an d th e e n z y m e / i n h ib i t o r sy s te m s w ith o u t any c h a n g e s . T h is
approach ensured that the m ethod was not optim ised to produce the
c o rre c t resu lts.
2.5. Impl ementation
S im u la tio n s o f the d o c k in g p ro b le m are o ften c o m p u te r
intensive. T he m ore orientations exam ined, and the m ore com plete
the treatm ent o f the discrim ination function, the m ore com puter time
req u ire d . T o avoid this p ro b lem a m assiv ely p a ra lle l arch itectu re
machine, the D A P (AM T Ltd, Reading R G6 lA Z ), was used. The DAP
(D istrib u te d A rray of P ro cesso rs), is a 64x64 grid of closely coupled,
sim ple p ro cesso rs (Figure 2.2). Each p rocessor carries out the same
in stru ctio n , b u t on d ifferen t data. The DA P is th erefo re a single
in s tr u c tio n , m u ltip le d a ta stream (S IM D ) m ac h in e . T h e D A P is
c o n n e c te d by a fast in p u t/o u tp u t (I/O ) bus to a c o n v e n tio n a l
a rc h ite c tu re co m p u ter, called the host. This co m p u ter co n tro ls the
execution o f program s by the DAP and performs all the necessary file
I/O fo r b o th c o m p u ters. T he DA P is p ro g ra m m e d in a highly
m achine-specific form of FO RTR AN which allows parallel instructions
to be written in an understandable and compact form. D A PFO R T R A N
has the full set o f FO R TR AN mathem atical routines, and many more
w h ic h are u se fu l in the p a ra lle l c o n te x t o f the m ac h in e , but
unfortunately lacks any high level I/O functions.
P.E. P.E. 4 P.E. P . E f ? ► P . E f 4 ► P.E. P.E. t P.E. P.E. Simple processor elements (P.E.) linked
to neighbouring elements by fast connections
processor layer
64 Figure 2.2
The parallel architecture of the DAP computer, a 64x64 grid of
simple processor elements (P.E.s). Each processor has fast connections
to its four nearest neighbours (as shown in the inset). All instructions
are carried out at the processor layer, the required data is brought to
this layer from the array memory. 9 0
The DA P is best suited to p ro b lem s w hich can readily be
e x p re sse d in a p a ra lle l a lg o rith m , w h ich in v o lv e m ain ly in te g er
arithmetic and which entail little I/O. It had already been decided to
investigate the extent to which the steric m atching of surfaces could
be used to solve the docking problem. The sim plest way of mapping
a protein surface onto the DAP architecture was to take a planar slice
through the protein, divide the plane into a 64x64 grid, and at each
grid p o in t find the height o f the pro tein surface above the plane
(Figure 2.3). For convenience these heights were taken to be integers
in the range 0 to 63. Since the surface slices taken were 64 elements
square it was convenient to map a single elem ent onto each DAP
p ro cesso r. This allow ed the energy sum m ation to be carried out
sim ultaneously for each o f the 4096 elements. This sim ple mapping
o f the p ro b lem o nto the DA P a rc h ite c tu re allo w ed large speed
im provem ents. A search could be com pleted w ithin 2 days on the
DAP, whereas it would have taken a Sun SPARC II around 100 days.
W h ils t the D A P M a tc h ste ric se arc h p r o g r a m w as b eing
developed the DAP at the ICRF was connected to a SUN 3. The SUN 3
w as rela tiv e ly slow and the c lu ste rin g and c o n stra in t alg o rith m s
could not be implemented on it. These processes were also quite I/O
in ten siv e and so it was not convenient to im p lem en t them on the
D A P . I n s t e a d th e p r e - p r o c e s s i n g an d p o s t - p r o c e s s i n g w e re
im plem ented on a SUN SPARC II. This meant that large intermediate
files had to be generated and passed between the com puter systems.
The DAP is now hosted by a SUN SPARC II. This allows the DAPMatch
p r o g r a m to be m o re h ig h ly in te g ra te d , re d u c in g the n eed for
interm ediate files and making the package easier to use.
6 31