• No results found

Structural Com parison

Structural A n alysis, C om parison and A lign m en t

2.2.3 Structural Com parison

2.2.3.1

Structure-Structure Comparison

T he C O C O P L O T toolkit generates a m easure of stru c tu ra l sim ilarity based on the num ber of equivalent, i.e. overlapping, contacts between two protein structures. Inter-residue contacts buried in the protein core not only provide a description of te rtia ry protein stru ctu re b u t also contribute to the stab ility of the overall fold. As a result, these contacts are often highly conserved during evolution.

D uring evolution a stru ctu re can be affected not only by point m utations, where one residue changes identity to another, b u t also by indels of sometimes large frag­ m ents of am ino acid sequence (see section 1.2.4.1). Therefore, any m ethod th a t ex­ amines these d istan t stru ctu ral relationships m ust take these indels into account by allowing gaps to be introduced into the alignm ent. For stru ctu re-stru ctu re com par­ isons the SSAP algorithm (Orengo & Taylor, 1996) was used to generate a stru ctu ral alignm ent between th e two proteins. To generate th e pairwise contact overlap score, th e contact m aps of each stru ctu re were first generated then th e SSAP alignm ent used to identify equivalent positions in the pairwise alignm ent where b o th stru c­ tures have an inter-residue contact. The contact overlap score, Sstructure-structure, is given by th e overlapping contacts as a percentage of the larger num ber of contacts between the two stru ctures (see equation 2.3).

S s t r u c t u r e - s t r u c t u r e = ~

1

- " ^ — * 1 0 0 (2.3)

t^max

W here

Coverlap = N um ber of overlapping contacts between stru ctures (I) and (J)

Cmax = M ax ( C o n tacts/, C o n ta ctsj )

Figure 2.6 shows the comparison contact m ap between two actin-binding proteins (PD B codes 2vik and Isvq) based on a SSAP stru c tu ra l alignm ent. These two proteins are stru ctu rally sim ilar (SSAP score of 83), yet evolutionarily d istan t (17% sequence identity). In th e com parison contact m ap, the contacts in the first stru ctu re (2vik) are shown as grey dots, the contacts in th e second stru ctu re (Isvq) are shown

as black dots and th e overlapping contacts are shown as red dots. The m inim um sequence distance of 8 residues can be seen as th e yellow band on the m ain diagonal. T his is imposed to avoid including frequently occurring contact p a tte rn s between residues close in sequence, since these p a tte rn s (typical of secondary structures) are common to bo th related and unrelated structures.

Chapter 2. Inter-Residue Contacts for Structural Comparison 68 -c u - - o

*

71 121

V .

5 I I 21 31 41 31 61 71 81 91 101 I I I 121 131 MATRIX SCOPE OVERWP NWXCOMTACTS MINSEODB CUTOFF GAP PEMALTY BACKGPOUMD CONTACT SEC - SEC SEC • COIL COIL • COIL 23 J» 70 297 5 8 0 10000 10 10000 0 0 0 Protein 1 NAME CATH LgiGTH CONTACTS Protein 2 NAME CATH LB-IGTH CONTACTS 2wkOO 340.20.10 126 297 ImtqOO 340.20.10 94 206 Scale I I Noconlaci 1__ I M in seq d s I I C om ae I (Protein I) I B C o n ta c t (Proiein Z) I B C o n ta ct (C W ta p t

F ig u r e 2.6: Pairwise structure-structure comparison by overlapping contact maps. The alignment between the structures is shown by a secondary structure schematic (alpha-helices in magenta, beta-strands in yellow) with the contacts of the first struc­ ture shown as black dots, the contacts of the second structure as grey dots and overlapping contacts as red dots. The values seen in the bottom -left box relate to CONALIGN parameters (see section 2.3)

Chapter 2. Inter-Residue Contacts for Structural Comparison 69

2.2.3.2 Structure-Tem plate Comparison

To com pare the sim ilarity in contact m aps between a query stru ctu re and 3D tem ­ plate, a stru c tu ra l alignm ent m ust again be perform ed to identify th e set of equiva­ lent positions. This stru ctu ral alignm ent was achieved using th e CORALIGN pro­ gram from the CO RA suite (Orengo, 1999). The CORALIGN program is used to align a single protein stru ctu re to the consensus stru ctu ral tem plate generated from a CO RA m ultiple stru ctu re alignm ent (Orengo, 1999). The C O C O P L O T program is th en used to generate a COM for the tem p late and a contact m ap for th e single stru ctu re. T he contact overlap score, Sgtructure-tempiate: is then calculated as the num ber of contacts th a t occur between equivalent positions in the alignm ent (see equation 2.4)

S s t r u c t u r e - t e m p l a t e = * 1 0 0 (2.4)

t^max

W here

Coverlap = N um ber of overlapping contacts between contacts in stru ctu re (I) and consensus contacts in stru c tu ra l tem plate (J)

Cmax = M ax ( C o n tacts/, Consensus C o n ta ctsj )