• No results found

3. A New MOE-Script for Improved Visualization of PLS-QSAR-Equations

3.4. Concepts of Visualization

3.4.1. Colour Scheme and Object Visualization

When using a graphical representation and assigning an object a special colour, this colour can contain information. Colouring atoms and bonds, for example, has already been used by various chemical software packages, e.g. SYBYL HQSAR [11].

Colour from to red  -0.2415 red_orange -0.2415 -0.1449 orange -0.1449 -0.0966 white -0.0966 0.1018 yellow 0.1018 0.1526 green_blue 0.1526 0.2544 green 0.2544 

SYBYL offers the already introduced HQSAR package including fragment contribution predic- tion (see section 2.1, [11]). SYBYL therefore breaks down the contributions to colours using seven discrete codes (see table 3.1) ranging from red for bad contributions via white for neutral to green for good ones. Between white and green, yellow and green-blue are set as interme- diate values, thus giving an in principle positive contribution a yellow sign. Commonly, the colour yellow typically is interpreted as “Watch out!”, “Be careful!”. Furthermore, yellow has a negative connotation in both folk tradition and Goethe’s theory of colours [15]. Additionally, green-blue is seen as the “coldest” of all colours, thus also not suited for signalling positive contributions. On the other hand, the MIBALS add-on package for MOE ([54], see also sec- tion 2.2) assigns only two colours, grey and red. The amount of contribution is then visualized by object size of so-called wirespheres, a wireframe model of spheres placed over the specific atom [34, Graphics Object Functions]. Red is used for negative, grey for positive contributions — a not very intuitive colouring either, especially as grey is already used for colouring carbon atoms of the molecule.

Implementing a Good Colour Scheme

Our script therefore provides an easy interpretable, intuitive colouring scheme ranging from red showing negative contributions to white, which stands for neutral atoms, and then to green for good ones. We skip intermediate colours and don’t classify the values into histograms like SYBYL, but use a floating scheme assigning pastel-coloured for intermediate and an intensive red and green for the extreme values respectively. MOE supports 24 bit RGB colours, thus has a 256 steps scale (1 byte) for red (R), green (G) and blue (B), the resulting colour is computed by adding the three components. It is represented by one integer value with the high byte coding for red [34, Color Widget]:

code = red ∗ 2562+ green ∗ 256 + blue

As the focus is set on one molecule, the colour scheme is scaled over red and green for the molecule’s values to have either an extreme green or red colour. The colour is calculated using the local functioncalculateColors, shown in listing 3.7.

1 local function calculateColors[atoms,values] 2 local maximum = max (abs values);

3 local multiplier = 200/maximum; 4 local colors = zero atoms; 5

7 local posgreenvalues = mput[zero posvalues,igen(length posvalues),255]; 8 local posredvalues = ceil (200-multiplier*posvalues);

9 (colors | values > 0) = pow[16,4]*posredvalues + pow[16,2]*

posgreenvalues + posredvalues;

10

11 //similar calculation for negative values skipped

12

13 (colors | values == 0) = pow[16,4]*255 + pow[16,2]*255 + 255; 14

15 return colors; 16 endfunction

Listing 3.7: Determining object colours in dependence on values

In listing 3.7,atomsis an array of atoms for which the colours have to be calculated in depen- dence onvalues. To scale, the maximum value is determined in lines 2 and 3. multiplieris one for the maximum value, 0.5 for the half. For all (in this case) positive values, the red, green and blue values have to be determined. To get the pastel scale, we set the maximum to green (255, line 7). The red and blue amount is then dependent on the value, thus determining a grade between green (RGB code 0,255,0) and (near) white (200,255,200). In line 9 we subsequently calculate the integer code for MOE as shown above. Zero values are set to white (line 13), then the colour array is returned.

When using the parallel view mode, the colours are also only scaled over the molecule itself, thus an intensive green of one molecule can’t be automatically compared with an intensive green of other molecules displayed simultaneously.

Activity by Colours

As shown in the SYBYL HQSAR package, one of the easiest ways to signal contribution is to simply colour the atom according to its value. MOE therefore offers the aSetRGB [atoms,

color]function which defines a freely assignable RGB colour for atoms. At that point, the RGB colour of an atom is defined, but not shown, as the package offers various colour modes for atoms. By default, atoms are displayed with the colour assigned to their element type, ’

element’. To change to RGB, the commandaSetColorBy [atoms,’a:rgb’]has to be stated

for the corresponding atoms. The colouring itself is done using the functioncolorAtoms[].

“Colouring the atom” in terms of MOE’s graphical capabilities is misleading, as the atom not only includes the core point, but also half of all bondings the atom takes part in. On the one

hand, this inclusion makes the assignment clearly visible as the colour is not only reduced to one point. On the other hand, it is impossible to assign a different colour to the bond itself.

An inherent disadvantage of this method is that the size of the value is only characterized by the colour, thus making it sometimes difficult to clearly distinguish between only slightly different properties. As the original colour is overwritten, the colour coding of the elements in the molecule is hidden, which might lead to problems, although typically chemists know their molecules used for QSAR very well. Additionally, the atoms’ elements can be indicated by using the element text feature of MOE.

The main advantage of this method is the simple calculation, which is fast for a large number of atoms, as there are no additional graphical objects to draw. Furthermore, the graphical representation is self-explanatory and easy to understand. The textual representation of the value is clearly visible, as no additional objects can hide them. An example of the colour visualization is shown in figure 3.3a.

(a) Activity by colour. (b) Wirespheres mark the activity. Scale factor 0.5.

Figure 3.3.: Figures show the two possible visualization options of contributions to activity. In both cases, GPV0265 and GPV0049 (see appendix B) are displayed. The MLCS is not shown.

Wirespheres as Visual Marker for Activity

In contrast to a simple colouring which is using pre-existing objects like bonds or atoms, the MIBALS approach ([54], see also section 2.2) draws new objects in order to visualize its model fragment scores, so-called graphic object spheres. These are wireframe models of spheres, which radii are dependent on the score of each fragment. The only source of estimating the value of the score is therefore the object size, as the colouring is just red or grey. The function wiresphereAtoms[]implements the wiresphere creation procedure in our script.

The graphical object has to be defined using GCreate, then a vertex object is established as a wiresphere model, for which an own MOE graphical function G_WireSphere is provided. It accepts four parameters, apart from colour and position a radius for the sphere and a quality parameter, which can be defined from “poor” (0) to “excessive” (4) or even higher [34, Graphics Object Functions]. We decide for 2 as a compromise. If speed problems occur with wirespheres, setting this parameter to 1 might be a good approach. The creation of the sphere itself is then done by GVertex combining the generic graphics object with the parameters set up by

G_Wiresphere.

As the spheres’ radii are based on the absolute values of contribution, their sizes can differ extremely. Very large spheres can even be bigger than the whole molecule, thus the middle- points of the spheres, which are identical with the atoms, are difficult to determine, resulting in a problematic assignment task, difficult interpretation and a potentially hidden textual value representation. Very small spheres make comparisons impossible. To overcome that problem, we implement a scaling feature wirespherefactor. This factor is used to resize all visible spheres by entering numeric values, the standard being two. It is accessible via the graphical user interface on “Options” tab and only active if “Display Mode” is switched to “Wirespheres” (see section 3.5.1).

Apart from slow creation when displaying more than one molecule (see section 3.4.5), the wiresphere model offers many advantages. As the original molecule can be seen, determination of elements or other MOE implemented features is possible. Additionally, the spheres double- code the values: On the one hand, the colour equals that of the bond colouring, on the other hand the size of the spheres is adapted to the value. For an example, refer to figure 3.3b.