Elementary steps of the reductive amination reaction a) The reaction is performed

if the required primary amine (4) and carboxylic acid (2) are recognized. b) The oxygen (2) is

removed from the carboxylic group. c) The hybridisation of carboxylic carbon (1) is changed

to sp3. d) A new bond is formed between the two fragments (4 and 1) with the specified bond

length and dihedral angles.

3.3.2 New fragment library

It is possible to build up a new template library as described earlier in section 3.2.5.1 under ‘Template library manager’ (page 40). A knowledge base for fragments can be generated in a similar way to the normal template library. The 3-D information and conformations of the fragments are necessary for the library. If this information is not available for the fragments generated by other programs it is possible to generate those while importing MOL/SD files to the new library by, for example CORINA and ROTATE programs. Imported fragments are detected according to Synthetic Knowledge Base to find all functional groups. Fragments are then inserted into the library groups according to their detected functionality.

3.3.3 Differences between Classic and SynSPROUT

Structure generation with SynSPROUT is similar to the Classic SPROUT program in the CANGAROO and HIPPO modules. The only difference is in the ELEFANT module and it is that the docked templates come from the new fragment library. Also the bi-directional sequential structure generation in the SPIDER module is like in the Classic SPROUT and the main difference arises from the fact that only synthetic joining is allowed. In every step of the structure generation phase the partial skeleton can grow from its functional group by

extending it with all selected fragments which have a corresponding functional group according to the synthetic rules. There are also different user defined parameters from the Classic SPROUT such as the maximum number of acceptor/donor atoms in skeleton, maximum number of stereo centre, maximum number of 5- and 6-membered aromatic rings, maximum number of synthetic joins, molecular weight and nearest functional group tolerance. The program provides users list with a Synthetic Rules and Functional Group Selection where it is possible to choose the desired synthetic reactions. This list is built from the synthetic knowledge base attached to the job. Spacer templates are selected from a fragment library after every stepwise connection. Since the functional groups are the connection points for further growth it is advisable to select only fragments, which have at least two functional groups. There are also new options in the SynSPROUT menu system. Some new options have been added to the ‘Skeleton’ menu and a completely new ‘Synthetic’ display option has been added.

3.4 Further modelling applications

3.4.1 Moloc

Moloc74 is an interactive modelling program for molecular structure calculations. It allows the user to create and display chemical structures, start and monitor a variety of calculations and analyse structures and the result of calculations. A wide variety of evaluations is possible from various geometric properties to energetic quantities. The program uses MAB force field,75,76 which is based on a simple and fast method to calculate charge distributions in organic molecules.

Specially designed tools are implemented for various purposes such as model building in protein X-ray crystallography, pharmacophore modelling, pharmacophore diversity analysis etc. In this study Moloc has been used for ligand minimisations inside the receptor to find the best possible contacts between generated ligand and protein complexes.

3.4.2 MacroModel

The MacroModel (v6.0),77 program is basic molecular modelling software with a large selection of force fields. In addition to common force fields such as AMBER, MM2, MM3

and Amber94, MacroModel includes MMFF (Merck Molecular Force Field), MMFFs, OPLS and OPLS-AA. The program can use five different kinds of conformational analysis algorithms including the most common one: Monte Carlo Multiple Minimum (MCMM). It has advanced methods also for molecular dynamics and free energy calculations as well as for salvation calculations. The program is well suited for general-purpose molecular mechanics for small and medium size organic molecules. It also has effective utilities for exploring proteins and protein-ligand complexes.

3.4.3 AutoDock

The AutoDock (v3.0)78 program is designed to predict how small molecules bind to a receptor of known 3D structure. The software is used for modelling flexible small molecules such as drug molecule binding to receptor proteins. AutoDock consists of three different modules: AutoDock performs the docking of the ligand to a set of grids describing the target protein; AutoGrid pre-calculates these grids; and AutoTors sets up which bonds will be treated as rotatable in the ligand. The programs search methods include the Monte Carlo simulated annealing (SA), local search (LS), Genetic Algorithm (GA) and GA-LS hybrid method (also called as the Lamarckian Genetic Algorithm (LGA).

3.4.4 eHiTS®

The Electronic High Throughput Screening (eHiTS)48, 79programcan dock flexible structures to target receptors. The program uses virtual high throughput screening methods to searching active molecules from the compound libraries. The eHiTS program performs accurate flexible ligand docking at high speed. The system generates all major docking modes that are compatible with the steric and chemistry constraints of the target cavity for each candidate structure. The solution could be used as a starting point for more involved energy minimisation studies to predict more exact binding modes and affinities. The program uses novel systematic algorithms for docking simulations.

3.4.5 SPA-Docking

This docking method has also been developed in ICAMS, University of Leeds and it is based on a novel simulated annealing minimisations algorithm called Systematic Population

Annealing (SPA).73 The algorithm is a combination of simulated annealing, evolutionary and local search methods. This docking method has been developed for the SPROUT program, and it uses the HIPPO module. In this docking method ligand is flexible but the receptor is kept rigid during the process. Rotatable bonds of the ligand are allowed to vary while docking. The program carries out a global energy minimisation for ligand. All conformations of the ligand are scored using a novel empirical scoring function which contains elements describing van der Waals, hydrogen bonding, metal ion bonding, hydrophobic contact, rotatable bond entropy and dihedral strain energy terms. The information for empirical coefficients is based on receptor-ligand complexes from the Brookhaven Protein Data Bank.

3.4.6 CAESA

The CAESA 46,47 (Computer Aided Estimation of Synthetic Accessibility) program attempts to overcome the synthetic feasibility problem by scoring and ranking according to an estimate of synthetic accessibility. The CAESA program is a rule-based expert system, which assesses the synthetic availability of a molecule by analysing its structural complexity arising from the stereochemistry, topology and functional groups.7 The CAESA version 2.4 was available for use during this study.

The program analyses each target structure on the basis of information included in various knowledge bases, which describe chemical and synthetical knowledge, and databases of available starting materials. Molecular fragments are described within the system’s knowledge bases using PATRAN46 linear notation and selected potential starting materials for synthesis are from a large database of available compounds such as Aldrich, Acros and Lancaster as well as in-house structure databases.31 First, a set of retrosynthetic rules is used to perform a retrosynthetic analysis of the target structure. After the analysis is complete, the selected starting materials are scored and ranked according to their physical coverage of the target, wastage and synthetic proximity. Potential synthetic routes are established between the starting material compounds and the target structure. Additionally, any part of the target structure that is not covered by any starting material undergoes a complexity analysis.

4. REVIEW OF STEROID HORMONES AND

HYDROXYSTEROID DEHYDROGENASES

4.1 Structure of steroid hormones

The structure of steroid hormones is related to the cyclopentanoperhydrophenanthrene nucleus (Figure 30a) and they are described in classes based on the numbering of carbons in their structure (Figure 30b). The nomenclature of steroids is heterogeneous and trivial names are commonly used. Steroid hormone abbreviations, trivial and IUPAC names discussed in this work are collect in Table 1. Sex hormones can be distinguished by the carbon number: C- 21 being progestational or adrenal steroids, C-19 being androgens and C-18 being estrogens.

O H C H₂ C H₂ C H₂ C H₂ C H C H C H₂ CH₂ C H CH C H₂ C H₂ CH C H CH2 CH₂ C H₂ 4 5 6 7 1 2 3 8 9 10 11 12 13 14 ₁₅ 16 17 18 19 A B C D 20 21 a b

Figure 30. a) Steroid hormone skeletons are related to cyclopentanoperhydrophenanthrene

structure. b) Numbering of the steroid backbone.

Table 1. Some of the steroid hormones nomenclature used in this chapter.

Trivial name Abbreviation IUPAC name

Estradiol 1 E2 estra-1,3,5(10)-triene-3,17β-diol

Equilin 2 EQU 3-hydroxyestra-1,3,5(10),7-tetraen-17-one 5α-Dihydrotestosterone 3 DHT 17β-hydroxy-5α-androstan-3-one

In document ENHANCEMENT OF THE SYNTHETIC SPROUT DE NOVO LIGAND DESIGN PROGRAM KNOWLEDGE BASE. SPROUT APPLICATION FOR (Page 56-60)