Examples from Industrial Practice in
Lead Development
Wolfgang Muster
Areas
¾
Computer-Aided Molecular Modeling (CAMM) *
¾
Absorption, Distribution, Metabolism and Excretion (ADME)
– Physicochemical Properties
¾
Predictive Toxicology
¾
Genotoxicity/Carcinogenicity
¾
Phospholipidosis
¾
Genotoxic impurities
* Alternative terms applied to this area: 9 Computer-Aided Drug Design (CADD) 9 Computational Drug Design (CDD)
9 Computer-Aided Molecular Design (CAMD) 9 Rational Drug Design
9 In silico Drug Design
9 Computer-Aided Rational Drug Design
9 Computer-Aided Drug Discovery and Development (CADDD) 9 Cheminformatics and Molecular Modeling
Areas
Drug
Drug
Candidate
Candidate
ADME Properties Safety Profile EfficacySustainable Pharmacy, Osnabrück 2008 Safety 42% Efficacy 28% Business 22% ADME 8%
Predictive
Toxicology
In silico ADME
Better
target validation
Failure reasons
Stahl et al. (2006) Drug Discover Today 11(7-8): 326-33. Kapetanovic (2008) Chem-Biol Interactions 171: 165-76.
Remark: 3D-receptor modeling for prediction of potential side effects are presently devised (Vedani et al.)
Computer-Aided Molecular Modeling (CAMM)
* nature of known ligands, homology to related targets,
the size, polarity and shape of binding sites in known target 3D structures,
knowledge of the key amino acids modulating selective binding or functional activity
Stahl et al. (2006) Drug Discover Today 11(7-8): 326-33.
Sustainable Pharmacy, Osnabrück 2008
Computer-Aided Molecular Modeling (CAMM)
Various computational methods, such as
Virtual screening
Many computational techniques are available to compile focused compound sets, with most of them falling under the umbrella term ‘virtual screening’
Fragment-based screening
Fragment screening is an additional ‘focused screening’ technique; small libraries of several hundred to several thousand low molecular weight substances that are screened by direct-binding methods in combination with X-ray crystallography
Chemogenomics search strategies
(for target classes without structure information, especially for G-protein-coupled receptors) Multidimensional similarity paradigm: ligand structure similarity, target sequence similarity and similarity of biological effects are combined. Biological similarity is determined in terms of affinity fingerprints of compounds against a set of targets.
Classic structure-based design (QSARs)
Fostel, J.
Predictive ADME-Tox 2005
Stahl et al. (2006) Drug Discover Today 11(7-8): 326-33.
Sustainable Pharmacy, Osnabrück 2008
Predictive ADME – Molecular properties
¾ Optimization of chemical series (quality of leads)
¾ All activities of promising compound classes should focus on multiple ADME–Tox-related parameters in parallel to activity and selectivity
¾ Results of commercially available tools for calculating physicochemical properties and ADME-related parameters have to be interpreted with great care
¾ The use of generic models can only be recommended if they have been validated for a particular project; results of new compounds outside of the training sets can be
misleading (ionization constants, lipophilicity and solubility)
¾ Shift in optimization strategy, use of measured values calls for high quality, fast and standardized assays (100–500 compounds per week)
¾ Generally, the aim of a local model is to rank compounds and not to predict the absolute magnitude of an in vivo or in vitro effect
¾ Allows project teams to abandon the classic paradigm of sequential filtering in more complex and expensive models (continuous model building; in vivo spot checks)
Use of in silico tools within toxicology:
¾
In silico
prediction of toxic effects at early development
stages – before drug candidate selection
¾
Hypothesis generation for structural mechanisms of action
¾
In later stages: first assessment of impurities, degradation
products, side products, metabolites,...e.g. structural
evaluation of synthesis schemes
Sustainable Pharmacy, Osnabrück 2008
System name Short description Predicted endpoints
Classical QSAR approaches Correlate structural or property descriptors of compounds with biological activities
QSARs for various endpoints published
DEREK for Windows Knowledge(rule)-based expert system M/C/SS/I and more (>40) MCASE
(CASE, CASETOX)
Machine-learning approach to identify molecular fragments with a high probability of being associated with an observed biological activity
Available modules:
M/C/T/I/H/MTD/BD/AT and more OncoLogic Knowledge-based expert system, mimicking the decision logic of
human experts
C MDL QSAR QSAR modeling system to establish structure-property relationships,
create new calculators and generate new compound libraries
M/C/hERG inhib/AT/LD50 lazar Derives predictions from toxicity data by searching the database for
compounds that are similar with respect to a given toxic activity
M/C/H/ET TOPKAT TOPKAT employs cross-validated QSTR models for assessing
various measures of toxicity; each module consists of a specific database
Available modules:
M/C/T/LD50/SS/I/ET and more ToxScope ToxScope correlates toxicity information with structural features of
chemical libraries, and creates a data mining system
M/C/I/H/T and more HazardExpert Knowledge(rule)-based expert system M/C/I/SS/IT/NT COMPACT COMPACT is a procedure for the rapid identification of potential
carcinogenicity or toxicities mediated by CYP450s
C and P450-mediated toxicities PASS Based on the comparison of new structures with structures of
well-known biological activity profiles by using MNA structure descriptors
Multiple endpoints Cerius2 Molecular modeling software with a ADME/Tox tool package provides
computational models for the prediction of ADME properties
ADME/H
System name Short description Predicted endpoints
Tox Boxes Modules generated by a machine-learning approach implemented in a fragment-based Advanced Algorithm Builder (AAB)
M/AT/C/LD50 and more
MetaDrug Assessment of toxicity by generating networks around proteins and genes (toxicogenomics platform)
>40 QSAR models for ADME/Tox properties
DICAS Cascade model with the capability to mine for local correlations in datasets with large number of attributes
C
CADD Computer-aided drug design (CADD)by multi-dimensional QSARs applied to toxicity-relevant targets
Receptor- and CYP450-mediated toxicities, ED
CSGeno Tox QSTR-based package employing electrotopological state indexes, connectivity indexes and shape indices
M
Admensa Interactive QSAR-based system primarily for ADME optimization CT
PreADMET Calculation of important descriptors and neural network for the construction of prediction system
M/C
BfR Decision Support System Rule-based system using physicochemical properties and substructures I and corrosion
M=Mutagenicity, C=Carcinogenicity, SS=Skin Sensitisation, I=Irritancy, H=Hepatotoxicity, T=Teratogenicity, MTD=Maximum Tolerated Dose, LD50=, BD=Biodegradation, AT=Acute Toxicity, ET=Environmental Toxicities, IT=Immunotoxicity, NT=Neurotoxicity, CT=Cardiotoxicity, ED= Endocrine disruption, ADME=Absorption Distribution Metabolism Excretion, QSTR=Quantitative Structure Toxicity Relationship, MNA=Multilevel Neighborhoods of Atoms
Muster et al. (2008) Drug Discovery Today 13/7-8, 303-310.
In silico
prediction systems – Summary table
continued
¾ Genotoxicity endpoint represented by 139 rules* (51 chromosomal damage*)
¾ Carcinogenicity endpoint represented by 54 rules*
¾ Irritation (skin, eye and respiratory tract) (33 rules*)
¾ Sensitisation (skin and respiratory tract) (76 rules*)
¾ Thyroid toxicity, hERG channel inhibition, oestrogenicity, photo-induced effects, neurotoxicity, teratogencity: less well covered
¾ Negative in DfW means: really negative or not covered!
DEREK for Windows (DfW)
D
eductive
E
stimation of
R
isk from
E
xisting
K
nowledge
DfW is a knowledge-based expert system for the qualitative
prediction of toxicity. DfW is not a database system but a
rulebase system. Each rule describes relationship between a
structural feature (toxicophore) and its associated toxicity.
Sustainable Pharmacy, Osnabrück 2008 * DfWV9.0.0
MCASE tries to predict toxicity on the basis of discrete structural fragments found to be statistically relevant to specific biological activity (biophores).
The differences between active and inactive molecules are investigated with the help of a so-called ‘learning dataset’, to deduce the attributes or substructures (so-called biophores) responsible for activity. From the frequency with which a particular biophore is identified in all active and all inactive molecules, one can calculate the probability with which this fragment is associated with biological activity.
MultiCASE (MCASE)
M
ultiple
C
omputer
A
utomated
S
tructure
E
valuation
¾ Ames modules for each strain +/- rat or hamster S9 available
¾ Four carcinogenicity modules incl. proprietary data male/female rats and mice
¾Modules and the underlying database have been developed with FDA
¾High prediction accuracy of the MCASE modules (mainly based on the unique dataset) ¾ Teratogenicity/Developmental toxicity/Male fertility/Behavioral toxicity in diff
species (49 modules)
¾ Hepatotoxicity in humans (14 modules)
¾ GSH adduct formation (in-house) rat and human microsomes
¾ Further available modules: antibacterial (pharm), ADME, cytotoxicity, ecotoxicity, skin/eye irritations, allergies, enzyme inhibition, biodegradation, bioaccumulation
Sustainable Pharmacy, Osnabrück 2008
DEREK for Windows (DfW)
DEREK is a knowledge-based expert system for the qualitative
prediction of toxicity. DEREK is not a database system but a
rulebase system. Each rule describes relationship between a
structural feature (toxicophore) and its associated toxicity.
METEOR
Meteor is a computer program that helps scientists who need
information about the metabolic fate of chemicals. The program
uses expert knowledge rules in metabolism to predict the
metabolic fate of chemicals and the predictions are presented
in metabolic trees. The only information needed by the program
to make its prediction is the molecular structure of the chemical.
VITIC Toxicology Database
Vitic is a chemically intelligent toxicology database, which can
recognise and search for similarities in chemical structures.
Vitic is especially useful in (Quantitative) Structure-Activity
Relationship (QSAR) modelling.
MultiCASE (MCASE)
MCASE tries to predict toxicity on the basis of discrete structural fragments found to be statistically relevant to specific biological activity (biophores).
The differences between active and inactive molecules are investigated with the help of a so-called ‘learning dataset’, to deduce the attributes or substructures (so-called biophores) responsible for activity. From the frequency with which a particular biophore is identified in all active and all inactive molecules, one can calculate the probability with which this fragment is associated with biological activity.
In silico phospholipidosis tool (CAFCA)
In-house tool predicts amphiphilic properties of charged small molecules expresed in terms of free energy of amphiphilicity (DDGAM). Amphiphilic
compounds have the potential to accumulate in lipid bilayers, interfering with the phopholipid metabolism and turnover, therefore causing adverse effects.
In silico phototoxicity prediction
Phototoxicity prediction based on chemical structure or chemical structure in combination with measured UV spectra
Further endpoints in development
Promising results with local models with the potential to be generally applicable (e.g. prediction of hERG channel inhibition, GSH adduct formation)
Sustainable Pharmacy, Osnabrück 2008
Expert vs data-driven (QSAR) systems - Toxicology
¾
Local SARs (project-specific SARs) based on 5 to maximally 30 data
points; can be evaluated by eye
¾
(Q)SAR systems will get increasing importance if HCS for more
toxicological endpoints are validated and implemented
¾
(Q)SAR systems are normally not used for genotoxicity and/or
carcinogenicity at Roche
¾
Commercial systems are predicting well and can be optimized
¾
Acceptance of regulatories
¾
Established for other endpoints (e.g. phototoxicity,
phospholipidosis, hERG assay)
¾
(Q)SAR systems might be also helpful, if additional in vitro HCS
parameters or cross-reactivities have been measured
DEREK / MCASE analysis MNT in vitro Ames micro Ames GLP MNT in vitro MNT in vivo tbd ML/TK Gene mutations Chromosomal aberration HCA one or both required for phase II Crosscheck VITIC, METEOR, SciFinder, TOXNET In silico (HTS) In vitro optimize LI/LO CCS RDC1 In vivo
Rodent cancer bioassay
Use of
in silico
genotoxicity prediction
optimize
On-the-fly Prediction/Classification DEREK combined with MCASE
Structural assessments of synthesis scheme, impurities, metabolites
Sustainable Pharmacy, Osnabrück 2008
The success of early genotoxicity screening
Year Ames micro
number of positive (incl. weak pos.
and inconclusive ones) compounds b
Full Ames (GLP)
number of positive (incl. weak pos.
and inconclusive ones) compounds b
1996 - 33 (48 %) 1997 - 25 (37 %) 1998 9 (15 %) 11 (24 %) 1999 5 (11 %) 9 (18 %) 2000 11 (11 %) 5 (20 %) 2001 6 (7 %) 3 (21 %) 2002 7 (9 %) 1 (6 %) 2003 3 (3 %) 0 2004 0 0 2007 2 (1 %) 0 2005 3 (2 %) 0 2006 3 (2 %) 0 2008a 1 (2 %) 0 a until March 2008
bexpected mutagens, intermediates/reactants and positives results due to impurities excluded Start of routine
in silico
Phospholipidosis
¾ Drug-induced phospholipidosis is a reversible storage disorder characterized by accumulation of phospholipids within cells, i.e., in the lysosomes
¾ Caused by cationic amphiphilic drugs (CADs) and some cationic hydrophilic drugs (e.g. Aminoglycoside gentamicin)
¾ Drug-induced phospholipidosis is a generalized condition in humans and animals; it may occur in virtually any tissue characterized by accumulation of one, or several classes of phospholipids within the cell
¾ Phospholipidosis may or may not be accompanied by organ toxicity although their association has not been proven (except for gentamicin)
Cationic hydrophilic Hydrophobic residues O N N O O O
Sustainable Pharmacy, Osnabrück 2008 N O O I O N I
pKa
pKa
Negative
ΔΔG
AM>= -6 kJ/mol
pK
a<
6.3
ΔΔ
G
AM< -6 kJ/mol
pK
a>= 6.3
Positive
Free Energy of Amphiphilicity (ΔΔGAM )In silico
classification of phospholipidosis potential
CAFCA (CAlculated Free energy of Charged Amphiphiles) Fischer, H. et al. (2000) Chimia 54, 640-645.
Techniques to detect phospholipidosis
In silico
tool
¾ From in vivo findings to predictive in vitro assay to HT in silico tool
¾ Calculation for large data set possible
¾ Accessible on the Intranet - optimization of pKa value as well as amphiphilic properties
¾ Identification of “clear positive” chemical series rather than single molecules
¾ Useful in Lead Identification and early Lead Optimization (depends on the indication, potency/dose and duration of treatment)
¾ Overall predictability of the in silico tool is very high for the in vitro assay;
in vitro test normally not conducted anymore
Amiodarone
In silico
classification of genotoxic impurities
¾ In principle, any impurity that is present below the threshold of qualification (0.15%) needs not to be toxicologically „qualified“ or „characterized“ (ICH)
¾ For a drug of 1 g daily intake this implies that a chronic intake of less than 1.5 μg of an impurity in that drug is considered toxicologically insignificant, however, ICH guidelines do indicate that “lower thresholds (for reporting, identification & qualification) can be appropriate if the impurity is unusually toxic” - but do not give guidance on what this is or how to handle
¾ Synthesis of APIs often involves reactive starting materials, intermediates or process steps; synthesis pathways frequently involve known or suspected genotoxic compounds
¾ Unknown/undetermined low levels of genotoxic impurities may be present (such as e.g. sulfonic acid esters)
¾ Issue not directly addressed in ICH guidelines -> new draft of the EMEA ’guideline on the limits of genotoxic impurities’ with new concept
¾ Clinical developments put on hold, because the synthesis pathways contains intermediates with alerting structures; Companies were requested to either show that the alerting
intermediates are below 1 ppm in the drug or provide data on genotoxicity
¾ Solution: use a generic TTC (Threshold of Toxicological Concern) based on historical experience with genotoxic carcinogens; staged TTC taking treatment duration into accout
Step 1: Identify and classify structural alerts in parent compound and impurities Step 2: Establish a qualification strategy
A: Limitation based on structural information, chemistry and analytical capabilities
B: Testing of “neat“ impurity; limitation based on outcome C: Testing of spiked material; limitation based on outcome Step 3: Establish acceptable limits
Proposal of acceptable intake levels without appreciable risk based on dose, duration of use, indication and patient/volunteer population
(staged TTC)
Sustainable Pharmacy, Osnabrück 2008 Class 1: Genotoxic Carcinogens Class 2: Genotoxic, Carc unknown Class 3: Alert – Unrelated to parent Class 4: Alert – Related to parent Class 5: No Alerts Eliminate Impurity? Staged TTC Threshold Mechanism? No or unknown PDE
(e.g. ICH Q3 appendix 2 reference Control as an ordinary impurity Impurity Genotoxic? 1 API Genotoxic2 Yes Ye s/ Not tested No
1 Either tested neat or spiked into API and tested up to 250 μg/plate 2 If API is positive, risk benefit analysis required
3 Quantitative risk assessment to determine ADI
Risk
Assess-ment?3
No No
Basic
Research IdentificationLead OptimizationLead DevelopmentPreclinical
Target identification, assessment and validation Clinical Development Filing/Approval & Launch
Phase 1 Phase 2 Phase 3
ADME / MolecProp
clogP / PSA / cPAMPA / cpKa Metabolic clearance
In silico
systems during drug development process
Sustainable Pharmacy, Osnabrück 2008
PredTox 1: DEREK / MCASE / VITIC / METEOR
CAMM
PredTox 2:
PL / Phototox
¾
Adequately predict complex toxicological endpoints (e.g.
hepatotoxicity, cardiotoxicity, nephrotoxicity) – need for
standardized high-quality data (Innovative Medicine Initiative)
¾
Design
in silico
tools to cope with the enormous amount of
data generated by new techniques – HTS/HCS, omics, system
biology, biomarkers, etc.
¾
Establish closer link from preclinical to clinical development
¾ In silico systems are extensively used during the early phases of drug development until selection of the clinical candidate (e.g. 3D-modeling, expert systems, QSAR tools)
¾ Applying in silico and in vitro screening significantly reduced failures in early project phases, increased efficiency and improved thquality of clinical candidates
¾ The number of ADME-Tox in silico and (HTS)-in vitro screens are rapidly increasing
¾ DEREK/MCASE and other commercially available systems are predicting toxicity endpoints like mutagenicity, carcinogenicity, skin sensitisation and irritancy well; in-house optimization is essential for high performance
¾ Further endpoints are less-well covered, mainly due to the lack of comprehensive, high quality and standardized databases
¾ QSAR tools can be established, based on internal standardized datasets, e.g. phospholipidosis, phototoxicity, hERG channel inhibition, GSH adduct formation
¾ Challenge how to predict adequately potential genotoxic impurities from structures in synthesis scheme; further regulations needed?
Conclusions
“Are you sure, Stan, that a pointy head and a
long beak is what makes them fly?”
Sustainable Pharmacy, Osnabrück 2008