Introduction to Molecular Docking
Introduction to Molecular Docking
(a very brief and general overview)
(a very brief and general overview)
Introduction to Molecular Docking
Introduction to Molecular Docking
This text is meant to give the reader a general overview of molecular docking concept. A This text is meant to give the reader a general overview of molecular docking concept. A basic understanding of molecular docking will be presented to the reader in the simplest and basic understanding of molecular docking will be presented to the reader in the simplest and easiest manner possible. I would like to remind the readers though, that this text will not be able easiest manner possible. I would like to remind the readers though, that this text will not be able to replace any kinds of textbook in particular molecular modelling or equivalent to it. The to replace any kinds of textbook in particular molecular modelling or equivalent to it. The readers are advised to refer to those textbook pertaining to those topics for more in-depth readers are advised to refer to those textbook pertaining to those topics for more in-depth knowledge.
knowledge.
Objectives.
Objectives.
This text will basically covers the following objectives: This text will basically covers the following objectives:
1
1. . Definition oDefinition of moleculf molecular docking ar docking (what (what it's all aboit's all about)ut) 2.
2. The The importance importance of of dockingdocking 3.
3. ApproacApproaches to hes to docking docking (Methodolo(Methodologies)gies) 4.
4. Mechanistic Mechanistic of dockof docking (Hoing (How to w to perform docperform docking)king) 5.
5. Application Application of docking of docking (Where (Where it is beit is being useding used))
Introduction.
Introduction.
Molecular docking is a method to predict the preferred orientation of one molecule to a Molecular docking is a method to predict the preferred orientation of one molecule to a se
seconcond d whewhen n bobound und to to eaceach h othother er to to foform rm a a stastable ble comcomplplexex. . CoCompumputeters rs and and prprogrogramsams (softwares) are used to predict or simulate the possible reaction (and interactions) between two (softwares) are used to predict or simulate the possible reaction (and interactions) between two molecules based on their 3 dimensional structures.
molecules based on their 3 dimensional structures. Us
Using ing sofsoftwtwarares, es, the the intinteraeractictionons s can can be be vieviewewed d and and anaanalyzlyzed ed to to undunderserstantand d anandd ans
answerwers s sosome me biobiologlogicaical l impimportaortant nt ququestestionions s reregagardirding ng a a cecertairtain n chchememicaical l or or biobiologlogicaicall reaction.
reaction. An
Analyalyzinzing g the the intintereractactionions s babasicsicallally y cocomes mes witwith h (n(niceice) ) 3 3 D D gragraphphics ics whwhich ich cacan n bebe manipulated in several ways to clearly explore in detail (in atomic resolutions) the interaction manipulated in several ways to clearly explore in detail (in atomic resolutions) the interaction involved between the atoms in the two interacting molecules.
involved between the atoms in the two interacting molecules.
This method can therefore be used not only to predict possible binders or inhibitors, but This method can therefore be used not only to predict possible binders or inhibitors, but also to predict how strong the association between the molecules (called the binding affinity) also to predict how strong the association between the molecules (called the binding affinity) can be. It is useful to know the binding strength (binding energy) when you are comparing can be. It is useful to know the binding strength (binding energy) when you are comparing (ranking) a group of compounds or derivatives to determine which derivative is the best binder (ranking) a group of compounds or derivatives to determine which derivative is the best binder or inhibitor (how strong a compound will bind to the target).
Molecular surface of a protein Molecular surface of a protein
Prediction of the binding affinity will be useful when you are synthesizing compounds Prediction of the binding affinity will be useful when you are synthesizing compounds whereby you can predict the affinity of your desired compound towards a certain target (say a whereby you can predict the affinity of your desired compound towards a certain target (say a protein or DNA; with particular interest to stop the function of the enzyme/protein or to block protein or DNA; with particular interest to stop the function of the enzyme/protein or to block certain reaction).
certain reaction).
You can therefore save a lot time and money by “experimenting” using the computer first You can therefore save a lot time and money by “experimenting” using the computer first before actually going to the lab to make your compound. In addition, you can predict how a before actually going to the lab to make your compound. In addition, you can predict how a molecule interact or react with another molecule for example in protein – protein interaction, in a molecule interact or react with another molecule for example in protein – protein interaction, in a specific biological reaction (of your interest) before conducting the (“wet”) experiment.
specific biological reaction (of your interest) before conducting the (“wet”) experiment.
This method is also useful when you want to screen (they call it “virtual screening”) a This method is also useful when you want to screen (they call it “virtual screening”) a number of compounds say from a natural product or plants/herbs to see whether your small number of compounds say from a natural product or plants/herbs to see whether your small molecules (from the medicinal plants/herbs) will have certain pharmacological effects on a molecules (from the medicinal plants/herbs) will have certain pharmacological effects on a particular protein or enzyme (for example HIV protease etc.). Large pharmaceutical companies particular protein or enzyme (for example HIV protease etc.). Large pharmaceutical companies in
in EuEurorope pe anand d US US hahavve e bebeen en ususining g ththis is tetechchniniquque e fofor r sosome me titime me in in ththe e didiscscovoverery y anandd development of new drugs.
development of new drugs.
There are two main types of docking (molecular docking) in practice: small molecule – There are two main types of docking (molecular docking) in practice: small molecule – protein (called “ligand – protein docking”) and protein – protein docking. As mentioned earlier, protein (called “ligand – protein docking”) and protein – protein docking. As mentioned earlier, there is also small molecule – DNA or RNA docking done by some researchers. I believe, this there is also small molecule – DNA or RNA docking done by some researchers. I believe, this can be categorized as ligand – DNA/RNA docking.
can be categorized as ligand – DNA/RNA docking.
Protein – protein docking involves two protein molecules simulated by the Protein – protein docking involves two protein molecules simulated by the comp
computeruter/com/computputer er progprogram ram to to bindbind/int/interaceract t with with one one anotanotherher. . HowHoweveverer, , in in this this casecase, , thethe interactions are basically rigid compared to the ligand – protein docking. You might be able to interactions are basically rigid compared to the ligand – protein docking. You might be able to see that by simulating certain protein – protein interactions in a specific biological reaction, you see that by simulating certain protein – protein interactions in a specific biological reaction, you can get some information or insights at the molecular level on how a certain mechanism took can get some information or insights at the molecular level on how a certain mechanism took
There are many docking programs available which are able to simulate the ligand There are many docking programs available which are able to simulate the ligand
-protein interaction with the ligand (small molecule) having full flexibility (rotations in their bonds) protein interaction with the ligand (small molecule) having full flexibility (rotations in their bonds) while the protein or receptor either rigid or partially flexible (some amino acids side chains on while the protein or receptor either rigid or partially flexible (some amino acids side chains on the interaction site of the protein were made to be able to rotate). The higher the flexibility, the the interaction site of the protein were made to be able to rotate). The higher the flexibility, the more computational time needed to simulate the interaction.
more computational time needed to simulate the interaction.
3 D structure of a ligand with many torsions 3 D structure of a ligand with many torsions
Why is it important?
Why is it important?
Because of its ability in predicting binding interactions and orientation (in some cases at Because of its ability in predicting binding interactions and orientation (in some cases at a very high accuracy with reference to existing crystal structure of the complex studied), it is a very high accuracy with reference to existing crystal structure of the complex studied), it is being widely used in rational drug design @ structure based drug design processes (structure being widely used in rational drug design @ structure based drug design processes (structure based drug design means that we use 3 dimensional structures to design drug/new drugs with based drug design means that we use 3 dimensional structures to design drug/new drugs with the help of computers and softwares).
the help of computers and softwares).
Another good reason why many researchers are moving towards docking methods (to Another good reason why many researchers are moving towards docking methods (to complement their work) in their research is because some information are difficult to obtain complement their work) in their research is because some information are difficult to obtain through experimental ways. The ability of the computer to simulate the reactions in atomic through experimental ways. The ability of the computer to simulate the reactions in atomic details (nanoscale) and with the increasing power of computers (high performance computing), details (nanoscale) and with the increasing power of computers (high performance computing), pro
providvides es the the ananswswer er to to diffdifficuicult lt resresearearch ch proprobleblems ms whwhich ich cacannonnot t be be solsolveved d thrthrougough h ththee conventional means. This is why many researchers have diverted some (if not all) of their conventional means. This is why many researchers have diverted some (if not all) of their attention into this technique.
Brief Theory
Brief Theory
The theory behind molecular docking lies behind the enzyme – substrate recognition The theory behind molecular docking lies behind the enzyme – substrate recognition process. The problem can be thought as a “lock – and – key” concept. In this problem, the process. The problem can be thought as a “lock – and – key” concept. In this problem, the orientation of the ligand (small molecule or substrate protein) will be “fitted” to the receptor of orientation of the ligand (small molecule or substrate protein) will be “fitted” to the receptor of interest using either 2 approaches; matching technique, and simulation processes.
interest using either 2 approaches; matching technique, and simulation processes.
Certain programs (softwares) are able to rank the affinity of compounds towards the Certain programs (softwares) are able to rank the affinity of compounds towards the receptor studied. The programs will “adjust” the conformation of the ligand and in some cases receptor studied. The programs will “adjust” the conformation of the ligand and in some cases the conformation of the side chains of the receptors binding site (site of interaction) to see how the conformation of the side chains of the receptors binding site (site of interaction) to see how they will fit each other. This is done by calculating the energy for each conformation of the they will fit each other. This is done by calculating the energy for each conformation of the complex and the molecules when they are in their uncomplex form (before binding)
complex and the molecules when they are in their uncomplex form (before binding)
Methods of Calculation
Methods of Calculation
1
1stst method:method:
The first methods uses matching technique. This technique looks The first methods uses matching technique. This technique looks for complementary of the surfaces of each of the molecules
for complementary of the surfaces of each of the molecules (solvent – accessible surface area) to see whether it will match (solvent – accessible surface area) to see whether it will match each other surface. This is called the shape complementary each other surface. This is called the shape complementary methods
methods
2
2ndnd method:method:
The second method uses simulation processes. This technique The second method uses simulation processes. This technique simulates the actual docking process by calculating the ligand – simulates the actual docking process by calculating the ligand – protein pairwise interaction energies (using emperical methods or protein pairwise interaction energies (using emperical methods or molecular mechanics: take into account torsional energy between molecular mechanics: take into account torsional energy between bonds, van der Waals energy, electrostatic energy, hydrogen bonds, van der Waals energy, electrostatic energy, hydrogen bonding potential, atomic solvation energy, and so forth) bonding potential, atomic solvation energy, and so forth)
How Docking is Done.
How Docking is Done.
Typically, you will need two types of file or input; one for the ligand/small molecule and Typically, you will need two types of file or input; one for the ligand/small molecule and another one for the target/receptor. The ligand or small molecule can be built from scratch using another one for the target/receptor. The ligand or small molecule can be built from scratch using suitable software or taken from available databases.
suitable software or taken from available databases.
Some software comes with a 2 dimensional drawing which can be converted to 3 Some software comes with a 2 dimensional drawing which can be converted to 3 dimensional type. There are softwares with 3 dimensional drawing capabilities. The protein or dimensional type. There are softwares with 3 dimensional drawing capabilities. The protein or receptor's structure can either be downloaded from several databases, for example protein receptor's structure can either be downloaded from several databases, for example protein databank (
databank (http://www.rcsb.org/ http://www.rcsb.org/ ) or you can build it based on a template using softwares with) or you can build it based on a template using softwares with amino acids/biopolymer construction features (called homology modelling). The protein amino acids/biopolymer construction features (called homology modelling). The protein databank is a good resiporatory of large collections of x-ray crystallographic and NMR (nuclear databank is a good resiporatory of large collections of x-ray crystallographic and NMR (nuclear magnetic resonance) 3D structures available for the public.
magnetic resonance) 3D structures available for the public.
You will need to check the structure built (the receptor) to validate the quality of your You will need to check the structure built (the receptor) to validate the quality of your structure before performing any docking. You can easily find programs to do this over the web. structure before performing any docking. You can easily find programs to do this over the web.
So now you have 2 inputs (in 3D) needed to perform your docking/simulation. One for So now you have 2 inputs (in 3D) needed to perform your docking/simulation. One for the ligand, and the other one for your target/receptor.
the ligand, and the other one for your target/receptor.
How reliable is a docking result?
How reliable is a docking result?
There are many docking programs available on the net, commercial or open source. There are many docking programs available on the net, commercial or open source. However, the success of a program mainly depends on two components: the search algorithm However, the success of a program mainly depends on two components: the search algorithm and, the scoring function. Of course, its only a computer, if you put “rubbish in”, you get “rubbish and, the scoring function. Of course, its only a computer, if you put “rubbish in”, you get “rubbish out”. You will need to carefully check all your inputs and parameters before accepting any out”. You will need to carefully check all your inputs and parameters before accepting any results from the simulation. The resolution of the receptor used also play an important part in results from the simulation. The resolution of the receptor used also play an important part in the accuracy of your simulation. The higher the resolution of your protein or receptor, the higher the accuracy of your simulation. The higher the resolution of your protein or receptor, the higher the accuracy of your result.
the accuracy of your result.
The Search Algorithm.
The Search Algorithm.
The search algorithm is a process where all possible conformations and orientations of The search algorithm is a process where all possible conformations and orientations of the complex (the paired ligand and protein) in a space (the binding site of interest) is being the complex (the paired ligand and protein) in a space (the binding site of interest) is being searched. If the ligand is flexible, then the program will calculate the energy for each rotation searched. If the ligand is flexible, then the program will calculate the energy for each rotation made of each and every rotatable bonds it can find.
made of each and every rotatable bonds it can find.
The same goes for the protein/receptor. For each and every rotation of the side chain of The same goes for the protein/receptor. For each and every rotation of the side chain of the amino acids (in the binding site), the program will calculate the energy involved. Each
the amino acids (in the binding site), the program will calculate the energy involved. Each energy value calculated will be presented as a “snapshot” of the pair.
In each of the snapshot, you will be able to see what kind of interactions are involved and In each of the snapshot, you will be able to see what kind of interactions are involved and which atoms are in contact or close proximity making any kinds of bonding such as hydrogen which atoms are in contact or close proximity making any kinds of bonding such as hydrogen bonds, hydrophobic interactions and many more. Each of the “snapshot” of the pair or the bonds, hydrophobic interactions and many more. Each of the “snapshot” of the pair or the complex is called the binding “pose”.
complex is called the binding “pose”.
A binding pose of the docked ligand A binding pose of the docked ligand
The Scoring Function
The Scoring Function
The scoring function is a process where the program takes a binding pose and gives a The scoring function is a process where the program takes a binding pose and gives a num
numbeber r to to indindicaicate te the the liklikelielihoohood d whwhetether her ththe e binbindinding g intinteraeractiction on is is fafavovorabrable. le. MolMolececulaularr mechanics force field (physics based calculations) is used to estimate the energy of each pose. mechanics force field (physics based calculations) is used to estimate the energy of each pose. Each and every pose will come with an energy value and the scoring function of the program Each and every pose will come with an energy value and the scoring function of the program will rank the poses accordingly (normally in a descending manner). The lower the energy of the will rank the poses accordingly (normally in a descending manner). The lower the energy of the pose, the more stable the complex will be, and more likely the possibility of the binding will pose, the more stable the complex will be, and more likely the possibility of the binding will happen.
Application of Docking
Application of Docking
As mentioned earlier, most application goes to the field of drug design. Most drugs, as As mentioned earlier, most application goes to the field of drug design. Most drugs, as we know are small organic molecules. Another application is in the prediction of the activation or we know are small organic molecules. Another application is in the prediction of the activation or inhibition of a particular enzyme or protein (or in some cases DNA) of interest.
inhibition of a particular enzyme or protein (or in some cases DNA) of interest.
A ligand in the cavity/active site of a protein A ligand in the cavity/active site of a protein
In
In ththe e prprococesess s of of ididenentitifyfyining g popotetentntiaial l drdrug ug cacandndididatates es frfrom om lalargrge e dadatatababaseses s ofof drugs/small molecules, many efforts have been made to identify molecules with tendency to drugs/small molecules, many efforts have been made to identify molecules with tendency to bind to a protein target of interest. This process is sometimes called virtual screening or hits bind to a protein target of interest. This process is sometimes called virtual screening or hits ide
identintificficatiation on or or in-in-silsilico ico drudrug g scscreereeninning g whwherere e ththousousandands s of of drudrug g cacandindidatdates es arare e beibeingng screened rapidly using high speed or high performance computing facilities.
screened rapidly using high speed or high performance computing facilities.
Docking can also be used in “lead optimization” process. This is a process where the Docking can also be used in “lead optimization” process. This is a process where the lead or drug candidate which had earlier shown binding affinity towards the protein, is being lead or drug candidate which had earlier shown binding affinity towards the protein, is being structurally modified (in the computer) to enhance its binding potential. The potency and/or the structurally modified (in the computer) to enhance its binding potential. The potency and/or the selectivity of the drug candidate towards the protein, can therefore be improved. Researchers selectivity of the drug candidate towards the protein, can therefore be improved. Researchers have been using this technique lately as part of an effort to help minimize the problems with late have been using this technique lately as part of an effort to help minimize the problems with late failure in drug development.