All the presented sampling approaches use a common local minimization subroutine, which has nothing to do with our global search. Its main role is to account for flexibility of side chains during the search. We have explored and optimized this protocol in our previous work (Moghadasi et al., 2015a). It consists of the following steps. We first run a rigid-body energy minimization algorithm (Mirzaei et al., 2012) which locally minimizes the position and orientation of the ligand with respect to the receptor. Then we run a side-chain positioning (SCP) algorithm (Moghadasi et al., 2013), (Moghadasi et al., 2015a) that solves a combinatorial optimization problem
in order to repack the amino acid residues at the interface of the receptor-ligand complex. SCP models the flexibility of the protein structures upon binding.
7.11
Monte Carlo Minimization Protocol
To better study the role of sampling in refinement we have used an in-house MCM- based off-grid refinement protocol, previously introduced in Chapter 5. The advan- tage of using this implementation is that it shares the same local optimization and energy function with SSDU and it becomes possible to make comparisons and draw conclusions. Briefly, the protocol performs a local perturbation, followed by sliding the proteins into contact, and locally optimizing the structure as described above. The new conformation is accepted or rejected using the Metropolis criterion. 7·5 de- picts a flowchart of the MCM-based refinement. MCM takes the input conformations form a PIPER cluster, and produces an ensemble of predicted conformations as the refinement output set.
7.11.1 Energy Function
In this work, our choice of energy function is a state-of-the-art high-accuracy docking energy potential, that can be calculated as a weighted sum of a number of force-field and knowledge-based energy terms (Gray et al., 2003), (Andrusier et al., 2007; Pierce and Weng, 2007). We consider the following energy terms to find the interaction free energy value:
E = wV DWEV DW+ wSOLESOL+ wCOU LECOU L+ wHBEHB+ wDARSEDARS+ wRPERP,
where EV DW is the Lennard-Jones potential, ESOLis an implicit solvation term (Schae-
fer and Karplus, 1996), ECOU L is the Coulomb potential, EHB is a knowledge-
Focused Resampling PIPER MCM Off-grid Refinement Side-chain Positioning Rigid-body Minimization Metropolis Accept? Random Perturbation 5x NO YES
For each input conformation
Figure 7·5: The flowchart of the MCM-based off-grid refinement procedure. based intermolecular potential that is derived from the non-redundant database of native protein-protein complexes which uses a novel DARS (Decoys as Reference State) (Chuang et al., 2008) reference set. A large decoy set of docked conformations should be generated based on a shape-complementarity scoring function in order to form the DARS reference set. The potential is then computed by observing the fre- quency of interactions in these decoys. The last term, ERP, is a statistical energy
term associated with a set of rotamers selected from the backbone-dependent rotamer library (Shapovalov and Dunbrack, 2011). The weight set of the energy function is adopted according to the selections in Gray et al. (Gray et al., 2003).
7.11.2 Refinement dataset
We validate our algorithm over 34 cases from a protein docking benchmark containing Other Type (OT) of complexes; we have reported the same test set in our earlier work (Moghadasi et al., 2015a). The reason for considering OT complexes is that they exhibit multiple deep funnels in the vicinity of the native structure. This makes them particular difficult cases for protein docking refinement. The choice of protein complexes in this benchmark was based on the number of near-native conformations that PIPER provides. Given 1000 input conformations from PIPER, we only consider the complexes whose initial number of near-native conformations is greater than 30. Since SSDU is based on sampling in the vicinity of the local minima of the sample points, when there are not enough conformations in the vicinity of the near-native structure, SSDU would pointlessly sample in a region remote form the native.
We start with a near-native cluster produced by global PIPER docking, i.e., a cluster whose geometric center is within 10 ˚A from the native structure. Additionally, we have a few structures in our dataset whose closest to native cluster in within 12 ˚
A from the native. A dense rotation set consisting of 250, 000 uniformly distributed rotations was prepared. Rotations from this set were compared to the rotations of the 1000 structures retained during the initial PIPER docking. From the dense rotation set we retained only the rotations that were within 5 degrees to any rotation in the original set of 1000. This procedure typically results in a subset of 2000 to 5000 rotations within the region of interest. These rotations are then used for local resampling of the region by the PIPER program. Translations were constrained to a 10 ˚A distance from the geometric center of the structure representing the cluster center. Similarly to the initial docking, the local resampling was performed using three different sets of weights in the scoring function, resulting in three sets of orientations. Structures further than 10 ˚A interface RMSD from the cluster center were removed.
From each of the three sets we selected the 333 lowest energy structures, and then added one more to yield 1000 conformations that form the initial PIPER cluster presented to the refinement protocol.