6.6 Results
7.1.2 Residue Independence
The second design decision, the decision to treat rotamer assignment to residues inde- pendently while enforcing backbone continuity was made after another design option proved infeasible. I describe the independent-residue scheme first.
Suppose the locally-flexible backbone segments for a protein each span k residues, and each have b backbone conformations, and suppose that each residue has s states per backbone conformation (or bs states total). In the independent residue scheme, each vertex in the graph represents a single residue. A state assigned to a single vertex represents both the backbone conformation and the side-chain conformation. Vertices that correspond to residues that are part of the same locally-flexible-backbone segment are constrained in their state assignments: they may only take on states such that these states belong to the same backbone conformation.
Each edge connecting a residue pair from different backbone segments would store a pair energy table withb2s2 entries, and each edge connecting a residue pair from the
same segment would store a pair energy table withbs2 entries, since intra-segment pair
energies are only meaningful for rotamer pairs that originate from the same backbone. The independent-residue scheme would thus store at most k2b2s2 +k(k−1)bs2 pair
energies for the segment pair; the scheme would likely store fewer than this number of energies since not all residue pairs for these segments necessarily interact.
This scheme wherein each residue is optimized separately creates opportunities to break the backbone. Suppose residuesiandi+1 move in concert and are assigned states such that they are both in backbone conformation A – in this state assignment, the backbone is not broken. Then if the annealer tried to perform a single state substitution at residue i that moved it from backbone conformation A to backbone conformation
B, the annealer would break the backbone. The way to resolve this break would be for residuei+ 1 to undergo a simultaneous state substitution so that its new state also originates from backboneB. This means that simulated annealing has to change, since the system is no longer one in which single states are substituted at a time.
The alternative to the independent-residue scheme would have been to treat the entire moving backbone segment as a single super residue where this super residue would be assigned one of manysuper rotamers. If a four-residue segment hadbalternate backbone conformations andsrotamers on each residue in each backbone conformation, then the super residue composed of these four would have bs4 super rotamers to choose
segment, and an assignment of rotamers to each of the residues in the segment.
The advantage of the super-rotamer formulation is that simulated annealing would not need to change. Simulated annealing, instead of making a single rotamer substi- tution at a time, would make a single super-rotamer substitution at a time. Since each super rotamer is built from a single backbone conformation, there is no way to inadvertently break the backbone during a super-rotamer substitution.
The first obvious disadvantage of the super-rotamer scheme is that if the interaction graph to store super-rotamer-pair energies were implemented in the same way the fixed- backbone interaction graph were implemented, then it would require too much memory. In fixed backbone design, an edge connecting a pair of interacting residues with s
states each stores s2 rotamer-pair energies (unless they are quite distant, when the
AminoAcidNeighborSparseMatrix stores fewer energies). Consider two super residues where each represented a segment of k residues with b backbone conformations and s
rotamers per conformation; each super residue has bsk rotamers. If the edge between these two super-residues were to store a single table, as edges do in the fixed-backbone design interaction graph, then it would store b2s2k energies. This memory requirement is dramatically larger than the number of rotamer-pair energies represented in the independent-residue scheme.
The super rotamer scheme should instead represent its pair energies the same way that the independent-residue scheme does. The super residue could still behave as if it were reading from gigantic tables of super-rotamer-pair energies. It would present the same interface to the annealer as before, tell the annealer how many states (super rotamers) each vertex (super residue) had and the annealer would continue to make single state (super-rotamer) substitutions at a time. Behind the scenes, the graph would, for each super-rotamer substitution, retrieve many more energies than if the graph read from a set of tables holding super-rotamer-pair energies.
The true problem with the super-rotamer scheme is that if the segment length becomes long, or the number of rotamers per residue becomes very large, then the number of super-rotamers becomes unmanageably large. If a single segment contained five residues, and allowed four different backbone conformations, and if each residue had 64 rotamers per backbone conformation, then the super-residue would have 4 billion possible super-rotamers. It would not be possible to represent which of the super-rotamers to substitute on a single super residue with a 32-bit integer. Moreover, simulated annealing would take an extraordinarily large amount of time. Currently simulated annealing on a fixed protein backbone considers tens of millions of state
substitutions, considering each rotamer for substitution several hundred times. In order for simulated annealing to visit each super-rotamer once in the example above, it would have to run 1000 times longer than fixed-backbone simulated annealing. The super- rotamer idea has severe practical limitations.