5 Towards Development of KAT2A HAT Domain Tool Inhibitors
5.1.1 Phylogeny, Structure and Function
Protein acetylation was discovered in 1963, when Phillips identified acetyl groups in histones isolated from calf thymus.465 In 1964, Allfrey et al. showed that in isolated nuclei radiolabelled
acetate was rapidly incorporated into histones, independent of translation.466 Furthermore, it
demonstrated that histone acetylation decreased the effectiveness of this inhibition. They proposed that post-translational histone acetylation affords a ‘dynamic and reversible mechanism for "activation" as well as "repression" of RNA synthesis’.466 Unfortunately, after this
development, acetylation was largely neglected from research efforts for 30 years467, with the
notable exception that, in 1978, sodium butyrate was discovered to inhibit histone deacetylase activity, causing histone hyperacetylation.468,469 In 1988, the connection between histone
acetylation and transcriptional activation was uncovered431 and there was growing recognition
that histones acetylated at specific residues mediate unique effects on gene expression.467 This
initiated a flood of research in the 1990s, affording many discoveries, including the pivotal revelations that GCN5 and RPD3 (reduced potassium dependency 3), both known transcription regulators, possess histone acetyltransferase470 and deacetylase471 activity respectively. This
finally confirmed a causal link between histone acetylation and transcriptional regulation. It is now widely accepted that transcription can be regulated via acetylation of conserved lysine residues located in the amino terminal domains of histones. Acetylation results in charge neutralisation which is thought to weaken the interactions between the histones, DNA and regulatory proteins, which alters the nucleosome structure, producing the more open chromatin environment required for transcription.381 These acetylation marks are written by HATs, by
transferral of an acetyl from acetyl-CoA, and erased by histone deacetylases (HDACs).
Figure 5.1. Phylogenetic Tree of HAT Families. Schematic showing HATs subdivided into their respective families.
Figure adapted from Arrowsmith et al., 2012.380
During the following five years an array of additional HATs were identified467, including for
example, TAF1472, p300 and CREBBP473. Phylogenetic trees have been devised by groups such as
within gene families.219 One such tree for the HAT domains is shown in Figure 5.1.380 Common
aliases of the HAT proteins are listed in Table 5.1. As shown, the domains can be separated into families. The MYST family (after yeast members MOZ, YBF2/SAS3, SAS2 and TIP60) is the largest. The p300/CBP family comprises of p300, CREBBP and ATAT1 (α-tubulin N-acetyltransferase 1), and KAT2A and KAT2B are in the GNAT (GCN5-related N-acetyltransferases) family. The other HATs are transcriptional co-activators or steroid receptor co-activators, which harbour acetyltransferase activity alongside their other functions.474 Interestingly, except from at the
core region of the acetyl-CoA binding sites, there is little overall sequence conservation between these HAT families. This diversity should enable development of selective tool inhibitors. Within families conservation is higher. As discussed, KAT2A and KAT2B have 70% sequence identity.435
Table 5.1. Alternative Names for Histone Acetyltransferases. HATs listed with families, subtypes and aliases.474
Family Subtype Aliases Cytoplasmic KAT1 HAT1
HAT4 NAA60
GNAT KAT2A GCN5
KAT2B PCAF KAT9 ELP3
p300/CREBBP KAT3A CREBBP, CBP KAT3B p300, EP300 ATAT1 MEC17
MYST KAT5 TIP60
KAT6A MYST3, MOZ KAT6B MYST4, MORF KAT7 MYST2, HBO1 KAT8 MYST1, MOF
Transcriptional Co-activators KAT4 TAF1, TAFII250, TFIID1 KAT12 GTF3C4, TFIIIC90 Steroid Receptor Co-activators KAT13A NCOA1, SRC1
KAT13B NCOA3, SRC3, AIB1, ACTR KAT13C NCOA2, SRC2, P600 KAT13D CLOCK
The first HAT X-ray crystal structures resolved were Saccharomyces cerevisiae HAT1 (PDB ID: 1BOB)475, Saccharomyces cerevisiae KAT2A (PDB ID: 1YGH)476 and Homo sapiens KAT2B (PDB ID:
1CM0).477 The KAT2A HAT domain structure, which is well conserved between yeast and
humans, has a mixed α/β topology, comprising of five α-helices and six β-strands with a globular fold, Figure 5.2.241 Overall, the domain resembles a vice. The core, which forms the base of the
vice, is composed of the three-stranded antiparallel β-sheet of β2, β3 and β4, the α-helix α3 and the strand-loop-helix containing β5 and α4. The N- and C-termini form the two sides of the vice. At the N-terminus, β1 contributes to the core β-sheet via hydrogen bonds with β2, while α1 and
α2 sit off to one side and above the core. At the C-terminus, the β6 strand of the loop-α5-loop- β6 substructure is associated with the core by hydrogen bonding with β5. The positioning of the core and the N- and C-termini creates a pronounced cleft, approximately 10 x 10 x 20 Å, suitable for substrate binding. While the core domain is structurally conserved among HATs, the terminal regions show no sequence homology with other acetyltransferases.476
Figure 5.2. KAT2A HAT Domain Topology. Structure of Homo sapiens GCN5 acetyltransferase domain (PDB ID:
1Z4R).241 Protein chain coloured from blue at N-terminus to red at C-terminus and α-helices and β-strands labelled.
HATs are bi-substrate enzymes, meaning that they bind and convert two substrates during catalysis. Theoretically, there are three catalytic mechanisms that bi-substrate enzymes can employ. Firstly, a random-order ternary complex mechanism, in which both substrates bind to the enzyme, in any order, to form a ternary complex and then the acetyl is transferred directly from the acetyl-CoA to the lysine. Secondly, a compulsory-order ternary complex mechanism, which again relies on formation of a ternary complex and direct transfer of the acetyl group, but the substrates must bind in a particular order. Finally, a ping-pong mechanism, in which acetyl- CoA binds first and the acetyl is transferred to an amino acid at the enzyme catalytic site, then subsequently the lysine substrate binds and is acetylated. All three of these mechanisms require a general base at the HAT catalytic site, such as a glutamic acid residue, to facilitate nucleophilic attack at the acetyl-CoA by deprotonating the lysine residue. In addition, the ping-pong mechanism requires a residue capable of accepting the acetyl group, such as a cysteine.474 There
is increasing evidence that GCN5 employs a compulsory-order ternary complex mechanism. A conserved glutamic acid residue (S. cerevisiae Glu173/Homo sapiens Glu582/Tetrahymena Glu122) located at the bottom of the cleft in the GCN5 catalytic site is implicated as the general base necessary for catalysis.476,478 Bi-substrate kinetic experiments indicated that both
substrates are required to bind before catalysis, forming a ternary complex in a sequential manner, where acetyl-CoA binds first, followed by the lysine substrate.479
Also in accordance with this, Rojas et al. reported a crystal structure of the HAT domain of
Tetrahymena GCN5 with coenzyme A and a histone H3 peptide bound in the cleft (PDB ID:
1QSN).480 They showed that histone H3 binding is dependent on structural contributions from
CoA, which reorients GCN5 upon binding. In addition, they propose that the conserved general base, Glu122, which is responsible for extracting the proton from the substrate lysine, acts via a mediating water that shuttles the proton from the lysine to the glutamic acid, Figure 5.3.480
Figure 5.3. KAT2A HAT Domain Substrate Binding. Tetrahymena GCN5 acetyltransferase domain (PDB ID: 1QSN)480 illustrated with and without protein surface. KAT2A shown in yellow, CoA in green and histone H3 peptide in white, with heteroatoms highlighted. Conserved general base Glu122, lysine residue and mediating water depicted.
The diverse cellular and physiological implications of lysine acetylation do not result solely from the effect at histones. The number of non-histone proteins known to be subject to acetylation is growing rapidly. The first non-histone target discovered was tubulin481, followed some 10
years later by p53482, HIV-1 transcriptional regulator Tat483,484 and nuclear factor-κB (NF-κB).485
In 2006, Kim et al. conducted the first proteomic survey of protein acetylation. They used a screen combining immunoaffinity purification of the lysine-acetylated peptides with peptide identification by nano-HPLC/mass spectrometric analysis, to identify 388 acetylation sites in 195 proteins.486 In 2009, Choudhary et al. elevated this approach, incorporating immunoaffinity
purification, isoelectric focusing and high-resolution mass spectrometry. They identified 3600 lysine-acetylation sites in 1750 proteins.487 With such an extensive list of acetylated proteins
involved in an array of cellular processes, acetylation has finally been established as a globally important post-translational modification.467 This abundance of interacting species and
would suffer from adverse effects. However, it is likely that these risks would be outweighed by the benefits and therefore tolerable, given that current approved cancer therapeutics leave considerable room for improvement.10 Irrespective of this, tool inhibitors of the KAT2A HAT
domain will be invaluable in enabling mechanistic biological investigation and disease validation.