Modern Methods of Protein Crystal Structure Determination and Structural Analysis of RNA Polymerase

(1)

Modern Methods of Protein Crystal Structure Determination

and Structural Analysis of RNA Polymerase

T. Koval,

1,2

J. Dohnalek,

2, 3

M. Dusek,

2

L. Krasny

4

1

Charles University in Prague, Faculty of Mathematics and Physics, Ke Karlovu 3, 121 16 Praha 2, Czech Republic. 2_{Institute of Physics, Cukrovarnicka 10, 162 00 Praha 6, Czech Republic.}

3

Institute of Macromolecular Chemistry AS CR, Heyrovskeho nam.2, 162 06 Praha 6, Czech Republic. 4

Institute of Microbiology, Videnska 1083, 142 20 Praha 4, Czech Republic.

Abstract. X-ray structure analysis of macromolecules is an important tool used to study biological macromolecules. Here, we present a summary of crystallization and diffraction data collection methods that are applied in the process of structure determination. Then, structural characterisation of bacterial RNA polymerase, a multisubunit protein complex performing transcription of DNA into RNA, is reviewed. The aims and status of our experimental work focused on RNAp 3D structure determination are explained.

Introduction

X-ray protein structure analysis methods are relatively new and fast developing tools for determination of structure of organic macromolecules. The potential to provide a protein or nucleic acid structure at atomic resolution is one of the clear advantages of this method. The whole process, which leads to the final structure, consists of many steps and can last from weeks to years. For structure determination, usually a few milligrams of protein at a concentration of the order of tens of mg ml-1 is needed. Another part of the process is sample characterisation using dynamic light scattering (DLS), denaturing and native electrophoresis, mass spectroscopy (such as MALDI – Matrix Assisted Laser Desorption Ionization), spectroscopic methods and other biochemical techniques. Then, or in parallel with protein characterisation, crystallization follows. After suitable crystals have been obtained, diffraction experiments take over in the procedure. The last set of steps consists of the methods used to solve and refine the final structure.

In the first part of this communication, we summarize basic theories along with modern approaches of protein crystallization and diffraction data collection. In the second part, we focus on structural characterisation of RNA polymerase and on our effort to determine the three-dimensional structure of RNAp from a gram-positive bacterium.

Crystallization

In order to apply X-ray diffraction analysis, a suitable single crystal of the studied material must be obtained. The process in which protein single crystals are formed is quite complicated, depend on many conditions and the results cannot be predicted even with use of modern quantum mechanical methods. Nevertheless, the main conditions influencing protein crystallization are known and thousands of successful crystallization results have been reported.

Crystallization principles and methods

In order to form crystals, protein molecules must overcome an energy barrier as schematically shown in Figure 1. On the top of the energetic barrier is the nucleation phase, so this process is the critical for further crystal growth.

There are many crystallization methods specially designed to force the system to a state at which crystallization can occur and the basic principle of every method is to slowly lead the system containing a protein to the state of restricted solubility. In the beginning of crystallization procedure, the system containing protein is in the undersaturated phase (phase diagram in Figure 2). Changes in protein and crystallizing agents concentrations lead the system to the nucleation phase. When crystals start to grow the system moves to the metastable phase in which the crystalline nuclei continue to grow but the process of nucleation stops.

One of the most common ways to lead the system through the phase diagram as described above is called Vapour diffusion method. In this method, the system is composed of two parts. A reservoir containing the solution of crystallizing agents is the first part. Usually this is a water solution containing salts, organic solvents, polymers, buffers, detergents and various additives. Salts, organic solvents and polymers are responsible for

(2)

Figure 1. Time dependence of protein molecules free energy behaviour during crystallization. Particular phases are labelled.

Figure 2. Protein crystallization phase diagram divided into four areas (phases). The phase of undersaturation, where protein is solved, the metastable phase, the nucleation phase and the precipitation phase where protein is excluded from solution, but in an unarranged state.

precipitation of protein, buffers maintain a certain pH level, detergents are sometimes used in small amounts to increase protein solubility, additives are many different types of chemicals, which can affect crystallization (for example some additives can adhere to protein surface and help bind protein molecules together or the opposite). A small drop formed from protein solution along with the solution of crystallizing agents is the second part of the system. The whole system is sealed so that it cannot exchange matter with its surroundings. In this system molecules of solvent migrate from the drop to the reservoir and so concentration of protein and crystallizing agents in the drop increases. In the phase diagram the system would be represented by a point moving up and to the right. When the system reaches the nucleation phase, crystals start growing and so concentration of protein in the solution decreases, but concentration of crystallizing agents can still increase, so the point in the diagram would be moving down and to the right and it would reach the metastable phase. Some typical experimental setups are shown in Figure 3.

Diffusion through oil or gel is another commonly used type of the crystallization experimental setup. In the case of diffusion through oil, the drop is covered by silicon or paraffin oil and so molecules of solvent migrate through this barrier which slows them down. In the case of diffusion through gel the situation is different. Not only the molecules of solvent, but all types of molecules diffuse through a gel. Crystallizing agent molecules meet the molecules of protein in the gel and this may cause protein crystallization. Dialysis and batch methods are also sometimes used for protein crystallization.

In order to help protein molecules to overcome this barrier, a special approach can be used - seeding. The principle lies in putting the system directly into the crystal growth phase and addition of crystalline nuclei to it. In order to do so, some crystals of protein must be available. These can be crushed and either microscopic crystalline particles (microseeding) or directly small parts of crystals (macroseeding) can be used as crystallization nuclei.

(3)

Figure 3. Crystallization system for the Vapour diffusion method and different types of the most common arrangements. a) hanging drop, b) sitting drop, c) sandwich drop

Screening and optimization

Screening for suitable conditions stands at the beginning of the crystallization process. The principle is to set up many different conditions (many different types of crystallizing agents solutions). One screening set usually contains 48 or 96 different conditions and in principle a number of different sets can be used. In many cases, though, only one type with a set of wide range of conditions is sufficient. The next step is scoring of the screening results. After a few days or weeks the drops with screened conditions are visually observed and the best possible evaluation of the individual results for every crystallization solution is attempted, Pursuant to the scoring results, optimization of the promising conditions follows. This optimization process involves changes of the individual parameters of the conditions in order to obtain suitable single crystals for X-ray analysis. The main parameters which influence protein crystallization are given in Table 1.

Table 1. Summary of main conditions influencing protein crystallization.

Variable of parameter Influence on crystallization

Composition and concentration of precipitants Nucleation rate, 1D-3D growth, size,crystalformandquality pH and composition of buffer Any parameters of crystals

Concentration of protein Nucleation rate, 1D-3D growth, size Detergents and additives Any parameters of crystals

Temperature Nucleation rate, time of growth, crystal form

Drop size Time of growth, size

Precipitant : protein ratio Size and quality

Used technique (method, set up, seeding,…) Nucleation rate, time of growth, size and quality

Diffraction data collection

X-ray structure analysis of protein single crystals has many unique characteristics and differs from analysis of “small” molecules. The main difference is in the number of atoms in a molecule and the size of crystal unit cell. Unit cell dimensions of protein crystals are larger and so diffraction spot spacing on a diffraction image is denser and intensity of diffraction is much smaller. In general, a protein diffraction image contains much higher numbers of spots compared to that from a crystal of e.g. a small organic molecule. Spots spacing is also dependent on wavelength and therefore in protein crystallography larger wavelengths of X-ray radiation are used in order to differentiate the spots on a diffraction image.

Experimental setup, crystal testing and data collection

For protein X-ray analysis in a “home” laboratory one-axis oscillation in combination with a Cu X-ray source and an imaging plate or CCD detector are the most common experimental setups. Crystals are kept at cryogenic temperatures (to minimize radiation damage) but for some experiments crystals can be irradiated at room temperature. The first step of an experiment is crystal testing in order to determine whether a crystal is suitable for diffraction data collecting or even to determine the identity of the crystalline material (salt crystals). In summary, protein crystals suitable for X-ray analysis must be of a proper size, mosaicity not extremely exceeding 1° and preferably without twinning. Overlapping diffraction spots, ice diffraction and sometimes also

(4)

low diffraction limit are other common obstacles in protein diffraction data collection. These problems are mostly overcome by testing a number of different crystals of the same protein where one tries to find the most suitable sample for the experiment.

After the test for crystal diffraction, usually some information about space group, unit cell, diffraction limit, intensity and quality of diffraction image is known so that a data collection experiment can be properly planned. Exposure times, detector distance, and goniometer angles for data collection must be set in order to obtain the best possible completeness of the data with the highest possible diffraction limit and with adequate data redundancy. The fact that protein crystals are sensitive to X-ray radiation must be taken into consideration during experiment planning and sample condition should be regularly checked during data collection.

Advantages of synchrotron radiation

Use of synchrotron radiation in X-ray structure analysis of macromolecules has become common in the past two decades. High intensity and small divergence of the beam and a high level of radiation polarization are some of the advantages of synchrotron radiation. For protein structure solution (which in most cases cannot be done by direct methods) tuneable wavelength of the X-ray beam is an unmatched advantage. This can be also used for data collection on large and complicated structures such as ribosome or viral particles.

Experiments at the Institute of Physics, AS CR

For crystal testing and data collection from well diffracting crystals, an in-house X-ray diffractometer installed at the Institute of Physics in Prague is used. The Gemini Enhanced Ultra diffractometer with the Atlas CCD detector (Oxford Diffraction) is designed for “low molecular weight” crystallography as well as for macromolecular experiments and offers two different wavelengths of X-ray radiation (copper and molybdenum anodes). For protein diffraction a Cu sealed tube with multi-layer optics (intensity comparable to rotating anode) as an X-ray source and a sensitive single-chip CCD detector with 135 mm front area diagonal are used. The combination of these devices leads to recordable intensity of X-ray diffraction and good quality of resulting data, at minimum of a level comparable with rotating anode in-house protein diffractometers. Another significant advantage is the four-circle kappa goniometer which allows for almost any adjustment of crystal orientation in the beam during experiment and also for special scans of the reciprocal space. A comparison of the one axis oscillation platform and the 4-cycle kappa platform is shown in Figure 4.

Bacterial RNA polymerase

Ribonucleic acid polymerase (RNAp) is a multisubunit protein complex performing transcription of DNA into RNA in cells. From the biochemical point of view, RNAp is a nucleotidyl transferase that polymerizes ribonucleotides at the 3' end of a newly built RNA molecule. In bacteria, there is only one type of RNAp that synthesizes all the types of RNA (mRNA, tRNA, sRNA and rRNA). Structures of the bacterial core enzyme and of the holoenzyme have been determined only for T. thermophilus [5, 8-11] and T. aquaticus [3, 4, 6, 7] (both are gram-negative bacteria and extremophiles) but it can be assumed that the structure and function of RNAp from other types of bacteria would share some of the fundamental features

Structure and function

The conserved composition of the core enzyme is α2, β, β’, ω. The two α subunits (αI, αII, each of a molecular weight of about 40 kDa) have a structural function. They can also recognize regulatory factors and the C-terminal domains of the α subunits bind to the promoter UP element (this interaction increases the affinity of RNAp to the promoter). The two largest subunits, β and β’ (~150 kDa each), bind to the α dimer and form the

Figure 4. Comparison of the common platforms used for X-ray diffraction analysis of proteins: a) one-axis oscillation, b) four-circle kappa goniometer.

(5)

active site. The ω subunit (about 10 kDa) is a part of the core enzyme, too. The position of the ω subunit is clear from the structures [3]–[11], but its biological role is not fully understood. Gram-positive bacteria, unlike gram-negative bacteria, contain two additional subunits, ω2 and δ. The ω2 (MW ~8 kDa) subunit is still uncharacterized and virtually nothing is known about its position on RNAp and its role in transcription or in the cell. Also very little is known about the biological role of the δ subunit (MW 21 kDa). There is some evidence that the δ subunit enhances promoter selectivity [1], [2]. The holoenzyme contains yet another subunit - the σ factor. Many different σ factors exist in the cell. These factors provide the core RNAp with affinity to certain promoter regions (recognition of promoters). The σ factor for housekeeping genes is σ70 in E. coli (Mw ~ 70 kDa) and σA in B. subtilis (~43 kDa) Structure of the core enzyme or the holoenzyme from a gram-positive bacterium has not yet been determined and so the exact structure and position of the σ factor and of the δ and ω2 subunits are not known. A summary of the known structures of bacterial RNAp to date is given in Table 2.

Table 2. Summary of structures of the core enzyme and holoenzyme of bacterial RNAp.

Only structures of some subunits/subunit fragments exist for RNAp from non-thermophilic bacteria: E. coli (G.-neg.): [12-14]; A. aeolicus (G.-neg.): structure of a part of σ subunit [15, 16]; B. subtilis (G.-pos.): part of α subunit [17]; M. tuberculosis (G.-pos.): structure of a part of σ subunit [18]; S. coelicolor (G.-pos.):structure of a part of σ subunit [19]. PDB ID is the access code of the Protein Data Bank record, Res. is resolution.

PDB ID Subunits Res.[Å] Source Organism Conclusions 1hqm[3] α1α2ββ’ω 3.3 T. aquaticus (G.neg.)

Bact. RNAp sub. ω and euk. RNAp sub. RPB6 are seq., struct., and funct. homologs.

1i6v[4] α1α2ββ’ω 3.3 T. aquaticus (G.-neg.)

Structural mechanism for rifampicin inhibition of bacterial RNAp.

1iw7[5] α1α2ββ’ωσ 2.6 T. thermophilus (G.-neg.) Crystal structure of a bacterial RNAp holoenzyme

1l9u[6] α1α2ββ’ωσ 4.0 T . aquaticus (G.-neg.) Crystal structure of a bacterial RNAp holoenzyme

1ynj [7] α1α2ββ’ω 3.2 T. aquaticus (G.-neg.)

Structural mechanism for sorangicin inhibition of bacterial RNAp.

1ynn [7]α1α2ββ’ω 3.3 T. aquaticus (G.-neg.)

Structural mechanism for rifampicin inhibition of bacterial RNAp.

1zyr [8] α1α2ββ’ωσ 3.0 T. thermophilus (G.-neg.)

Structural mechanism for streptolydigin inhibition of bacterial RNAp.

2a68 [9] α1α2ββ’ωσ 2.5 T. thermophilus (G.-neg.)

Structural mechanism for rifabutin inhibition of bacterial RNAp.

2a69 [9] α1α2ββ’ωσ 2.5 T. thermophilus (G.-neg.)

Structural mechanism for rifapentin inhibition of bacterial RNAp.

2a6e [9] α1α2ββ’ωσ 2.8 T. thermophilus (G.-neg.) Crystal structure of a bacterial RNAp holoenzyme

2a6h[10]α1α2ββ’ωσ 2.4 T. thermophilus (G.-neg.)

Structural mechanism for sterptolydigin inhibition of bacterial RNAp.

2cw0[8] α1α2ββ’ 3.3 T. thermophilus (G.-neg.) Crystal structure of a bacterial RNAp holoenzyme

2o5i[11] α1α2ββ’ω 2.5 T. thermophilus (G.-neg.) Crystal structure of RNAP elongation complex.

2o5j[11]α1α2ββ’ω 3.0 T. thermophilus (G.-neg.)

Crystal structure of RNAP elong. complex with the NTP substrate.

Transcription

The first phase of transcription is initiation. At the beginning, RNAP recognizes via its σ factor a promoter region (40bp) and binds to it. After binding, some conformational changes occur, and RNAp unwinds the DNA to form the transcription bubble that consists of a ~13 bp region of melted DNA. The second phase is

elongation of the RNA transcript. RNAp adds new nucleotides to the 3’ end of the newly built RNA; the σ factor leaves the core enzyme during elongation in a stochastic manner. The last phase is called termination of transcription and it is either mediated by a hairpin structure in the nascent RNA that dissociates RNAp from its template or by the Rho factor (helicase).

(6)

Figure 5. Composition and function of bacterial RNAP. a) Structure of core enzyme from T. aquaticus [3] b) Cystal structure of RNAP elongation complex from T.thermophilus [11]

Status of the experimental work

Several research groups of three institutes are currently collaborating to determine the structure of the RNAp holoenzyme or its parts from a Gram-positive bacterium. We are interested in the position and function of the ω/ω2 and σ/δ subunits and also in structural differences in comparison with RNAp from Gram-negative bacteria..

The core enzyme of this bacterial RNAp has been partially characterised. The purified complex consists of appropriate domains as judged by SDS PAGE, and initial results of our DLS experiments clearly indicate the presence of the whole complex of core RNAp in solution (particles with an average diameter of 23 nm). Initial crystallization screening experiments have provided interesting hits for further optimization of crystallization conditions. Establishment of a reliable crystallization protocol will not only be an essential prerequisite for successful determination of the structure but it will also provide a starting platform for structural analysis of various ligands interacting with this type of RNAp.

Acknowledgments. The work on this project is supported by the Grant Agency of the Academy of Sciences of the Czech Republic, project No. IAA500500701 (to JD) and by the Ministry of Education, Youth and Sports, and by TRIOS, project No. 2B06065 (to LK).

References

[1] Achberger, E. C., M. D. Hilton, and H. R. Whiteley. 1982. The effect of the delta subunit on the interaction of Bacillus

subtilis RNA polymerase with bases in a SP82 early gene promoter. Nucleic Acids Res. 10:2893–2910.

[2] Lampe, M., C. Binnie, R. Schmidt, and R. Losick. 1988. Cloned gene encoding the delta subunit of Bacillus subtilis

RNA polymerase. Gene 67:13–19.

[3] Minakhin, L., Bhagat, S., Brunning, A., Campbell, E.A., Darst, S.A., Ebright, R.H., Severinov, K. (2001) Bacterial RNA polymerase subunit omega and eukaryotic RNA polymerase subunit RPB6 are sequence, structural, and functional homologs and promote RNA polymerase assembly. Proc.Natl.Acad.Sci.USA 98: 892–897.

[4] Campbell, E.A., Korzheva, N., Mustaev, A., Murakami, K., Nair, S., Goldfarb, A., Darst, S.A. (2001) Structural mechanism for rifampicin inhibition of bacterial rna polymerase. Cell 104: 901–912.

[5] Vassylyev, D.G., Sekine, S., Laptenko, O., Lee, J., Vassylyeva, M.N., Borukhov, S., Yokoyama, S. (2002) Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 A resolution NATURE 417: 712–719.

[6] Murakami, K.S., Masuda, S., Darst, S.A. (2002) Structural basis of transcription initiation: RNA polymerase holoenzyme at 4 A resolution. Science 296: 1280–1284.

[7] Campbell, E.A., Pavlova, O., Zenkin, N., Leon, F., Irschik, H., Jansen, R., Severinov, K., Darst, S.A. (2005) Structural, functional, and genetic analysis of sorangicin inhibition of bacterial RNA polymerase Embo J. 24: 674–682.

[8] Tuske, S., Sarafianos, S.G., Wang, X., Hudson, B., Sineva, E., Mukhopadhyay, J., Birktoft, J.J., Leroy, O., Ismail, S., Clark, A.D., Dharia, C., Napoli, A., Laptenko, O., Lee, J., Borukhov, S., Ebright, R.H., Arnold, E. (2005) Inhibition of bacterial RNA polymerase by streptolydigin: stabilization of a straight-bridge-helix active-center conformation. Cell 122: 541–552.

[9] Artsimovitch, I., Vassylyeva, M.N., Svetlov, D., Svetlov, V., Perederina, A., Igarashi, N., Matsugaki, N., Wakatsuki, S., Tahirov, T.H., Vassylyev, D.G. (2005) Allosteric modulation of the RNA polymerase catalytic reaction is an essential component of transcription control by rifamycins. Cell (Cambridge,Mass.) 122: 351–363.

(7)

[10] Temiakov, D., Zenkin, N., Vassylyeva, M.N., Perederina, A., Tahirov, T.H., Kashkina, E., Savkina, M., Zorov, S., Nikiforov, V., Igarashi, N., Matsugaki, N., Wakatsuki, S., Severinov, K., Vassylyev, D.G. (2005) Structural basis of transcription inhibition by antibiotic streptolydigin. Mol.Cell 19: 655–666.

[11] Vassylyev, D.G., Vassylyeva, M.N., Perederina, A., Tahirov, T.H., Artsimovitch, I. (2007) Structural basis for transcription elongation by bacterial RNA polymerase. Nature 448: 157–162.

[12] Zhang, G., Darst, S.A. (1998) Structure of the Escherichia coli RNA polymerase alpha subunit amino-terminal domain. Science 281: 262–266.

[13] Chlenov, M., Masuda, S., Murakami, K.S., Nikiforov, V., Darst, S.A., Mustaev, A. (2005) Structure and Function of Lineage-specific Sequence Insertions in the Bacterial RNA Polymerase beta' Subunit J.Mol.Biol. 353: 138–154. [14] Patikoglou, G.A., Westblade, L.F., Campbell, E.A., Lamour, V., Lane, W.J., Darst, S.A. (2007) Crystal structure of the

Escherichia coli regulator of sigma70, Rsd, in complex with sigma70 domain 4. J.Mol.Biol. 372: 649–659.

[15] Sorenson, M.K., Ray, S.S., Darst, S.A. (2004) Crystal structure of the flagellar sigma/anti-sigma complex sigma(28)/FlgM reveals an intact sigma factor in an inactive conformation. Mol. Cell 14: 127–138.

[16] Doucleff, M., Malak, L.T., Pelton, J.G., Wemmer, D.E. (2005) The C-terminal RpoN domain of sigma54 forms an unpredicted helix-turn-helix motif similar to domains of sigma70. J.Biol.Chem. 280: 41530–41536.

[17] Newberry, K.J., Nakano, S., Zuber, P., Brennan, R.G. (2005) Crystal structure of the Bacillus subtilis anti-alpha, global transcriptional regulator, Spx, in complex with the {alpha} C-terminal domain of RNA polymerase Proc.Natl.Acad.Sci.Usa 102: 15839–15844.

[18] Thakur, K.G., Joshi, A.M., Gopal, B. (2007) Structural and biophysical studies on two promoter recognition domains of the extra-cytoplasmic function sigma factor sigma(C) from Mycobacterium tuberculosis. J.Biol.Chem. 282: 4711– 4718.

[19] Li, W., Stevenson, C.E., Burton, N., Jakimowicz, P., Paget, M.S., Buttner, M.J., Lawson, D.M., Kleanthous, C. (2002) Identification and structure of the anti-sigma factor-binding domain of the disulphide-stress regulated sigma factor sigma(R) from Streptomyces coelicolor. J.Mol.Biol. 323: 225–236.