X-ray Structure Determination
March 2013
[email protected]
1 Tuesday, March 19, 13
Slides are skipped if I’m doing a PDB evaluation lecture too.
Who am I?
2 Tuesday, March 19, 13
3 Tuesday, March 19, 13
4 Tuesday, March 19, 13
5 Tuesday, March 19, 13
These are also perfectly out of phase, but they have different amplitudes
You can work out what happens when you add 2 waves of the same frequency, but different amplitudes and phases - takes some geometry. You get a new wave of the same frequency, new amplitude and phase.
Phase difference = 2
π
r
.(
s-s
o
)/
λ
6 Tuesday, March 19, 13
When an electron is hit by an electromagnetic wave, it will oscillate with the frequency of the wave. The oscillating electron will become a source of secondary rays in all directions.
These waves will have the same frequency; Thompson scattering The 2 waves have traveled different distances
S
=(
s-s
o
)/
λ
Phase difference = 2
π
r
.
S
S =2sin(
θ
)/
λ
s
o
s
(
s-s
o
)
Allow
s
to rotate, what
is the surface of
(s-s
o
)
?
7 Tuesday, March 19, 13
s
0
is fixed in
direction
Let |
s
| = |
s
0
| = 1/
λ
s
sweeps over the
surface of a sphere
8 Tuesday, March 19, 13
Scattering from a group of atoms
F
(
S
) =
Σ
f
j
exp(2
π
i
r
j
.
S
)
S
is continuous
9 Tuesday, March 19, 13
Group the electrons together -> atoms -> fj is the contribution from each atom
group the atoms together -> molecule
What if there are 2?
F
(
S
) =
Σ
f
j
exp(2
π
i
r
j
.
S
)
F
(
S
) =
Σ
f
j
exp(2
π
i(
r
j+
δ
/2
).
S
) +
Σ
f
j
exp(2
π
i(
r
j-
δ
/2
).
S
)
= exp(
π
i
δ
.
S
)
Σ
f
j
exp(2
π
i
r
j
.
S
) +
exp(-
π
i
δ
.
S
)
Σ
f
j
exp(2
π
i(
r
j
.
S
)
= (exp(
π
i
δ
.
S
)+
exp(-
π
i
δ
.
S
) )
Σ
f
j
exp(2
π
i
r
j
.
S
)
= 2cos(
π
δ
.
S
)
Σ
f
j
exp(2
π
i
r
j
.
S
)
= (sin(2
π
δ
.
S
)/sin(
π
δ
.
S
))
Σ
f
j
exp(2
π
i
r
j
.
S
)
10 Tuesday, March 19, 13
What if there are N?
F
(
S
) =
Σ
f
j
exp(2
π
i
r
j
.
S
)
F
(
S
) = ...
= (sin(N
πδ
.
S
)/sin(
πδ
.
S
))
Σ
f
j
exp(2
π
i
r
j
.
S
)
Transform of
one object
Sampling
function
11 Tuesday, March 19, 13The transform of an object depends on the object (obviously).
Plot of sinNy/siny
12 Tuesday, March 19, 13
13
Fourier Transform
13 Tuesday, March 19, 13
Fourier Transform of single, multiple objects. Sampling as we make a 1D crystal
This is a computer exercise that we do in the crystallography course
So, we have an underlying transform that depends on a single object (a circle here), which then get sampled at a spacing that depends on the separation of the objects. The shape of the sampling gets ‘sharper’ as we get more repeating objects.
14
Fourier Transform
14 Tuesday, March 19, 13
Fourier Transform of single, multiple objects. Sampling as we make a 1D crystal
This is a computer exercise that we do in the crystallography course
So, we have an underlying transform that depends on a single object (a circle here), which then get sampled at a spacing that depends on the separation of the objects. The shape of the sampling gets ‘sharper’ as we get more repeating objects.
15
Fourier Transform
15 Tuesday, March 19, 13
Sampling as we make a 2D crystal
1895 the discovery of X-rays
Crystallography is a mature science.
1830 32 point groups described
1850 the 14 (Bravais) lattices
1891 the derivation of the
230 space groups
1912 X-ray diffraction
Why continue studying it?
16 Tuesday, March 19, 13
2009
Ramakrishnan,
Steitz & Yonath
Why continue studying it?
Nobel Prize
2003 MacKinnon
1997 Walker
1988 Deisenhofer, Huber, Michel
1985 Hauptman, Karle
1982 Klug
1976 Lipscomb
1964 Hodgkin
1962 Perutz, Kendrew
1962 Crick, Watson, Wilkins
2006 Kornberg
2012 Lefkowitz
& Kobilka
17 Tuesday, March 19, 13
X-ray crystallography is producing scientific results of great importance.
Robert Lefkowitz & Brian Kobilka "for studies of G-protein-coupled receptors". RSY - ribosome
Kornberg - molecular basis of eukaryotic transcription
MacKinnon - structural and mechanistic studies of ion channels
Walker -elucidation of the enzymatic mechanism underlying the synthesis of adenosine triphosphate (ATP) DHM reaction centre
H&K - direct methods
Klug - crystallographic electron microscopy and his structural elucidation of biologically important nucleic acid-protein complexes Lipscomb - structure of boranes illuminating problems of chemical bonding
Hodgkin - for her determinations by X-ray techniques of the structures of important biochemical substances - penicillin PK - hemoglobin & myoglobin
CWW - DNA structure
Structure
Function
Drug discovery
Applications:
Designer enzymes
18 Tuesday, March 19, 13Structure is the key to Function. Applications are extensive.
The pictures are the active site of an enzyme, DXR, from M. tuberculosis (Henriksson et al, 2007). This enzyme is the target for an anti-malarial drug, fosmidomycin Are there any designer enzymes that you use? Washing powder: temperature and substrates (cellulases, lipases, proteases) (Proctor & Gamble)
19 Tuesday, March 19, 13
20 Tuesday, March 19, 13
f (2stv = STNV) was solved by Lars Liljas, Torsten Unge and me. The 3rd virus structure to be solved, first to be refined, whole capsid in the asymmetric unit. Quite a job in 1982-3
21
60046 entries on 090910
Electron crystallography Neutron diffraction Fiber diffraction FT-IR SAXS76%
85%
14%
68840 entries on 101101
71516 entries on 110307
79521 entries on 120223
88325 entries on 130219
21 Tuesday, March 19, 13Still increasing, thru 60k autumn 2009, 70k New Year 2011
PDB by
country
1
2
3
4
5
7
6
22 Tuesday, March 19, 13Sweden was beating Italy when I made this snapshot.
But Sweden is doomed to fall behind as the new superpowers ramp up their biotechnology.
How long before Sweden is overtaken by Singapore? Singapore has more crystallographers per sq. meter than any other country. These statistics are no longer easily available at the PDB
Material
Crystals
Data Collection
Phasing
Model
Refinement
Publication
The Crystallographic Pipeline
DNA revolution ....
Nano-drops,
robotics
23 Tuesday, March 19, 13
CONSTRUCTS
AZ C-term
AZ N-term
DXR_1
DXR_2
DXR_4
DXR_3
DXR_5
DXR_6
No inhibitor so ....
cDNA CONSTRUCTS
24 Tuesday, March 19, 13Getting DXR crystals for drug discovery
DXR_4 construct gave crystals but the fosmidomycin inhibitor was not visible in the electron density map. However, we saw no density for a few residues at the C-terminus. The next construct _6, deleted them, material gave a new space group, we could see the anti-malarial drug bound to the M. tb enzyme
Pictures show the manganese iron at the active site, with protein side-chains to the ion, Lys to the phosphonate of the drug, Trp from the ‘lid’, and the NADPH co-factor. Surface is a volume to fill in potential new inhibitors.
T-2 Phase diagram
Crystallization Phase Diagram
Make a
new crystal
Enlarge a crystal
- seeding
25 Tuesday, March 19, 13
Crystal Phase Diagram.
Crystals are formed in the area labelled Nucleation
1 & 2 are vapour diffusion, 2 is a crystal that has been added after equilibration 3 is a microbatch experiment where a crystal is added to the sealed volume
T-1 Vapor diffusion
Hanging drop, vapour
diffusion experiment
Use robotic liquid handling.
26 Tuesday, March 19, 13
Hanging drop experiment - this is not rocket science,
The precipitant is at the bottom of the beaker (& usually mixed with the drop containing the protein). Slowly, as things equilibrate, small crystals can form, and grow,
Liquid handling systems have been developed to speed things up Robotic-optical systems to monitor crystal growth.
Volumes ~300 nanoL
27 Tuesday, March 19, 13
SInce 2004, we have had a simple liquid handling system (Douglas Instruments Oryx) for setting up droplets Why are small volumes good?
Mosquito robot
Crystallization Hotel
& drop imager
28 Tuesday, March 19, 13
Since end of 2012
Mosquito robot, 100nl+100nl volumes Disposable tips - no contamination Fast 96 well plate takes 2 minutes
Seeding and the phase diagram
No
seeds added
Seeds
added
29 Tuesday, March 19, 13
Small crystals can also be used to ‘seed’ bigger ones ‘Mighty oaks from little acorns grow’
30
Protein
Crystals
30 Tuesday, March 19, 13
Good, Bad & Ugly results
Sometimes the drop has a mixture of classes, example 1 (top left) where we have very thin needles (BAD) and one big crystal that is suitable for crystallographic work (GOOD)
31
Home X-ray source
Radiation damage
31 Tuesday, March 19, 13
For macromolecular crystals, crystallographers use copper anodes. Why?
The picture is the first 2D electronic detector we ever had, a multi-wire proportional counter made by an American company called Nicolet-Xentronics - Autumn 1987. Terese Bergfors and I sat on it for 2 months and solved our first structure with it on 13th Dec 1987, I missed Tonegawa’s Nobel lecture at the BMC.
The above crystal has had holes drilled through it by (more powerful) synchrotron x-ray beams.
It is not quite so dramatic usually, the crystal just looses the capacity to diffract the x-rays as the long range order breaks down. Why? Bonds broken, free radical produced, ...
32
Synchrotron radiation
32 Tuesday, March 19, 13
When high energy electrons or positrons travel around in a storage ring, they emit an enormous amount of x-radiation.
Unlike home source, they have a more continuous wavelength spread so you can build stations (beam-lines) that are ‘tuneable’ The bottom pictures are insertion devices, wigglers or undulators, which are even more powerful.
The x-ray beam can be orders of magnitude more intense than home sources. Some rings are dedicated to producing just X-rays.
Brilliance &
wavelength
tuneability
Dedicated
ESRF
Paul Ellis (SLAC)
33 Tuesday, March 19, 13
European Synchrotron Research Facility in Grenoble
Is there any point in producing such intense beams since the crystals will just get destroyed faster? Would have been useless without freezing - liquid nitrogen temperatures.
A snap-shot of PDB depositions.
34 Tuesday, March 19, 13
A DXR diffraction image, displayed with a computer program called Mosflm, runs on your laptop, but not your iPhone.
The x-ray camera is more or less the same as 30 years ago, but the detector is electronic, images are sucked into a computer and either processed interactively, or written to storage for taking home.
35 Tuesday, March 19, 13
Crystals have symmetry within the 3D unit cell and this repeats through space to make up the complete lattice.
The detailed positioning of different symmetry elements that get applied to the building block of the crystal defines the so-called space group Not all space groups are equally common
The computer software makes suggestions for the possible space groups during the processing N.B. macromolecules cannot be crystallized in space groups that have mirrors or inversions! Why?
The crystal building block does not necessarily correspond to a single molecule, it too can have symmetry.
BUT this symmetry is local - non-crystallographic symmetry - the insert is a ‘spherical’ virus (STNV) where the whole virus capsid is the so-called asymmetric unit. We collected the data at home, with film ...
Material
Crystals
Data Collection
Phasing
Model
Refinement
Publication
The Crystallographic Pipeline
36 Tuesday, March 19, 13
FFT
FFT
Cat amplitudes,
Horse phases?
Fourier Transform
37 Tuesday, March 19, 13The x-ray diffraction pattern is a sampled, complex number, but we can only measure the amplitudes. How important are the phases?
FFT
FFT
Cat amplitudes,
Horse phases?
Fourier Transform
38 Tuesday, March 19, 13Material
Crystals
Data Collection
Phasing
Model
Refinement
Publication
The Crystallographic Pipeline
MIR/MIRAS
MAD
SAD
MolRepl
39 Tuesday, March 19, 13Phasing may be via experimental techniques, or by the molecular replacement method. • Max Perutz invented the MIR method
Wayne Hendrickson invented the SAD/MAD method
Experimental methods work by introducing new scattering into the asymmetric unit, either a heavy atom soaked into it, or by modifying how a particular atom interacts with the x-rays (by changing the wavelength)
• Michael Rossman, David Blow and Walter Hoppe are the fathers of the Molecular Replacement method. In this method, we use somebody else’s structure to solve ours.
Phasing Statistics Today
Snapshot
May-July 2003
Paul Ellis, SLAC
Structural Genomics
40 Tuesday, March 19, 13
MAD/SAD have taken over from MIR
Phasing Statistics Today
Snapshot
May-July 2003
Paul Ellis, SLAC
Structural Genomics
41 Tuesday, March 19, 13
MAD/SAD have taken over from MIR
42
Molecular Replacement Method
Unknown
Known
42 Tuesday, March 19, 13
This illustrates how we use a known structure (top cat, say), to solve the structure of the unknown, but related structure (an earless, tailless
variant)
The success of the method depends on the similarity of the 2 objects, for macromolecules we need an RMS on CA atoms of < 1.5Å, otherwise it
gets hard.
P2 Myelin Protein (1988)
Founding members of 2 protein
families
Retinol Binding Protein (1984)
43 Tuesday, March 19, 13
Here are founding members of 2 families of proteins that I worked on.
0
1
2
3
RMSD
Search for similarity to 1CBS with Dali
(cellular retinoic acid binding protein, scores Z-sorted to RMSD=2.0)
P2 myelin is #67
RBP is #314
~180 in family
Liisa Holm
Hard to solve
Easy to solve
44 Tuesday, March 19, 13Here is a DB search for a member of the P2 family, cRABP
The family now has ~200 members, of which almost all have been solved by MR - like the family relationships linking the royal families of Europe
in the 1800-1900’s.
The results for the search start to merge the 2 families so RBP also pops up with other members of its family.
Dali is a structural similarity search program, available via a web server, developed by Prof. Liisa Holm.
Search for similarity to 1CBS with Dali
(cellular retinoic acid binding protein, scores Z-sorted to RMSD=2.0)
P2 myelin is #67
~180 in family
45 Tuesday, March 19, 13
Here is a DB search for a member of the P2 family, cRABP
The family now has ~200 members, of which almost all have been solved by MR - like the family relationships linking the royal families of Europe
in the 1800-1900’s.
The results for the search start to merge the 2 families so RBP also pops up with other members of its family.
Dali is a structural similarity search program, available via a web server, developed by Prof. Liisa Holm.
Material
Crystals
Data Collection
Phasing
Model
Refinement
Publication
The Crystallographic Pipeline
46 Tuesday, March 19, 13
The fun part of crystallography where you get to sample the fruits of your work. This is where I did some of the early research
47 Tuesday, March 19, 13
Myoglobin - the first protein structure to be solved & first atomic model, parts still exist in the Science Museum in London.
Bror Strandberg was a young Swedish post-doc who worked with John Kendrew at the end of the 1950s. Here he’s imitating an oxygen molecule One can see the haem, and a long helix on the right.
48 Tuesday, March 19, 13
Carbonic anhydrase, the first macromolecule to be solved in Sweden by Bror’s group. Bror came back to UU, and set up his group to do macromolecular crystallography.
Another young Swede went to Cambridge in 1961, Carl-Ivar Brändén, who returned, setting up his group at SLU. I was the first thing they shared, in 1979.
49 Tuesday, March 19, 13
Siemens 4004
PDP-11/VG3400
Frodo
1978
VG3400
50 Tuesday, March 19, 13This is the 2nd thing Calle & Bror shared.
I started working on graphics in 1976 at MPI for Biochemistry in Munich
This was one of the first computer graphics systems one could buy that was powerful enough Note the input devices. The big box is what we call a graphics card today
The computer (DEC PDP-11) had 64 kilo words of memory, but only 32k could be accessed at a time. You can also see the disks to the left (20 cm radius), they held 1.5 M words
Cost?
51 Tuesday, March 19, 13
This was a black and white vector drawing device Chicken wire contours of maps
52 Tuesday, March 19, 13
Skeletons
53 Tuesday, March 19, 13
An alternative map representation.
A piece of string through space
Jones & Thirup (1986)
54 Tuesday, March 19, 13
A breakthrough paper.
Material
Crystals
Data Collection
Phasing
Model
Refinement
Publication
The Crystallographic Pipeline
Validation
55 Tuesday, March 19, 13
The first model ALWAYS has errors or is incomplete.
Fixing these problems is called crystallographic refinement
Model Errors
What sorts of errors do
crystallographers make?
Every sort that can be made!
56 Tuesday, March 19, 13
It’s good to understand
what can go wrong
Complete trace incorrect
2ry structure recognized, could be wrong
direction
Incorrect connections between 2ry
structure units
57 Tuesday, March 19, 13
58 Tuesday, March 19, 13
Rainbow colouring from N- to C-termini, red to blue Left as published, Right as corrected
Secondary structure
elements maintained
Beware if you have a 3 A
poorly phased map
59 Tuesday, March 19, 13
Out of register error
Most common serious error
All
low resolutions start models will have
this sort of error
Usually a local error; 2 errors can bring
sequence back into register with the
density
Structure has been published where most
2ry structure elements were out of
register
60 Tuesday, March 19, 13
Out
of
register
errors
61 Tuesday, March 19, 13
Out of
register - most
common
serious error
Rest are
out of
register
OK
Structural
alignment
62 Tuesday, March 19, 13Phasing errors
Resolution
Inexperience
Shortage of equipment
is no longer a problem
Main reasons for making
errors
Most errors can be
fixed during
refinement, but keep
the experimental map
Coupled
Can be treated
Macbook is enough
63 Tuesday, March 19, 13
64
Fourier Transform
Amplitudes
Phases
Current
model
Perfect
model
64 Tuesday, March 19, 13This illustrates how we use our current best model to phase, then look at maps to find errors or missing bits. In this case a tail of a cat called Pelle
(note he lacks ears too)
65
Fourier Transform
65 Tuesday, March 19, 13
This illustrates how we use our current best model to phase, then look at maps to find errors or missing bits. In this case a tail of a cat called Pelle
(note he lacks ears too)
•
The initial model tends to
•
Lack bits that should be there (missing domains, loops, side
chains, ligands, water, …)
•
Contain bits that shouldn’t be there
•
Have bits that should be there but are in the wrong place
•
In other words, the model contains random and systematic
errors
•
Hence: the initial model is almost always inaccurate
and imprecise and incomplete
Why do we need refinement ?
66 Tuesday, March 19, 13
Refinement
Quality control
Rebuilding
Final model
Initial model
Biology
Iterate these steps
until the model is
as complete as the
data will allow and
no more
improvement can
be obtained by
further refinement
(“convergence”)
Retraction
Model improvement
67 Tuesday, March 19, 13Do papers have to me retracted? Some disasters have been made.
Resolution
68 Tuesday, March 19, 13
This illustrates that our cat becomes less clear as we get less data to make the reconstruction
This is at the heart of resolution
Resolution
69 Tuesday, March 19, 13
This illustrates that our cat becomes less clear as we get less data to make the reconstruction
This is at the heart of resolution
Resolution
70 Tuesday, March 19, 13
This illustrates that our cat becomes less clear as we get less data to make the reconstruction
This is at the heart of resolution
Resolution
1.8 Å
71 Tuesday, March 19, 13
Resolution
2.5 Å
72 Tuesday, March 19, 13
Resolution
3.0 Å
73 Tuesday, March 19, 13
Resolution
3.5 Å
74 Tuesday, March 19, 13
Not all PDB entries are
created equal!
Resolution
75 Tuesday, March 19, 13
Nor are the crystallographers.
Evaluating a PDB entry
Coordinates but no structure factors
deposited
Coordinates and SF deposited
Coordinates are derived data;
one needs SFs to calculate maps
76 Tuesday, March 19, 13
Not all PDB entries are created equal.
Not all PDB entries have their di
ff
raction data.
The coordinates are derived data!
A lovely structure!
Most users of the PDB are afraid of maps, so ...
77 Tuesday, March 19, 13
78 Tuesday, March 19, 13
Our attempt at bringing maps and crystallography to the non-specialist.
EDS (Kleywegt et al., 2004)
When do
you stop?
?
Are you sure
these are waters?
79 Tuesday, March 19, 13
B-factors should indicate motion, but are also fudge factors
When do you stop adding waters?
?
Residue based
goodness-of-fit to ED
Jones et al. (1991) - O
80 Tuesday, March 19, 13Residue-based goodness-of-fit to the map indicators
ALmost all PDB entries have some bad bits in them
RIP
81 Tuesday, March 19, 13