• No results found

Phasing & Substructure Solution

N/A
N/A
Protected

Academic year: 2021

Share "Phasing & Substructure Solution"

Copied!
50
0
0

Loading.... (view fulltext now)

Full text

(1)

CCP4 Workshop Chicago 2011

Phasing & Substructure Solution

Tim Grüne

Dept. of Structural Chemistry, University of Göttingen June 2011

http://shelx.uni-ac.gwdg.de

(2)

Data Integration (Anomalous)

Differences SubstructureSolution Phasing & DensityModification RefinementBuilding & mosflm, xds, HKL2000,. . . xprep, shelxc, solve,. . . shelxd, SnB, HySS,. . .

shelxe, resolve, pi-rate,. . .

arp/warp, coot, ref-mac5,phenix, YOU!

Molecular Replacement

(3)

An electron density map is required to create a crystallographic model. The electron density is calculated from the complex structure factor F(hkl):

ρ(x, y, z) = 1 Vcell

X

h,k,l

|F(h, k, l)|eiφ(h,k,l)e−2πi(hx+ky+lz) (1)

Experiment measures: I(hkl) i.e. real numbers

Model Building requires: F(hkl) i.e. complex numbers:

(4)

Motivation: The Phase Problem

The phase problem is one of the critical steps in macromolecular crystallography:

A diffraction experiment only delivers the amplitude |F(hkl)|, but not the phase φ(hkl) of the structure factor. Without phases, a map cannot be calculated and no model can be built.

(5)

Solutions to the Phase Problem

The main methods to overcome the phase problem in macromolecular crystallography (i.e. to obtain initial

phases):

• (multi/ single) wavelength anomalous dispersion (MAD, SAD)

• (multiple/ single) isomorphous replacement (MIR, SIR)

• molecular replacement (MR)

and combinations thereof (SIRAS, MR-SAD, . . . ).

The first two methods are so-called experimental phasing methods. They require the solution of a substructure.

(6)

~a

~b ~c

The Substructure of a (crystal-) structure are the coordi-nates of a subset of atoms within the same unit cell.

(7)

~a

~b ~c

The Substructure of a (crystal-) structure are the coordi-nates of a subset of atoms within the same unit cell.

It can be any part of the actual structure. In the usual sense, substructure refers to the marker atoms that are used for phasing.

A real crystal with the properties of the substructure (atom distribution vs. unit cell dimensions) cannot exist: the atoms are too far apart for a stable crystal.

(8)

Small Molecule Crystallography

The substructure consists of a small number of atoms in a large unit cell.

In small molecule crystallography, the phase problem is usually easily solved:

A structure with not too many atoms (< 2000 non-hydrogen atoms) can be solved by means of ab initio methods

(9)

The Terms “Ab initio” and “Direct” Methods

ab initio Methods: phase determination directly from amplitudes, without prior knowledge of any atomic posi-tions. Includes direct methods and the Patterson method. The most widely used of all ab initio methods are:

Direct Methods: phase determination using probabilistic phase relations — usually the tangent formula (Nobel prize for H. A. Hauptman and J. Karle in Chemistry, 1985).

(10)

The Substructure

Central to phasing based on both anomalous dispersion and isomorphous replacement is the determination of the substructure.

The steps involve

1. Collect data set(s)

2. Create an artificial substructure data set

3. Determine the substructure coordinates with direct methods 4. Calculate phases (estimates) for the actual data

(11)

What about Sheldrick’s Rule?

The Sheldrick’s Rule says that direct methods only work when the data resolution is 1.2 Å or better. Macromolecules seldomly diffract to this resolution.

(12)

Sheldrick’s Rule

Sheldrick’s Rule refers the real structures and the 1.2 Å-limit is about the distance where single atoms can be resolved in the data.

The substructure is an artificial crystal and atom distances

within the substructure are well above the usual diffraction limits of macromolecules, even for 4 Å data.

(13)

Solving a Macromolecular Phase Problem with a Substructure

The substructure represents only a very small fraction of all atoms in the unit cell (one substructure atom per a few thousand atoms). Yet, knowing the coordinates of the substructure allows to solve the phase problem for the full structure.

(14)

The Independent Atom Model (IAM)

“Ordinary” crystallography (other than ultra-high resolution charge density studies) is based on the “Independent Atom Model”: The total structure factor F(hkl) of each reflection (hkl) is the sum the independent contributions

from each atom in the unit cell.

(x, y, z)

x y

(hkl) = (430)

φ = 2π(hx + ky + lz)

= rel. distance to Miller-plane

Re(F) Im(F)

(15)

The Independent Atom Model (IAM)

x y

(hkl) = (430)

The (vectorial) sum from each atom contribution yields the total scattering factor F(hkl). The experiment delivers only its modulus |F(hkl)| =

q

(16)

Isomorphous Replacement (1/2)

The Harker Construction represents one way to determine the phase of F(hkl) from an isomorphous replace-ment experireplace-ment. It requires two data sets:

q

Inat(hkl) qIder(hkl)

Native: lacking one or several atoms (the sub-structure)

Derivative: Same unit cell, same atoms plus substructure atoms

(17)

Isomorphous Replacement (2/2)

The substructure coordinates can derived from the difference data set (qIder(hkl) − qInat(hkl)). The substructure phases φ(hkl) can be calculated from the substructure coordinates.

The phases of the native (or derivative) data set can be derived from

measured native intensities Inat(hkl)

measured derivative intensities Ider(hkl)

(18)

Harker Construction (1/4)

Im(F(hkl))

Re(F(hkl))

(19)

Harker Construction (2/4)

q

Inat(hkl)

Draw the structure factor F(hkl) of the substructure.

Fder(hkl) points somewhere on the circle around the tip

of F(hkl) with radius qInat(hkl)

because

(20)

Harker Construction (3/4)

Draw the structure factor F(hkl) of the substructure.

Fder(hkl) points somewhere on the circle around the tip

of F(hkl) with radius qInat(hkl)

At the same time, Fder(hkl) also lies somewhere on the circle around the origin with radius qIder(hkl), because

(21)

Harker Construction (4/4)

The two intersections of the two circles are the two possible structure factors Fder(hkl).

How to select the correct one:

MIR a second Harker construction with |Fder-2(hkl)| has exactly one of the two possibilities in common with the first Harker construction.

SIR the mean phase of the two possibilities serve as starting point for density modification, but if the substructure coordinates are centrosymmetric the

two-fold ambiguity cannot always be resolved (this problem is specific to SIR).

(22)

Substructure Contribution to Data

The electron density ρ(x, y, z) can be calculated from the (complex) structure factors F(hkl). Inversely the structure factor amplitudes can be calculated once the atom content of the unit cell is known:

F(hkl) = atoms j X in unit cell fjhkl)e−Bj sin2θhkl λ2 e2πi(hxj+kyj+lzj) (2)

• fj atomic scattering factor: depends on atom type and scattering angle θ. Values tabulated in [3], Tab. 6.1.1.1 • Bj atom displacement parameter, ADP: reflects vibrational motion of atoms

• 2π(hxj + kyj + lzj) phase shift: relative distance from origin • (xj, yj, zj): fractional coordinates of jth atom

(23)

Extraction of Substructure Contribution

Experimental determination (i.e. other than by molecular replacement) of the contribution of the substructure to the measured intensities is achieved by

Anomalous Dispersion: Deviation from Friedel’s Law and comparison of

|F(hkl)| with |F(¯h¯k¯l)|

Isomorphous replacement: Changing the intensities without changing the unit cell and com-parison of

|Fnat| with |Fder|

The alteration is usually achieved by the introduction of some heavy atom type into the crystal, either by soaking or by co-crystallisation.

(24)

Anomalous Scattering

The atomic scattering factors fj(θ) describe the reaction of an atom’s electrons to X-rays.

They are different for each atom type (C, N, P,. . . ), but normally they are real numbers and do not vary signifi-cantly when changing the wavelength λ.

If, for one atom type, the wavelength λ matches the transition energy of one of the electron shells, anomalous scattering occurs.

This behaviour can be described by splitting fj(θ) into three parts:

fjanom = fj(θ) + fj0(λ) + ifj00(λ)

(25)

SAD and MAD the following “grouping” has turned out to be useful, with its phase diagram on the right (ADP

e−8π 2U

j

sin2θhkl

λ2 omitted for clarity):

F(hkl) = non-X substructure fµe2πihrµ | {z } FP + substructure X normal fνe2πihrν | {z } FA + substructure X anomalous fν0e2πihrν | {z } F0 Im(F) FP FA F0 iF00 F

(26)

Since the equation for (Eq. 2) is a “simple” sum, one can group it into sub-sums. In the case of SAD and MAD the following “grouping” has turned out to be useful, with its phase diagram on the right (ADP

e−8π 2U

j

sin2θhkl

λ2 omitted for clarity):

F(hkl) = non-X substructure fµe2πihrµ | {z } FP + substructure X normal fνe2πihrν | {z } FA + substructure X anomalous fν0e2πihrν | {z } F0 +i substructure X f00e2πihrν Im(F) FP FA F0 iF00 F

Normal presentation, because f0 usu-ally negative

(27)

Now compare F(hkl) with F(¯h¯k¯l) in the presence of anomalous scattering: F(¯h¯k¯l) = non-X substructure fµe2πi(−h)rµ | {z } FP− + substructure X normal fνe2πi(−h)rν | {z } FA− + substructure X anomalous fν0e2πi(−h)rν | {z } F0− substructure 00 2πi(−h)r Im(F) Re(F) FP+ FA+ F0+

Friedel’s Law is valid for the fµ, fν, and fν0 parts:

(28)

Now compare F(hkl) with F(¯h¯k¯l) in the presence of anomalous scattering: F(¯h¯k¯l) = non-X substructure fµe2πi(−h)rµ | {z } FP− + substructure X normal fνe2πi(−h)rν | {z } FA− + substructure X anomalous fν0e2πi(−h)rν | {z } F0− +i substructure X anomalous fν00e2πi(−h)rν Im(F) Re(F) FP+ FA+ F0+ iF00+ F+

The complex contribution of ifν00 violates Friedel’s Law:

FP

F−

iF00−

F−

(29)

Karle (1980) and Hendrickson, Smith, Sheriff (1985) published the following formula∗: |F+|2 = |FT|2 + a|FA|2 + b|FA||FT| + c|FA||FT|sinα |F−|2 = |FT|2 + a|FA|2 + b|FA||FT| − c|FA||FT|sinα Im(F) Re(F) α FT+ FA+ FP+ F+

with FT = FP + FA the non-anomalous contribution of structure and substructure.

(30)

Combining Theory and Experiment

The diffraction experiment measures the Bijvoet pairs |F+| and |F−| for many reflections. In order to “simulate” a small-molecule experiment for the substructure, we must know |FA|.

With the approximation 12(|F+| + |F−|) ≈ |FT| the difference between the two equations above yields

(31)

Status Quo – A Summary

• Our experiment measures |F+| = q

I(hkl) and |F−| = q

I(¯h¯k¯l).

• We are looking for (an estimate) of |FA| for all measured reflections.

These |FA| mimic a small-molecule data set from the substructure alone.

The small-molecule data set can be solved with direct methods, even at moderate resolution.

• With the help of Karle and Hendrickson, Smith, Sheriff, we already derived

|F+| − |F−| ≈ c|FA|sinα

(32)

The Factor

c

Before we know |FA|, we must “get rid of” c and sin α.

c = 2f00(λ)

f(θ(hkl)) can be calculated for every reflection provided f

00(λ).

During a MAD experiment, f00(λ) is usually measured by a fluorescence scan.

To avoid the dependency on f00(λ), the program shelxd [1] uses normalised structure factor amplitudes

(33)

The Angle

α

In the case of MAD, multi-wavelength anomalous dispersion, we can calculate sinα because there is one equation for each wavelength.

In the case of SAD, the program shelxd approximates |FA|sin(α) ≈ |FA|. Why is this justified?

Bijvoet pairs with a strong anomalous difference (

|F

+| − |F|

) have greater impact in direct methods. The

difference is large, however, when α is close to 90◦ or 270◦, i.e. when sin(α) ≈ ±1. This coarse approximation

(34)

The Angle

α

In the case of MAD, multi-wavelength anomalous dispersion, we can calculate sinα because there is one equation for each wavelength.

In the case of SAD, the program shelxd approximates |FA|sin(α) ≈ |FA|. Why is this justified?

Bijvoet pairs with a strong anomalous difference ( |F

+| − |F|

) have greater impact in direct methods. The

difference is large, however, when α is close to 90◦ or 270◦, i.e. when sin(α) ≈ ±1. This coarse approximation

(35)

In order to find the wavelength with the strongest response of the anomalous scatterer, the values of f0 and f00

are determined from a fluorescence scan.

−10 −8 −6 −4 −2 0 2 4 Signal [e] f’ and f’’ for Br λ = 1Å * 12398 eV/E Br f" Br f’

In order to get the strongest contrast in the differ-ent data sets, MAD experimdiffer-ents collect data at

1. maximum for f00 (peak wavelength) 2. minimum for f0 (inflection point)

(36)

Single isomorphous replacement requires two data sets:

1. Macromolecule in absence of heavy metal (Au, Pt, Hg, U,. . . ), called the native data set.

Measures |FP|.

2. Macromolecule in presence of heavy metal, called the deriva-tive data set.

Measures |F|.

|FA| can then be estimated from the difference |F| − |FP|

Im(F(hkl))

F

FA FP

(37)

SAD and MAD

vs.

SIR and MIR

SAD/ MAD SIR/ MIR

• |FA| from |F+| vs. |F−| (one data set) • |FA| from native vs. derivative (two data sets)

•signal improved by wavelength selection •signal improved by heavy atom (large atomic num-ber Z)

•requires tunable wavelength (synchrotron) •independent from wavelength (inhouse data)

(38)

SIRAS (1/2)

Isomorphous replacement is hardly used for phasing anymore:

• Incorporation of heavy atom into crystal often alters the unit cell ⇒ non-isomorphism between native and derivative ruins usability of data sets

(39)

SIRAS (2/2)

However, one often has a high-resolution data set from a native crystal and one (lower resolution) data set with anomalous signal.

They can be combined into a SIRAS experiment:

• Anomalous data set: SAD

• Anomalous data set as derivative vs. native: SIR

(40)
(41)

set from a crystal with exactly the same (large) unit cell as our actual macromolecule but with only very few atoms inside.

(42)

Normalised Structure Factors

Experience shows that direct methods produce better results if, instead of the normal structure factor F(hkl),

the normalised structure factor is used.

The normalised structure factor is calculated as E(hkl)2 = F(hkl) 2

hF(hkl)2/εi

ε is a statistical constant used for the proper treatment of centric and acentric reflections; the denominator F(hkl)2/ε as is averaged per resolution shell. It is calculated per resolution shell (≈ 20 shells over the whole resolution range).

(43)

Starting Point for Direct Methods: The Sayre Equation

In 1952, Sayre published what now has become known as the Sayre-Equation

F(h) = q(sin(θ)/λ) X

h0

Fh0Fhh0

This equation is exact for an “equal-atom-structure” (like the substructure generally is).

It requires, however, complete data including F(000), which is hidden by the beam stop, so per se the Sayre-equation is not very useful.

(44)

Tangent Formula (Karle & Hauptman, 1956)

The Sayre-equation serves to derive the tangent formula

tan(φh) ≈

P

h0 |Eh0Eh−h0|sin(φh0 + φh−h0)

P

h0 |Eh0Eh−h0| cos(φh0 + φh−h0)

which relates three reflections h, h0, and h − h0.

H. A. Karle & J. Hauptman were awarded the Nobel prize in Chemistry in 1985 for their work on the tangent formula (and many other contributions).

(45)

Solving the Substructure

Direct methods (and in particular shelxd) do the following:

1. Assign a random phase to each reflection. They will not fulfil the tangent formula.

2. Refine the phases using |FA| (or rather |EA|) to improve their fit to the tangent formula 3. Calculate a map, pick the strong peaks (as putative substructure)

(46)

Solving the Substructure

Steps 1-4 are repeated many times, each time with a new set of random phases.

Every such attempt is evaluated and the best attempt is kept as solution for the substructure.

The substructure is thus found by chance. It works even if the data quality is only medium — but may require more trials.

The cycling between steps 2-3, i.e. phase refinement in reciprocal space vs. peak picking in direct space, is called dual space recycling [4, section 16.1]

(47)

dual space

recycling

Real space: select atoms reciprocal space refine phases random atoms atoms consistent

with Patterson

repeat a few 100−1000 times

No

FFT and peak search

from atoms phases(map)

between E obs and E calc selection criterion: Correlation Coefficient

or (better)

(48)

e.g. with the Harker Construction. With those phases, an initial electron density map for the new structure can be calculated: I(hkl) |FA| (x, y, z)A

R

f

µ

φ(hkl)A φ(hkl) ρ(x, y, z)

(49)

Phasing the Rest

(50)

References

1. G. M. Sheldrick, A short history of SHELX, Acta Crystallogr. (2008), A64, pp.112–133

2. R. J. Morris & G. Bricogne, Sheldrick’s 1.2Å rule and beyond, Acta Crystallogr. (2003), D59, pp. 615–617 3. E. Prince (ed.), International Tables for Crystallography, Vol. C, Union of Crystallography

4. M. G. Rossmann & E. Arnold (eds.), International Tables for Crystallography, Vol. F, Union of Crystallogra-phy

References

Related documents

8. Name the particles which determine the mass of an atom. Write the charges on sub atomic particles. What are valence electrons? Give example. *Which kind of elements have tendency

divergence angle, probe mounting angle, pressure and temperature – was performed and applied to one example run with particles of intermediate size Honite 16 under specific

Consistent with the predominantly nuclear MRTF-A, the three aggressive lines (UACC-257, SK-Mel-147, and SK-Mel-103) have dramatically increased expression of these

The lower income segments, for instance, have shown a clear indication of a planned increase in spending in the categories of Confectionery, Household Cleaning, Packaged Food

Then card on table is picked up, initials on face are acknowledged, turned over and IT IS THE BLUE BACKED CARD, with the initials on back to be acknowledged as the original blue

When it comes to supply chain execution software—the applications that direct manufac- turing operations, oversee distribution processes like receiving and order fulfillment, and

In Rutherford’s experiment, most particles passed through the empty space in the ____________.. Calculate the average atomic mass

were  composed  of  a  unknown  negatively   charged  particle,  and  is  credited  with  the   discovery  and  identification  of  the  electron5. •Thomson  found