X-ray Crystallography

X-ray crystallography is the "bread and butter" of structural biology. It's our primary way to figure out the 3D arrangement of the tangle of atoms in a protein, and since (as biologists like to say) structure determines function, crystal structures are incredibly valuable for explaining biology.

To properly interpret crystal structures, I think it's important to understand the process of crystallography; at the very least, we ought to know its stregths, weaknesses, and caveats. On this page, I'll give an overview of the dark art of crystallography.

An Overview

In x-ray crystallography, protein units are coerced to pack neatly into a crystal, and then x-rays are diffracted off its periodic lattice. Since the diffraction pattern provides information of the electron density within the crystals, it lets us model in the location of the atoms within the protein crystal. There's a number of steps in the process:
  1. Obtain a purified solution of protein. Back in the day, people used to get protein directly from the native organism -- sperm whale myoglobin, for instance, used to be isolated from the skeletal muscles of dead whales. Nowadays with recombinant DNA, we can get the protein we want from specially engineered strains of bacteria. From the raw extract, we do some chromatography to separate out impurities to end up with a purified protein solution.
  2. Grow protein crystals. Once we have a pure enough protein sample, we put the protein in a crystal growth solution, and wait weeks to months for crystals to grow. Finding suitable conditions for crystal growth can be a gruesome trial-end-error process, and the crystallographer often needs to tweak different ingredients in the buffer or modify the protein itself to successfully grow crystals.
  3. Cryoprotect and cryocool the crystals. To slow down radiation damage from X-rays, protein crystals are typically cooled to liquid nitrogen temperatures (77 Kelvin). Before cryocooling, we normally dip the crystals in a cryoprotectant (aka antifreeze) to prevent the formation of ice, which can damage the protein crystal structure and create unwanted "ice rings" in the diffraction pattern.
  4. Collect X-ray diffraction patterns. In this step, we mount the cryocooled crystal onto a device called a goniometer, and then we shoot an x-ray beam at the crystal; on the other side, a detector measures the intensity of the diffracted x-rays in different directions. By using the goniometer to rotate the crystal in various angles, we can collect a full dataset of diffraction patterns.
  5. Solve the "phase problem." At this point, we know the the intensities of spots on a diffraction pattern, but this is only part of the information needed to reconstruct the electron density inside the crystal. We still need the phases of each of the spots. If a similar crystal structure has been solved before, the phase problem can be solved on a computer using molecular replacement; if not, we need to perform a phasing experiments. There are a few ways to solve the phase problem; most of them rely on the anomalous scattering of heavier atoms that shifts the intensities of diffraction spots in a phase-dependent manner. These technqiues have amusing names such as MAD and SAD (for Multiple/Single wavelength Anomalous Dispersion).
  6. Build a model.
  7. Refine the structure.

Protein Expression and Purification

Protein Crystal Growth

Cryoprotection

Interlude: Diffraction and Fourier Transforms

Before delving into the experimental details of collecting x-ray diffraction patterns, I want to explain why shooting x-rays at a crystal can help us figure out its structure.

As I've said earlier, the way x-rays bounce off the crystal provides information about the electron density within the crystal. We will use this information later to model in the atomic positions in. More precisely, the the diffraction pattern is a Fourier transform of the electron density, where each spot is a Fourier coefficient, and its intensity is the magnitude of that coefficient. By performing the appropriate inverse Fourier transform to "undo" the operation, we can back out the density of electrons in the crystal.

The whole reason that crystals are so useful for determining structure is that they have a periodic structure -- its atoms are organized neatly to repeat over and over again in the same way. It turns out that this periodic structure is intimitely related to the mathematics of Fourier Transforms and the physics of diffraction. So before we dive into the actual x-ray diffraction, it is well worth our time to develop the proper language to talk about repeating, symmetric lattices of atoms.

Structure of a crystal

Crystals have a very regular structure. If we zoom into a crystal, we will see that the atoms form a wallpaper-like pattern, where the same chunk repeats itself over and over again. Since there are so many repeating chunks, we like to take a mathematical trick and pretend that there's infinitely many.



This repeating chunk is known as a unit cell. So a crystal is made up of infinitely many regular copies of a unit cell. Notice that the unit cell of a crystal is not uniquely defined -- if we just slide over and carve out a different part of the wallpaper pattern, we still end up with the same crystal.

Just knowing the basic repeating chunk (the unit cell) is not enough to describe the full contents of a crystal. We also need to know how the unit cell is copied over and over again. If we draw a dot for each unit cell, we end up with a huge grid of discrete points called a Bravais lattice. The Bravais lattice tells us the way to translate (slide over) the unit cells to fully form the crystal; it forms a scaffold or a schematic of the shape of the crystal.



To mathematically describe the Bravais lattice, we draw lattice vectors between different points in the grid. This seems like a formidable task, since there are infinitely many lattice vectors, but the regular structure of the Bravais lattice helps us out here...

Motivate the basis vectors
Define a,b,c and the alpha beta gamma angles
Transformation between u,v,w and the real coordinates x,y,z
Classification of Bravais Lattices

Lattice Planes and Miller Indices

The Reciprocal Lattice

Symmetry and space groups

Atoms and electron density maps

Now that we understand lattices of repeating points, we can fill in the lattices with the contents of unit cells.

In a protein crystal, the unit cells are composed one (or more!) copies of a protein, which are made of many individual atoms linked together. In addition to the protein itself there may be other co-factors or ligands, as well as any other gunk from the crystallization cocktail that sticks to the protein or snuggles itself neatly between neighboring protein units. For the most part, these components are all well-ordered in the sense that their atoms consistently appear in the same location in all the unit cells in the crystal.

The other main component of a protein crystal is the bulk solvent in the "empty space" surrounding the protein units. In contrast to well-defined atomic positions of the ordered part of a unit cell, bulk solvent is often disordered. The water molecules in these solvent channels can flow around freely, in a quasi-liquid manner, and there is no reason for solvent atoms to be in the same position in different unit cells. (This begs a funny philosophical question of whether a crystalline protein is really in a crystal phase if a significant portion of the unit cell does not actually repeat!)

The quantity measured by x-ray diffraction experiments is the electron density \(\rho(\vec{r})\), where \(\vec{r}\) is position within the unit cell.

Electron density as sum of Gaussians
Visualizing electron density

Wiggly and Jiggly B-factors

A real protein crystal is far from the ideal infinite array of perfectly repeating unit cells. We need to account for these realities to properly model a real crystal...

Static disorder
Dynamic disorder
The net result of these different effects is that the position of an atomic nucleus varies over time and over different units within the crystal.

The Fourier Transform

Why care about reciprocal space
1D series expansion of function
Generalize to 3D
Convolution theorem and why spots instead of spread
What does FT of a protein crystal look like?
Examples for intuition

Diffraction and the Bragg condition

Motivation: probe periodic structures w/ waves. Interference!
X-rays are EnM waves with atom-size wavelength.
Scattering off spherical center
Interference between multiple scatttering centers
Why FT of crystal (Born approx?)
Geometry of reflected wave; implication for experiment

The Phase Problem

X-ray diffraction

The Phase Problem

Interlude: Structure Refinement

Before going into the details of how a crystallographer refines a structure, I want to explain why we refine crystal structures.

The whole refinement process is a game of statistics and optimization. We are given a set of experimental observations -- a set of reflection intensities from x-ray diffraction -- and we wish to use that data to build our best model -- a set of atomic positions, B-factors, etc.

Fitting Models to Data

Bayesian Statistics
Log likelihood as objective function
Regularization as setting priors
Cross-validation to tune parameters

Structure Refinement