PROTEIN SEQUENCING
First Sequence
• The first protein sequencing was achieved by Frederic Sanger in 1953. He
determined the amino acid sequence of bovine insulin
• Sanger was awarded the Nobel Prize in
1958
I. Strategy
• Determine number of polypeptide chains (subunits)
• Determine number of disulfide bonds (inter- and intra- chain)
• Determine the amino acid composition of each polypeptide chain
• If subunits are too large, fragment them into shorter polypeptide chains
• Sequence each fragment using the Edman degradation method
• Complete the sequence by comparing overlaps of different sets of fragments
II. End-group Analysis
• Number of chains can be determine by identifying the number of N- and C-terminal.
• N-terminal analysis
– Dansyl chloride– Phenylisothiocynate (PITC)/ Edman reagent – Aminopeptidase
• C-terminal analysis
– carboxypeptidaseN-terminal Analysis with Dansyl Chloride
• Main reagent: 1-dimethyl aminophthalene-5-sulfonyl chloride (dansyl chloride)
• Dansyl poplypeptide chain is prepared
• Acidic hydrolysis liberates all amino acid and the N- terminal dansyl amino acid
• Amino acids are separated
• Fluorescence of the dansyl amino acid is detected
• Type of aa is obtained from comparison with standard dansylated amino acids
N-terminal Analysis Edman (Degradation)
• Nucleophilic attack on phenyl isothiocyanate (PITC), the Edman reagent, under mild alkaline conditions (N- methylpiperidine/water/
methanol)
• Formation of a phenylthiocarbamyl derivative (PTC-peptide)
N-terminal Analysis Edman (Degradation)
• Anhydrous trifluoro acetic acid (TFA) is used to cleave the terminal amino acid in the form of a thiozolinone derivative leaving the other peptide bonds intact
• The thiozolinone (TZ) derivative is extracted in an organic solvent (e.g.
N-butyl chloride)
• Peptide cleaved carries a free amino terminus
N-terminal Analysis Edman (Degradation)
• The TZ is extracted into an organic solvent and treated with an acid (25 % TFA/water) to form phenylthiohydantoin (PTH)
derivative
• PTH is detected from UV absorption at 296 nm
N-terminal Analysis-Edman Degradation
• PTH amino acid is separated from the other components by chromatography or electrophoresis
• The terminal amino is identified according to retention time or mass
• This sequence can be repeated to identify all amino acid in short peptide chains (40-60 amino acid long)
Edman Degradation on Protein Sequencer
Perkin Elmer Applied Biosystems Model 494 Procise protein/peptide sequencer http://www.biotech.iastate.edu/facilities/protein/nsequence494.html
Edman Degradation on Protein Sequencer
By-products of Edman Degradation
N- and C-terminal Analysis-Exopeptidase Method
• Exopeptidases cleave the terminal residue of a polypeptide chain
• Aminopeptidases cleave the N-terminal residues
• Carboxypeptidases cleave the C-terminal residues
• Aminopeptidases and carboxypeptidases are highly specific, thus are of limited use due to slow rates and resistance of some amino to cleavage
III. Disulfide Bond Cleavage
• Disulfides are reduced to thiol with dithiothreitol (DTT) or 2-
mercaptoethanol
• Thiols are treated with alkylating agents (e.g.
iodoacetic acid) to
prevent the re-oxidation
during subsequent
steps.
Protection of sulfyhydryl groups
IV. Separation and Molecular Weight Determination of Subunits
• Traditional Methods
– SDS-PAGE, SEC, or RP-HPLC are used to separate the subunits after cleavage of disulfide bonds
– Mw standards and a calibration curve are used to determine the molecular weights
– The approximate number of amino acids can be estimated from the Mw of the subunit using 110 Da as
V. Amino Acid composition
• Strategy:
– hydrolysis followed by separation and identification
• Acid catalyzed hydrolysis
– 6M HCl/ 100-120ºC/ 24 h (in oxygen free environment to prevent oxidation of SH groups)
– Some residues are degrated under these harsh conditions
• Base catalyzed hydrolysis – 4 M NaOH /100ºC/ 4-8 hours
– Arg, Cys, Ser and Thr are decomposed and other amino acids are deaminated and racemized
– Used mainly to determine Trp which is extensively degraded under acid catalyzed hydrolysis
V. Amino Acid composition
• Enzymatic hydrolysis
– By exo- and endopeptidases
– A combination of endo and exopeptidases must be used to hydrolyze all the peptide bonds
• Separation
– Individual amino acids in hydrolyzed mixture can be separated by RP-HPLC or CE and identified
according to retention time
• Increasing sensitivity
– Pre- or post-column derivatization is used to increase sensitivity
Derivatization with OPA and MCE
VI. Cleavage of Specific Peptide Bonds
• Direct sequencing is applicable to peptides that
have up to about 50 residues only.• Problems which occur after lengthy reactions – Incomplete reactions
– Accumulation of impurities from side reactions
• Solution: use enzymes to break down the
polypeptide chain into shorter fragments
Enzymatic Fragmentation
• Trypsin
– Trypsin is the most commonly used proteolytic
enzyme. It cleaves at the C-end of positively charged amino acids (Arg and Lys) if the next residue is not a proline.
– It is highly specific
– Cleavage sites may be removed or added via derivatization to take advantage of the specificity of trypsin
– Reaction times can be adjusted to limit proteolysis if there are too many Arg and Lys residues
– Non-denaturing conditions can be used to limit proteolysis as well
Trypsin Digestion
Derivatization of Cys for Tryptic Digestion
Other Proteolytic Enzymes
• Endopeptidases
– Pepsin; cleaves at the amino end of Phe, Tyr, Trp the previous residue is not a proline
– Chymotrypsin: cleaves at the carboxyl end of Phe, Trp, Tyr if the next residue is not proline
– Endopeptidae GluC: cleaves at the carboxy end of Glu
• Exopeptidases
– Leucine aminopeptidase: cleaves rapidly N-terminal leucine aa.
Does not cleave N-terminal proline
– Aminopeptidase M: cleaves all N-terminal residues – Carboxypeptidase A: cleaves all except Arg, Lys, and Pro
• Especially efficient for aa with bulky aliphatic and aromatic side
Chemical Fragmentation Methods
• Cyanogen bromide (CNBr) specifically cleaves Met residues at the C-end forming a homoserine lactone
1.
2.
3.
4.
Sequence Determination
• Separate segments by chromatography or
electrophoresis and sequence fragments individually
• Edman degradation is the method of choice
– Fully automated systems which use the Edman degradation methods are available commercially (Sequenator)
• In the sequenator the protein is immobilized through bonding to a solid support or by adsorbing it onto an inert glass frit.
• Controlled amounts of reagents are injected by a pumping system
• The thiozolinone is transferred to a conversion chamber for hydrolysis to the PTH amino acid
• The final product, the PTH amino acid, is pumped into an HPLC column for on-line analysis
– 1 hour analysis time is possible for 50 amino acid residues
The solid-phase matrix-the Merrifield resin
Edman degradation
Ordering of Peptide Fragments
• Compare amino acid sequence of one set of peptide fragments with the sequence of a second set of fragments obtained using different cleavage points
Determination of Disulfide Bond Position
• Digest polypeptide chain(s)
• Run 2D gel of mixture of fragments using same conditions in both dimension
• After separation in the first dimension, the matrix is exposed to performic acid which cleaves all possible disulfide bonds
• Separation in the second dimension is performed
– Fragment without ss bonds will be positioned along the diagonal of the matrix
– Fragments linked by S-S bonds will produce off diagonal spots – The disulfide linked fragments can be extracted from the gel and
sequenced
Protein Sequencing by Mass Spectrometry
• Digest protein
• Obtain MALD TOF mass spectrum of digest
• Use online database to match fragments patterns with those in the data base
• Obtain sequence of fragments by performing MS/MS