3.5 CsgH Structure
3.5.2 Backbone Assignment CsgH-CTH
A double labelled 15N 13C sample was expressed and purified by the same methodology as the 15N labelled samples. The purified sample (400 µM) was used to record 3D HNCACB, CBCA, HNCO and HN(CA)CO spectra at 292 ̊K required for backbone assignment. The spectra were processed using NMRPipe (156) and rendered using CcpNmr analysis 2.4.0 (153). The chemical shifts of the HN and N atoms of the backbone are provided by a 15N 1H HSQC experiment, 113 peaks were initially selected and inspection later identified 10 peaks overlapping these for a total of 123 peaks. The HNCACB experiment provides chemical shift values of Cα and Cβ atoms associated with each peak in the HSQC (where present), importantly it also includes the Cα and Cβ atoms of the preceding residue. Separating the carbon peaks of the current residue from those of the preceding residue is aided by the CBCA experiment which only shows the Cβ and Cα atoms of the preceding residue. Similarly the HNCO experiment provides the chemical shifts of the backbone carbonyl carbon atom associated with a particular HN and N in the backbone as well as the atom from the preceding residue. Together, the chemical shifts and sequential information provided by these experiments, combined with our knowledge of the protein sequence and the probable chemical shifts of a given residue, (From BMRB database [144]), it becomes possible to assign the chemical shifts to specific residues. The chemical shifts were selected using CcpNmr analysis 2.4.0 [188]. The program MARS [145] was used to automatically assign the chemical shifts to specific residues based on the pattern of Cα and Cβ chemical shifts, the data on the preceding residue’s chemical shifts and the protein sequence. MARS was run with 0.25 ppm tolerance for C’ and 0.5 ppm tolerance for Cα and Cβ residues. The resulting assignments could be displayed and inspected in CcpNmr (Figure 3.53). The data was generally of good quality although there were some heavily overlapped amide peaks and some peaks were absent from the 3D experiments, particularly in the HNCACO spectra.
Figure 3.53: Backbone Assignment Strips. Strips from NMR spectra used for backbone
assignment showing the data from the HNCACB (Positive Blue, Negative Green), CACB (Red), HNCO (Purple) and HNCACO (Orange). Peaks linking the strips are illustrated with black lines, the data is of good quality but there are still some gaps such as between alanine 34 and leucine 33 in the carbonyl region.
While the bulk of the protein could be assigned quickly (~70 %) some peaks were more difficult to assign, generally due to peak overlap, inaccurate chemical shifts or sequence similarity causing some degree of ambiguity. Through cycles of inspection, addition of new peaks, adjustments to the chemical assignments and automated assignment by MARS the assignments were gradually improved, overlapping peaks distinguished and the backbone assigned as completely as possible. The final backbone assignment (Figure 3.54) was ~99 % complete for the protein backbone excluding proline residues, the linker and the poly-histidine tag (95 % for entire protein). MARS uses the chemical shift values for the spin systems, the polypeptide sequence and the secondary structure prediction and attempts to assign the full backbone to satisfy the available data, the consistency with which a particular spin system is assigned to a particular residue is used as an estimate of confidence. The final MARS run assigned 97 residues with a high confidence, 4 with medium confidence (Glu100, His101, His102, His103) and 5 were unassigned (Met1, Pro94, His104, His105, His106). Notably most of these less confidently assigned and missing residues occur in the tag (Residues 99-106), it is likely that overlap makes it impossible to distinguish between the Histidine residues in the tag after His101 as they are likely to have similar chemical environments. Proline will not produce a peak due to its distinct structure and so it is an anticipated gap in the assignment. The N-terminal methionine residue is absent from the spectra because its - NH3+ group is in fast exchange with the solvent and so cannot be observed.
Figure 3.54: Backbone Assignments of 15N 1H HSQC. The 2D HSQC spectra of CsgH used to solve the protein structure is shown here with the assignments made using CcpNmr Analysis and MARS. The central regions of the spectra are shown labelled in the separate boxes on the right for clarity. The spectra is well dispersed and 95 % of the backbone amides has been assigned. The unassigned peaks are likely to be side chains or noise. Note that the 1H 15N peak for Alanine 23 is folded explaining its unusual chemical shifts.
The assigned chemical shift values of the 15N, Hα, Cα, Cβ and C’ of each residue can be used together with the sequence to predict the Phi and Psi angles together with the secondary structure of the protein. The secondary structure predicted from the assigned chemical shifts using DANGLE (155) compared favourably to the predicted structure of the protein (PSIPRED [126]) with strands positioned almost exactly as predicted based on the protein sequence (Figure 3.55). The consistency of the secondary structure prediction supports the accuracy of the assignments and the dihedral angles predicted were useful for the structural calculations, although the final structure used dihedral angles predicted by Talos+ [189].
Figure 3.55: Comparison of secondary structure predictions. Cartoon figure comparing the
secondary structure of CsgH based on sequence only (PSIPRED [126]) and the secondary structure predicted from the measured chemical shift data (DANGLE). The structures appear to be very similar with strands in approximately the same position. Note that the construct used for NMR lacks the presumably disordered N-terminal region and so this is not included in the prediction by DANGLE. Although the C-terminus of CsgH according to DANGLE has been shown as disordered but the prediction is more indeterminate with some evidence of both helical and coiled conformations.