• No results found

3.3 8QPTBBRM BLOT ANALYSIS

3.4 GENERATION OF LAMBDA LIBRARY BANES

3.6.2 Nucleic acid sequencing strategy

The sequencing strategy used is shown in Figure 18. The restriction mapping of the clones pQK3 and pQK9 showed that sequencing overlapping subcloned fragments of the insert would be lengthy due to the limited number of convenient unique restriction endonuclease sites. To avoid this the inserts were sequenced inwards from the T7 and T3 priming sites of the vector and thereafter primers were generated as required. Both strands of the plasmid DNA was sequenced and all sequencing reactions were repeated with the dITP termination m i x e s to ensure any compressions were fully denatured.

Figure 17 Partial restriction maps of the inserts of clones pQK2, pQK 3 , pQK7 and pQK9 The partial restriction maps of the inserts of (A) pQK2, pQK3 and pQK7 and (B) pQK9 are shown. The restriction endonuclease sites determined and their distance in kilobases from the 5' EcoRI cloning site in the vector is indicated

Figure 18 D N A sequencing strategy on pQK3 and pQK9 A schematic diagram of the EcoRI insert in (A) pQK3 and (B) pQK9 is shown. The arrows indicate the approximate length of sequence reads from the ends of the primers used. The sequenced regions encoding open reading frames are boxed. pQK3 contains a frameshift and this is indicated. The position of relevant restriction sites are noted. A distance scale is gi v e n relative to the T3 and T7 priming sites in the vector

Eco R I Hat II Xba I Nco I Eco R I

T T T f T

AT6 TAA T A A

T3

T7

B)

EcoR 1 Hat II Nco 1

T

4 / /_______

T T

‘ 7 / ATG TAA

3.ft. 3 8»qutnct analysis of PQIC3

Analysis of the sequence of the insert of pQK3 by translation into amino acid sequence (Figure 19) showed that a putative long open reading frame encoding a polypeptide of 563 amino acids was disrupted by an insertion of AC after amino acid 67. The frameshift caused by this two base insertion leads to a stop codon after amino acid 104. This two base insertion was the only frameshift within the putative long open reading frame and was o n l y revealed by sequencing using the dl termination mixes. It is possible that this two base insertion was generated d u r i n g the cloning procedure. However mutations that lead to reduction or elimination of seed polypeptides have been found in several different plant species, including gene inactivation due t o translational frameshift (Goldberg, 1989) and several ricin pseudogenes have been described (Tregear, 1990).

Figure 19 Nucleotide sequence of the insert of pQK3 The coding strand sequence of pQK3 is numbered (bold type) for reference. The deduced amino acid sequence of the open reading frame is shown in bold type above the corresponding nucleotide sequence and is numbered for reference. The frameshift is shown with the amino acids deduced within this frame numbered in brackets

CAATTCAATCTAGCATAAACTTTTTGCATAATCATCAAACTTATAAATTCTTAATATTTTACAACTTCCTAAGCCAAACATCACATATAC 90 AACAATGCATCCTAAACTATTTATCCTTCGTATGGAAAATATCAATTCGTGCTTAAATTCCTCTTAATAAAAATGCCTTTCTTCCCATAT 180 CAACAAAGACAATTTTAAAATTCTCCACTATTTACTTTTAAATACTTTATTAGAGGATTGAATTTACACCAAACAGATAGAATTATAATA 270 TCACrrCTATCCTTCTCACATAIATTAATTCTTACTCTCATTATTTATTATTTTCTTCTATTAATCTCTTATTTTCACTCCAAATCACCT 360 CATCAATACTCCTCTACAATTACAGACCCATTTCATTTGAAATTCTTTCTTAGAACAAGTTTTTAAGCATGACCTCACTTCTTAAGTACA 450 ACCGTTGAACATATAATGCAGTTGCAGAATGTTACAGGCCATAAAAACTTGCAGGTTTTCAAGATAATAACTAACTAGCAGTTATACTAA 540 ACTTTCCATATTTATCCAAACTCAATATTTTTTACCTCCTAGATATTAAATTACTCTAATTTTAATTTTAAAACATTTTAAATTCTCTCA 630 AAATTTAAACTGTCCATCCAAACACAGTAATACTAGCAATAATCCATATTCCTAGACCACAAATCAAACTACCAATAAACTCTCCTTATA 720 AACATTAGTTTTTTTATATAAACGTACAATCCAGATCCCTATAAACATTTGCCCTCGTAACATCATAAAAGATTCCCCTTCTITCGGTGC 810 TGACATTTCITTTGAAACGGCGATCTCTCTCTCTCTCTCTCTArATATATATATATATATATATATATATATATATATATCATGCTACCC 900 TTTCAGCTGAGCAAGACCAACCCCTGCCTCCCTAAACCGTCATCTrTATATATCATGCCCTCCCAACCATTCTTGCCAAAATTGTTAGCC 990 ACAGGTAAATTATGCACTCTGTCCAAATCAGACTAATCCTCCAAAAATAGGCTTTATTTTTATTCTCACCTCCTACCAATGCTCCAAGAT 1080 CCCTATGCCTATAAAAGCCACAGTCACTCCACAACTTTC AACCC ACACC AAAC ATTCCACACCATCCTTTATTTTACTCCAACTCCGATC 1170 AACAAGCTTCCCTCTACGCTTACCTTCAGCTTTAAAATTTGAATTACTTCTTTCAGTTTTCATTACATCTTTTCCTAAAATTTGGTGCAA 1260 H D K T L KL L I L C L A W T C S P S A L R C A A 25 CtGCATCCACTCTCAAATCGACAAAACTTTGAAGCTACTGATTTTATCTCTTCCATCCACTTGTTCATTCTCTCCACTCAGATGTGCCCC 1350 I T y T P T A T H Q D Q PI K r T T B C A T SQ S T K Q r i 5 5 AAGAACCTATACTCCCCTAGCAACAAATCAGGACCAGCCCATTAAATTTACTACTCAACCTGCCACTTCACAAACCTACAAACACTTCAT 1440 « , L . Q . L . « e L I E „T . L C , « , « Q « ,» e. « N. ,» M :atacctgtgcttcgagatccaacaacagtggaagaaacaaatcca 1530 TCAAGCGCTTAGACAGAGACTAACAGCTCCCCTGATACACACGACATA !tc8aTATCTCCCACCATACCGA<

c8aACTcXaTCCTATTTCCTTCCTCATCCCCCAGCAtStCCAt3tA?CTACcVtTTCCCTCCCA?Cc2gCCGTACt8aCTTCCTTTTGAT(Ì7i£ G8TASrrATc8TGATctAGAGAGATGCCCTCATc2GA?AAGAGAGCAAATAA3TCTAG8cTTAc2ccécTTGA?ACATGCCAÌATCG A'iitó cÌfACGAA^TGGCCCCA^TAACCATGAACAAAAAGCTCCTAjcCTCATCGTGATAATCcÌAATGCCTT^AGAAGCACCTCGATACAGGT

5c‘i85i

AÌATéAAACCGGGTTGSTCTCAScATCCCAAjTGGTAjcGCCTTTcXACCACATCCTGCCATGTTAASTTTGCACAACAATTCCCATAAT^i^

cÌìct8ac8ac8tgttcXac2at8actcc2agatacttttccaaataatctcattttat8aa8tattaatcccc 2acctcttcttgtagat(

t3cttgt8acacccaa?acttccacttctaccattaatcctttttgtctccaatccgccaaatccaaaccgat8acctctcctaataaga ( ! ? i i

t8aA ^wLu«t2ac2gGTGCTCaScTACCATTATGAGc£aa2tGTGCGCATTCCTg8tCCACATCGGCTGtScTTCCATGTTCCT<25U

aaccaactctataacaatccccatcccatcataatgtIta!wtgcaaagacaagcttgaccagaacc2cc TGTCCACCTTGAAACTTCAC( lllh

aagaccataacgtccaaggSgaagggtttaaccaccaaaccttatgcttcacccgattcgatcgtca TAT ATC ATtStACCtÌÌGCCAGAA^ c8ACACcècACTTATTGCCAAAÌATCCGACAATG8AA?TATTATCAATA?AAACTSTGécTTCGCCTTCACTGCAAAAGCTAGACACAAT

‘H5A

c8cTATCAACTCATCCTGc2AA2cAACGAATATCTAATGc8Ac2cc8cTGGCGTACGc8cAATAACA?AACCCCTTTCGTAACTTCAATC<2 t U

g8tgggtatt8ggatctctgcatgc2agctc2gggaaataatgtgtggctggctgactgtcataaaaataaggagcagc2cc2atggcca( CTTTACA?AC A TCCCTC T AT ACCTTC AGTCC 2 A A A T AC aaacaactStttaa? TTATAAAGACCACAAAcXAGSATSTCCCAÌTCTCClfc^?^ ATCg8ttSca8caATcSaTGCCCTa8tc2aAGATGCTTGTTTAAAAATCACg8tACCATTTATAATt}ìaTATCACGACATGGTCATCCAT ' m

CTCAAACCCTSTCATc8AACTCTTAAAGACAÌAATACTTCATCCCTACCATc8TAAACCTAACc2AA{ATCGCTTACTTTCTfTTAATCC( ^ ? Ì CTTCATCtCTGAGCAGACAAATCCT ATCTGICTGTGTGTAATCTTT AAATAACTG ATCC TATCCATCGATCTGCAAAAACACTC ATGACC 3060 TTCATCCTGCCTTCTAATCTTTCTAATGTGCTTAAATAATAAATTTATATTTTACCAACACCCTACCCCATCAACTGATCAATAAACACA 3150 AACCAGATTGCTTCAACTCAAAATATATTTTAGAACCTAACTGCATAAAATCAAAACAATTCCAAACCTAAATCATAATTACACCCCGCC 3240 CATTTAATTCCAAAGCTACATCTTCATTCCTCCAATTAACAAATTATCTAGCAATTGGATATACAATATAATAAACTCAATATAACATCA 3330 ATATACCTAGTTCCTTCCCTTCAAACATACAAAATTAAATCATCATAACAACCTTCATCACAACAAAAGTTCAATTAAAACTTACCAAAA 3420 AAACAGCCCAAATTTCATGGCTTACCACTTAAGACTTACATAAACATTTTCCTTCACATCCTCGGCTACCTTCTACCATCAACCTAATCA 3510 GACACTITAGTTTTGATTCAATAATTCTTAAGAAAAACTTCTTACAATACTCAAACACCCCAATATATTCATACAATCTTACATAATTCA 3600 ATTC 3604

3.t.4 Sequence analysis of cot)

Analysis of the sequence of the insert of pQK9 by translation into amino acid sequence shows the insert contains a single long open reading frame encoding 563 amino acids and is shown in Figure 20. The amino acid sequence determined for the abrin-related probe is identical over its length with the sequence shown here.

The similarity between ricin and abrin suggested that the structure of the polypeptide encoded within pQK9 would be similar to preproricin. Based on a comparison wit h abrin C A- chain and ricin D the polypeptide encodes a n abrin-related preproprotein consisting of a leader of 34 amino acids, an A- chain of 251 amino acids, a linker of 14 amino acids and a B- chain of 263 amino acids

1 Pat I 100 ACATTTGTCGCGTACATCATAAAAGATTCCTGCAGTACAATAATTTGCCTTCTTTCGGTGACACTTCCTTTGAGCTGGCTCTTACGTGACGTCAGCAAGA GTGACCTAATTATAATTACTATCATCATAAATTATGCACTCIGTCCAAATGAGGCIAATCCTCCAAAAACGGG1TGATTTTTATTATGAGGTGCTACCAA Kpn I 300 TGGTCCAAGATCCCTCTGCCTATAAAGTGATCCCACCACTTCCAAGCCACACCAAACATTGCACAGCATCCTTTATTTIACTGCAAGTGGGATCAACAAG GTTCGCTCTACGGTTAGGTTCAGGTTTAAAATTTGAATTACTTCITTCAGTTIICATTACATCTITIGCTAAAATTTGGTGCAATTGCATGCAGTTTTAA 401 1226 A-chain...| l«~

TIG ATG CTT ITT GTC TGC AAT CCG CCA AAT GCA AAC CAA TCA CCA CTA TTA ATA AGA TCA ATT GTC GAA GAA TCA Leu HET Leu Phe V»l Cys Asn Pro Pro Asn Ala Asn Gin Ser Pro Leu Leu lie Arg Ser lie Val Glu Glu Ser

266

....«-chain Fsp l

AAG ATT TGC AGC TCT CGT TAT G Lye lie Cyt Ser Ser Arg Tyr G

Figure 2 0 Gene sequence of the preproprotein The coding sequence of pQK9 is numbered above for reference, with relevant restriction sites noted. Potential control sequences are underlined. The deduced amino acid sequence is shown beneath the corresponding nucleotide sequence and is numbered below. The N-terminal residue of the mature A-chain is numbered 1 and residues 5' to residue 1 are numbered negatively. The 14 amino acid linker separating the C-terminus of the A-chain and the N- terminus of the B-chain is indicated

3.7 .ANALYSIS O F , T H E A B R I N - R B L A T B D P R E P R O P R O T B I N