3.2 DISCUSSION
3.2.2 Evaluation of the cDNA selection library
Contam ination with non-specific sequences The evaluation of the Y -linked, testis-expressed cDNA selection library by hybridisation with a variety o f probes and by sequencing, revealed that only a small proportion o f selected clones contain expressed cDNAs as inserts (731 recom binants, 16.2% of the library). The rem aining recom binants contained either vector sequences or genom ic Y Chr repeat sequences (>83%). The expected level of such contam ination was approximately 10% (Del M astro and Lovett, 1997) and thus the proportion o f contam inants in this library is unusually high.
Fundam ental factors that affect the outcom e of the selection process are the quality of the starting materials. M ore specifically, the quality of the
cDNA source is of great importance. A cDNA source that has good length distribution and is low in ribosom al RNA has the potential to give excellent results. In addition, the quality and purity of the genomic tem plate, is also critical. The technique relies on capturing the biotinylated genom ic target and its hybridised cDNAs. Thus, it is essential to obtain good incorporation of biotin w ithin the genomic target and experience has shown that cosmids give high levels of enrichm ent and low background. In addition, the binding capacity of the streptavidin-coated magnetic beads should be adequate, in order to capture the biotinylated genomic fragm ents with their hybridised cDNAs. Finally, the relative am ount of cDNA to genomic target is an
im portant param eter in a selection experiment. In general, the genomic target should be in excess in relation to the cDNA amount. This ensures that the low er abundance cDNAs are efficiently selected.
The starting material used for the construction of the cDNA selection library was m ade under the supervision of Dr. K. Taylor and is described in Dr. J. C am eron’s thesis. The total RNA was prepared using the LiC l/urea extraction method and its concentration was estim ated using the
spectrophotom eter, to be 540pg. The estim ated amount o f polyadenylated RN A was lOpg and was isolated using M essage Affinity Paper. Follow ing the isolation of mRNA, first strand cDNA was synthesised using oligo-dT
prim ers. The yield of single stranded cDNA used in the construction o f the selection library was 0.57pg. As a genomic target 480 cosm ids from the Law rence Liverm ore library (plates 1-5) and 480 Y-cosm ids from the Taylor
et al library (plates D, E, F, G and H) were used. These cosm ids provided a coverage of -5 1 M b DNA, which was equivalent to approxim ately 1 Y chrom osom e. The Lawrence Liverm ore library was prepared by flow sorting Y chrom osom es from the somatic cell hybrid 1640-51 and o f the clones produced, 82% are estim ated to be human, 13% are ham ster and 5% are non
m ouse somatic cell hybrid 3E7 and is thought to contain <10% contam ination of hum an Chrs 1 and 12. Consequently, it would be acceptable to assum e that the starting m aterial used for the construction of the cDNA selection library described in this thesis was checked sufficiently and was of good quality.
It is clear that vector, repetitive and ribosom al RN A contam ination appears to be for m ost cDNA selection libraries, a com m on problem , with figures for proportion of clones containing repeats and ribosom al RNAs, varying from >5% to 60% (Table 3.9). These are few er estim ates of the extent o f contam ination by vector fragments. Interestingly, in a sim ilar experim ent conducted by Del M astro et al (1995) using Chr 5 cosmids and 4 different hum an tissues, although vector contam inants were not abundantly present (0.4%), the level of contam ination from repeat elements was in a similar proportion (21%) with the library described in this thesis (24%).
It is a widely held opinion that there are only a very low num ber of Y- linked testis expressed sequences. This feature, in com bination with the high level of repetitive elements on the Y chromosome, could account for the relatively low proportion of clones, which were genuine cDNAs. However, the Y -chrom osom e, testis-specific cDNA selection library constructed by Lahn and Page (1997), was found to have an overall level o f non-cD N A sequences of >11%, close to that suggested as acceptable and which
corresponds to the contam ination level suggested by Lovett and Del M astro (1997). W hy do the present results and those of Lahn and Page (1997) differ? A partial explanation may be found in modified experim ental m ethodologies in the construction of the two libraries. An exam ple is that Lahn and Page used a genom ic target that provided a 5-fold coverage of only the Y euchrom atic region, whereas the cosmids used for the construction of the described selection library, provided only a 1-fold coverage o f the Y
chrom osom e. In addition, these cosmids were random ly selected and could include both the euchrom atic and the highly repetitive Y heterochrom atic
region. A nother difference in the two protocols is that Lahn and Page carried out four rounds of cDNA selection, followed by two rounds of subtraction with hum an Cot-1 DNA. In contrast, in the m ethod that Lovett and Del M astro followed, the repeats present in the cDNA were blocked with Cot-1 before the genom ic target / cDNA hybridisation and also the cDNA selection process was carried out only twice.
Identified Y-linked sequences Despite the high frequency of clones not containing cDNAs, the described selection library was considered to be successful in the isolation of expressed genes. O f the 148 clones that were sequenced, 79 matched expressed sequences in the database and m any of those were proved to be of Y origin (48, including redundant clones).
Random sequencing identified only two know n Y -linked genes, TSPY and RBM Y and these recom binants accounted for 8.9% o f the sequenced cDNAs. Some known Y-linked genes, like DAZ and SRY, were not encountered amongst the 148 clones sequenced, but are expected to be
am ongst the rem aining 583 clones, which were not sequenced. Lahn and Page (1997), who used 3,600 Y-linked cosmids as template, rather than the 980 used in this experiment, identified 16 know n genes, both Y-specific and pseudoautosom al, which corresponded to 84.1% of the cDNA containing clones. The Lahn and Page genom ictargetsw ere selected to give them five fold coverage o f the 30Mb euchrom atic region, which theoretically contained >95% of Y-linked STSs and would provide sufficient genom ic m aterial for capturing of most of the Y-linked genes. In contrast, the 980 cosm id clones used to construct the cDNA library evaluated in this thesis, was thought to provide only a one-fold coverage of the entire 50M b Y chrom osom e.
(1997), a large proportion (64%) of the expressed sequences, identified in the present U C L study, either m apped to the Y chrom osom e and/or were
expressed in testis and its associated structures.
Interestingly, a num ber o f the expressed sequences identified in the U CL testis cDNA selection library, appear to either be proto-oncogenes, for exam ple m il 2d 10 which matched the int-2 proto-oncogene, or sequences expressed in tumorigenic tissues, like m ll3 d l2 that is expressed in E w ing’s sarcoma. A lthough the adult testis was obtained from a testis cancer patient (the cancerous parts were removed) and this would explain expression of genes related with tumors, proto-oncogene expression in testis is perhaps not surprising, given the high level of cell divisions, both m itotic and meiotic. Several proto-oncogenes, like int-1, c-myc, c-fos and c-jun are involved in cell proliferation m echanisms and as it has been dem onstrated these genes appear to be present at several developm ental stages of germ cells (Propst et aL,
1988, K um ar et aL, 1993).
A nother interesting feature of the cDNAs isolated from the UCL selection library is that some match ESTs that contain repetitive elements. Table 3.8 lists 14 such cDNAs, which correspond to 17.7% of the sequenced clones. It has been proposed that testis is a tissue where, high levels of repetitive sequences, like M ERs, LINES and SINES and retroposon like elem ents with long term inal repeats are transcribed (Lankenau et aL, 1994, H endriksen et aL, 1997; Casau et aL, 1999). As a result, it is not unexpected that a proportion of cDNAs would fall into this category. However, although Lahn and Page found in their library repetitive sequences, they did not specify w hether they were am ongst the ones mentioned above, or other Y-specific repeats.
Some of the cDNA clones selected by Chr Y DNA, from the testis library are puzzling. One example is cDNA clone m ll3 b 3 (Table 3.6), which shows 100% hom ology with a cDNA sequence obtained from Spirom etra
erinacei (plerocercoid stage) and with an EST from a laryngeal cancer cDNA library. The connection betw een those tissues and Spirom etra (tapeworm ) is unlikely, unless the patient from whom the tissue samples w ere taken, were infected with this parasite. Even so, it is difficult to explain why the Y Chr tem plate should have selected this cDNA.