• No results found

CHAPTER 5; Development of a microsatellite-based approach to the study of co-segregation in FH kindreds

Summary

The aim of this work was to follow the inheritance of the LDL-receptor (LDLR) gene using co-segregation analysis of highly informative markers in families with clinical FH in whom no mutation had been found by standard mutation detection methods (Chapter 3). Classical co-segregation analysis, based on a selection of RFLPs internal to the LDLR gene, has been used extensively both for research and diagnostic studies (Schuster et al.

1991; Miserez et a l 1993). However, these intragenic RFLPs are often uninformative due to low heterozygosity and high allele association. As a first step in this study, diallelic markers in the LDLR gene were typed in the 40 CEPH (Centre d ’Etude Polymorphism Humaine) reference families. These diallelic markers have a very low recombination value as they lie within the gene. On the basis of recombination values between these markers and microsatellites external to the LDLR gene and a physical map of the region which became available prior to its publication (courtesy of Dr. Linda Ashworth), two microsatellites were chosen, D19S394 and D19S221 for use in co­ segregation studies of FH families. Results from twenty-two FH families from Southampton and South West Hampshire are presented. No evidence of non-co- segregation of the FH phenotype and the LDLR gene was seen in 20 of the Southampton families (total lod score = 13.4). In two families however, non-co-segregation was seen, with lod scores of -2.06 and -3.49. In both these cases, the hypercholesterolaemia was due to mutations in other than the LDLR gene (familial defective apolipoprotein B (FED), and familial combined hypercholesterolaemia (FCH)). Use of microsatellites for co-segregation analysis facilitates both the confirmation of FH in families where identification of a mutation has not been successful and the search for any families where FH does not co- segregate with the LDLR gene, and will enhance the repertoire of molecular diagnostic tools available for FH.

5.1. Introduction

FH is caused by a very wide range of LDLR gene mutations, probably several hundred in the UK (King-Underwood et a l 1991 ; Sun et a l 1992; Webb et a l 1992; Gudnason et a l 1993; Day et a l 1997b). Thus, it is difficult to undertake direct mutation characterisation for either research or diagnostics. The alternative is to use linkage markers such as restriction fi’agment length polymorphisms (RFLPs) internal to the gene or closely linked variant polymorphisms flanking the gene for family studies. The main drawback of using RFLPs for linkage studies lies in the fact that they often have a low polymorphism information content (PIC) value (maximum PIC= 0.375); hence either large families, or many RFLPs, must be used in the analysis. This makes the deduction of a haplotype using a combination of diallelic RFLPs difficult, and increases the total number of genotypes needed. Many co-segregation studies of the LDLR gene in FH have been undertaken using these intragenic RFLPs (Hegele et a l 1990; Rodningen et a l 1993; Blouin et a l 1995), but as yet, a microsatellite based approach had not been explored.

Characteristics of markers ideal for use in co-segregation studies include high heterozygosity, such that each marker would be informative in a large number of families, ease of amplification and genotyping and close proximity to the gene. There are two ways to identity markers closest to the region of interest: 1) by using genetic maps and identifying close linkage by low recombination in families; markers with the lowest recombination value indicate those closest to the gene, 2) using physical maps. The CEPH families are a universal, fully validated resource for genetic mapping in which a microsatellite based genetic map has already been developed (Buetow et a l 1994). These families have ideal structures for linkage analysis, with three generations, four grandparents and at least eight children. Researchers can obtain CEPH DNA in return for agreeing to submit their data to a common database.

5.2. Genetic mapping and genetic markers

If two loci are on different chromosomes, they will segregate independently of each other. Therefore, the chance of a person inheriting a particular allele is 50%. If however, the two loci are on the same chromosome, and especially if they lie close to each other, then due to their close proximity they might always be expected to segregate together. Meiotic recombination can disrupt this, but rarely will it separate two loci that lie very

close to each other. Hence sets of alleles close to each other will tend, unless disrupted by recombination, to be inherited as a block or a haplotype. The farther apart two loci are on a chromosome the greater the chance that a cross-over will separate them; this is known as the recombination fraction (0= theta), and is a measure of the genetic distance between the two loci.

Genetic mapping aims to estimate the genetic distance between two loci by measuring how often two loci are separated by meiotic recombination. A requirement for genetic mapping is the availability of genetic markers. Genetic mapping is useful for the elucidation, through linkage, of disease causing genes. Marker-marker mapping is useful for the construction of skeleton and framework maps. These are high resolution maps of markers used for further genetic and physical mapping. To be used for genetic or disease mapping, markers must be inherited in a Mendelian fashion and must be polymorphic. They should be easily typed, easily scored, cheap and use readily available material such as blood or buccal cells.

Human genetic markers were first used in the early part of the 20th century. Blood groups (Lutheran, ABO, SE etc., approximately 20 loci) were utilised as genetic markers from 1910 to the 1960s. These markers had certain disadvantages, such as the need for fresh blood or rare antisera, and also genotype could not be inferred from the phenotype in certain cases because of dominance. The physical localisation of these markers was also not known. Electrophoretic mobility variants of serum proteins and HLA tissue types were then used, but again these needed specialised assays and provided limited information either due to low polymorphism values or to the fact as in the case of HLA tissue types only linkage to 6p21.3 could be tested. The advent of recombinant DNA technology brought about the discovery and use of RFLPs (Botstein et a l 1980). RFLPs were first used as a tool for genetic analysis in 1974 for the localisation of temperature sensitive adenovirus mutations on a physical map of restriction fragments. These biallelic markers, though extremely abundant in the human genome, suffer from the same constraints as the classic genetic markers: individual loci provide limited information with the maximum possible PIC value being 0.375. DNA minisatellites were first discovered by Jeffreys et a l in 1985. These are hypervariable markers which share a common consensus sequence but differ in the number of repeats, often in the range of 9-24 repeats (from 0.1-20kb). They are found on all chromosomes, but often cluster near telomeres.

DNA microsatellites are defined by the presence of short arrays of tandemly repeating units of l-4bp, and are dispersed throughout the genome. They are found in all eukaryotic organisms. A and T mononucleotide repeats are the most common, and together account for 10Mb or 3% of the nuclear genome. By contrast, G and C repeats are less common. CA dinucleotide repeats (TO on the opposite strand) are the most frequently occurring dinucleotide repeat and account for 0.5% of the genome. CT/AG are the next most common repeats followed by CG/GC repeats which are rare (Brenner et a l 1993). This is because C residues flanked by a G at the 3'end (CpG) are prone to méthylation, and subsequent deamination, resulting in a TpG (CpA on the other strand). Trinucleotides and tetranucleotides are both one or two orders of magnitude less frequent than (CA)n repeats (Gastier et a l 1995).

Classical minisatellites may be difficult to handle by standard PCR protocols, especially for the larger repeating array sizes, as these alleles may fail to amplify (preferential amplification of the smaller allele). Thus PCR amplification of these repeats will give null alleles or false homozygotes. This can be resolved by typing using Southern blotting. Also, minisatellites tend to be clustered in subtelomeric regions of chromosomes. The standard tools for linkage analysis are therefore the microsatellites, the bulk of which are (CA)n repeats. These are, however, prone to stutter bands or shadow bands due to replication slippage during PCR, and this may make the results hard to read. Tri- and tetranucleotide repeats give clearer single bands and are gradually replacing dinucleotide repeats as the markers of choice. Compatible sets of microsatellite repeats are currently available which can be amplified together in a multiplex PCR reaction and run on the same gel. This is made possible by designing primers to give bands with allele sizes that do not overlap and are labelled with different fluorescent dyes.

5.3. Genetic and physical map distances.

Genetic distance is not the same as physical distance, though the two can be correlated. Physical distance is as its name implies, the actual distance in DNA measured in kb or Mb, whereas genetic distance is defined by the recombination fraction. No matter how far the physical distance between two loci the recombination fraction never exceeds 50%.

males, and hence female genetic maps are longer than male genetic maps. However, for some specific intervals, recombination is higher in male meiosis, e.g. there is a male hot spot of recombination in the telomeric region of chromosome 21q, whereas females show low or no recombination events in this region (Blouin et al. 1995). The biological basis for this difference in recombination is not known (Ashley, 1994). From a study of human sperm inter-individual variability, as well as chromosomal variation in recombination exists especially for some of the larger chromosomal arms (Yu et al.

1996). There seems to be a strong genetic contribution to this inter-individual variability in humans (Robinson, 1996). Thus, recombination rates over a specific chromosome segment are controlled by local and unlinked factors such as genes involved in recombination/repair that can exert their effects in trans elsewhere in the genome, and local effects attributable to sequence, position, or chromatin structure. In the human genetic maps currently available, any large gaps existing between markers may represent regions of enhanced recombination rather than segments devoid of microsatellites (Dib et al. 1996). Hence, the relationship between genetic and physical distance is not a constant one. In general, in a sex averaged map, 1% recombination (IcM ) is approximately the average value corresponding to a million base pairs or 1Mb for specific chromosomal regions (Li et al. 1988). This value is an average and may break down in certain chromosomal regions. Recombinational hot spots for example are found near both telomeres of chromosome 16, and a marked reduction in recombination is present in most areas encompassing the centromere and satellite heterochromatin (Doggett et al.

1995). A similar finding is seen in the pseudoautosomal region at the tip of the short arm of the X and Y chromosomes (Strachan and Read, 1996).

In this chapter, the CEPH reference families as well as a physical map of the region were used to select markers for co-segregation studies on the basis of high heterozygosity and low recombination with the LDLR gene. Twenty-two FH heterozygous probands from Southampton and SW Hampshire were studied for co-segregation analysis on the basis of a clear-cut phenotype of FH, and the availability of other contactable family members. Direct detection of mutations in the LDLR gene by SSCP analysis was undertaken simultaneously with expansion of the families and co-segregation analysis with the marker D19S394. Probands from another sample group from Utah, in which mutation detection had been carried out, were included in this group for co-segregation studies with highly informative microsatellite markers. Results of these probands and families

from Utah will not be presented in this chapter, but are detailed in Chapter 7: Evidence of a third genetic locus causing FH.

5.4. Methods

5.4.1. Patient selection and isolation o f DNA

Twenty-two clinically heterozygous FH probands from Southampton and SW Hampshire were selected and their families expanded. These probands were chosen from a set of 150 probands being screened by SSCP for mutations in the LDLR gene. Families were chosen according to ease of contact of family members. All family probands satisfied the diagnostic criteria of FH set out in section 2.10.1. DNA was isolated either from whole blood or mouthwash sample and the appropriated DNA dilutions for storage were made as described in section 2.1.

5.4.2. LDLR gene polymorphisms in the CEPH fam ilies

CEPH DNA was kindly given from the MRC Biochemical Genetics Unit, Galton Laboratory, Wolfson House, University College London. The DNA was examined for the following LDLR gene polymorphisms: Ava II (exon 13), TA repeat (3’UTR exon 18) and microsatellite D19S394. Pvu II polymorphism (intron 15, V. Gudnason, personal communication) was done on a small number of CEPH individuals to confirm certain genotypes already in the CEPH database. Oligonucleotide sequences were as described in Tables 2.1 and 2.2.

5.4.2.1. Ava IIR F LP , exon IS LDLR gene

Approximately 500 samples from 40 CEPH families were amplified by PCR for exon 13 of the LDLR gene. PCR and cycling conditions were as given in section 2.4.1. Amplified products were digested with Ava II (section 2.4.1), run on a MADGE gel (section 2.5.2), and detected by staining with ethidium bromide as described in section 2.6.1.1.

5.4.2.2 TA repeat, exon 18 LDLR gene

Analysis of this triallelic polymorphism in exon 18 was done on approximately 500 CEPH individuals. PCR conditions were as described in section 2.4.3. Samples were analysed on a long denaturing gel (section 2.5.1), and detected by autoradiography (section 2.6.1.2).

5.4.2.3. Pvu IIpolymorphism, intron 15 LDLR gene

Twenty two individuals from one CEPH family (1344) were typed for this intronic polymorphism. To date this polymorphism had only been detected by Southern blotting methods. PCR and digestion conditions are described in section 2.4.4.

5.4.3. PCR o f microsatellite D I9S394

PCR conditions were as described in section 2.2.4. 109 individuals from 22 Southampton and SW Hampshire families were analysed for microsatellite D19S394. Fragment allele sizes were determined by fluorescent detection as described in section 2.7. The CEPH DNA from the 40 reference families was also typed for D19S394 and detected as above. Typing of CEPH families for marker D19S394 was necessary despite the fact that this marker was already present in the CEPH database, as it had been typed in very few families and individuals and this would have led to underestimation of any recombination likely to be present.

5.4.4. LDLR gene polymorphism analysis

Prior to the analysis of FH families by co-segregation, two RFLP markers (not in substantial linkage disequilibrium with each other (Miserez et al. 1993), Ava II in exon 13, and a TA repeat in the 3' untranslated region of exon 18 of the LDLR gene were typed in individuals from the 40 CEPH reference families as described above. The “link2sum” program (J. Attwood, unpublished) was used to tabulate the genotyping data by mating types, to compute gene frequencies and to test for mendelian segregation and allelic exclusions. Gene frequencies were computed by simple counting in parents and allelic exclusions were calculated for all individuals with at least one typed parent. The PIC value was also estimated for each marker. The CHROMPIC ftinction of CRI-MAP (a program for carrying out liklihood analyses and detecting and presenting chromosomal breakpoints) presented the data in the form of chromosomes (gametes) for every offspring in each pedigree. The chromosomes were shown with a set of closely linked and ordered markers with a phase likelihood estimate. In this way, all cross-overs in the offspring in each pedigree were shown under the most likely linkage phase classified by male or female parent of origin. The presence of close double recombinants or parents transmitting recombinant chromosome to different offspring, was an indication of genotype errors at the relevant loci. A two point analysis was carried out between the LDLR gene and chromosome 19 markers extracted from the CEPH

database, to identify those closest to the gene, i.e. those with the smallest genetic (and physical) distance.

To calculate the LOD score, individuals from FH families from Southampton and SW Hampshire were assigned affected/non-affected status on the basis of elevated cholesterol (total cholesterol cut-off value of > 6.7mmol/l for < 16 years old, and > 7.5mmol/l for > 16 years old). LOD scores (and non-parametric linkage scores (npl)) were calculated for the 22 FH pedigrees by the GENEHUNTER program (Kruglyak et a l 1996) (Human Genome Mapping Project, Hinxton, Cambridge).

5.5. Results

5.5.1. Genetic mapping o f chromosome 19p

A physical map of the LDLR gene and the surrounding region is shown in Figure 5.1. From this map, D19S394 is at a distance of 150kb telomeric to the gene (Ashworth et al.

1995), and exhibits extremely high heterozygosity (0.9) with at least 17 different allele sizes. The next closest microsatellite is D19S221, a dinucleotide repeat, about 1300kb centromeric to the gene (Ashworth et al. 1995), and with a heterozygosity of 0.87. Table 5.1 summarises the basic genotyping data. Approximately 500 individuals were typed for Ava

n,

the TA repeat, and D19S394, while information for D19S221 was already present on the database; D19S394 was also on the CEPH database, but had been typed in very few individuals and was re-typed for the whole set of 40 reference families. As shown, D19S394 had the highest heterozygosity value, and so the largest percentage of heterozygous parents. This was followed by D19S221 with a heterozygosity of 0.85. Table 5.2 shows a summary of the linkage analysis data of the LDLR gene polymorphisms. Recombination fractions (male and female) between the TA repeat and Pvu n and D19S394 were zero, with LOD scores of 65 and 38 respectively. For the Ava II, the male recombination fraction was 0.07, and the female 0.01 for D19S394. D19S221 had higher recombination values with the TA repeat.

Table 5.1. LDLR gene markers typed on CEPH families.

Marker Avan TA repeat D19S394 D19S221

Total typed 500 518 423 412

Heterozygosity* 0.49 0.55 0.91 0.85

^heterozygosity calculated from parental genotypes. Data on marker D19S221 was obtained from CEPH version 8.1

Pvu II TA repeat