• No results found

THE MOLECULAR GENETICS REVOLUTION

In the mid 1950s, there were only two “facts” known about the human genome. It was thought that humans had 48 chromosomes and that X-chromosome inactivation in humans occurred by the same mechanism as had been observed in fruit flies. Both of these observations have been proven to be in error. In the past few decades,

there has been an explosion of knowledge about the human genome, largely attributed to advances in molecular biology.

Deoxyribonucleic Acid

Genes are the instructions required for building structural proteins and enzymes and peptide hormones, and the complete set of genetic instructions for any organism is called its genome ( Table 6.11). The human genome has 46 chromosomes, including 22 pairs of autosomes and two sex chromosomes. The genome is made up of three billion base pairs, somewhere between 40,000 and 100,000 genes. The functions of approximately 10,000 human genes have been characterized, and in 2003 the first draft of the human genome sequence will be completed.

TABLE 6.11. The human genome

In 1944, Avery and colleagues demonstrated that DNA is the chemical that carries genetic instructions. Roughly equal parts of DNA and its supporting proteins make up the 46 chromosomes. If the strands of DNA in the nucleus of a single cell could be unwound and spliced together, the resulting DNA molecule would stretch more than 1.5 meters long, but it would be only 20 trillionths of a centimeter wide.

The genetic code is spelled out with the four nitrogenous bases: adenine, thiamine, cytosine, and guanine ( Fig. 6.5). The purine and pyrimidine bases are arranged in a ladderlike, double helix arrangement that is very stable (i.e., theoretic dissociation constant = 10 - 23). During cell division, DNA is duplicated with extremely high fidelity by synthesis of a new strand of one side of the molecular ladder.

FIG. 6.5. Genetic code. The DNA code consists of four characters and is read three characters at a time. It is translated into an RNA message, which instructs cells in how to assemble proteins from amino acid building blocks.

The human genome consists of at least 40,000 genes, but the genes comprise only one tenth of the encoded information. Most of the genome is of unknown function, but it probably codes for the proper spacing, alignment, and punctuation of the genetic instructions. About 99.8% of the DNA sequence is identical from one person to the next. Stated another way, there are many minor differences between any two persons; on average, there is a variation of one nucleotide for every 200 to 500 base pairs. When these sequence differences occur within genes, they can lead to genetic diseases or genetic variation. Most of the minor differences have no observable effect because they occur in the noncoding regions of the genome, regions of DNA that do not contain genes. These otherwise unimportant differences have been the basis of the current explosion of genetic knowledge, because much of our ability to study genes or diagnose genetic illness exploits differences (i.e., DNA sequence polymorphisms) in these regions to track or find neighboring genes.

The DNA sequence is read by cellular enzymes three bases at a time, and each triplet directs the positioning of a particular amino acid within the structure of a protein (see Fig. 6.5). The protein coding instructions are transmitted to the cellular machinery through messenger RNA, a transient, intermediary molecule that is similar to a single strand of DNA ( Fig. 6.6). The RNA strand is transcribed from the DNA template in the nucleus and has an opposite or complementary genetic sequence.

Messenger RNA moves from the nucleus into the cytoplasm, where the protein manufacturing organelles build a protein. Analysis of messenger RNA molecules is extremely useful in the laboratory for detecting genes.

FIG. 6.6. Anatomy of a gene. Regulatory regions are present in the 5' region. Introns are spliced out of the final messenger RNA.

Several advances in molecular biology have enabled the molecular genetics revolution to take place. The first was the discovery of restriction enzymes, which are bacterial proteins that can cut DNA molecules at specific sites by recognizing the DNA sequence at those sites. Over 400 restriction enzymes have been discovered, many are commercially available, and about 25 are used commonly. Restriction fragment length polymorphisms (RFLPs) occur because of minor sequence changes (usually single base substitutions) that abolish or create a recognition site, altering the length of a digestion fragment. Restriction sites occur frequently, and several restriction sites can occur in the vicinity of any given gene. When these RFLPs are polymorphic, they become useful markers for linkage studies, diagnostic testing, and paternity testing ( Fig. 6.7). RFLPs and other DNA polymorphisms provide the landmarks for genetic maps.

FIG. 6.7. Linkage study using restriction fragment length polymorphisms. Each lane represents the genotype of one family member. M, mother; F, father; D, daughter;

S, son. In this example, the disease allele is associated with the upper band passed from the mother to the son.

Scientists have gained a greater understanding of how to manipulate the physical conditions, such as pH, salt concentration, and temperature, of in vitro DNA

reactions. These skills—combined with the use of restrictions enzymes—allowed the development of recombinant DNA or new combinations of DNA engineered in the laboratory. Recombinant DNA technology has made possible the development of gene probes (pieces of DNA usually radioactively labeled) that recognize and bind specifically to a homologous sequence in another sample of DNA. These technologies also underlie cloning—the copying of DNA segments in lower animals and the

manufacturing of human proteins using bacteria or cell cultures.

Various blotting technologies are used commonly to study DNA. With blotting, biologically relevant molecules undergo electrophoresis and are transferred to a stable membrane for repeated experiments. Blots are called Southern blots if DNA is being analyzed, Northern blots if RNA is being analyzed, and Western blots if proteins are being analyzed.

DNA testing is clinically applicable to many disorders and can be performed in one of several ways ( Table 6.12). When the molecular basis of a disease is known, direct mutation testing can provide a yes or no answer on any DNA sample. For instance, in CF, hundreds of mutations have been discovered. A battery of mutations can be tested for using various methods such as dot blots, which are simple to interpret ( Fig. 6.8).

TABLE 6.12. Common conditions for which DNA testing is available

FIG. 6.8. Direct mutation diagram. Direct detection of cystic fibrosis mutations using reverse dot blots. In this example, five mutations in exon 11 of the cystic fibrosis gene (G542X, S549N, G551D, R553X, R560T), are tested for using a simple YES/NO assay. Exon 11 is amplified using the polymerase chain reaction. The product of the reaction is labeled to allow its detection and is placed on a membrane. The membrane has been prepared with oligonucleotide probes, which detect either the normal or the abnormal sequence. A: results from a known cystic fibrosis carrier. B: results from a child with cystic fibrosis.

Similarly, fragile X syndrome is usually the result of an expansion of a triplet sequence within the gene. Normal persons usually have only 5 to 50 copies of this triplet repeat, but affected patients have hundreds or thousands of copies of the triplet repeat. Similar triplet expansions cause myotonic dystrophy, Huntington disease, and Kennedy disease. The region containing the triplet can be amplified using the PCR, which produces millions of copies of the small region of DNA from the X

chromosome that contains the fragile X repeat. Specificity is achieved by directing the reaction using two complementary primers on either side of the region of interest.

Once amplified, the size of the product can be measured to evaluate the number of triplets, determining whether the mutation exists ( Fig. 6.9).

FIG. 6.9. Fragile X mutation detection. In patient A, a shorter polymerase chain reaction product corresponds with a smaller number of triplet repeats. Patient B exhibits an expanded number of repeats. Affected patients typically have hundreds or even thousands of copies of the triplet.

For families with unusual mutations or with diseases for which the molecular basis is unknown, linkage testing can be performed. Linkage tests compare DNA

polymorphisms close to the disease-causing gene in family members known to have or carry the disease with those of unaffected and at-risk family members. Indirect assessments can be made about whether at-risk persons have the disease allele. The accuracy of these predictions depends on correct diagnosis and relationships of the family members, and the genetic distance between the polymorphism tested and the disease allele. For some families, linkage testing an be uninformative ( Fig.

6.10).

FIG. 6.10. Informativeness of linkage testing for cystic fibrosis. Marker KM-19 in kindred 18 is “not informative”-the disease alleles can not be distinguished in the parents.

The Genome Project

The Human Genome Project promises to be the single most important project in biology; genetics and genomics are now the central sciences of medicine. An understanding of the relationship between genetic variation and disease risk will alter the future prevention and treatment of common illnesses.

The full human sequence will be completed in 2003, but a very accurate draft is available now for over 95% of the genome. Disease gene identifications that formerly required years of chromosome walking and jumping, cloning, physical mapping, sequencing, and sequence assembly can now be completed in weeks. Industrialized sequencing technologies using capillary electrophoresis, micro arrays, and others developed for the genome project are now widely used for genotyping and

sequencing.

We also now have a tremendous catalog of individual sequence variation in humans. Tens of thousands of micro-satellite markers are available for linkage analysis and hundreds of thousands of SNPs (single nucleotide polymorphisms) for genetic association studies. The tools are now well developed for doing these functional studies to find which variations and mutations cause individuals to be at risk for numerous medically important, genetically complex human diseases. The genome project has delivered improved cDNA resources, better predictive software, and additional knowledge about the non-protein coding regions of the genome. Remarkable technologies are commercially available for comprehensive analysis of gene expression in single cells, tissues, or whole organisms.

Advances in gene knockout technology, antisense technology, gene transfer, and gene transfection allow greater in vitro insights using appropriate model systems, including both cell culture and whole organisms. The complete sequence of the Escherichia coli, yeast, nematode, fruit fly, and mouse genomes provide important evolutionary clues to gene function and extend the range of experiments possible.

At the same time there have been parallel improvements in the technology for global protein analysis. Gene expression is played out at the protein level—elegant techniques are now available to examine spatial and temporal patterns of protein expression, protein-ligand interactions, and protein modifications.

Finally, the genome project occurred at the same time as the information technology revolution. Tremendous bioinformatics and computational software is now

available for gene discovery, expression profiling, understanding gene-environment interactions, and so on. Suffice it to say, better tools are now available for making advances in women's health care than ever before in human history.

At least 3% of the annual budget of the project is going to the Ethical, Legal, and Social Issues section of the enterprise. This amount of early attention to societal impact is unprecedented for a science and technology project. Grant-funded programs have examined privacy issues, genetic discrimination in insurance and

employment, and the role of coercion. Genetic discoveries may challenge long-held beliefs about equality, predetermination, and free will as we learn about genes that have a major role in personality, creativity, intelligence, and mental illness. The safety, efficacy, and utility of new gene tests should be evaluated, especially before treatment is available.