• No results found

Research for Newborn Screening: Developing a National Framework

N/A
N/A
Protected

Academic year: 2020

Share "Research for Newborn Screening: Developing a National Framework"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

Research for Newborn Screening: Developing a National Framework

Jeffrey R. Botkin, MD, MPH

ABSTRACT. Newborn metabolic screening represents

the largest application of genetic testing in medicine. As new technologies are developed, the number of condi-tions amenable to newborn screening (NBS) will con-tinue to expand. Despite the scope of these programs, the evidence base for a number of NBS applications remains relatively weak. This article briefly reviews the evidence base for several conditions. The article then develops a proposal for a structured sequence of research protocols to evaluate potential applications for NBS before their formal implementation in public health programs. Such a framework for research will require collaboration be-tween states and the federal government, a collaboration that is emerging through recent federal legislation and funding.Pediatrics2005;116:862–871;newborn screening, ethics, research.

ABBREVIATIONS. NBS, newborn screening; PKU, phenylketonu-ria; MCAD, medium-chain acyl-coenzyme A dehydrogenase de-ficiency; MS/MS, tandem mass spectrometry; RCT, randomized, controlled trial; SCD, sickle cell disease; CF, cystic fibrosis.

N

ewborn metabolic screening is conducted for

⬃4 million infants per year and represents the largest single application of genetic test-ing in medicine. Newborn screentest-ing (NBS) programs traditionally are run by state public health depart-ments, although there is an emerging commercial sector for the provision of these services. Screening for phenylketonuria (PKU) was initiated in the 1960s, and subsequently the number of conditions on the NBS panels increased considerably. However, there is a broad range among states in the number of conditions targeted, from 4 to⬎40. With the advent of new technology such as tandem mass spectrome-try (MS/MS) and the recognition of the substantial variability between programs, an active national dis-cussion has emerged to support states in bringing to children high-quality services that are effective and efficient.1

Unfortunately, there are significant barriers to con-ducting research on the efficacy of NBS programs. The basic question relevant to efficacy is whether morbidity and/or mortality rates are reduced for

affected children identified through a universal screening program, compared with outcomes after clinical diagnosis or selective screening. Assessing the efficacy of universal screening requires a basis on which to make this comparison, with both short-term and long-term outcomes in mind. However, state departments of health often do not have funds to conduct evaluations of established programs beyond counts of true-positive, false-positive, and true-neg-ative results and laboratory quality assessments. Pro-grams typically do not make systematic attempts to identify affected children who had false-negative re-sults or to evaluate formally the longer-term health benefits for affected children. Also, many of the con-ditions targeted in NBS programs are rare, meaning that most states identify only a few affected children with each condition per year. This makes outcome studies with sufficient statistical power through state-based projects virtually impossible in all except the largest states.

A more fundamental barrier to research in NBS is ethical concern regarding the use of randomized, controlled trials (RCTs), which are usually consid-ered the standard in research design. The ethical concern arises when an apparently clinically benefi-cial intervention for affected children is proposed as a component of a population-based screening pro-gram. It becomes ethically problematic to propose a control arm for a study in which screening is not provided to a segment of the population, although the efficacy of the screening approach is unproven. This is a question of scale; can we be confident that interventions that are effective on a smaller project scale will be effective when implemented on a pop-ulation basis? To date, the only RCT of NBS in the United States is the Wisconsin cystic fibrosis (CF) project. The Wisconsin CF project has been valuable in addressing the efficacy of CF NBS but the project design, involving randomization, has been the focus of criticism in the lay press and ethical discussion in the professional literature.2,3This project is discussed in more detail below. In the absence of randomized designs, research on NBS often is observational after implementation of screening, with either historical control data or control through comparisons with similar populations without screening.

These barriers to research are formidable. Despite the use of this technology for 4 million infants per year in the United States and many more internation-ally in the past 3 decades, the research basis remains relatively poor. The New York State Task Force on Life and the Law stated, in its 2000 publication on genetic testing, “In fact, only a minority of newborn From the Department of Pediatrics and Medical Ethics, University of Utah,

Salt Lake City, Utah.

Accepted for publication Jan 12, 2005. doi:10.1542/peds.2004-2571 No conflict of interest declared.

(2)

screening tests that are currently performed have been demonstrated formally to have both clinical validity and utility.”4Wilcken et al,5in a 2003 pub-lication, concluded more broadly, “Formal evidence of the clinical effectiveness of newborn screening is lacking.” Currently, many states are adopting MS/MS for NBS programs, despite uncertainties re-garding the sensitivities and specificities of the tests and the natural history and treatability of many con-ditions identified. Of the ⱖ30 conditions detectable with MS/MS, medium-chain acyl-coenzyme A dehy-drogenase deficiency (MCAD) shows the greatest promise in terms of screening efficacy for children. Nevertheless, Elliman et al6 observed in 2002, “De-spite international experience of screening well over a million newborn infants [for MCAD] . . . there has been no report of a systematic follow-up of longer term outcome in affected infants detected by screen-ing.” Therefore, although we know that MS/MS can detect affected children and that early intervention can be lifesaving, we remain uncertain about the nature and magnitude of the longer-term benefits of actual population screening programs.

The implication of these concerns is not only that some modalities of NBS may prove to be ineffective when evaluated formally. Experience over several decades and a body of observational research lend support to many of these programs. In this era of evidence-based medicine, however, a less-than-rig-orous approach to research on these large, expensive, and important, public health programs is no longer appropriate.7–9The recent history of medicine illus-trates how research on popular screening programs can reveal limited efficacy; hospital admission chest films10and breast self-examinations11are prominent examples. Furthermore, not only does research iden-tify screening programs that are ineffective and/or harmful,12but formal evaluation can identify aspects of valuable programs that reduce efficacy in critical ways. Therefore, the goals of research are not only to make policy decisions to adopt or forgo population screening but also to design programs to maximize benefits and to minimize harm. This article reviews several examples of NBS that illustrate the strengths and weaknesses of the empirical foundation for screening. A proposal to develop a national frame-work for research on NBS is then outlined.

EXAMPLES Hemoglobinopathies

Screening for hemoglobinopathies is a component of NBS programs in 49 states and the District of Columbia. Sickle cell disease (SCD) is the primary condition of interest, although other hemoglobinop-athies are also detected.13 SCD occurs most com-monly in the United States among African Ameri-cans, with an incidence at birth of 1 case per 375 infants. Other population groups are affected with incidences of 1 case per 3000 Native American in-fants, 1 case per 20 000 Hispanic inin-fants, and 1 case per 60 000 white infants. Young children with SCD are susceptible to systemic infections with Streptococ-cus pneumoniaeat a rate of 8 episodes per 100

person-years, with a case fatality rate of ⬃35%.14 Early de-tection of SCD for an infant permits the prophylactic administration of penicillin to prevent pneumococcal infections, in addition to vaccination withHemophilus influenzatype b and pneumococcal vaccines.

The seminal study demonstrating the efficacy of preventive therapy was published in 1986 by Gaston et al.15 This multicenter clinical trial randomized children⬍3 years of age with SCD to either penicillin or placebo. The study was terminated after 15 months of follow-up monitoring when results indi-cated substantial reductions in infection and mortal-ity rates in the treatment group. These impressive results led to a federally sponsored consensus con-ference in 1987.16 The conference concluded, “The benefits of screening are so compelling that universal screening should be provided. State law should man-date the availability of these services while permit-ting parental refusal.” Furthermore, the conference concluded, “To be effective, neonatal screening must be part of a comprehensive program for the care of sickle cell patients and their families.”

The study by Gaston et al15clearly demonstrated the efficacy of penicillin prophylaxis in reducing morbidity and mortality rates for young children with SCD who were monitored in a longitudinal research environment. However, a key question is whether the efficacy of a preventive treatment can be maintained when expanded to a population level as part of a routine public health program. It is worth emphasizing that the impressive benefits of penicil-lin prophylaxis demonstrated by Gaston et al15were for children diagnosed clinically, not through NBS. Therefore, the benefits added by NBS are for the subsets of affected children who die or become seri-ously ill before a clinical diagnosis.

NBS is more than a test and an intervention; it must be viewed as a system involving a chain of decisions and actions from the heelstick of the infant through the laboratory, the health department, the primary care provider, and the parents to the effec-tive delivery and maintenance of long-term treat-ment for the child. Any system is only as good as its weakest link, and the efficacy of all NBS programs is contingent on the integrity of this chain.

(3)

and they estimated that only 44% of parents were compliant. Only 25% of patients had received the pneumococcal vaccine. A recent study of children with SCD receiving Medicaid in Washington and Tennessee found that enough prophylactic antibiotic was dispensed to cover only 40% of the year-long study period.19Teach et al20found a penicillin com-pliance rate of 43% among children with SCD, as measured with urine assays. Other reports also illus-trated compliance problems with the SCD prophy-lactic regimen.21

The implication of these data are that it is difficult to know the magnitude of the benefit for NBS for SCD. The general consensus in the literature is that mortality and morbidity rates for young children are decreased with NBS,22but acquiring definitive data to draw this conclusion is challenging for several reasons. First, there has been no formal controlled trial of NBS for SCD. Comparison with historical mortality rates can provide useful information, but historical control data may be biased because of changes in health care with time. Second, the adverse outcomes preventable with screening for SCD occur for a minority of affected children, whether or not prophylactic interventions are used; therefore, it is difficult to identify the benefits of screening without carefully tracking large populations of affected chil-dren over time. The ability to track the health out-comes of a large cohort of children is not a feature of our health care system. Third, the almost-universal use of SCD NBS makes it impossible to compare otherwise comparable states that use and do not use NBS for this condition.

A recent Cochrane review identified no RCTs of NBS for SCD.23The reviewers concluded, “There is however evidence of benefit from early commence-ment of treatcommence-ment in SCD, which is made possible by screening in the neonatal period. . . . Information from a well designed prospective RCT of neonatal screening is desirable to make recommendations for practice. However such trials may now be consid-ered unethical in view of the proven benefit of early prophylactic treatment with penicillin.”23

The conclusion here is that NBS for SCD probably is effective in saving many lives per year, but we do not have solid data to demonstrate this efficacy or to define the magnitude of the benefits. It is too late to conduct an efficacy trial of population screening, but additional work on enhancing compliance is war-ranted. This is a frustrating state of affairs for an intervention that has been adopted for virtually all infants born in the United States and its territories.

Galactosemia

NBS for galactosemia is performed in every state and the District of Columbia. This condition is attrib-utable to a genetic defect in an enzyme responsible for breaking down sugars present in milk and occurs at a rate of ⬃1 case per 60 000 neonates. Affected infants appear normal at birth but within 2 weeks can develop vomiting, irritability, hepatomegaly, jaundice, and sepsis. In the absence of early detec-tion, death in the neonatal period is thought to occur for ⬃20% to 30% of patients. Galactosemia among

survivors is associated with developmental delays. Treatment consists of a diet low in lactose/galactose. Enthusiasm for NBS for galactosemia developed in the 1960s and 1970s, with the identification of a valid test using dried blood spots. Clinical observations demonstrated that affected children experienced prompt resolution of symptoms with initiation of the appropriate diet. However, an important feature of galactosemia is that symptoms develop rapidly in the first 2 weeks of life, which means that the NBS system must be efficient to identify affected children before death or serious illness occurs. Evidence indi-cates that approximately two thirds of infants are symptomatic at the time of the report of a positive NBS result.13

As the technology developed to screen for this devastating condition, there was a strong push to initiate universal screening. Levy, an effective early advocate of NBS, wrote an article with Hammersen in 1978, in which they stated: “Galactosemia screen-ing should be routine for all newborn infants. It is a disorder with definite and severe complications, but one in which the complications can be prevented with simple and inexpensive treatment.”24 Subse-quently, outcome studies showed that the situation is more complicated. In a study of 350 affected chil-dren (mean age: 9 years) published in 1990, Wag-goner et al25 compared the outcomes of children diagnosed before the advent of NBS on the basis of clinical symptoms alone and children diagnosed shortly after birth by virtue of having an affected sibling. In this context, early detection on the basis of family history is a surrogate for early detection through population screening. The children diag-nosed on the basis of clinical symptoms had a mean age of diagnosis of 63 days, whereas those diagnosed on the basis of family history had a mean age of diagnosis of 1 day. If early detection and treatment are effective in reducing morbidity rates, then we would expect that children diagnosed at birth would have better outcomes than children diagnosed late on the basis of clinical symptoms. Unfortunately, the results reported by Waggoner et al25 showed no statistical differences in intellectual function between these groups. Waggoner et al25concluded, “It is clear that current methods of treatment, even if carefully followed, do little to ameliorate the long-term com-plications which occur in the majority of cases re-gardless of when treatment was begun or how suc-cessfully galactose intake was restricted.” Other authors also raised concerns about our current un-derstanding and treatment of galactosemia.26–28

(4)

10 years of life. This mortality rate compares with 7 unexplained infant deaths among 84 siblings of af-fected infants before the era of screening. With the assumption that 25% of the siblings were affected with galactosemia (an autosomal recessive condi-tion), ⬃21 of the 84 siblings would have been af-fected with galactosemia. Therefore, a mortality rate of 7 (33%) of 21 affected siblings can be estimated. This evidence suggests a reduction in mortality rates from 33% to 15% with screening, although there is potential for historical bias as well as uncertainty about the affected status of the siblings. In addition, advances in neonatal care over the past 30 years might have produced a lower contemporary mortal-ity rate among affected infants in the absence of screening. Comparable data from the United States are not available, but a reduction in mortality rates of this magnitude would result in ⬃12 fewer infant deaths resulting from galactosemia per year nation-wide with screening, or⬃3 lives saved per 1 million children screened. By comparison, the sixth leading cause of infant death in the United States in 2002 was injuries, with a rate of 235 deaths per 1 million chil-dren.30

This brief analysis suggests several conclusions. The early enthusiasm for the efficacy of NBS for galactosemia has not been supported by subsequent data, with respect to the preservation of cognitive function among affected children. These data on the relative efficacy of NBS were acquired 2 decades after some states initiated screening. Early interven-tion seems to reduce infant mortality rates for galac-tosemia, but the magnitude of this benefit remains uncertain. Some children still die as a result of galac-tosemia, despite NBS, and clinical diagnosis can be achieved in the absence of screening. Approaches other than universal NBS have been evaluated, with promising results.31 Again, the purpose of this dis-cussion is not to suggest that NBS for galactosemia does not have value but to highlight the limited knowledge on which this enormous public health effort is based.

Neuroblastoma

Neuroblastoma is the most common extracranial tumor among young children, with an incidence of

⬃1 case per 7000 children.32 Better prognoses are associated with younger age and earlier stages of the disease. These features of the condition suggested that presymptomatic diagnosis and early treatment might improve the mortality rate. In addition, the tumor secretes a characteristic pattern of cat-echolamines, which enables detection through blood testing before the emergence of clinical symptoms. Enthusiasm for a screening approach to neuroblas-toma led to the development of programs in Japan in the early 1970s. However, there was sufficient uncer-tainty about the efficacy of screening that 2 large screening trials were conducted, one in Germany by Schilling et al33and the other in Canada by Woods et al.34

In the German study, almost 2.6 million children were screened for neuroblastoma in 6 of 16 German states from 1995 to 2000. There were 2.1 million

children who served as control subjects in the other German states. The incidence and outcomes of neu-roblastoma cases were compared between the screened and control populations over the same time period. In the Canadian study, 476 654 children were screened in Quebec Province between 1989 and 1994, and the results were compared with those for chil-dren in separate control populations in Ontario, Min-nesota, Florida, and the Greater Delaware Valley.

The results of both studies demonstrated no ben-efit from population screening, in terms of mortality rates. Of particular interest was the finding that screening identified many more children than would have been predicted on the basis of the clinical inci-dence of the disease. This confirmed other observa-tions that neuroblastomas can arise and then resolve spontaneously without producing symptoms. These children might be accurately labeled as having the condition, but they represent false-positive results in the sense that they are not destined to be ill with their neuroblastomas. However, children identified as having neuroblastomas are considered for treatment because physicians may not be able to discriminate between children who will become ill and those who have tumors that will resolve spontaneously. In this situation, screening may seem to lead to improved survival rates for children with neuroblastomas, compared with historical control subjects, but this is only because screening identifies a subset of asymp-tomatic children who would have fared well any-way.

To illustrate this point, imagine that there are 20 children in a population with neuroblastomas iden-tified clinically. Assume treatment cures 10 children, and 10 children die as a result of their disease. There-fore, the cure rate is 50%. After the introduction of screening, 40 children with neuroblastomas are iden-tified but, unbeknownst to the screeners, 20 cases would have resolved spontaneously. Forty children are treated for their cancer and 10 die, as observed previously. The apparent cure rate is now 75%, an improvement of 25% that might be falsely attributed to the benefits of the screening program.

This problem is directly relevant to screening for metabolic diseases, because metabolic conditions usually entail a spectrum of severity and the spec-trum may include a proportion of subjects with “ab-normal” biochemical test results who will never be-come sick with the disease.5,35These neuroblastoma studies are excellent illustrations of the value of pop-ulation-based research for assessment of the efficacy of screening approaches.

Another aspect of these studies worth mentioning is the use of separate but relevantly similar popula-tions as control groups. Rather than randomize chil-dren within a region to screening versus clinical diagnosis, these studies screened an entire popula-tion and compared the outcomes with those for a comparable unscreened population during the same time period. This approach eliminates the problems with historical control data and avoids the complex-ities of randomizing children to 2 different groups within a population.

(5)

worth emphasizing is the ability to conduct large-scale, population-wide studies within a reasonable time frame. The German study required the collabo-ration of 6 of 16 states for the screening intervention and that of the remaining states for clinical data only as control populations. With uncommon conditions, no individual state could generate a sufficient num-ber of cases to conduct such a study. Obviously this situation pertains to the United States, in which col-laboration between multiple states would be essen-tial to obtain a sufficient number of cases in a rea-sonable time with a population that is representative of the national population. The complexity of this interstate collaboration should not be underesti-mated but the obstacles should be confronted to generate high-quality data on population-based screening programs. These examples illustrate the need for a more consistent and comprehensive ap-proach to evaluating screening tests and programs before implementation on a population-wide basis.

COLLABORATIVE RESEARCH AGENDA A number of commentators, professional bodies, and state programs have developed criteria for de-ciding when a condition should be added to NBS programs.36–39 These criteria typically address the nature of the disease, the availability of a valid test, evidence for the benefits of screening, and the pres-ence of all necessary service elements for a complete screening program. Here we are concerned primarily about the evidence for the benefits of screening. The criteria for what constitutes adequate evidence of benefit have not been established at the national level, leaving this determination up to individual state programs. The lack of established criteria and sufficient data on benefits is a central reason why there is substantial variation between states and countries regarding the conditions targeted in NBS programs.

We can imagine the confusion if drugs and devices were regulated and funded at the state level. Fortu-nately we have a national system of drug evaluation and approval through the Food and Drug Adminis-tration, by which drugs and devices proposed for human medical use are evaluated through a stan-dard series of research protocols.40Generally, human studies are pursued only after collection of data on safety in animals, when feasible. In phase I human studies, a small number of participants are involved, primarily for evaluation of safety and pharmacoki-netic features. If the drug seems safe, then phase II studies involving up to several hundred participants are pursued to evaluate effectiveness. If these results are promising, then phase III studies are conducted with several hundred to thousands of individuals to assess safety, effectiveness, and dosage. Phase II studies may be performed with or without a control group, and phase III studies often use a randomized, double-blind, controlled protocol to maximize the quality of the data. With the results of these studies, the Food and Drug Administration is in a position to determine whether a drug should be licensed on a national basis for specific indications for specific population groups (such as adults or children). After

approval, phase IV studies may be conducted for postmarketing evaluations of safety and efficacy in new or larger patient populations. The method is long, expensive, and by no means foolproof in terms of safety or efficacy, but it is a remarkably robust approach to the scientific assessment of drugs for medical applications.

A similar framework for the methodical evaluation of screening tests is necessary. The Institute of Med-icine Committee on Assessing Genetic Risks con-cluded, in 1994, “The committee recommends the systematic development of basic data on the full range of genetic testing and screening services that is needed to provide a sound basis for policy develop-ment in the future.”38(p306) Other authors also sup-port a standardized approach to genetic test evalua-tion.41The following is a preliminary proposal for a framework to study NBS tests and NBS programs.

There are 3 basic questions for research to address. First, does early detection and treatment of affected infants or children reduce morbidity and/or mortal-ity rates? Second, if early detection seems beneficial, does a population-based screening approach result in net benefits to affected children, compared with alternative methods of detection? Third, if there are net benefits from population screening, are these benefits sufficient to warrant the use of public health resources for this purpose? The proposed research framework is designed to answer these questions in sequence.

Does early detection produce better outcomes? There is strong public confidence in the ability of medical science to identify signs of future disease and to act decisively to save lives.42 Screening tests have become quite prevalent in medicine, including mammograms, Pap tests, digital rectal examinations, sigmoidoscopies, amniocentesis, and measurements of blood pressure and blood glucose, cholesterol, and prostate-specific antigen levels, to name only a few. Commercial providers are now prominently adver-tising full-body computed tomography to the public as a method for early detection of a host of potential problems.43

However, early detection is not beneficial if med-icine does not have the ability to affect the course of the disease. This is more common than popularly thought. The US Preventive Services Task Force con-ducts exhaustive analyses of preventive measures. The US Preventive Services Task Force supports screening for breast cancer, colon cancer, and cervi-cal cancer, but it does not advocate population screening for cancers of the prostate, bladder, pan-creas, ovaries, or lung. These decisions are based in large measure on the absence of data indicating that early detection improves outcomes.44

(6)

se-verity. In these situations, it may be that the individ-uals who are most severely affected do not benefit from early detection, those with mild or subclinical disease may be harmed by unnecessary interven-tions, and those with intermediate severity can ob-tain benefit from early detection. If clinicians cannot discriminate between these degrees of severity at the time of diagnosis, then an affected individual may experience burdensome or harmful interventions as often as an improved outcome resulting from a screening program.

STAGE I RESEARCH

For the purposes of this discussion, stage I re-search refers to projects that seek to determine whether early detection and intervention can im-prove clinical outcomes. This kind of research can be performed in a variety of ways that do not require population screening. For genetic conditions (the majority of NBS conditions), significant information can be obtained by comparing the outcomes of sec-ond affected siblings versus first affected siblings when there are discrepancies in the time of diagno-sis. A first affected sibling is diagnosed often only after clinical symptoms emerge and frequently much later, after parents have pursued a “diagnostic odys-sey.” Once parents have been alerted to the risk for subsequent siblings, the second affected child can be diagnosed prenatally or in early infancy. If a pro-posed early treatment or preventive strategy is avail-able, then a comparison of the outcomes for the first versus second (and subsequent) affected siblings provides evidence for the efficacy of the intervention. This approach can be used retrospectively, if an in-tervention is in use for the condition, or prospec-tively, through enrollment of sibling pairs at the time of diagnosis of the second affected child. The galac-tosemia study by Waggoner et al25noted above is an example of this method.

A second option for stage I research is a RCT of the intervention among children diagnosed clinically. This approach is useful when the initial presentation of the condition is not devastating for the majority of children. Stated differently, it is more useful when only a subset of affected children experience the serious adverse outcome to be prevented. This is because investigators needs to know which children are affected before they can be randomized and the children cannot have already experienced the ad-verse outcome at the time of randomization. A good example here is penicillin prophylaxis for children with SCD. As discussed above, the study by Gaston et al15 demonstrated that children with SCD fare much better with penicillin prophylaxis, and it is an excellent example of stage I research.

A third approach to stage I research is a small-scale screening project. If there is a high-risk group that can be targeted for screening to produce a sufficient number of affected children, then a RCT of screening for the proposed intervention can be conducted. However, most conditions considered appropriate for population-wide NBS are rare enough in the gen-eral population, and not strongly associated with a

particular racial or ethnic group, that targeted screening is not feasible.

An approach that is not as useful for stage I re-search is the use of historical data comparing chil-dren identified at a younger versus older age. Par-ticularly when there is an association between an earlier era and the later age of diagnosis, there are many factors that may bias the comparison. More specifically, an earlier age of diagnosis in more recent eras may occur in conjunction with many other im-provements in care.

The purpose of stage I research is to provide de-finitive data on the efficacy of early intervention. The move to population-wide screening should be made only when there is solid evidence that early detection and intervention can lead to improved morbidity and/or mortality rates.

STAGE II RESEARCH

Stage II research addresses the second question in sequence. That is, does a population-based screening approach result in net benefits to affected children, compared with alternative methods of detection? The central point here is that improvements in clin-ical outcomes that are demonstrable through stage I research may not be achievable in population-wide programs. Conversely, the benefits of early detection may be brought to affected children through clinical detection schemes in the absence of population screening. After stage I research, the question is how best to bring the benefits of early detection to af-fected children.

(7)

and unscreened groups over the past 20 years. Al-though the magnitude and nature of the benefits of CF NBS remain controversial, the Wisconsin trial has been critical in providing data for policy develop-ment.48

There are a number of potential problems with a randomized, controlled design from methodologic and ethical perspectives. First, if screening itself is randomized, then it may be difficult or impossible to identify all cases in the unscreened group. This cre-ates a significant potential for bias. In the unscreened group, those who come to medical attention by vir-tue of clinical symptoms, or who do so at a younger age (within the window of a research project), tend to be those who are more severely affected with the condition. In contrast, a screened population would include children across the full spectrum of severity, including those who are mildly affected and those who may never become ill with the condition. Given this difference in sensitivity for detection, a compar-ison of outcomes for the screened and unscreened populations would show improved outcomes for the screened group even if the intervention confers no benefit. This problem is similar to “length bias”49and is primarily a concern for studies that calculate out-come data in terms of number of deaths per affected population. This is because the denominator is ex-panded through screening to include mildly affected individuals, thereby decreasing the apparent death rate, compared with a group composed of only se-verely affected individuals.

There are at least 2 ways to address this problem. The neuroblastoma studies measured their outcomes in terms of deaths per 100 000 population in the screened and unscreened populations. This elimi-nates the bias created by calculating death rates for the affected population. Another approach to elimi-nating this source of bias is to follow the approach used in the Wisconsin CF screening trial, in which blood samples were obtained for all infants but screening test results were disclosed on a random-ized basis. As noted, the unscreened group had re-sults reviewed and disclosed at 4 years of age. This allowed the research team to obtain outcome data for all affected members of the unscreened group as if they had been screened at the outset. Through this approach, subclinical cases could be identified in the unscreened group although they were never identi-fied clinically.

A second practical problem with RCTs arises from the low incidence of most conditions targeted by NBS. Because RCTs require at least 2 approximately equivalent groups for comparison and the groups must be of a size to allow determination with suffi-cient power that a significant difference exists in the outcome measure, trials must be quite large for most NBS conditions. This issue is discussed in greater detail below.

The more fundamental challenges to the perfor-mance of RCTs in stage II research are ethical con-cerns. If early intervention has been shown to be effective in stage I research, is it ethical to randomize infants to an unscreened group?50In addressing this question, a standard approach in research ethics asks

whether there is equipoise between the 2 study groups.51,52 That is, is there genuine uncertainty in the professional community about whether an inter-vention under study is preferable to an alternative? If there is general consensus that one option is prefer-able to another on the basis of solid scientific evi-dence, then randomization is not ethically accept-able. Conversely, if there is legitimate uncertainty about the best approach, then randomization is ac-ceptable.

The NBS context and the proposed stage I/stage II approach offer a different level of complexity than most questions over equipoise. If stage I research demonstrates benefit, then there is no longer equi-poise with respect to earlier intervention versus later intervention. However, equipoise may still exist with respect to whether the benefits of earlier intervention can be achieved through the complex mechanism of a NBS program. Therefore, the “test article” is the program, ie, the method of delivering the key inter-vention, rather than the intervention itself.

Let us look at the issue as if the research were to address the efficacy of a delivery method for an intervention that we know to be effective. Would it be ethical to compare a particular delivery method with no delivery at all? This is analogous to compar-ing a placebo in a trial with an intervention of known efficacy. This is generally considered unethical, un-less the risks to the placebo group are minor or there are other compelling scientific reasons to consider a placebo group.53 In the context of NBS, however, population-wide screening may be only one ap-proach to early detection. For example, neonatal deaths resulting from congenital adrenal hyperplasia or galactosemia usually occur after symptoms have been present for several days. This symptomatic pe-riod offers the opportunity for clinical diagnosis. The more effective parents and the health care system are in recognizing and responding to characteristic symptoms in individual cases, the less marginally effective a population screening approach would be. Therefore, for these conditions, screening is an alter-native not to nothing (as with a placebo) but to the health care system that is designed to respond to sick infants. When there is no ability to detect an affected child before the time when permanent damage has been done, as with PKU and congenital hypothyroid-ism, then an unscreened group in a RCT would be analogous to a placebo group; this study design would not be appropriate. Therefore, randomization need not be framed in terms of screening versus nothing, depending on the condition, but can be regarded as diagnosis through screening versus di-agnosis through clinical care or selective screening.

(8)

galactosemia as rapidly as does population screen-ing.

The conclusion is that phase II screening RCTs are ethical in the context of NBS when population screening is compared with a potentially effective method of delivering a timely clinical diagnosis or with a more selective screening approach. RCTs of screening for conditions similar in their presentation to congenital adrenal hyperplasia, CF, or SCD can be justified, particularly in conjunction with efforts to enhance provider education about early clinical de-tection. In contrast, for conditions for which there is no prospect of early clinical detection before signifi-cant morbidity or death, a stage II RCT would not be justified.

If a RCT is not deemed ethical or feasible, an alternative is a cohort design. A prospective cohort design compares the outcomes of 2 groups that differ by virtue of the intervention in question. In this context, a screened cohort of children is compared with an unscreened cohort with respect to morbidity and mortality rates over time. For NBS, cohorts could consist of whole state newborn populations or pop-ulations of multiple states. There are several signifi-cant advantages to a cohort design for stage II screen-ing research. Logistically, it is easier to implement a screening program in a population in a uniform manner. From an ethical perspective, the cohort de-sign avoids explicitly asde-signing children to an un-screened group when screening could have been made available. Of course, infants in the unscreened cohort are not provided screening, but this is already the situation in the absence of research. After a new screening modality is introduced, some states take years to consider or to implement the program, whereas others are more rapid adopters, providing the opportunity for a comparison of cohorts accord-ing to state.

There are 2 principal drawbacks to the cohort de-sign. The first is the potential bias created by com-paring populations that may differ with respect to a number of variables in addition to the variable in question (screening). State populations may differ with respect to factors such as socioeconomic status, racial mixture, disease prevalence, health care ser-vices, insurance coverage, and efficacy of the NBS programs. If differences in morbidity or mortality rates are found between cohorts, there may be resid-ual concern that the explanation does not depend on screening. The second problem inherent in the cohort design in this context involves the ability to identify and to monitor affected individuals in the un-screened cohort. A screening program establishes the population prevalence at an individual level and creates the infrastructure for tracking. Without a screening program, there is unlikely to be a compre-hensive registry of affected children. Furthermore, children who might have died at a young age as a result of the condition might not have been identified as affected or their condition might not have been recorded in a retrievable manner. Many affected chil-dren may be known to subspecialty physicians in regional referral centers, but these are likely to be more severely affected children.

This latter problem of defining the affected group in the unscreened population is a fundamental chal-lenge. If a cohort design is used for stage II research, then the unscreened cohort must be evaluated as thoroughly as possible to identify affected children. One way to address this problem adds a retrospec-tive component to the project. In many states, resid-ual NBS samples are stored for variable lengths of time, from months to decades.54,55 If the analyte is stable with time, then stored NBS samples can be screened for the condition in question at a time when differences in morbidity or mortality rates between the screened and unscreened groups would be ex-pected. Children identified as affected through ret-rospective screening of residual samples could be traced and their health status measured and com-pared with that of children identified prospectively through screening. Children who died before a diag-nosis was made also could be identified through this approach. Furthermore, children who were mildly affected and never came to clinical recognition would be identified. Identification and tracking would not be 100% with this method, but this ap-proach is likely to be much more comprehensive than other forms of identification. This approach would require the retention and availability of resid-ual NBS samples. A discussion of the extent and content of parental information or permission for this kind of research would be important.3

For stage II research, the best approach from a scientific perspective is a RCT. However, this ap-proach is likely to be expensive, and ethical concerns may be prominent. Nevertheless, a randomized de-sign is justifiable in some circumstances. In other circumstances, a cohort design with retrospective screening of the initially unscreened cohort is most appropriate from both scientific and ethical perspec-tives.

STAGE III AND STAGE IV RESEARCH Stage III research addresses the relative costs of a population-wide screening program. Stage III re-search may demonstrate benefits of screening, but decisions about implementation will be dependent on estimates of the costs necessary to achieve the benefits. Cost-benefit and cost-effectiveness analyses may be feasible with data obtained in stage II projects. An economic analysis may reveal that the benefits do not justify the costs of the program.

(9)

MS/MS compares favorably with other mass screen-ing programs on a cost-benefit basis. In contrast, Pandor et al,58in a systematic analysis in the United Kingdom, concluded that the evidence supports the use of MS/MS for PKU and MCAD but sufficient evidence for screening for other conditions is lacking. Both studies revealed the need for additional data to estimate actual costs and benefits. Despite the vol-ume of literature on NBS for CF, Grosse et al48noted that a full cost-effectiveness analysis has not been performed. Under the proposed research scheme, stage II projects could be designed to collect data on costs and benefits in a manner conducive to stage III economic analysis.

Stage IV research involves projects designed to evaluate established programs on an ongoing basis. To date, state NBS programs have a limited ability to conduct formal program evaluations and quality as-surance activities. The American Academy of Pedi-atrics/Health Resources and Services Administra-tion Newborn Screening Task Force1and the Council of Regional Networks for Genetic Services59 place a strong emphasis on the funding and development of these activities. Effective programs require periodic evaluation because of changes in test technology, program organization, population demographic fea-tures, and health care resources.

COLLABORATION

Central to the ability to conduct stage II and stage III research is a population of sufficient size. Because of the low incidence of many conditions, multiple states must collaborate with a single protocol to achieve adequate statistical power to draw timely conclusions about the efficacy of a screening strat-egy. Traditionally, development of multistate re-search collaborations has been a significant chal-lenge, and such collaborations have not been common in the NBS literature beyond survey projects. New federal initiatives may help foster larg-er-scale, multistate projects.

Title XXVI of the Children’s Health Act of 2000, Screening for Heritable Disorders, establishes a pro-gram to improve the ability of states to provide newborn and child screening. The Act “authorizes the Secretary to award grants to States, or a political subdivision of a State, or a consortium of two or more States, or political subdivisions of States to enhance, improve or expand the ability of States and local public health agencies to provide screening, counseling or health care services to newborns and children having or at risk for heritable disorders.”60 Furthermore, the Act “authorizes the Secretary to award grants to eligible entities to provide for the conduct of demonstration programs to evaluate the effectiveness of screening, counseling or health care services in reducing the morbidity and mortality caused by heritable disorders in newborns and chil-dren.”60 To assist in this process, the Secretary of Health and Human Services recently established the Advisory Committee on Heritable Disorders in New-borns and Children. The tasks of the committee are to provide advice and recommendations to the

Sec-retary concerning the grants and projects authorized under the Act.

In addition, Title V of the Social Security Act pro-vided funding for 2 new initiatives. A national coor-dinating center for NBS is being established, and regional genetic services and NBS collaborative sys-tems have been created. Seven national regions have been created to “enhance and support the genetics and newborn screening capacity of States across the nation by undertaking a regional approach toward addressing the maldistribution of genetic resources. These grants are expected to improve the health of children and their families by promoting the trans-lation of genetic medicine into public health and health care services.”61

These national priorities and funding opportuni-ties represent an exciting development in the care of children. The state-level organization of NBS services is an accident of history but should not be a barrier to evidence-based analyses of the benefits and risks of these complex programs. The adoption of an ac-cepted sequence of research protocols through mul-tistate collaborations should greatly facilitate the translation of research into effective public health programs. Ultimately, there are no serious method-ologic or ethical barriers to conducting stage I, stage II, and stage III research to demonstrate the efficacy of NBS modalities before the implementation of pop-ulation-based programs.

ACKNOWLEDGMENTS

My thanks go to Mary Ann Bailey, PhD, and colleagues at the Hastings Center for supporting this work under grant 1 R01 HG02579.

REFERENCES

1. American Academy of Pediatrics/Health Resources and Services Ad-ministration Newborn Screening Task Force. Serving the family from birth to the medical home: newborn screening: a blueprint for the future: a call for a national agenda on state newborn screening pro-grams.Pediatrics.2000;106:389 – 422

2. Begley S. Research involving tests on newborns highlights need for stricter ethics.Wall Street Journal. May 3, 2002

3. Taylor HA, Wilfond BS. Ethical issues in newborn screening research: lessons from the Wisconsin cystic fibrosis trial.J Pediatr. 2004;145: 292–296

4. New York State Task Force on Life and the Law.Genetic Testing and Screening in the Age of Genomic Medicine.Albany, NY: Health Education Services; 2000:143

5. Wilcken B, Wiley V, Hammond J, Carpenter K. Screening newborns for inborn errors of metabolism by tandem mass spectrometry.N Engl J Med.2003;348:2304 –2312

6. Elliman D, Dezateux C, Bedford HE. Newborn and childhood screening programmes: criteria, evidence, and current policy.Arch Dis Child.

2002;87:6

7. Wilfond B. Screening policy for cystic fibrosis: the role of evidence.

Hastings Cent Rep.1995;25:S21–S23

8. Grimes DA, Schulz KF. Uses and abuses of screening tests.Lancet.

2002;359:881– 884

9. Miller AB. The ethics, the risks and the benefits of screening.Biomed Pharmacother.1988;42:439 – 442

10. Mant D, Fowler G. Mass screening: theory and ethics.Br Med J.1990; 300:916 –918

11. Kosters JP, Gotzsche PC. Regular self-examination or clinical examina-tion for early detecexamina-tion of breast cancer.Cochrane Database Syst Rev.

2003;(2):CD003373

12. Feldman W. How serious are the adverse effects of screening?J Gen Intern Med.1990;5:S50 –S53

(10)

14. US Preventive Services Task Force. Screening for hemoglobinopathies, 1996. Available at: www.ahrq.gov/clinic/uspstf/uspshemo.htm. Ac-cessed August 18, 2005

15. Gaston MH, Verter JI, Woods G, et al. Prophylaxis with oral penicillin in children with sickle cell anemia: a randomized trial.N Engl J Med.

1986;314:1593–1599

16. National Institutes of Health Consensus Conference. Newborn screen-ing for sickle cell disease and other hemoglobinopathies.JAMA.1987; 258:1205–1209

17. Ramgoolam A, Steele R. Formulations of antibiotics for children in primary care: effects on compliance and efficacy.Paediatr Drugs.2002; 4:323–333

18. Centers for Disease Control and Prevention. Update: newborn screen-ing for sickle cell disease: California, Illinois, and New York, 1998.

MMWR Morb Mortal Wkly Rep.2000;49:729 –731

19. Sox CM, Cooper WO, Koepsell TD, DiGiuseppe DL, Christakis DA. Provision of pneumococcal prophylaxis for publicly insured children with sickle cell disease.JAMA.2003;290:1057–1061

20. Teach SJ, Lillis KA, Grossi M. Compliance with penicillin prophylaxis in patients with sickle cell disease.Arch Pediatr Adolesc Med.1998;152: 274 –278

21. Wurst KE, Sleath BL. Physician knowledge and adherence to prescrib-ing antibiotic prophylaxis for sickle cell disease.Int J Qual Health Care.

2004;16:245–251

22. Quinn CT, Rogers ZR, Buchanan GR. Survival of children with sickle cell disease.Blood.2004;103:4023– 4027

23. Lees CM, Davies S, Dezateux C. Neonatal screening for sickle cells disease [Cochrane review]. In:The Cochrane Library. Issue 3. Chichester, United Kingdom: John Wiley & Sons; 2004

24. Levy H, Hammersen G. Newborn screening for galactosemia and other galactose metabolic defects.J Pediatr.1978;92:871– 877

25. Waggoner DD, Buist NR, Donnell GN. Long-term prognosis in galactosemia: results of survey of 350 cases.J Inherit Metab Dis.1990;13: 802– 818

26. Gitzelmann R, Steinmann B. Galactosemia: how does long-term treat-ment change the outcome?Enzyme.1984;32:37– 46

27. Matalon R. Galactosemia: promise, frustration and challenge.J Am Coll Nutr.1997;16:190 –191

28. Widhalm K, Miranda de Cruz B, Koch M. Diet does not ensure normal development in galactosemia.J Am Coll Nutr.1997;16:204 –208 29. Badawi N, Cahalane SF, McDonald M, et al. Galactosaemia—a

contro-versial disorder. Screening and outcome. Ireland 1972–1992.Ir Med J.

1996;89:16 –17

30. National Center for Health Statistics. Infant deaths/mortality. Available at: www.cdc.gov/nchs/fastats/infmort.htm. Accessed August 18, 2005 31. Shah V, Friedman S, Moore AM, Platt BA, Feigenbaum AS. Selective screening for neonatal galactosemia: an alternative approach.Acta Pae-diatr.2001;90:948 –949

32. Castleberry RP. Biology and treatment of neuroblastoma.Pediatr Clin North Am.1997;44:919 –937

33. Schilling FH, Spix C, Berthold F, et al. Neuroblastoma screening at one year of age.N Engl J Med.2002;346:1047–1053

34. Woods WG, Gao RN, Shuster JJ, et al. Screening of infants and mortality due to neuroblastoma.N Engl J Med.2002;346:1041–1046

35. Refsum H, Fedriksen A, Meyer K, Ueland P, Kase BF. Birth prevalence of homocystinuria.J Pediatr.2004;144:830 – 832

36. Wilson JMG, Jungner G.Principles and Practice of Screening for Disease. Geneva, Switzerland: World Health Organization; 1968

37. National Research Council, Committee for the Study of Inborn Errors of Metabolism.Genetic Screening: Programs, Principles, and Research. Wash-ington, DC: National Academy of Sciences; 1975

38. Andrews LB, Fullarton JE, Holtzman NA, Motulsky AG, eds.Assessing Genetic Risks: Implications for Health and Social Policy. Washington, DC: National Academy of Sciences; 1994

39. National Institutes of Health.Promoting Safe and Effective Genetic Testing in the United States: Final Report of the Task Force on Genetic Testing. Bethesda, MD: National Institutes of Health; 1997

40. Food and Drug Administration. Testing drugs in people. Available at: www.fda.gov/fdac/special/newdrug/testing.html. Accessed August 18, 2005

41. Burke W, Atkins D, Gwinn M, et al. Genetic test evaluation: information needs of clinicians, policy makers, and the public. Am J Epidemiol.

2002;156:311–318

42. Russell LB.Educated Guesses: Making Policy About Medical Screening Tests. Berkeley, CA: University of California Press; 1994

43. AccuScan Health Imaging. Home page. Available at: www. accuscanonline.com/index2.html. Accessed August 18, 2005 44. US Preventive Services Task Force. Guide to clinical preventive services.

Available at: www.ahcpr.gov/clinic/cps3dix.htm#cancer. Accessed August 18, 2005

45. National Institutes of Health, Consensus Development Panel. National Institutes of Health Consensus Development Conference Statement: phenylketonuria: screening and management, October 16 –18, 2000. Pe-diatrics.2001;108:972–982

46. Farrell PM, Kosorok MR, Rock MJ, et al. Early diagnosis of cystic fibrosis through neonatal screening prevents severe malnutrition and improves long-term growth.Pediatrics.2001;107:1–13

47. Chatfield S, Owen G, Ryley HC, et al. Neonatal screening for cystic fibrosis in Wales and the West Midlands: clinical assessment after five years of screening.Arch Dis Child.1991;66:29 –33

48. Grosse SD, Boyle CA, Botkin JR, et al. Newborn screening for cystic fibrosis: evaluation of benefits and risks and recommendations for state newborn screening programs.MMWR Recomm Rep.2004;53(RR-13):1–36 49. National Center for Biotechnology Information.Guide to Clinical Preven-tive Services. 2nd ed. Health Services Research Information Project; 1996. Available at: www.ncbi.nlm.nih.gov/books/bv.fcgi?rid⫽hstat3.section. 10513. Accessed August 18, 2005

50. Wilcken B. Ethical issues in newborn screening and the impact of new technologies.Eur J Pediatr.2003;162:S62–S66

51. Freedman B. Equipoise and the ethics of clinical research.N Engl J Med.

1987;317:141–145

52. Miller PB, Weijer C. Rehabilitating equipoise. Kennedy Inst Ethics J.

2003;13:93–118

53. World Medical Association. Declaration of Helsinki. Available at: www.wma.net/e/policy/b3.htm. Accessed August 18, 2005 54. Mandl KD, Felt S, Larson C, Kohane IA. Newborn screening program

practices in the United States: notification, research and consent. Pedi-atrics.2002;109:269 –273

55. Therrell BL, Hannon WH, Pass KA, et al. Guidelines for the retention, storage, and use of residual dried blood spot samples after newborn screening analysis: statement of the Council of Regional Networks for Genetic Services.Biochem Mol Med.1996;57:116 –124

56. Pollitt RJ. Newborn mass screening versus selective investigation: ben-efits and costs.J Inherit Metab Dis.2001;24:299 –302

57. Schoen EJ, Baker JC, Colby CJ, To TT. Cost-benefit analysis of universal tandem spectrometry for newborn screening. Pediatrics. 2002;110: 781–786

58. Pandor A, Eastham J, Beverley C, Chilcott J, Paisley S. Clinical effec-tiveness and cost-effeceffec-tiveness of neonatal screening for inborn errors of metabolism using tandem mass spectrometry: a systematic review.

Health Technol Assess.2004;8(12):1–121

59. Pass KA, Lane PA, Fernhoff PM, et al. US newborn screening system guidelines II: follow-up of children, diagnosis, management, and evaluation: statement of the Council of Regional Networks for Genetic Services (CORN).Pediatrics.2000;137(suppl):S1–S46

60. Health Resources and Services Administration, Maternal and Child Health Bureau. Advisory Committee on Heritable Disorders and Ge-netic Diseases in Newborns and Children Charter. Available at: www.mchb.hrsa.gov/programs/genetics/committee/charter.htm. Ac-cessed August 18, 2005

(11)

DOI: 10.1542/peds.2004-2571

2005;116;862

Pediatrics

Jeffrey R. Botkin

Research for Newborn Screening: Developing a National Framework

Services

Updated Information &

http://pediatrics.aappublications.org/content/116/4/862

including high resolution figures, can be found at:

References

http://pediatrics.aappublications.org/content/116/4/862#BIBL

This article cites 42 articles, 9 of which you can access for free at:

Subspecialty Collections

sub

http://www.aappublications.org/cgi/collection/fetus:newborn_infant_

Fetus/Newborn Infant

http://www.aappublications.org/cgi/collection/ethics:bioethics_sub

Ethics/Bioethics

atistics_sub

http://www.aappublications.org/cgi/collection/research_methods_-_st

Research Methods & Statistics following collection(s):

This article, along with others on similar topics, appears in the

Permissions & Licensing

http://www.aappublications.org/site/misc/Permissions.xhtml

in its entirety can be found online at:

Information about reproducing this article in parts (figures, tables) or

Reprints

http://www.aappublications.org/site/misc/reprints.xhtml

(12)

DOI: 10.1542/peds.2004-2571

2005;116;862

Pediatrics

Jeffrey R. Botkin

Research for Newborn Screening: Developing a National Framework

http://pediatrics.aappublications.org/content/116/4/862

located on the World Wide Web at:

The online version of this article, along with updated information and services, is

by the American Academy of Pediatrics. All rights reserved. Print ISSN: 1073-0397.

References

Related documents

The remainder of this paper is organized into the following five sections: 1) a literature review on mega-events, event tourism, and LBSN data opportunities; 2) a description of

The results showed that in both pylorus ligation and ethanol induced ulcer models, the aqueous extract RO12 exhibited significant decrease (P<0.001) in ulcer

On base of column and balance cash flow composition we determine in which areas agricultural companies invested money and what sources they used to finance their

The metastatic lymph node ratio (MLNR), the ratio of positive nodes to the total number of total retrieved nodes, which is reported to show metastatic lymph node status more

In conclusion, irrespective of whether lexical acquisition is viewed from a cognitive perspective or from the point of view of an information theory model, any account of L2

Tu má přiřazenu 6to4 směrovač, který musí být připojen jak k IPv4 Internetu, tak ke koncové IPv6 síti a procházejí jím veškerá data přepravovaná 6to4!.

The assessment of impacts of climate change and their costs – summarised in Section 2 – as well as the costs and benefits of adaptation strategies – discussed in Section 3 – refer