4.Discussion
4.3. Genotype-phenotype correlations
This study has shown that there are significant correlations between APC mutation position and disease severity. These data confirm and extend previous reports of genotype-phenotype correlations. The associations are present when the APC mutations are grouped together by APC functional domains. Within the groups linear regression did not demonstrate any further relationship between APC mutation position and disease severity, except two (20aa, excluded from further analysis, and Post- MCR). The power to detect a relationship in this study may have been low because some of the groups were small in size. Collectively these data support the hypothesis that nearby mutations are equivalent and will produce similar disease severities (in the
absence of other factors). Therefore, if patient A’s and patient B’s mutations are close to one another their severities can be directly compared.
Three tiers of corrected count severity were identified. The MCR group represented the severest tier, with a mean severity of 3459 (geometric mean). The next tier was
represented by Unk, Pre-arm and Post-MCR its mean severity was 1118 (geometric
mean). Their severity is modestly greater than the remaining groups (Arm+, p-cat and
Pre-MCR), whose severity was 654 (geometric mean).
Analysis of genotype-phenotype associations is complex. I have assumed that all of the identified mutations produced stable truncated product, clearly this does not apply to the Unk group. None of the pedigrees with identifiable mutations came from gene regions associated with attenuated polyposis bar one (a patient with nt 1129 ins AGTA, exon 9 ; severity: 1290 polyps). All of the pedigrees in this study produced classical colonic FAP, i.e. greater than 100 polyps. Even so, a small minority of the patients had very mild disease, that in other contexts could be described as attenuated colonic polyposis. Five patients had uncorrected polyp counts of 100 or less (range 1 0 0 -7 3 polyps). Two of the five came from the same pedigree, whose members were
otherwise classical for FAP (mean sevehtypedigree: 488 polyps; r: 7 3 - 1133; n = 5; mutation 3249 ins t). The remaining patients were from families with clearly classical polyposis. This suggests that the severity distribution for classical FAP may partly explain the phenomena of attenuated FAP, i.e. a small proportion of patients will have mild disease simply by chance. Furthermore, the conventional criteria of 100 polyps may be slightly too high. However, it does not explain attenuated disease clustering in families, because if attenuated FAP is purely a function of the severity distribution, then clustering within families should be infrequent. For example, the probability of finding two attenuated FAP members within a family of five affected members is very low, e.g.
P(2 attenuated in a family of 5) = 0.013 This is, in itself, an argument for extralocus factors modifying disease severity, e.g. modifier genes.
I chose to group APC mutations by functional domains rather than by exons. Because most mutations produce stable protein product, I felt it more biologically plausible to group by functional features of the gene. Data analysis under such a model is complex.
For example, feature dosing (p-catenin binding repeats) requires within domain
analysis. Regression analysis across the entire gene, i.e. as a single block, is inappropriate because some domains may ‘step-up’ disease severity as they are transgressed (e.g. the transition from Pre-MCR to MCR). Linear and multiple regression have the assumption that the explanatory variable data is normally distributed and that severity transitions are graduated. The graduated transition is critical. A step change in severity represents a mathematical singularity; it cannot be accommodated using statistical algebra^®. These requirements are not fulfilled by APC gene mutations and FAP severity data (see graph 6) as a whole, but may be in part (i.e. on a per sub-group basis).
The underlying APC mutation distribution is uniform with ‘hotspots’ at codons 1061 and 1309. Furthermore, because of variation in pedigree size some APC mutations are more influential than they should be. This too is complex. For example, observations made on patients with identical germline mutations theoretically will give a better estimate of the severity for the mutation than an isolated value, assuming that the observations are independent. However, this may be confounded if the patients are
related because they may share genetic or environmental modifiers. For this reason, I grouped APC mutations and then performed ANOVA, which had the less limiting requirement of variance equality between groups (i.e. groups can be considered
Panenuated (Î 6. < 100 polyps) = 4/103, after restriction to Arm+, P-cat and Pre-Arm (the mildest groups) because attenuated FAP is not a feature o f the severe polyposis mutations. Therefore P(2 o f 5| mild severity group) follows from the binomial distribution , 0.013.
categorical variables rather than using APC mutation position, a quantitative variable)^^.
These data demonstrate that the effect of APC germline mutation can be modelled and accounted for. When this is done the severity data for an individual can be
decomposed into two parts: germline mutation effect and residual effect. The effects of modifier genes (if any) are to be found by examination of this residual. The data
demonstrate that mutations that are near to each other can be grouped. The groups can be defined in terms of gene features. The comparison of the severity of colonic polyposis only requires knowledge of the germline APC mutation position. As long as the patient has under gone colectomy in early adult life then its is reasonable not to adjust for either gender or age at operation. These were the provisos that I relied on for my subsequent analyses.
In some respects A N O V A and multiple regression analysis are synonymous and have related properties Draper and Smith (1981). Applied regression analysis. New York, Wiley..