Methodological considerations - Effects on male reproductive function of phthalates and other e

6 Discussion

6.1 Methodological considerations

Several methodological considerations which might influence the results need to be discussed. The main methodological issues are discussed below.

6.1.1 Selection bias

Selection bias may be introduced when the enrolled women or their spouses differ from the non-participating men and women with respect to characteristics related to exposure and outcome. Selection bias can be a problem in studies where the exposure is known by the participants. Occupational studies can be an example since exposed men and women who have had problems con- ceiving might be overrepresented in the study population because they choose to participate to a higher degree than the colleagues without the health problem in question (130). This will obviously hamper the generalizability of the observations to the general population. We do, however, believe that differential selection is not a major problem in this environmental study, where the participants were unaware of their exposure levels. Also, the participants were invited into the INUENDO study and this thesis is based on exposure measurements from the CLEAR study. The participants were therefore igno- rant of which xenobiotics were analyzed. Nevertheless, with regards to outcome the non-participating men and women might differ from the enrolled with a higher proportion of infertile in the participating population, but non- differential since unrelated to exposure.

The participation rates were high in Greenland (90%) and Warsaw, Poland (68%), but in Kharkiv, Ukraine, only 26% of invited women participated in the study. The latter was a consequence of the recruitment procedure in Ukraine, where contact between the participants and project team was handled by too many medial doctors. A sub-sample of 605 non-participating Ukrainian women was interviewed regarding demographic and reproductive information. Only age differed between participating and non-participating women with the latter being of slightly lower age.

The men were consecutively enrolled after their wives or the couple in con- sensus had agreed to participate. The men who did not accept the invitation did not differ from the participating men with regard to their wives’ TTP (115).

Our study population is a selected population which differs from the general population since no sterile couples are enrolled. Due to a study design where enrollment took place over a period of three years relative to a single point in time subfertile couples as well as very fertile couples are well represented. Due to possible length bias underrepresentation of subfertile cases is expected.

6.1.2 Information bias

Systematic errors in measurements, classification of exposure status or outcome may cause information bias. Knowing whether the systematic error is differential or non-differential helps to determine the direction of the bias relative to the observed outcome of the study. Non-differential misclassification exists when the frequency of errors is approximately the same in the compared groups, often with an error towards the null hypothesis, i.e., it makes it more difficult to detect an effect of the exposure, even if it exists. Differential misclassification will occur if errors in classification of exposure status occur more frequently in one of the studied groups, and may result in the association being over- or underestimated.

Exposure and outcome assessment with regard to differential and non- differential misclassification are described in the sections below.

Exposure assessment

PFASs and HCB levels varied substantially between the three study countries. The levels of xenobiotics in serum were measured at the same laboratory by the same person; exposure levels would therefore not be expected to be biased because of inter-observer or -laboratory variation. Furthermore, in- house quality controls for HCB and PFASs (PFOS, PFOA, PFHxS, PFNA) were analyzed with each sample batch, with coefficients of variation (CV) of 11 and 5-9%, respectively (131). Samples from the three countries were analyzed randomly across time, thus systematic differences related to season, equip- ment or batches are not expected.

In the Greenlandic study population 116 out of 196 semen samples were collected up to one year after enrollment. Blood samples were drawn at enrollment. Analysis of xenobiotics with seasonal variations like 5OH-MEHP, 5oxo- MEHP and 7OH-MMeOP in blood levels, could therefore have been biased for the 116 men relative to the time of enrollment and fertilization. If relevant, the bias of the observed estimate would be towards the null hypothesis since the misclassification of exposure would have been non-differential with respect to the outcome.

differential misclassification in the studies of phthalates (V and VI) since these have short half-lives (40;41). In Study V the time gap between the start of spermatogenesis for the collected semen samples and the blood sampling, was approximately 90 days, corresponding to the duration of spermatogenesis. This could have caused non-differential exposure errors, as the exposure at the time of collection could have differed substantially for the exposure during spermatogenesis. In fact, all epidemiologic studies of reproductive toxic effects caused by DEHP and DiNP metabolites have collected sperm and blood/urine at the same time. The method has been validated by an exposure study of urine phthalates (132), though another exposure study discouraged it (133). The omnipresent PFASs and HCB accumulates in the body and have long half- lives, therefore we believe that the exposure level measured in one blood sample represents the exposure during spermatogenesis well. PFASs for which more than 30% of the samples were below the LOD were excluded from the analysis (116) in order to ensure that only omnipresent PFASs were included in the study.

The blood samples in the TTP study were drawn when the women were pregnant and not in the period when they tried to conceive, i.e. TTP was collected retrospectively. Especially in the Polish cohort we cannot exclude the possibil- ity of non-differential misclassification since blood samples were collected late in pregnancy (median gestational week 33), even if we adjusted for the gestational week of blood sampling. During pregnancy physiological changes occur which might influence absorption, distribution and metabolism of phthalates. Absolute differences before and after pregnancy have been investigated and found to be relatively small (134). We therefore believe the misclassification to be moderate and non-differential with bias toward null for the outcome estimate.

Outcome assessment

Overall, we do not believe any differential misclassification to occur in the studies investigating seminal or hormonal characteristics since the participants were unaware of their exposure levels.

All semen samples were centrally analyzed for SCSA, TUNEL, Bcl-xL and Fas. The inter-day and inter-sample SCSA CV of the %DFI was low, with only 6 and 1.5%, respectively. Similarly, intra-laboratory CV for TUNEL, fas and Bcl-xL was 5, 6 and 9%, respectively. Total-assay CVs for LH, FSH, estradiol, testosterone, SHBG and inhibin B were 2.6, 2.9, 8.1, 2.8, 5.5 and <7%, respectively. The semen samples were analyzed within each research country by one tech- nician in each site. This could cause non-differential inter-laboratory variation. All local staff involved in the semen analysis had however participated in three quality control workshops prior to initiation of the analyses. This resulted in an inter-observer CV of 8.1 and 11.1%, respectively, for sperm concentration and motility assessments (112). 95% of all semen analyses were initiated within 60 minutes, and the remaining 5% within 95 minutes after ejaculation. Morphology was analyzed centrally by two technicians.

The relatively low CVs and the fact that the samples were analyzed in random order makes the probability of assessment errors low and random.

Semen quality can vary between ejaculates within each man. The wide variety in the quality of semen from different ejaculates was not possible to account for with only one semen sample per man (135). The solution is to base the statistics on a sufficiently large number of participants. Our study population consists of approximately 600 men, which gives sufficient statistical power to detect even small changes.

All studies could have been hampered by recorder bias. We do not believe this to be a problem since 1) a standardized questionnaire was used, 2) the inter- viewers were trained to adhere strictly to the question and answer format, and 3) were blinded towards the participant’s fertility status and exposure levels as this was not known at the time of interview.

The women were pregnant at enrollment to the study, therefore we do not believe TTP to be hampered by recall bias, despite the retrospective approach in Study VI (42). To reduce misclassification of outcome, all accidental pregnancies were excluded, e.g. women not trying to become pregnant or using contraception. Couples with accidental pregnancies might be more fecund than the overall population. It could therefore be argued that they should be included in the analysis with a TTP of one or zero (136;137). However, they might also be sub-fertile and therefore refrain from using contraception in spite of not planning to get pregnant (138). In the interview couples were not asked if they had received medical treatment for infertility. To take this into account TTP values above 13 months were censored since infertility treatment often occurs after one year of regular intercourse without the use of contraception (137).

6.1.3 Confounding

A confounder is a causal determinant of the outcome which is also associated with the exposure of interest, but without being on the casual pathway from exposure to outcome.

An advantage of this study is that men and women were face-to-face interviewed and we were thus able to obtain data on known or possible confounders previously described in the literature (139).

In Study I regarding apoptosis, the multiple models were adjusted by all de- mographic and clinical characteristics based on a priori assumption from se- men studies. We were not able to find sufficient literature on confounding variables with relation to both apoptotic markers and male fertility since pre- vious studies did not describe confounders (19;26-28;31;140). We do not believe this to introduce over-adjustment, since the crude and adjusted estimates did not change substantially.

We chose to adjust for cotinine in serum instead of smoking status (yes/no) to avoid social desirability bias where the participants answered “no” although they smoked at certain occasions, this could cause residual confounding. Co- tinine, which is a biomarker of nicotine, gives a snapshot of the person’s smoking status, and might therefore likewise give an imprecise picture of the truth, but passive smoking will be measured in the sample. If we adjusted for both co-variates we would over adjust the association because these two variables are highly correlated (r=-0.5, p<0.001), so to take passive smoking into account and to avoid social desirability bias, we chose to adjust by cotinine in serum.

TTP might provide a good approximation of fecundity in our large study population, but several determinants are important for the actual comparability of TTP and couple fecundity. Couples have to stop using birth control and start attempting pregnancy at an exact time-point with regular unprotected intercourse for the TTP to be a valid measure of fecundity (136). Cultural and behavioral differences regarding pregnancy planning might exist between the three study populations which can affect this approximation of the TTP as a measure of couple fecundity. There is thus a risk of non-differential misclassification, where longer TTPs are more closely related to infrequent sexual intercourse than to actual reduced fecundity. We adjusted by the frequency of sexual intercourse to account for possible cultural or behavioral differences in pregnancy planning, though residual confounding might occur, especially regarding variables influencing the phthalate exposure.

Parity is known to influence the TTP (137). To take this into account we adjusted for parity and made sub-analysis only investigating first time parents. The estimate when only investigating primiparous women changed the direction of the association towards a shorter TTP in first-time pregnant women exposed to DEHP metabolites.

Reverse causality may explain the association between e.g. phthalates and testosterone if men with low testosterone metabolize phthalates slower than men with normal testosterone. Likewise, if more fertile women with shorter TTP metabolize phthalates slowly we will see high phthalate levels corresponding to shorter TTP. This is, however, not possible to investigate in a cross-sectional study and is thus speculative.

Multiple testing and risk of chance findings

The study population was large (Studies I, V and VI) compared to other semen studies, the results were therefore unlikely to be caused by low statistical power or type 2 error.

Using the 0.05 as the critical significance level the expected proportion of a false positive (type 1 error) association is 0.05, which means that for every 20 comparisons one result is expected to be significant because of random variation (20 x 0.05 = 1). In the tree studies of xenobiotics and male reproductive

function (Studies III-V), 278 adjusted analyses were made, of which 16 statis- tically significant associations emerged (Table 9), not including analysis of single phthalate metabolites. If taking the risk of multiple significance tests into account, approximately 14 of the significant associations could be chance findings. Results pointing in the same direction in pooled analysis as well as within each country are more likely not to be chance findings. It was not possible to make pooled analysis in Studies III and IV because of highly differenti- ated exposure levels of PFASs and HCB between countries. In these studies effect estimates pointing in the same direction in all countries likewise indi- cates a lower risk of type 1 errors due to multiple testing. In all three studies none of the associations were consistently significant associated in all countries. As reported in Table 9, only the association in Study V between Proxy- MiNP and testosterone was significantly associated in both Greenland and Ukraine, as well as overall. Three associations had negative regression coefficients in all study sites (Proxy-MEHP and total sperm count or testosterone, and Proxy-MiNP and testosterone), and in Study III one association had positive regression coefficients in all countries (PFOA and SHBG). The significant association between PFOA and TUNEL in Greenland, as well as the three significant associations which emerged in the study of HCB, all pointed in different directions in the three study countries, we should therefore be cautious not to attribute too much importance to them (Table 9).

As important as consistency across countries are biological plausibility and comparable findings of other similar studies. When taking this into account especially results from the study investigating phthalates and male reproductive function might be plausible.

Overall, we cannot exclude the risk of residual confounding by co-variates we did not adjust for or co-variates too roughly classified. We have, however, no reason to assume this to be the case.

Selection bias is unlikely due to our study design. Likewise, we do not believe differential misclassification to be a problem in this study. Uncertainties in measurements of exposures and outcomes were relatively small, and if any with error towards the null hypothesis. The solution to non-differential mis- classifications is a large sample size, which we have with the 600 enrolled men.

Analyses based upon cross-sectional data do not formally allow for conclu- sions on causality of the observed associations because the direction of cause and effect may be difficult to assess, but the results can generate hypothesis which can be investigated in longitudinal studies.

In document Effects on male reproductive function of phthalates and other environmental xenobiotics in humans (Page 36-41)