• No results found

9. PHASE 2: RCT OUTCOME RESULTS

9.7. Sensitivity analyses based on follow-up timing

9.8.1. General considerations

The intervention has a positive effect on all indicators, except for satisfaction with care, for which no effect was expected. The size of the effects was reported as Risk Differences and ranged from about 5 to 12 percentage points. There are several potential reasons why we did not observe an even higher effect. First of all, despite high adherence there were still a substantial number of intervention group couples who did not attend all sessions. This is a risk inherent in all trials of behavioural-educational interventions that rely to a large degree on participant motivation. Low uptake essentially counts as cross-over from the intervention to the control group. In analysis by intention-to-treat, this inevitably leads to substantial dilution towards the null of effect estimates. Higher coverage than we were able to achieve might have led to larger effects.

Another consideration is that, because of time constraints, we commenced data collection almost immediately after health workers had been trained, after only a short pilot period. Several health workers, especially those who did not participate in our formal training workshops, were still learning on the job how the intervention sessions should be provided. Based on the information collected during monitoring and supervision, the smooth running of the study activities in each PHC definitely improved over the course of the implementation period. Had we been able to allow for a longer adaptation period, habit and practice might have improved performance and therefore resulted in higher intervention effects. This might be the case in a real life scenario in which such an intervention were implemented over a longer period of time. Of course, there are also reasons why effectiveness might be lower in real-life

situations, not least the absence of additional resources and less supervision and support. For further discussion of these issues see Subchapter 11.5.

Due to the nature of the intervention and the limited opportunities for blinding, there was also a small but real risk of contamination inherent in the choice of an individually-randomised study design. This would have essentially counted as treatment switching from the control to the intervention group, and may have played a role in limiting the size of the effects observed. In the case of certain indicators, our results showed a considerably different picture, both for the intervention and for the control arms, from the baseline levels that we identified based on the latest DHS and other studies. It is likely that there have been some increases in PNC attendance and FP uptake since the previous studies from which we extracted baseline levels (our findings do not suggest substantial contextual improvements in EBF). However, it is also possible that

176 contamination between the study arms may have led to higher levels of PNC and FP use in the control group, compared to the general population.

Effective monitoring procedures were in place to reduce non-compliance with the assigned treatment and these were almost entirely successful in preventing the attendance of control- group participants in the intervention sessions. Therefore, any contamination is likely to have occurred during informal contact in the community between women and men from the two arms. It is possible that couples from different groups might have lived close to each other or come into contact under other circumstances which might have led to them discussing the study. The focus on men may have increased the likelihood of this happening, given that men spend more time out of their homes and may have wider social networks. In general, it would be useful for future studies to collect good quality observational data from the population in Bobo Dioulasso, in order to assess baseline levels for the most important MNH and FP outcomes, and to provide a benchmark for future intervention studies.

Another issue to reflect upon is the presence of interaction. Our results suggest that the effect on certain outcomes differed substantially by recruitment PHC. In particular, there was evidence of interaction for the use of effective modern contraception (primary outcome c.), any

contraceptive use at 3 and 8 months (secondary outcome b.) and timely initiation of effective modern contraception (secondary outcome c.). It is interesting to consider why we observed evidence of effect modification for these outcomes and not for others. In particular, we found no evidence of effect modification for the use of LARC methods (secondary outcome a.). The available data seems to indicate that the intervention increased the uptake of long-acting methods in all PHCs, whereas the effect varied for other FP methods. However, it is useful to bear in mind the low power of the Likelihood Ratio Test for interaction, which suggests that there might have been interaction for other outcomes but insufficient evidence to show it.

In order to explore the reasons for the presence of interaction, I looked at whether there were any differences in baseline characteristics between the two study arms in each recruitment PHC, and identified some differences for six factors (type of marriage, ethnicity, school, work, parity, and prior use of contraception). These were identified through visual inspection, with no statistical testing due to the small numbers in many of the categories. For reference, the relevant Table can be found in Appendix 1. I then ran the stratified analysis for each primary and

secondary outcome with the addition into the model of these six baseline characteristics. However, this additional adjustment did not change my findings with respect to interaction by recruitment PHC, suggesting that these differences do not explain the interaction seen. Therefore, the results presented in this chapter are from the simpler model. We therefore must assume that the presence of interaction is either due to unmeasured population-level differences, or to differences inherent in the PHCs themselves (size, supplies, management structure, work

177 ethic), or in the way they implemented the intervention (emphasis on certain health messages rather than others, leadership). The qualitative process evaluation explored some of these factors (see Chapter 10).

In relation to the validity of the data collected, it is important to point out that for certain secondary outcomes, namely relationship adjustment, timely initiation of contraception and satisfaction with routine care, the measures of effect used were not validated, and therefore the findings, will positive, must be interpreted with caution and considered exploratory. It is also important to consider the role of biases such as recall bias and social desirability/courtesy bias. It is possible that recall bias may have come into play in relation to events that occurred in the past, for example in the 8-month data concerning the timing of resumption of intercourse. However, it is unlikely that this bias would have had a differential effect by study arm. On the other hand, courtesy bias may have substantially affected responses related to satisfaction, and intervention group women may have been more likely to report desired breastfeeding practices in order to please the interviewer. This might have been compounded by the impossibility of blinding participants and data collectors to treatment allocation, a limitation which was inherent in the nature of the intervention. In the interviews, we don’t know in what way knowledge of arm assignment may have influenced participant responses, or interviewer interpretation of their responses.

As far as BF is concerned, however, these factors are unlikely to have substantially biased the results, given that the focus on BF was not exclusive, several health topics were covered during the intervention sessions, and the interviewers enquired about a wide range of health

behaviours. Social desirability bias is unlikely to differ by study arm for this outcome, given that most women, including those in the control group, would have received BF advice before or after birth and therefore been aware of recommended practices. In addition, we observed high levels of validity in both arms in relation to reported FP use (see Subchapter 9.4.16), and it is possible that the same applies to BF and other outcomes. As described, efforts were made to reduce these biases by field testing questionnaires and training interviewers in interpersonal communication skills.