This section provides information about the statistics tools use in the analysis. The existent statistical methods used to estimate, analyse the determinants and to project adult mortality in general and maternal mortality in particular.
3.3.1 Chi-square
Let consider a dependant qualitative variable Y with η1 categories (Y1, Y2, ..., Yj, ..., Yη1 and an pre-dictor X with η2 categories (X1, X2, ..., Xi, ..., Xη2) captured from a sample of n observations. The Chi-square test of independent allowed us to confirm or infirm an hypothesis called null hypothesis (H0).
We define the Hypotheses related to the Chi-square test as followed:
H0 = The two variable are significantly independent, that means there is no association between them.
H1 = The two variable are statistically dependent or associated
The opposite of the null hypothesis is noted H1 and called the alternative hypothesis. The observed values at a given point (I, J ) (noted O(I, J )) are the number of people belonging to both categories XI and YJ. The expected values at a given point (I, J ) (noted E(I, J )) are the number of people who were expected to belong to categories XI and YJ. In other words, the expected values at a given point (I, J ) is the number of observations in the sample multiply by the probability of any observation to belong to the categories XI and YJ.
By definition, the degree of freedom is:
df = (η1− 1)(η1− 2) (3.7)
The value of the Chi-squared statistic test is given by:
The value of the Chi-squared test with yates’s continuity correction is provided by the following:
χ2y =
Consult the chi-square distribution table to obtain the critical values according to the degree of freedom and level of significance. Hence, we could compare the calculated values and critical values of the Chi-square to be able to make the conclusions. The probability density function (pdf = }) of the Chi-Chi-square distribution is:
}(x) = x(υ2−1)e−x2
2υ2Γ(υ2) (3.10)
where, x > 0 and υ is the mean of the distribution and represents the degree of freedom of the Chi-square test of independence. The Gamma function is defined as followed:
Γ(x) = Z ∞
0
e−yyx−1dy (3.11)
Based on the pdf, we can build the chi-square distribution table of critical values ˘χ2df with the probabilities (or levels of significance) associated. There is generally an association between two variables and only the level of significance related to the association differed In this study, we decided to choose α = 0.05 = 5%.
In other words, we choose to accept only relationships significant at 1 − α = 95% or above. We consider all the association with significance at less than 95% as very weak and therefore not acceptable since they could be due to chance. In such situation, we consider that there is no statistical significant association.
After that, we identified the probability of rejecting the null hypothesis (called pvalue). The pvalue of the χ2 test is computed (but generally provided by statistical probability tables) as the probability of type I error.
pvalue = P(Type I error)
= P(rejecting H0 when is H0 true)
= p(Value of the statistic test is in CR when H0 is true)
= P (RR = χ2df > ˘χ2df in one tail CR)
Where CR is the critical region or
the rejection region in the Chi-square distribution plot. Thus, the probability corresponding to χ2(df )>
˘
χ2(df )is determined as pvalue. The following conclusion is drawn from the pvalue: pvalue< α =⇒ Reject H0
pvalue> α =⇒ Accept H0 H0 is rejected means the variables under the test are statistically significantly associated or dependent.
Said in other way, there is statistical evidence that the observed association between both variables is not due to chance. H0 is accepted means the variables are not statistically significantly associated or there is no statistical evidence about an association between the two variables tested. We can also say that the observed association is more probably due to the chance.
3.3.2 Logistic regression
This technique of analysis is applicable when the dependant variable is dichotomous: that means with 2 categories 0 and 1. The category 1 represents the study event (cases under study) while the category 0 represents the opposite event or control cases group. For each independent variable, one category should be identified as a reference category, which could serve as basis of risk comparison of people in other categories to face the event under study. In other words, the logistic regression helps us to compare the risk of exposure to the event under study between each category and the reference category of a given independent variable in the model.
The equation of a multiple logistic regression can be presented as follows:
Y = E(Y ) + εi (3.12)
Where Y ∼ B(p) is a Brenoulli random variable with probability of success p. In the logistic regression the dependent variable Y is a dummy variable which can only take two values Y = 1 or Y = 0. Where:
P (Y = 1) = p P (Y = 0) = 1 − p
p = probability that the dependent variable is 1 or proportion of people in the category 1 of the dependent variable.
1-p = Proportion of people in the category 0 of the dependent variable.
The second equation of the logistic model can also be written as follow:
logit(E(Y )) = logit(p) = β0+X
The above equation can be written in the form:
ˆ
In equation 3.16, If we replace a specific value of i = I. The categorical variable XI has many categories (possible values). Let consider XI as a (0,1) variable: XI = 1 and XI = 0 as the reference category.
The regression coefficient βI can be determined as follows:
ˆ
Where ˆp1 and ˆp2 are the probabilities of the variable XI to have the values a and a + 1 respectively.
ˆ
p2 = β0+ βI× 1 +P βiXi (3.21)
ˆ
p1 = β0+ βI× 0 +P βiXi (3.22)
ˆ
p2− ˆp1 = βI (3.23)
log(odds2) − log(odds1) = βI (3.24)
log(odds2
odds1) = βI (3.25)
log(OR) = βI (3.26)
OR = eβI (3.27)
where, OR=Odds ratio In the logistic regression, the regression coefficient of a given variable XI is expressed as the odds ratio of a category 1 and the reference category 0 (Kleinbaum et al., 2002). It gives us to ratio of chance between two categories of a given predictor X.
The analysis of the logistic regression is based on a number of statistics which is important to clarify.
Deviance analysis of model fit
The deviance statistics are an alternative of R2 which enable one to measure the model fit. The difference between the null deviance (D2N ull) and the residual deviance (D2residual) follow a Chi-squared distribution with degree of freedom (ddf ) corresponding to the difference of degree of freedom of the null and the residual deviance. (dD2 ∼ χ2ddf) where dD2 = D2N ull− Dresidual2 and ddf = dfN ull− dfresidual. The Pvalue = P rob(χ2ddf > dD2) allows to conclude about the significance (or fitting) of the model (Sheather, 2009).
Akaike information criterion (AIC).
In general, AIC = 2δ − 2 ln(M L) where δ is the number of parameters in the model and ML is the maximum value of the likelihood function. AIC is used to select the best model in a set of data. In fact, the model with the smallest AIC should be given preference of choice.
Odds ratio (OR = exp(β))
exp(β) = OR or the odds ratio and the level of significance (pvalue) associated. For each category of each predictor to except the reference category, there is an odds ratio and significance associated. When the pvalueis less than 0.05, we comment on the odds ratio which tell us about the ratio of the influence between each category and the reference. For a given category, when the odds ratio is greater than one (exp(β) > 1), we conclude that people under this category are exp(β) times more likely (or more at risk) to face the event of study (maternal death in our case) than those of the reference category.
When exp(β) < 1, we conclude that the individuals of the concerned category are 1 − exp(β) percent less likely or less at risk to experience the event under study than those of the reference category. The pvalueassociated to the odds ratio allows us to conclude about the degree of certainty of the odds ratio value.
3.3.3 Time series: ARIMA(p,d,q)
The time series model used in this study is the AutoRegressive Integrated Moving Average (ARIMA) model. The model ARIMA(p,d,q) depends on three positive integers parameters p,q and d. The model is a combination of three components with a parameter related to each of them. Indeed, the first component is the AutoRegressive (AR(p)), Integrated (I(d)) and Moving Average (MA(q)).
ARIMA (p,d,q)
The general formula of ARIMA (p,d,q)5 is : ϕ(L)
AR(p) or ARMA (p,0) or ARIMA (p,0,0)
MA(q) or ARMA (0,q) or ARIMA (0,0,q)
Yt= (1 +
3.4 SYNTHESIS AND PARTIAL CONCLUSION
Chapter one presented the problem of the research in terms of social, economic and demographic importance of the subject. It also presented the national and international context in which the study is situated. Chapter two summarized previous researches done on the subject and highlighted their strengths as well as insufficiencies. This followed by delineation of the problem of the study, the gaps in previous researches and the scientific importance of the problem highlighted. The originality of this study on its specific context is well captured.
The chapter on methodology presented the data used, analytical methods and procedures of analysis.
Three data sets were used in this study, including the census 2006 data, the DHS data 2010 and the EMOC data 2010. The DHS data and census data are collected from the population in the households, while the EMOC data are collected from patients at health facilities. The target population of all analyses undertaken in this study is composed of two population groups: the maternal deaths and maternal survivals. The maternal deaths were the population of interest, whilst the survivals were used for control. At descriptive level of analysis, a Chi-squared test and Wilcoxon Mann Witney test were used. At multivariate level of analysis, a logistic regression model was performed both at national and regional scales. The analyses are done using the software R.
The study focussed on the assessment of maternal mortality level provided by the census 2006. The assessment of the method developed during the census 2006 to adjust the observe information, the comparison of findings with existent estimates at national and regional level are made. The forecasting of maternal mortality levels from 2006 to 2050 was part of the objectives of the study. A mathematical model based on ARIMA model and a component method based on the LiST model incorporated in the software SPECTRUM and a design regression models were also used in the study.
MORTALITY
This chapter addresses the issue of maternal mortality determinants. It seeks to identify socio-economic and demographic factors which influence significantly the phenomenon at national and regional scales.
To reach the target, both descriptive and multivariate approaches have been used. Due to the lack and deficiencies of data regarding maternal mortality, different data sets from different sources have been used to cover most of the factors found in the literature and also enable confrontation and comparisons of results as well as exploring the regional disparities of the phenomenon. Each section of this chapter presents the results of the analyses for each data set used. Thus, are presented below, the outcomes of the analyses from the census data1 2006, the Emergency Obstetric and Neonatal Care (EMOC) data2, the demographic and health survey (DHS)3 2010 respectively. For each data set, a descriptive analysis and inferential analysis were performed, as explained in the chapter tree assigned to the methodology.