• No results found

Source of data and characteristics of respondents

2.5 Data quality

Of concern in this thesis is the quality of the data to be used for the analysis, with regard to accuracy of age statements, dates of events such as marriages, births and deaths of children, the accuracy of the reporting of the number of children born, the definition of events such as marriage, the accurate determination of the beginning of exposure to premarital sexual intercourse, the occurrence of premarital births and conceptions, and the accuracy of reporting of the duration of the postpartum variables.

A com prehensive evaluation of the data quality w ith regard to the events listed above has been carried out in relevant chapters of the thesis. This has been su p p lem en ted by a review of ev alu atio n s carried out by the WFS, including those of O w usu (1984) of the com plete GFS dataset, of G oldm an, Rustein and Singh (1985) of data quality of 41 WFS datasets, and M eekers (1991) of the dates of m arriage and first births in W est Africa. Each analytical chapter in the thesis examines an area of the quality of data relevant to the analysis in that particular chapter. The evaluation in this chapter is therefore restricted to know ledge of dates of events, accuracy of age statem ents, the concept of m arriag e in G hana, and the reliab ility of d ates of first m arriag e for the estim ation of the first birth interval.

2.5.1 Knowledge of dates of events

Table 2.7 show s the percent distribution of w om en according to the m ethod of date statem ent for specified events. It was only in about 52 percent of cases

th at resp o n d en ts gave their exact dates of birth. In the rem aining cases,

respondents only stated the calendar year in which they were born or gave their date of birth as 'years ago'. W here a respondent did not know her date of birth, local, regional and national historical calen d ars as w ell as d em o g rap h ic inform ation about her children w ere used to determ ine her age.

Table 2.7 also shows that just over 40 per cent of married women were able to give the exact date of their first union. About 35 percent reported the calendar year only, while about 24 percent reported their age at the event. Considering that only about one-half of all women knew their exact dates of birth, estimating date of first union by using a respondent's age at the event introduces another possible source of error. Considerably more women who had ever given birth to a child were able to provide the exact dates of birth of their first child (64.9 per cent), their penultimate birth (66.2 per cent) and last birth (78.6 per cent).

Table 2.7 Per cent distribution of women according to the method of date statement for specified events, GFS.

M e t h o d o f d a t e s t a t e m e n t R e s p o n d e n t 's d a t e o f b i r t h D a te o f firs t u n i o n D a te o f f ir s t b i r t h D a te o f p e n u l t i m a t e b i r t h D a te o f la s t b i r t h E x a c t d a t e g i v e n 52.1 40.4 64.9 66.2 78.6 C a l e n d a r y e a r o n ly 27.2 35.4 19.7 1 9.4 12.1 Y e a rs a g o 20.7 - 15.3 14.4 9.0 A g e o f r e s p o n d e n t a t e v e n t - 24.2 - - - M u l t i p l e s o u r c e s - - 0.1 - 0.3 N o t a p p l i c a b l e ( N o e v e n t ) - - - - - T o ta l 100.0 100.0 100.0 100.0 100.0

Of major concern, however, was the large number of women who were unable to state their complete dates of birth, and the existence of digit preference in the age data, leading to heaping of the age distributions on digits ending in zero and five. The age distribution shown in Figure 2.1 illustrates this digit preference.

Figure 2.1: Single-age distribution of respondents in the GFS: All women

Age at survey in single years

The heaping may be related to the large percentage of women who were unable to provide their complete dates of birth, or the exact dates of occurrence of other events such as their first marriage and the birth of their children. Table 2.7 illustrates this problem, while Table 2.8 presents Myers' index for selected socioeconomic groups. Table 2.7 shows that the dates of first marriage and of the birth of children are likely to be misplaced because over 40 per cent of women could not provide exact dates for the occurrence of these events.

Table 2.8 Myers' Index for total sample, and by ethnicity, education and rural-urban residence, GFS

Category Myers' Index Category Myers' Index

All women 11.2 Educational groups

No education 14.6

Ethnic groups Primary 11.0

Akan 10.5 Middle 4.7

Ga-Adangbe 10.1 Higher 13.4<a>

Ewe 7.2

Mole-Dagbani 17.3 Place of current residence

Other 14.0 Rural 11.7

Small urban 10.6

Large urban 9.1

Notes: (a) the Myers' index for women with higher education is high because of the small sample size of this group, and the subsequent instability of their age distribution.

The range of the Myers' index is from 0 to 90. The index is zero if there is no digit preference or heaping at all, and 90 if all reported ages ended in only one digit. The indices appear low and acceptable, though not entirely satisfactory, and as Figure 2.1 shows, there is considerable heaping on digits ending in zero and five. Estimates based on ages in single years would therefore be affected by this digit preference, and as much as possible, most of the analysis carried out in

Inability to state the exact dates of the occurrence of events like marriages and births has serious implications for the estimation of birth intervals, as the estimation of an interval depends on the subtraction of the date of a terminal event from an initial one. At the WFS headquarters, dates of birth were imputed for events for which exact dates were not provided by the respondent. The imputation procedure can have a non-negligible effect on the estimates for which the imputation was done. For example, Meekers (1991) uses random and midpoint imputation procedures to show that the length of the first birth interval, as well as the estimated proportions of negative birth intervals and premarital conceptions can vary substantially depending on the imputation procedure adopted. Chidambaram and Pullum (1980) also used several imputation procedures to examine birth history data for Bangladesh, and concluded that estimated recent fertility decline in Bangladesh is smaller under an imputation procedure that defined 'years ago' as completed years, than under a procedure that defined 'years ago' as rounded years (Goldman et al.,1985:41).

Another serious problem is that of the fluid definition of marriage used in the GFS. A fluid definition of marriage means that it is not a clear-cut event for which an exact date of initiation can be given. To overcome these problems, the date of commencement of cohabitation was asked, but marriage and cohabitation do not necessarily coincide (Meekers, 1991:250). In matrilineal societies for instance, and indeed even in the patrilineal Greater Accra region, married persons may each remain in their parents' homes and visit the other partner. Childbearing may thus commence even before the married couples begin to cohabit. Again, because the contracting of marriage is a process that may protract for several months, if not years, children may be born even before the marriage process is completed. Under this situation, it is not always clear

whether such children are to be considered premarital or not (Ware, 1977; Meekers, 1991:251).

The problems discussed above do not, however render the data unusable. On the contrary, both the nuptiality and fertility data have been extensively used for very profitable scientific analysis. The fertility data in particular have been found to be useful, and the coverage of live births satisfactory (Goldman et al., 1985). Some of the other studies for which the dataset has been used include those by McDonald (1984) on nuptiality and completed fertility in 34 countries, including Ghana; Casterline and others (1984) on the proximate determinants of fertility in 30 countries, including Ghana; Gaisie (1981 and 1984) on the proximate determinants of fertility in Ghana, and Guz and Hobcraft (1991) on breastfeeding and fertility in several African countries, including Ghana.

In view of these problems, however, care.is taken not to exaggerate the preciseness of estimates or measurement procedures used. As much as possible, age groups are used instead of single-year ages. Each chapter examines aspects of data quality for the particular analysis carried out in that particular chapter. Thus, although the GFS data present a number of problems of data quality, it is unlikely that they will hinder the usability of the dataset; the problems outlined above are taken into account during the analysis and interpretation of results obtained.

Chapter 3