4. Chapter Four: Improving Quality of Ethnic Codes in HES
4.5 Coding Rate Methods
4.5.3 Discussion about the Coding Rate Methods
4.5.3.1 Why the Local Area-Sex-Ethnicity Coding Rate Method Failed
The assumption of this method, which allows the coding rates to vary across ethnic groups for each local area-sex group, is even superior to that of the local area-age-sex coding rate method. However, in practice, calculating the local area-sex-ethnicity coding rates is subject to the small number problem. The coding rates here are calculated as the ratio between the observed number of cases to the expected number of cases for ethnicity-sex groups at local authority level. As the population size of ethnic minorities is usually small at local level, thus the expected numbers of cataracts are even smaller. The local authority-sex-ethnicity coding rates for ethnic minorities are more likely to have extreme values. As one or two cataract cases could occur purely by chance for minority ethnic groups, this situation results in high variations of the coding rates. So the crude coding rates are poorly estimated because of the small number problem. The empirical Bayes estimation, also known as a shrinkage method, is employed to reduce the variations of the coding rates. Although the empirical Bayes estimation could largely reduce the variations of the coding rates, some extreme values occur at local authority level for minority ethnic groups. For example, the table below shows the top 10 and bottom 10 coding rates for male people from Mixed ethnic group at local authority level. Normally, the coding rates are supposed to have a mean value less than 1 or 100 per cent. However, the coding rates below are either extremely high or extremely low. When applying these extreme values to the cardiovascular disease data, the estimated total number of cardiovascular disease will be much shrunk or enlarged.
Local Authority Coding Rate Local Authority Coding Rate
00HX 0.00013290 15UH 200.2661142
23UD 0.00021616 29UL 36.5685155 00HH 0.00023504 16UD 35.94320635 46UD 0.00027590 29UN 26.06724945 19UD 0.00029395 29UB 25.60222421 46UC 0.00031407 00BZ 24.8057228 46UB 0.00032591 15UF 24.21986951 23UC 0.00036927 15UC 23.76801994 19UJ 0.00066934 30UF 20.51121699
Table 4-1Extreme values in local area-sex-ethnicity coding rates
Generally, in the UK, previous research found that South Asians (including Indian, Pakistani and Bangladeshi people) have a higher risk of cardiovascular disease than the white population (Cappuccio, 1997, Nazroo, 1997, Nazroo, 2001, Aspinall and Jacobson, 2004), and the relative risk is low for people born in Caribbean and West African groups (Wild and McKeigue, 1997, Bardsley et al., 2000). However, when applying the local area-sex-ethnicity coding rates to adjust the cardiovascular disease data, compared with previous studies, much inconsistency has been found in the ethnic inequalities in cardiovascular disease based on the adjusted cardiovascular disease data, as shown in graphs below.
The graphs below shows the standardised incidence ratios for ethnic groups based on the cardiovascular disease data adjusted by the crude local area-sex-ethnicity coding rates. The evidence of inconsistency is that all the ethnicity-sex groups except for white men have lower risk of cardiovascular disease than the general population, particularly South Asians.
SIRs obtained under local area-sex-ethnicity coding rates method (without shrinkage) 0 50 100 150 200 250 300 Chinese Female Chinese Male Mixed Female Mixed Male Black Africa Female Black Africa Male Black Caribbean Female Black Caribbean Male Other Asian Female Other Asian Male Bangladeshi Female Bangladeshi Male Pakistani Female Pakistani Male Indian Female Indian Male White Female White Male
CVD Standardised Incidence Ratio
Figure 4-19SIRs obtained under local area-sex-ethnicity coding rates method (without shrinkage)
The graphs below shows the standardised incidence ratios for ethnic groups based on the cardiovascular disease data adjusted by the local area-sex-ethnicity coding rates that have been shrunk towards the national coding rate by empirical Bayes estimation. The evidence of inconsistency is that the SIRs of Indian men and women are close to or lower than the general population, and much lower than that of white people. And Black Africa people are found to be in the highest risk.
SIRs obtained under local area-sex-ethnicity coding rates method (with shrinkage towards the national coding rate)
0 50 100 150 200 250 300
Chinese Female Chinese Male Mixed Female Mixed Male Black Africa Female Black Africa Male Black Caribbean Female Black Caribbean Male Other Asian Female Other Asian Male Bangladeshi Female Bangladeshi Male Pakistani Female Pakistani Male Indian Female Indian Male White Female White Male
CVD Standardised Incidence Ratio
Figure 4-20SIRs obtained under local area-sex-ethnicity coding rates method (with shrinkage towards the national coding rate)
The graphs below shows the standardised incidence ratios for ethnic groups based on the cardiovascular disease data adjusted by the local area-sex-ethnicity coding rates that have been shrunk towards the GOR level coding rate by the empirical Bayes estimation. The evidence of inconsistency is that the Indian, Pakistani and Bangladeshi groups, who were found to have a higher risk of cardiovascular disease before, are found to have lower or much lower risk than the general population and the white population. Although there is no previous evidence about the relative risk of cardiovascular disease among people from Mixed group, the standardised incidence ratios for both men and women from Mixed group are found to be the highest,
particularly for men, who have a value of about 2000 (excluded from the graph for presentation). This is clearly overestimated.
SIRs obtained under local area-sex-ethnicity coding rates method (with shrinkage towards theGOR coding rate)
0 50 100 150 200 250 300
Chinese Female Chinese Male Mixed Female Mixed Male Black Africa Female Black Africa Male Black Caribbean Female Black Caribbean Male Other Asian Female Other Asian Male Bangladeshi Female Bangladeshi Male Pakistani Female Pakistani Male Indian Female Indian Male White Female White Male
CVD Standardised Incidence Ratio
Figure 4-21SIRs obtained under local area-sex-ethnicity coding rates method (with shrinkage towards the GOR coding rate)