D iscrim ination am ong th ree non-nested distributions is m ore difficult th a n between two, as was rep o rted in th e previous section. S tandard hypothesis tests are generally not valid if the num ber of tested distributions exceeds two. On th e oth er hand, an ad ditional candidate will affect other criteria used in selecting th e tru e distribution. The em phasis here is placed on discrim ination among th e two- or th ree-p aram eter gam m a, W eibull and lognorm al distributions. M onte Carlo experim ents are used to examine th e perform ance of th e discrim ination criteria as each distribution is taken to be true. T h e discrim ination procedures m ight be expected to expose certain distributions as being good approxim ations to others over different p aram eter ranges, even when the distrib u tio n s come from different p aram etric families.
One of th e purposes of th e experim ents is to exam ine th e outcom es arising from the situ atio n where th e d a ta are generated from, say, th e th ree-p aram eter g am m a distribu-
tion, and discrim ination is m ade betw een it and its tw o-param eter co u n terp art. Such considerations often arise in practice, and has especial im portance for fitting m odels to air quality d ata, as will be discussed later. For each experim ent, selection will proceed am ong th e two- and th ree-p aram eter gam m a, W eibull and lognorm al distributions. In stead of using th e B LR te st, th e K ullback-Leibler (KL) inform ation criterion based on th e m axim ized log-likelihood values will be used. In this case, AIC and SIC will be equivalent to KL since th e penalties for AIC and SIC are identical. T he sam ple size is 365, which is considerably large for practical purposes. As m entioned previously, GIC is p rim arily designed to discrim inate am ong three or m ore distributions sim ultaneously, in which case th e LR te st is not applicable. Thus, GIC is expected to com plem ent the existing discrim ination criteria.
T he results of experim ents using different criteria are shown in Table 4. T he m ain points from th e tab le are as follows.
(i) W hen th e th ree-p aram eter gam m a d istribution is correct, all of th e criteria in Table 4 perform poorly. T he best is KL (or equivalently, AIC and SIC), although it is only correct 59.6 p er cent of the tim e when th e shape p aram eter is 2, and 52.6 p er cent of th e tim e when th e shape p aram eter is increased to 6. F P E is reasonably close to KL. KS and CHI perform well in accepting th e tru e null b u t they also accept th e false th ree-p aram eter m odels frequently. U P E is worst and selects th e false th ree-p aram eter W eibull distrib u tio n 576 tim es in 1000 experim ents. W hen th e value of th e shape p a ram eter is 2, th e th ree-p aram eter W eibull distribution has a ra th e r high probability of 35.5 per cent of being th e best fitting d istribution, b u t when th e shape p aram eter is increased to 6, th e th ree-p aram eter lognorm al distribution becomes a good approx im atio n to the tru e distribution. Som ewhat surprisingly, when the value of th e shape p a ra m ete r is 2, th e tw o-param eter lognorm al is th e best fitting of th e tw o-param eter distributions. A lthough th e tw o-param eter gam m a becomes th e best when th e value of th e shape p aram eter is increased to 6, th e tw o-param eter lognorm al d istrib u tio n is still a good approxim ation to th e tru e th ree-p aram eter gam m a d istribution.
(ii) The perform ances of the discrim ination criteria are improved substantially when th e tru e distrib u tio n is th e tw o-param eter gam m a. KL, AIC and SIC are th e best when th e value of th e shape p aram eter is 2, b u t F P E is best when th e shape p aram eter is 6. KS and CHI have high probabilities of accepting the two- and th ree-p aram eter gam m a and W eibull distributions when th e values of the shape p aram eter are 2 and 6. U PE rem ains th e worst. For th e other th ree-p aram eter distributions, th e ranks are sim ilar to th e case where th e tru e distribution is th e th ree-p aram eter gam m a, which suggests th a t changing th e value of th e location p aram eter from 1 to 0 does not effect th e results qualitatively.
(iii) M ost of th e criteria perform very well when the th ree-p aram eter W eibull dis trib u tio n is true. In this case, U P E becomes th e best when th e value of th e shape param eter equals 2 or 4. T he high probabilities of accepting th e tru e d istrib u tio n indicate th a t both th ree-p aram eter gam m a and lognorm al distributions are not good approxim ations to th e th ree-p aram eter Weibull distribution. Once again, th e two- p aram eter lognorm al distrib u tio n is th e best when the shape p aram eter is 2, b u t it is replaced by the tw o-param eter gam m a d istribution when th e shape p aram eter is 4. These results suggest th a t th e m em ber of th e sam e fam ily of d istributions w ith two param eters is not necessarily as good as a m em ber from a non-nested co u n terp art. The tw o-param eter lognorm al distribution is th e best approxim ation to th e th ree-p aram eter Weibull distrib u tio n if th e d a ta are heavily skewed, as is often encountered in air pol lution applications.
(iv) Sim ilar observed p attern s are obtained when th e tru e distribution is th e two- param eter W eibull. M ost of the criteria perform well in accepting th e tru e distribution. Only KS and CHI have low powers for rejecting the false two- and th ree-p aram eter gam m a, and the th ree-p aram eter W eibull and lognorm al distributions. U P E perform s the best of th e criteria.
(v) W hen the value of the shape p aram eter is 0.9 for th e underlying th ree-p aram eter lognorm al d istrib u tio n , m ost of th e criteria perform very well, w ith th e exception of
UPE.
However, when th e shape p aram eter is decreased to 0.5, power is reduced sub stantially. KS and CHI have significantly higher probabilities for rejecting th e false distributions when th e value of th e shape p aram eter is 0.9, as com pared w ith th e cases when th e tru e d istrib u tio n s are th e two- or three-p aram eter gam m a and W eibull dis tributions. However, th e false th ree-p aram eter gam m a distribution will be frequently accepted by KS and CHI tests if th e shape p aram eter is decreased to 0.5. U PE has the worst. It is in terestin g to note th a t, of the tw o-param eter distributions, th e two- p aram eter lognorm al distrib u tio n is th e best approxim ation to th e three-param eter lognormal. Based on these results, it is clear th a t, no m a tte r which three-param eter distribution is tru e , th e tw o-param eter lognorm al will always be a good representation of th e d a ta if selection is restricted to tw o-param eter distributions and th e sam ple d ata are quite skewed. These sim ulation results are consistent w ith th e recom m endations of m any air pollution specialists (for exam ple, Larsen (1971,1974) and B enarie (1980)) th a t the tw o-param eter lognorm al distrib u tio n is th e best for fitting urb an air pollutant concentrations. However, care should still be exercised because th e tw o-param eter ver sion m ight not be th e best if alternative th ree-p aram eter distributions are under serious consideration.(vi) W hen th e tru e distrib u tio n is th e tw o-param eter lognorm al, m ost criteria apart from U P E perform very well. The im provem ent in rejecting th e false m odels using the KS and CHI tests can also been seen when th e value of the shape p aram eter is 0.9, as com pared w ith th e cases where th e tru e distributions are th e two- or three-param eter gam m a and W eibull distributions. T heir perform ance in rejecting false m odels will deteriorate when th e shape p aram eter is reduced to 0.5. Similarly, th e result of selection among th ree-p aram eter d istributions has no significant changes when th e d a ta are from th e tw o-param eter lognorm al distribution.
From th e results of th e experim ents, th e criteria used are not always consistent, especially for th e underlying two-and th ree-p aram eter gam m a distributions. T he same problem is found when th e th ree-p aram eter lognorm al is the tru e d istrib u tio n w ith the value of th e shape p a ra m ete r taken as 0.5. A m ajo r consequence of these findings is