Comparison of previous EQ-5D-3L value sets with the more recently developed EQ-5D-5L value set within countries has commenced. However, the fact that population preferences for EQ-5D-3L health states were collected, in many cases, more than 20 years ago, has not been considered [19]. In the specific case of Spain, the time difference between studies was around 12 years and, as shown in chapters 2 and 9, differences with respect to the relative importance of dimensions occurred. One could argue that these differences were only due to differences in the instruments or in the valuation methods used. However, as discussed in chapter 9, it seems that these were not the only issues, at least in Spain where the economic crisis appears to have had an important impact in how the population views health problems.
As the economic crisis has been worldwide, and has not only occurred in Spain, it is worth considering whether the Spanish case can be safely extrapolated to any other country.
However, another reason can partially explain the observed differences. It may have been that the policies implemented to make the life of physically disabled people easier have had an impact on how the population views mobility-related problems, which could explain the reduction in the relative importance of the mobility dimension. Having stated this, and given the uncertainty concerning what really influences the population to change its health preferences, then the development of a value set necessitates a longitudinal approach rather than a transversal one. In other words, valuation studies for a specific instrument should be performed at least each 5-10 years, the exact time being related to the number of policy changes made in the country.
The remaining question is whether exactly the same method should be used again. In principle the response should be: “no”, as there is always room for improvement. For example, even when explained well, the C-TTO task remains complex for respondents to carry out.
One of the most difficult aspects of the task for the respondent is to give a precise value “t”
for her/his point of indifference when choosing between 10 years in an impaired state and t years in full health. This can lead respondents to exhibit behavioural imprecision (satisficing, circling, and wandering) which will reduce the accuracy of their C-TTO responses. Satisficing is arguably the most problematic: once respondents start to feel that the choices are becoming difficult to make, they press the “A&B are about the same” button to terminate the task, instead of continuing until they reach their “true” point of indifference. Satisficing can thus lead to inaccurate preferences being recorded by respondents, which may bias modelling results.
While some respondents may be struggling to determine their indifference point, they might be capable of reporting their indifference range of values for a health state more accurately.
In addition to the behavioural imprecision of the respondent’s data, the C-TTO task itself also inhibits a degree of imprecision (i.e., task imprecision) as only year and half-year values of t are included in the iteration sequence, leading to 41 discrete values, rather than a continuous range of values. Changing the termination rule in C-TTO (i.e., removing the obligation to
184 | Chapter 10
Future research
Comparison of previous EQ-5D-3L value sets with the more recently developed EQ-5D-5L value set within countries has commenced. However, the fact that population preferences for EQ-5D-3L health states were collected, in many cases, more than 20 years ago, has not been considered [19]. In the specific case of Spain, the time difference between studies was around 12 years and, as shown in chapters 2 and 9, differences with respect to the relative importance of dimensions occurred. One could argue that these differences were only due to differences in the instruments or in the valuation methods used. However, as discussed in chapter 9, it seems that these were not the only issues, at least in Spain where the economic crisis appears to have had an important impact in how the population views health problems.
As the economic crisis has been worldwide, and has not only occurred in Spain, it is worth considering whether the Spanish case can be safely extrapolated to any other country.
However, another reason can partially explain the observed differences. It may have been that the policies implemented to make the life of physically disabled people easier have had an impact on how the population views mobility-related problems, which could explain the reduction in the relative importance of the mobility dimension. Having stated this, and given the uncertainty concerning what really influences the population to change its health preferences, then the development of a value set necessitates a longitudinal approach rather than a transversal one. In other words, valuation studies for a specific instrument should be performed at least each 5-10 years, the exact time being related to the number of policy changes made in the country.
The remaining question is whether exactly the same method should be used again. In principle the response should be: “no”, as there is always room for improvement. For example, even when explained well, the C-TTO task remains complex for respondents to carry out.
One of the most difficult aspects of the task for the respondent is to give a precise value “t”
for her/his point of indifference when choosing between 10 years in an impaired state and t years in full health. This can lead respondents to exhibit behavioural imprecision (satisficing, circling, and wandering) which will reduce the accuracy of their C-TTO responses. Satisficing is arguably the most problematic: once respondents start to feel that the choices are becoming difficult to make, they press the “A&B are about the same” button to terminate the task, instead of continuing until they reach their “true” point of indifference. Satisficing can thus lead to inaccurate preferences being recorded by respondents, which may bias modelling results.
While some respondents may be struggling to determine their indifference point, they might be capable of reporting their indifference range of values for a health state more accurately.
In addition to the behavioural imprecision of the respondent’s data, the C-TTO task itself also inhibits a degree of imprecision (i.e., task imprecision) as only year and half-year values of t are included in the iteration sequence, leading to 41 discrete values, rather than a continuous range of values. Changing the termination rule in C-TTO (i.e., removing the obligation to
Discussion | 185
10
provide an indifference point) may reduce behavioural and TTO task imprecision. I have proposed to change the current termination rule of the TTO by utilizing the following two conditions: 1) A fixed number of steps in the iterative sequence, and 2) “A specific” interval of the range of values where the true preference is, e.g,. when a respondent is circling between values of a width of 1 year. Hence, the future of C-TTO tasks should be interval-based instead of based on indifference points.
Conclusions
The studies presented in this thesis, together with similar work accomplished elsewhere, have resulted in a detailed valuation protocol for the EQ-5D-5L instrument, paired with a quality assurance procedure and novel analytical approaches. The updated protocol has enabled teams from all over the world to successfully establish EQ-5D-5L value sets.
Despite TTO being a preferred method for health state valuation for two decades, important new insights have been achieved concerning how respondent behaviour and specific features of the valuation task work together to define the level of precision of C-TTO responses. These insights have emphasized the importance of the interviewer’s role in C-TTO valuation, and made it evident that it is unlikely that any interview protocol or software for performing interviews is sufficient to guarantee proper interviewer competence, compliance, or engagement. This motivated the introduction of a QC process and of new modelling approaches. Looking back, one may wonder how widespread similar issues were in other valuation studies, and why they have not been identified (and resolved) previously. Most likely, similar issues have always been there, but were simply not noticed.
Given the changes in thinking concerning what works in valuation exercises, it would be appropriate to recognize that – at least from a valuation perspective – the rigorous approach to EQ-5D-5L valuation studies inspires trust. However, it can also be noted that the kind of insights that guide our current valuation work did not exist when EQ-5D-3L was originally valued. It is wise to remain modest with respect to claims about the qualities of older value sets derived from instruments such as EQ-5D-3L, SF-6D, AQoL and HUI, until we have returned to scrutinize them. Such research is warranted to provide users, stakeholders and society with recommendations regarding the use of the instruments in analysis and health care decision-making, to the benefit of all patients.
Discussion | 185
10
provide an indifference point) may reduce behavioural and TTO task imprecision. I have proposed to change the current termination rule of the TTO by utilizing the following two conditions: 1) A fixed number of steps in the iterative sequence, and 2) “A specific” interval of the range of values where the true preference is, e.g,. when a respondent is circling between values of a width of 1 year. Hence, the future of C-TTO tasks should be interval-based instead of based on indifference points.
Conclusions
The studies presented in this thesis, together with similar work accomplished elsewhere, have resulted in a detailed valuation protocol for the EQ-5D-5L instrument, paired with a quality assurance procedure and novel analytical approaches. The updated protocol has enabled teams from all over the world to successfully establish EQ-5D-5L value sets.
Despite TTO being a preferred method for health state valuation for two decades, important new insights have been achieved concerning how respondent behaviour and specific features of the valuation task work together to define the level of precision of C-TTO responses. These insights have emphasized the importance of the interviewer’s role in C-TTO valuation, and made it evident that it is unlikely that any interview protocol or software for performing interviews is sufficient to guarantee proper interviewer competence, compliance, or engagement. This motivated the introduction of a QC process and of new modelling approaches. Looking back, one may wonder how widespread similar issues were in other valuation studies, and why they have not been identified (and resolved) previously. Most likely, similar issues have always been there, but were simply not noticed.
Given the changes in thinking concerning what works in valuation exercises, it would be appropriate to recognize that – at least from a valuation perspective – the rigorous approach to EQ-5D-5L valuation studies inspires trust. However, it can also be noted that the kind of insights that guide our current valuation work did not exist when EQ-5D-3L was originally valued. It is wise to remain modest with respect to claims about the qualities of older value sets derived from instruments such as EQ-5D-3L, SF-6D, AQoL and HUI, until we have returned to scrutinize them. Such research is warranted to provide users, stakeholders and society with recommendations regarding the use of the instruments in analysis and health care decision-making, to the benefit of all patients.
186 | Chapter 10
RefeRenCes
1. Oppe M, Devlin NJ, van Hout B, Krabbe PF, de Charro F. A program of methodological research to arrive at the new international EQ-5D-5L valuation protocol. Value Health. 2014 Jun;17(4):445-53.
2. Dolan P. Modeling valuations for EuroQol health states. Med Care 1997; 35(11):1095-1108.
3. Janssen BMF, Oppe M, Versteegh MM, Stolk EA. Introducing the composite time trade-off: a test of feasibility and face validity. Eur J Health Econ 2013;14 Suppl 1:S5-13. doi:10.1007/s10198-013-0503-2.
4. Devlin NJ, Tsuchiya A, Buckingham K, Tilling C. A uniform time trade off method for states better and worse than dead: feasibility study of the ‘lead time’ approach. Health Econ 2011;20:348–61.
5. Craig BM, Busschbach JJ. The episodic random utility model unifies time trade-off and discrete choice approaches in health state valuation. Popul Health Metr. 2009 Jan 13;7:3.
6. Oppe M, van Hout B. The optimal hybrid: experimental design and modeling of a combination of TTO and DCE. EuroQol Group Proceedings. 2010. [available at: http://eq-5dpublications.euroqol.
org/download?id=0_53738&fileId=54152, accessed June 6, 2017]
7. Ramos-Goñi JM, Pinto-Prades JL, Oppe M, Cabasés JM, Serrano-Aguilar P, Rivero-Arias O.
Valuation and Modeling of EQ-5D-5L Health States Using a Hybrid Approach. Med Care. 2014 Dec 17.
8. Bleichrodt H, Pinto JL. Loss aversion and scale compatibility in two- attribute trade-offs. J Math Psychol. 2002;46:315–337.
9. Hawkins SA. Information processing strategies in riskless preference reversals: the prominence effect. Organ Behav Hum Decis Proces. 1994;59:1–26.
10. Devlin NJ, Shah KK, Feng Y, Mulhern B, van Hout B. Valuing health-related quality of life: An EQ-5D-5L value set for England. Health Econ. 2017 Aug 22. doi: 10.1002/hec.3564. [Epub ahead of print]
11. Purba FD, Hunfeld JAM, Iskandarsyah A, Fitriana TS, Sadarjoen SS, Ramos-Goñi JM, Passchier J, Busschbach JJV. The Indonesian EQ-5D-5L Value Set. Pharmacoeconomics. 2017 Jul 10.
12. Wong ELY, Ramos-Goñi JM, Cheung AWL, Wong AYK, Rivero-Arias O. Assessing the Use of a Feedback Module to Model EQ-5D-5L Health States Values in Hong Kong. Patient. 2017 Oct 10. doi:
10.1007/s40271-017-0278-0. [Epub ahead of print]
13. Ludwig K, Graf von der Schulenburg J-M, Greiner W. German value set for the EQ-5D-5L.
Phamacoeconomics 2017. In press.
14. Amemiya T. Advanced Econometrics. Harvard University Press. Cambridge Massachusetts. ISBN:
0-674-00560-0.
15. Versteegh MM, Vermeulen KM, Evers, Silvia M. A. A., Wit GA de, Prenger R, Stolk EA. Dutch Tariff for the Five-Level Version of EQ-5D. Value Health 2016;19:343–52.
16. Luo N, Liu G, Li M, Guan H, Jin X, Rand-Hendriksen K. Estimating an EQ-5D-5L Value Set for China. Value Health. 2017 Apr;20(4):662-669. doi: 10.1016/j.jval.2016.11.016. Epub 2017 Feb 9.
186 | Chapter 10
RefeRenCes
1. Oppe M, Devlin NJ, van Hout B, Krabbe PF, de Charro F. A program of methodological research to arrive at the new international EQ-5D-5L valuation protocol. Value Health. 2014 Jun;17(4):445-53.
2. Dolan P. Modeling valuations for EuroQol health states. Med Care 1997; 35(11):1095-1108.
3. Janssen BMF, Oppe M, Versteegh MM, Stolk EA. Introducing the composite time trade-off: a test of feasibility and face validity. Eur J Health Econ 2013;14 Suppl 1:S5-13. doi:10.1007/s10198-013-0503-2.
4. Devlin NJ, Tsuchiya A, Buckingham K, Tilling C. A uniform time trade off method for states better and worse than dead: feasibility study of the ‘lead time’ approach. Health Econ 2011;20:348–61.
5. Craig BM, Busschbach JJ. The episodic random utility model unifies time trade-off and discrete choice approaches in health state valuation. Popul Health Metr. 2009 Jan 13;7:3.
6. Oppe M, van Hout B. The optimal hybrid: experimental design and modeling of a combination of TTO and DCE. EuroQol Group Proceedings. 2010. [available at: http://eq-5dpublications.euroqol.
org/download?id=0_53738&fileId=54152, accessed June 6, 2017]
7. Ramos-Goñi JM, Pinto-Prades JL, Oppe M, Cabasés JM, Serrano-Aguilar P, Rivero-Arias O.
Valuation and Modeling of EQ-5D-5L Health States Using a Hybrid Approach. Med Care. 2014 Dec 17.
8. Bleichrodt H, Pinto JL. Loss aversion and scale compatibility in two- attribute trade-offs. J Math Psychol. 2002;46:315–337.
9. Hawkins SA. Information processing strategies in riskless preference reversals: the prominence effect. Organ Behav Hum Decis Proces. 1994;59:1–26.
10. Devlin NJ, Shah KK, Feng Y, Mulhern B, van Hout B. Valuing health-related quality of life: An EQ-5D-5L value set for England. Health Econ. 2017 Aug 22. doi: 10.1002/hec.3564. [Epub ahead of print]
11. Purba FD, Hunfeld JAM, Iskandarsyah A, Fitriana TS, Sadarjoen SS, Ramos-Goñi JM, Passchier J, Busschbach JJV. The Indonesian EQ-5D-5L Value Set. Pharmacoeconomics. 2017 Jul 10.
12. Wong ELY, Ramos-Goñi JM, Cheung AWL, Wong AYK, Rivero-Arias O. Assessing the Use of a Feedback Module to Model EQ-5D-5L Health States Values in Hong Kong. Patient. 2017 Oct 10. doi:
10.1007/s40271-017-0278-0. [Epub ahead of print]
13. Ludwig K, Graf von der Schulenburg J-M, Greiner W. German value set for the EQ-5D-5L.
Phamacoeconomics 2017. In press.
14. Amemiya T. Advanced Econometrics. Harvard University Press. Cambridge Massachusetts. ISBN:
0-674-00560-0.
15. Versteegh MM, Vermeulen KM, Evers, Silvia M. A. A., Wit GA de, Prenger R, Stolk EA. Dutch Tariff for the Five-Level Version of EQ-5D. Value Health 2016;19:343–52.
16. Luo N, Liu G, Li M, Guan H, Jin X, Rand-Hendriksen K. Estimating an EQ-5D-5L Value Set for China. Value Health. 2017 Apr;20(4):662-669. doi: 10.1016/j.jval.2016.11.016. Epub 2017 Feb 9.
Discussion | 187
10
17. Shah K, Rand-Hendriksen K, Ramos JM, Prause AJ, Stolk E. Improving the quality of data collected in EQ-5D-5L valuation studies: a summary of the EQ-VT research methodology program. 2014 EuroQol Proceedings. [available at: http://eq-5dpublications.euroqol.org/
download?id=0_53918&fileId=54332, accessed June 6, 2017]
18. Al Sayah F, Johnson JA, Ohinmaa A, Xie F, Bansback N; Canadian EQ-5D-5L Valuation Study Group.. Health literacy and logical inconsistencies in valuations of hypothetical health states: results from the Canadian EQ-5D-5L valuation study. Qual Life Res. 2017 Jan 25.
19. Hernandez-Alava M, Wailoo A, Grimm S, Pudney S, Gomes M, Sadique Z, Meads D, O’Dwyer J, Barton G, Irvine L. EQ-5D-5L versusEQ-5D-3L:The Impact on Cost-Effectiveness in the United Kingdom. Value in Helath 2017. In press.
Discussion | 187
10
17. Shah K, Rand-Hendriksen K, Ramos JM, Prause AJ, Stolk E. Improving the quality of data collected in EQ-5D-5L valuation studies: a summary of the EQ-VT research methodology program. 2014 EuroQol Proceedings. [available at: http://eq-5dpublications.euroqol.org/
download?id=0_53918&fileId=54332, accessed June 6, 2017]
18. Al Sayah F, Johnson JA, Ohinmaa A, Xie F, Bansback N; Canadian EQ-5D-5L Valuation Study Group.. Health literacy and logical inconsistencies in valuations of hypothetical health states: results from the Canadian EQ-5D-5L valuation study. Qual Life Res. 2017 Jan 25.
19. Hernandez-Alava M, Wailoo A, Grimm S, Pudney S, Gomes M, Sadique Z, Meads D, O’Dwyer J, Barton G, Irvine L. EQ-5D-5L versusEQ-5D-3L:The Impact on Cost-Effectiveness in the United Kingdom. Value in Helath 2017. In press.
1111
Samenvatting
11 Chapter 11 Samenvatting
11
Chapter 11
190 | Chapter 11
Recent is een nieuwe versie van de EQ-5D standaard vragenlijst geïntroduceerd: de EQ-5D-5L. Om de EQ-5D-5L bruikbaar te maken voor economische evaluaties, moeten nationale waarderingen voor de gezondheidstoestanden van de EQ-5D-5L worden bepaald. Er is voorgesteld dat alle landen waar dit gebeurt gebruik maken van hetzelfde, gestandaardiseerde protocol voor waarderingsstudies, zodat het mogelijk is gezondheidswaarderingen tussen landen te vergelijken. Het voorgestelde protocol voor waarderingsstudies omvat twee technieken voor het waarderen van gezondheid, genaamd: “Composite Trade-Off(C-TTO)”
en ”Discrete Choice Experiments" (DCE). Dit proefschrift beschrijft de eerste ervaringen met dit gestandaardiseerde protocol en de ontwikkelingen die sindsdien hebben plaatsgevonden, met een focus op de volgende drie vragen:
1. Wat voor problemen kan men verwachten bij het gebruik van het EQ-5D-5L evaluatie protocol bij het genereren van nationale waardebepalingen?
2. Hoe kan dit protocol verbeterd worden?
3. Hoe kan de verkregen data het best verwerkt worden om een waardebepaling te ontwikkelen?
HOOFDSTUK 2 presenteert één van de eerste toepassingen van dit gestandaardiseerde protocol voor de EQ-5D-5L waardering. Het bleek goed mogelijk om gezondheidswaarderingen te bepalen op basis van de verzamelde data, maar de resultaten werden sterk beïnvloed door interviewereffecten. De conclusie van dit hoofdstuk is dat een betere implementatie van het protocol noodzakelijk is om interviewereffecten te verminderen. Het beantwoorden van de tweede vraag was een grotere uitdaging en er is daarom aanvullend onderzoek gedaan onder de paraplu van een internationaal onderzoeksprogramma. Door gebruik te maken van verschillende DCE en C-TTO datasets werd eerst onderzocht wat het effect is van: i) het aantal vragen, en ii) de volgorde waarin ze gesteld zijn op de nauwkeurigheid van de antwoorden en op het gedrag van respondenten. HOOFDSTUK 3 toont aan dat de C-TTO data gevoeliger is voor dit soort effecten dan de DCE data. Deelnemers gaven vaak hun C-TTO respons al na het beantwoorden van slechts drie vragen in de iteratieprocedure. Dat is wel een efficiënte strategie, maar het betekent ook dat de nauwkeurigheid van de waarden beperkt is tot dat wat haalbaar is in drie stappen. In HOOFDSTUK 4 werd getest of het invoeren van een rangschikkingstaak
HOOFDSTUK 2 presenteert één van de eerste toepassingen van dit gestandaardiseerde protocol voor de EQ-5D-5L waardering. Het bleek goed mogelijk om gezondheidswaarderingen te bepalen op basis van de verzamelde data, maar de resultaten werden sterk beïnvloed door interviewereffecten. De conclusie van dit hoofdstuk is dat een betere implementatie van het protocol noodzakelijk is om interviewereffecten te verminderen. Het beantwoorden van de tweede vraag was een grotere uitdaging en er is daarom aanvullend onderzoek gedaan onder de paraplu van een internationaal onderzoeksprogramma. Door gebruik te maken van verschillende DCE en C-TTO datasets werd eerst onderzocht wat het effect is van: i) het aantal vragen, en ii) de volgorde waarin ze gesteld zijn op de nauwkeurigheid van de antwoorden en op het gedrag van respondenten. HOOFDSTUK 3 toont aan dat de C-TTO data gevoeliger is voor dit soort effecten dan de DCE data. Deelnemers gaven vaak hun C-TTO respons al na het beantwoorden van slechts drie vragen in de iteratieprocedure. Dat is wel een efficiënte strategie, maar het betekent ook dat de nauwkeurigheid van de waarden beperkt is tot dat wat haalbaar is in drie stappen. In HOOFDSTUK 4 werd getest of het invoeren van een rangschikkingstaak