This article has Open Peer Review reports available.
Construct validity of the interview Time Trade-Off and computer Time Trade-Off in patients with rheumatoid arthritis: A cross-sectional observational pilot study
© Buitinga et al.; licensee BioMed Central Ltd. 2012
Received: 13 October 2011
Accepted: 19 June 2012
Published: 25 June 2012
The Time Trade-Off (TTO) is a widely used instrument for valuing preference-based health-related quality of life (HRQoL). The TTO reveals preferences for own current health (‘utilities’) on a scale anchored between death (0) and perfect health (1). Limited information on the external validity of the TTO is available. Aim of this pilot study was to examine the construct validity of both an interview TTO and a computer-based TTO in patients with rheumatoid arthritis (RA).
Thirty patients visiting the outpatient rheumatology clinic participated. Construct validity was assessed by measuring convergent and discriminative validity. Convergent validity was assessed by calculating Spearman’s correlations between the utilities obtained from the TTOs and pain, general health (rating scales), health-related quality of life (SF-36 and SF-6D) and functional status (HAQ-DI). Discriminative power of both TTO measures was determined by comparing median utilities between worse and better health outcomes.
Correlations of both TTO measures with HRQoL, general health, pain and functional status were poor (absolute values ranging from .05 to .26). Both TTOs appeared to have no discriminative value among groups of RA patients who had a worse or better health status defined by six health outcome measures. About one-third of respondents were zero-traders on each of the TTO measures. After excluding zero-traders from analysis, the correlations improved considerably.
Both the interview TTO and computer TTO showed poor construct validity in RA patients when using measures of HRQol, general health, pain and functional status as reference measures. Possibly, the validity of the TTO improves when using an anchor that is more realistic to RA patients than the anchor ‘death’.
The Time Trade-Off (TTO)  is an instrument developed to assess effects of treatments in cost-utility analyses (CUAs) by measuring changes in health-related quality of life (HRQoL) directly by patients. The TTO reveals preferences for own current health (‘utilities’) on a scale anchored between death (0) and perfect health (1) by asking people how many life years they are willing to give up to become perfectly healthy. It is assumed that the more life years people are willing to trade off, the worse their health state is. The purpose of this measure is to capture the desirability of patients’ own health state reflecting their health-related quality of life (HRQoL).
Traditionally, the TTO is administered by interview. The TTO can also be administered by questionnaire or computer. Furthermore, different methodological approaches to the TTO are used . This makes comparison between studies difficult. Differences in TTO procedures seem to influence utilities. For example, it has been found that utility scores are heavily influenced by the method of elicitation (ping-pong, titration) . Furthermore, the mode of administration (interview/computer/questionnaire) or the way the TTO question is formulated can influence utilities. Besides, the size of time frame that is used (e.g. fixed time period, life expectancy) has a great impact, since utilities are calculated as the proportion of the remaining lifetime sacrificed .
Few studies have examined psychometric properties of the interview TTO in rheumatoid arthritis (RA). The studies that reported on the construct validity, showed poor to moderate correlations between TTO and measures of HRQoL, functional status, disease activity and pain [4–6]. It was found that the TTO was only able to discriminate between worse or better disease-specific HRQoL using the RAQoL [4, 5], between worse or better outcomes on the dimensions ‘symptom’ and ‘role’ of the disease-specific AIMS-2  and between worse or better mental health using the RAND-36 mental component summary scale . Tijhuis et al. showed that the TTO was able to discriminate between worse or better pain, worse or better disease activity and worse or better functional status . In contrast, Bejia et al. showed that the TTO was not able to discriminate between worse and better pain or worse and better disease activity .
Computer-based utility elicitation procedures to administer the TTO have been developed, for example iMPACT3  and U-Titer . Studies in a range of conditions have used such computer-based programmes to administer a TTO using different procedures [7, 9].
In this study, we report on preliminary results with respect to the construct validity of the TTO assessed in patients with RA using an interview TTO as well as a computer TTO, and using a standardised procedure for both TTOs. The first aim of this study was to examine convergent validity of the interview and computer TTO separately by correlating TTO utilities of both TTO measures with other patient-reported outcomes (PROs) in patients with RA. The second aim was to examine whether the interview and computer TTO were able to discriminate between worse and better patient-reported health outcomes.
Patients and study design
Thirty consecutive outpatients (aged 18–85) of our rheumatology clinic who were diagnosed with RA participated. People who did not understand the Dutch language were excluded.
All participants completed the TTO twice with an interval of 14 days. Randomly the first TTO was either interview or computer-based, consequently followed by the other at the next assessment. Measures of pain, general health, health-related quality of life and functional status were administered at the first TTO assessment. Informed consent was obtained from all participants. According to legislation in the Netherlands (WMO), no approval of the ethical review board was indicated.
The Time Trade-Off question used in this study was formulated as follows:
“Imagine that a new treatment became available which helped you to recover fully. A side-effect of this treatment, however, is that you will die sooner. Would you opt for this treatment?” A graphical aid was used to make the question more clearly. When participants asked about the definition of being perfectly healthy, they were told to imagine being in perfect health without any disease or health-related complaints.
A life time perspective was adopted. Life expectancy calculations of the Dutch Central Bureau of Statistics  were used. The remaining life expectancy was calculated by extracting the age of the participant from his or her expected age of dying according to the CBS. The bisection method was applied to reach the point at which participants did not prefer one of the two options: staying in their health state for the rest of their lives or being perfectly healthy for a shorter life time. Therefore, the trade-off started with setting the shorter life in perfect health on half of the remaining life expectancy. For example, a person with a remaining life expectancy of 20 years was first asked about his or her willingness to trade off 10 life years. If the person accepted the trade, a remaining life expectancy of five years in perfect health was presented. If the person did not accept the trade, a remaining life expectancy of 15 years was presented. This process continued until the patient was indifferent between his or her own current health state according to his or her life expectancy and a shorter life in perfect health. Then, the TTO score was calculated by the formula: 1-(number of life years given up/remaining expected life years).
NRS pain and general health
Current severity of pain and current general health were both measured by a numerical rating scale (NRS), ranging from 0 (best) to 10 (worst).
Physical and mental health were measured by calculating the physical and mental component summary scores (PCS and MCS) of the SF-36 version 2 , a generic descriptive instrument for measuring health-related quality life on eight dimensions (mental functioning, physical functioning, bodily pain, vitality, role limitations due to physical problems, role limitations due to emotional problems, social functioning and general health). The scores range from 0 to 100, whereby a higher score indicates a better health.
From the SF-36, SF-6D utility scores were derived, reflecting health state valuations of the general public . The utility scores range from 0 to 1, whereby a higher score indicates a better HRQoL.
The level of functional disability was assessed by the Health Assessment Questionnaire Disability Index (HAQ-DI) , a self-report measure consisting of eight categories (dressing and grooming, arising, eating, walking, hygiene, reach, grip and common daily activities). The HAQ score ranges from 0 to 3, whereby a higher score indicates a worse functional status.
To examine the presence of an order effect between participants who started with the interview TTO or with the computer TTO, a Mann–Whitney U-test was performed.
Construct validity was assessed by measuring convergent and discriminative validity. Convergent validity of the interview and computer version was assessed by calculating Spearman’s correlations between each of the TTOs with the NRS for pain and general health, SF-36, SF-6D and HAQ-DI. Moderate correlations (0.40-0.59) are expected: all measures (except for the SF-6D) are descriptive, and most instruments only capture one or some aspects of the construct quality of life. The SF-6D yields utilities, but these are derived from a general public. A sample of 29 participants is required to demonstrate a significant moderate Spearman’s correlation of 0.50 with an alpha of 0.05 (one-tailed) and a power (1-β) of 0.80. Discriminative power of the interview TTO and computer TTO was determined by comparing median utilities between worse and better pain, general health, HRQoL and functional status. Therefore, the outcome measures were dichotomised by the median score. A worse health outcome was defined by the ≤ median value of the outcome measure. A better health outcome was defined by the > median value of the outcome measure. Because of a difference in scaling of the NRS (Pain and General Health) and HAQ-DI, a worse health outcome on these instruments was defined by the > median value of the outcome measure. A better health outcome was defined by the ≤ median value of the outcome measure. The Mann–Whitney U-test was used to test significance. Data were analysed using SPSS version 16.0.
Demographic, clinical and psychosocial characteristics
Age (mean years ± SD)
58 ± 13
Disease duration (median years (IQR))
Marital status (%)
Educational level (%)
Work status (%)
Utilities (Median (IQR))
Pain (numerical rating scale) (median (IQR))
General Health (numerical rating scale) (median (IQR))
Descriptive health-related quality of life (SF36) (median (IQR))
Preference-based health related quality of life (SF-6D) (median (IQR))
Functional status (HAQ-DI) (median (IQR))
For neither interview TTO utility scores nor computer TTO utility scores an effect of test order was found (P = 0.37 and P = 0.73 respectively). So, no significant differences in utility scores existed between patients who started with the interview TTO or computer TTO.
Construct validity: Convergent and discriminative validity
Spearman’s correlations (95% confidence intervals) of interview TTO or computer TTO utilities with health outcome measures for the total sample and for the traders only
Interview TTO traders only
Computer TTO traders only
(N = 30)
(N = 20)
(N = 30)
(N = 22)
Pain (NRS) ≈
−0.10 (−0.45, 0.27)
−0.38 (−0.70, 0.08)*
−0.26 (−0.57, 0.11)
−0.47 (−0.74, -0.05)*
General Health (NRS) ≈
−0.08 (−0.43, 0.29)
−0.42 (−0.73, 0.03)*
−0.05 (−0.40, 0.32)
−0.13 (−0.53, 0.30)
Descriptive HRQoL (SF-36)
0.16 (−0.22, 0.49)
0.45 (0.01, 0.74)*
0.22 (−0.15, 0.54)
0.35 (−0.08, 0.67)
0.24 (−0.14, 0.55)
0.32 (−0.14, 0.67)
0.22 (−0.15, 0.54)
0.24 (−0.20, 0.60)
Preference-based HRQoL (SF-6D)
0.18 (−0.21, 0.50)
0.45 (−0.01, 0.73)*
0.11 (−0.27, 0.45)
0.20 (−0.24, 0.57)
Functional status (HAQ-DI) ≈
−0.07 (−0.42, 0.29)
−0.20 (−0.59, 0.27)
−0.21 (−0.53, 0.17)
−0.38 (−0.69, 0.05)*
Discrimination of the interview and computer TTO between worse and better health outcomes a d
Interview TTO utilities (N = 30)
Computer TTO utilities (N = 30)
Worse health outcomes b c
Better health outcomes
Worse health outcomes b c
Better health outcomes
Pain (NRS) e
General Health (NRS)
0.88 (0.80 -1.00)
0.87 (0.80 -1.00)
This pilot study showed that the construct validity of both the interview TTO and computer TTO was poor in patients with RA when using measures of HRQol, general health, pain and functional status as reference measures. After exclusion of zero-traders from analysis, the results improved. This finding was expected, because zero-traders did not have a significantly different health status compared with traders. Indications of the poor convergent validity of the TTO were also found in other studies in RA and studies in other diseases [4–6, 9, 15–17]. In most of these studies it was unclear how many participants were zero-traders and whether they were in- or excluded. One study reported similar results when in- or excluding zero-traders from analysis . In our study, we did not find the TTO to be discriminative for any of the health outcome measures used. Other studies found evidence for and against its discriminative ability [4, 5, 9, 16]. Contradicting findings were found for pain and disease activity scores in patients with RA [4, 5] and for functional status scores in patients with cardiovascular disease [9, 16].
All these studies were found to have differences in the TTO procedure applied. This might explain the contradicting results regarding the discriminative ability of the TTO. Beside the mode of administration, studies differed in the time frame used (remaining life expectancy [4, 5, 16–18], time frame dependent on age group  or not mentioned ). Furthermore, some studies described the way in which people had to think about current health [16, 17] and/or about the anchors perfect health [4, 5, 16] and death , whereas other studies did not [6, 9, 18]. One study used a symptom-free anchor (‘no angina’) instead of ‘perfect health’ . In many studies it was stated that a visual aid was used, although no further information was given about its representation [4–6]. Besides, many studies did not report the precise method of elicitation (e.g. ping-pong) [4–6, 9, 18].
In our study, the TTO procedure applied was precisely described, facilitating the comparison with other studies. Strengths of this study were the fact that we used two different TTO assessments and that we used a broad set of PROs in a homogeneous population consisting of RA patients. A limitation of this study was the use of a small convenience sample.
There are several explanations possible for the results of our study, irrespective of the TTO procedure used. First, the low correlation with the SF-6D, another preference-based instrument, can be partly explained by the difference in perspective used to obtain utilities. SF-6D utilities are derived from the general public, so these scores represent a societal perspective. TTO scores were directly calculated from the patients’ preferences, representing a patient perspective. Secondly, except for the SF-36 and SF-6D, the comparators used in this study only measure one aspect (e.g. functional status) of the construct quality of life. Furthermore, except for the SF-6D, the comparators are descriptive which implies that valuations of health states are not assessed. With these measures patients are asked about their levels of impaired health or pain, whereas personal preferences toward their health state remain unrevealed. It is possible that people with the same health state report different utilities if they have different ‘aspirations’ . Nease et al. illustrate this by the example that inability to walk ‘more than a city block’ does not have to be a limitation if someone does not desire to be active . Therefore, it would be worthwhile to examine in future studies whether it is better to validate the TTO against individualized measures of personal preferences, such as the SEIQOL [19, 20] or MACTAR . Thirdly, it has been found that preferences are prone to biases inherently to the nature of the TTO, such as loss aversion. Loss aversion can be observed when a choice has to be made between ‘remaining the status quo’ (remaining in the current health state) and ‘accepting an alternative to it’ (trading off life years for perfect health). In that case people will evaluate the advantages and disadvantages of the alternative in terms of losses and gains . The TTO asks people about their willingness to trade off life years (a loss) for optimal health (a gain) . Because ‘losses loom larger than gains’ , people become reluctant to give up life years. This will result in higher utilities, as supported by findings of Van Osch et al. . Furthermore, TTO utilities might be influenced by other factors that are unrelated to current health , such as family-related aspects, for example having children  or seeing grandchildren grow up . Finally, the nature of the disease can influence utilities. Asking patients to trade off life years may feel unrealistic, because patients with RA do not perceive their disease as life-threatening . Therefore, people may be less willing or not willing at all to trade off life years. Our results are indicative of this: irrespective of health, a relatively large number of participants were not willing to trade any life year for perfect health. For chronic illnesses such as RA there may be more realistic health-related anchors, for example ‘becoming dependent on others’ and ‘having increased physical limitations’, which were reported by RA patients to worry them [25, 26]. It could be examined whether the validity of the TTO improves when changing the trade-off about dying earlier in other more realistic (health-related) trade-offs. The use of a ‘chained’ TTO procedure could also improve the validity of the TTO. In a chained procedure, the health state of interest is not directly compared with death but indirectly with the aid of an intermediate anchor health state [27–29]. A limitation is that a chained procedure is more complex, because it adds an additional step to the valuation process, possibly leading to extra noise . Limited research has been performed on the chained TTO and has been mainly applied in temporary health states [28–30]. For chronic health states it has been shown that chained TTOs are systematically biased upwards (when the worst endpoint was varied) or downwards (when the best endpoint was varied), but that it is possible to correct for these biases . However, the respondents were not patients, but healthy people and women at high risk for breast cancer. Research in chronically ill patients examining the validity of the chained TTO for chronic states is lacking.
In conclusion, both the standardised interview TTO and standardised computer TTO showed similar poor results regarding construct validity when using measures of HRQoL, general health, pain and functional status as reference measures. Possibly, the validity of the TTO can be improved by replacing the anchor ‘death’ by an anchor that is more realistic to RA patients. Future studies in which direct patient reported utilities are derived, could start with the development of a TTO instrument using realistic anchors for RA patients. This instrument could be validated against individualized measures of personal preferences, such as the SEIQOL or MACTAR instrument.
We would like to thank the participants and the rheumatology department for their contribution to this study. We are also very grateful to André Brands, who programmed the computer TTO. This study was financially supported by a grant from the foundation: ‘Stichting Onderzoek en Zorgontwikkeling Reumacentrum Twente’.
- Torrance GW, Thomas WH, Sackett DL: A utility maximization model for evaluation of health care programs. Health Serv Res. 1972, 7: 118-133.PubMedPubMed CentralGoogle Scholar
- Arnesen T, Trommald M: Are QALYs based on time trade-off comparable? A systematic review of TTO methodologies. Health Econ. 2005, 14: 39-53. 10.1002/hec.895.View ArticlePubMedGoogle Scholar
- Lenert LA, Cher DJ, Goldstein MK, Bergen MR, Garber A: The effect of search procedures on utility elicitations. Med Decis Making. 1998, 18: 76-83. 10.1177/0272989X9801800115.View ArticlePubMedGoogle Scholar
- Tijhuis GJ, Jansen SJT, Stiggelbout AM, Zwinderman AH, Hazes JMW, Vliet Vlieland TPM: Value of the time trade off method for measuring utilities in patients with rheumatoid arthritis. Ann Rheum Dis. 2000, 59: 892-897. 10.1136/ard.59.11.892.View ArticlePubMedPubMed CentralGoogle Scholar
- Bejia I, Salem KB, Touzi M, Bergaoui N: Measuring utilities by the time trade-off method in Tunisian rheumatoid arthritis patients. Clin Rheumatol. 2006, 25: 38-41. 10.1007/s10067-005-1125-6.View ArticlePubMedGoogle Scholar
- Witney AG, Treharne GJ, Tavakoli M, Lyons AC, Vincent K, Scott DL, Kitas GD: The relationship of medical, demographic and psychosocial factors to direct and indirect health utility instruments in rheumatoid arthritis. Rheumatology. 2006, 45: 975-981. 10.1093/rheumatology/kel027.View ArticlePubMedGoogle Scholar
- Lenert LA, Sturley A, Watson ME: iMPACT3: Internet-based development and administration of utility elicitation protocols. Med Decis Making. 2002, 22: 464-474. 10.1177/0272989X02238296.View ArticlePubMedGoogle Scholar
- Sumner W, Nease R, Littenberg B: U-titer: a utility assessment tool. Proceedings of the Annual Symposium on Computer Application in Medical Care: 17–20 November 1991; Washington D.C. Edited by: Clayton PD. 1991, McGraw-Hill, Washington D.C, 701-705.Google Scholar
- Nease RF, Kneeland T, O'Connor GT, Sumner W, Lumpkins C, Shaw L, Pryor D, Sox HC: Variation in patient utilities for outcomes of the management of chronic stable angina: Implications for clinical practice guidelines. J Am Med Assoc. 1995, 273: 1185-1190. 10.1001/jama.1995.03520390045031.View ArticleGoogle Scholar
- Centraal Bureau voor de Statistiek: Overlevingstafels. 2006, Available from: http://www.cbs.nl. Accessed 17 december 2007Google Scholar
- Buitinga L, Braakman-Jansen LMA, Taal E, Van de Laar MAFJ: A computer Time Trade-Off: a feasible and reliable alternative for the interview Time Trade-Off in rheumatoid arthritis. Clin Exp Rheumatol. 2011, 29: 783-789.PubMedGoogle Scholar
- Ware JE, Sherbourne CD: The MOS 36-Item Short-Form Health Survey (SF-36): I. Conceptual Framework and Item Selection. Med Care. 1992, 30: 473-483. 10.1097/00005650-199206000-00002.View ArticlePubMedGoogle Scholar
- Brazier J, Roberts J, Deverill M: The estimation of a preference-based measure of health from the SF-36. J Health Econ. 2002, 21: 271-292. 10.1016/S0167-6296(01)00130-8.View ArticlePubMedGoogle Scholar
- Fries JF, Spitz P, Kraines RG, Holman HR: Measurement of patient outcome in arthritis. Arthritis Rheum. 1980, 23: 137-145. 10.1002/art.1780230202.View ArticlePubMedGoogle Scholar
- Maor Y, King M, Olmer L, Mozes B: A comparison of three measures: the time trade-off technique, global health-related quality of life and the SF-36 in dialysis patients. J Clin Epidemiol. 2001, 54: 565-570. 10.1016/S0895-4356(00)00338-3.View ArticlePubMedGoogle Scholar
- Lalonde L, Clarke AE, Joseph L, Mackenzie T, Grover SA: Comparing the psychometric properties of preference-based and nonpreference-based health-related quality of life in coronary heart disease. Qual Life Res. 1999, 8: 399-409. 10.1023/A:1008991816278.View ArticlePubMedGoogle Scholar
- Khanna D, Ahmed M, Furst DE, Ginsburg SS, Park GS, Hornung R, Tsevat J: Health values of patients with systemic sclerosis. Arthritis Rheum. 2007, 57: 86-93. 10.1002/art.22465.View ArticlePubMedGoogle Scholar
- Nease RF, Tsai R, Hynes LM, Littenberg B: Automated utility assessment of global health. Qual Life Res. 1996, 5: 175-182. 10.1007/BF00435983.View ArticlePubMedGoogle Scholar
- McGee HM, O'Boyle CA, Hickey A, O'Malley K, Joyce CRB: Assessing the quality of life of the individual: the SEIQoL with a healthy and a gastroenterology unit population. Psychol Med. 1991, 21: 749-759. 10.1017/S0033291700022388.View ArticlePubMedGoogle Scholar
- O'Boyle CA, McGee H, Hickey A, O'Malley K, Joyce CRB: Individual quality of life in patients undergoing hip replacement. Lancet. 1992, 339: 1088-1091. 10.1016/0140-6736(92)90673-Q.View ArticlePubMedGoogle Scholar
- Tugwell P, Bombardier C, Buchanan WW: The MACTAR patient preference disability questionnaire - An individualized functional priority approach for assessing improvement in physical disability in clinical trials in rheumatoid arthritis. J Rheumatol. 1987, 14: 446-451.PubMedGoogle Scholar
- Kahneman D, Tversky A: Choices, values, and frames. Am Psychol. 1984, 39: 341-350.View ArticleGoogle Scholar
- Van Osch SMC, Wakker PP, Van Den Hout WB, Stiggelbout AM: Correcting biases in standard gamble and time tradeoff utilities. Med Decis Making. 2004, 24: 511-517. 10.1177/0272989X04268955.View ArticlePubMedGoogle Scholar
- Van Osch SMC: The Construction of Health State Utilities. 2007, , , PhD thesis. Leiden University Medical Center (LUMC), Leiden University, Department of Medical Decision MakingGoogle Scholar
- Lempp H, Scott D, Kingsley G: The personal impact of rheumatoid arthritis on patients' identity: A qualitative study. Chronic Illn. 2006, 2: 109-120.View ArticlePubMedGoogle Scholar
- McPherson KM, Brander P, Taylor WJ, McNaughton HK: Living with arthritis – What is important?. Disabil Rehabil. 2001, 23: 706-721. 10.1080/09638280110049919.View ArticlePubMedGoogle Scholar
- Spencer A: The implications of linking questions within the SG and TTO methods. Health Econ. 2004, 13: 807-818. 10.1002/hec.863.View ArticlePubMedGoogle Scholar
- Locadia MA, Stalmeier PFM, Oort FJ, Prins MH, Spranger MAG, Bossuyt PMM: A comparison of 3 valuation methods for temporary health states in patients treated with oral anticoagulants. Med Decis Making. 2004, 24: 625-633. 10.1177/0272989X04271042.View ArticlePubMedGoogle Scholar
- Jansen SJT, Stiggelbout AM, Wakker PP, Vliet-Vlieland TPM, Leer JH, Nooy MA, Kievit J: Patients’ utilities for cancer treatments: A study of the chained procedure for the Standard Gamble and Time Tradeoff. Med Decis Making. 1998, 18: 391-399. 10.1177/0272989X9801800406.View ArticlePubMedGoogle Scholar
- Johnston K, Brown J, Gerard K, O’Hanlon M, Morton A: Valuing temporary and chronic health states associated with breast screening. Soc Sci Med. 1998, 47: 213-222. 10.1016/S0277-9536(98)00065-3.View ArticlePubMedGoogle Scholar
- Stalmeier PFM: Discrepancies between chained and classic utilities induced by anchoring with occasional adjustments. Med Decis Making. 2002, 22: 53-64.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2474/13/112/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.