Design
Participants acted as judges and matched the thirty-eight items from the DASH to the definitions of the impairment, activity limitations and participation restrictions constructs from the ICF model.
Participants
Twenty-four academics (one clinical, five industrial and thirteen health psychologists and five health service researchers) from the Applied Psychology Research Group at the University of Aberdeen took part in the study as part of a seminar on the DCV method. The precise number of judges required for judgement tasks is yet to be established but between 2–20 is regarded as adequate [16–18].
Materials
The definitions of the three ICF constructs, namely: impairment, activity limitations, and participation restrictions were taken from the WHO and are given in Table 1. All 38-measurement items from the DASH were assessed. The DASH is a self-administered region-specific outcome instrument developed as a measure of self-rated upper-extremity disability and symptoms and has been identified as the most validated and easy to use measure of upper extremity function [6]. The DASH consists of 30 core items, and 8 optional items, which generate a disability score, scaled 0 (no disability) to 100.
Procedure
The detailed procedure for a DCV study has been published previously [13, 14, 19]. Briefly, for each DASH item, participants provided a Yes/No judgement of whether the item was a match to the theoretical definition of each ICF construct. Consequently, each participant provided 3 judgements for each of the 38 items, i.e. 114 judgements in total. In addition, participants gave a confidence rating for each judgement on an 11-point scale ranging from 0% to 100%, rising in 10% increments.
Statistical Analysis
Classification of items
Judgements were coded 1 for a match and -1 for a no match. Each judgement was multiplied by its accompanying confidence rating, expressed as a proportion. Consequently, the weighted judgements ranged from -1 to +1. One-sample t-tests were used to classify each item to one of the 7 possible combination of constructs, namely: I, A, P, IA, IP, AP or IAP. An item was classified as being related to a construct if its weighted judgement against that construct was significantly greater than zero. Missing data, either a missing judgement or a missing confidence rating were coded zero. The weighted judgement of that item by that judge, therefore, was zero and was entered as such into the one sample t-test. Hochberg's correction was used to correct for multiple tests [20]. Application of the Hochberg's correction identified statistical significance was achieved with t-values that corresponded to a p value of ≤ 0.001.
Inter-rater reliability
Intraclass correlation coefficients (ICC) and their 95% confidence intervals (95%C.I.), were used to assess agreement between judges across all 38 items and for each construct, i.e. I, A and P judgements. The weighted judgements were used to calculate the ICC using the two-way mixed model with measure of consistency.