Participants
The current study analysed data collected by the European Project on OSteoArthritis (EPOSA), a population-based study involving cohorts living in Germany, Italy, the Netherlands, Spain, Sweden, and the UK, that recruited 2942 adults between the ages of 65–85. All the participants gave written informed consent; the study design and methodology are outlined in detail elsewhere [4]. The study design was granted approval by the appropriate local ethics committees (Germany: Universitat Ulm Ethikkommission [312/08]. Italy: Comitato Etico Provinciale Treviso [XLIV-RSA/AULSS7]. The Netherlands: Medisch Ethische Toetsingscommissie Vrije Universiteit Amsterdam [2002/141]. Spain: Comité Ético de Investigación Clínica del Hospital Universitario La Paz Madrid [PI-1080]. Sweden: Till forskningsetikkommittén vid Karolinska Instituted Stockholm [00–132]. UK: Hertfordshire Research Ethics Committee [10/H0311/59]).
The project aimed to evaluate the participants once at baseline (between November 2010 and November 2011) and a second time 12–18 months later. During the assessment the participants underwent a clinical examination and were interviewed at home or in a health care centre by trained physicians and nurses using a standardized questionnaire.
Measures
Physical function, pain and stiffness of hand OA were assessed at baseline and 12–18 months later using the three subscales of the AUSCAN (15 items grouped into 3 scales: pain (5 items), stiffness (1 item), and physical function (9 items)) that utilized a 5-point Likert scale (responses ranged from none to extreme; 0 = none, 1 = mild, 2 = moderate, 3 = severe, and 4 = extreme) [1].
Physical function, pain and stiffness of hip and/or knee OA were measured at baseline and 12–18 months later using the three subscales of the WOMAC (24 items grouped into 3 scales: pain (5 items), stiffness (2 items), and physical function (17 items)) that utilized a 5-point Likert scale. Hip/knee pain and stiffness were defined as the maximum value of two joints [2, 3].
All the AUSCAN and WOMAC subscales were normalized from 0 to 100; higher scores indicate worse health status [1,2,3].
The Grip strength of both hands at baseline and 12–18 months later was measured twice, using a strain gauge dynamometer and the result (the highest values of the right and left hands were summed and divided by 2) is expressed in kilograms [26].
Walking speed was measured by time, registered in seconds, for a 3-m course marked out on the floor with no obstructions for an additional 2 ft at both ends.
Anxiety and depression were evaluated at baseline and 12–18 months later using the Hospital Anxiety and Depression Scales (HADS), a 14-item self-assessment instrument that measures anxiety and/or depression separately [27]. Scores that are 8 or higher (scores range between 0 and 21) on each/either of the subscales indicate altered states.
Health-related QoL was measured at baseline and 12–18 months later using: the 5-level EQ-5D, consisting of a descriptive system comprising five dimensions (mobility, self care, usual activities, pain/discomfort, anxiety/depression) and the EQ VAS, a vertical visual analogue scale [28]. The scores of the EQ-5D were converted into a single index, based on general population surveys, using the time trade-off (TTO) valuations from the general population of the UK; scores between − 0.594 and 1. 1 indicate good or satisfying health. Scores of the EQ-VAS range between 0 and 100, with higher scores indicating better health.
Clinical diagnosis of OA was formulated on the basis of the participant’s medical history and a physical examination (only at baseline), utilising algorithms in accordance with the clinical criteria developed by the American College of Rheumatology (ACR) [29] and the recommendations of the European League Against Rheumatism [30].
Clinical hand OA (classified as present vs absent) was diagnosed using specific AUSCAN sections [1]: the cut-off in the algorithm for hand pain was ≥3 and it was ≥1 for stiffness. At least 2 of the following criteria were required for a diagnosis of hand OA: a) hard tissue enlargement of two or more joints, b) hard tissue enlargement of two or more distal inter-phalangeal joints, c) deformity of at least one hand joint. Swelling of the metacarpophalangeal joints criteria was a variable that was assessed only in the English and German participants.
Clinical hip/knee OA, defined as the presence of OA in at least one or both of these joints, was diagnosed on the basis of the outcome of specific WOMAC sections. Pain in the hip/knee on at least one side [2, 3] was evaluated during the physical examination using a cut-off ≥3. For the participants, to be diagnosed with knee OA, at least two of the following were necessary: a) a morning stiffness score from mild to extreme; b) crepitus with active motion on at least one side at the physical examination; c) bone tenderness on at least one side at the physical examination; d) bone enlargement at the physical examination on at least one side; e) no palpable warmth of synovium at the physical examination in either knees. All of the following were needed for a positive hip OA assessment: a) pain in the hip on at least one side associated with restricted hip internal rotation at a physical examination; b) morning stiffness of the hip lasting < 60 min, evaluated using the stiffness section of the WOMAC with a score from mild to extreme.
Statistical analysis
Data analyses and graphical presentations were carried out using SAS software (SAS System, SAS Institute Inc., Cary, NC), version 9.4. Data were analysed using a set of weights calculated per sex and per 5-year age class with respect to the 2010 Standard European Population [4].
The changes over times (in the 12–18 months between the baseline evaluation and the follow-up one) were evaluated as continuous variables using the non-parametric signed rank test. Spearman’s correlation was used to compare the changes in the AUSCAN and WOMAC physical function scales and the changes in the other variables; the Cronbach α coefficient was used to measure the scales’ reliability (internal consistency) (values of α ≥ 0.7 reflect a good reliability) [31].
Only the data of the participants whose assessments were considered complete, that is they had completed both the baseline and follow-up assessments, were included in the statistical analysis. The MCID was calculated by measuring the changes from basal to follow-up measurements scores. Since the MCID for subject-reported outcome measures may vary in different populations and depending on the context, as recommended by Revicki et al. [32], we used multiple approaches to estimate the MCID in the AUSCAN and WOMAC physical function scores to triangulate on a single value or on a small range of values.
For anchor-based estimation of MCID we used the receiver operating characteristic (ROC) curve on the change score in the anchor. The variables assessed as possible anchors for the AUSCAN hand OA physical function score were: the AUSCAN for hand OA Pain, the AUSCAN for hand OA Stiffness, the Grip strength, the HADS anxiety, the HADS depression, the EQ-5D-5 L, and the EQ VAS. The variables evaluated as possible anchors for the WOMAC for hip/knee OA physical function score were: the WOMAC for hip/knee OA Pain, the WOMAC for hip/knee OA Stiffness, the Walking-test time, the HADS anxiety, the HADS depression, the EQ-5D-5 L, and the EQ VAS.
An anchor should be chosen because of a significant correlation between the change in the physical function score and the change in the anchor and a correlation coefficient ≥ |0.30| [31].
A ROC curve was constructed for those participants showing stable or worsened anchor scores; the area under the curve (AUC) summarizes the instrument’s ability to distinguish between individuals who have or do not have a minimal clinically important difference in functionality. The criteria used to calculate the probability of an optimal cut-off were: the Youden index (J) [33], the Euclidean distance (D), and the equality sensitivity and specificity (S). The percentage of participants exceeding the MCID were estimated for each cut-off value.
The following were considered for the distribution based-methods:
-
1.
A standardized response mean (SRM) [34], representing very small (SRM2), moderate (SRM5) and large changes (SRM8) [35].
-
2.
The standard error of measurement (SEM) [36] of the changes, considering a 63% confidence interval (CI) (SEM63), a 90% CI (SEM90), and a 95% CI (SEM95).
-
3.
The Edwards-Nunnally (EN) method [37], at the 90% CI (EN90), and at the 95% CI (EN95).
Anchor-based and distribution-based-methods were used to determine the MCID, and on that basis the participants were divided into two categories: worse/no worse functionality at the second assessment point. Triangulation was used to examine multiple values from different approaches to converge on a single value, with Cohen’s k (range from − 1 to 1, with one indicating a perfect agreement).