This is a study of the construct validity and reliability of the 2MST. The research was carried out in the Adult Orthopedic Rehabilitation sector of Hospital Sarah (São Luís, MA, Brazil) from July 1, 2020 to January 30, 2021, approved by the institution’s research ethics committee (protocol number 3.962.645). All research participants validated their participation by signing an informed consent form.
The sample calculation considered the confidence coefficient of 0.95 and the amplitude of the confidence interval for the intraclass correlation coefficient (ICC) of 0.30. In addition, the calculation was performed to detect adequate reliability (ICC = 0.75) according to the classification of Fleiss . Thus, a sample size of 34 participants was estimated. To compensate for a possible sample loss, a minimum sample size of 40 volunteers was considered. The processing of the sample calculation was carried out based on the study carried out by Bonnett .
We included in this study: patients of both genders; with a minimum age of 40 years and a maximum of 80 years; complaint of knee pain lasting more than 3 months, diagnosis of knee OA issued after evaluation by an experienced orthopedist, based on criteria established by the American College of Rheumatology with clinical evaluation and imaging. The criteria were presence of pain, presence of osteophytes and at least one of the 3 characteristics (age over 50 years, presence of crackling and/or morning stiffness for less than 30 min). Patients with grade 2 or 3 in the classification of Kellgren and Lawrence were included in the study .
The non-inclusion criteria adopted in the study were: individuals with a history of lower limb surgery; use of mobility aids; neurological disorder (sensory and/or motor); hip OA; use of prosthesis or orthosis in the lower limbs; cardiopulmonary diseases or any other acute adverse health condition that may make it impossible to carry out the proposed tests. Exclusion criteria were patients who did not show up within the stipulated period of 7 to 14 days for the retest.
The present study was integrated by two physical therapist examiners who performed the measurements with the 2MST independently in two moments (test and retest), resulting in a total of 4 test applications for each participant, two evaluations on the first day and two more on the second day. The assessments were carried out by two physiotherapists with more than 10 years of experience. In addition, a 1-month prior training was carried out to standardize the execution of the tests.
When measuring functional capacity using 2MST, the examiner measured the maximum number of knee lifts that the individual performs in 2 min. Before starting the test, a marking was made on the wall, at the midpoint between the patella and the anterosuperior iliac spine. The examiner counted the number of right knee elevations that reached this mark for patients who had pain associated with right knee OA and for patients with bilateral symptomatic knee OA. The counting of left knee elevations was performed only in patients with exclusive symptoms of left OA.
Two previous runs of the test were performed for familiarization, for a period of 30 s (with a 1-min rest interval between them). After 1 min of rest, the first examiner (staying beside the patient for safety in case of imbalance) applied the test for 2 min, giving verbal information to start the test, another when 1 min had passed and when there were 30 s to the end of the test. After a 10-min rest break, the second examiner performed the same procedure. The order of examiners was defined by drawing lots before each application of 2MST.
After a minimum interval of 7 days and a maximum of 14 days, the patients were evaluated with the 2MST again by the two examiners. The same pattern performed in the test was maintained, with the maintenance of the time, in the same environment, without the patient having performed any type of physical exercise on the day of the assessment, in order to avoid fatigue before the assessment.
All 2MST runs were recorded for review by the examiners. In addition, the planes were filmed for further analysis using an iPhone 8 cell phone (Cupertino, CA, USA) and a universal telescope tripod set at the height of the marking made on the wall. A third independent examiner counted the number of steps using video recordings. This measure was taken to allow for the analysis of agreement, considering video-based counting as the reference measure.
To determine the construct validity through correlations, patients answered validated instruments, translated and adapted to Brazilian Portuguese, commonly used in patients diagnosed with knee OA. Therefore, we used several questionnaires and scales to better assess the pain of patients with knee AO, within a biopsychosocial model, considering pain intensity, physical function, joint stiffness, catastrophizing and self-efficacy.
The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) is a self-administered questionnaire designed specifically for individuals with knee or hip OA. It was culturally validated and adapted to Brazilian Portuguese . The questionnaire has three domains: pain, with 5 items; joint stiffness, with 2 items; and physical function, with 17 items. For each item, the patient has 5 response options (none, mild, moderate, strong, very strong). The pain domain score ranges from 0 to 20, the stiffness domain ranges from 0 to 8 and the physical function domain ranges from 0 to 68 points. The higher the value, the worse the symptoms.
The Numerical Pain Scale (NPS) is a scale consisting of a sequence from 0 to 10, where the value 0 represents “no pain” and the number 10 represents “the worst pain imaginable”. Thus, individuals graded their pain based on this parameter. This scale is validated for Portuguese . Each patient answered the scale twice: once for pain intensity at rest and once for pain intensity during active knee movements.
The Pain-Related Catastrophizing Thoughts Scale (PCTS) was used to assess catastrophizing in relation to pain. It is composed of 9 items scaled on a Likert scale, ranging from 0 to 5 associated with the words “almost never” and “almost always”. The total score is the sum of the scores of the completed items, divided by the number of these items answered, with the minimum score being 0 and the maximum 5. Higher scores indicate a greater presence of catastrophic thoughts. The scale was adapted and validated for Brazilian Portuguese .
The Pain Self-Efficacy Questionnaire (PSEQ) was developed to investigate the degree of confidence that patients with chronic pain have about themselves to perform daily activities or functions. It consists of 10 items, with response options ranging from 0 to 6, 0 being “not at all confident” and 6 “completely confident”, totaling a score from 0 to 60. The higher the score, the greater is your self-efficacy. This instrument is validated for Brazilian Portuguese .
To characterize the sample, quantitative data were described as mean and standard deviation (SD), and qualitative data as number and percentage. The intraclass correlation coefficient (ICC2,3) was used to determine intra- and inter-examiner reliability, with its respective 95% confidence interval (CI), standard error of measurement (SEM) and minimal detectable difference (MDD) . To interpret the ICC value, the study by Fleiss  was used as a reference: for values below 0.40, reliability was considered low; between 0.40 and 0.75, moderate; between 0.75 and 0.90, high, and, finally, values greater than 0.90, reliability was considered excellent.
To determine the construct validity, the Shapiro-Wilk normality test was initially applied. Upon identification of non-normal distribution of data, Spearman’s correlation coefficient (rho) was used to verify the magnitude of correlation between 2MST and NPRS, WOMAC, PCTS and PSEQ. As a hypothesis for the magnitudes of correlation, we expect a correlation ≥0.50 between 2MST and the physical function domain of the WOMAC (similar constructs) and a correlation ranging from 0.30 to 0.50 with the pain and joint stiffness domains of the WOMAC, NPRS, PCTS and PSEQ (related but different constructs). It is expected that at least 75% of the hypotheses defined a priori are confirmed .
The agreement between the face-to-face evaluations of the 2MST and the evaluation performed based on the video recording was analyzed using the Bland-Altman methodology, considering 4 moments of the completion of the 2MST .
The software used for the analyzes was SPSS (version 17, Chicago, IL, USA) and a significance level of 5% was considered.