In primary care, when patients over 40 years of age present with hip pain, the most common diagnosis is osteoarthritis (OA) [1, 2]. A combination of radiographic signs and clinical findings is usually recommended for confirming the diagnosis. But although approximately half demonstrate definite radiological signs of OA , radiographs are not recommended solely for just confirming the diagnosis. Thus, the clinical exam is of key importance . Clinical practice guidelines recommend assessment of range of motion (ROM) and muscle strength when adult patients present with hip pain  and the two clinical signs documented to correlate with hip OA besides pain are reduced ROM [5–8] and muscle strength [5, 8–11]. Reduced ROM is further documented as a clinical predictor for hip OA [2, 12] and in patients with mild symptomatic hip OA, specific ranges of reduced ROM are correlated with radiographic signs .
A number of studies have evaluated the reliability of ROM and muscle strength measurements in patients with hip OA and reported moderate to excellent reliability [6, 7, 14–19]. But the presence of methodological issues raises questions about the external validity of these results. Equipment ill-suited for clinical practice has been used [7, 18] or the number of study subjects has been small, limiting the between-subject variation [6, 14, 16, 17]. Inappropriate correlation coefficients have been reported [14, 15] or reliability coefficients have been reported alone, ignoring agreement parameters [15, 17, 19]. Reliability coefficients indicate the procedure’s ability to discriminate between patients, whereas agreement parameters reflect error between repeated measurements [16, 17]. So, when measurements are used to assess change over time, agreement parameters should be reported .
Intra-rater reproducibility is commonly found to be more reliable than inter-rater reproducibility because between-rater variability is eliminated [21–23]. In clinical or research settings, intra-rater reproducibility could be adequate where only one rater performs the measurements, whereas inter-rater reproducibility is essential for clinicians when follow-up consultations on the same patient are performed by different clinicians or when clinicians have to agree on a diagnosis. Three studies have examined inter-rater reliability of ROM measurements on hip OA patients but none reported agreement parameters [16, 17, 24]. One study reported inter-rater reliability on muscle strength measurements in hip OA patients but agreement parameters were not reported . Only one study evaluating reproducibility among primary care clinicians has been identified .
Therefore, the primary purpose of this study was to assess the inter-rater reproducibility of passive ROM and muscle strength measurements in patients with unilateral hip OA among clinicians in both primary care and hospital secondary care. The secondary purpose was to assess the inter-rater reliability of the degree of clinical hip OA among the same clinicians based on findings of ROM and strength measurements.