As the percentage of older persons is increasing rapidly across the Western world, the prevalence of OA is expected to rise [2]. So far, OA has received little attention from clinicians and health care organisations. OA has mainly been studied clinically in a selected patient population. However, prevalence rates, course and consequences of OA in the general population are still largely unknown [7]. World-wide large variations exist in treatment guidelines and timing of joint replacement surgeries [17, 18]. The EPOSA study was initiated to add to the insight into the correlates of OA and the role of differences in geographics, socio-economic status, and health care policies between European countries. Since OA cannot be cured, insight into the physical, mental and social consequences of OA is important. Such consequences may be more important outcomes of treatment than disease severity. However, there is a gap in the literature on these consequences in the general population. The EPOSA study aims to bridge this gap. This information will help to improve guidelines for the treatment of OA and subsequently the quality of life in OA patients.
In this paper, the harmonization procedures are described that are used in the European Project on Osteoarthritis (EPOSA). This study combines data of existing cohort studies across five European countries varying in climate, socio-economic status, life style, and health care policies. Across the cohorts, different measurement instruments were used and post harmonization algorithms were needed to merge the datasets and enable statistical analyses that allow testing cross-country differences.
Due to the lack in standardization of the definition of OA, it was impossible to harmonize OA into one variable of OA. However, OA is believed to be a collection of disorders with shared features rather than a single disease entity, suggesting that it is appropriate to use several definitions of OA. Different definitions were constructed, based on three sources of information, i.e. self-report, clinical diagnosis and radiography. Higher rates of knee, hand and hip OA were found, for self-reported definitions than for the clinical definitions and radiography. These findings suggest that the prevalence rates of OA are higher for less specific definitions (e.g. pain or self-report) than for more specific definitions (clinical judgement by a rheumatologist). The only exception is the ProVA study, in which the self-reported prevalence rates were lower than the clinical prevalence rates. This may be explained by the fact that we interpreted "possible clinical OA" as having OA, which may have led to an overestimation. If the "possible" category is interpreted as not having OA, the prevalence rates for clinical OA are lower than for self-reported OA (data not shown). In contrast to our findings, a recent review study showed that prevalence rates were higher when radiographic OA definition was used compared to symptomatic or self-reported OA definitions (Pereira, 2011). This study included population based studies as well as hospital based studies, which makes it difficult to compare these results with the results of our study. The study also showed the difficulty in drawing conclusions on pooled prevalence rates due to large differences in study design, definition of OA and measurement of OA.
A limitation of this study might be the lack of radiographic OA data in most cohorts (only available in ProVA and HCS), as radiographic OA is still seen as the standard case definition of OA in many epidemiological studies [19]. Also the widely used American College of Rheumatology (ACR) criteria include both clinical assessment and radiography in the definition of knee, hip and hand OA [20–22]. These criteria are developed for patients who report to their doctor with pain. The EPOSA study however, is completely population based, including both OA patients (who do not always seek care) and healthy persons. Performing radiography in population based studies is not common and often not feasible.
Despite the harmonization efforts, important differences remain in the interpretation of the OA-definitions for each EPOSA cohort. The number of joints included in the non-specific OA definitions varied (two joints in HCS, three joints in ProVA, and all joints in LASA and Peñagrande) and the source of information varied (e.g. clinical definitions were based on physical examinations carried out as part of the study in ProVA, on information available in medical records in LASA and Peñagrande, and on self-report of being diagnosed with OA by a physician in HCS and ActiFE, and self-reported OA was based on the question "do you have OA?" in LASA and on reported pain or difficulty moving the joints (Peñagrande and ProVA). These differences in definitions hamper cross-country comparisons of prevalence rates. The interpretation of the definitions between countries is too diverse and pre harmonization is needed to reliably study cross-national differences in prevalence rates of OA. In the literature, much effort has been devoted to developing a standard definition of OA for epidemiologic studies that includes symptoms, disability, and joint structural disease [23]. A difficulty in establishing a single definition is that although there is some correlation between radiographic disease severity and both symptoms and disability, the relationships are not as strong as expected [24, 25]. The results of these studies suggest that OA is a collection of disorders with shared features rather than a single disease entity, resulting in different OA phenotypes. In our study we tried to harmonize different OA variables into three definitions (self-reported, clinical and radiographic OA), enabling studying different phenotypes of OA, depending on specific research questions.
Recently, recommendations for standardization of radiographic OA and symptomatic OA were given by researchers of the Translational Research in Europe Applied Technologies for Osteo-Arthritis (TREAT-OA) consortium, a large study on genetic and biochemical risk factors of OA [26]. Consensus was reached that radiographic knee OA should be defined with the original Kellgren & Lawrence score "definite osteophytes and possible joint space narrowing", which is in agreement with our study. Radiographic hip OA was defined as "at least definite joint space narrowing", and no consensus was reached for the definition of radiographic hand OA. It was also not possible to standardize symptomatic OA since all studies defined symptomatic OA differently. The authors suggest including pain, clinical assessment of OA as well as radiographic data in this definition [26]. Efforts in the development of knee OA definitions for use in epidemiological studies have led to the EULAR-recommendations [27], however, the information needed for these definitions (symptoms: pain, morning stiffness and functional limitations and signs: crepitus, restricted movement and bony enlargement) is not yet commonly measured in existing cohort studies. Until common definitions are available in cohort studies, post harmonization procedures, although not ideal, are the only available option for cross-cohort comparison of prevalence rates. Unfortunately, the cross-national EPOSA dataset is not well suited for research on the prevalence of OA. However, the dataset can be used to study associations between OA and risk factors or consequences in one or more cohorts that use a similar OA definition.
Post harmonization of other instruments other than OA measures was also problematic. Although a common ADL limitations measure was constructed, it has to be tested whether this measure is a valid and reliable instrument. Loss of information was especially great regarding social participation. Unfortunately, these data could not be harmonized because of heterogeneity of the social activity questionnaires across cohorts.
The measurement instruments used in each of the cohorts and for all variables were carefully selected by each of the cohorts and validated instruments were used if available. However, it is unclear to what degree the harmonized variables are valid. Validating all constructed variables is not feasible. To compensate for this, we are publishing the harmonization guidelines to be transparent about the data harmonization. In future papers the effect of harmonization on the validity of the main variables used will be discussed in the concerning papers.
Fortunately, many other variables were successfully harmonized and include presence of other chronic diseases, anthropometric measures, physical performance, grip strength, pain, self perceived health and hospitalization. Although the intention of the harmonization was to study the prevalence rates of OA across countries and to focus on personal and societal consequences of OA, we cannot pursue these research questions with the current harmonized OA data. On the other hand, it is very well possible to study risk factors and consequences of OA in a part of the cohorts (based on the research questions and OA definitions available in the cohorts) and to study other non-OA-related research questions using high quality harmonized EPOSA data.
Post harmonization of epidemiologic studies has become more common in the past 20 years [28]. It is of great importance to address issues that arise when original data are being harmonized. When attempting post harmonization of data from existing cohort studies the challenges described in this paper are likely to occur. Although the cohorts that participate in the EPOSA study use common data collection instruments, still large differences in many variables existed due to differences in wording and categories, differences in classifications or absence of data. When harmonization leads to too much loss of information, for example in social participation, analysis can be done per cohort. Overall estimates can be obtained by pooling the results. However, this approach hampers cross-national comparisons and difficulty with the interpretation of the results remains. The results of this current paper show the urgency for more agreement on common data collection instruments in the design stage of cohort studies rather then retrospectively to facilitate pooling of data and cross-national comparison.
Despite these issues, the EPOSA dataset provides a unique opportunity to study various research questions in general populations of older persons across Europe. The cross-national nature of the study provides for a large number of older persons in the analyses and large variation across cohorts, resulting in sufficient power to draw conclusions with respect to associations between variables. The harmonized dataset allows for analysis on the individual level, and stratified analyses allow for studying cross-nation differences. In addition, direct replication of findings across countries is possible. Four of the cohorts participating in the EPOSA study provide follow-up data, enabling longitudinal data analyses.
Despite the extensive harmonization procedure, some variables, including OA, may continue to be difficult to compare across countries, and interpretation of findings may require particular attention. These aspects will be considered carefully by all involved investigators, and potential biases in the cross-national comparisons will be discussed in each future article.