Self-reported OA was measured by asking participants, “Do you have OA?”. If participants answered “yes”, location of OA was asked. The different sites were fingers, hand/wrist, elbows, shoulders, toes, feet, knee, hip, neck, and back. Self-reported knee, hip or hand OA was defined as present when the participant reported having OA in at least one site, knee, hip or hand (fingers or hand/wrist).
Algorithms for clinical OA were developed based on the clinical classification criteria developed by the American College of Rheumatology (ACR) . Algorithms were specified both for site specific OA (knee, hip and hand, respectively) and non-specific OA (any of these three joints).
The knee OA clinical diagnosis was based on both history and physical examination: pain in the knee was evaluated by the Western Ontario and McMaster Universities OA Index (WOMAC) pain subscale score , plus any 3 of: over 50 years of age, morning stiffness lasting <30 minutes evaluated by the WOMAC stiffness subscale (score from ‘mild’ to ‘extreme’); crepitus on active motion in at least one side; bony tenderness in at least one side; bony enlargement in at least one side, no palpable warmth of synovium in both knees.
The hip OA clinical diagnosis was based on both history and physical examination: pain in the hip was evaluated by the WOMAC pain subscale score, plus all of: pain associated with hip internal rotation in at least one side; morning stiffness lasting <60 minutes evaluated by the WOMAC stiffness subscale (score from ‘mild’ to ‘extreme’); and over 50 years of age.
The hand OA clinical diagnosis was based on both history and physical examination: the pain, aching or stiffness of the hand was evaluated by the Australian/Canadian OA Hand Index (AUSCAN) pain and stiffness subscale ; plus any 2 of: hard tissue enlargement of 2 or more of the 2nd and 3rd distal interphalangeal (DIPs), 2nd and 3rd proximal interphalangeal (PIPs), 1st carpometacarpal (CMC) joints of at least one hand; hard tissue enlargement of 2 or more DIPs of at least one hand; deformity of at least 1 of the 2nd and 3rd DIPs, 2nd and 3rd PIPs, 1st CMC joints of at least one hand. Swelling of the metacarpophalangeal (MCP) joints which is also included in the ACR classification criteria as a control to exclude rheumatic arthritis was only measured in the UK and Germany.
The WOMAC and AUSCAN are often used as indicators of pain or stiffness in the ACR classification criteria of OA . The WOMAC pain subscale contains five items with regard to pain in the knee or hip: during walking on a flat surface, descending or ascending stairs, at night in bed, when sitting or lying, when standing. The AUSCAN pain subscale contains five items relating to pain experienced performing certain hand functions (at rest, gripping, lifting, turning, and squeezing objects). Both the WOMAC and AUSCAN ask about pain experienced in the past 48 hours. The WOMAC and AUSCAN responses to the five items on pain are scaled on a five point Likert scale ranging from none (0) to extreme pain (4). For both the WOMAC and AUSCAN missing values were imputed according to the user manual [17, 18]. The scores were summed to get an overall pain score (range 0–20), and pain was defined by a score of 3 or more , also allowing inclusion of people with mild symptoms.
Hip x-rays consisted of a standard AP pelvis view. Knees were imaged with AP and lateral radiographs. The former were weight bearing with the patella positioned centrally and the latter were standing or supine with attendant knee flexion. From the total of 444 UK participants, 402 had knee radiographs and 394 hip radiographs. Joints were not imaged if they had previously been replaced. Grading of radiographs was performed by two investigators independently, based on scoring system according to Kellgren-Lawrence.
The presence of joint replacements was assessed by asking participants if they had ever had joint replacement surgery. If participants answered “yes”, location of the joint replacement, year of joint replacement, and reason of joint replacement was elicited.
Demographic data were collected on age, gender, education level and marital status. Education was measured by the highest level of education completed and categorized into “elementary school not completed”, “elementary school completed”, “vocational education/general secondary education”, and “college or university education”.
Marital status was assessed by asking whether the participants were single or never been married, married or cohabitating, divorced, widowed, registered partnership, or living apart.
The WOMAC and AUSCAN Indices are tri-dimensional disease specific measures that assess the dimensions of pain, stiffness and physical function [17, 18]. The WOMAC Index consists of 24 questions (5 pain, 2 stiffness, 17 physical function) and the AUSCAN Index consists of 15 questions (5 pain, 1 stiffness, 9 physical function). A description of the WOMAC and AUSCAN pain subscale is given above (see paragraph on clinical OA). The WOMAC stiffness subscale contains two items regarding stiffness in the knee or hip: after first awakening in the morning, and after sitting, lying or resting later in the day. The AUSCAN stiffness subscale contains one item relating to stiffness experienced with certain hand functions (after first awakening in the morning). Both the WOMAC and AUSCAN ask about stiffness experienced in the past 48 hours. The WOMAC and AUSCAN responses to the five items on stiffness are scaled on a five point Likert scale ranging from none (0) to extreme stiffness (4).
The WOMAC physical function subscale contains seventeen items relating to difficulty with knee and/or hip function experienced in the previous 48 hours. The AUSCAN physical function subscale contains nine items relating to difficulty with hand function experienced in the previous 48 hours. The WOMAC and AUSCAN responses were scaled on a five point Likert scale ranging from none (0) to extreme difficulty (4). For both the WOMAC and AUSCAN missing values were imputed according to the user manual [17, 18]. For each WOMAC and AUSCAN dimension, subscale scores were normalized resulting in WOMAC and AUSCAN subscale scores, each ranging from 0 to 100, in which the three subscales are equally weighted [17, 18].
Number of chronic conditions was measured through self-reported presence of the following chronic diseases or symptoms that lasted for at least three months or diseases for which the participant had been treated or followed by a physician: chronic non-specific lung disease, cardiovascular diseases, peripheral artery diseases, stroke, diabetes, cancer, and osteoporosis. If participants answered “yes” then they were asked to specify which diseases or type. Chronic conditions were evaluated as the number of diseases and multimorbidity was defined as the occurrence of 2 or more coexisting conditions.
Lifestyle characteristics were measured by self-reports on smoking, alcohol and physical activity. Both current smoking status (never, former, current smoker) and smoking history (age when started smoking, age when stopped smoking) were assessed. Alcohol consumption was measured by frequency and amount over the past year. Physical activity was measured using the validated LASA Physical Activity Questionnaire (LAPAQ) . Frequency and duration of activities over the past 2 weeks were asked for walking, cycling, gardening, light and heavy household work and a maximum of two sports. In order to calculate the daily activity, the frequency and duration were multiplied and subsequently divided by 14 days. A total physical activity score was calculated in minutes/day and kcal/day.
The social characteristics were assessed using the Lubben’s Social Network Scale (LSNS), the Maastricht Social Participation Profile (MSPP), and the Participation scale. The six-item Lubben’s Social Network Scale (LSNS) was developed specifically for older populations  and assesses family (3 items) and friendship networks (3 items). Responses for questions measuring number of network contacts ranged from 0 (none) to 5 (nine or more). Total subscale scores were calculated ranging from 0 to 15. The Maastricht Social Participation Profile (MSPP)  measures frequency and diversity of actual social participation, and is based on definitions of social participation by older people with a chronic disease themselves. In this study two subscales of the MSPP were used: consumptive participation, such as attending an organised activity (6 items) and formal participation, such as volunteering work (3 items). The response categories range from 0 (‘not at all’) to 3 (‘more than twice a week’). A total score was calculated ranging from 0 to 21. The Participation scale (P-scale) measures participation restrictions in people with chronic (physical) impairments or disabilities . Five items were used (from the total of eighteen items), measuring the perception of participation in helping others, taking part in recreational/social activities, being socially active, visiting others in the community, and visiting public places in the neighbourhood. The scale has a two-tier question and response format. First, a participant is asked to indicate whether they participate, more often (1), the same (2) or less often (3) in a particular aspect of participation compared to their peers. If people participate less often, they could indicate how great a problem this is to them, namely, no problem (1), small problem (2), medium problem (3) or large problem (5). The overall P-score is derived by summing the individual item scores. A higher score indicates a higher level of participation restriction.
Psychological characteristics and wellbeing
Cognitive function was measured by administering the Mini-Mental State Examination (MMSE) including 20 items . A total score of the MMSE was calculated ranging from 0 to 30, and cognitive impairment was defined by a score of 23 or less . Validated or translated versions of the MMSE were used in all centers [26–28].
Anxiety and depressive symptoms were evaluated by the Hospital Anxiety Depression Scales (HADS) . HADS is a self-report questionnaire comprising 14 four-point Likert-scaled items, 7 for anxiety (HADS-A) and 7 for depression (HADS-D). The HADS measures levels of symptoms in the last week. HADS-A and HADS-D were used as categorical variables with cut-off level of 8 or more (range of 0 – 21) for presence of depression or anxiety.
Mastery was measured by means of the 7-item Pearlin Mastery Scale . The questionnaire consists of seven statements such as “I have little control over the things that happen to me.” Response categories range from 1 = strongly disagree to 5 = strongly agree. The summed items range from 7 to 35, but for ease of interpretation 7 is subtracted, so the final scale ranges from 0 to 28, with higher scores indicating more mastery.
Wellbeing was evaluated by the EuroQoL and self-rated health. The EuroQoL (EQ-5D) consists of five questions, each representing one domain: mobility, self-care, usual activities, pain/discomfort, anxiety/depression . The answer options differ between the questions, but can roughly be divided into: no problems, some problems, extreme problems. In addition, the participants were asked to assess their own health state on a visual analogue scale (EQ VAS). Self-rated health (SRH) was measured with the question: how is your health in general? . Response categories are ‘very good’, ‘good’, ‘fair’, ‘bad’, or ‘very bad’.
Health care utilisation and physical environment
For health care utilisation, four types of health care services were assessed: hospitalization, primary care use, specialist services use and medication use. For hospitalization, this involved whether and how many times the participant had been hospitalized in the last year (no/yes). For primary care use, the number of visits to the general practitioner or nurse or visits received at home by these professionals during last month was assessed. Specialist care use was assessed by the number of visits in the last year to the rheumatologist, traumatologist, orthopaedic surgeon, physiotherapist, and podiatrist. Medication use was measured by asking participant which medication, prescribed by a doctor, they used during the past two weeks. The following information was asked: the brand name of the drug, the dosage (expressed as quantity per tablet or per 100 ml, and dosage form) and the number of times used per day /week/ as needed.
Home care services (formal and informal) was measured by asking whether the participant had received help at home in the last year. If the participant answered “yes”, questions were asked from whom they received the help, and the type of help (household or personal care).
Features of the physical environment were measured using the Home and Community Environment (HACE) instrument  and weather sensitivity. The HACE is a standardized, self-report instrument designed to assess barriers and facilitators in several environmental domains . The following features of the neighbourhood environment from the HACE were examined: parks and walking areas that are easy to use; places to sit and rest at bus stops, in parks, or other places where people walk; and public transportation close to home. A question was also included on public facilities such as a daily supermarket, bus stop, post office, bank or community centre. Response categories were ‘a lot’, ‘some’, and ‘not at all’. When participants answered ‘a lot’ or ‘some’, they were asked whether they made use of the resources. Weather sensitivity was measured with the question: which weather condition(s) affects your pain the most? Multiple response categories are damp/rainy, cold, hot, or no particular weather condition. In addition, participants were asked if they drive a car.