Data analysed in this study were derived from routine general practice in the UK. These data were sourced from a proprietary health data resource, The Health Improvement Network (THIN) , a competitor to the General Practice Research Database (GPRD) , a widely recognised source of UK GP data. THIN was selected since it had greater patient numbers matching the initial selection criteria.
THIN data are collected in a non-interventional way from the daily record keeping of primary care physicians in the UK. The records are anonymised at the collection stage so that researchers have access to an encrypted identifier for the physician's office and the patient. They provide a longitudinal medical record for each patient. Currently, the dataset consists of contributions from over 300 practices and data from approximately five million patients of whom over 2.3 million are actively registered with the practices and can be prospectively followed. The remaining patients have historical data but have either left the practice or died. There are nearly 30 million patient years of computerised data in THIN. On average, patients have full data for six years and may have up to 15 years of observations. Data from THIN consist of four categories that detail the following: 1) subject demographic details; 2) medical history (diagnoses); 3) test results and addition health-related data such as smoking status; and 4) drug treatments. Ethical approval was granted by the Cambridge MREC.
Data were extracted for patients with a diagnosis for rheumatoid arthritis (RA) recorded at least twice on two different dates. A marker of 'newly diagnosed' was assigned if the patient had less than six months case history prior to first RA diagnostic code.
Of these patients, the first rheumatoid presentation (index event) was determined from the first of the following three events: either 1) occurrence of three consecutive monthly prescriptions for a non-steroidal anti-inflammatory drug (NSAID); or 2) recorded rheumatology referral, consultation or screening; or 3) an RA diagnosis. Cases were further selected by presence of at least one valid CRP measurement between the index date (the date of the index event) and study endpoint.
Finally, to be assured of following cases from their first rheumatoid presentation, a minimum wash-in period of six months between the date of first contact with the database and the index date was applied.
The observational endpoint was the first recorded date of a major orthopaedic intervention which included hip, knee, shoulder, elbow joint replacement or cervical spine fusion, in common with the Early Rheumatoid Arthritis Study Group . Those who did not progress to TJR were censored at the date of last contact or death.
The key marker in this investigation, serum CRP concentration, is reported as provided in THIN. CRP observations were utilised only if they had a value >0 in order to be sure that this was a true observation and not a null value attributed 0 (zero) systematically or at data entry. Recorded total cholesterol (TC) observations were similarly treated. Mean CRP was calculated for all valid observations between index and study endpoint, as was mean body mass index (BMI) and mean blood pressure (systolic and diastolic).
Cases where CRP change was recorded were included where at least two observations were recorded approximately one year apart (± 90 days in order to recruit sufficient cases). Given high intra-individual variability for CRP [15, 16], baseline CRP status was calculated as the average of all observations within 90 days of the first CRP observation. Similarly, the follow-up CRP level was defined as the mean of values occurring within a further 90 day of the second (1 year) observation. It was not possible to determine whether the assays were of high sensitivity therefore decided to categorize CRP level according to the scientific statement recently issued by the American Heart Association and Centers for Disease Control and Prevention  a method used previously in large epidemiological studies . A low level of CRP was defined as 10 mg/L or less whilst levels in excess of 10 mg/L were defined as high based on published population norms. CRP change was then defined for the period as low → low (LL), low → high (LH), high → low (HL), and high → high (HH). Classification by the method used in a recent study of hospital laboratory data  was not feasible in this instance as too few cases in the CRP change subgroup met the AHA criteria for 'normal' CRP (≤ 3 mg/L). Any case in the CRP change subgroup who had a recorded diagnosis of an acute ischaemic event during their CRP observation period (± 30 days) was excluded from analysis.
Survival was evaluated using Cox proportional hazards regression analysis (CPHM), using SPSS® v15, SPSS Inc, Chicago. Independent covariates of survival were tested using a manual forward inclusion method with a threshold significance of p = 0.05 as the criterion for inclusion in the final models. Time to event was measured from the date of the last reported CRP observation. Surviving cases were right-censored by their last known contact date.