Study design
This study was carried out in several universities and companies in Beijing, China and was approved by the Ethics Committee of Beijing Sport University (approval number: 2021070H). The translation and cross-cultural adaptation were conducted according to the procedures established by Beaton et al. [21] After signing written informed consent, all participants independently completed the CHN-ProFitMap-neck. The subjects in the chronic neck pain (CNP) group completed the CHN-ProFitMap-neck, the VAS, the NDI, the SF-36, and demographic variables (such as age, gender and education); subjects in the non-CNP group completed the CHN-ProFitMap-neck. After 1 week, 100 subjects in the CNP group were randomly chosen to answer the CHN-ProFitMap-neck again at the same time as the first test.
Translation and cultural adaptation
First, we contacted the ProFitMap-neck developer, associate professor Martin Björklund, via e-mail to obtain permission for translation and cross-cultural adaptation. Procedures established by Beaton et al [21] were strictly followed (Fig. 1).
Stage I: forward translation (English to Chinese)
The English version of the ProFitMap-neck was translated into Chinese by two native Chinese speakers, bilingual with English as a second language (Fig. 1). Translator 1 is a physiotherapist and translator 2 is an English linguistic professor. The two Chinese versions of ProFitMap-neck were named Translation 1 (T1) and Translation 2 (T2), respectively.
Stage II: synthesis
A synthesis of the two translations (T1 and T2) was obtained with the discrepancies fully discussed and solved by the translators and researchers of this study (Fig. 1). This process was documented and the synthesis version was named Translation 12 (T-12).
Stage III: back translation (Chinese to English)
Two native English speakers with Chinese as a second language translated the T-12 into English (Fig. 1). They have no background in medicine. These two English versions were named Back Translation 1 (BT1) and Back Translation 2 (BT2).
Stage IV: expert committee
The expert committee consisted of one methodologist, one health professional, one language professional, and all translators. They reviewed the original and all the translation versions and a prefinal Chinese version of the CHN-ProFitMap-neck was produced (Fig. 1).
Stage V: pilot test
Fifteen CNP patients were recruited. All participants completed the CHN-ProFitMap-neck and were interviewed about their comprehensibility of the items, answers, and instructions of the CHN-ProFitMap-neck (Fig. 1). Any problem that raised in this process was also documented and addressed.
Stages I to V could be repeated until no query existed.
Participants
From September to December 2021, participants were recruited from Beijing, China, via posters and social media. Individuals aged 18 to 65 years old who have the capability of communication, reading and writing were included. The CNP patients experience neck pain for more than 3 months. The exclusion criteria included: (1) experiencing acute neck pain, (2) history of cervical fractures or surgery, (3) vestibular neurological or cardiovascular disease, (4) during pregnancy. The sample should be 4 to 10 times the number of items with a minimum sample size of 100 [22]. In this study, the sample size was 5 times the number of items. Thus, 220 CNP patients and 100 subjects without neck pain were recruited.
Instruments
Profile Fitness Mapping neck questionnaire (ProFitMap-neck)
The ProFitMap-neck consists of a symptom scale (26 items) and a functional limitation scale (18 items). The symptom scale involves a frequency index and an intensity index. The score in the symptom-frequency index ranges from 1 (never/ very seldom) to 6 (very often/ always). The score in the symptom-intensity index ranges from 7 (nothing/ none at all) to 12 (almost unbearable/ unbearable, all/ maximally). The score in the functional limitation scale ranges from 1 (very good, no problem, very satisfying, very likely) to 6 (very bad, very difficult/ impossible, very dissatisfying, very unlikely). The calculation of scores could be seen in the original ProFitMap-neck study [16]. The ProFitMap-neck has proved to be an instrument with good reliability and validity in the assessment of neck pain [16, 19, 20].
Visual analogue scale (VAS)
The VAS is a self-reported scale. It consists of a horizontal or vertical line (usually 100 mm) anchored by two verbal phrases at both ends to describe the pain status. One end implies “no pain”, and the other end implies “unbearable pain” [23, 24].
Neck disability index (NDI)
The NDI is one of the most widely used self-reported questionnaires to assess neck dysfunction [25]. The NDI has 10 items, including pain severity, personal care, lifting, reading, headache, concentration, working, driving, sleeping and leisure activities [26]. The score of each item ranges from 0 (no pain or dysfunction) to 5 (maximal pain or dysfunction) [25]. The Chinese version of NDI has proven to be a reliable, valid and responsive instrument to measure functional limitations in patients with neck pain [27].
The 36-item short form health survey (SF-36)
The SF-36 is a self-evaluated tool consisting of 36 items that can be divided into 8 dimensions: physical functioning (PF), role limitations due to physical problems (RP), body pain (BP), general health (GH), vitality (VT), social functioning (SF), role limitations due to emotional problems (RE), mental health (MH) and one single item of health transition [28]. Each item can be transformed into scores of 0 to 100, and higher scores indicate better function and health status [29]. The Chinese version of the SF-36 was proven to have acceptable psychometric properties [28].
Statistical analyses
The statistical analyses were carried out using Statistical Package for the Social Sciences (SPSS) Version 23.0 (IBM Corp., Armonk, NY). The demographic and clinical data of the participants were described by means and standard deviations. P < 0.05 was considered significant in all statistical tests.
Measures of reliability
Internal consistency is the ability of an instrument to involve interrelated items [30]. The internal consistency could be assessed by Cronbach’s α. If Cronbach’s α value is no less than 0.7, the internal consistency is considered adequate [31]. The test-retest reliability indicates the ability of the scores of an instrument to be reproducible over time when it is used on the same patient, whose condition has not changed [30]. The test-retest reliability could be assessed by the intraclass correlation coefficient (ICC3A,1) in the 95% confidence interval, which was calculated in a two-way mixed effects model based on absolute agreement measures [32, 33]. An ICC value less than 0.5 indicates poor reliability, a value between 0.5 and 0.75 indicates moderate reliability, a value between 0.75 and 0.9 indicates good reliability and a value higher than 0.9 indicates excellent reliability [32]. The interval period between repeated tests is often 1 or 2 weeks [34]. The measurement errors could be assessed by the standard error of measurement (SEM) and the smallest detectable change (SDC). A score higher than the value of SDC reflects the “real” change beyond the measurement errors [34, 35].
Measures of validity
Content validity indicates the ability of an instrument to reflect the domain of interest and the conceptual definition of a construct [15]. Floor/ceiling effects could be used to describe the content validity. If more than 15% of the subjects reach the lowest or highest possible scores, their deterioration or improvement cannot be detected by the instrument, which means floor/ceiling effects are present [36], indicating limited content validity [34].
Construct validity indicates the ability of an instrument to measure the construct it was designed to measure [15]. In this study, construct validity was examined by exploratory factor analysis (EFA), convergent and discriminant validity and known group validity [37]. EFA was utilized to explore the factor structure by using principal component analysis with varimax rotation. Items with factor loadings lower than 0.3 have no significant correlation with any factor and should be removed [38].
If a “gold standard” is absent, convergent and discriminant validity can be assessed to verify the correlation between the assessed instrument and other existing and valid measures [37]. In this study, Spearman rank correlation analysis was utilized to calculate the correlation between the ProFitMap-neck and the NDI, the VAS, and the SF-36. A Spearman correlation value is considered small when it is between 0.1 and 0.3; moderate, between 0.3–0.5; high, ≥0.5 [39]. We hypothesized moderate negative correlations between the CHN-ProFitMap-neck and the NDI and VAS; moderate positive correlations between the CHN-ProFitMap-neck and the SF-36.
Known group validity indicates the sensitivity of an instrument between different groups [15]. If the scores of two different groups are proven significantly different (P < 0.05) via the independent-sample T test, it means that the instrument is able to detect the difference between different groups [37]. We hypothesized that the CHN-ProFitMap-neck scores in the CNP group would be significantly lower than those in the non-CNP group.