Bmc Musculoskeletal Disorders Reliability and Validity Study of Persian Modified Version of Music (musculoskeletal Intervention Center) – Norrtalje Questionnaire

Background: Musculoskeletal disorders (MSDs) are a major health problem in the world. Self-reported questionnaires are a known method for estimating the prevalence of MSDs among the population. One of the studies concerning MSDs and their relation to work-related physical and psychosocial factors, as well as non-work-related factors, is the MUSIC-Norrtalje study in Sweden. In this study, the research group developed a questionnaire, which has been validated during its development process and is now considered a well-known instrument. The aim of this study is to validate the Persian version of this questionnaire.


Background
Musculoskeletal disorders (MSDs) are a major health problem in all countries [1]. They encompass a variety of conditions, including disorders of muscles, tendons and nerves. Although the underlying pathology of these conditions may differ and their diagnoses are unclear, the symptoms are often similar.
There is no "golden standard" measurement tool for estimating the prevalence of MSDs among the population. Statistics on the prevalence of work-related musculoskeletal disorders may vary from one reference source to another, primarily due to variations in outcome measures, and the diagnostic criteria. A common method for estimating the magnitude of the problem is self-reported data in questionnaires concerning episodes of pain [2].
The MUSIC-Norrtalje study from Sweden was performed to find health and risk factors for low back and neck/ shoulder disorders in the general working population. In this study a questionnaire was designed to measure different outcomes of low back and neck/shoulder pain, as well as work-life exposure, lifestyle factors, social exposures co morbidity, life events and psychosomatic complaints [6,7]. The evolution of the questionnaire and its reliability and validity has been studied and published in Sweden [8]. This questionnaire is the combination of questions and indexes tested and approved in other studies [9][10][11][12][13].
In Iran there is no epidemiological information about MSDs. However, on the basis of some informal reports and statistics it seems that these are among the most frequent work-related disorders.
To generate knowledge about MSDs and to meet the increasing demand for questionnaires in Iran, we decided to translate the MUSIC questionnaire into the Persian language and validate it in a large worker population in Iran Khodro Car Manufacturing Company.

Methods
Because of the multifactorial dimension of MSDs and the complexity of interplaying physical, psychosocial and lifestyle factors, we used the expert panel method. We established two expert groups (one in Sweden and one in Iran) from different areas including occupational medicine, epidemiology and psychology. Participants of the panel in Sweden included: one professor in occupational health (MD and PhD), one professor of psychology (PhD) and two Iranian physicians (MD and PhD student).
The panel in Iran included: two physicians, three psychologists (two at the master's and one at the PhD level), and one epidemiologist. The two groups communicated by email, and two members who were in both groups had regular meetings with the other members in Iran and Sweden.
In the first step (figure 1), based on expert panel discussion, a number of changes relating to Iranian culture, word flow and related issues were made.
The English version of the modified MUSIC questionnaire we worked on included 10 domains, each consisting of a few scales with a numbers of items:   In the second step we translated the questionnaire into the Persian language. The expert panel in Iran discussed the pitfalls of the translated version. This group shared their results with Swedish expert panel group. Finally in step three, both groups confirmed the translated version. Within these groups we used a consensus-building strategy that satisfied all of the members' concerns based on the disciplines they represented.
These group changes were mainly demographic ones such as: accommodation, commutation time, history of participation in the war with Iraq, and disability related to the war.
Thus the Persian version of the final questionnaire was prepared, and ready for work on its validity and reliability.

Validity detection
In the fourth step, we used the Focus Group Discussion (FGD) method to detect questionnaire face and content validity [14].
We conducted 3 discussion groups; each group consisted of 5-6 participants with different job titles including workers, office workers, expert workers (technicians and engineers) and managers.
The main objective of the group meetings was to identify: a-that people understood the concept of the questions.
b-that people understood the questions in the same way as the investigators did.
Each meeting lasted for 2-3 hours. One occupational physician and one psychologist were the fixed interviewers, and one psychologist or industrial nurse took it in turns to accompany them.
The focus group discussions were taped and noted. The questions discussed were: 1-Which question is ambiguous in each domain and how many participants agree with this?
2-Is ambiguity related to the question stem, or answer, or both?
3-How can the above question be changed to make it clearer?
4-Are there any other suggestions to improve the questionnaire?

Reliability detection
To detect the reliability of the questionnaire, we used the test-retest method. 40 participants selected randomly and proportional to their job titles and levels of education (secondary school to master's level) were asked to fill in the questionnaire at 3 week intervals. Interclass Correlation Coefficients (ICCs) for the rating scale, and kappa coefficients for dichotomous answers and categorical data, were used for analyses [15].
To assess/rate the ICCs or Kappa we used the following scoring system: We analyzed all of the questions (items) in referred groups separately and one by one, but due to the large number of questions we reported the results based on their pertinent domain.

Validity
Totally 16 people participated in 3 focus group discussion meetings. The MUSIC questionnaire consisted of 10 domains and 14 sub-domains (Scales). The total number of items in each domain and the items where there was ambiguity are shown in table 1. Out of 297 items in all, 20 items (in 7 sub-domains) were ambiguous. All of the ambiguities were related to the stem of the questions, and the predicted answers were clear for the participants. Table  2 shows ambiguous items and their frequency in each domain declared by FGD participants and the final decision made on them.
In spite of the clarity of two following questions (in workrelated psychosocial factors), 63% expressed concern to response to each one, although their stems were clear.
"-How many people have you seen being bullied during the last six months?
-Have you been subjected to bullying at the workplace during the last six months?" Finally, 7 of 16 participants in Focus Group Discussion meetings felt that the number of questions in the questionnaire is large and should be reduced, and 8 (50%) people did not agree to give their names on the questionnaire sheet.

Reliability
As questions about demographic and general working conditions were considered as facts and consistent with time, we did not determine their reliability.
In the general health, sleep and recovery domains the ICCs or kappa were more than 0.7 (acceptable).
In the musculoskeletal domain, the level of ICC or kappa was good and excellent (>= 0.8) in all body regions.
In the physical working conditions scale, the range of ICCs or kappa related to each question varied from 0.3 to 0.9. In spite of this wide range, only one coefficient was considered unacceptable/poor.
In the psychosocial working conditions scale, the ICCs or kappa ranged from 0.2 to 0.9.
The unacceptable/poor coefficient was related to only one question.
Regarding dichotomous questions, all kappa coefficients were significant. Thus there was good agreement in testretest answers.
In the reorganization scale, there were significant coefficients in all related items.
There were two questions in the household/spare time domain. The first one showed an acceptable reliability coefficient and the second was questionable.
In the lifestyle domain, one item in the exercise scale showed a questionable reliability coefficient.
In the psychosomatic domain, there were excellent reliability coefficients.
In the life events domain, all but one question showed a significant coefficient (Table 3).

Discussion
There is no "gold standard" measurement tool for estimating the prevalence of MSDs. Although self-reporting is usually considered a less reliable way to measure disease outcomes, MSDs is a mainly self-reported condition.
Pain is described in different ways and is of a complex nature; it is influenced by physical and psychosocial exposures, individual factors, personality and earlier experiences.
Depending on the tools (sickness absence registration, physical examination, used to measure outcome and type of complaint (self-reported pain or self-reported disabling pain), and the region of pain, the prevalence of MSDs varies widely.
Exposure assessments are another problem with regard to MSDs. For some physical and ergonomics factors, direct measurements and observations can be made. However, these are expensive and time-consuming methods. In epidemiological studies, a method that is often used for estimating the magnitude of the exposures is selfadministrated questionnaires. The validity and reliability of these methods has been compared in different studies [18][19][20][21][22].
For psychosocial exposures, external observations are harder to perform. The experience of the worker may be more relevant when examining the relationship between exposure and outcome. The lack of standardized exposure tools for assessing psychosocial risk factors that are relevant to work-related MSDs is considerable [23].
In Sweden, the MUSIC questionnaire has been used as a tool to study the relationship between MSDs and workrelated factors. This questionnaire has been validated during its development process; it has been used in numerous publications and is considered a valid and relevant instrument.
In our study, the Persian version of the MUSIC questionnaire was developed and its validity and reliability were determined and described. It is necessary for study designers to consider features which improve subjects' reporting accuracy, including using familiar terms that are common in worksite discourse, and presenting guidelines which will help them to place their exposure in relation to that of others [24]. We used the above recommendations while translating the MUSIC questionnaire into the Persian language and in the expert panel method.
In the validity study using the Focus Group Discussion (FGD) method, we found that, only 22 out of 297 questions were ambiguous. Of them, except for two questions, all others (20) had unclear translation. For instance, after translation, the FGD participants did not understand the concept of household/spare time, so we added a description to clarify this in the translation.
The participants also mostly thought that physical activity in the workplace meant a kind of exercise. In the translated version we tried to make clear the distinction between physical exercise and physical work exposures. After this, none of the participants had any problems with the concept of the question.
Regarding company organizational culture, two questions related to work psychosocial factors (questions about bullying) were considered not to be applicable in the Persian version and were thus dropped.
Although in the original MUSIC questionnaire some questions asked about preceding exposures 20 years back in time, we restricted the time frame of the Persian version of the questionnaire to one year back in time. The reason for this was that the employees are mostly young people with shorter work experience and, if applicable, high probability of recall bias (the results of the validity study confirmed this).
In the test-retest study, the reliability coefficient was relatively high in most items, and only 5 questions out of 297 had an ICC below 0.7 (table 3).
These questions with low ICCs or kappa (only 5 questions) were dropped using expert opinion. We recommend other investigators to consider these results in their own research. ICC is the ratio of the between-subjects variance divided by total variance [25]. It is a measure of relative reliability and in some instances can produce misleadingly high levels of reliability (for example if there is a large variance between subjects) [26]. Some researchers advise to report ICC with other measurements like SEM (standard error of measurement).
The magnitude of the Kappa coefficient represents the proportion of agreement greater than that expected by chance but there are other factors that can influence the magnitude of kappa like prevalence, bias, and non-independence of ratings [27].
MUSIC questionnaire is an expanded questionnaire with different sections. Deleting one domain (scale) or sub domain does not affect the validity of questionnaire and it depends on research group and the aims of using questionnaire. On the other hand, as it was referred questionnaire is about different regions of body (129 questions). Usually in practice we use restricted part of body region and it decrease the number of questionnaire.
In MUSIC questionnaire in general health domain we had 18 questions. For decreasing the total number of questions this domain can be substitute by GHQ-12 that other group in Iran has studied its validity and reliability [28].
Memory is an unavoidable problem in re-test situations. Subjects may remember how they answered questions and attempt to reproduce those answers during re-test. A3-week interval between tests was chosen, in part to minimize overestimate and underestimate of reliability (due to influence of memory or actual change in work condition). Based on research group opinion and human resource department, there were no modifications in job tasks, any new intervention, organizational changes, or production demands during test re-test period.
In general, the results show that the Persian version of the questionnaire has a good conceptual structure and provides reliable information on workplace factors. This questionnaire could be considered a valuable and specific instrument to assess self-reported musculoskeletal pain and work-related physical and psychosocial exposures, as well as lifestyle factors.

Conclusion
In conclusion, the findings from the present study provide evidence that the Persian version of the MUSIC questionnaire is a reliable and valid instrument for measuring musculoskeletal pain and disorders, as well as workrelated physical and psychosocial exposures and also nonwork-related factors.
The main problem was word flow in the translation and a few questions that participants were uncomfortable about answering. These problems were solved during the validity study.