Because of the multifactorial dimension of MSDs and the complexity of interplaying physical, psychosocial and lifestyle factors, we used the expert panel method. We established two expert groups (one in Sweden and one in Iran) from different areas including occupational medicine, epidemiology and psychology. Participants of the panel in Sweden included: one professor in occupational health (MD and PhD), one professor of psychology (PhD) and two Iranian physicians (MD and PhD student).
The panel in Iran included: two physicians, three psychologists (two at the master's and one at the PhD level), and one epidemiologist. The two groups communicated by email, and two members who were in both groups had regular meetings with the other members in Iran and Sweden.
In the first step (figure 1), based on expert panel discussion, a number of changes relating to Iranian culture, word flow and related issues were made.
The English version of the modified MUSIC questionnaire we worked on included 10 domains, each consisting of a few scales with a numbers of items:
1- Demographic data (9 items)
2- General working conditions including extra work (14 items)
3- General health (18 items)
4- Sleep and recovery (24 items)
5- Musculoskeletal problems, with the scales of pain, disability, previous pain history, and clinical signs (5 scales, 129 items)
6- Working conditions, with the scales of physical working conditions, psychosocial working conditions and reorganization. In this domain, each scale consists of a few subscales with the items in it. In general, physical working conditions consists of 12 items, psychosocial working conditions 44 items, and reorganization 6 items.
7- Household/spare time (2 items)
8- Lifestyle factors (2 scales and 5 items)
9- Psychosomatic factors (17 items)
10- Life events (17 items)
In the second step we translated the questionnaire into the Persian language. The expert panel in Iran discussed the pitfalls of the translated version. This group shared their results with Swedish expert panel group. Finally in step three, both groups confirmed the translated version. Within these groups we used a consensus-building strategy that satisfied all of the members' concerns based on the disciplines they represented.
These group changes were mainly demographic ones such as: accommodation, commutation time, history of participation in the war with Iraq, and disability related to the war.
Thus the Persian version of the final questionnaire was prepared, and ready for work on its validity and reliability.
Validity detection
In the fourth step, we used the Focus Group Discussion (FGD) method to detect questionnaire face and content validity [14].
We conducted 3 discussion groups; each group consisted of 5–6 participants with different job titles including workers, office workers, expert workers (technicians and engineers) and managers.
The main objective of the group meetings was to identify:
a- that people understood the concept of the questions.
b- that people understood the questions in the same way as the investigators did.
Each meeting lasted for 2–3 hours. One occupational physician and one psychologist were the fixed interviewers, and one psychologist or industrial nurse took it in turns to accompany them.
The focus group discussions were taped and noted. The questions discussed were:
1- Which question is ambiguous in each domain and how many participants agree with this?
2- Is ambiguity related to the question stem, or answer, or both?
3- How can the above question be changed to make it clearer?
4-Are there any other suggestions to improve the questionnaire?
Reliability detection
To detect the reliability of the questionnaire, we used the test-retest method. 40 participants selected randomly and proportional to their job titles and levels of education (secondary school to master's level) were asked to fill in the questionnaire at 3 week intervals. Interclass Correlation Coefficients (ICCs) for the rating scale, and kappa coefficients for dichotomous answers and categorical data, were used for analyses [15].
To assess/rate the ICCs or Kappa we used the following scoring system:
>0.9 excellent
>0.8 good
>0.7 acceptable
>0.6 questionable
>0.5 poor
<0.5 unacceptable [16, 17]
We analyzed all of the questions (items) in referred groups separately and one by one, but due to the large number of questions we reported the results based on their pertinent domain.