ABSTRACT
Objective
The American Orthopedic Foot and Ankle Association (AOFAS) Midfoot scale is one of the most popular outcome measures for evaluating midfoot pathologies. We aimed to obtain a valid and reliable Turkish translation of the AOFAS Midfoot scale.
Methods
Fifty-seven patients with midfoot pathologies were included, and the mean age was 38.47±12.54. To appraise construct validity, correlations were applied with the visual analog scale (VAS), the Turkish version of the foot and ankle ability scale (FAAM), and the 12-item short form health survey.
Results
The AOFAS Midfoot-Turkish scale had adequate internal consistency (α=0.75) and test-retest reliability [intraclass correlation coefficient (ICC)2,1=0.86 for function, and ICC2,1=0.95 for total score]. The AOFAS Midfoot-Turkish scale total score had a moderate to strong correlation with VAS activity and FAAM-ADL, FAAM-Sports, and PCS-12 (rho=-0.69, p=0.001; rho=0.88, p=0.001, r=0.86, p=0.001, and r= 0.68, p=0.001, respectively). The lowest correlation was found between the AOFAS Midfoot-Turkish and the MCS-12 (rho=0.37, p=0.004).
Conclusion
The Turkish version of the AOFAS Midfoot scale is a reliable and valid outcome measurement instrument that can be used to evaluate Turkish-speaking individuals with various midfoot pathologies, especially Lisfranc injuries.
INTRODUCTION
Foot and ankle injuries are the most common musculoskeletal disorders that greatly affect patients’ quality of life (QoL) and functionality (1). Tarsometatarsal (Lisfranc) joint injuries are relatively rare (9-14/100.000/person-years), and missed diagnosis and inadequate treatment are common (2).
Many scales have been developed for academic or clinical purposes (3). The American Orthopedic Foot and Ankle Society (AOFAS) score is an extensively used clinical outcome measure, particularly designed to assess the foot and ankle (4). It has four anatomical subdivisions: proximal to distal, ankle to hindfoot, midfoot, and forefoot, including the hallux and lesser toes (5). It consists of objective and subjective items measuring pain, functionality, and alignment (5). The scales have been used for 30 years with unabated frequency (4).
The original scales are difficult to administer to non-English speakers. The different parts of scales are translated into various languages (6-9). The Turkish version of the AOFAS Hindfoot and Forefoot has been published (10, 11). In this study, we aimed to present a reliable, valid, cross-culturally adapted Turkish version of the AOFAS Midfoot Scale.
METHODS
Procedure
Fifty seven patients with midfoot injuries who applied to the orthopedic clinic between July 2021 and July 2022 and met the inclusion criteria included in the study. Informed consent was obtained from the participants. The local Bakırköy Dr. Sadi Konuk Training and Research Hospital Ethics Committee approved the study (decision no: 2021-13-02, date: 05.07.2021). The study was registered in a clinical trial (NCT05246488).
The eligibility and exclusion criteria are presented in Figure 1. The sociodemographic and medical data of the patients were recorded. Exceeding the recommendation of at least five patients per item, 57 patients were included, including 8.14 patients for each of the seven items on the AOFAS Midfoot scale (12).
Measures
AOFAS Midfoot Scale
The seven-item AOFAS Midfoot scale is a questionnaire specifically designed for the midfoot. The scale consists of three subsections: pain, function, and alignment, and is represented by 40, 45, and 15 points, respectively. Between 0 and 100, higher scores indicate better results (5).
Visual Analog Scale (VAS)
The VAS was used to evaluate pain subjectively. Patients used a 10- cm line, which ranged from no pain (0) to the most acute pain (10), to assess their pain levels at rest, during activity, and at night. Measuring the marking projection on the ruler yields the score (13).
Foot and Ankle Ability Measure (FAAM)
FAAM is a PROM used to evaluate region-specific physical functions. The questionnaire was divided into two subscales: Activity of daily living (FAAM-ADL/21- items) and sports (FAAM-Sports/7- items). The questionnaire was evaluated using a 5- point Likert scale, ranging from “none at all” to “unable to do”. The item scores for the FAAM-ADL subscale, which ranges from 0 to 84, and the FAAM-Sports subscale, which ranges from 0 to 32, were converted into percentage scores. Higher scores represent the higher function (14).
Short Form-12 Health Survey (SF-12)
The SF-12 is a simpler form of the SF-36 questionnaire that evaluates perceived health-related QoL. The questionnaire comprises 12 items, with seven items focusing on the physical components (PCS-12) and five items addressing the mental components (MCS-12). For each metric, scores range from 0 to 100; a higher score is correlated with a higher QoL (15).
Study Protocol
Dr. Harold Kitaoka permitted the translation of the scale into Turkish. The cross-cultural adaptation was conducted in five phases following the Beaton guidelines (16). During the first phase, two translators translated the scale into Turkish. These translators were 8-years experienced physiotherapists and blinded and unbiased researchers who both were native Turkish speakers. In the second phase, a bilingual individual compared and reviewed both translations. During the third phase, the Turkish version was subjected to back-translation into English by two proficient native English speakers who also had a strong command of Turkish. During the fourth phase, a committee of four translators compared the back-translated version of the AOFAS Midfoot scale with the original English version. During the translation process, the translators realized that the term “blocks” is not used to describe distance in daily Turkish. Akbaba et al. (10) replaced the term “blocks” with the phrase “200 meters” and included duration in the Turkish translation of the AOFAS ankle-hindfoot scale. Therefore, “blocks” was replaced with “200 meters” and walking duration was added to the scale. The pre-final version of the AOFAS Midfoot Turkish (AOFAS Midfoot-T) scale was developed for field testing. Thirty appropriate patients with midfoot injuries were given the pre-final version during the final phase (Figure 2). After filling out the form, patients were interviewed regarding any challenging questions or unfamiliar terminology.
Measurement error, internal consistency, and test-retest reliability were used for measuring reliability. Construct validity was evaluated using hypothesis testing, measuring the degree of correlation between the AOFAS Midfoot-T and VAS scores, as well as the Turkish versions of FAAM, PCS-12, and MCS-12. The hypothesis stated that the total AOFAS Midfoot-T score had a strong positive correlation (correlation coefficient of 0.70 or greater) with the FAAM score because they measured similar constructs. Additionally, it was expected that the total AOFAS Midfoot-T score would have a moderate negative correlation (correlation coefficient between 0.50 and 0.70) with VAS scores because they measure related but dissimilar constructs. Furthermore, it was predicted that the total AOFAS Midfoot-T score would have a moderate positive correlation (correlation coefficient between 0.50 and 0.70) with the PCS-12 score because they measure related but dissimilar constructs. Lastly, it was anticipated that the total AOFAS Midfoot-T score would have a weak positive correlation (correlation coefficient between 0.30 and 0.50) with the MCS-12 score because it measures unrelated constructs.
The VAS, validated Turkish versions of the FAAM and SF-12, AOFAS Midfoot-T scale, and VAS were completed by all patients. All patients successfully completed the subjective component of the AOFAS Midfoot-T scale. The clinician evaluated the quantitative component of the AOFAS Midfoot-T scale. The second assessment, in which patients re-applied the AOFAS Midfoot-T scale, was performed within a week following the first evaluation to determine the test-retest reliability of the translated form. No intervention was administered during this timeframe to reduce the likelihood of immediate clinical changes. The reliability analysis was restricted to patients who indicated “no clinical change”.
Statistical Analysis
All statistical analysis were conducted using the Statistical Package for the Social Sciences version 20.0 (SPSS Inc., Chicago, IL, USA). The statistical significance level was p<0.05. The ICC was computed to assess the test-retest reliability. Reliability with an ICC exceeding 0.75 was deemed excellent (17). The internal consistency of the AOFAS Midfoot-T scale was assessed by calculating Cronbach’s alpha (α) coefficient upon the initial completion of the scale. An α value ranging from 0.70 to 0.95 was considered acceptable reliability (17). The measurement error was evaluated using the standard error of measurement (SEM). The square root of (1-ICC) was multiplied by the standard deviation of the scores to calculate the SEM. MDC95 was determined by multiplying the SEM by 1.96 and then multiplying the result by the square root of 2. The investigation of construct validity involved testing predetermined hypotheses and analyzing the Pearson correlation coefficient. The correlation strength was classified as weak when it was less than 0.50, moderate when it was between 0.5 and 0.70, and strong when it was greater than 0.70 (18). The floor (score 0-10) and ceiling effects (score 90-100) at the time the form was initially completed were evaluated by determining the percentage of patients who, concerning the total number of patients, scored the lowest or highest values on the questionnaire. A floor or ceiling effect was identified at a threshold of more than 15% (19).
RESULTS
Translation and Cross-cultural Adaptation
There were no difficulties in the forward and backward translation, and the Turkish version was consistent with the original scale. However, the term “blocks” is not used to indicate distance in Turkish; thus, “blocks” was replaced with “200 meters” and walking duration was added to the scale. Preliminary tests indicate that patients perceived all questions correctly. The required time to complete the AOFAS Midfoot-T scale is approximately 10-15 minutes. 57 patients with a mean age of 38.47±12.54 years participated in the first and second assessments. The sociodemographic and medical characteristics of the patients are presented in Table 1.
Reliability
The Turkish version’s internal consistency was adequate for the first administration, with an α of 0.75. Cronbach’s coefficient for the function subscale was 0.84 for the initial application of the Turkish translation. The means and standard deviations in the first and second applications of the Turkish version are given in Table 2. The ICC2,1 was 0.86 (0.76-0.91) and 0.95 (0.94-0.97) for the function subscale and total score, respectively. The SEM and MDC95 were determined as 2.24 and 6.20 for the function subscale and 8.20 and 22.66 for the total score of the AOFAS Midfoot-T scale.
Validity
The FAAM-ADL and FAAM-Sports (r=0.88, p=0.001 and r=0.86, p=0.001, respectively; Hypothesis-1) met the a priori criteria of a strong positive relationship. In addition, the a priori criterion of a negative correlation was met for the VAS-rest, VAS-activity, and VAS-night (r=-0.60, p=0.001, r=-0.69, p=0.001, and r= -0.57, p=0.001, respectively; Hypothesis-2). There was a moderate positive correlation between the AOFAS Midfoot-T total score and PCS-12 (r=0.68, p=0.001, Hypothesis-3). There was a weak positive correlation between the total SHEDS-T score and the MCS-12 subscale (r=0.37, p=0.004, Hypothesis-4) (Table 3). All findings (100%) that supported the hypotheses indicated good construct validity. During the test and retest examinations, the floor and ceiling effects as well as the total number of questions answered were the same. In the first application of the AOFAS Midfoot-T scale, floor and ceiling effects were calculated as 0% and 47%.
DISCUSSION
This study aimed to present a culturally adapted, reliable, and valid Turkish translation of the AOFAS Midfoot scale for use in the evaluation of Turkish-speaking individuals with midfoot pathologies. The AOFAS Midfoot-T was found to have adequate test-retest reliability (ICC=0.95), internal consistency (Cronbach’s α coefficient=0.75), and validity. According to the current findings, the AOFAS Midfoot-T scale does not demonstrate a ceiling or floor effect, and the MDC95 values for the total score of the translated version were 22.66. Changes less than these MDC95 values during consecutive applications of the Turkish form may reflect measurement errors rather than a real change in foot function.
The results were consistent with the AOFAS Midfoot study using the Persian version (α=0.75) (9) and the Lisfranc injury patients (α=0.75) (20). However, the α value was specified neither in the original study (5). The ICC had excellent internal reliability between measurements administered over 5- to -7 days for the Turkish version (ICC2,1=0.95). Similar to the present study, both the Persian version of the scale (ICC2,1=0.96) showed excellent test-retest reliability (9). However, Ponkilainen et al. (20) did not specify test-retest reliability. In the study where all subgroup translations were presented in the same study, the AOFAS midfoot Arabic scale ICCs ranged from 0.405-0.542, and good structural validation was reported (8).
In the present study, the MDC95 values were 6.20 for the function subscale and 22.66 for the total score of the AOFAS Midfoot-T. Since it was not calculated in other studies in the literature, the MDC95 value of the AOFAS Midfoot-T scale could not be compared with other studies in the literature (5, 8, 9). On the other hand, a ceiling effect was confirmed for the AOFAS Midfoot-T scale (%47) and the study was conducted in patients with Lisfranc injury (%28) (20).
Region-specific patient-reported outcome measures (PROM), such as FAAM (14), VAS foot and ankle (21), and the European Foot and Ankle Society Score (22), may have psychometric and reliable properties that will correct the uncertainty and loss of reliability experienced by AOFAS in evaluating the results of foot and ankle pathologies alone.
The Patient-reported Outcomes Measurement Information System (PROMIS) is a series of person-focused measurements that evaluate and track health status. PROMs assess the individuals’ QoL or functionality and the patient's health perception, thereby providing important clinical and scientific information (23). Consequently, orthopedic communities are increasingly using PROMIS. Although traditional imaging and physical examination findings are a priority for clinicians, they may not reflect patient satisfaction and functionality. There is a great need for reliable PROMs translated into multiple languages (24).
Richter et al. (21) noted that regardless of the popularity of the AOFAS scores, the scoring was not validated, resulting in problematic evaluation material in cases of incomplete responses to the survey. Malviya et al. (25) stated that the evaluations had limited accuracy due to insufficient response options for each component. Guyton exposed these theoretical limitations with statistical evidence (26).
Hunt and Hurwit (27) noted that among the different outcome measurement instruments they reviewed in the foot and ankle clinical literature, the AOFAS scales remained highly used compared with other validated scales. They also emphasized that although a change in philosophy is needed in the use of reliable scales, the most valid scale should be preferred in clinical practice (27). As we underline, although there is a consensus that the use of AOFAS scales should decrease or should not be used alone, there is no consensus on which scale to use or to combine. Since the use of AOFAS subscales remains popular, we present the Turkish version in order to obtain clinically and academically reliable results.
Although the use of different PROMs is encouraged and the limitations of the use of AOFAS scores have been mentioned, a recent article advocated the use of completely patient-reported AOFAS. The completely patient-reported Dutch version of the AOFAS scale showed sufficient construct validity, internal consistency, test-retest reliability, and responsiveness and was suitable for use in research settings (28).
Despite all the aforementioned limitations, the AOFAS scoring system has been used in more than half of the studies examining midfoot injuries in recent years (29). Moreover, the AOFAS Midfoot score has been preferred as the primary scoring system in many studies on midfoot injuries (30).
The strength of the study is that the most common outcome scale for midfoot pathologies was translated into Turkish, with sufficient samples from both genders and a wide range of pathologies (e.g. Lisfranc injury, navicular bone fracture, midfoot arthritis). The main criticisms of the original AOFAS scoring, such as being not validated, not containing sufficient response options, small changes in the answers causing a large difference in the total score, and the drawbacks of using them alone, are also the main limitations of our study.
CONCLUSION
The Turkish version of the AOFAS Midfoot Scale is semantically and linguistically sufficient to evaluate patient-reported outcomes both clinically and scientifically for Turkish-speaking individuals with all developmental or traumatic midfoot pathologies, especially Lisfranc injuries. Although the AOFAS scales are old and widespread, the PROMIS scales, which offer a holistic evaluation opportunity by providing objective and subjective patient evaluations and having a consensus, should be preferred.