Authors

  1. Guo, Jia-Wen PhD, RN

Article Content

 

* Google Translate, a free, online, machine translation tool, has been studied in clinical and research settings for facilitating linguistic translation, and has potential to support translation of survey instruments.

 

* Google Translate provides a good starting point for translating study instruments even though it does not adequately address cultural aspects of translation.

 

A survey instrument that maintains its validity after translation is essential for cross-cultural and international collaborative researchers to compare outcomes of health interventions across countries, ethnicities, or racial groups. Hence, there is a great need to translate survey instruments into the language of the culture being studied. The development of a new instrument for a specific purpose and population is a complex and time-consuming process. Translating a well-developed instrument into another language can save the time and effort of developing a separate instrument.

 

CHALLENGES EXISTING IN INSTRUMENT TRANSLATION

Instrument translation involves more than ensuring formal linguistic equivalence. Ensuring cultural equivalency of the original instrument1 complicates the translation process and is a challenge in survey instrument translation. First, an original survey instrument can be difficult to translate into another language with lexical equivalency. Most instruments have not been developed with the intent of future translatability; hence, language- and culture-related content in the original survey instrument may not be applicable to another language or culture. Second, sentences with a high word count and complex sentence structure in one language may be difficult to translate into another language. According to Brislin,2 an English sentence of fewer than 16 words with a simple grammatical structure is more easily translatable to another language. Nevertheless, ease of translation may vary depending on the grammatical characteristics of the other languages. Third, an easy-to-read sentence in one language does not guarantee that it is similarly easy to read when translated into another language because of variations in syntax. Thus, ease of reading is not always consistent across languages for sentences of the same meaning. Therefore, it is strongly recommended to have well-qualified translators for forward and backward translations to ensure language equivalence during the language translation process.1-4 This leads to the next common challenge-the difficulty of finding experienced and well-qualified translators within a limited project time.5 Qualified translators are crucial to high-quality translations, but they may not be easy to find. As such, fewer highly qualified translators may be more valuable than multiple less-qualified translators. Finally, the cost and time for multiple forward and backward translations pose a concern for research projects operating within limited budgets and time frames.

 

POTENTIAL OF USING A MACHINE TRANSLATOR, GOOGLE TRANSLATE, FOR TRANSLATING A SURVEY INSTRUMENT

Machine translation is an automatic way of translating from one language to another in real time by using statistical and linguistic knowledge-based models. Google Translate (Google, Mountain View, CA),6 one of the most popular services, is a free, online, machine translation tool supporting 90 languages and has the potential to facilitate instrument translation. Used appropriately, this tool may significantly reduce human effort and resource expenditure associated with translations and is able to generate translation results from one language to another quickly. However, there are several known issues with using Google Translate for research purposes.

 

First, low-accuracy translation may result. Google Translate often treats the original text as a sequence of words or phrases, and the output may generate the translated words or phrases in the same order as the original language,7-9 which may not be meaningful in the target language. Second, because of the phrase-based statistical model that Google Translate uses,10 an original sentence with a few words and a simple grammatical structure may result in better translation quality.7 Yet, little is known about which languages are more suitable for this sort of program. Moreover, the type of document being translated and the context in which words are used may influence the quality of translation11 because a word can carry multiple meanings depending on the context. For example, the word "blue" can mean a specific color or refer to a negative emotional mood state. Google Translate may not be able to translate the sentence, "I am blue," according to the appropriate context of use, which can result in an incorrect translation output.

 

Although these concerns exist, clinicians and researchers remain interested in how Google Translate can be useful in healthcare delivery and research.12-16 For example, Google Translate has been used to facilitate medical communications, such as treatment decisions, when there were no human translators available.14 However, it is rarely discussed how Google Translate can be used in survey instrument translation. Survey instruments tend to have a less complex sentence structure for readability and may be simpler to translate than journal articles. Google Translate can potentially be used for survey instrument translation, but little is known about the quality of documents translated by this program.

 

Mandarin, the official language in China and Taiwan, is the most widely used and commonly spoken language in the world17 and the third most popular language in the United States.18 Although most Mandarin used in China and Taiwan is similar, certain terms are not semantically equivalent. For example, the legume known as "peanut" in English is "" in China, whereas a "peanut" is referred to as "" in Taiwan. Alternatively, the characters "" that mean "peanut" in Taiwan mean "potato" for people from China. In Taiwan, a "potato" is called "." These culture-specific differences add complexity to the process of translation. Google Translate translates "peanut" as "" and translates "" as "potato," implying that the program may have greater use for people from China.

 

Mandarin has been one of the most challenging languages for Google Translate, and the results are frequently cited as having an accuracy rate of less than 50%.12-14,16 Hence, the purpose of this case study is to assess the translation quality of Google Translate for a survey instrument translated from English to Mandarin.

 

DESCRIPTION OF THE TESTING MATERIAL

The Pain Care Quality Survey (PainCQ) with two subsurveys, the PainCQ-Interdisciplinary Care Survey and the PainCQ-Nursing Care Survey, is a validated survey for measuring the quality of pain care from the perspective of the patient based on the interdisciplinary care and nursing care.19 The instrument was developed in the United States and published in US English.19,20 The PainCQ is a 33-item survey that includes 36 sentences (three items have two sentences).19 Analysis of the readability, word counts, and grammatical complexity of the PainCQ was conducted.

 

Readability of each item was determined by calculating the Flesch Reading Ease score, using MS Word software (version 2013) (Microsoft, Redmond, WA), with scores ranging from 0 (low readability) to 100 (high readability). Written items earning a Flesch Reading Ease score of 65 or greater are considered to be consistent with plain English.21 For the mean (SD) of the PainCQ items, the Flesch Reading Ease score was 67.41% (22.22%; range, 26.40%-100%), which correlates to a reading level of between eighth and ninth grades. The word count per item ranged between 4 and 28 words, and the mean (SD) of the word count per item was 12.15 (5.75). The grammatical complexity of items was assessed by the number of clauses (Table 1); when an item did not contain a clause, it was scored 1, and when it included one clause, it was scored 2. An item that contained more than one clause was scored 3. A higher score indicates a higher grammatical complexity. The grammatical complexity of the PainCQ items had a mean (SD) of 1.52 (0.67), with three items that scored 3 and 11 that scored 2. This evaluation was performed by a native English speaker with a doctoral degree in psychology.

  
Table 1 - Click to enlarge in new windowTable 1 Evaluation Criteria for Grammatical Complexity, Intelligibility of Translation, Structural Accuracy, and Usefulness of Translation

PROCEDURES OF USING GOOGLE TRANSLATE

There were five steps in the use of Google Translate for this study:

 

Step 1: Visit Google Translate via https://translate.google.com/.

 

Step 2: Select the source language as English from the left window and the translated language as Chinese (Traditional) from the right window (Figure 1).

  
Figure 1 - Click to enlarge in new windowFIGURE 1. Screenshot of using Google Translate in English-to-Mandarin task (screenshot was taken on June 15, 2016).

Step 3: Paste all 33 English-version items into the left window for the source language.

 

Step 4: Check the translation results from the right window.

 

Step 5: Copy the translation outcome from the right window and paste it into an MS Word document.

 

 

The author followed the previously mentioned steps to use Google Translate to translate the PainCQ from English to Mandarin in November 12, 2013. The entire process took less than 3 minutes for all 33 PainCQ items. The overall impression for this operation was that Google Translate was fairly easy to access and use.

 

FEEDBACK FROM A REVIEW OF OUTPUT FROM GOOGLE TRANSLATE

Three bilingual nurse researchers assessed the output of Google Translate in translating the PainCQ survey. All three nurse researchers had experienced using translated study instruments or had translated survey instruments from English to Mandarin. To properly assess translated Mandarin output by Google Translate, the nurse researchers were recruited from both geographical regions, two from China and one from Taiwan.

 

The nurse researchers were first asked to translate the PainCQ items from English to Mandarin by themselves to familiarize themselves with the testing material, the PainCQ survey. Then, they were given the translated PainCQ items generated by Google Translate to review and compare the translation outcomes. Finally, an interview was conducted to gather feedback from the nurse researchers. During the interview, they were asked about the time they spent translating the PainCQ, the process of translation, and, most importantly, their opinion of the quality of the translation conducted in Google Translate. Each interview lasted approximately 20 to 30 minutes, and notes were taken during the interview. The notes were summarized according to common themes that emerged from the data and are discussed in the following sections.

 

Human Translation Is Better Than Google Translate

Unsurprisingly, human translations yielded much better translation quality than those generated by Google Translate. The Google Translate results were not satisfactory because the meaning of translated output was either incomplete or nonexistent.

 

One English Term Can Be Translated in Various Ways Into Mandarin

The nurse researchers pointed out that the word "pain" could be represented by multiple Mandarin terms. For example, the characters "" specifically indicate physical discomfort, while "" indicates mental or physical suffering; while both terms represent the concept of pain, they are clearly two different constructs easily discernable by human translators. When using Google Translate, the physical concept of pain, "," is produced when only the word itself is queried; however, when the word "pain" appears in a sentence, the other translation, "," is produced. Therefore, the translation result for the PainCQ was not fully satisfactory because the keyword and concept, pain, was incorrectly interpreted.

 

Questions With a High Readability Level Could Be Difficult to Translate

Some items with simple grammar and a few words with a high English readability level were difficult to translate properly using Google Translate. For example, the item "My nurses asked me about my pain" can easily be translated into Mandarin with the same semantic word order by Google Translate into "" The verb "ask" translates into the Mandarin character "," which implies "an action of asking." The output of Google Translate in this example implies that nurses only asked about pain, losing the context of the original statement that carries the implication of an obligation of the nurse to monitor patient care with the potential for intervention. Hence, in this example, it was more appropriate to use the Mandarin term of "care" () instead of "ask" () to present what the original English version of the item intended.

 

The Passive Voice in English Sentences Could Be Difficult to Translate

The passive voice is commonly used in both English and Mandarin, but Google Translate did not always translate the passive voice sentences correctly. For example, "My pain was controlled" was translated as "my pain control" () by Google Translate. This output is an incomplete sentence in Mandarin and did not make sense to Mandarin Chinese speakers.

 

The Period (Punctuation) Matters in Google Translate

An English sentence with and without a period has two different translation outputs by Google Translate. Using the previous example, for the phrase "My pain was controlled." (with a period), the output was "", which translates as "my pain control." However, "My pain was controlled" (without the period) was translated as "" or "my suffering was controlled." Surprisingly, the output for the version without a period in the sentence was syntactically correct in Mandarin. The passive tense in Mandarin was also presented appropriately, but the translation of "pain" () in this output was not preferable as it means "suffering." The term "" which means physical discomfort, is more appropriate for this context. This discovery suggests that users can input sentences with and without periods to explore options from the outputs of Google Translate.

 

Verb Tense Could Be Difficult to Translate

Unlike English, Mandarin verbs have only one tense; that is, verbs do not change form to show tense. A Mandarin sentence without a time designation implies the present tense. In Mandarin, a word or phrase relating to time (eg, now, yesterday, after 3 days) is added in the sentence to determine tense. Therefore, if an English sentence with a past tense verb does not contain a time designation, Google Translate cannot translate it into Mandarin with appropriate tense forms. For example, it translates both "I love you." and "I loved you." as "" in Mandarin. The English version of the PainCQ was constructed in past tense; however, Google Translate produced only the present tense.

 

Same Term May Be Used Differently Between Geographical Regions

In terms of Mainland Chinese characters, people from China use simplified Chinese, while those from Taiwan and Hong Kong use traditional Chinese. The major difference between simplified Chinese and traditional Chinese is the number of strokes in the same character. For example, "learning" in simplified Chinese is "" with six strokes and translated as "" in traditional Chinese with 16 strokes. Nevertheless, there are some common Chinese characters in simplified Chinese and traditional Chinese. For example, "person" is translated as "" in both simplified Chinese and traditional Chinese with two strokes. Google Translate offers two options for Mandarin translation: "Chinese (Simplified)" and "Chinese (Traditional)"; however, the translation output does not take into consideration cultural variations in language use.

 

Although forms of Mandarin used in China and Taiwan vary,22 the output of Google Translate is more intuitive for Mandarin speakers from China. For example, Google Translate translated "follow-up" as "," which is commonly used in China but not in Taiwan where people use "" instead. Either "" or "" shows the meaning of monitoring the treatment or intervention by further observing an individual or patient. However, people from China usually use "" in the context of secretly tracking or following someone. Mandarin speakers from both geographical regions understand that both terms indicate "follow-up," but people from Taiwan are more familiar with "," while those from China are used to "" when they are talking about following up on medical care.

 

Potential of Using Google Translate in Survey Instrument Translation

Although most of the Google Translate output was not satisfactory, nursing researchers in this study strongly suggested using it as a starting point for translators.

 

Translating an instrument is time consuming, although most question items are fairly short sentences compared with sentences in other documents such as a journal article. The three nurse researchers took approximately 1 to 3 hours to translate 33 PainCQ items to a satisfactory level of quality. Generally, they spent approximately the first 30 or 45 minutes generating the first translation draft of the PainCQ and then spent additional time to improve the translation quality and make sure the translation outcomes matched the original English version of items. Therefore, using Google Translate to generate the translation "draft" can facilitate the effort expended by manually translating a survey instrument, by reducing the time and cost of generating a translation draft.

 

COMPARISON OF TRANSLATION QUALITY OF GOOGLE TRANSLATE IN TERMS OF READABILITY, WORD COUNTS, AND GRAMMATICAL COMPLEXITY

To further examine Google Translate outputs in the survey instrument translated from English to Mandarin, the relationships between translation quality and readability level, word count, and grammar complexity of each item were investigated. Knowing that people from different geographical regions may speak and use Mandarin Chinese differently,22 two nursing researchers who are fluent in English and Mandarin, one from China and one from Taiwan, were purposively recruited to evaluate the quality of the Google Translate translations.

 

Measurements of Google Translate Translation Quality

The aspects of intelligibility, structural accuracy, and usefulness of translation were analyzed to measure translation quality. According to Anazawa et al's23,24 studies, the operational definitions of intelligibility, structural accuracy, and usefulness of translation were that intelligibility of translation was based on whether the translation result was understood and the meaning of the translated item is close to the targeted language (Mandarin), structural accuracy was based on the accuracy level in the sentence structure of the targeted language, and usefulness of translation was based on whether the translation result was useful for the purpose of the task, which was to facilitate the translator's work. All three measurements of translation quality were reliable, with an interrater reliability (IRR) of greater than 0.75.23,24 The scores ranged from 1 (not intelligible at all, no structural accuracy, or not useful at all) to 5 (very intelligible, accurate translation, or very useful). A higher score indicates a better translation quality.

 

Data analysis was conducted using IBM SPSS Statistics version 22 (IBM, Armonk, NY). Descriptive statistics were used to describe the characteristics of continuous variables (ie, intelligibility, structural accuracy, and usefulness of translation; Flesch Reading Ease score; and word counts). The means of intelligibility, structural accuracy, and usefulness of translation from two evaluators were used as the score of translation quality. Pearson r and Spearman [rho] were used to assess the correlation between translation quality and readability level (measured by Flesch Reading Ease test) and word counts. Independent sample t tests and Mann-Whitney U tests were used to assess the difference in the translation quality and grammatical complexity. The grammatical complexity was recorded as two groups: no clause included (simple grammar) and at least one clause included (more complex grammar).

 

Translation Quality of Google Translate not Relevant to Readability, Word Count, and Grammatical Complexity

The mean values of the intelligibility of translation, structural accuracy, and usefulness of translation, as calculated by the two raters, ranged between 2.59 and 3.11 (Table 2). According to Table 1, most PainCQ items were "either only partially or somewhat intelligible," "either one or two inaccurate translations," and "neither useful nor not useful." The unsatisfactory results may have come from the cultural and linguistic variations between languages. For example, Google Translate could not correctly translate past tense sentences from English to Mandarin because there is no change in verb tense in Mandarin. Therefore, human-translator review is required to avoid misinterpretation or loss of information. These findings are similar to the findings of previous studies, which found that the translation outputs of Google Translate were inadequate.11,12,14-16

  
Table 2 - Click to enlarge in new windowTable 2 Correlation Between the Quality of Translation and Word Counts and Readability

The correlations between the translation quality and readability and word count in the PainCQ item were weak (-0.27 and 0.31) and not statistically significant. Regarding item grammatical complexity, there were approximately even numbers of items with no clauses (n = 16, 48.48%) and items with one clause or more (n = 17, 51.52%). No statistically significant difference between these two groups' translation quality of Google Translate was observed (Table 3).

  
Table 3 - Click to enlarge in new windowTable 3 Comparison the Quality of Translation by the Grammatical Complexity

Half of the PainCQ items had a simple grammatical structure, and the mean (SD) word count per PainCQ item was 12.15 (5.75), which implied that Google Translate may perform well in translating the PainCQ items from English to another language. Although previous publications indicated that Google Translate performed better on high-readability sentences, which often contain a lower word count (<16 words) and a simple grammatical structure,7,9 the study found that the readability, word count, and grammatical complexity of the PainCQ items did not influence the translation quality in the Google Translate English-to-Mandarin task. That is, a sentence with high readability, fewer words, and less complexity may not always be easier to translate from one language to another. This finding was confirmed by the interview results.

 

Interrater Agreement and Interrater Reliability

To evaluate interrater agreement (IRA) and IRR, two evaluators individually reviewed and rated all 33 Mandarin versions of PainCQ items, which were generated by Google Translate based on the intelligibility, structural accuracy, and usefulness of translation (Table 1). The IRA and IRR were evaluated based on the level of agreement between two evaluators in this study. According to Gisev et al,25(p331) IRA indices relate to the extent to which different raters assign the same precise value for each item being rated, and IRR indices relate to the extent to which raters can consistently distinguish between different items on a measurement scale. That is, IRA is sensitive to the variation in rating scores, and IRR is sensitive to the ranking orders.25 A high IRA is indicated when individual evaluators provide the identical rating score for each subject or question; a high IRR is indicated when individual evaluators provide the same ranking order for each subject or question.

 

In this study, [kappa] and intraclass correlation coefficient (ICC) were used for IRA/IRR indices. The values of [kappa] are between -1 and +1, where 1 indicates prefect agreement. Interpretation of [kappa] values is suggested by Landis and Koch26; values greater than 0.8 indicate almost perfect agreement, values of greater than 0.6 to 0.8 indicate substantial agreement, values of greater than 0.4 to 0.6 indicate moderate agreement, values of greater than 0.2 to 0.4 indicate fair agreement, values of greater than 0.0 to 0.2 indicate slight agreement, and values less than 0.0 indicate poor agreement. The values of ICC range between 0 and 1, where 1 indicates perfect reliability. The following benchmarks suggested by Fleiss27 were used to interpret the ICC values: greater than 0.75 indicates excellent reliability, 0.40 to 0.75 indicates fair to good reliability, and less than 0.40 indicates poor reliability. [kappa] Values and ICC were calculated by using SPSS.

 

Results of Interrater Agreement and Interrater Reliability

The ICC ranged from 0.89 and 0.94, which indicated excellent reliability; the k values ranged from 0.47 to 0.55, which indicated a moderate agreement (Table 4). The moderate agreement could be due to the use of evaluators from different geographical regions. Perhaps, because the output of Google Translate is more consistent with Mandarin Chinese used in China, the evaluator from China generally scored the intelligibility, structural accuracy, and usefulness of translation slightly higher than the Taiwanese evaluator (Table 4). This finding reflected the similar feedback mentioned in the previous section regarding the discussion of translating "follow-up," in which Google Translate favored the term used commonly in China but not in Taiwan.

  
Table 4 - Click to enlarge in new windowTable 4 Interrater Reliability and Agreement

LIMITATIONS

This case study had several limitations. First, the method of assessing translation quality (intelligibility of translation, structural accuracy, and usefulness of translation) was subjective, especially in assessing the intelligibility of translation and usefulness of translation. For example, the interpretation of "only partially intelligible," "somewhat intelligible," and "almost intelligible" may be subjective for raters when evaluating intelligibility of the translation. Despite this, the measurements showed a good IRR greater than 0.75, as discussed in previous studies,23,24 suggesting that the raters were consistent in applying these scores.

 

Second, only a few bilingual nurse researchers were involved in this study because too few qualified translators were available to meet the criteria. However, the evaluators represented the two biggest geographic regions that use Mandarin, China and Taiwan, and were able to provide effective feedback about the output of Google Translate. Moreover, triangulating methods of assessing the quality of the translation helps to overcome this limitation. Knowing that different geographical regions may use the Mandarin language differently, it is particularly important to include translators from China and Taiwan in this type of study. Moreover, the outcome of the IRA/IRR indices in this study demonstrated acceptability.

 

Third, the testing material focuses on pain care, which limits the generalizability of results to other topics. However, the results of feedback from the interview and the measurements of translation quality can still be generalizable for the task of English-to-Mandarin translation, highlighting issues such as the lack of verb tense in Mandarin.

 

Finally, the output of Google Translate may have improved since the study was conducted, which limits the replicability of the results. Google Translate allows users to submit a more appropriate translation result, which then is used to improve the program's translation quality. Therefore, it is possible that the Google Translate translation quality may have been improved after the study was conducted.

 

CONCLUSIONS

The findings from this case study show that Google Translate cannot replace human translators. This study found that the translation quality of Google Translate was not sufficient for translating survey instruments without human correction. Although not fully satisfactory, Google Translate can still be used for facilitating the process of translating study instruments by quickly generating a translation draft. A firm understanding of the cultural equivalency of a language (eg, Mandarin, English, Spanish) is essential for the quality in a translated survey instrument. Currently, Google Translate does not provide translation output with the cultural equivalency of a language. For future studies using Google Translate in translating survey instruments, researchers should take the translation process into account during instrument development to facilitate cross-cultural use and allow adequate time to engage appropriate translators to correct outputs generated by Google Translate.

 

Acknowledgment

The author gratefully acknowledges the following individuals for their contributions to this manuscript: Rumei Yang, MS., Meihong Ding, MS., Erin P. Johnson, PhD, and Susan L. Beck, PhD.

 

References

 

1. Beck CT, Bernal H, Froman RD. Methods to document semantic equivalence of a translated scale. Res Nurs Health. 2003;26(1): 64-73. [Context Link]

 

2. Brislin RW. The wording and translation of research instruments. In: Berry WJLJW, ed. Field Methods in Cross-Cultural Research. Thousand Oaks, CA: Sage Publications, Inc; 1986: 137-164. [Context Link]

 

3. Sidani S, Guruge S, Miranda J, Ford-Gilboe M, Varcoe C. Cultural adaptation and translation of measures: an integrated method. Res Nurs Health. 2010;33(2): 133-143. [Context Link]

 

4. Santos HP Jr., Black AM, Sandelowski M. Timing of translation in cross-language qualitative research. Qual Health Res. 2015;25(1): 134-144. [Context Link]

 

5. Jones PS, Lee JW, Phillips LR, Zhang XE, Jaceldo KB. An adaptation of Brislin's translation model for cross-cultural research. Nurs Res. 2001;50(5): 300-304. [Context Link]

 

6. Google Translate. Your world. Now with 90 languages. https://translate.google.com/about/intl/en_ALL/languages.html. Accessed July 15, 2015. [Context Link]

 

7. Shen E. Comparison of online machine translation tools. http://www.tcworld.info/e-magazine/translation-and-localization/article/comparis. Accessed July 14, 2015. [Context Link]

 

8. Google Translate. Wikipedia. https://en.wikipedia.org/wiki/Google_Translate#cite_note-Google-1. Accessed July 16, 2015. [Context Link]

 

9. Li H, Graesser AC, Cai Z. Comparison of Google translation with human translation. Paper presented at: The Twenty-Seventh International Flairs Conference; 2014. [Context Link]

 

10. Koehn P, Och FJ, Marcu D. Statistical phrase-based translation. Paper presented at: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1; 2003. [Context Link]

 

11. Anazawa R, Ishikawa H, Park MJ, Kiuchi T. Online machine translation use with nursing literature: evaluation method and usability. Comput Inform Nurs. 2013;31(2): 59-65. [Context Link]

 

12. Balk EM, Chung M, Hadar N, et al. Accuracy of Data Extraction of Non-English Language Trials With Google Translate. Rockville, MD: 2012. [Context Link]

 

13. Taylor RM, Crichton N, Moult B, Gibson F. A prospective observational study of machine translation software to overcome the challenge of including ethnic diversity in healthcare research. Nurs Open. 2015;2(1): 14-23. [Context Link]

 

14. Patil S, Davies P. Use of Google Translate in medical communication: evaluation of accuracy. BMJ. 2014;349: g7392. [Context Link]

 

15. Borner N, Sponholz S, Konig K, Brodkorb S, Buhrer C, Roehr CC. Google translate is not sufficient to overcome language barriers in neonatal medicine. Klin Padiatr. 2013;225(7): 413-417. [Context Link]

 

16. Balk EM, Chung M, Chen ML, Trikalinos TA, Kong Win Chang L. AHRQ Methods for Effective Health Care. Assessing the Accuracy of Google Translate to Allow Data Extraction From Trials Published in Non-English Languages. Rockville, MD: Agency for Healthcare Research and Quality (US); 2013. [Context Link]

 

17. Wikipedia. List of languages by total number of speakers. https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers. Accessed July 31, 2015. [Context Link]

 

18. Hoeffel E, Rastogi S, Kim M, Shahid HT. Overview of Race and Hispanic: 2010 Census Briefs. Washington, DC: United States Census Bureau; 2011. [Context Link]

 

19. Beck SL, Towsley GL, Pett MA, et al. Initial psychometric properties of the Pain Care Quality Survey (PainCQ). J Pain. 2010;11(12): 1311-1319. [Context Link]

 

20. Pett MA, Beck SL, Guo J, et al. Confirmatory factor analysis of the pain care quality surveys (PainCQ(c)). Health Serv Res. 2013;48(3): 1018-1038. [Context Link]

 

21. Flesch R. A new readability yardstick. J Appl Psychol. 1948;32(3): 221. [Context Link]

 

22. Zhang J, McBride-Chang C. Diversity in Chinese literacy acquisition. Writ Syst Res. 2011;3: 87-102. [Context Link]

 

23. Anazawa R, Ishikawa H, Takahiro K. Evaluation of online machine translation by nursing users. Comput Inform Nurs. 2013;31(8): 382-387. [Context Link]

 

24. Anazawa R, Ishikawa H, Park MJ, Kiuchi T. Preliminary study of online machine translation use of nursing literature: quality evaluation and perceived usability. BMC Res Notes. 2012;5: 635. [Context Link]

 

25. Gisev N, Bell JS, Chen TF. Interrater agreement and interrater reliability: key concepts, approaches, and applications. Res Social Adm Pharm. 2013;9(3): 330-338. [Context Link]

 

26. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33: 159-174. [Context Link]

 

27. Fleiss JL. Chapter 1. Reliability of Measurement: Design and Analysis of Clinical Experiments. New York, NY: John Wiley & Sons, Inc; 1999: 1-32. [Context Link]