ORIGINAL RESEARCH article

Front. Psychol., 31 March 2022

Sec. Psychology of Language

Volume 13 - 2022 | https://doi.org/10.3389/fpsyg.2022.872025

Assessing Students’ Translation Competence: Integrating China’s Standards of English With Cognitive Diagnostic Assessment Approaches

  • 1. School of English Studies, Shanghai International Studies University, Shanghai, China

  • 2. School of Education, Shanghai International Studies University, Shanghai, China

Abstract

While translation competence assessment has been playing an increasingly facilitating role in translation teaching and learning, it still failed to offer fine-grained diagnostic feedback based on certain reliable translation competence standards. As such, this study attempted to investigate the feasibility of providing diagnostic information about students’ translation competence by integrating China’s Standards of English (CSE) with cognitive diagnostic assessment (CDA) approaches. Under the descriptive parameter framework of CSE translation scales, an attribute pool was established, from which seven attributes were identified based on students’ and experts’ think-aloud protocols. A checklist comprising 20 descriptors was developed from CSE translation scales, with which 458 students’ translation responses were rated by five experts. In addition, a Q-matrix was established by seven experts. By comparing the diagnostic performance of four widely-used cognitive diagnostic models (CDMs), linear logistic model (LLM) was selected as the optimal model to generate fine-grained information about students’ translation strengths and weaknesses. Furthermore, relationships among translation competence attributes were discovered and diagnostic results were shown to differ across high and low proficiency groups. The findings can provide insights for translation teaching, learning and assessment.

Introduction

In translation studies, translation competence assessment is a widely-discussed topic that has gained continuous scholarly attention (Adab, 2000; Orozco and Hurtado Albir, 2002; Way, 2008; Angelelli, 2009; Galán-Mañas and Hurtado Albir, 2015; Hurtado Albir, 2015). One remarkable progress, as can be seen during the past decades, is that assessment has been no longer merely about a way of measurement, it has also served as a tool to facilitate translation teaching and learning. For example, the practice of traditional assessment that only provided one total score has been supplemented with or sometimes even replaced by the use of various assessment instruments such as rubrics or scales to guide students’ autonomous learning or teachers’ feedback-giving (e.g., Presas, 2012; Galán-Mañas, 2016). However, on one hand, most of these instruments, if not all, were self-made while lacking rigorous development and validation procedures. On the other hand, feedback through neither way of assessment could convey fine-grained diagnostic information about students’ translation competence.

Even though quite a few language proficiency scales have started to arise since last century, notably, Interagency Language Roundtable Scale (ILR), Canadian Language Benchmarks (CLB), Common European Framework of Reference for Languages (CEFR), and American Council for the Teaching of Foreign Languages (ACTFL), only ILR included translation competence descriptors which were primarily designed for government’s language service management rather than for translation teaching and learning. Presumably because of the growing frequency of international exchanges and cooperation, recent years have begun to witness the emerging role of translation competence in language proficiency scales. For example, Council of Europe added translation competence descriptors to its new version of Council of Europe (2018). Also, China’s Standards of English (CSE; National Education Examinations Authority, 2018), the Chinese counterpart to CEFR, included translation competence scales as one of its major constitutes. Unlike those self-made ones in previous studies, these two scales have experienced rigorous development processes and were therefore of higher quality to evaluate students’ translation competence. Take CSE for example, guided by Bachman and Palmer (1996, 2010) communicative language ability (CLA) model while situated in Chinese English teaching and learning context, the CSE project team has spent more than 3 years collecting and validating descriptors from 1,60,000 Chinese EFL learners and teachers nationwide.

The second problem, through no fault of teachers’ own, lies in the inadequacy of previous measurement approaches in helping produce students’ mastery information on each translation sub-competence or sub-skills. Hurtado Albir and Pavani (2018) tried to design a multidimensional assessment scheme for the purpose of providing data about students’ different translation sub-competences, however, their proposal involved more than 20 tasks (with each task corresponding to each sub-competence), making it time-consuming and therefore unrealistic to implement in most assessment settings. In this case, a more useful approach that can delve into specific information about students’ translation competence while using a moderate amount of assessment tasks is called for. Fortunately, a new paradigm of measurement approach—cognitive diagnostic assessment (CDA), is found to meet such needs. Different from traditional measurement theories such as classical test theory (CTT) or item response theory (IRT), CDA can provide diagnostic feedback through fine-grained reporting of students’ skill mastery profiles (Tatsuoka, 1983; Shohamy, 1992; DiBello et al., 1995) “with a limited number of items/tasks available in a test form” (Sawaki et al., 2009, p. 192).

Given that such scales as CSE can overcome the shortcomings of previous assessment instruments by providing reliable and valid translation competence descriptors while CDA can outperform traditional measurement approaches in generating fine-grained information about certain competence with comparatively limited tasks, it is our assumption that the integration of the two is much likely to provide solutions to the aforementioned problems. Focusing on Chinese/English language pairs, this study is therefore aimed at investigating the feasibility of integrating CSE with CDA approaches to provide diagnostic information about students’ translation competence.

Literature Review

Cognitive Diagnostic Assessment

“CDA is designed to measure specific knowledge structures and processing skills in students so as to provide information about their cognitive strengths and weaknesses” (Leighton and Gierl, 2007, p. 3). Generally speaking, six steps are involved in the application of CDA approaches, namely, describing assessment purposes, identifying attributes, constructing a Q-matrix, validating the Q-matrix empirically, evaluating the selected model, and communicating diagnostic feedback (Fan and Yan, 2020). At the very beginning, the question of “what to assess” (theoretical construct) should be answered. Next, by seeking guidance from various relevant sources such as test specifications, content domain theories, analysis of item content, think-aloud protocol analysis of examinees’ test-taking process, and other relevant research results (Buck and Tatsuoka, 1998; Leighton et al., 2004; Leighton and Gierl, 2007), the construct is operationalized into a set of cognitive attributes, namely, specific knowledge, skills or strategies involved in completing the test tasks (Birenbaum et al., 1993; Buck et al., 1997). Then, a Q-matrix, “specifying the relationship between attributes and test items” (Jang, 2009a, p. 211), is constructed by “content experts such as language assessment specialists, language teachers, and applied linguists” (Lee and Sawaki, 2009a, p. 170). Last, cognitive diagnostic models (CDMs) are compared and selected to conduct cognitive diagnostic analysis and then generate diagnostic feedback. So far, there have been a vast array of CDMs among which some typical ones are the linear logistic model (LLM; de la Torre and Douglas, 2004), the deterministic input, noisy “and” gate model (DINA; Junker and Sijtsma, 2001), the reduced reparameterized unified model (RRUM; Hartz, 2002), the deterministic input, noisy “or” gate model (DINO; Templin and Henson, 2006), the additive cognitive diagnostic model (ACDM; de la Torre, 2011) and the generalized DINA model framework (G-DINA; de la Torre, 2011). Various as they are, all CDMs can be classified into compensatory, non-compensatory and saturated (general) models. Compensatory models (e.g., LLM, DINO, and ACDM) assume that mastery of one attribute can compensate for non-mastery of other attributes needed to answer an item correctly. By contrast, non-compensatory models (e.g., DINA and RRUM) assume that examinees should master all the attributes required by the item so as to produce correct answers. Saturated models (e.g., G-DINA) allow for both compensatory and non-compensatory relationships within the same test, however, such CDMs employ more complex parameterization and thus require a larger sample size in order to provide accurate estimates (de la Torre and Lee, 2013).

Cognitive Diagnostic Assessment in Language Testing

Despite its popularity in the psychometric community for quite a long time, cognitive diagnostic research didn’t show its presence in language testing until the 1990s (Sheehan et al., 1993; Buck et al., 1997; Buck and Tatsuoka, 1998). Since then, a plethora of studies have been conducted, the majority of which were devoted to exploring the feasibility and validity of retrofitting existing non-diagnostic language tests for cognitive diagnostic purposes. Specifically, they have investigated the extent to which CDMs could fit the response data and then generated fine-grained information about test-takers’ language strengths and weaknesses. Taking a step forward, quite a few studies have explored the extent to which diagnostic results could help discover the relationships among attributes (e.g., Chen and Chen, 2016a; Ravand and Robitzsch, 2018; Du and Ma, 2021) and could differ across different proficiency groups (e.g., Kim, 2010, 2011; Fan and Yan, 2020). Among all those studies, a notable phenomenon is that assessment of receptive language skills such as reading and listening (e.g., von Davier, 2008; Jang, 2009b; Lee and Sawaki, 2009b; Kim, 2015; Chen and Chen, 2016b; Yi, 2017; Aryadoust, 2021; Dong et al., 2021; Toprak and Cakir, 2021; Min and He, 2022) gained much more attention than that of productive ones such as writing (Kim, 2010, 2011; Xie, 2017; Effatpanah et al., 2019; He et al., 2021). “One possible reason is that different test methods are used to assess these two types of skills” (He et al., 2021, p. 1). For the former, selected-response tests are often used, producing answers that can be scored with 1 (right) or 0 (wrong), which can serve as the response data directly analyzed by CDMs. The latter, however, is usually assessed with constructed-response tasks, meaning that it is difficult to score the answers (e.g., an essay) with either 1 or 0. To tackle this problem, Kim (2010, 2011) designed a diagnostic checklist composed of 35 writing ability descriptors and asked experts to use this checklist to assess examinees’ essays by rating 1/0 on each descriptor, the rating results of which then served as the writing response data analyzed by CDMs. Kim’s study provided methodological references for cognitive diagnostic research conducted with constructed-response tests, but descriptors in the study needed further amelioration in that they were sourced merely from a few teachers’ verbal reports. He et al. (2021) made an improvement in this regard by using descriptors selected from CSE. One slight flaw in their study, however, was that only one type of writing activity was involved, which, to some degree, restricted the inferences of test-takers’ writing competence.

Another phenomenon worth noting is that while CDA approaches were shown to be suitable and effective for language assessment, performances of different CDMs varied. The G-DINA model turned out to be the most widely-used or the best-fitted model in diagnosing receptive language skills (e.g., Chen and Chen, 2016b; Ravand, 2016; Ravand and Robitzsch, 2018; Javidanmehr and Sarab, 2019) given that “it is sensitive to the integrative nature of comprehension skills” (Chen and Chen, 2016a, p. 1051). However, no optimal model has been found in writing competence diagnosis due to the limited number of relevant research. Among the very few studies, three models have been found with the best model fit data including RRUM (Kim, 2010, 2011; Xie, 2017), ACDM (Effatpanah et al., 2019), and LLM (He et al., 2021). To the best of our knowledge, it seems that cognitive diagnostic studies on translation competence are not yet available. To solve the problems existing in translation competence assessment mentioned earlier, it is imperative that such a research gap should be filled. But before that, the foremost thing is to determine the construct of translation competence.

Construct of Translation Competence

The construct of translation competence has caught continuous scholarly attention over the past decades (e.g., Cao, 1996; PACTE, 2003, 2005, 2008, 2015; Pym, 2003; Göpferich and Jääskeläinen, 2009), among which the empirical research results from PACTE have received considerable support. According to PACTE, translation competence is comprised of bilingual, extralinguistic, knowledge of translation, instrumental and strategic as well as psycho-physiological components. Based on PACTE’s translation competence model while taking into consideration the research findings on Chinese/English translation competence (e.g., Yang, 2002, 2014), the translation team of the CSE project defined translation competence as “the written language transference ability demonstrated by language learners and users in the participation of intercultural and trans-lingual activities” (National Education Examinations Authority, 2018, p. 108). Following this construct, they established a descriptive parameter framework of CSE translation scales composed of four major parameters, namely, translation activities, translation strategies, translation knowledge, and typical translation features (see Table 1). Under each parameter exists several subparameters with each corresponding to one subscale in CSE translation scales. So far, subscales related to the first two parameters have already been unveiled. For each subscale, there are five proficiency levels in which levels 5 and 6 are generally aligned with novice translation learners while levels 7, 8, and 9 with advanced ones. Among those parameters, translation activities can be helpful in selecting appropriate assessment tasks. For the other three ones, operationalized as they are, not all of them are suitable enough to serve the intended purpose of cognitive diagnosis. In particular, typical translation features, as the name suggests, are more concerned with the general criteria of translation quality rather than specific knowledge or skills. In this case, it is suggested that further refinement of this parameter framework is needed.

TABLE 1

First-level parameterSecond-level parameterThird-level parameter
TranslationTranslation activitiesDescription
Narration
Exposition
Argumentation
Instruction
Interaction
Translation strategiesPlanning
Execution
Appraising and compensation
Translation knowledgeTheoretical knowledge
Practical knowledge
Professional knowledge
Typical translation featuresAccuracy
Completeness
Appropriateness
Fluency
Standardization

Descriptive parameter framework of CSE translation scales.

Refinement of the Parameter Framework of China’s Standards of English Translation Scales

Following the definition of cognitive attributes, it can be seen that both translation strategies and translation knowledge can meet the requirement. In other words, if we intend to conduct cognitive diagnostic assessment of translation competence, such subparameters as strategies of planning, execution, appraising and compensation, as well as theoretical, practical and professional knowledge are suitable to serve as cognitive attributes. By way of contrast, subparameters of typical translation features are neither strategies nor knowledge and thus need to be refined. To this end, document analysis was conducted to search for subskills corresponding to those subparameters, i.e., accuracy, completeness, appropriateness, fluency and standardization. By perusing translation competence descriptors in CSE and referring to the syllabus for major English tests (with translation section) and translation tests administrated in China including College English Test-Band 4 and 6 (CET 4 and 6), Test for English Majors-Band 8 (TEM 8), and China Accreditation Test for Translators and Interpreters Level 2 and 3 (CATTI 2 and 3), six major subskills were extracted, namely, conveying key information, conveying details with accurate wording, conforming to language norms, conforming to language habits, reproducing styles and optimizing logical structures. All those subskills can be classified into the five typical translation features, that is, conveying key information, conveying details with accurate wording and reproducing styles can fall into the category of “accuracy” and “completeness” while conforming to language norms, conforming to language habits and optimizing logical structures belong to “standardization,” “appropriateness,” and “fluency” respectively. Together with the subparameters of translation strategies and knowledge, an attribute pool of translation competence was then established to serve as the framework of reference for attribute selection (see Table 2). For the definition of those attributes, the first six ones were from the translation team of the CSE project (Bai et al., 2018), while the last six ones were a synthesis of test syllabus content, CSE translation competence descriptors and the definition of typical translation features.

TABLE 2

AttributeDefinition
Using the strategy of planningMaking plans for the translation process and translation activities according to the purpose of translation.
Using the strategy of executionMaking use of translation techniques (e.g., amplification, omission, and conversion) to solve translation problems.
Using the strategy of appraising and compensationEvaluating and monitoring the translation process and products so as to compensate for translation deficiencies in a timely manner by using appropriate translation compensation techniques.
Mastering theoretical knowledgeUnderstanding the nature, basic concepts, theories and history of translation.
Mastering practical knowledgeApplying theoretical knowledge of translation to specific translation practice activities.
Mastering professional knowledgeMastering knowledge of translation industry norms, professional conduct, etc.
Conveying key informationUnderstanding and communicating main messages in a faithful, complete and correct manner without obvious wrong translation and loss of information.
Conveying details with accurate wordingUsing appropriate words to express meaning in an accurate way.
Conforming to language normsConforming to relevant standards of the target language (grammatical rules, terminology, units of measurement, etc.).
Conforming to language habitsConveying messages in accord with language and cultural habits of the target language.
Reproducing stylesReproducing the original rhetorical and stylistic features.
Optimizing logical structuresReproducing the original text structures in a logical and coherent way.

The attribute pool of translation competence.

So far, we have identified the problem existing in translation competence assessment, which is also shown to be a research lacuna in cognitive diagnostic research in language testing. Moreover, we have found that CSE could provide a scientific framework of reference for the application of CDA approaches to translation competence assessment in such aspects as selection of tasks and attributes, development of diagnostic checklists, etc. This study, therefore, was aimed at investigating the feasibility of integrating CSE with CDA approaches to assess students’ translation competence. To this end, this study was guided by three research questions: (1) To what extent can CDMs fit the data and then generate fine-grained information about examinees’ translation strengths and weaknesses? (2) To what extent can diagnostic results help discover the relationships among translation competence attributes? (3) To what extent can diagnostic results differ across high and low translation proficiency groups?

Methodology

Instrument

As with the common practice in current cognitive diagnostic studies, this study retrofitted a non-diagnostically designed test for cognitive diagnostic output. The translation responses were collected from a school-based English proficiency test (SEPT) administered in a leading foreign studies university in China. The SEPT, administered in paper-and-pencil format at the end of each semester, is designed for non-English major undergraduates. As for the content of the SEPT, language skills and knowledge such as listening, reading, writing, vocabulary and grammar are often covered. In particular, translation is only included in the SEPT targeted at juniors presumably because the translation course is not open to them until the third school year. For the development of the SEPT administered in January 2021, the two researchers were responsible for the selection of materials for the translation section. Following the descriptive parameter framework of CSE translation scales, we tried to cover as many translation activities as possible. Considering our subjects were novice translation learners, short texts with simple language and familiar topics mainly from the retired CET translation sections were chosen with reference to descriptors from levels 5 and 6 in the CSE translation scales. With careful selection and discussion, the translation section of the SEPT was finally developed which consisted of four tasks covering five major types of activities, namely, description, narration, exposition, argumentation and instruction. The first two tasks were in English (111 words in total) designed to assess students’ English–Chinese translation competence while the last two were in Chinese (171 words in total) used to evaluate students’ Chinese–English translation competence. Based on the requirement for translation speed in CET (280 ∼ 320 words per hour), a total of 60 min were set for the completion of the translation section. After the test, 510 translation responses were obtained, out of which 458 ones were left in this study after those with a considerable amount of incomplete answers (with a proportion of more than 15% of the original content untranslated) were screened out.

Participants

Twelve juniors (10 females and 2 males) majoring in Chinese/English translation volunteered to help identify the attributes involved in this study. Averaged at 21 years old, these students had learned translation for 2.5 years.

Seven domain experts (four males and three females) were recruited to help identify the attributes, develop the diagnostic checklist and construct the Q-matrix, and five of them also participated in the rating process in this study. We selected experts who (a) acquired a master’s or doctoral degree in translation. (b) Had hands-on experience as translators or translation teachers. (c) Had previous experience as scorers in large-scale language tests. Averaged at 30 years old, the seven experts involved in this study had an average of 5.2 years’ translation experience. All of them had acquired a master’s degree in translation and had served as scorers in such tests as TEM and CET. In addition, four of them were faculty members or Ph.D. candidates in the translation program, the other three were professional translators.

Procedures

Somewhat different from the steps mentioned by Fan and Yan (2020) in diagnosing receptive language skills, the procedures in this study bore a strong resemblance to those in diagnosing writing skills (Kim, 2010, 2011; Xie, 2017; Effatpanah et al., 2019; He et al., 2021) which included five major stages: identifying the attributes, developing the diagnostic checklist, constructing the Q-matrix, rating the translation responses and selecting the CDM.

Identifying the Attributes

In the literature review, we have established an attribute pool of translation competence comprising 12 attributes. In order to identify those involved in this study and at the same time to check whether there existed any other attribute, think-aloud protocols were conducted among students and experts. To be more specific, 12 juniors majoring in Chinese/English translation were asked to give a retrospective verbal report on the process of translating the four texts. Translation majors were chosen because “their proficiency level is relatively higher than that of the subjects and thereby can provide more comprehensive information about the subskills involved in the test” (Zhang and Shi, 2020, p. 81). In addition, seven experts were asked to verbalize how they evaluated 10 translation responses that were randomly chosen from 458 responses. A total of 85 and 158 min’ verbal reports were collected from students and experts, respectively, transcribed into about 18,000 and 36,500 words (Chinese) respectively.

By analyzing and coding the transcription, the two researchers consistently identified seven major attributes in this study including conveying key information (A1), conveying details with accurate wording (A2), conforming to language norms (A3), conforming to language habits (A4), reproducing styles (A5), optimizing logical structures (A6) and using the strategy of execution (A7). For detailed transcription, please refer to the Supplementary Material. Because of the limited space, only some typical quotes extracted from the verbal reports were presented in Table 3 to demonstrate how we identified the seven attributes in our study. For example, expert No. 7 pointed out that “the student didn’t fully understand the original information and tended to translate the text based on their own interpretation, which caused a lot of wrong translations,” it was found that what he said was consistent with the definition of the attribute “conveying key information,” then, we identified A1 as one of the attributes in our study.

TABLE 3

AttributeSelected quotes
A1E7: Obviously, this student didn’t fully understand the original information and tended to translate the text based on their own interpretation, which caused a lot of wrong translations.
S7: I didn’t quite understand the sentence “we do not believe that climate change is a certainty” which is a little bit contradictory to common sense, so I was not sure whether my word-for-word translation was correct or not…
A2E6: Some of the words in the translated text were not accurate enough, for example, “affair” could not be used to replace the meaning of “marriage.”
S12: It was challenging to choose an accurate and appropriate expression of “冲”, crash? flock? or rush? I couldn’t find the one that I felt satisfied with.
A3E5: Grammatical and spelling mistakes were common in this translation work such as the wrong spelling of “flew” or a lack of articles in front of singular nouns (house, car, etc.).
A4E2: The translated text was awkward and did not look like Chinese because all the passive voice in the original (English) text have been followed rigidly.
S6: Given that verbs are used more frequently in Chinese than in English, I chose to convert nouns into verbs in my translation.
A5E3: This student failed to take into account the style of the original text (a notice posted up in the library) and therefore could not achieve pragmatic effects.
S9: This text seemed to be very official, so I tried to use formal words to keep consistent with the tone in the original text.
A6E4: This student failed to reproduce the logical relationship existing in the original text such as the causal relationship in the third translation task.
S2: I did not quite understand the logic within the second sentence, so I felt there must be some problems with my translation.
A7E1: The technique of “explicitation” should be used to further explain the word “needs” to help target readers better understand this message.
S5: To make the English word “it” more specific in Chinese, I chose to use the technique of “conversion”.

Selected quotes from students’ and experts’ verbal reports.

“E,” “S” mean expert and student, respectively.

Developing the Diagnostic Checklist

Three steps were involved in developing the diagnostic checklist. Firstly, based on the translation activities involved in the four translation tasks and the English proficiency levels of the examinees, a total of 46 descriptors mainly from the six subscales (description, narration, exposition, argumentation, instruction, and the strategy of execution) at levels 5 and 6 were selected to form an initial descriptor pool. Then, by analyzing the content of the four translation tasks, the two researchers chose 22 descriptors that were of direct relevance to the tasks from the descriptor pool. Next, a discussion among the seven experts was conducted to judge whether these 22 descriptors were suitable, the result of which was that controversy on 2 descriptors was unsolved and thus deleted. To call on He et al. (2021) suggestion, for the remaining 20 descriptors, the researchers made some modifications in due courses to tailor them to specific translation tasks for the convenience of experts’ ratings. Table 4 shows the final diagnostic checklist of 20 descriptors in which D1–D3, D4–D8, D9–D14, and D15–D20 belong to task 1, task 2, task 3, and task 4, respectively.

TABLE 4

TaskNo.Descriptor
OneD1Can translate short argumentative texts on common themes, conveying the arguments and reasoning.
D2Can accurately convey detailed information in the argumentative texts.
D3Can change wording and sentence structures based on the writing styles of the original, ensuring stylistic consistency between the original and the translation.
TwoD4Can translate notices and posters used in everyday life, conveying the key information.
D5Can accurately convey detailed information in the notice.
D6Can convert passive voice into active voice as needed in the notice.
D7Can change wording and sentence structures based on the writing styles of the original, ensuring stylistic consistency between the original and the translation.
D8Can add words or phrases implied in the original, making the translation coherent and intelligible.
ThreeD9Can translate short popular science articles, conveying the key information.
D10Can convey detailed information in the exposition.
D11Can properly translate expressions for a series of nouns in accordance with the grammatical rules of the translation.
D12Can flexibly adjust the word order according to the way of expression in English.
D13Can flexibly use translation skills such as omissions to remove repetitions in the original.
D14Can add conjunctions indicating logical connections implied in the original according to English sentence patterns.
FourD15Can translate simple documentary texts, reproducing the courses of events.
D16Can translate accurately reproduce the original details in scenes of events and activities.
D17Can use transliteration to translate proper nouns, such as names of persons and places.
D18Can properly translate expressions for space and time in accordance with the grammatical rules of the translation.
D19Can choose proper sentence patterns to reproduce the tone in the original.
D20Can convert diverse Chinese clauses or sentences into English compound sentences, non-finite verb phrases, or prepositional phrases, making the translation compact and concise.

The diagnostic checklist of 20 descriptors.

Constructing the Q-Matrix

Seven experts were asked to identify the relationship between each descriptor and the seven attributes. For each descriptor, if a certain attribute was thought to be measured, then “1” was coded, otherwise, “0” was coded. Putting their coding results together, we coded “1” for those relationships on which four or more experts reached consensus (Meng, 2013; Du and Ma, 2021), otherwise, we coded “0.” In such a way, the Q-matrix was constructed (see Table 5). As shown, out of the 20 descriptors, 11 descriptors were coded as measuring one attribute and 9 ones measuring more than two attributes. The times of measurement for each attribute ranges from 3 to 9, meeting the requirement that attributes should be measured at least 3 times in order to produce stable results (Buck and Tatsuoka, 1998; Hartz, 2002).

TABLE 5

AttributeDescriptorA1A2A3A4A5A6A7
D11000000
D20100000
D30000101
D41000000
D50100000
D60001001
D70001101
D80100001
D91000000
D100100000
D110010000
D120001011
D130011011
D140001011
D151000000
D160100000
D170010001
D180010000
D190000100
D200001011
Times of measurement4546349

The Q-matrix.

Rating the Translation Responses

Five experts were involved in rating students’ translation responses using the diagnostic checklist. For practical reasons, the division of work was as follows: for 458 translation responses, one expert (rater 1) rated all the 20 descriptors while each of the other four experts was responsible for one translation task (i.e., rater 2 for D1–D3, rater 3 for D4–D8, rater 4 for D9–D14, rater 5 for D15–D20), in this way, each descriptor was rated by two experts independently. The binary rating method was used, that is, “1” was coded if the rater thought that a student’s response generally met the criteria of the descriptor; otherwise, “0” was coded (Kim, 2010, 2011; Xie, 2017; Effatpanah et al., 2019; He et al., 2021). Prior to the rating, the researchers explained to raters specific details in each descriptor. After ensuring each rater has fully understood the coding process and their own responsibilities, we then carried out a pilot rating, the results of which were fairly satisfactory given the high inter-rater agreement. Then five raters began their independent coding on 458 responses, the results of which were acceptable in that the percent exact agreement on each descriptor ranged from 95.63 to 100%. As for the inconsistent part, discussions were conducted between experts to determine the final results. The duration of each discussion averaged about 50 min. After a long process of training, rating and discussion, a final table of 458 candidates’ 1/0 scores on the 20 descriptors was developed to serve as the response data for this study. The examinees’ total scores in the checklist (one point for each descriptor) generally followed a normal distribution (skewness = −0.31, kurtosis = 0.33) with a mean score of 11.84 (out of a total score of 20) and a standard deviation of 2.67.

Selecting the Cognitive Diagnostic Models

To select the appropriate CDM with the best diagnostic performance, model fit statistics (absolute and relative model fit) need to be evaluated and compared. According to Chen et al. (2013), the absolute model fit is examined through the evaluation of the transformed correlation (r) and log-odds ratio (l) with their adjusted p-values compared to the suggested threshold of 0.05. If the p-value of r and l are higher than 0.05, the data is shown to fit certain CDM. As for the relative fit, it aims to help select the optimal model by evaluating indices including Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC) and -2 log-likelihood (-2LL). “For each of these three statistics, the fitted model with the smallest value is selected among the set of competing models” (Chen et al., 2013, p. 127). However, as -2LL was shown to always select the saturated model such as G-DINA (Lei and Li, 2014), only AIC and BIC were taken into account in this study. Four CDMs (i.e., G-DINA, RRUM, ACDM, and LLM), the most widely-adopted models in diagnosing language skills, were evaluated and compared in this study by using the G-DINA package version 2.8.7 (Ma and de la Torre, 2020) in RStudio (Version 1.4.1717).

Table 6 presents the information of model fit statistics. As demonstrated, all the four CDMs could fit the response data well as the adjusted p-value were all higher than 0.05. In terms of relative model fit, LLM and G-DINA registered the smallest and largest value in AIC and BIC, respectively, meaning that LLM was the optimal CDM while G-DINA was the poorest one in terms of their diagnostic performance among the four CDMs. One thing worth attention was that the AIC and BIC values between ADCM and LLM were very close with a slight difference of 2 (AIC: 10023.99 vs. 10022.58; BIC: 10775.08 vs. 10773.67), which made it difficult to distinguish between the two models in this study (Burnham and Anderson, 2004). In this case, classification accuracy (Pa), the indicator of reliability in GDINA package, was employed to compare the two models. Table 7 shows classification accuracy at the test level (P_a) and the attribute level (P_a A1, A2, A3, A4, A5, A6, A7). Evidently, Pa indices of LLM were all higher than those of ACDM, indicating that LLM processed a higher probability of accurately classifying examinees into their true latent classes. Therefore, we decided to select LLM to generate diagnostic information about students’ translation competence in this study.

TABLE 6

G-DINARRUMACDMLLM
Adj.p-value (r)0.09670.29030.06160.2750
Adj.p-value (l)0.16070.85340.60260.4156
AIC10060.110038.0310023.9910022.58
BIC10939.1210789.1210775.0810773.67

Model fit statistics.

TABLE 7

Classification accuracyG-DINARRUMACDMLLM
P_a0.59620.71810.64410.6929
P_a A10.89770.90130.91430.9219
P_a A20.89760.89330.89270.9085
P_a A30.99480.89010.82810.8518
P_a A40.80050.84570.84260.8677
P_a A50.99990.99980.99980.9999
P_a A60.80400.99880.95460.9997
P_a A70.83580.91010.89050.9420

Classification accuracy Pa.

Results

RQ1: To What Extent Can Cognitive Diagnostic Models Fit the Data and Then Generate Fine-Grained Information About Examinees’ Translation Strengths and Weaknesses?

As explained above, all the four widely-used CDMs could fit the data well while LLM turned out to be the best one. According to its diagnostic results, students’ translation strengths and weaknesses could be identified for the overall group as well as for individuals.

On one hand, we could discover the overall mastery profile of students’ translation competence through “attribute prevalence” (see Figure 1). As illustrated, students had the highest mastery probability of conveying key information (77.67%) and conveying details with accurate wording (74.03%) while showing the lowest mastery proportion in optimizing logical structures (39.52%) and using the strategy of execution (29.78%), suggesting that A1 and A2 were the easiest attributes while A6 and A7 were the most difficult ones for the examinee group. The three remaining attributes, namely, conforming to language norms (59.85%), conforming to language habits (64.65%), reproducing styles (60.48%), were of similar difficulty for the examinee group in that students’ mastery probability of the three was fairly close.

FIGURE 1

On the other hand, we could know each individual’s strengths and weaknesses through “person parameter estimation specifications” generated by LLM. For example, the two radar maps in Figure 2 depicted two students’ (student No. 15 on the left and student No. 26 on the right) mastery information on each attribute, both students scored 13 in the diagnostic checklist, but their weaknesses and strengths were quite different. For student No. 15, A1, A2, and A5 were the strongest attributes while A6 was the weakest one. However, for student No. 26, A4, and A6 were the strongest attributes while A7 was the weakest one.

FIGURE 2

RQ2: To What Extent Can Diagnostic Results Help Discover the Relationships Among Translation Competence Attributes?

To discover the relationships among translation competence attributes, correlation analysis was first conducted based on examinees’ person parameter estimations on each attribute. According to Table 8, it was observed that A1, A2, A3, A4, and A5 were significantly correlated with each other (p < 0.01), in particular, the correlation coefficients between A1, A2, A3, and A4 ranged from 0.408 to 0.847, meaning that there existed moderate to strong positive relationships among these four attributes (Dancey and Reidy, 2017).

TABLE 8

A1A2A3A4A5A6A7
A110.847**0.677**0.420**0.309**0.156**0.046
A210.662**0.414**0.214**0.0430.031
A310.408**0.280**0.131**−0.196**
A410.200**−0.007−0.107*
A510.071−0.094*
A610.028
A71

Pearson correlations among translation competence attributes.

*p < 0.05, **p < 0.01.

According to Chen and Chen (2016a), relationships among attributes could be further revealed by analyzing the co-occurrence of the attributes in dominant latent classes which represented different skill mastery/non-mastery profiles. Table 9 displayed the ten most dominant latent classes and their posterior probabilities. For each latent class, it consisted of seven numbers (“1” means mastery while “0” means non-mastery), the number from the left to the right represented A1, A2, A3, A4, A5, A6, and A7, respectively. Apparently, “1111100” was the most common latent class among the examinee group, indicating that 16.93% of the examinees have generally mastered the first five attributes while failing to master A6 and A7, which concurs with the results of attribute prevalence. Another two major ones were “1111110” and “1111000” with a proportion of 9.64 and 6.40%, respectively. High proportions of the three latent classes signified frequent co-occurrence of the first four, five and six attributes. By adding up the posterior probabilities of their co-occurrence in the ten dominant latent classes (58.7% in total), a proportion of 39.85, 29.91, and 9.64% was found, respectively, suggesting that the most frequent co-occurrence existed among the first four attributes. Such a finding, together with the correlation results demonstrated in Table 8, could reasonably lead to the conclusion that A1, A2, A3, and A4 could form an attribute cluster. Considering the relatively lower frequencies of co-occurrence among the first five and six attributes, it would be more appropriate to infer that A5 bore some relationship with the “A1–A4” attribute cluster, and A6, together with A5 could also interact with the “A1–A4” attribute cluster. Among the remaining latent classes, similar relationships could also be discovered, for example, the classes “1111101” (3.34%) and “1111011” (3.54%) indicated that A7 was more likely to connect itself with the “A1–A4” attribute cluster when it was accompanied by A5 or A6.

TABLE 9

Latent classPosterior probabilityLatent classPosterior probability
111110016.93%11110113.54%
11111109.64%11111013.34%
11110006.40%11101003.22%
00000005.47%11001013.16%
11101103.97%11011113.03%

Ten dominant latent classes and posterior probabilities.

By taking the findings above into account, the relationships among translation competence attributes could be described by means of the structure in Figure 3. In this figure, the oval box referred to the attribute cluster while round boxes stood for attributes with their sizes denoting the attribute prevalence among the examinee group. In addition, real lines represented certain noticeable associations while dotted lines indicated that no obvious relationships were observed based on correlation analysis and co-occurrence analysis. Such a structure can provide insights into the sequence of skill training (see the Section “Discussion”).

FIGURE 3

RQ3: To What Extent Do Diagnostic Results Differ Across High and Low Translation Proficiency Groups?

Given that “a diagnostically well-constructed model was assumed to produce skill profiles that had distinctively different characteristics across different proficiency levels” (Kim, 2011, p. 528), we chose to compare high and low proficiency groups with regard to their attribute prevalence and latent classes.

According to the scores in the diagnostic checklist, the top 33% (N = 151) and bottom 33% (N = 151) of the examinees were divided into high proficiency group (M = 14.61, SD = 1.35) and low-proficiency group (M = 8.86, SD = 1.65) (Bachman, 2004). Independent-samples t-test confirmed that the translation proficiency of the two groups was significantly different [t(300) = −33.13, p < 0.001]. The attribute prevalence of the two groups was presented in Figure 4. For the low proficiency group, A1 (67.87%) and A2 (62.22%) were the easiest attributes while A4 (19.23%) and A6 (27.81%) were the most challenging ones. By contrast, for the high proficiency group, A1 (96.58%) and A5 (80.10%) were the easiest attributes while A3 (45.03%) and A7 (43.19%) were the most difficult ones. Apart from the differences manifested in their strengths and weaknesses, it could also be found that the mastery proportions of all but one attribute (A2) in the high proficiency group were higher than those in the low proficiency group. For attributes like A4 and A5, the most obvious distinctions across the two groups were observed with an absolute difference of 43.76 and 42.35%. The findings above indicated that the diagnostic results could generally distinguish high and low translation proficiency groups.

FIGURE 4

Further evidence can be collected by analyzing the common latent classes of the two groups. As displayed in Table 10, among the ten dominant latent classes of the two groups, nine classes of the low proficiency group included less than four attributes (except for “1110100”), indicating that the majority of the examinees in this group mastered less than four attributes. By way of contrast, all the ten classes at high proficiency level included at least four attributes, meaning that most, if not all examinees in the high proficiency group mastered at least 4 attributes. The results above confirmed that the diagnostic results could well distinguish different proficiency groups.

TABLE 10

Low proficiency level
High proficiency level
Latent classPosterior probabilityLatent classPosterior probability
11000006.47%11011109.16%
10100005.30%11111007.81%
11101005.00%11011007.59%
11100004.76%10101105.94%
01000014.73%11111105.70%
11001004.62%10001115.61%
01100004.51%11101004.68%
00000004.27%10011104.10%
11000104.04%10110113.67%
10001003.89%10010113.57%

Ten dominant latent classes and posterior probabilities at low and high proficiency levels.

Discussion

In this study, we have explored the feasibility of assessing students’ translation competence by integrating CSE with CDA approaches. It was found that LLM was best fitted to the translation responses data among the four widely-used CDMs and could help identify students’ translation strengths and weaknesses at both group and individual level. Moreover, it was discovered that diagnostic results could help reveal the relationships among translation competence attributes and could also differ across high and low translation proficiency groups. Those findings can lend support to the feasibility of such an approach in assessing students’ translation competence, in addition, they can provide implications for model selection, translation teaching and learning, etc.

Firstly, the satisfying absolute model fit statistics of all the four CDMs can enhance the credibility of model selection results in previous studies (Kim, 2010, 2011; Chen and Chen, 2016b; Ravand, 2016; Xie, 2017; Ravand and Robitzsch, 2018; Effatpanah et al., 2019; Javidanmehr and Sarab, 2019; He et al., 2021). The selection of LLM in our study is consistent with that in He et al.’s study (2021), which may be attributed to the fact that both their study and ours were conducted under CSE descriptive framework and both studies dealt with data obtained from constructed-response tests. Apart from LLM, both ACDM and RRUM turned out to fit the response data better than the G-DINA model in our study, which shares certain similarities with some writing diagnostic studies (Kim, 2010, 2011; Xie, 2017; Effatpanah et al., 2019) in their preference for the former two CDMs. Another advantage of LLM, ACDM, and RRUM over G-DINA model in our study was the classification accuracy, which could be accounted for by the fact that only 458 examinees were included in our study while a large sample size was required by G-DINA model in order to generate accurate estimates (de la Torre and Lee, 2013). Given this, it can be predicted that the G-DINA model may be more suitable to be applied to large-scale selected-response tests while data from constructed-response tests with a small sample size may function better under other models like LLM, ACDM and RRUM. Such a prediction can provide references for model selection in future studies.

Secondly, by knowing students’ strengths and weaknesses at the group level, teachers can carry out remedial teaching by adjusting their teaching design (content, method, etc.). In this study, A6 (optimizing logical structures) and A7 (using the strategy of execution) were shown to be the most challenging attributes for the examinees, so teachers should attach importance to these two subskills in their teaching. Also, by presenting different mastery profiles of two students with the same score, the study clearly illustrated the limitations of the traditional way of assessment under the CTT theory and highlighted the advantage of CDA in discovering in-depth information under the same score. Such fine-grained information can offer each student with personalized feedback, which then, helps promote their individualized learning.

Thirdly, person parameter estimations and dominant latent classes helped uncover the relationships among translation competence attributes. In this study, conveying key information (A1), conveying details with accurate wording (A2), conforming to language norms (A3), conforming to language habits (A4) were found to form an attribute cluster. As those four subskills are all closely related to bilingual knowledge and competence, such a cluster can be named “linguistic attribute cluster.” The formation of the linguistic attribute cluster is conducive to teaching by informing teachers that these four attributes can be taught concurrently. Given that A6 and A7 could only interact with the linguistic attribute cluster when accompanies by A5, they are not suitable to be taught prior to A5. In a similar vein, it is not advisable to teach A7 earlier than A6 considering the former was connected with the linguistic attribute cluster only when it occurred along with A6. As analyzed above, it can be seen that the relationships structure of translation competence attributes was established with the linguistic attribute cluster as its basis in that the last three attributes were either individually or collectively connected with the cluster, which, to some degree, echoes Yang (2002)’s statement that “linguistic competence forms the major constituent element as well as the foundation of translation competence and the strengthening of the latter has to rely on the enhancement of the former” (p. 16). As the examinees are novice translation learners, the central role of linguistic attribute cluster also lends evidence to Yang (2014) experimental results that the effects of Chinese learners’ language proficiency on their translation proficiency were particularly strong in the initial stage of learning translation. In addition, the finding that A7 could be associated with the attribute cluster only when working with A5 or A6 corroborates the proposition that “translation strategy is, in essence, a problem-solving means used to achieve certain goals” (Bai et al., 2018, p. 104). For the examinee group in this study, translation strategy was used mostly for reproducing style (A5) or optimizing logical structures (A6).

Fourthly, the result that the high proficiency group mastered more attributes than the low proficiency one is consistent with that in Du and Ma’s study (2021). However, different from previous studies in which high-proficiency levels perform better in all attributes than low-proficiency ones (Kim, 2010, 2011; Xie, 2017; Fan and Yan, 2020; He et al., 2021), there existed one exceptional attribute in our study. Such an exception can be explained by the limitation of the sample in this study, that is, the examinee group is generally homogeneous in their translation learning experience and thus in their overall translation proficiency. Since they are all novice translation learners who had just learned Chinese/English translation systematically for half a year, it is likely that there is no obvious distinction in their mastery of certain translation subskill(s) even if their total scores are significantly different. In this case, it is assumed that such an exception may no longer exist if more experienced examinees (e.g., advanced translation majors) are involved.

Limitations and Suggestions for Future Research

Notwithstanding instructive findings and implications, several limitations exist. Firstly, constrained by the test format, attributes related to translation knowledge and the strategies of planning and appraising were not assessed. Secondly, in order to standardize the testing procedures and thus guarantee the quality of response data, only 458 examinees from the same school were involved. In the future, it is suggested techniques like Translog, screen-recording can be used to assess such subskills as using the strategy of planning, appraising and compensation. In addition, the sample size can be enlarged with translation major students involved, in this way, the relationships structure of translation competence attributes of non-translation majors and translation majors can be compared so as to further investigate the difference between those two groups and then provide insights for translation teaching targeted at the two different groups.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Statements

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Ethics statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the patients/participants legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Author contributions

HM: design of the study, data collection and analysis, and writing original draft. HC: data collection and analysis, revising the draft, and supervision. Both authors contributed to the article and approved the submitted version.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: National Social Science Foundation of China (17BYY101).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.872025/full#supplementary-material

References

  • 1

    AdabB. (2000). “Evaluating translation competence,” in Developing Translation Competence, edsSchäffnerC.AdabB. (Amsterdam: John Benjamins Publishing Company), 215228. 10.1075/btl.38.20ada

  • 2

    AngelelliC. (2009). “Using a rubric to assess translation ability: defining the construct,” in Testing and Assessment in Translation and Interpreting Studies: A Call for Dialogue between Research and Practice, edsAngelelliC.JacobsonH. (Amsterdam: John Benjamins Publishing Company), 1347. 10.1075/ata.xiv.03ang

  • 3

    AryadoustV. (2021). A cognitive diagnostic assessment study of the listening test of the singapore-cambridge general certificate of education O-Level: application of DINA, DINO, G-DINA, HO-DINA, and RRUM.Int. J. Listen.352952. 10.1080/10904018.2018.1500915

  • 4

    BachmanL. (2004). Statistical Analyses for Language Assessment.Cambridge: Cambridge University Press.

  • 5

    BachmanL. F.PalmerA. S. (1996). Language Testing in Practice: Designing and Developing Useful Language Tests.Oxford: Oxford University Press.

  • 6

    BachmanL. F.PalmerA. S. (2010). Language Assessment in Practice: Developing Language Assessments and Justifying Their Use in the Real World.Oxford: Oxford University Press.

  • 7

    BaiL.FengL.YanM. (2018). The translation proficiency scales of China’s standards of english: construct and principles (in Chinese).Modern Foreign Lang.41101110.

  • 8

    BirenbaumM.KellyA. E.TatsuokaK. K. (1993). Diagnosing knowledge states in algebra using the rule-space model.J. Res. Mathematics Educ.24442459. 10.1002/j.2333-8504.1992.tb01488.x

  • 9

    BuckG.TatsuokaK. (1998). Application of the rule-space procedure to language testing: examining attributes of a free response listening test.Lang. Test.15119157. 10.1177/026553229801500201

  • 10

    BuckG.TatsuokaK.KostinI. (1997). The subskills of reading: rule-space analysis of a multiple-choice test of second language reading comprehension.Lang. Learn.47423466. 10.1111/0023-8333.00016

  • 11

    BurnhamK. P.AndersonD. R. (2004). Multimodel inference: understanding AIC and BIC in model selection.Sociol. Methods Res.33261304. 10.1177/0049124104268644

  • 12

    CaoD. (1996). On translational language competence.Babel42231238.

  • 13

    CATTI Center (2020a). Syllabus for CATTI Level 2.China: CATTI Center.

  • 14

    CATTI Center (2020b). Syllabus for CATTI Level 3.China: CATTI Center.

  • 15

    ChenH.ChenJ. (2016a). Exploring reading comprehension skill relationships through the G-DINA model.Educ. Psychol.3610491064. 10.1080/01443410.2015.1076764

  • 16

    ChenH.ChenJ. (2016b). Retrofitting non-cognitive-diagnostic reading assessment under the generalized DINA model framework.Lang. Assess. Quarterly13218230. 10.1080/15434303.2016.1210610

  • 17

    ChenJ.de la TorreJ.ZhangZ. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling.J. Educ. Measurement50123140. 10.1111/j.1745-3984.2012.00185.x

  • 18

    Council of Europe (2018). Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Companion Volume with New Descriptors.Strasbourg: Council of Europe.

  • 19

    DanceyC. P.ReidyJ. (2017). Statistics Without Maths for Psychology, 7th Edn. New York, NY: Pearson Education Limited.

  • 20

    de la TorreJ. (2011). The generalized DINA model framework.Psychometrika76179199. 10.1007/s11336-011-9207-7

  • 21

    de la TorreJ.DouglasJ. A. (2004). Higher-order latent trait models for cognitive diagnosis.Psychometrika69333353. 10.1007/BF02295640

  • 22

    de la TorreJ.LeeY. (2013). Evaluating the wald test for item-level comparison of saturated and reduced models in cognitive diagnosis: evaluating the wald test for item-level comparison.J. Educ. Measurement50355373. 10.1111/jedm.12022

  • 23

    DiBelloL. V.StoutW. F.RoussosL. A. (1995). “Unified cognitive/psychometric diagnostic assessment likelihood-based classification techniques,” in Diagnostic Monitoring of Skill and Knowledge Acquisition, edsFrederiksenN.GlaserR.LesgoldA.ShaftoM. (Hillsdale, NJ: Lawrence Erlbaum), 361390. 10.4324/9780203052969-19

  • 24

    DongY.MaX.WangC.GaoX. (2021). An optimal choice of cognitive diagnostic model for second language listening comprehension test.Front. Psychol.12:608320. 10.3389/fpsyg.2021.608320

  • 25

    DuW.MaX. (2021). Exploring the inter-relationship among cognitive skills in L2 reading with the multi-CDM (in Chinese).Foreign Lang. Educ.424752.

  • 26

    EffatpanahF.BaghaeiP.BooriA. A. (2019). Diagnosing EFL learners’ writing ability: a diagnostic classification modeling analysis.Lang. Testing Asia9123. 10.1186/s40468-019-0090-y

  • 27

    FanT.YanX. (2020). Diagnosing English reading ability in Chinese senior high schools.Stud. Educ. Eval.67:100931. 10.1016/j.stueduc.2020.100931

  • 28

    Galán-MañasA. (2016). Learning portfolio in translator training: the tool of choice for competence development and assessment.Interpreter Translator Trainer10161182. 10.1080/1750399X.2015.1103108

  • 29

    Galán-MañasA.Hurtado AlbirA. (2015). Competence assessment procedures in translator training.Interpreter Translator Trainer96382. 10.1080/1750399X.2015.1010358

  • 30

    GöpferichS.JääskeläinenR. (2009). Process research into the development of translation competence: where are we, and where do we need to go?Across Lang. Cultures10169191. 10.1556/Acr.10.2009.2.1

  • 31

    HartzS. M. (2002). A Bayesian Framework for the Unified Model for Assessing Cognitive Abilities: Blending Theory with Practicality.Champaign, IL: University of Illinois at Urbana-Champaign. Unpublished doctoral dissertation.

  • 32

    HeL.JiangZ.MinS. (2021). Diagnosing writing ability using China’s standards of english language ability: application of cognitive diagnosis models.Assessing Writing50:100565. 10.1016/j.asw.2021.100565

  • 33

    Hurtado AlbirA. (2015). The acquisition of translation competence. competences, tasks, and assessment in translator training.Meta60256280. 10.7202/1032857ar

  • 34

    Hurtado AlbirA.PavaniS. (2018). An empirical study on multidimensional summative assessment in translation teaching.Interpreter Translator Trainer122547. 10.1080/1750399X.2017.1420131

  • 35

    JangE. E. (2009a). Demystifying a Q-Matrix for making diagnostic inferences about L2 reading skills.Lang. Assess. Quarterly6210238. 10.1080/15434300903071817

  • 36

    JangE. E. (2009b). Cognitive diagnostic assessment of L2 reading comprehension ability: validity arguments for Fusion Model application to LanguEdge assessment.Lang. Test.263173. 10.1177/0265532208097336

  • 37

    JavidanmehrZ.SarabM. R. A. (2019). Retrofitting nondiagnostic reading comprehension assessment: application of the G-DINA model to a high stakes reading comprehension test.Lang. Assessment Quarterly16294311. 10.1080/15434303.2019.1654479

  • 38

    JunkerB. W.SijtsmaK. (2001). Cognitive assessment models with few assumptions, and connections with non-parametric item response theory.Appl. Psychol. Measurement25258272. 10.1177/01466210122032064

  • 39

    KimA. Y. (2015). Exploring ways to provide diagnostic feedback with an ESL placement test: cognitive diagnostic assessment of L2 reading ability.Lang. Testing32227258. 10.1177/0265532214558457

  • 40

    KimY. H. (2010). An Argument-Based Validity Inquiry into the Empirically-Derived Descriptor Based Diagnostic (EDD) Assessment in ESL Academic Writing.Toronto, ON: University of Toronto. Unpublished doctoral dissertation.

  • 41

    KimY. H. (2011). Diagnosing EAP writing ability using the reduced reparameterized unified model.Lang. Testing28509541. 10.1177/0265532211400860

  • 42

    LeeY. W.SawakiY. (2009a). Cognitive diagnosis and Q-matrices in language assessment.Lang. Assessment Quarterly6169171. 10.1080/15434300903059598

  • 43

    LeeY. W.SawakiY. (2009b). Application of three cognitive diagnosis models to ESL reading and listening assessments.Lang. Assessment Quarterly6239263. 10.1080/15434300903079562

  • 44

    LeiP.-W.LiH. (2014). “Fit indices’ performance in choosing cognitive diagnostic models and Q-matrices,” in Paper Presented at the Annual Meeting of the National Council on Measurement in Education (NCME), (Philadelphia, PA). 10.1177/0146621616647954

  • 45

    LeightonJ. P.GierlM. J. (2007). Cognitive Diagnostic Assessment for Education: Theory and Applications.Cambridge: Cambridge University Press.

  • 46

    LeightonJ. P.GierlM. J.HunkaS. (2004). The attribute hierarchy method for cognitive assessment: a variation on Tatsuoka’s rule-space approach.J. Educ. Measurement41205236. 10.1111/j.1745-3984.2004.tb01163.x

  • 47

    MaW.de la TorreJ. (2020). GDINA: The Generalized DINA Model Framework.R package version 2.8.7. Available online at: https://CRAN.R-project.org/package=GDINA(accessed December 8, 2021).

  • 48

    MengY. (2013). Developing a Model of Cognitive Diagnostic Assessment for College EFL Listening.Shanghai: Shanghai International Studies University. Unpublished doctoral dissertation.

  • 49

    MinS.HeL. (2022). Developing individualized feedback for listening assessment: combining standard setting and cognitive diagnostic assessment approaches.Lang. Testing3990116. 10.1177/0265532221995475

  • 50

    National Advisory Committee for Foreign Language Teaching (2004). Syllabus for Test for English Majors (Band 8).Shanghai: Shanghai Foreign Language Education Press.

  • 51

    National College English Testing Committee (2016). Syllabus for College English Test-Band 4 (CET 4) and Band 6 (CET 6).Beijing: Foreign Language Teaching and Research Press.

  • 52

    National Education Examinations Authority (2018). China’s Standards of English Language Ability.Beijing: National Education Examinations Authority.

  • 53

    OrozcoM.Hurtado AlbirA. (2002). Measuring translation competence acquisition.Meta47375402. 10.7202/008022ar

  • 54

    PACTE (2003). “Building a translation competence model,” in Triangulating Translation: Perspectives in Process-Oriented Research, ed.AlvesF. (Amsterdam: John Benjamins Publishing Company), 4366. 10.1075/btl.45.06pac

  • 55

    PACTE (2005). Investigating translation competence: conceptual and methodological issues.Meta50609619. 10.7202/011004ar

  • 56

    PACTE (2008). “First results of a translation competence experiment: knowledge of translation and efficacy of the translation process,” in Translator and Interpreter Training. Issues, Methods and Debates, ed.KearnsJ. (London: Continuum), 104126. 10.1371/journal.pone.0145328

  • 57

    PACTE (2015). Results of PACTE’s experimental research on the acquisition of translation competence: the acquisition of declarative and procedural knowledge in translation. the dynamic translation index.Trans. Cogn. Act. Special Issue Trans. Spaces42953. 10.1075/ts.4.1.02bee

  • 58

    PresasM. (2012). Training translators in the European higher education area.Interpreter Translator Trainer6139169. 10.1080/13556509.2012.10798834

  • 59

    PymA. (2003). Redefining translation competence in an electronic age.Meta48481497. 10.7202/008533ar

  • 60

    RavandH. (2016). Application of a cognitive diagnostic model to a high-stakes reading comprehension test.J. Psychoeduc. Assess.34782799. 10.1177/0734282915623053

  • 61

    RavandH.RobitzschA. (2018). Cognitive diagnostic model of best choice: a study of reading comprehension.Educ. Psychol.3812551277. 10.1080/01443410.2018.1489524

  • 62

    SawakiY.KimH. J.GentileC. (2009). Q-Matrix construction: defining the link between constructs and test items in large-scale reading and listening comprehension assessments.Lang. Assessment Quarterly6190209. 10.1080/15434300902801917

  • 63

    SheehanK.TatsuokaK.LewisC. (1993). A diagnostic classification model for document processing skills.ETS Res. Report Series1993:41.

  • 64

    ShohamyE. (1992). Beyond performance testing: a diagnostic feedback testing model for assessing foreign language learning.Modern Lang. J.76513521. 10.1111/j.1540-4781.1992.tb05402.x

  • 65

    TatsuokaK. (1983). Rule space: an approach for dealing with misconceptions based on item response theory.J. Educ. Measurement20345354. 10.1111/j.1745-3984.1983.tb00212.x

  • 66

    TemplinJ. L.HensonR. A. (2006). Measurement of psychological disorders using cognitive diagnosis models.Psychol. Methods11287305. 10.1037/1082-989X.11.3.287

  • 67

    ToprakT. E.CakirA. (2021). Examining the L2 reading comprehension ability of adult ELLs: developing a diagnostic test within the cognitive diagnostic assessment framework.Lang. Testing38106131. 10.1177/0265532220941470

  • 68

    von DavierM. (2008). A general diagnostic model applied to language testing data.Br. J. Mathematical Statistical Psychol.61287307. 10.1348/000711007X193957

  • 69

    WayC. (2008). “Systematic assessment of translator competence,” in Search of Achilles’ Heel,” in Translator and Interpreter Training. Issues, Methods and Debates, ed.KearnsJ. (London: Continuum International), 88103.

  • 70

    XieQ. (2017). Diagnosing university students’ academic writing in English: is cognitive diagnostic modelling the way forward?Educ. Psychol.372647. 10.1080/01443410.2016.1202900

  • 71

    YangX. (2002). On translational competence for C-E translation (in Chinese).Chinese Trans. J.231821.

  • 72

    YangZ. (2014). An empirical research on the relationship between English proficiency and C-E translation competence of Chinese students (in Chinese).Foreign Lang. Their Teach.15459.

  • 73

    YiY. S. (2017). Probing the relative importance of different attributes in L2 reading and listening comprehension items: an application of cognitive diagnostic models.Lang. Testing34337355. 10.1177/0265532216646141

  • 74

    ZhangH.ShiY. (2020). The group and individualized diagnosis on EFL reading in university-based final test (in Chinese).Technol. Enhanced Foreign Lang. Educ.57985.

Summary

Keywords

translation competence assessment, cognitive diagnosis, China’s Standards of English, fine-grained diagnostic feedback, translation teaching and learning

Citation

Mei H and Chen H (2022) Assessing Students’ Translation Competence: Integrating China’s Standards of English With Cognitive Diagnostic Assessment Approaches. Front. Psychol. 13:872025. doi: 10.3389/fpsyg.2022.872025

Received

09 February 2022

Accepted

07 March 2022

Published

31 March 2022

Volume

13 - 2022

Edited by

Alexander Robitzsch, IPN - Leibniz Institute for Science and Mathematics Education, Germany

Reviewed by

Shenghai Dai, Washington State University, United States; Hongli Li, Georgia State University, United States; Ann Cathrice George, Institut des Bundes für Qualitätssicherung im österreichischen Schulwesen (IQS), Austria; Lihong Song, Jiangxi Normal University, China

Updates

Copyright

*Correspondence: Huilin Chen,

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics