Text complexity assessment is one of the urgent problems of our time. Many modern texts, including classroom books and legislative acts, prove to be too difficult and as such cannot cater to readers’ needs. This also applies to legal, financial, banking documents. Although the first methods of measuring text complexity were suggested over 70 years ago, the problem is far from being solved. The diversity of languages, text types and genres, as well as their audience, are major challenges for researchers. Despite the constant growth in the number of scientific publications, their complex language or the lack of scientific acculturation of users creates a tendency to avoid these sources by favoring commercial or political incentives rather than accuracy and informational value. This difficulty in reading scientific documents also exists when scientists are interested in scientific documents from disciplines other than those in which they are experts. Text simplification aims to reduce these barriers. Text simplification is used in the field of translation (pre-editing), localization and technical writing. Simplified texts are also more accessible to non-native speakers, young readers, people with reading disabilities, or with lower levels of education.
Excessively complicated texts contribute to the strain on automatic simplification of texts. The purpose of simplifying texts is twofold: to provide their availability to a wider or specific target audience including readers with learning disabilities and to further benefit automatic processing of texts. Simplification can be achieved with different techniques, i.e. lexical substitutions, syntactic paraphrasing, etc. Deep learning neural networks ensure hope for a breakthrough in assessing complexity and simplifying texts. The first findings of deep learning implementation for this have already been obtained, which can be learned from and pave the way for further research. Attention should be paid to new ideas on assessing conceptual complexity and simplifying it.
This Research Topic focuses on modern machine learning approaches to these problems. We hope that this will contribute to the developing best practices. The final goal is to create an interdisciplinary community of researchers in information retrieval, data mining, automatic language processing, linguistics, didactics.
We are looking for contributions in the form of Review, Original Research, Brief Research Report, Perspective, Technology and Code etc. in the following areas, including, but not limited to:
• application of state-of-the-art models of neural architectures to text simplification and complexity
• understanding which features neural networks extract from texts for text simplification and complexity
• compiling corpora annotated with complexity labels for training and testing
• model evaluation and validation
• description of linguistic features relevant to the assessment of the difficulty of various classes of texts
• Complex Word Identification
• evaluating the dependence on subject areas, types and genres of texts
• text readability for foreign language learners
• complexity of web content
• text adaptation
• scientific multi-document summarization
• visualization as text simplification
• identification of difficulties preventing the simplification and summarization of texts
• metrics of text difficulty
• applications in education, law, etc.
Text complexity assessment is one of the urgent problems of our time. Many modern texts, including classroom books and legislative acts, prove to be too difficult and as such cannot cater to readers’ needs. This also applies to legal, financial, banking documents. Although the first methods of measuring text complexity were suggested over 70 years ago, the problem is far from being solved. The diversity of languages, text types and genres, as well as their audience, are major challenges for researchers. Despite the constant growth in the number of scientific publications, their complex language or the lack of scientific acculturation of users creates a tendency to avoid these sources by favoring commercial or political incentives rather than accuracy and informational value. This difficulty in reading scientific documents also exists when scientists are interested in scientific documents from disciplines other than those in which they are experts. Text simplification aims to reduce these barriers. Text simplification is used in the field of translation (pre-editing), localization and technical writing. Simplified texts are also more accessible to non-native speakers, young readers, people with reading disabilities, or with lower levels of education.
Excessively complicated texts contribute to the strain on automatic simplification of texts. The purpose of simplifying texts is twofold: to provide their availability to a wider or specific target audience including readers with learning disabilities and to further benefit automatic processing of texts. Simplification can be achieved with different techniques, i.e. lexical substitutions, syntactic paraphrasing, etc. Deep learning neural networks ensure hope for a breakthrough in assessing complexity and simplifying texts. The first findings of deep learning implementation for this have already been obtained, which can be learned from and pave the way for further research. Attention should be paid to new ideas on assessing conceptual complexity and simplifying it.
This Research Topic focuses on modern machine learning approaches to these problems. We hope that this will contribute to the developing best practices. The final goal is to create an interdisciplinary community of researchers in information retrieval, data mining, automatic language processing, linguistics, didactics.
We are looking for contributions in the form of Review, Original Research, Brief Research Report, Perspective, Technology and Code etc. in the following areas, including, but not limited to:
• application of state-of-the-art models of neural architectures to text simplification and complexity
• understanding which features neural networks extract from texts for text simplification and complexity
• compiling corpora annotated with complexity labels for training and testing
• model evaluation and validation
• description of linguistic features relevant to the assessment of the difficulty of various classes of texts
• Complex Word Identification
• evaluating the dependence on subject areas, types and genres of texts
• text readability for foreign language learners
• complexity of web content
• text adaptation
• scientific multi-document summarization
• visualization as text simplification
• identification of difficulties preventing the simplification and summarization of texts
• metrics of text difficulty
• applications in education, law, etc.