The volume of chemical and biochemical research, made available via scientific publications and patents, is rapidly increasing. With such explosive growth, it is extremely challenging for scientists to keep up to date with all of the new discoveries and advancements even within relatively focused discipline areas. Thus, there has been a surge of interest in automated text mining tools to aid scientists in coping with the explosive growth of research texts, to allow efficient and effective knowledge extraction from these data. In chemistry and biochemistry, key information related to synthesis, properties, and mode of action of chemicals is critical for pharmaceutical and life sciences applications and yet it is often only described in natural language texts.
Biochemical texts contain a wealth of information, and in this Research Topic we aim to explore the application of text mining methods that facilitate analysis and transformation of unstructured natural language descriptions of chemicals and their interrelationships into actionable, structured knowledge. The complexity of biochemical texts creates several challenges to achieving these goals. First, the lexicon of biochemical texts usually consists of extensive domain-specific terminology, rendering the use of resources developed for general language processing ineffective. Second, the scientific literature is usually written in a formal way resulting in complex sentence structures. Finally, biochemical texts often couple chemical structure information with linguistic descriptions, resulting in texts that contain images, figures, and tables conveying critical information. To this end, this article collection calls for novel approaches to address these challenges, aiming at improving the effectiveness of text mining in biochemical data.
This Research Topic calls for research papers addressing natural language processing or text mining of chemical or biochemical texts, including scientific literature or patents. The nominated research themes include but are not limited to the information extraction tasks such as chemical or drug named entity recognition and identification of relations between chemical entities, document summarization or classification, and construction of knowledge bases or knowledge graphs from texts.
Methods that target the identification of crucial information in relevant texts, such as chemical entities and their properties, details of chemical reactions or synthesis, or interactions between chemicals, biological molecules or genetic variation are welcome. We also welcome chemical-based algorithms, tools, and methods targeting the identification of drug-drug interactions or drug repurposing evidence in biomedical text.
Research that addresses the particular linguistic characteristics of biochemical texts, including resource development such as annotated corpora or domain-specific terminologies, or methods for constituent components of a chemical text mining system, including specialized domain-specific tokenization or chemical structure analysis, are also in scope.
The volume of chemical and biochemical research, made available via scientific publications and patents, is rapidly increasing. With such explosive growth, it is extremely challenging for scientists to keep up to date with all of the new discoveries and advancements even within relatively focused discipline areas. Thus, there has been a surge of interest in automated text mining tools to aid scientists in coping with the explosive growth of research texts, to allow efficient and effective knowledge extraction from these data. In chemistry and biochemistry, key information related to synthesis, properties, and mode of action of chemicals is critical for pharmaceutical and life sciences applications and yet it is often only described in natural language texts.
Biochemical texts contain a wealth of information, and in this Research Topic we aim to explore the application of text mining methods that facilitate analysis and transformation of unstructured natural language descriptions of chemicals and their interrelationships into actionable, structured knowledge. The complexity of biochemical texts creates several challenges to achieving these goals. First, the lexicon of biochemical texts usually consists of extensive domain-specific terminology, rendering the use of resources developed for general language processing ineffective. Second, the scientific literature is usually written in a formal way resulting in complex sentence structures. Finally, biochemical texts often couple chemical structure information with linguistic descriptions, resulting in texts that contain images, figures, and tables conveying critical information. To this end, this article collection calls for novel approaches to address these challenges, aiming at improving the effectiveness of text mining in biochemical data.
This Research Topic calls for research papers addressing natural language processing or text mining of chemical or biochemical texts, including scientific literature or patents. The nominated research themes include but are not limited to the information extraction tasks such as chemical or drug named entity recognition and identification of relations between chemical entities, document summarization or classification, and construction of knowledge bases or knowledge graphs from texts.
Methods that target the identification of crucial information in relevant texts, such as chemical entities and their properties, details of chemical reactions or synthesis, or interactions between chemicals, biological molecules or genetic variation are welcome. We also welcome chemical-based algorithms, tools, and methods targeting the identification of drug-drug interactions or drug repurposing evidence in biomedical text.
Research that addresses the particular linguistic characteristics of biochemical texts, including resource development such as annotated corpora or domain-specific terminologies, or methods for constituent components of a chemical text mining system, including specialized domain-specific tokenization or chemical structure analysis, are also in scope.