About this Research Topic
Shared semantic resources, such as thesauri, have long been used by librarians to index documents. Nowadays, by making these semantic resources available as machine readable knowledge bases, such as ontologies, they are of central importance in many text mining applications; their potential utility extends way beyond document indexing, to cover areas such as the interpretation, management, and reuse of all kinds of data, including text.
Information extraction is an important step in text mining, in which methods based on semantic and knowledge resources are used to support text interpretation and to produce structured machine-readable information. More broadly, domain specific and general resources are extensively used in many fields of text mining, including question answering, summarization, simplification, query expansion, topic modelling, sentiment/emotion analysis, etc.
Conversely, text mining methods provide the means to build, adapt and enrich semantic/knowledge resources through the automatic analysis of document corpora. Semantic resources may include terminological and lexical resources, case frames, ontologies, knowledge bases, and annotated reference corpora.
This themed article collection aims to publish papers describing the coupling of semantic resources and text mining. We encourage submission of papers covering a broad range of methodological, applications, and fundamental studies.
We solicit papers covering themes including, but not limited to, the following:
• creating and enriching ontologies with entities and relationships from text
• text annotation using semantic resources (e.g., OBIE, distant learning methods, transfer learning, weak supervision methods, zero/few shot learning)
• definitions and uses of semantic distances
• alignment or mapping between semantic resources using text corpus analysis
• populating or curating databases and knowledge bases through text mining
• entity linking or normalization of multi-source data, using semantic resources for data integration
• making textual data more findable, accessible, interoperable, and reusable by using reference semantic resources (FAIR principles)
• creation of reference corpora with semantic annotations derived from semantic resources for training machine learning methods and for the evaluation of methods
• how semantic resources may be exploited to develop new criteria, measures, and practices in shared tasks
• tools and workbenches for semantic annotation and for developing semantic resources
• platforms for creating NLP workflows
• applications such as curation of information and ontology building for specific domains
• user-in-the-loop, including its role in the active learning, information curation, crowdsourcing, feedback-providing, or collaborative design of semantic resources (this may be related to Human-Machine Interfaces, explainability, interpretability, and ethics)
• domain adaptation and transfer learning using semantic resources for the adaptation of text mining to new tasks and domains or, conversely, extending semantic resources by analyzing various corpora
• knowledge discovery and fact-checking: evaluation of textual information novelty and plausibility with respect to knowledge base and ontology
• formal semantic representation, including philosophical considerations regarding the relationship between text corpora and formal semantic representation—examples include formal representations capturing the meaning of words using corpus analysis, and the role of context in the meaning.
Keywords: semantic resources, text-mining, text annotation, distant learning methods, transfer learning, mapping of semantic resources, reference corpora, NLP
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.