AUTHOR=French Leon , Liu Po , Marais Olivia , Koreman Tianna , Tseng Lucia , Lai Artemis , Pavlidis Paul TITLE=Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application JOURNAL=Frontiers in Neuroinformatics VOLUME=9 YEAR=2015 URL=https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/fninf.2015.00013 DOI=10.3389/fninf.2015.00013 ISSN=1662-5196 ABSTRACT=

We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/.