Markup: A Web-Based Annotation Tool Powered by Active Learning

Across various domains, such as health and social care, law, news, and social media, there are increasing quantities of unstructured texts being produced. These potential data sources often contain rich information that could be used for domain-specific and research purposes. However, the unstructur...

Full description

Saved in:
Bibliographic Details
Main Authors: Samuel Dobbie (Author), Huw Strafford (Author), W. Owen Pickrell (Author), Beata Fonferko-Shadrach (Author), Carys Jones (Author), Ashley Akbari (Author), Simon Thompson (Author), Arron Lacey (Author)
Format: Book
Published: Frontiers Media S.A., 2021-07-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_c35952c7ca36406ebd94d45aa86d3f95
042 |a dc 
100 1 0 |a Samuel Dobbie  |e author 
700 1 0 |a Samuel Dobbie  |e author 
700 1 0 |a Huw Strafford  |e author 
700 1 0 |a Huw Strafford  |e author 
700 1 0 |a W. Owen Pickrell  |e author 
700 1 0 |a W. Owen Pickrell  |e author 
700 1 0 |a Beata Fonferko-Shadrach  |e author 
700 1 0 |a Carys Jones  |e author 
700 1 0 |a Ashley Akbari  |e author 
700 1 0 |a Ashley Akbari  |e author 
700 1 0 |a Simon Thompson  |e author 
700 1 0 |a Simon Thompson  |e author 
700 1 0 |a Arron Lacey  |e author 
700 1 0 |a Arron Lacey  |e author 
245 0 0 |a Markup: A Web-Based Annotation Tool Powered by Active Learning 
260 |b Frontiers Media S.A.,   |c 2021-07-01T00:00:00Z. 
500 |a 2673-253X 
500 |a 10.3389/fdgth.2021.598916 
520 |a Across various domains, such as health and social care, law, news, and social media, there are increasing quantities of unstructured texts being produced. These potential data sources often contain rich information that could be used for domain-specific and research purposes. However, the unstructured nature of free-text data poses a significant challenge for its utilisation due to the necessity of substantial manual intervention from domain-experts to label embedded information. Annotation tools can assist with this process by providing functionality that enables the accurate capture and transformation of unstructured texts into structured annotations, which can be used individually, or as part of larger Natural Language Processing (NLP) pipelines. We present Markup (https://www.getmarkup.com/) an open-source, web-based annotation tool that is undergoing continued development for use across all domains. Markup incorporates NLP and Active Learning (AL) technologies to enable rapid and accurate annotation using custom user configurations, predictive annotation suggestions, and automated mapping suggestions to both domain-specific ontologies, such as the Unified Medical Language System (UMLS), and custom, user-defined ontologies. We demonstrate a real-world use case of how Markup has been used in a healthcare setting to annotate structured information from unstructured clinic letters, where captured annotations were used to build and test NLP applications. 
546 |a EN 
690 |a natural language processing 
690 |a active learning 
690 |a unstructured text 
690 |a annotation 
690 |a sequence-to-sequence learning 
690 |a Medicine 
690 |a R 
690 |a Public aspects of medicine 
690 |a RA1-1270 
690 |a Electronic computers. Computer science 
690 |a QA75.5-76.95 
655 7 |a article  |2 local 
786 0 |n Frontiers in Digital Health, Vol 3 (2021) 
787 0 |n https://www.frontiersin.org/articles/10.3389/fdgth.2021.598916/full 
787 0 |n https://doaj.org/toc/2673-253X 
856 4 1 |u https://doaj.org/article/c35952c7ca36406ebd94d45aa86d3f95  |z Connect to this object online.