Memory-based Named Entity Recognition in Tweets

Antal Van den Bosch, Toine Bogers

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

We present a memory-based named entity recognition system that participated in the MSM-2013 Concept Extraction Challenge. The system expands the training set of annotated tweets with part-of-speech tags and seedlist information, and then generates a sequential memory-based tagger comprised of separate modules for known and unknown words. Two taggers are trained: one on the original capitalized data, and one on a lowercased version of the training data. The intersection of named entities in the predictions of the two taggers is kept as the final output.
Original languageEnglish
Title of host publicationMSM 2013 : Proceedings of the 3rd WWW Workshop on Making Sense of Microposts
EditorsAmparo Cano, Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie
Number of pages4
PublisherCEUR Workshop Proceedings
Publication date13 May 2013
Pages40-43
Publication statusPublished - 13 May 2013
Externally publishedYes
Event3rd workshop on 'Making Sense of Microposts' - RIo de Janerio, Brazil
Duration: 13 May 2013 → …
http://oak.dcs.shef.ac.uk/msm2013/

Conference

Conference3rd workshop on 'Making Sense of Microposts'
Country/TerritoryBrazil
CityRIo de Janerio
Period13/05/2013 → …
Internet address
SeriesCEUR Workshop Proceedings
Volume1019
ISSN1613-0073

Keywords

  • memory-based named entity recognition
  • concept extraction
  • natural language processing
  • sequential tagger
  • annotated tweets

Fingerprint

Dive into the research topics of 'Memory-based Named Entity Recognition in Tweets'. Together they form a unique fingerprint.

Cite this