Abstract
We present a memory-based named entity recognition system that participated in the MSM-2013 Concept Extraction Challenge. The system expands the training set of annotated tweets with part-of-speech tags and seedlist information, and then generates a sequential memory-based tagger comprised of separate modules for known and unknown words. Two taggers are trained: one on the original capitalized data, and one on a lowercased version of the training data. The intersection of named entities in the predictions of the two taggers is kept as the final output.
Original language | English |
---|---|
Title of host publication | MSM 2013 : Proceedings of the 3rd WWW Workshop on Making Sense of Microposts |
Editors | Amparo Cano, Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie |
Number of pages | 4 |
Publisher | CEUR Workshop Proceedings |
Publication date | 13 May 2013 |
Pages | 40-43 |
Publication status | Published - 13 May 2013 |
Externally published | Yes |
Event | 3rd workshop on 'Making Sense of Microposts' - RIo de Janerio, Brazil Duration: 13 May 2013 → … http://oak.dcs.shef.ac.uk/msm2013/ |
Conference
Conference | 3rd workshop on 'Making Sense of Microposts' |
---|---|
Country/Territory | Brazil |
City | RIo de Janerio |
Period | 13/05/2013 → … |
Internet address |
Series | CEUR Workshop Proceedings |
---|---|
Volume | 1019 |
ISSN | 1613-0073 |
Keywords
- memory-based named entity recognition
- concept extraction
- natural language processing
- sequential tagger
- annotated tweets