Abstract
We present a memory-based named entity recognition system that participated in the MSM-2013 Concept Extraction Challenge. The system expands the training set of annotated tweets with part-of-speech tags and seedlist information, and then generates a sequential memory-based tagger comprised of separate modules for known and unknown words. Two taggers are trained: one on the original capitalized data, and one on a lowercased version of the training data. The intersection of named entities in the predictions of the two taggers is kept as the final output.
| Originalsprog | Engelsk |
|---|---|
| Titel | MSM 2013 : Proceedings of the 3rd WWW Workshop on Making Sense of Microposts |
| Redaktører | Amparo Cano, Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie |
| Antal sider | 4 |
| Forlag | CEUR Workshop Proceedings |
| Publikationsdato | 13 maj 2013 |
| Sider | 40-43 |
| Status | Udgivet - 13 maj 2013 |
| Udgivet eksternt | Ja |
| Begivenhed | 3rd workshop on 'Making Sense of Microposts' - RIo de Janerio, Brasilien Varighed: 13 maj 2013 → … http://oak.dcs.shef.ac.uk/msm2013/ |
Konference
| Konference | 3rd workshop on 'Making Sense of Microposts' |
|---|---|
| Land/Område | Brasilien |
| By | RIo de Janerio |
| Periode | 13/05/2013 → … |
| Internetadresse |
| Navn | CEUR Workshop Proceedings |
|---|---|
| Vol/bind | 1019 |
| ISSN | 1613-0073 |
Emneord
- memory-based named entity recognition
- concept extraction
- natural language processing
- sequential tagger
- annotated tweets