ITU

Natural Language Processing

Organisational unit: Research Group

IT University of Copenhagen
Rued Langgaards Vej 7
DK-2300 Copenhagen S
Denmark

Contact information

Organisation profile

Natural Language Processing (NLP) uses machine learning and other techniques to parse, analyse, translate and understand texts in human languages such as English or Danish. The work of ITU NLP researchers include transfer learning, representation learning, analysis of clinical patient records, automatic summarization, corpora building, stance detection, fake news analysis, and much more. 

  1. 2021
  2. Published

    We Need to Consider Disagreement in Evaluation

    Basile, V., Fell, M., Fornaciari, T., Hovy, D., Paun, S., Plank, B., Poesio, M. & Uma, A., 2021, ACL-IJCNLP2021 Workshop on Benchmarking: Past, Present and Future. Association for Computational Linguistics, p. 15-21

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  3. Published

    Challenges in Annotating and Parsing Spoken, Code-switched, Frisian-Dutch Data

    Braggaar, A. & van der Goot, R., Apr 2021, Proceedings of the Second Workshop on Domain Adaptation for NLP. Association for Computational Linguistics, p. 50-58

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  4. Published

    Creating a Universal Dependencies Treebank of Spoken Frisian-Dutch Code-switched Data

    Braggaar, A. & van der Goot, R., 25 Sep 2021.

    Research output: Contribution to conference - NOT published in proceeding or journalConference abstract for conferenceResearchpeer-review

  5. Published

    Discriminating Between Similar Nordic Languages

    Haas, R. & Derczynski, L., 20 Apr 2021, Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects. Association for Computational Linguistics, p. 67–75

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  6. Published

    An IDR Framework of Opportunities and Barriers between HCI and NLP

    Inie, N. & Derczynski, L., 20 Apr 2021, Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing: HCINLP. Association for Computational Linguistics, p. 101-108

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  7. Published

    De-identification of Privacy-related Entities in Job Postings

    Jensen, K. N., Zhang, M. & Plank, B., 21 May 2021, Proceedings of the 23rd Nordic Conference on Computational Linguistics. Association for Computational Linguistics, p. 210-221 (Linköping Electronic Conference Proceedings; No. 21, Vol. 178).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  8. Published

    Set-to-Sequence Methods in Machine Learning: A Review

    Jurewicz, M. & Derczynski, L., 12 Aug 2021, In: The Journal of Artificial Intelligence Research. 71, p. 885-924

    Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

  9. Published

    Cross-Lingual Cross-Domain Nested Named Entity Evaluation on English Web Texts

    Plank, B., 2021, Findings of ACL 2021. Association for Computational Linguistics, p. 1808 1815 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  10. Published

    Abusive Language Recognition in Russian

    Saitov, K. & Derczynski, L., 20 Apr 2021, Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing. Association for Computational Linguistics, p. 20-25

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  11. Published

    Cross-lingual Multi-task Transfer for Zero-shot Task-oriented Dialog

    van der Goot, R., Stepanovic, M., Ramponi, A., Sharaf, I., Üstün, A., Imankulova, A., Khairunnisa, S. O., Komachi, M. & Plank, B., 25 Sep 2021.

    Research output: Contribution to conference - NOT published in proceeding or journalConference abstract for conferenceResearchpeer-review

  12. Published

    From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding

    van der Goot, R., Sharaf, I., Imankulova, A., Üstün, A., Stepanovic, M., Ramponi, A., Khairunnisa, S. O., Komachi, M. & Plank, B., 2021, Proceedings of NAACL. Association for Computational Linguistics

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  13. Published

    Lexical Normalization for Code-switched Data and its Effect on POS Tagging

    van der Goot, R. & Çetinoğlu, Ö., Apr 2021, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, p. 2352-2365 13 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  14. Published

    Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP

    van der Goot, R., Üstün, A., Ramponi, A., Sharaf, I. & Plank, B., 2021, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics, p. 176-197

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  15. Published

    On the Effectiveness of Dataset Embeddings in Mono-lingual, Multi-lingual and Zero-shot Conditions

    van der Goot, R., Üstün, A. & Plank, B., Apr 2021, Proceedings of the Second Workshop on Domain Adaptation for NLP: EACL 2021 workshop. Association for Computational Linguistics, p. 183–194

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  16. Published

    Annotating Online Misogyny

    Zeinert, P., Inie, N. & Derczynski, L., 3 Aug 2021, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, p. 3181–3197

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  17. 2020
  18. Published

    Sequence labelling and sequence classification with gaze: Novel uses of eye‐tracking data for Natural Language Processing

    Barrett, M. J. & Hollenstein, N., 5 Nov 2020, In: Language and Linguistics Compass. 14, 11, p. 1-16 16 p.

    Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

  19. Published

    Matching Theory and Data with Personal-ITY: What a Corpus of Italian YouTube Comments Reveals About Personality

    Bassignana, E., Nissim, M. & Patti, V., 2020, Proceedings of the Third Workshop on Computational Modeling of People's Opinions, Personality, and Emotion's in Social Media. Association for Computational Linguistics, p. 11-22

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  20. Published

    One of these words is not like the other: a reproduction of outlier identification using non-contextual word representations

    Brink Andersen, J., Bak Bertelsen, M., Hørby Schou, M., Ciosici, M. R. & Assent, I., Nov 2020, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing and the 10th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . Association for Computational Linguistics, 11 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  21. Published

    Accelerated High-Quality Mutual-Information Based Word Clustering

    Ciosici, M. R., Assent, I. & Derczynski, L., 1 May 2020, Proceedings of The 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association, p. 2484-2489 6 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  22. Published

    Synthetic Data for English Lexical Normalization: How Close Can We Get to Manually Annotated Data?

    Dekker, K. & van der Goot, R., May 2020, Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020). European Language Resources Association, p. 6300-6309

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  23. Published

    Detection and Resolution of Rumors and Misinformation with NLP

    Derczynski, L. & Zubiaga, A., Dec 2020, Proceedings of the 28th International Conference on Computational Linguistics: Tutorial Abstracts. Barcelona, Spain (Online): Association for Computational Linguistics, p. 22-26

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  24. Published

    The Rumour Mill: Making the Spread of Misinformation Explicit and Tangible

    Inie, N., Falk Olesen, J. & Derczynski, L., Apr 2020, The ACM CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  25. Published

    Buhscitu at SemEvaL-2020 Task 7: Assessing Humour in Edited News Headlines using Hand-Crafted Features and Online Knowledge Bases

    Jensen, K. N., Filrup Rasmussen, N., Wang, T., Placenti, M. & Plank, B., 2020, SemEval. Association for Computational Linguistics

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  26. Published

    FT Speech: Danish Parliament Speech Corpus

    Kirkedal, A. S., Stepanovic, M. & Plank, B., 2020, INTERSPEECH 2020. International Speech Communication Association (ISCA), (Annual Conference of the International Speech Communication Association).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  27. Published

    Mental health-related conversations on social media and crisis episodes: a time-series regression analysis

    Kolliakou, A., Bakolis, I., Chandran, D., Derczynski, L., Werbeloff, N., Osborn, D. PJ., Bontcheva, K. & Rob, S., Feb 2020, In: Scientific Reports. 10, 1342.

    Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

  28. Published

    SHR++: An Interface for Morpho-syntactic annotation of Sanskrit Corpora

    Krishna, A., Vidhyut, S., Chawla, D., Sambhavi, S. & Goyal, P., Feb 2020, Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020),. Association for Computational Linguistics, p. 7069–7076

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  29. Published

    NLP North at WNUT-2020 Task 2: Pre-training versus Ensembling for Detection of Informative COVID-19 English Tweets

    Møller, A. G., van der Goot, R. & Plank, B., Nov 2020, Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020). Association for Computational Linguistics, p. 331-336

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  30. Published

    DAN+: Danish Nested Named Entities and Lexical Normalization

    Plank, B., Jensen, K. N. & van der Goot, R., Dec 2020, The 28th International Conference on Computational Linguistics. Association for Computational Linguistics, p. 6649–6662

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  31. Published

    Biomedical Event Extraction as Sequence Labeling

    Ramponi, A., van der Goot, R., Lombardo, R. & Plank, B., Nov 2020, Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  32. Published

    Cross-Domain Evaluation of Edge Detection for Biomedical Event Extraction

    Ramponi, A., Plank, B. & Lombardo, R., May 2020, Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020). European Language Resources Association, p. 1975 1982 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  33. Published

    Neural Unsupervised Domain Adaptation in NLP—A Survey

    Ramponi, A. & Plank, B., Dec 2020, The 28th International Conference on Computational Linguistics. Association for Computational Linguistics

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  34. Published

    Maintaining quality in FEVER annotation

    Schulte, H., Binau, J. & Derczynski, L., 9 Jul 2020, Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER): Association for Computational Linguistics. Association for Computational Linguistics, p. 42-46

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  35. Published

    Offensive Language and Hate Speech Detection for Danish

    Sigurbergsson, G. & Derczynski, L., 1 May 2020, Proceedings of the International Conference on Language Resources and Evaluation: LREC 2020. European Language Resources Association, p. 3498–3508

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  36. Published

    Norm It! Lexical Normalization for Italian and Its Downstream Effects forDependency Parsing

    van der Goot, R., Ramponi, A., Caselli, T., Cafagna, M. & De Mattei, L., May 2020, Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020). France: European Language Resources Association (ELRA), p. 6272–6278

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  37. Published

    DaNewsroom: A Large-scale Danish Summarisation Dataset

    Varab, D. & Schluter, N., Apr 2020, Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). European Language Resources Association, p. 6731–6739

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  38. Published

    Directions in abusive language training data, a systematic review: Garbage in, garbage out

    Vidgen, B. & Derczynski, L., 28 Dec 2020, In: PLOS ONE. 15, 12, e0243300.

    Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

  39. Published

    SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

    Zampieri, M., Nakov, P., Rosenthal, S., Atanasova, P., Karadzhov, G., Mubarak, H., Derczynski, L., Pitenis, Z. & Coltekin, C., Dec 2020, Proceedings of the Fourteenth Workshop on Semantic Evaluation. Barcelona (online): Association for Computational Linguistics, p. 1425-1447 23 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  40. 2019
  41. Published

    Quantifying the morphosyntactic content of Brown Clusters

    Ciosici, M., Derczynski, L. & Assent, I., Jun 2019, Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Vol. 1. p. 1541–1550 N19-1157

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  42. Published

    Bornholmsk Natural Language Processing: Resources and Tools

    Derczynski, L. & Kjeldsen, A. S., 2019, Proceedings of the Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 338–344 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  43. Published

    Misinformation on Twitter During the Danish National Election: A Case Study

    Derczynski, L., Albert-Lindqvist, T. O., Bendsen, M. V., Inie, N., Pedersen, J. E. & Pedersen, V. D., 4 Oct 2019, Proceedings of the conference on Truth and Trust Online.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  44. Published

    Offensive Language and Hate Speech Detection for Danish

    Derczynski, L., 2019.

    Research output: Contribution to conference - NOT published in proceeding or journalPaperResearch

  45. Published

    Simple Natural Language Processing Tools for Danish

    Derczynski, L., 27 Jun 2019.

    Research output: Contribution to conference - NOT published in proceeding or journalPaperResearch

  46. Published

    Joint Rumour Stance and Veracity Prediction

    Edelbo Lillie, A., Middelboe, E. R. & Derczynski, L., 2019, Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 208–221 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  47. Published

    SemEval-2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours

    Gorrell, G., Kochkina, E., Liakata, M., Aker, A., Zubiaga, A., Bontcheva, K. & Derczynski, L., 7 Jun 2019, Proceedings of the 13th International Workshop on Semantic Evaluation: NAACL HLT 2019. Association for Computational Linguistics, p. 845-854

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  48. Published

    Measuring Catastrophic Forgetting in Visual Question Answering

    Greco, C., Plank, B., Fernandez, R. & Bernardi, R., 2019, Tenth International Workshop on Spoken Dialogue Systems Technology (IWSDS) 2019.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsCommunication

  49. Published

    Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering

    Greco, C., Plank, B., Fernandez, R. & Bernardi, R., Jul 2019, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, p. 3601–3605

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  50. Published

    Proceedings of the 22nd Nordic Conference on Computational Linguistics

    Hartmann, M. (ed.) & Plank, B. (ed.), 1 Jun 2019, Turku, Finland: Linköping University Electronic Press. (NEALT (Northern European Association of Language Technology) Proceedings Series; No. 42).

    Research output: Book / Anthology / Report / Ph.D. thesisAnthologyResearchpeer-review

  51. Published

    The Lacunae of Danish Natural Language Processing

    Kirkedal, A. S., Plank, B., Derczynski, L. & Schluter, N., 2019, Proceedings of the Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 356–362 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  52. Published

    At a Glance: The Impact of Gaze Aggregation Views on Syntactic Tagging

    Klerke, S. & Plank, B., 2019, The First Workshop Beyond Vision and LANguage: inTEgrating Real-World kNowledge : EMNLP-IJCNLP Workshop. Hong Kong: Association for Computational Linguistics, p. 51–61

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  53. Published

    Political Stance in Danish

    Lehmann, R. & Derczynski, L., 2019, Proceedings of the Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 197–207 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  54. Published

    CiteTracked: A Longitudinal Dataset of Peer Reviews and Citations

    Plank, B. & van Dalen, R., 25 Jul 2019, 4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2019). Chandrasekaran, M. K. & Mayr, P. (eds.). urn:nbn:de:0074-2414-3 ed. CEUR Workshop Proceedings, Vol. Vol-2414. p. 116-122 (CEUR Workshop Proceedings, Vol. 2414).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  55. Published

    Lexical Resources for Low-Resource PoS Tagging in Neural Times

    Plank, B. & Klerke, S., 2019, Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa’19) . Association for Computational Linguistics, p. 25–34 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  56. Published

    MoRTy: Unsupervised Learning of Task-specialized Word Embeddings by Autoencoding

    Rethmeier, N. & Plank, B., 2019, Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)): RepL4NLP-2019. Florence: Association for Computational Linguistics, p. 49-54

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  57. Published

    Recurrent models and lower bounds for projective syntactic decoding

    Schluter, N., 2019, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Vol. Volume 1 (Long and Short Papers). p. 251-260 10 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  58. Published

    Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat

    Shekhar, R., Venkatesh, A., Baumgärtner, T., Bruni, E., Plank, B., Raffaella Bernardi & Raquel Fernández, Jun 2019, NAACL (North American Association for Computational Linguistics). Minneapolis: Association for Computational Linguistics

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  59. Published

    Normalisation of imprecise temporal expressions extracted from text

    Tissot, H., Roberts, A., Derczynski, L. & Didonet Del Fabro, M., 2019, In: Knowledge and Information Systems. p. 1-34

    Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

  60. Published

    An In-depth Analysis of the Effect of Lexical Normalization on the Dependency Parsing of Social Media

    van der Goot, R., Oct 2019, Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). Hong Kong, China: Association for Computational Linguistics, p. 115–120 5 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  61. 2018
  62. Published

    Stance Prediction for Russian: Data and Analysis

    Lozhnikov, N., Derczynski, L. & Mazzara, M., 2018, Proceedings of 6th International Conference in Software Engineering for Defence Applications: SEDA 2018. Springer, p. 176-186 (Advances in Intelligent Systems and Computing, Vol. 925).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  63. Published

    IUCM at SemEval-2018 Task 11: Similar-Topic Texts as a Comprehension Knowledge Source

    Reznikova, S. & Derczynski, L., 2018, Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval-2018). Association for Computational Linguistics, p. 1068-1072

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review