ITU

Natural Language Processing

Organisational unit: Research Group

IT University of Copenhagen
Rued Langgaards Vej 7
DK-2300 Copenhagen S
Denmark

Contact information

Research Group Coordinator

Leon Derczynski (leod@itu.dk)

Organisation profile

ITU NLP researches in natural language processing, with extra focus on deep learning approaches, NLP for Danish, information extraction, parsing, summarization, bias reduction, fake news detection, social media processing, and low-resource languages.


Machine understanding of natural language is a major AI challenge of our time. ITU's Natural Language Processing Group adresses this challenge in all core areas of Natural Language Processing including:

• Dependency parsing. When understanding a sentence, we need to know who did what to whom – i.e., how the words relate to each other. Parsing is a process for understanding these relations. In dependency parsing, a sentence’s syntactic structure is described using the sentence’s words and a set of relations that connect the words. ITU NLP works on building parsing tools and improving parsing practices.

• Social media. Although there’s a lot of unimportant-seeming noise and chatter on social media, it’s actually very useful – not just for targeting politics and business strategy, but also for detecting virus outbreaks and earthquakes. The highly varied language on social media is difficult to process. We focus on techniques for processing this language and ways of using social media intelligence.

• Multilingual NLP. Lots of research is done on English – but there are approximately 7000 known living languages, separated over 128 language family groups. So it’s very important to get the state of the art to work in more languages than English. As well as covering many others, ITU NLP includes special focus on the Danish languages.

• Stance detection & fake news analysis. We can estimate how true or false an online claim is by measuring the reaction around it – the stance people take to it. At ITU NLP we continue work on veracity in digital media.

• Entity detection. Finding where people, organizations and places are mentioned in text is really important for many tasks – building automatic summaries, doing business intelligence, and so on. These names are called entities, which can include things like names of drugs, genes, products, and so on. Finding these names well is tough, and a theme of research at ITU NLP.

• Deep learning approaches. Language is tough to process, and so at ITU we use modern deep learning techniques to address this huge AI challenge. We’re interested in multi-task learning, transfer learning, efficient nets, and working with new and powerful toolkits, and have a selection of GPU resources for our research computing.

• Representation learning. It’s difficult to map the language of humans, with words, to the language of computers, with numbers. Finding a way of representing words using numbers can be done automatically, which is called representation learning. At ITU we’re particularly interested in learning multilingual representations, learning representations across different domains (a domain is a specific type of language, like news articles, conversation, doctor’s notes and so on), and distributional clustering.

  1. 2020
  2. Published

    Mental health-related conversations on social media and crisis episodes: a time-series regression analysis

    Kolliakou, A., Bakolis, I., Chandran, D., Derczynski, L., Werbeloff, N., Osborn, D. PJ., Bontcheva, K. & Rob, S., Feb 2020, In : Scientific Reports. 10, 1342.

    Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

  3. 2019
  4. Published

    Misinformation on Twitter During the Danish National Election: A Case Study

    Derczynski, L., Albert-Lindqvist, T. O., Bendsen, M. V., Inie, N., Pedersen, J. E. & Pedersen, V. D., 4 Oct 2019, Proceedings of the conference on Truth and Trust Online.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  5. Published

    An In-depth Analysis of the Effect of Lexical Normalization on the Dependency Parsing of Social Media

    van der Goot, R., Oct 2019, Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). Hong Kong, China: Association for Computational Linguistics, p. 115–120 5 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  6. Published

    CiteTracked: A Longitudinal Dataset of Peer Reviews and Citations

    Plank, B. & van Dalen, R., 25 Jul 2019, 4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2019). Chandrasekaran, M. K. & Mayr, P. (eds.). urn:nbn:de:0074-2414-3 ed. CEUR Workshop Proceedings, Vol. Vol-2414. p. 116-122 (CEUR Workshop Proceedings, Vol. 2414).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  7. Published

    Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering

    Greco, C., Plank, B., Fernandez, R. & Bernardi, R., Jul 2019, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, p. 3601–3605

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  8. Published

    Simple Natural Language Processing Tools for Danish

    Derczynski, L., 27 Jun 2019.

    Research output: Contribution to conference - NOT published in proceeding or journalPaperResearch

  9. Published

    SemEval-2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours

    Gorrell, G., Kochkina, E., Liakata, M., Aker, A., Zubiaga, A., Bontcheva, K. & Derczynski, L., 7 Jun 2019, Proceedings of the 13th International Workshop on Semantic Evaluation: NAACL HLT 2019. Association for Computational Linguistics, p. 845-854

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  10. Published

    Proceedings of the 22nd Nordic Conference on Computational Linguistics

    Hartmann, M. (ed.) & Plank, B. (ed.), 1 Jun 2019, Turku, Finland: Linköping University Electronic Press. (NEALT (Northern European Association of Language Technology) Proceedings Series; No. 42).

    Research output: Book / Anthology / Report / Ph.D. thesisAnthologyResearchpeer-review

  11. Published

    Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat

    Shekhar, R., Venkatesh, A., Baumgärtner, T., Bruni, E., Plank, B., Raffaella Bernardi & Raquel Fernández, Jun 2019, NAACL (North American Association for Computational Linguistics). Minneapolis: Association for Computational Linguistics

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  12. Published

    Quantifying the morphosyntactic content of Brown Clusters

    Ciosici, M., Derczynski, L. & Assent, I., Jun 2019, Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Vol. 1. p. 1541–1550 N19-1157

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  13. Published

    At a Glance: The Impact of Gaze Aggregation Views on Syntactic Tagging

    Klerke, S. & Plank, B., 2019, The First Workshop Beyond Vision and LANguage: inTEgrating Real-World kNowledge : EMNLP-IJCNLP Workshop. Hong Kong: Association for Computational Linguistics, p. 51–61

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  14. Published

    Bornholmsk Natural Language Processing: Resources and Tools

    Derczynski, L. & Kjeldsen, A. S., 2019, Proceedings of the Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 338–344 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  15. Published

    Joint Rumour Stance and Veracity Prediction

    Edelbo Lillie, A., Middelboe, E. R. & Derczynski, L., 2019, Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 208–221 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  16. Published

    Lexical Resources for Low-Resource PoS Tagging in Neural Times

    Plank, B. & Klerke, S., 2019, Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa’19) . Association for Computational Linguistics, p. 25–34 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  17. Published

    Measuring Catastrophic Forgetting in Visual Question Answering

    Greco, C., Plank, B., Fernandez, R. & Bernardi, R., 2019, Tenth International Workshop on Spoken Dialogue Systems Technology (IWSDS) 2019.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsCommunication

  18. Published

    MoRTy: Unsupervised Learning of Task-specialized Word Embeddings by Autoencoding

    Rethmeier, N. & Plank, B., 2019, Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)): RepL4NLP-2019. Florence: Association for Computational Linguistics, p. 49-54

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  19. Published

    Normalisation of imprecise temporal expressions extracted from text

    Tissot, H., Roberts, A., Derczynski, L. & Didonet Del Fabro, M., 2019, In : Knowledge and Information Systems. p. 1-34

    Research output: Journal Article or Conference Article in JournalJournal articleResearchpeer-review

  20. Published

    Offensive Language and Hate Speech Detection for Danish

    Derczynski, L., 2019.

    Research output: Contribution to conference - NOT published in proceeding or journalPaperResearch

  21. Published

    Political Stance in Danish

    Lehmann, R. & Derczynski, L., 2019, Proceedings of the Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 197–207 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  22. Published

    Recurrent models and lower bounds for projective syntactic decoding

    Schluter, N., 2019, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Vol. Volume 1 (Long and Short Papers). p. 251-260 10 p.

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  23. Published

    The Lacunae of Danish Natural Language Processing

    Kirkedal, A. S., Plank, B., Derczynski, L. & Schluter, N., 2019, Proceedings of the Nordic Conference of Computational Linguistics (2019). Linköping University Electronic Press, p. 356–362 (NEALT (Northern European Association of Language Technology) Proceedings Series).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  24. 2018
  25. Published

    IUCM at SemEval-2018 Task 11: Similar-Topic Texts as a Comprehension Knowledge Source

    Reznikova, S. & Derczynski, L., 2018, Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval-2018). Association for Computational Linguistics, p. 1068-1072

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

  26. Published

    Stance Prediction for Russian: Data and Analysis

    Lozhnikov, N., Derczynski, L. & Mazzara, M., 2018, Proceedings of 6th International Conference in Software Engineering for Defence Applications: SEDA 2018. Springer, p. 176-186 (Advances in Intelligent Systems and Computing, Vol. 925).

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review