Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Prajjwal Bhargava, Aleksandr Drozd, Anna Rogers

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review


    Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.
    Original languageEnglish
    Title of host publicationProceedings of the Second Workshop on Insights from Negative Results in NLP
    Number of pages11
    Place of PublicationOnline and Punta Cana, Dominican Republic
    PublisherAssociation for Computational Linguistics
    Publication date1 Nov 2021
    Publication statusPublished - 1 Nov 2021


    • Natural Language Understanding
    • Generalization
    • BERT-based architectures
    • Adversarial robustness
    • Transformer models


    Dive into the research topics of 'Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics'. Together they form a unique fingerprint.

    Cite this