Skip to main navigation Skip to search Skip to main content

Bleaching Text: Abstract Features for Cross-lingual Gender Prediction

  • Rob van der Goot
  • , Nikola Ljubesi
  • , Ian Matroos
  • , Malvina Nissim
  • , Barbara Plank

    Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

    Abstract

    Gender prediction has typically focused on lexical and social network features, yielding good performance, but making systems highly language-, topic-, and platform-dependent. Cross-lingual embeddings circumvent some of these limitations, but capture gender-specific style less.
    We propose an alternative: bleaching text, i.e., transforming lexical strings into more abstract features. This study provides evidence that such features allow for better transfer across languages. Moreover, we present a first study on the ability of humans to perform cross-lingual gender prediction. We find that human predictive power proves similar to that of our bleached models, and both perform better than lexical models.
    Original languageEnglish
    Title of host publicationProceedings of the 56th Annual Meeting of the Association for Computational Linguistics
    Number of pages7
    Place of PublicationMelbourne
    PublisherAssociation for Computational Linguistics
    Publication date2018
    Publication statusPublished - 2018

    Keywords

    • Gender prediction
    • Cross-lingual embeddings
    • Text bleaching
    • Human prediction
    • Lexical features

    Fingerprint

    Dive into the research topics of 'Bleaching Text: Abstract Features for Cross-lingual Gender Prediction'. Together they form a unique fingerprint.

    Cite this