Subcharacter Information in Japanese Embeddings: When Is It Worth It?

Marzena Karpinska, Bofang Li, Anna Rogers, Aleksandr Drozd

    Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

    Abstract

    Languages with logographic writing systems present a difficulty for traditional character-level models. Leveraging the subcharacter information was recently shown to be beneficial for a number of intrinsic and extrinsic tasks in Chinese. We examine whether the same strategies could be applied for Japanese, and contribute a new analogy dataset for this language.
    OriginalsprogEngelsk
    TitelProceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP
    Antal sider10
    UdgivelsesstedMelbourne, Australia
    ForlagAssociation for Computational Linguistics
    Publikationsdato2018
    Sider28-37
    StatusUdgivet - 2018

    Emneord

    • Logographic writing systems
    • Character-level models
    • Subcharacter information
    • Chinese language processing
    • Japanese language analogies

    Fingeraftryk

    Dyk ned i forskningsemnerne om 'Subcharacter Information in Japanese Embeddings: When Is It Worth It?'. Sammen danner de et unikt fingeraftryk.

    Citationsformater