Languages with logographic writing systems present a difficulty for traditional character-level models. Leveraging the subcharacter information was recently shown to be beneficial for a number of intrinsic and extrinsic tasks in Chinese. We examine whether the same strategies could be applied for Japanese, and contribute a new analogy dataset for this language.
|Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP
|Association for Computational Linguistics
|Udgivet - 2018