Abstract
We present a novel corpus for personality prediction in Italian, containing a
larger number of authors and a different genre compared to previously available
resources. The corpus is built exploiting Distant Supervision, assigning Myers-
Briggs Type Indicator (MBTI) labels to YouTube comments, and can lend itself to
a variety of experiments. We report on preliminary experiments on Personal-ITY,
which can serve as a baseline for future work, showing that some types are easier
to predict than others, and discussing the perks of cross-dataset prediction.
larger number of authors and a different genre compared to previously available
resources. The corpus is built exploiting Distant Supervision, assigning Myers-
Briggs Type Indicator (MBTI) labels to YouTube comments, and can lend itself to
a variety of experiments. We report on preliminary experiments on Personal-ITY,
which can serve as a baseline for future work, showing that some types are easier
to predict than others, and discussing the perks of cross-dataset prediction.
Original language | English |
---|---|
Title of host publication | Seventh Italian Conference on Computational Linguistics |
Publisher | Association for Computational Linguistics |
Publication date | 2020 |
Publication status | Published - 2020 |
Externally published | Yes |
Event | Seventh Italian Conference on Computational Linguistics - Bologna, Italy Duration: 1 Mar 2021 → 3 Mar 2021 http://clic2020.ilc.cnr.it/en/home-2/?ujci4tgnr78ybub/n28j8c0awe |
Conference
Conference | Seventh Italian Conference on Computational Linguistics |
---|---|
Country/Territory | Italy |
City | Bologna |
Period | 01/03/2021 → 03/03/2021 |
Internet address |