Abstract
We present a novel corpus for personality prediction in Italian, containing a
larger number of authors and a different genre compared to previously available
resources. The corpus is built exploiting Distant Supervision, assigning Myers-
Briggs Type Indicator (MBTI) labels to YouTube comments, and can lend itself to
a variety of experiments. We report on preliminary experiments on Personal-ITY,
which can serve as a baseline for future work, showing that some types are easier
to predict than others, and discussing the perks of cross-dataset prediction.
larger number of authors and a different genre compared to previously available
resources. The corpus is built exploiting Distant Supervision, assigning Myers-
Briggs Type Indicator (MBTI) labels to YouTube comments, and can lend itself to
a variety of experiments. We report on preliminary experiments on Personal-ITY,
which can serve as a baseline for future work, showing that some types are easier
to predict than others, and discussing the perks of cross-dataset prediction.
| Originalsprog | Engelsk |
|---|---|
| Titel | Seventh Italian Conference on Computational Linguistics |
| Forlag | Association for Computational Linguistics |
| Publikationsdato | 2020 |
| Status | Udgivet - 2020 |
| Udgivet eksternt | Ja |
| Begivenhed | The 7th Italian Conference on Computational Linguistics - Bologna, Italien Varighed: 1 mar. 2021 → 3 mar. 2021 Konferencens nummer: 7 http://clic2020.ilc.cnr.it/en/home-2/?ujci4tgnr78ybub/n28j8c0awe |
Konference
| Konference | The 7th Italian Conference on Computational Linguistics |
|---|---|
| Nummer | 7 |
| Land/Område | Italien |
| By | Bologna |
| Periode | 01/03/2021 → 03/03/2021 |
| Internetadresse |
Emneord
- personality prediction
- Italian corpus
- Myers-Briggs Type Indicator
- YouTube comments
- distant supervision