Abstract
Large Language Models are expressive tools that enable complex tasks of text understanding within Computational Social Science. Their versatility, while beneficial, poses a barrier for establishing standardized best practices within the field. To bring clarity on the values of different strategies, we present an overview of the performance of modern LLM-based classification methods on a benchmark of 23 social knowledge tasks. Our results point to three best practices: prioritize models with larger vocabulary and pre-training corpora; avoid simple zero-shot in favor of AI-enhanced prompting; fine-tune on task-specific data, and consider more complex forms instruction-tuning on multiple datasets only when only training data is more abundant.
| Originalsprog | Engelsk |
|---|---|
| Titel | Proceedings of the Third Workshop on Social Influence in Conversations (SICon 2025) |
| Forlag | Association for Computational Linguistics |
| Publikationsdato | jul. 2025 |
| ISBN (Elektronisk) | 979-8-89176-266-4 |
| DOI | |
| Status | Udgivet - jul. 2025 |
| Begivenhed | Proceedings of the Third Workshop on Social Influence in Conversations - Vienna, Østrig Varighed: 31 jul. 2025 → 31 jul. 2025 |
Konference
| Konference | Proceedings of the Third Workshop on Social Influence in Conversations |
|---|---|
| Land/Område | Østrig |
| By | Vienna |
| Periode | 31/07/2025 → 31/07/2025 |