Cultures of the AI paralinguistic in voice cloning tools.

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

With AI-based voice cloning tools becoming more accessible to designers, we deem it imperative to understand their paralinguistic capabilities, limitations and cultures. Paralinguistics as a field of study is concerned with how you say something rather than what you say, and new AI-based statistical voice synthesis tools differ significantly from previous methods. As such, they require asking novel questions and provoking new thoughts. This paper contributes by analyzing and evaluating various voice cloning platforms by looking at how they describe their own ability to produce three different paralinguistic elements: laughter, stuttering and pacing. We focus on text-to-speech and hybrid approaches to voice cloning, and follow up our analyses by attempting to produce these three paralinguistic elements using the voice cloning platform ElevenLabs’ voice synthesis tools. Conclusively, we draw on our results to pose questions for further investigation into what kinds of AI paralinguistic cultures can and should be designed.
OriginalsprogEngelsk
TitelCompanion Publication of the 2024 ACM Designing Interactive Systems Conference (DIS '24 Companion)
Publikationsdato3 jun. 2024
Sider249-252
DOI
StatusUdgivet - 3 jun. 2024
BegivenhedConference on Designing Interactive Systems - IT University of Copenhagen, Copenhagen, Danmark
Varighed: 1 jul. 20245 jul. 2024
Konferencens nummer: 19
https://dis.acm.org/2024/

Konference

KonferenceConference on Designing Interactive Systems
Nummer19
LokationIT University of Copenhagen
Land/OmrådeDanmark
ByCopenhagen
Periode01/07/202405/07/2024
Internetadresse

Fingeraftryk

Dyk ned i forskningsemnerne om 'Cultures of the AI paralinguistic in voice cloning tools.'. Sammen danner de et unikt fingeraftryk.

Citationsformater