Beyond Words: Personality-Aware Multimodal Modeling of Face-to-Face Emotion

Publikation: AfhandlingerPh.d.-afhandling

Abstract

Emotions develop in layers: quick changes in valence and arousal form the core affect; when these are shaped by context and past experience, they become conscious feelings and are labeled as specific emotions. Language captures only parts of this process. More detailed cues are found in eye movements, pupil changes, facial action units (AUs), skin conductance, and EEG signals. Current emotion‑AI often overlooks these signals and ignores how personality shapes emotional responses, leading to models that are fragile and biased.
This thesis approaches emotion recognition as a multimodal, trait‑aware task. Six studies range from controlled face‑viewing to simulated conversation. In 2 kHz eye‑tracking, just first fixations and microsaccades provide enough information for a simple multilayer perceptron to predict later gaze patterns in face emotion perception, showing that very fast eye movements are tied to emotion.
Expanding on this, the public AFFEC dataset (5,073 trials, 73 participants) includes gaze, facial AUs, 64‑channel 256 Hz EEG, skin conductance, and Big‑Five personality scores, with each trial rated for both felt and perceived valence–arousal. Temporal mimicry analysis shows people copy fear expressions fastest, and extraverts are 20 ms quicker. Adding personality traits to gaze+face models cuts cross‑subject MSE by 18%
The final model, MuMTAffect, a multitask transformer, reaches macro‑F1=0.55
(valence) / 0.59 (arousal) and R2 > 0.94 for Big‑Five personality prediction, while keeping at least 85 % of its emotion accuracy even when one modality is missing.

Key Contributions
i. Introduction of the Advancing Face-to-Face Emotion
Communication (AFFEC) dataset, a comprehensive multimodal resource combining eye tracking, facial dynamics, Electroencephalography (EEG), and Galvanic Skin Response (GSR), enriched with both felt and perceived emotional labels and Big Five personality profiles. ii. Identification of microsaccades and facial mimicry timing as early, trait-sensitive markers of emotional response, offering high temporal precision and clear interpretability. iii. Integration of personality traits into model architecture, resulting in improved generalization and predictive accuracy, particularly for high-variance traits such as Neuroticism. iv. Development of the Multimodal Multitask Affective
Transformer (MuMTAffect) Transformer, a multimodal, multitask model that fuses physiological signals and personality data to jointly predict emotions and stable traits with strong overall performance.

Keywords: Affect, Emotion Recognition, Eye Tracking, Facial Action Units, Skin Conductance, EEG, Personality, Multimodal Fusion, Transformer Networks, valence–arousal,
OriginalsprogEngelsk
KvalifikationPh.d.
Vejleder(e)
  • Burelli, Paolo , Hovedvejleder
Bevillingsdato9 okt. 2025
ISBN'er, elektronisk978-87-7949-570-8
StatusUdgivet - 2025

Fingeraftryk

Dyk ned i forskningsemnerne om 'Beyond Words: Personality-Aware Multimodal Modeling of Face-to-Face Emotion'. Sammen danner de et unikt fingeraftryk.

Citationsformater