Article

Image based machine learning algorithm for lip-synchronization yields an early boost in cortical speech tracking (en)

* Presenting author
Day / Time: 21.03.2024, 11:00-11:40
Typ: Poster
Information:

Die Poster werden durchgängig von Dienstag bis Donnerstag im Glasgang ausgestellt, sortiert nach thematischen Zusammenhängen in der hier angegebenen Posterinsel (siehe Sitzungstitel). Das Posterforum zur angegebenen Zeit bietet die Möglichkeit, mit den Autor*innen in Diskussion zu kommen.

Abstract: Animating virtual characters is challenging, in particular the implementation of natural lip movements of speech. We present an EEG-based evaluation of two previously introduced algorithms to coordinate virtual character lip motions with real-time speech input. Both algorithms were designed to yield a set number of blendshapes, which are predefined mesh-deformations used to manipulate the surface of 3D-Models. Algorithm 1 (Llorach Tó et al., 2016) is a vocal tract algorithm that follows a rule-based approach to compute the blendshapes based on the energy in four different frequency bins of the smoothed short-term power spectrum density. Algorithm 2 is an image-based machine learning model (Visual-ML).We conducted a mobile EEG experiment in a controlled virtual environment (VE). A total of 20 participants were presented with audio-visual scenes comprising one out of six virtual characters at a time telling unscripted stories with babble noise or no background noise. Conditions were compared by means of cortical speech tracking. Preliminary results show an early boost in cortical speech tracking for the visual-ML algorithm compared to the vocal-tract algorithm and audio-only for the babble noise conditions. EEG results will be presented alongside both demonstrations of both algorithms.