Hearing Your Way Through Music Recordings: A Text Alignment and Synthesis Approach (en)
* Presenting author
Abstract:
Musical pieces exhibit a rich set of structures and events which can be easily extracted from the musical scores but are sometimes overheard while listening to the corresponding recordings. For instance, listeners may not know which section of a Beethoven sonata is currently being played, or they may overhear leitmotif occurrences in Wagner’s operas. In this contribution, we present an approach for enriching music recordings with synthesized text comments that provide useful information. In a first step, we extract this information from the musical score and use synchronization techniques to temporally align the text annotation with the physical time axis of the specific recording. In the second step, we use text-to-speech synthesis to convert the aligned text into a speech signal, which we then superimpose with the original music recording. In our study, we conduct experiments synthesizing text comments regarding structure, measure positions, harmonies, and leitmotif occurrences. The presented approach may be useful for musicologists since commented recordings eliminate the need of having to look up additional information while listening. Furthermore, this approach could be helpful in a music education context, where students attempting to follow the score of a piece during playback could greatly benefit from audible measure numbers.