Abstract
The application of inverse filtering techniques for
high-quality singing voice analysis/synthesis is discussed. In
the context of source-filter models, inverse filtering provides a
noninvasive method to extract the voice source, and thus to study
voice quality. Although this approach is widely used in speech
synthesis, this is not the case in singing voice. Several studies
have proved that inverse filtering techniques fail in the case of
singing voice, the reasons being unclear. In order to shed light
on this problem, we will consider here an additional feature of
singing voice, not present in speech: the vibrato.
Vibrato has been traditionally studied by sinusoidal modeling. As
an alternative, we will introduce here a novel noninteractive
source filter model that incorporates the mechanisms of vibrato
generation. This model will also allow the comparison of the
results produced by inverse filtering techniques and by sinusoidal
modeling, as they apply to singing voice and not to speech. In
this way, the limitations of these conventional techniques,
described in previous literature, will be explained. Both
synthetic signals and singer recordings are used to validate and
compare the techniques presented in the paper.