Glottal Inverse Filtering

The vowel sounds in human speech can be pretty well modelled using source-filter theory. There, a vowel sound has two main building blocks: an excitation signal created by the vocal folds flapping against each other, and the filter resulting from the signal travelling through the vocal tract and getting modified on the way.

From the point of view of inverse problems, it is an intriguing nonlinear problem to recover both the excitation signal and the filter from a microphone recording of a vowel sound. This task is called glottal inverse filtering (GIF). Here I present the background of GIF and some modern inversion methodology for the solution of the inverse problem.

High-quality solution of the GIF problem are useful for synthetic speech and for speech recognition.