The Tilt intonation theory gives a data-driven abstraction of fundamental frequency contours, while also maintaining a method for reversing the abstraction to recreate the contours. Each intonation event (accent, boundary, silence, connection) is described by one or more continuous parameters, which are extracted from labelled contours. These parameters may then be used to regenerate an F0 contour. The goal of the experiment presented here is to successfully predict the Tilt parameters such that natural intonation contours are generated from the predicted parameters.
The four intonation events are each described by at least one parameter: starting F0. This is the fundamental frequency at the beginning of the event. Accents and boundaries are described by additional parameters. The amplitude parameter is the sum of the distance (in Hertz) between the starting F0 and the F0 at the peak of the accent (the rise amplitude) and the distance (in Hertz) between the peak and the F0 at the end of the event (the fall amplitude). Duration is the length (in seconds) of the intonation event. The fourth parameter is peak position, the distance (in seconds) from the start of the first vowel of the event and the peak of the F0 for that event. The final parameter is tilt. Tilt is the result of dividing the difference of the rise and fall amplitudes by the sum of the rise and fall amplitudes. [8].
The tilt parameter has a range of -1 to 1, where -1 is pure fall, 1 is pure rise, and 0 contains equal portions of rise and fall.
Figure 1: Tilt parameters