The course considers the foundational and advanced signal and information processing methods for human speech and natural audio signal applications, such as telephone conversation and music playing. For example, what kinds of information from human speech signal need to be extracted and then transmitted through the channel for effective speech communication over phone, and how?
(1) Preliminaries of associated digital signal processing methodologies, such as convolution, Z-transform, Fourier transform, power spectrum etc.
(2) A source-filter model: analysis-synthesis architecture.
(3) Source coding: scalar and vector quantization, redundancy removal, linear prediction, open loop and closed loop coding, coding noise buildup, coding noise shaping, coding gain.
(4) Speech and audio coding: vocoders, low bit rate and high bit rate codecs, perceptual audio coding, psychoacoustic principles.
(5) Speech and audio signal enhancement, minimum mean square error estimation, linear estimation for Gaussian distribution, Wiener filtering, power spectral subtraction methods, spectral band replication, etc