Multitask Noisy Speech Enhancement System

- Speech band equalizer
- Dynamics processing
- Noise gate
- Signal level limiter
- Clipping restoration
- Noise reduction
- Noise whitening
- Blind deconvolution
- Spectrum analyser
- Time stretching
- Spectral expander
- Fourier corrector
- Neural network corrector
- Decorrelation
- Joint approximation
- Homomorphic approximation
- Reverberation
- Synchronisation
- Normalisation
Contact info

Joint approximation

The joint approximation module enhances speech signal quality by smoothing the joint (complex) spectrum of the signal. It is expected that signal-to-noise ratio will be improved for voiced and plosive consonants. The module is designed for use in the final stage of the restoration process, after the signal is processed by other modules.

The joint approximation module uses the McAuley-Quaterri algorithm. It is assumed that a short fragment of speech signal (ca. 46 ms) may be treated as deterministic, with more smooth phase than stochastic signal. The smoothing of the joint signal spectrum is performed in order to match phase spectrum of the distorted speech signal to the phase spectrum of the speech pattern (recorded in good acoustic conditions). The value of the approximated spectrum point is calculated as a linear combination of four adjacent spectrum points (two on each side of the approximated point). If x0 is the approximated spectrum point and the adjacent spectrum points are x-2, x-1, x1, x2, then the approximated value is:

where ww is the smoothing factor (0-24). The value of the smoothing factor depends on the approximation method:

  • 16 for third-order polynomial approximation,
  • 12 for averaging of two adjacent spectrum points,
  • 0 for averaging of two spectrum points separated by one point,
  • 24 for linear forward or backward extrapolation.

Local maxima in speech signal spectrum are related to the glottal tone harmonics. It is possible to replace the tracking of spectrum maxima by tracking of cepstrum maximum in the limited frequency range, using the triangular weighting function, so that possibility of error during the estimation of glottal tone frequency is reduced. The shape of the weighting function in the detector depends on the results of cepstral maximum tracking procedure which ensures the continuity of the processed signal spectrum.

The user may set the gain (0.1-4.0), processing factor (0-100%) and processing threshold (0-100%). Additional parameters - smoothing factor, time weighting window and overlap length - may be set in the advanced mode.