Multitask Noisy Speech Enhancement System

- Speech band equalizer
- Dynamics processing
- Noise gate
- Signal level limiter
- Clipping restoration
- Noise reduction
- Noise whitening
- Blind deconvolution
- Spectrum analyser
- Time stretching
- Spectral expander
- Fourier corrector
- Neural network corrector
- Decorrelation
- Joint approximation
- Homomorphic approximation
- Reverberation
- Synchronisation
- Normalisation
Contact info


If multiple recordings of the speech signal are to be played simultaneously, they need to be synchronised. Therefore, the synchronisation module was implemented in the Browser application. Cross-correlation function is used, defined as:

where x1[n], x2[n] are two input signals. The maximum of the cross-correlation function (m samples delay) may be interpreted as a time delay estimation (time difference between the examined signals). The cross-correlation function is computed using the fast Fourier transform:

Each of the two synchronised sound files is divided into segments (segment length is set in the module options). Only the first and the last segment of each file is examined. Each of these segments is divided into overlapping frames. The cross-correlation function is computed for the pairs of frames: one frame from the start segment of the first file and one frame from the start segment of the second file. The maximum value of the cross-correlation function determines the pair of the most similar signal frames, this value also indicates the accuracy of the synchronisation procedure. The same procedure is repeated for the end segments of the files.

The second of the synchronised sound files is shifted on the timeline so that the most similar frames in the start and end segments are aligned. The time shift values computed for the start and end segments of the synchronised files may be different. In this case, the length of the second signal has to be changed. This is obtained by the resampling the second signal using the linear interpolation method. A small amount of noise is introduced during the resampling, but this does not deteriorate signal quality.

The synchronisation module is started by selecting the synchronisation option in the Browser application menu. The files for synchronisation are selected by clicking the boxes that represent sound files with the mouse. The names of the selected files are displayed in the list, the first of the selected files will be used as a pattern for synchronisation (other files will be synchronised for this one). In the main window of the synchronisation module, the user may select frame length and step size using the 'synchronisation factor' slider (current values are displayed as a tooltip). The range of available values - from fast (coarse) to slow (accurate), as well as the length of the start and end signal sections, may be defined using the advanced options window.

The results of the synchronisation process for each file are presented in the module window, 100% indicates the most accurate synchronisation. If the results are above 60%, the synchronisation factor should be increased (the slider should be moved towards the 'accurate' option). If the synchronisation result is below 50%, the length of the start and end section should be increased. Higher synchronisation factor values (near 'accurate') and longer start/end section result in longer processing time.