
Phase-Aware Signal Processing in Speech Communication: Theory and Practice
Mowlaee, Pejman
Stahl, Johannes
Kulmer, Josef
Mayer, Florian
An overview on the challenging new topic of phase–aware signal processing Speech communication technology is a key factor in human–machine interaction, digital hearing aids, mobile telephony, and automatic speech/speaker recognition. With the proliferation of these applications, there is a growing requirement for advanced methodologies that can push the limits of the conventional solutions relying on processing the signal magnitude spectrum. Single–Channel Phase–Aware Signal Processing in Speech Communication provides a comprehensive guide to phase signal processing and reviews the history of phase importance in the literature, basic problems in phase processing, fundamentals of phase estimation together with several applications to demonstrate the usefulness of phase processing. Key features: Analysis of recent advances demonstrating the positive impact of phase–based processing in pushing the limits of conventional methods. Offers unique coverage of the historical context, fundamentals of phase processing and provides several examples in speech communication. Provides a detailed review of many references and discusses the existing signal processing techniques required to deal with phase information in different applications involved with speech. The book supplies various examples and MATLAB® implementations delivered within the PhaseLab toolbox. Single–Channel Phase–Aware Signal Processing in Speech Communication is a valuable single–source for students, non–expert DSP engineers, academics and graduate students. INDICE: Preface xi .List of Symbols xv .PART I HISTORY, THEORY AND CONCEPTS .1 Introduction: Phase Processing, History 3 Pejman Mowlaee .1.1 Chapter Organization 3 .1.2 Conventional Speech Communication 4 .1.3 Historical Overview on Unimportance/Importance of Phase 6 .1.4 Importance of Phase in Speech Processing 10 .1.4.1 Speech Enhancement 11 .1.4.2 Speech Watermarking 13 .1.4.3 Speech Coding 13 .1.4.4 Artificial Bandwidth Extension 15 .1.4.5 Speech Synthesis 16 .1.4.6 Speech/Speaker Recognition 17 .1.5 Structure of the Book 18 .1.6 Experiments 21 .1.6.1 Experiment 1: Phase Unimportance in Speech Enhancement 21 .1.6.2 Experiment 2: Effects of Phase Modification 23 .1.6.3 Experiment 3: Mismatched Window 26 .1.6.4 Experiment 4: Phase Spectrum Compensation 28 .1.7 Summary 29 .References 30 .2 Fundamentals of Phase–Based Signal Processing 39 Pejman Mowlaee .2.1 Chapter Organization 39 .2.2 STFT Phase: Background and Some Remarks 40 .2.2.1 Short–Time Fourier Transform 40 .2.2.2 Fourier Analysis of Speech: STFT Amplitude and Phase 41 .2.3 Phase Unwrapping 42 .2.3.1 Problem Definition 42 .2.3.2 Remarks on Phase Unwrapping 44 .2.3.3 Phase Unwrapping Solutions 46 .2.4 Useful Phase–Based Representations 52 .2.4.1 Group Delay (GD) Representations 52 .2.4.2 Instantaneous Frequency (IF) 56 .2.4.3 Baseband Phase Difference (BPD) 58 .2.4.4 Harmonic Phase Decomposition 58 .2.4.5 Phasegram: Unwrapped Harmonic Phase 61 .2.4.6 Relative Phase Shift (RPS) 62 .2.4.7 Phase Distortion (PD) 63 .2.5 Experiments 65 .2.5.1 Experiment 1: One–dimensional phase unwrapping 66 .2.5.2 Experiment 2: Comparative study of phase unwrapping methods 67 .2.5.3 Experiment 3: Comparative study on group delay spectra 69 .2.5.4 Experiment 4: Circular statistics of the harmonic phase 70 .2.5.5 Experiment 5: Circular statistics of the spectral phase 71 .2.5.6 Experiment 6: Comparative study of phase representations 74 .2.6 Summary 75 .References 76 .3 Phase Estimation Fundamentals 83 Josef Kulmer and Pejman Mowlaee .3.1 Chapter Organization 83 .3.2 Phase Estimation Fundamentals 84 .3.2.1 Background and Fundamentals 84 .3.2.2 Key Examples: Phase Estimation Problem 84 .3.2.3 Phase estimation 94 .3.3 Existing Solutions 98 .3.3.1 Iterative Signal Reconstruction 98 .3.3.2 Phase Reconstruction Across Time 103 .3.3.3 Phase Reconstruction Across Frequency 104 .3.3.4 Phase Randomization 105 .3.3.5 Geometry–based Phase Estimation 107 .3.3.6 Least Squares (LS) 110 .3.3.7 Tempo–Spectral Smoothing of Unwrapped phase 112 .3.4 Experiments 117 .3.4.1 Experiment 1: Monte Carlo Simulation Comparing ML and MAP 117 .3.4.2 Experiment 2: Monte Carlo Simulation on Window Impact 118 .3.4.3 Experiment 3: Phase Recovery Using the Griffin–Lim Algorithm 120 .3.4.4 Experiment 4: Phase Estimation for Speech Enhancement: A Comparative Study 122 .3.5 Summary 124 .References 125 .PART II APPLICATIONS .4 Phase Processing For Single–Channel Speech Enhancement 131 Johannes Stahl and Pejman Mowlaee .4.1 Introduction and Chapter Organization 131 .4.2 Speech Enhancement in the STFT Domain General Concepts 132 .4.2.1 A priori SNR Estimation 133 .4.2.2 Noise PSD Estimation 135 .4.3 Conventional Speech Enhancement 136 .4.3.1 Statistical Model 137 .4.3.2 Short–Time .Spectral Amplitude Estimation 138 .4.4 Phase–Sensitive .Speech Enhancement 140 .4.4.1 Phase Estimation for Signal Reconstruction 142 .4.4.2 Spectral Amplitude Estimation Given the STFT–Phase 142 .4.4.3 Iterative Closed–Loop Phase–Aware Single–ChannelSpeech Enhancement 144 .4.4.4 Incorporating Voiced/Unvoiced Uncertainty 146 .4.4.5 Uncertainty in Prior Phase Information 148 .4.4.6 Stochastic–Deterministic MMSESTFT Speech Enhancement 149 .4.5 Experiments 154 .4.5.1 Experiment 1: Proof–of–Concept 154 .4.5.2 Experiment 2: Consistency 156 .4.5.3 Experiment 3: Sensitivity Analysis 156 .4.6 Summary 159 .References 159 .5 Phase Processing For Single–Channel Source Separation 163 Pejman Mowlaee and Florian Mayer .5.1 Chapter Organization 163 .5.2 Why Single–Channel Source Separation? 164 .5.2.1 Background 164 .5.2.2 Problem Formulation 165 .5.3 Conventional Single–Channel .Source Separation (SCSS) 165 .5.3.1 Computational Auditory Scene Analysis (CASA) 166 .5.3.2 Model–Based SCSS 168 .5.4 Phase Processing for Single–Channel .Source Separation 173 .5.4.1 Complex Matrix Factorization Methods 174 .5.4.2 Phase Importance for Signal Reconstruction 176 .5.4.3 Phase–Aware .Time–Frequency Masks 187 .5.4.4 Phase Importance in Signal Interaction Model 189 .5.5 Experiments 191 .5.5.1 Experiment 1: Phase Estimation for Signal .Reconstruction Proof–of–Concept 192 .5.5.2 Experiment 2: Comparative Study on GLA–Based Phase Reconstruction Methods 192 .5.5.3 Experiment 3: Phase–Aware Time–Frequency Mask 195 .5.5.4 Experiment 4: Phase–Sensitive Interaction Functions 195 .5.5.5 Experiment 5: Complex Matrix Factorization 196 .5.6 Summary 198 .References 198 .6 Phase–Aware Speech Quality Estimation 205 Pejman Mowlaee .6.1 Chapter Organization 205 .6.2 Introduction on Speech Quality Estimation 206 .6.2.1 General Definition of Speech Quality 206 .6.2.2 Speech Quality Estimators: Amplitude, Phase, or Both? 208 .6.3 Conventional Instrumental Metrics for Speech Quality Estimation 208 .6.3.1 Perceived Quality 209 .6.3.2 Speech Intelligibility 211 .6.4 Why Phase–Aware Metrics? 215 .6.4.1 Phase and Speech Intelligibility 216 .6.4.2 Phase and Perceived Quality 216 .6.5 New Phase–Aware Metrics 216 .6.5.1 Group Delay Deviation (GDD) 217 .6.5.2 Instantaneous Frequency Deviation (IFD) 217 .6.5.3 Unwrapped MSE (UnMSE) 218 .6.5.4 Phase Deviation (PD) 218 .6.5.5 UnHPSNR and UnRMSE 218 .6.6 Subjective Tests 219 .6.6.1 CCR Test 220 .6.6.2 MUSHRA Test 220 .6.6.3 Statistical Analysis 222 .6.6.4 Speech Intelligibility Test 222 .6.6.5 Evaluation of Speech Quality Measures 224 .6.7 Experiments 226 .6.7.1 Experiment 1: Impact of Phase Modifications on .Speech Quality 227 .6.7.2 Experiment 2: Phase and Perceived Quality Estimation 229 .6.7.3 Experiment 3: Phase and Speech Intelligibility Estimation 230 .6.7.4 Experiment 4: Evaluating the Phase Estimation Accuracy231 .6.8 Summary 233 .References 233 .7 Conclusion and Future Outlook 239 Pejman Mowlaee .7.1 Chapter Organization 239 .7.2 Renaissance of Phase–Aware Signal Processing: Decline and Rise 240 .7.3 Directions for Future Research 241 .7.3.1 Involved Research Disciplines 241 .7.3.2 Related Research Disciplines 244 .7.4 Sum Up 245 .References 246 .A MATLAB Toolbox 251 .A.1 Chapter Organization 251 .A.2 Phase–Lab Toolbox 251 .A.2.1 MATLAB .R codes 252 .A.2.2 Additional Material 252 .References 252 .Index 255
- ISBN: 978-1-119-23881-2
- Editorial: Wiley–Blackwell
- Encuadernacion: Cartoné
- Páginas: 256
- Fecha Publicación: 16/12/2016
- Nº Volúmenes: 1
- Idioma: Inglés