Speech in mobile and pervasive environments

Speech in mobile and pervasive environments

Rajput, Nitendra
Nanavati, Amit Anil

97,96 €(IVA inc.)

This book provides a cross-disciplinary reference to speech in mobile and pervasive environmentsSpeech in Mobile and Pervasive Environments addresses the issues related to speech processing on resource-constrained mobile devices. These include speech recognition in noisy environments, specialised hardware forspeech recognition and synthesis, the use of context to enhance recognition and user experience, and the emerging software standards required for interoperability. This book takes a multi-disciplinary look at these matters, while offering an insight into the opportunities and challenges of speech processing in mobile environs. In developing regions, speech-on-mobile is set to play a momentous role, socially and economically; the authors discuss how voice-based solutions and applications offer a compelling and natural solution in this setting.Key FeaturesProvides a holistic overview of all speech technology related topics in the context of mobilityBrings together the latest research in a logically connected way in a single volumeCovers hardware, embedded recognition and synthesis, distributed speech recognition, software technologies, contextualinterfacesDiscusses multimodal dialogue systems and their evaluationIntroduces speech in mobile and pervasive environments for developing regionsThis book provides a comprehensive overview for beginners and experts alike. It can be used as a textbook for advanced undergraduate and postgraduate students in electrical engineering and computer science. Students, practitioners or researchers in the areas of mobile computing, speech processing, voice applications, human-computer interfaces, and information and communication technologies will also find this reference insightful. For experts in the above domains, this bookcomplements their strengths. In addition, the book will serve as a guide to practitioners working in telecom-related industries. INDICE: About the Series EditorsList of ContributorsForewordPrefaceAcknowledgments1 Introduction1.1 Application design1.2 Interaction modality1.3 Speechprocessing1.4 Evaluations2 Mobile Speech Hardware: The Case for Custom Silicon2.1 Introduction2.2 Mobile hardware: Capabilities and limitations2.2.1 Looking inside a mobile device: Smartphone example2.2.2 Processing limitations2.2.3 Memory limitations2.2.4 Power limitations2.2.5 Silicon technology and mobile hardware2.3 Profiling existing software systems2.3.1 Speech recognition overview2.3.2 Profiling techniques summary2.3.3 Processing time breakdown2.3.4 Memoryusage2.3.5 Power and energy breakdown2.3.6 Summary2.4 Recognizers for mobile hardware: Conventional approaches2.4.1 Reduced-resource embedded recognizers2.4.2 Network recognizers2.4.3 Distributed recognizers2.4.4 An alternative approach: Custom hardware2.5 Custom hardware for mobile speech recognition2.5.1 Motivation2.5.2 Hardware implementation: Feature extraction2.5.3 Hardware implementation: Feature scoring2.5.4 Hardware implementation: Search2.5.5 Hardware implementation: Performance and power evaluation2.5.6 Hardware implementation: Summary2.6 ConclusionBibliography3 Embedded Automatic Speech Recognition and Text-to-Speech Synthesis3.1 Automatic speech recognition3.2 Mathematical formulation3.3 Acoustic parameterization3.3.1 Landmark-based approach3.4 Acoustic modeling3.4.1 Unit selection3.4.2 Hidden Markov models3.5 Language modeling3.6 Modifications for embedded speech recognition3.6.1 Feature computation3.6.2 Likelihood computation3.7 Applications3.7.1 Car navigation systems3.7.2 Smart homes3.7.3 Interactive toys3.7.4 Smartphones3.8 Text-to-speech synthesis3.9 Text to speech in a nutshell3.10 Front end3.11 Back end3.11.1 Rule-based synthesis3.11.2 Data-driven synthesis3.11.3 Statistical parameteric speech synthesis3.12 Embedded text-to-speech3.13 Evaluation3.14 SummaryBibliography4 Distributed Speech Recognition4.1 Elements of distributed speech processing4.2 Front-end processing4.2.1 Device requirements4.2.2 Transmission issues in DSR4.2.3 Back-endprocessing4.3 ETSI standards4.3.1 Basic front-end standard ES 201 1084.3.2 Noise-robust front-end standard ES 202 0504.3.3 Tonal-language recognition standard ES 202 2114.4 Transfer protocol4.4.1 Signaling4.4.2 RTP payload format4.5 Energy-aware distributed speech recognition4.6 ESR, NSR, DSRBibliography5 Context in Conversation5.1 Context modeling and aggregation5.1.1 An example of composer specification5.2 Context-based speech applications: Conspeakuous5.2.1 Conspeakuous architecture5.2.2 B-Conspeakuous5.2.3 Learning as a source of context5.2.4 Implementation5.2.5 A tourist portal application5.3 Context-based speech applications: Responsive information architect5.4 ConclusionBibliography6 Software: Infrastructure, Standards, Technologies6.1 Introduction6.2 Mobile operating systems6.3 Voice over internet protocol6.3.1 Implications for mobile speech6.3.2 Sample speech applications6.3.3 Access channels6.4 Standards6.5 Standards: VXML6.6 Standards: VoiceFleXML6.6.1 Brief overview of speech-based systems6.6.2 System architecture6.6.3 System architecture: VoiceFleXML interpreter6.6.4 VoiceFleXML: Voice browser6.6.5 A prototype implementation6.7 SAMVAAD6.7.1 Background and problem setting6.7.2 Reorganization algorithms6.7.3 Minimizing the number of dialogs6.7.4 Hybrid call-flows6.7.5 Minimally altered call-flows6.7.6 Device-independent call-flow characterization6.7.7 SAMVAAD: Architecture, implementation and experiments6.7.8 Splitting dialog call-flows6.8 Conclusion6.9 Summary and future workBibliography7 Architecture of Mobile Speech-Based and Multimodal Dialog Systems7.1 Introduction7.2 Multimodal architectures7.3 Multimodal frameworks7.4 Multimodal mobile applications7.4.1 Mobile companion7.4.2 MUMS7.4.3 TravelMan7.4.4 Stopman7.5 Architectural models7.5.1 Client-server systems7.5.2 Dialog description systems7.5.3 Generic model for distributed mobile multimodal speech systems7.6 Distribution in the Stopman system7.7 ConclusionsBibliography8 Evaluation of Mobile and Pervasive Speech Applications8.1 Introduction8.1.1 Spoken interaction8.1.2 Mobile-use context8.1.3 Speech and mobility8.2 Evaluation of mobile speech-based systems8.2.1 User interface evaluation methodology8.2.2 Technical evaluation of speech-based systems8.2.3 Usability evaluations8.2.4 Subjective metrics and objective metrics8.2.5 Laboratory and field studies8.2.6 Simulating mobility in the laboratory8.2.7 Studyingsocial context8.2.8 Long- and short-term studies8.2.9 Validity8.3 Case studies8.3.1 STOPMAN evaluation8.3.2 TravelMan evaluation8.3.3 Discussion8.4 Theoretical measures for dialog call-flows8.4.1 Introduction8.4.2 Dialog call-flow characterization8.4.3 {m,q,a} -characterization8.4.4 {m,q,a} - complexity8.4.5 Call-flow analysis using {m,q,a} - complexity8.5 ConclusionsBibliography9 Developing Regions9.1 Introduction9.2 Applications and studies9.2.1 VoiKiosk9.2.2 HealthLine9.2.3 The spoken web9.2.4 TapBack9.3 Systems9.4 ChallengesBibliographyIndex

  • ISBN: 978-0-470-69435-0
  • Editorial: John Wiley & Sons
  • Encuadernacion: Cartoné
  • Páginas: 312
  • Fecha Publicación: 10/02/2012
  • Nº Volúmenes: 1
  • Idioma: Inglés