Deep Learning In Visual Speech Recognition: A Review Of Recent Developments And Performance Analysis

Mr. Aditya Nivas Magdum; Dr. Mrs S B Patil

doi:10.53555/kuey.v29i1.7847

pdf

Published: Jan 10, 2023

DOI: https://doi.org/10.53555/kuey.v29i1.7847

Keywords:

Visual Speech Recognition (VSR), Visemes, Low Acoustic Environment, Machine Learning, Multimodal Fusion

Mr. Aditya Nivas Magdum

Dr. Mrs S B Patil

Abstract

Visual Speech Recognition (VSR) is especially important in situations where acoustic signals are distorted, for example, in noisy environments or for people with hearing loss. This review aims at identifying the critical difficulty that arises from the visually similar phonemes or visemes which greatly affect the VSR. Visemes are other phonemes that are visually similar and hence pose a challenge when distinguishing them. We discuss the phoneme-viseme mapping and the effects of these similarities on VSR in low acoustic conditions. Different ways of improving VSR accuracy are described, such as data-oriented methods based on machine learning and deep learning algorithms, integration of vision with other sensory inputs, and context-based recognition systems that use linguistic context. We also discuss the existing methods of VSR systems including LipNet and LipReading in the Wild (LRW) and their drawbacks in practical scenarios. Future directions are concerned with the possibility of using both visual and degraded acoustic signals, new NN structures, individual VSR systems, and enhancements of real-time signal processing. The purpose of this review is to give a clear picture of the existing literature on the difficulties and possibilities of improving VSR accuracy in a low acoustic environment so that better communication technologies can be developed.

Downloads

Download data is not yet available.

How to Cite

Mr. Aditya Nivas Magdum, & Dr. Mrs S B Patil. (2023). Deep Learning In Visual Speech Recognition: A Review Of Recent Developments And Performance Analysis. Educational Administration: Theory and Practice, 29(1), 555–562. https://doi.org/10.53555/kuey.v29i1.7847

Issue

Vol. 29 No. 1 (2023)

Section

Articles

Author Biographies

Mr. Aditya Nivas Magdum

Research Student, Dept of Electronics Engineering, Shivaji University Kolhapur

Dr. Mrs S B Patil

Professor, Department of ETC, Dr. J J Magdum College of Engineering Jaysingpur

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

Mr. Aditya Nivas Magdum

Dr. Mrs S B Patil