Multimodel Models for Machine Perception of Speech
How computers interpret data based on human sensory capabilities is known as machine perception. AI-based systems that emulate human perceptions can provide robust systems and a better experience for users. Multimodal sensory data models that combine audio and visual information mimicking human perception will be studied to better understand the relationship between the sensory data capture, synthesis and model learning. The research will explore research challenges in this space with a view on addressing real world problems in the field of dysarthric speech production and perception.