Go direct to the page content

Voice AI

Voice AI

Research Topic

Emotional Voice Conversionn

Emotional Voice Conversion is the process of transforming an utterance into a different emotion. Recently, research has focused on combining deep learning and representation learning to achieve more natural and subtle emotional expressions, expanding toward modeling temporal emotion variations, generating mixed emotions, and controlling prosody and intonation based on emotions.

Research Topic

Spoken Dialogue System

A Spoken Dialogue System is an AI system that enables natural human-computer interaction through speech by integrating Automatic Speech Recognition(ASR), natural language understanding, and spoken response generation. Recent advances combine LLMs with speech deep learning to enhance contextual reasoning, emotional adaptability, and real-time conversational fluency.

Model overview
Model structure