Emotional Voice Conversionn
Emotional Voice Conversion is the process of transforming an utterance into a different emotion. Recently, research has focused on combining deep learning and representation learning to achieve more natural and subtle emotional expressions, expanding toward modeling temporal emotion variations, generating mixed emotions, and controlling prosody and intonation based on emotions.
Spoken Dialogue System
A Spoken Dialogue System is an AI system that enables natural human-computer interaction through speech by integrating Automatic Speech Recognition(ASR), natural language understanding, and spoken response generation. Recent advances combine LLMs with speech deep learning to enhance contextual reasoning, emotional adaptability, and real-time conversational fluency.