
A team from Duke Kunshan University has won awards at an international competition to test the ability of speech recognition technology.
Led by Ming Li, an associate professor of electrical and computer engineering and head of the Speech and Multimodal Intelligent Information Processing (SMIIP) Lab at DKU, the team picked up awards in two categories of the 2023 VoxCeleb Speaker Recognition Challenge – semi-supervised domain adaptation speaker recognition and track of speaker diarisation.
“Deep learning-based speaker recognition technology is widely applicable in everyday scenarios such as smart homes, intelligent customer service, and smart cities,” said Li. “The DKU SMIIP team has been dedicated to research in this field and has won top international competitions in speaker recognition multiple times.”
Li worked with Yuke Lin, Ming Cheng, Xiaoyi Qin, and Ze Li on the competition-winning entries, all students from the DKU SMIIP Lab and part of a joint graduate training program between Wuhan University and Duke Kunshan. They also worked with MaShang Consumer Finance, a technology-driven financial company, which helped the team create a dataset and provided powerful computing resources during the competition.
The 2023 VoxCeleb Speaker Recognition Challenge aimed to test the ability of speech-recognition technology to understand real-world speech, using audio from YouTube interviews, news broadcasts, talk shows and debates, with background noise and laughter on some recordings.
This year’s competition, which was held in South Korea, attracted teams from international companies and research institutions including Microsoft, Samsung, Huawei, Sogou, Tencent, ByteDance, Johns Hopkins University, Ghent University, Brno University of Technology, Catalonia Institute of Technology, Yonsei University, the Institute of Acoustics of the Chinese Academy of Sciences, Shanghai Jiao Tong University, and Xiamen University.
The VoxSRC competition has been held five times, the first three organized by the Visual Geometry Group lab at the University of Oxford, in the United Kingdom, and the last two by the Korea Advanced Institute of Science & Technology, in South Korea.
Duke Kunshan University has a history of success in the competition, previously winning the speaker diarization track at VoxSRC 2022 and securing championship final places in both the self-supervised speaker recognition and speaker diarization categories at VoxSRC 2021.
“I look forward to the team achieving more success and contributing to improving people’s lives through technology,” said Li.