IEEE Fellow

Jonathan Le Roux,Ph.D.

Multi-Source Speech and Audio Processing

Contributing to the advancement and practical application of speech and audio signal processing technology in complex real-world environments where multiple sound sources interfere with each other.

In complex acoustic environments where many people are talking, humans have the ability to focus on a specific speaker or topic of interest. This is known as the "cocktail party effect." Dr. Jonathan Le Roux has spent many years working to replicate this phenomenon in machines. He has developed practical approaches, including deep clustering, that brought about major progress in the field.

He developed the first technology that could effectively separate two unknown speakers from a single-channel mixture of voices. He later extended the method to multichannel settings, remaining agnostic to the number of microphones and their geometric arrangement. This series of achievements represents significant milestones in the field of audio source separation. Building on this work, Jonathan also pioneered new approaches to multi-speaker automatic speech recognition (technology that can recognize speech from multiple speakers simultaneously) using deep learning models through the entire processing pipeline. These technologies can be applied to a wide range of devices and uses, including smart speakers, teleconferencing, and mobile devices, contributing to improvements in quality of life.

He was elevated to IEEE Fellow in 2024 in recognition of these technical contributions.

Reflecting on this honor, Jonathan said it reaffirmed for him that "MERL is a research powerhouse that punches above its relatively small size, and this reinforces its well-earned reputation for excellence and innovation." He added, "Most importantly, the world-class caliber of my MERL colleagues in a wide range of fields is truly inspiring. I can walk a few steps and discuss scientific questions with a top expert on that particular topic, which often leads to cross-topic collaborations."

Jonathan and his team have set their sights on enabling machines to listen to, understand, and interact with their surroundings through acoustic signals and other data. He is passionate about advancing the modeling of acoustic environments through multimodal AI and generative AI.

This is his message to young researchers: "The most important thing towards a long and successful research career is to keep having fun!"

Profile

Dr. Jonathan earned his bachelor's and master's degrees in mathematics from the École Normale Supérieure in Paris, France. In 2009, he received his Ph.D. from the University of Tokyo and subsequently worked as a postdoctoral researcher at NTT Communication Science Laboratories. He joined MERL in 2011 and has been leading the Speech & Audio team since 2018. His specialty is speech and audio signal processing, focusing in particular on applying machine learning and deep learning to sound analysis, speech enhancement, and audio source separation.

Major Awards

  • MERL team wins the Generative Data Augmentation of Room Acoustics (GenDARA) 2025 Challenge
  • MERL team wins the Listener Acoustic Personalisation (LAP) 2024 Challenge
  • MERL team wins the Audio-Visual Speech Enhancement (AVSE) 2023 Challenge
  • MERL interns and researchers win ICASSP 2023 Best Student Paper Award
  • Joint CMU-MERL team wins DCASE2023 Challenge on Automated Audio Captioning
  • Best Poster Award and Best Video Award at the International Society for Music Information Retrieval Conference (ISMIR) 2020
  • Best Paper Award at the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019
  • Best Student Paper Award at IEEE ICASSP 2018
  • MERL's Speech Team Achieves World's 2nd Best Performance at the Third CHiME Speech Separation and Recognition Challenge (2015)

Related Links

Other Fellows