Meta releases SeamlessM4T AI model for speech and text translation
Meta Platforms has released SeamlessM4T, an artificial intelligence model that can perform translations by recognizing different pronunciations and texts in more than 100 languages.
For now, SeamlessM4T is available to researchers and developers under an appropriate license. Subsequently, Facebook, Instagram, WhatsApp, Messenger and Threads will have these translation capabilities. Metadata from SeamlessAlign, the largest open source dataset for multimodal translation with 270,000 hours of learned speech, has also been published.
Note that last year, Meta developers released No Language Left Behind (NLLB), a text-to-text machine translation model that supports 200 languages and has since been integrated into Wikipedia as one of the translation service providers. Previously, they demonstrated a universal speech translator, which was the first direct speech-to-speech system for Southern Minh language (a dialect of Chinese). Another Meta language project is Massively Multilingual Speech, a speech recognition, identification and synthesis system for over 1100 languages. SeamlessM4T builds on the results of all these projects, providing multilingual and multimodal translation based on a single model built on a wide range of spoken data sources with state-of-the-art results. SeamlessM4T supports:
- Speech recognition for nearly 100 languages;
- Speech-to-text conversion for nearly 100 input and output languages;
- Speech-to-speech conversion, support for almost 100 input languages and 36 output languages;
- Text-to-speech translation for nearly 100 languages;
- Text-to-speech, support for almost 100 input languages and 35 output languages.
What's Your Reaction?