Meta Launches Program to Advance Speech and Translation AI

World map with pins inserted on different locations.

World map with pins inserted on different locations.

Meta has launched a new initiative to advance speech recognition and translation AI, with the goal of bridging linguistic boundaries via its Language Technology Partner Program. Developed in collaboration with UNESCO, this effort aims to improve AI's ability to process underserved languages, promoting inclusivity and digital communication.

The Language Technology Partner Program asks collaborators to provide extensive speech recordings, transcriptions, and written translations in diverse languages. Meta is seeking partners who can provide over 10 hours of speech data, large text corpora, and translated sentence sets. These contributions will support training AI-driven models for speech recognition and machine translation, with the end goal of open-sourcing these technologies for broad accessibility.

A girl writing on board in different languages.
expand image
Credit: Leonardo Toshiro Okubo on Unsplash

One of the initial participants in the program is the Government of Nunavut, Canada, which is working with Meta to integrate Inuit languages like Inuktitut and Inuinnaqtun into AI models. This alliance with UNESCO's International Decade of Indigenous Languages emphasizes the requirement of maintaining linguistic history through technology. Participants in the program will also gain access to technical workshops run by Meta's AI researchers, offering insights into using open-source AI models for language development.

In addition to this initiative, Meta will provide an open-source machine translation benchmark to evaluate the performance of AI language models. The benchmark consists of carefully picked sentences by linguistic specialists and is currently available in seven languages. Developers and researchers are encouraged to submit translations to improve multilingual AI capabilities.

I love you painted in various language on wood.
expand image
Credit: Hannah Wright on Unsplash

This latest effort builds on Meta's earlier work in AI-powered language technology. In 2022, the business announced the No Language Left Behind (NLLB) project, which pioneered neural machine translation for various languages. Meta recently launched the Massively Multilingual Speech (MMS) project, scaling audio transcription to over 1100 languages, including zero-shot speech recognition, allowing AI to transcribe languages it hasn't been explicitly trained on.

Meta describes these initiatives as steps toward creating AI systems that understand and respond to human needs, independent of language or cultural background. While the advancements are framed as a charity endeavor, they also boost Meta's AI infrastructure, enhancing capabilities for products such as Meta AI and automated translation services on its platforms.