Fancy speaking French without learning a word? A new AI tool means you can

Meta's Seamless tool can take spoken or written communication and convert it almost instantly into another language
Meta’s Seamless tool can take spoken or written communication and convert it almost instantly into another language - Shapecharge/E+

A live translation tool could soon enable people to engage in conversations in real-time despite speaking different languages.

Facebook parent company Meta has launched a tool called Seamless which is able to take spoken or written communication and convert it almost instantly into another language.

The technology is able to work on more than 100 different languages and the Meta engineers behind the technology hope to one day create a real-life version of the talking Babel Fish gadget from The Hitchhiker’s Guide to the Galaxy.

In the book, the tool enables translation but modern technology has not yet been able to match the fictional capabilities.

ADVERTISEMENT

Computer translation tools exist which are able to turn text from one language into another, or are slow to convert speech into audio in another language.

The latest version of Meta’s Seamless technology is able to turn speech from 101 languages into speech in 36 languages, while translating speech to text into 96 languages.

Data show the system is able to translate text 23 per cent more accurately in speech-to-speech tasks than other systems, according to a study published in the peer-reviewed journal Nature.

It is also 50 per cent more resilient to background noise than rival systems, such as OpenAI’s Whisper technology, the study reports.

The AI system was trained on almost half a million hours of translations can translate words and sentences in one step.

Exactly how the translation tool would be used in modern technology remains unknown, but the scientists say it could be applied to podcasts, voice memos, audiobooks and more.

ADVERTISEMENT

It is also possible the Seamless live translation tool could be integrated into wearable devices, such as Meta’s smartglasses made in partnership with Ray-Ban which enable people to film and stream footage from their spectacles while also listening to audio that the glasses transmit to the ear via the skull.

The Seamless tool could, in theory, be integrated into this device to also provide a live translation tool.

“The world we live in has never been more interconnected – the global proliferation of the internet, mobile devices, communicative platforms and social media exposes individuals to more multilingual content than ever before,” the study authors write in their paper.

“The current social order places a demand on the world-readiness of a person, a measure of how competent a person is to take on the polyglot world.

“Initially developed in the context of language learning, world-readiness underscores the importance of being able to communicate in languages beyond our mother tongue for both instrumental (that is, employment or schooling) and cultural reasons (that is, to become a global citizen).

ADVERTISEMENT

“That said, although we believe that language acquisition should remain a key mechanism for boosting our world-readiness, we acknowledge that doing so requires resources many people may not possess.”

They add that the Seamless technology could act as a “co-pilot” to help people have multilingual conversations.

They add that it could make more information accessible to the blind and illiterate, but they acknowledge the system performs better with some languages, genders and accents.

“The performance of our system in translating slang or proper nouns may also be inconsistent across high and low-resource languages,” the Meta scientists say.

The scientists say that in the quest for a perfect replica of the Babel Fish it is crucial that any future speech-to-speech translation technology is able to minimise errors because they can not be revised and amended due to the live nature of the translation.

‘Augmentation device’

“We believe that Seamless-fuelled applications should best be viewed as an augmentation device that assists in translation rather than a tool that replaces the need for language learning or reliable human interpreters,” the authors write.

ADVERTISEMENT

“This reminder is especially pertinent in high-stakes situations involving legal or medical decision-making.”

Tanel Alumäe, professor of speech processing at Tallinn University of Technology, who was not involved with the Meta project, said: “This speech-to-speech translation is particularly impressive because it involves an ‘end-to-end’ approach: the model can directly translate, for example, spoken English into spoken German, without first transcribing it into English text and translating it into German text.”

Dr Alumae also praised Meta for making all the code and data from the project publicly available for no fee to allow more researchers to use the technology.