OpenAI's ChatGPT starts speaking Welsh after glitch
ChatGPT users in the UK were left puzzled as the chatbot has started responding in Welsh to English-language queries. The unexpected glitch is the latest example of how the development of artificial intelligence systems can lead to unforeseen issues.
When English-speaking users interact with the bot using ChatGPT's new voice interface, they have been surprised to find it translating their questions into almost flawless Welsh.
Several users have reported encountering this issue with ChatGPT - an issue reported by the Financial Times - despite not understanding Welsh or living in or near Wales.
READ MORE: Man, 34, dies after being found unconscious on footpath in early hours of the morning
READ MORE: At home with Jiffy, the family man behind the rugby hero
In February, users also complained about a bug where the bot would answer text questions in a mix of Spanish and English. The Welsh language has thrown a spanner in the works for large language models, which are known to "hallucinate" or produce nonsensical answers. This issue continues to plague generative AI systems despite years of development and billions of dollars invested.
OpenAI, backed by Microsoft and valued at $86bn earlier this year, is in a race with Google, Meta, and start-ups like Anthropic, Elon Musk's xAI, and Cohere to enhance its AI capabilities.
ChatGPT now supports multiple languages, including Icelandic, Georgian, and Macedonian. In June, the Welsh government announced a data partnership with OpenAI to improve how AI technologies function in the Welsh language.
However, OpenAI has confessed in a research paper that ChatGPT's performance in Welsh is "much worse than expected performance". The company discovered that most of its training data for translation was "actually English audio" that had been mislabelled by the system.
Cambridgeshire entrepreneur Sarah Coward was taken aback when testing the new voice feature of ChatGPT-4o, launched earlier in the year, as it unexpectedly replied in Welsh.
"I had no idea what language it was because it completely took me by surprise," Coward said. The puzzled entrepreneur quizzed the chatbot about its sudden lingual switch, and ChatGPT reasoned that it believed she would be "more comfortable in that language".
OpenAI, facing the issue head-on, explained that the quirk lies in ChatGPTs voice transcription system named Whisper. In conversation with the Financial Times, OpenAI admitted that occasionally the model confuses its responses and transcribes in a different tongue Welsh being the unexpected choice this time.
Users facing this so-called Welsh glitch have been advised to adjust their "Speech" setting from "auto-detect," to English, although OpenAI has stopped short of promising this as a panacea.
"Everybody knows that ChatGPT and some large language model applications create hallucinations or inaccuracies in responses," commented Coward, whose entity, In The Room, crafts conversational AI-based interactions for brands.
"This is a demonstration of, to a certain extent, legitimate concern that companies should have in employing these types of technologies right now in any consumer-facing area," she added. "It could be quite damaging in terms of customer experience and ... trust."
In an OpenAI paper about its own speech recognition system, the company noted: "Welsh is an outlier with much worse than expected performance ... despite supposedly having 9,000 hours of translation data."
Further investigation revealed that "the majority of supposedly Welsh translation data is actually English audio" which had been "misclassified as Welsh by the language identification [AI] system".
When the Financial Times tested the system, ChatGPT misinterpreted an inquiry regarding cities in the UK, US and Asia as being about Wales. "It seems I misunderstood the language of your question and responded in Welsh by mistake," ChatGPT confessed when the error was pointed out. "I'll be more careful in the future."