Microsoft launches real-time translation tech
One of the great inventions in sci-fi literature was the Babel fish, created by Douglas Adams in The Hitchhiker’s Guide to the Galaxy. Apparently, “[…] if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language.”
Ignoring the commandeering of the term by a certain online presence for their rather more paltry efforts at a translation service, the dream of simultaneous translation from language to language has remained one of the goals that computing has striven towards for many years.
The advent of artificial intelligence technology has undoubtedly brought the dream closer. Last week, Microsoft announced that its research teams, working on the type of deep learning that iteratively improves itself, have had its work incorporated into a range of Microsoft’s services, for public use.
The technology can translate between Chinese, German and English at present – with more languages to come.
Microsoft thinks AI can make people's lives better. A $25M initiative encourages developers to design products using #AI for disabled community. The company has been advancing speech-to-text to offer real-time translation during calls on Skype. via @CNET. https://t.co/0kQ1Ilqypo
— Dbrain (@dbrainio) May 17, 2018
The research teams in China and the US worked on the translation of Chinese general interest new stories from one language to another, using techniques like translating from one language to the other, and then back again, to compare the outputs, and then have the routines hone their results.
The researchers also used the type of self-improvement methods that humans use when they are learning a language. One such method is based on deliberation networks, which are similar to how people revise their own writing by going through it many times. Think of it like a series of proof-readings by an increasingly intelligent robot.
The company used human translators to cross-reference the quality of the algorithms’ results. At the time, the achievement of parity with human translators in quality (March 2018) was summarised by Xuedong Huang, head of Microsoft’s speech, natural language and machine translation:
“Hitting human parity in a machine translation task is a dream that all of us have had, we just didn’t realize we’d be able to hit it so soon.”
Since then, the research group’s work has been adapted to Microsoft translation technologies that are available to developers in Azure Cognitive Services, which includes the Microsoft Translator app.
YOU MIGHT LIKE
Why voice search is the next frontier
The same team has also made updates to Microsoft’s speech recognition systems, which have also reached human parity on speech to text conversion, and are capable even of processing poor-quality, telephone-carried speech. This type of technology will prove invaluable to call centers and customer care functions in the coming months and years, it is thought.
There are two further systems which the company has previewed. One comprises transcription of speech even when the speakers are not talking directly into a microphone, such as in a meeting room, and couples that with the simultaneous translation capability. The former capability is termed vision-enhanced far-field speech recognition and is available to developers via the Speech Devices SDK. The company envisages multilingual, face-to-face meeting being understood and parsed in and out of different languages at speeds which make a decent stab at continuity.
The other technology on preview is a text to speech translation system that, according to the company, “generates digital voices from text that are nearly indistinguishable from recordings of people.”
The speech and language research group premiered the combined technologies in September at Microsoft Ignite, as part of the Microsoft 365 offering (see from approximately from 1:13:00):
With an increasing acceptance of all things digital, including digital assistants, the new capabilities will, no doubt, be coming soon to a customer care department near you.
29 February 2024
29 February 2024