Machine translations lose human touch
Hyderabad: Translation into different languages by machines sounds easier that it is. It is thought of as merely a matter of replacing words in one language with corresponding words in another. But it is much more complicated than that as there can be many ways of saying the same thing. This is evident from the gaffes committed by Google Translate.
The popular American TV show hosted by Jimmy Fallon recently had a segment on Google Translate, mocking the technology. The lyrics of an English language song are converted to a different language and then the output is converted back to English. For example, if you translate the line from the famous song “We will we will rock you” into Telugu, the output is “Memu ninnu raksistamu” which means “we will save you”.
A machine translation system like Google Translate has come a long way since when IBM first translated 60 sentences from Russian to English in 1954, to Microsoft achieving human parity in translation from Chinese to English recently.
“We have a model for four Indic languages —Hindi, Bangla, Tamil and Urdu. We have gained at least 20 per cent improvement compared to the previously deployed models. This is significant in terms of end user experience,” a Mircosoft spokesperson told this newspaper.
Companies were using the Statistical Machine Translation technique which struggled to make sense of words in local context and their dynamics with other words. Very recently, giants like Google, Microsoft, and Facebook have started relying on Neural Machine Translation in which a large neural network is built and trained to mimic neuron brain cells.
With Neural Machine Translation, companies achieved better translation, but it is still in its early days. Although plain texts are converted easily, in case of idioms and jokes, the translation loses the human touch. “Deep neural networks have large parameter spaces and need ample amounts of data in order to generalise adequately,” the Microsoft spokesperson said.
Machines are certainly struggling to make sense of words in local context and their dynamics with other words. “This is because while Indian languages are widely spoken (in terms of native speakers), most of these languages have very little or no parallel resources available to build a general domain in Machine Translation system. In the absence of readily available parallel corpora, comparable resources are often used to extract good quality parallel data from the web,” said Microsoft.
Meanwhile, companies are taking the help of people to correct machine translations. The idiomatic “Call it a day”, which means “Stop working on something”, is translated into Hindi as “Ant karana”.
This update was suggested by Google translate users evident from a shield symbol which appears next to such translations. Microsoft Bing translates “Speak of the devil...” correctly as “Hum abhi isake bare men hee baat kar rahe the” (we were just talking about it).