“Sanskrit is the number one and most requested language on Google Translate and we are finally adding it,” Isaac Caswell, Senior Software Engineer, Google searchhe told ET in an exclusive interview. “We are also adding the first languages of northeastern India, which is another rather underrepresented place.”
In addition to Sanskrit, the other Indian languages in the latest iteration of Google Translate are Assamese, Bhojpuri, Dogri, Konkani, Maithili, Mizo, and Meiteilon (Manipuri), bringing the total number of Indian languages supported by the service to 19.
The announcement was made at Google’s annual I / O conference that began late Wednesday evening.
The latest update doesn’t cover all of India’s 22 programmed languages, as the company hoped, but Caswell said, “We’ve significantly narrowed the gap at least for programmed languages.”
All languages that have been added in the update will only be supported in the text translation feature, but the company will work soon to implement speech to text, camera mode and other features. “We are working on it, but they are not yet supported for all of these languages,” said Caswell.
Discover the stories of your interest
Google is also working to iron out the problems with Indian language translations. “We feel that often the translations our models produce for Indian languages, when they make mistakes, are often archaic,” said Caswell.
Translations are often words that people don’t know or use regularly, he said. “We’re trying to understand (the problems) better and hopefully get our model to move towards more conversational output rather than this old-fashioned or contrived kind of thing. But we know there are other issues we’re trying to get our fingers on more closely, too, ”she said.
These are the first languages that were added using the zero-hit machine translationwhere a machine learning model sees only monolingual text, which means that it learns to translate into another language without ever seeing an example.
“While this technology is impressive, it isn’t perfect. And we’ll continue to improve these templates to deliver the same experience you’re used to with a Spanish or German translation, for example, ”Caswell said in a blog post announcing the update.
The addition of the eight Indian languages is part of a larger update that added 24 languages to Google Translate, which now supports a total of 133 languages used worldwide.
More than 300 million people use the newly added languages: for example, Mizo is spoken by around 800,000 people in northeastern India and Lingala is spoken by over 45 million people across Central Africa. As part of the update, the indigenous languages of the Americas (Quechua, Guarani and Aymara) and an English dialect (Sierra Leonean Krio) have also been added to Google Translate.