Machine translation has long since been a dream of researchers as far back as the 1950’s. What started as an experiment by researchers at Georgetown University has grown into a multimillion dollar pursuit by tech companies who are racing to grasp that golden ring of accurate, automated language translation.
While that dream has thus far been out of reach, companies like Google have made significant strides in making that dream a reality. Just recently, Google added 13 new languages to its Translate application, bringing their total to 103 different languages. With this latest addition, they can now translate for almost 99% of the world’s population (90 percent of the other languages are used by fewer than 1,000 speakers). While this is an accomplishment in its own right, the major problem still lies in the fact that the translations provided through the application are still quite inaccurate.
For the uninitiated, Google Translate works using statistical machine translation (SMT), where computers analyze millions of existing translated documents from the web to “learn” vocabulary and look for patterns in a language. The application then picks the most statistically probable translation when asked to translate a new bit of text. This new bit of translated text is then stored in the applications memory for future use.
In theory, this approach sounds like a plausible solution to the machine translation dilemma. In fact, this process does prove to be more successful than the less accurate and more time-consuming method of rule-based machine translation, which relies on language and grammar rules that are manually coded into the system to translate the text. However, one of the drawbacks of Google’s SMT process is that there is no way to prevent the system from analyzing and storing inaccurately translated text. Even its own translations, which are not entirely accurate, are sent back into the pool of texts from which it draws its examples, thus resulting in a form of cannibalization of its own accuracy.
With all of this frustration, time and money spent on trying (and failing) to come up with an accurate machine translation tool, it begs the question: Why? If Google Translate is a free tool to use, what do they have to gain even if they succeed in providing high quality machine translations? A lot, actually. As the internet’s role in every face of our lives continues to grow, language remains one of the few roadblocks to businesses being able to truly reach a global audience. The first tech company that is able to offer accurate, instantaneous machine translation of every language is guaranteed a spot on every desktop and smart phone in the world. That sort of exclusivity in its own right can be worth billions in ancillary marketing applications and advertising.
For the time being, Google’s dream of developing an accurate machine translator remains just that, a dream. While Google or a different tech company may one day discover the magic formula, the simple truth is that language is just too elusive of a concept for current A.I. to process accurately and naturally. It requires that je ne sais quoi that only a human translator can provide—for now.