Search

Patents & Machine Translation: Is ’Google Translate’ the Future?

Brett Gallagher  September 8 2011 03:32:34 PM

In 2011, Google and the European Patent Office (EPO) signed a deal under which the EPO would provide Google with a large volume of patents and translated patent content in multiple languages in order to feed into their Google Translate program.

The idea behind the deal is that the large influx of translated patent content will improve Google Translate's ability to translate patent materials, providing a no-cost option to those seeking to use such content for research purposes. With the ever-rising costs of patent translations in the EU, and the lack of a final deal between EU member countries to simplify, centralize and consolidate a Europe-wide patent filing system, having such a system in place could reduce translation costs significantly.

How does 'Google Translate' work?

Google Translate is a rule-based, statistical machine translation program. Essentially, Google's statistical approach focuses on aligning sentences with their translations. Google Translate has done this for hundreds of millions of sentence pairs. When a new text is to be translated, is statistically matched against a huge database of pairing and a probable translation calculated.

The statistical approach has become popular because it is cheaper to develop large scale, general engines. Google has automated engine development so that it can take place dynamically.

The Flaws of Google Translate's Method

While Google Translate has certainly allowed for increased access to foreign language content, it still have some significant limitations. Google Translate's machine translation methodology frequently yields ridiculous translation errors, swapping in common terms for similar but nonequivalent terms, as well as switching the meaning of a sentence altogether.

This is especially problematic when it comes to such a specialized field as patents. For example, a statistical machine translation tool, like Google Translate, is dependent on dialect. As a case in point, a statistical machine translation engine that has been trained for patents written in Mandarin Chinese as used in China cannot be used very effectively to translate patents written in Mandarin Chinese as used in Taiwan. Even though the grammars are virtually identical and both are Mandarin Dialect, scientific terminology used in Mainland China and Taiwan have evolved separately over the past several decades.

The "Symbolic" Machine Translation Method

The "symbolic" approach focuses on the grammatical structure of linguistic segments. For a text to be machine translated, it is first broken down into segments (sentences, phrases and words) based on certain grammatical rules. These segments are then translated individually and the target language text rebuilt from those parts, again following grammatical rules for the target language.

The advantage of the symbolic approach is that it can be better adapted to special projects and special kinds of texts, such as patents, as opposed to the statistical method that programs like 'Google Translate' utilize.

While programs that utilize this more specialized approach to patents are not free-of-charge like 'Google Translate', they are much more reliable in terms of quality of output, and certainly more affordable than full, certified translations. This is especially useful when large numbers of documents need to be reviewed but do not necessarily need to be a professional, legal-grade quality.

What the future holds, though, for 'Google Translate' and its foray into the world of patents and patent translation remains to be seen. How quickly and effectively the influx of EPO documents into their system yields positive is still and unknown.

Written by David Evseeff & Alan Engel

=================================================================================

About STI, Inc.

STI has been a leading provider in the patent & IP translation industry for more than 30 years, helping numerous government and private sector clients to meet their translation, localization, and interpretation needs. STI integrates the best in human and technological resources to help our clients break down language barriers and face the demands of increasingly rapid globalization. With an effective, personalized, and proven system of project management, there is no project too big or too small.

STI is a Corporate Member of the American Translators Association (ATA) and a Charter Member of the Association of Language Companies (ALC), the only trade association in the U.S. devoted to the language industry. STI's President, Marla Schulman, is the Immediate Past President of the AlC.

STI is located in the Washington D.C metro area. To learn more, please contact STI at www.schreibernet.com, or call (301) 424-7737.


Copyright © Schreiber 2000-2011