The new translation tools are being implemented with some sucess in business applications although I have
previously expressed my skepticism with the effort. A
Computerworld story online today illustrates how the new efforts are going. Most results to the present have been sketchy requiring human translators.
Ford Motor Co. has used “machine translation” software since 1998 and has translated 5 million automobile assembly instructions.
Ford uses Enterprise Global Server from Systran Software Inc. but this is just the beginning. English instructions are written by engineers and then parsed by a homegrown AI program into unambiguous detailed directions, such as, “Attach bracket No. 423 using six half-inch bolts.” Each instruction is then stored as a record in a translation database.
Systran’s tool uses a reliable translation technique called rules-based translation. Such systems use bilingual dictionaries combined with electronic style guides containing usage and grammar rules. The commercial translators are then supplemented with assembly line application-specific glossaries from Ford.
The glossaries are cumulative in that they are combined with “translation memories,” databases of previously translated text in the form of source and target sentence pairs. These memories are usually compiled over time by users. If the translation system (or a human) finds an exact match for the sentence it’s trying to translate, it just retrieves the corresponding sentence in the target language from the database. Near matches or “fuzzy,” matches are flagged for review by a human translator.
Statistical machine translation is a newer technique. It uses collections of documents and their translations to “train” software. Over time, these data-driven systems “learn” what makes a good translation and what doesn’t and then use probability and statistics to decide which of several possible translations of a given word or phrase is most likely correct based on context.
The systems as a result develop their own rules and fine-tune them over time.
Google Inc. uses Systran’s rules-based software but is also developing its own statistical-based systems to translate to and from the more difficult and non-Western Romance languages due to their significant differences from Western languages.
Sites may include a link to Google’s system at Google translation for free.
Other large companies are in need of translations and use them such as Microsoft Corp. which incorporates a rules-based natural-language parser in its Word software.
FedEx Corp. rolled out Trados GXT, a product of Maidenhead, England-based SDL International. It consists of translation memories integrated with an enterprise translation workflow system but has not obviated the need for human-based translation services.
A new development and increasingly sophisticated translation systems combine multiple methods. A statistical machine translation product from Language Weaver Inc. in Marina del Rey, Calif., can now be used with translation management software called WorldServer from Idiom Technologies Inc. Customers can tap into WorldServer to retrieve previously translated content in a translation memory or generate new translations — through Language Weaver’s algorithms — when no matches are found.
At SRI International in Menlo Park, Calif. researchers are working with the U.S. Department of Defense to automate the translation of Arabic and Mandarin Chinese — structured and unstructured text as well as real-time speech — into English.
It's all Greek to me; but, actually I do know Greek so if I can learn perhaps machines can as well.