Good, Fast, Cheap: Translation Memory Systems Offer the Potential for All Three

by Deborah S. Ray and Eric J. Ray

Boss: Beginning in three months, we'll be making our products available to markets in Europe and Asia. We'll need all Getting Started Guides, Tutorials, User Documentation, Advanced System Reference Manuals, Online Help, and marketing materials translated into German, French, Spanish, Italian, and Japanese. I need to see a translation plan by Friday morning.

You: Consider it done!

For technical communicators exploring translation services, a relatively new technology can help provide consistency among translated documents, make the translation process more efficient, and make translation projects cost effective. Translation memory systems assist human translators by following along as a document is translated, creating a database of translated material and terminology, and allowing translators to access previously translated material easily. Using this technology, translators can translate, save, and reuse material, making the resulting translations highly consistent and the overall process more efficient and cost effective than working without this technology.


In this exploratory article, we explain the evolution toward translation memory systems, discuss why and when they're particularly useful for helping translate technical documentation, and offer guidelines for determining whether translation memory systems are appropriate for your translation needs.

Reprinted with permission from Technical Communication, the journal of the Society for Technical Communication. Originally published in May 1999 (V. 46, No. 2).


The Evolution Toward Tranlation Memory Systems


More than three decades ago, translation technologies emerged in scientific and technical fields, attempting to automate translation among Indo-European languages. Initially, translation technology development focused on machine translation systems, which work by translating basic words, complex words, and sentence structures from one language into another. At a basic level, these early systems could translate words, conjugate verbs, inflect articles and adjectives, and rearrange word order as necessary. And, as the technology advanced, machine translation systems could also recognize a variety of idioms and phrases, as well as improved translation of verb tenses and articles.


The problems of machine translation systems
Researchers quickly discovered, however, problems in automating language translation. For example, early machine translation systems required enormous computing power-far more than was readily available to most users at that time. Because of this, machine translation technology was not widely available, practical, or affordable as a translation tool. So, for a long time, only companies who were researching the technologies or who could afford the mainframe equipment used these systems.


More importantly, despite technological advances and improvements, machine translation systems still could not adequately account for differences in language structure, differences in how languages handle singularity and plurality, ambiguity of pronoun antecedents, the variety of word contexts, and the majority of idioms and nuances inherent in all languages. The resulting machine translations were crude at best, often not even adequately producing the gist of a document.


Over the years, machine translation technology has improved somewhat and is much more widely available than it was even 10 years ago. Today, because of advances in computing technology, machine translation systems are available as software for average PCs and even available on Web sites, making the technology accessible to most companies needing or performing translation services. Figure 1, for example, shows a basic machine translation system that requires people only to enter text and to specify the originating and target languages.


In addition to "gisting" machine translation systems, such as the one in Figure 1 that provides users with the general idea of the material, other, more advanced machine translation systems are also available. These more advanced systems can provide an adequate starting point for professional translators and accommodate specialized vocabularies.


Today's machine translation technology, however, is still far from meeting the initial goal of automating translation. As the following example shows, the same factors that affected translation quality and accuracy thirty years ago still affect them today: language is simply too variable to be automatically translated with the quality and accuracy of the original. For example, the English sentence:


Technical communicators working in a networked environment should consider a new technology that can make developing, maintaining, and reusing structured documents easier than ever before.


is translated into German as:


Die technischen Verbindungen, die in einem vernetzten Klima arbeiten, sollten eine neue Technologie betrachten, die das Entwickeln, das Beibehalten und der strukturierten Dokumente ueberhaupt vorher wiederverwenden bilden kann leicht als.


which re-translated into English reads:


The technical connections, which operate in an interlaced climate, should regard a new technology, those developing, maintaining and the structured documents at all beforehand reuse to form can easily as.


Although word order, tense, and complex grammar pose problems for most machine translation systems, industry-specific terminology and context-dependent meanings, both prevalent in technical documentation, particularly hinder the translation accuracy in this example. In particular, "environment" is incorrectly translated here; however, if "environment" were used to describe a machine room or facility, then the translation (Klima, here) would be more accurate. Even more advanced machine translation systems cannot always definitively identify contexts and translate appropriately.


So, although current machine translations can give readers the overall gist of a document, provide a starting point or rough draft for translation, or automate translations of simplistic documents (like parts lists or specification sheets), they are still far from the automated, publication-quality translations hoped for years ago.


The advent of translation memory systems


Translation technologies continued to emerge, surging again in the 1980's when computing equipment and resources became more widely available. At that time, large translation agencies began using specialized, automated tools to help their translators work more efficiently and consistently. These tools assisted translators by storing previously-translated materials, providing databases of specialized terminology, and allowing translators to access this information for subsequent jobs. These specialized tools marked a different direction in the development of translation technologies: rather than automating the translation process, these technologies were intended to assist human translators, allowing them to work more efficiently, with less repetition, and with more consistency. And, with that, translation memory systems emerged.


Using a translation memory system, a translator can translate, save, and reuse translated segments. The translation memory system, which is usually integrated into a word processing program, follows along as a document is translated and saves translated sentences and passages. Then, when the translator comes across identical or similar material, the translation memory system lets the translator reuse the previously-translated material. As Figure 2 shows, the program allows translators to reuse previous translations, modify them as necessary, or choose from suggested translations.


What's significant to technical communicators is that translation memory systems are particularly suitable for helping translators with highly-technical documentation, large documentation sets, multiple related documents, and documents with specialized vocabularies. Using this technology, translators can offer highly consistent, efficient, and cost-effective translations:


First, translation memory systems can help ensure that the translated documents are consistent, including common definitions, phrasings, terminology, and so on. This is particularly valuable when different translators are working on a single project and when translation jobs span many different documents. The database of translations and terminology that's developed throughout the process helps promote consistency within a documentation set, as well as consistency in subsequent translation projects.


Second, translation memory systems can help speed the overall translation process. Because these systems remember previously-translated material, they eliminate the need for re-translating boilerplate text, instructions that are identical across different documentation series, glossaries of terms common to an industry or documentation set, or warnings and cautions present throughout documentation. The translator can translate the material one time, then reuse or tweak it at subsequent instances.


Third, translation memory systems not only make the translation process consistent and efficient, but also cost effective for long-term translation needs. For example, given a documentation set, a translator might need to translate 100% of the first manual, but only 80% of the next manual, 75% of the next, and so on. Likewise, a warning message could be translated one time and reused throughout an entire documentation set or series of documents for a product. Additionally, because the translation memory system builds a database of terminology and phrasing specific to a product, company, or industry, it makes subsequent translation jobs even more cost effective because a portion of the material is already translated.

Considerations for Translation Memory Systems


Although translation memory systems can help ensure consistency, improve efficiency, and reduce costs, they are not ideal for all documents or situations. In particular, translation memory systems are less effective for stand-alone or short documents, for document sets that are produced using varied writing styles, or for documents that include multiple terminology contexts. In these cases, the translator would likely translate the material from scratch and not benefit at all from the memory aspects that these systems provide.


Additionally, these systems are not ideal for documents that contain what's called "fuzzy matches" in the source material, where non-identical text is used to convey identical material. One document, for example, might refer to a drop-down menu, and another to a select list. Although the translation memory system might suggest the same translation, it would most likely require the translator to re-translate the second instance or choose from suggested translations. Either way, the fuzzy matches in the source documents are not ideal for translation through memory systems because the system does not eliminate or reduce the amount of work the human translator must do.


If you are exploring translation services, consider whether translation memory systems can help:

Consider the documents



  • How much material do you have? Translation memory systems are ideal for large documentation sets or multiple, related documents. For example, if your company produces a range of documents for a single product, plans to translate all documentation for current and future product versions, or the documentation is particularly lengthy, then translation memory systems might be appropriate.
  • How much of the material is reused among the documents? Translation memory systems are most effective for translating documents containing identical or similar passages. For example, you might have material in a Getting Started Guide that's reused and expanded on in a User Manual, or you might have introductory material, glossaries, warnings, examples, or instructions that are common to many documents you produce. Likewise, if your company produces closely-related products that share common features, translation memory systems can be particularly effective.
  • How consistent is the material? Translation memory systems are especially useful if the related documents are nearly identical in writing style, tone, vocabulary, and conventions. In particular, documents produced using a common template and following standard conventions would be more appropriate for translation memory systems than idiosyncratic documents would be.
  • How often will the material be updated? Translation memory systems are particularly effective for long-term translation needs (as long as style, tone, and material remain consistent). Many times, each subsequent revision requires less and less work and, therefore, is faster and more cost effective.
  • How is the updated material dispersed? Translation memory systems are ideal for translating updates that occur throughout a documentation set. If you plan to update documents by reusing the existing material and adding a 20-page "What's New" section at the end, you would likely just want to reuse the translated material and have a translator translate the new material from scratch. On the other hand, if the updated material includes new sentences integrated into paragraphs, tweaks here and there, or new paragraphs integrated into sections, translation memory systems might be effective because you really cannot reuse the previous translation.

Consider the translator or translation agency



  • Will the translator be using a translation memory system? If you have translation needs that meet the mentioned criteria, find out whether the translator or agency uses translation memory systems and what features they include. Depending, your material may or may not be suitable for their particular systems, and the translator should help assess this.
  • Has the translator or agency translated documents about your products or within the industry? If the translator or agency has previously translated documents for your company or industry, you might already be able to benefit from databases of information and terminology. Find out what industries, products, and services the translator or agency specializes in, as well as the kinds of documents translated.
  • Can or will the translator or agency be using databases they may already have? If a translator or agency specializes in an industry or has translated documents for similar products, chances are they have a database of information and terminology that you can benefit from. Rather than having a translator start completely from scratch on your documents, the translator might have already-translated material to apply to your documents.
  • Is the translation memory system compatible with the document formats you use? Although most translation memory systems are compatible with common document formats (given some intermediary formats or conversion tools), verify that the translator or agency can readily accept and return the formats you use. In general, the more graphically oriented your tools (e.g., QuarkXPress or PageMaker), the more difficulty you're likely to encounter with document formats and layouts.
  • Who will retain ownership of any databases created during the translation of your documents? Database ownership will vary, depending on the translator or agency. If you plan to have documents translated in the future, consider attaining (or retaining) ownership of the database or at least attaining rights to use the material for other translation projects.
  • What format will the database be in? If you plan to retain ownership of the database and want the option of allowing another translation agency access to it for subsequent translation projects, find out whether the database format will be one that's commonly available and, thereby, likely exportable to other translation memory systems. Likewise, verify that a translator or agency can accept any translated material (such as company-specific terminology or glossaries) and use it in their translation memory system.

Consider the drawbacks



  • Establishing translation memory systems can be costly. In a lot of cases, establishing the translation memory system for your documents, products, company, or industry takes time. Because of this, the initial translation costs may be high; however, subsequent translation projects will become more and more economical.
  • Translation memory systems are not a panacea. Although translation memory systems can make the translation process easier, faster, and more cost effectively, they are not right for all documents or purposes. In fact, translation memory systems can be much more costly than other translation options, if you translation needs do not meet the suggested criteria mentioned here. Consistency of material, translation job size, and long-term translation needs, as well as the specific translation memory system used will all impact whether translation memory systems are right for your needs.

XML reaches translation memory systems


A new translation memory system database format is now available that should help translation databases become more portable. The TMX (Translation Memory eXchange) format is an XML-compliant markup language used to encode translation memory databases for exchange among products from different vendors. Although this format is not yet universally supported, it shows promise of helping the translation industry more easily exchange databases.


Conclusion


For technical communicators exploring translation services, translation memory systems can help translators provide consistent translations, both quickly and cost effectively in the long term. Translation memory systems are a newer translation technology that's used to assist human translators. Rather than actually doing translations, as machine translation systems do, translation memory systems follow along as a document is being translated, creates a database of translations and terminology, and allows translators to reuse previously-translated material. Translation memory systems are ideal for helping translators process large documentation sets and multiple, related documents, as well as accommodating long-term translation needs. Because identical and similar material can be reused, the overall translation is highly consistent, efficient, and cost effective.

For Further Reading


General information



  • http://www.languagepartners.com/ Visit the Language Partners site for an overview of computer-assisted translated products, tools, and technologies.
  • http://www.lisa.org/ Visit the Localization Industry Standards Association site for more information about localization standards.
  • http://www.ile.com/ Visit the International Language Engineering site for background information about translations, as well as information about specific products and services.

Translation Memory Systems



Acknowledgment



We'd like to thank Bill Burns of International Language Engineering for his insights, suggestions, and feedback.


Figure 1.

Today's machine translation services are widely available and easy to use.

Figure 2.

Translation memory systems allow translators to reuse or tweak previously-translated material, making the translation process efficient and cost effective.


Technology

Writing and Editing

AttachmentSize
translationfigure1.jpg22.43 KB
translationfigure2.jpg26.33 KB