Skip to content

Translation Memory: data protection/copyright issues

There is a growing tendency in the translation industry to ‘pool’ Translation Memories (TMs). This is partly an aspect of a broader shift towards crowd sourcing in translation and partly a result of the philosophy of online translation engines like Google Translate, which functions by collating an almost unimaginably vast corpus of translations. And, of course, a large TM is potentially very valuable on the marketplace.

It has been argued that sharing or even compiling a TM is not necessarily an infringement of copyright. In the US, copyright law states that a copy or a derivation must be at least substantially similar to the original version for its creation or use to constitute a violation. A short phrase or expression, particularly if it is already in general use, is therefore not copyrightable, and its reuse or adaptation is considered to be fair and acceptable. Because the segments into which a TM breaks down a document are generally only a few words long, their creation or sharing is not illegal. Furthermore, it should be impossible to reconstruct a text from segments pooled as part of a larger, shared TM.

Regardless of the legality or otherwise of the practice of pooling (or selling) TMs, most respectable Language Service Providers (LSPs) do not do so. There are various reasons for this.

Firstly, it could theoretically be possible to reconstruct the entire source text from the segments it is broken down into; some LSPs use a ‘scrambling’ algorithm to make this impossible, but this can only be done once the original has been aligned.

Secondly, even one sentence could contain confidential information: ‘So, in brief, the secret recipe for X fizzy drink is…’

Thirdly, most importantly, to share TMs with other companies constitutes a gross violation of the confidentiality agreement which any reputable LSP enters into with its customers. It is of paramount importance that a relationship of trust exists between these two partners from the outset and at every stage in the translation process. No matter how ‘safe’ the LSP thinks it is to pool its TMs, or how much benefit it may derive from being part of such a pool, it is still considered to be bad practice, to say the least, to share this information.

An LSP should guarantee that all of its employees as well as any freelance translators it works with sign confidentiality agreements. Most also sign a similar agreement with new clients.

Occasionally, a customer might ask an LSP not to use a TM at all, usually because the documents to be translated are highly confidential. Naturally, it is the customer’s prerogative to request this, but it means that they will be unable to take advantage of the significant cumulative savings that a TM enables.

Related Posts

There is a growing tendency in the translation industry to ‘pool’ Translation Memories (TMs). This is partly an aspect of a broader shift towards crowd sourcing in translation and partly a result of the philosophy of online translation engines like Google Translate, which functions by collating an almost unimaginably vast corpus of translations. And, of course, a large TM is potentially very valuable on the marketplace.

It has been argued that sharing or even compiling a TM is not necessarily an infringement of copyright. In the US, copyright law states that a copy or a derivation must be at least substantially similar to the original version for its creation or use to constitute a violation. A short phrase or expression, particularly if it is already in general use, is therefore not copyrightable, and its reuse or adaptation is considered to be fair and acceptable. Because the segments into which a TM breaks down a document are generally only a few words long, their creation or sharing is not illegal. Furthermore, it should be impossible to reconstruct a text from segments pooled as part of a larger, shared TM.

Regardless of the legality or otherwise of the practice of pooling (or selling) TMs, most respectable Language Service Providers (LSPs) do not do so. There are various reasons for this.

Firstly, it could theoretically be possible to reconstruct the entire source text from the segments it is broken down into; some LSPs use a ‘scrambling’ algorithm to make this impossible, but this can only be done once the original has been aligned.

Secondly, even one sentence could contain confidential information: ‘So, in brief, the secret recipe for X fizzy drink is…’

Thirdly, most importantly, to share TMs with other companies constitutes a gross violation of the confidentiality agreement which any reputable LSP enters into with its customers. It is of paramount importance that a relationship of trust exists between these two partners from the outset and at every stage in the translation process. No matter how ‘safe’ the LSP thinks it is to pool its TMs, or how much benefit it may derive from being part of such a pool, it is still considered to be bad practice, to say the least, to share this information.

An LSP should guarantee that all of its employees as well as any freelance translators it works with sign confidentiality agreements. Most also sign a similar agreement with new clients.

Occasionally, a customer might ask an LSP not to use a TM at all, usually because the documents to be translated are highly confidential. Naturally, it is the customer’s prerogative to request this, but it means that they will be unable to take advantage of the significant cumulative savings that a TM enables.