Skip to content

Translation Memory, Copyright and Data Protection

Is your data, and copyright, safe in a TM?

There is a growing tendency in the translation industry to consolidate, or ‘merge’, Translation Memories (TMs). This is partly an aspect of a broader shift towards crowd sourcing in translation; most translation engines function by collating an almost unimaginably vast body of translations. And partly a result of the philosophy of online translation engines like Google Translate; a large TM is potentially very valuable on the marketplace. While a substantive TM can provide enormous benefits for those with access to it; it can also lead to questions regarding copyright and Data Protection. After all, these TMs also contain potentially valuable proprietary, and other client-specific, information.

TM and Copyright

It has been argued that sharing or even compiling a TM is not necessarily an infringement of copyright. US copyright law states that a copy or a derivation must be at least substantially similar to the original version, for its creation or use to constitute a violation. A short phrase or expression, particularly if it is already in general use, is therefore not copyrightable. And therefore it’s reuse or adaptation is considered to be fair and acceptable. Because the segments into which a TM breaks down a document are generally only a few words long, their creation, or sharing, is not illegal. Furthermore, it should be impossible to reconstruct a text from segments pooled as part of a larger, shared TM.

TM and Data Protection

Regardless of the legality or otherwise of the practice of pooling (or selling) TMs; most respectable Language Service Providers (LSPs) do not do so. There are various reasons for this.

Firstly, it could theoretically be possible to reconstruct the entire source text from the segments it is broken down into; some LSPs use a ‘scrambling’ algorithm to make this impossible, but this can only be done once the original has been aligned.

Secondly, even one sentence could contain confidential information: “So, in brief, the secret recipe for X fizzy drink is…”

Thirdly, most importantly, to share TMs with other companies constitutes a gross violation of the confidentiality agreement which any reputable LSP enters into with its customers. It is of paramount importance that a relationship of trust exists between these two partners from the outset; and at every stage in the translation process. No matter how ‘safe’ the LSP thinks it is to pool its TMs, or how much benefit it may derive from being part of such a pool, it is still considered to be bad practice, to say the least, to share this information.

Confidentiality Best Practices

An LSP should guarantee that all of its employees, as well as any freelance translators it works with, sign confidentiality agreements. Most also sign a similar agreement with new clients.

Translation Memories should be client-specific — so only the client benefits from, and has access to, this valuable resource. Client-specific TMs are the safest, and simplest way to avoid any copyright and Data Protection issues.

Occasionally, a customer might ask an LSP not to use a TM at all, usually because the documents to be translated are highly confidential. Naturally, it is the customer’s prerogative to request this, but it means that they will be unable to take advantage of the significant cumulative savings that a TM enables.

Related Posts

Is your data, and copyright, safe in a TM?

There is a growing tendency in the translation industry to consolidate, or ‘merge’, Translation Memories (TMs). This is partly an aspect of a broader shift towards crowd sourcing in translation; most translation engines function by collating an almost unimaginably vast body of translations. And partly a result of the philosophy of online translation engines like Google Translate; a large TM is potentially very valuable on the marketplace. While a substantive TM can provide enormous benefits for those with access to it; it can also lead to questions regarding copyright and Data Protection. After all, these TMs also contain potentially valuable proprietary, and other client-specific, information.

TM and Copyright

It has been argued that sharing or even compiling a TM is not necessarily an infringement of copyright. US copyright law states that a copy or a derivation must be at least substantially similar to the original version, for its creation or use to constitute a violation. A short phrase or expression, particularly if it is already in general use, is therefore not copyrightable. And therefore it’s reuse or adaptation is considered to be fair and acceptable. Because the segments into which a TM breaks down a document are generally only a few words long, their creation, or sharing, is not illegal. Furthermore, it should be impossible to reconstruct a text from segments pooled as part of a larger, shared TM.

TM and Data Protection

Regardless of the legality or otherwise of the practice of pooling (or selling) TMs; most respectable Language Service Providers (LSPs) do not do so. There are various reasons for this.

Firstly, it could theoretically be possible to reconstruct the entire source text from the segments it is broken down into; some LSPs use a ‘scrambling’ algorithm to make this impossible, but this can only be done once the original has been aligned.

Secondly, even one sentence could contain confidential information: “So, in brief, the secret recipe for X fizzy drink is…”

Thirdly, most importantly, to share TMs with other companies constitutes a gross violation of the confidentiality agreement which any reputable LSP enters into with its customers. It is of paramount importance that a relationship of trust exists between these two partners from the outset; and at every stage in the translation process. No matter how ‘safe’ the LSP thinks it is to pool its TMs, or how much benefit it may derive from being part of such a pool, it is still considered to be bad practice, to say the least, to share this information.

Confidentiality Best Practices

An LSP should guarantee that all of its employees, as well as any freelance translators it works with, sign confidentiality agreements. Most also sign a similar agreement with new clients.

Translation Memories should be client-specific — so only the client benefits from, and has access to, this valuable resource. Client-specific TMs are the safest, and simplest way to avoid any copyright and Data Protection issues.

Occasionally, a customer might ask an LSP not to use a TM at all, usually because the documents to be translated are highly confidential. Naturally, it is the customer’s prerogative to request this, but it means that they will be unable to take advantage of the significant cumulative savings that a TM enables.