Skip to content

Translation memory: what is a segment?

What is a translation memory?

A translation memory (TM) is a combination of pieces of software that breaks down a source text into units known as ‘segments’, and builds databases of equivalent segments in different languages.

What is a segment?

A segment is the basic semantic unit of a text. Although a segment could be an entire sentence, it is more usually a small group of words: ‘the red house’, for example, or ‘eighty-three’.

Once the TM has split the text into segments, the translator goes to work, translating the segments one by one. Once a segment has been translated, the TM ‘learns’ what it means, and the next time a text is put into that TM, it will search for any segments it has already learned. The TM learns that ‘the red house’ means ‘la casa roja’ in Spanish and, the next time that it comes across ‘the red house’ in an English to Spanish translation, it automatically suggests the translation it has already seen.

We know this suggestion as a ‘candidate’. Some TMs only search for identical candidates. Other TMs will also retrieve segments which are only similar to segments in the source text. If a segment is similar but not identical to one the TM already knows, it will flag this as a ‘fuzzy match’. A fuzzy matching algorithm calculates how similar the already-translated-segment – the fuzzy match – is to the sentence in the source text, and will indicate this appropriately, typically using a colour-based code.

Having fed the source text into the TM, the translator then has various possible ways to deal with candidates, fuzzy or otherwise. In the case of an identical candidate, they will often have to do no more than check it before they click ‘accept’. A fuzzy candidate generally requires a closer analysis and some adjustment before it is accepted.

How does a translation memory work?

Let’s take an example. Imagine that the TM already recognised the segment ‘Dear Sir’; if you entered another document which contained the segment ‘Dear Sir/Madam’, it would suggest the translation that it had already learned for ‘Dear Sir’, indicating that its suggestion was a fuzzy match.

The translator would then decide whether to translate the new segment entirely from scratch, or adapt the TM’s suggestion. In this case, they would probably take the fuzzy match and add the relevant extra word.


Segments which don´t appear in the TM

Segments which have no existing match in the TM should be translated ‘manually’. Once done, the freshly-translated segment is stored in the TM and used again in future texts. Or maybe later on in the text at hand.

So, the translator just has to translate the first example of that segment. The TM will automatically suggest the match each time it occurs later on in the same text.

Why should we check a Translation Memory?

Project managers should check the newly-translated segments before using them for three reasons:

  • As part of the quality control process,
  • in order to ensure that we do not bury mistakes inside the TM and repeated in future documents,
  • to update the Translation Memory.



At we build a TM for each customer. We use this TM exclusively with that client’s documents. It is therefore the TM who quickly ‘learns’ the customer’s preferences.

Related Posts