How to spell-check your TRADOS translation
memories
Elvids
Introduction
Translation memories are,
next to our brains, our most important asset. It is crucial, then,
that they are in good health and that they stay in good health. This means among
other things that target segments do not contain spelling errors. Easier said
than done: I have been using TRADOS now for long enough and the one and only way
I found to do this job is to catch the errors in translated documents and then
use the maintenance function in the Workbench for corrections. This of course is
VERY frustrating: it is error prone and you are never sure you did everything
and did it correctly. And above all, it is labor-intensive: after half an hour
your brain turns sunny side up and the hair loss… – well, believe me, it's not
due to hormones.
No need to protract the
agony, I said, and spent some time with Word… Here's an straightforward method
that involves nothing more than Word and of course the Workbench. The time you
will spend with it will be 90% spell checking and 10% for the rest. And not vice
versa anymore.
Algorithm
The algorithm is a pretty
straightforward bootstrapping procedure: a two-language TM is equivalent to a
translated, uncleaned Word file (i.e. containing all its source and target
segments). Getting for source to target is easy – we can pretranslate, can't
we - so, what one needs for starters is a Word file with nothing else but all
the source segments that the TM contains! We proceed as follows:
1.
Export your TM
to a text file
2.
Create a Word
file from the exported text file
3.
Use the Word
macro below to create a file of source segments
4.
Pre-translate
the result of step 2 to get the target version of source segments
5.
Spell-check the
result of 3
6.
Update the
translation memory with the result of step 4
Word macro
If you are new to Word
Basic, you may first have a look at some introductory text on how to handle VBA
environment. You don't need to know how to program - just how to cut and paste
the code below into a VBA environment and start the result. Here's the source:
Public Sub GetSourceSegments()
Public Sub GetSourceSegments()
' change to the source language tag of your TM
Const SRCTAG = "US>"
Dim OtherDocument As Document
Dim ThisDocument As Document
Set ThisDocument = ActiveDocument
Set OtherDocument = Documents.Add(, , wdNewBlankDocument)
ThisDocument.Activate Selection.WholeStory
Selection.HomeKey
Dim rng As Range
Set rng = RangeFromTo(SRCTAG, ""<Seg")
If rng Is Nothing Then
MsgBox "source tag " & SRCTAG & " is missing in the text file"
Exit Sub
End If
Do
rng.Select
OtherDocument.Activate
Selection.TypeText rng.Text
ThisDocument.Activate
Selection.WholeStory
Selection.Start = rng.End
Set rng = RangeFromTo(SRCTAG, "<Seg")
Loop Until rng Is Nothing
End Sub
Private Function RangeFromTo(ByVal startstring$, ByVal endstring$) As Range
Dim rstart As Range, rend As Range
With Selection.Find
.Text = startstring
.Execute
Set rstart = ActiveDocument.Range(Selection.Start, Selection.End)
End With
If Selection Is Nothing Then
Set RangeFromTo = Nothing
Exit Function
End If
Selection.Start = rstart.End
With Selection.Find
.Text = "{Seg"
.Execute
Set rend = ActiveDocument.Range(Selection.Start, Selection.End)
End With
If rend = "" Then
Set RangeFromTo = Nothing
Else
Set RangeFromTo = ActiveDocument.Range(rstart.End, rend.Start)
End If
End Function
Conclusion
If
your source segments have embedded formatting commands, for example:
Press <\cs6\f1\cf6\lang1024>Rock and Roll<\cs6\f1\cf6\lang1024> button to bring back Elvis.
…then some extension to
the above code is necessary to get a clean version:
Press Rock and Roll button to bring
back Elvis.
The pretranslation would
then run without any hiccups. Still, I have not thought it all the way through
– is this the one and the same segment for the WorkBench or is it two
different segments?…
Some changes would be
necessary also in case that a translation unit contains more than just the two
(source and target) segments. Given the RangeFromTo primitive, however, it
should not be too difficult to expand it to these cases as well. One last
comment: I have a gut feeling it would be a very simple job in XLST. Well, as
Moustache in Irma La Douce used to say :"But, that's another story".