Revised March 1, 2003
As some of us have observed, Déjà Vu does not treat tab
characters in Word documents as sentence delimiters.
Instead, it converts them to formatting codes.
This can cause problems. For example, if I have a document
with numbered paragraphs, this is what one might look
like in DV:
13.{39}Maintenance and emergency personnel
or sometimes this.
{38}13.{39}Maintenance and emergency personnel
or maybe this:
{38}13.{39}{40}Maintenance and emergency personnel
I translate a number of documents from one customer, with
sentences that are repeated from previous source documents but
with different paragraph numbering (letters, numbers, bullets,
or none at all). Sometimes the paragraph numbering is
automatic, sometimes it is inserted manually. In this example,
I might have one or more of these source sentences in the MDB:
Maintenance and emergency personnel
{001}III.{002}Maintenance and emergency personnel
{001}-{002}Maintenance and emergency personnel
Sometimes DV will find a fuzzy match but then forget the codes
and the preceding character. Other times it won't make a good
match at all.
My preferred solution would be for DV to define the tab
character -- whether it appears in automatic or manual
numbering -- as a delimiter, and then handle the sentence
without any leading number/letter/ bullet characters. After
all, these elements do not need translation.
However, despite requests from several DV users, Atril has been
unable to provide this feature. Evidently this task presents
a difficult programming challenge.
There are several workarounds. One is, replace the tab with
another character such as ~ that is then defined as a sentence
delimiter. Another is to insert a new paragraph (with the Enter
key) after each tab. Both of these require only a few seconds
of pre-editing and post-editing for each document.
However, these methods don't work on tabs that are part of the
MS Word "list" formatting feature (automatic paragraph
numbering or bullets). The first sentences of such paragraphs
are still subject to the same problems described above.
Another approach is convert all the automatically-generated
numbers and bullets to actual text and tabs. This is what I do
most of the time, because the automatically-generated features
are almost never required in a translation. The sequence is:
1) Make sure that there are no sequences of "Tab-Enter" in the
document. To do this, replace all "Tab-Enter" with just "Enter"
(using Search and Replace).
2) Convert all automatic bullets and numbering to ordinary text.
This requires a VBA command (Visual Basic for Applications),
which (as far as I know) must be added as a step in a macro:
ActiveDocument.ConvertNumbersToText
3) Replace all Tab characters to the sequence Tab-Enter.
4) Save the document, and import it into DV. All leading
bullets and section numbers will be isolated on separate rows
from the sentences that follow them.
5) Translate as usual and export. Open the translation in Word.
6) Replace all Tab-Enter sequences with a single Tab.
But what if your wants requires the translation to keep the
automatic numbering and bullets?
I searched for a long time is a way for DV to "see" ONLY
the sentence that comes after the leading number or bullet,
and ignore this character.
One possibility is to format the bullet or number itself, so
that it is hidden. This also hides the leading tab character.
Then, set the DV project to ignore "Hidden" text.
However, this is tedious and error-prone to do for all the
numbered paragraphs in a document. So I considered a macro,
to do the job for each numbered or bulleted paragraph.
Unfortunately, when I tried to record a macro, each of these
types of list created different macros: letters (A B C),
numbers (1 2 3), Roman numerals (I II III or i ii iii), bullet
characters (( - * etc.), and multi-level or "outline" numbering
(1.2.1, 1.3.4, etc.). And the "help" available in Visual Basic
and Microsoft is not the best in the world.
This week I finally found a complete solution. The resulting
macro examines the first automatically numbered (or bulleted)
paragraph in the document, to see whether the bullet or
paragraph number character is hidden. It then sets *ALL* of
the numbered / bulleted paragraphs to the opposite setting.
While pre-editing the document in Word, I click a button on a
toolbar to hide the numbering. Then, when revising the
translation after export from DV, I click the same button
again to change it all back to visible.
This version continues and works correctly in the cases that
halted with errors in the first version.
Here's the macro.
Steven Marzuola
= = = = = =
Sub ToggleHideParaNos()
' Revised Mar 1, 2003 by Steven Marzuola
' Toggle Hidden attribute for paragraph numbers (or bullets)
' that are automatically generated by Microsoft Word
' Contains error checking
Dim myPar, a As Long, newValue As Boolean, newValDef As Boolean
newValDef = False
On Error GoTo myContinue
For Each myPar In ActiveDocument.ListParagraphs
a = myPar.Range.ListFormat.ListLevelNumber
If newValDef Then
myPar.Range.ListFormat.ListTemplate.ListLevels(a).Font.Hidden = newValue
Else
newValue = Not myPar.Range.ListFormat.ListTemplate.ListLevels(a).Font.Hidden
newValDef = True
myPar.Range.ListFormat.ListTemplate.ListLevels(a).Font.Hidden = newValue
End If
myContinue:
Next
End Sub
               (
geocities.com/marzolian)