Translation of bullet point in OOXML or ODF documents
Thread poster: Nicolas Gambardella
Nicolas Gambardella
Nicolas Gambardella
United Kingdom
Local time: 19:43
Member (2019)
English to French
+ ...
Aug 11, 2019

Hello,

I think I found an issue with Cafetran Espresso (I do not want to used the term bug, because I am not sure where the problem stands).

When parsing OOXML (MS Office) or ODF documents, Cafetran will not recognise the custom bullet points as translatable text. Therefore, one cannot translate them. If we have:

Step 1) here is the first step
Step 2) here is the second step
Step 3) here is the third step

CE provides us with:
... See more
Hello,

I think I found an issue with Cafetran Espresso (I do not want to used the term bug, because I am not sure where the problem stands).

When parsing OOXML (MS Office) or ODF documents, Cafetran will not recognise the custom bullet points as translatable text. Therefore, one cannot translate them. If we have:

Step 1) here is the first step
Step 2) here is the second step
Step 3) here is the third step

CE provides us with:

here is the first step
here is the second step
here is the third step

Résulting in the translation (here, in French)

Step 1) voici la première étape
Step 2) voici la deuxième étape
Step 3) voici la troisième étape

While we want also to translate "Step" into "Étape" and get:

Étape 1) voici la première étape
Étape 2) voici la deuxième étape
Étape 3) voici la troisième étape

The bullets are actually written whole in the file "word/document.xml" of the DOCX archive:

Step 2) voici la deuxième étape

And in the file content.xml of the ODT archive

Step 2) [...]

Should it be a bug report?
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Defined at document level? Aug 11, 2019

Maybe the preceding word ‘Step’ has been set in the list definition?

https://support.office.com/en-gb/article/define-new-bullets-numbers-and-multilevel-lists-6c06ef65-27ad-4893-80c9-0b944cb81f5f#style


 
Nicolas Gambardella
Nicolas Gambardella
United Kingdom
Local time: 19:43
Member (2019)
English to French
+ ...
TOPIC STARTER
Yes, bullet point definition Aug 12, 2019

Yes, it is. That is the whole point. We must be able to translate the bullet points that are generated that way. At the moment, we have to do it afterwards, by opening the exported file.

 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Others do? Aug 12, 2019

Nicolas Gambardella Le Novere wrote:

Yes, it is. That is the whole point. We must be able to translate the bullet points that are generated that way. At the moment, we have to do it afterwards, by opening the exported file.


I'm not sure whether this can be expected of a CAT tool. Do other CAT tools support this? If so, you could open a support ticket and send an example document: https://cafetran.freshdesk.com/support/tickets/new


 
Nicolas Gambardella
Nicolas Gambardella
United Kingdom
Local time: 19:43
Member (2019)
English to French
+ ...
TOPIC STARTER
Another consequence Oct 15, 2019

I just realized that the issue also affects the itemized lists with bullet points and not words.

In English, the bullet symbols used in regular text (not fancy presentations) are ... well, bullets, as the name indicates. However, in French, they are dashes (tirets).

There is no way to transform the points into tirets if the bullet point list is correctly formatted in the original document (that is, if the author actually used a bullet-point list structure rather than
... See more
I just realized that the issue also affects the itemized lists with bullet points and not words.

In English, the bullet symbols used in regular text (not fancy presentations) are ... well, bullets, as the name indicates. However, in French, they are dashes (tirets).

There is no way to transform the points into tirets if the bullet point list is correctly formatted in the original document (that is, if the author actually used a bullet-point list structure rather than making it from scratch). The bullets or the dash do not show up in the segments.

To answer a previous question, no, I do no know if this happens in other CAT tools. Anyone knows if SDL Trados or WordFast provide way to translate bullets and listing words?
Collapse


 
Jean Dimitriadis
Jean Dimitriadis  Identity Verified
English to French
+ ...
Dashing the bullet Oct 15, 2019

If bullets are properly set, I’m not sure they should appear in the CAT tool.

However, there is a way to convert bullets to dashes in the text processing application itself.

In LibreOffice/OpenOffice, select all items in a list, right-click > Bullets and Numbering > Bullets and Numbering (or in the menu bar > Format >Bullets and Numbering. Then, go to the last tab (Customize).

Under Number, choose Bullet. Then, under Character, click the Select button and
... See more
If bullets are properly set, I’m not sure they should appear in the CAT tool.

However, there is a way to convert bullets to dashes in the text processing application itself.

In LibreOffice/OpenOffice, select all items in a list, right-click > Bullets and Numbering > Bullets and Numbering (or in the menu bar > Format >Bullets and Numbering. Then, go to the last tab (Customize).

Under Number, choose Bullet. Then, under Character, click the Select button and choose the character to use.

For example, typing "Dash" will let you choose the en dash. Make sure the appropriate Font is being used. For an en dash, I'd probably stick to the body text font.

There must be a way to do this in Microsoft Word as well.

Best,

Jean

[Edited at 2019-10-15 11:20 GMT]
Collapse


 
Nicolas Gambardella
Nicolas Gambardella
United Kingdom
Local time: 19:43
Member (2019)
English to French
+ ...
TOPIC STARTER
They should Oct 15, 2019

Hi Jean,

"If bullets are properly set, I’m not sure they should appear in the CAT tool."

Why? Why shouldn't they? They are part of the text, and require translation. See my examples with "step" and "étape". I would agree with you if this was a tool configuration issue. If, when opening the same document I was seeing "étape" and you were seeing "step". However, even a French version of LibreOffice or MS Word, completely localized, will display "step" if the loaded do
... See more
Hi Jean,

"If bullets are properly set, I’m not sure they should appear in the CAT tool."

Why? Why shouldn't they? They are part of the text, and require translation. See my examples with "step" and "étape". I would agree with you if this was a tool configuration issue. If, when opening the same document I was seeing "étape" and you were seeing "step". However, even a French version of LibreOffice or MS Word, completely localized, will display "step" if the loaded document contains "step". The bullet point word is contained in the ODF or OOXML archive.

Re: changing them in the word processing software, I know of course how to configure them in LibreOffice, MS Word, and even LaTeX. However:

1) Sometimes we do not have access to the actual document and only receive an XLIFF (of course, if the XLIFF did not contain the bullet text, there is nothing CTE could do anyway)

2) For some texts, I do not want to touch the original files at all. I translate many texts encoded in weird MS Word documents, with complex layouts, strange fonts etc. Opening, modifying and saving them with another software, even another version of MS Word wreck them. While if I read and write them with CTE, because of the agnostic way the tags are treated, there is no problem.

I acknowledge that this is a technically difficult issue, since the bullet points are part of the styles, and therefore potentially stored in other parts of the ODF or OOXML archive, not with the text itself.

This is why I am curious to know if other CAT tools offer the possibilities of modifying them.

Note that this is a situation that is not limited to bullet points. We can configure section titles as well. I did not test it as I write this, but would CTE allow to translate "chapter" into "chapitre" or "section" into "rubrique"?
Collapse


 
Jean Dimitriadis
Jean Dimitriadis  Identity Verified
English to French
+ ...
Reasoning Oct 15, 2019

My reasoning is what you highlight in your last comment, namely the distinction between form/structure and content.

For example, in LibreOffice, one can select some text, make the font bigger, add some bold, and call the line a title. Or they can apply a heading style, with predefined settings (including font size and boldness). I usually prefer the latter.

However, I haven’t given this much thought and when I said "I’m not sure", I really meant it as a doubt, not a
... See more
My reasoning is what you highlight in your last comment, namely the distinction between form/structure and content.

For example, in LibreOffice, one can select some text, make the font bigger, add some bold, and call the line a title. Or they can apply a heading style, with predefined settings (including font size and boldness). I usually prefer the latter.

However, I haven’t given this much thought and when I said "I’m not sure", I really meant it as a doubt, not as an opposing argument.
Collapse


 
Nicolas Gambardella
Nicolas Gambardella
United Kingdom
Local time: 19:43
Member (2019)
English to French
+ ...
TOPIC STARTER
Word processing software were not designed with translation in mind Oct 15, 2019

I think you are right these are considered as style properties.

And this is where the crux of the matter is. I do not think that people designing those software tools ever spent too much time thinking about translation. This applies to the typesetting systems and also to the design of standard format. I know that first hand, having been part of a group designing international standards to encode stuff (mathematical models if you want to know). I still remember the arguing when we a
... See more
I think you are right these are considered as style properties.

And this is where the crux of the matter is. I do not think that people designing those software tools ever spent too much time thinking about translation. This applies to the typesetting systems and also to the design of standard format. I know that first hand, having been part of a group designing international standards to encode stuff (mathematical models if you want to know). I still remember the arguing when we asked to move from US-ASCII (7bits) to Unicode in ... 2002.

However, there is a bit of philosophy there of course. What is style, what is content. Take the example of fonts. It is typically style, and people would say that changing the font does change the content. Until we use Symbol or Dingbats ...

I think this should be the end of the debate for now
Collapse


Jean Dimitriadis
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Translation of bullet point in OOXML or ODF documents






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »