MS WORD question - convert *.doc(x) to XML or editable txt with tags
投稿者: nicomigo
nicomigo
nicomigo
ドイツ
Jul 17, 2012

Hi,

I have been working with Trados for quite some time now and I like the way Trados converts Word files into editable xml files (ttx). The think is, I would like to have the same without Trados: to be able to read a Word file with all the formatting converted to tags, edit it, for instance in Notepad or Notepad++ and reconvert it afterwards, but without Trados and still be able to keep the formatting intact. Does such a tool exist? I have been searching for it for about an hour wi
... See more
Hi,

I have been working with Trados for quite some time now and I like the way Trados converts Word files into editable xml files (ttx). The think is, I would like to have the same without Trados: to be able to read a Word file with all the formatting converted to tags, edit it, for instance in Notepad or Notepad++ and reconvert it afterwards, but without Trados and still be able to keep the formatting intact. Does such a tool exist? I have been searching for it for about an hour without much success.

Thank you in advance, kind community

Best regards,
Nicolas
Collapse


 
Rolf Keller
Rolf Keller
ドイツ
Local time: 19:23
英語 から ドイツ語
Word can export XML Jul 17, 2012

nicomigo wrote:

to be able to read a Word file with all the formatting converted to tags, edit it, for instance in Notepad or Notepad++ and reconvert it afterwards, but without Trados


Actually, the .docx format is a zipped XML. So, rename the file to .zip, then unzip it. You'll find some XML-files and folders, one of these contains the main part of the original document.


 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
ポーランド
Local time: 19:23
2009に入会
英語 から ポーランド語
+ ...
DOCX Jul 17, 2012

Docx files are in fact zip archives with a bunch of xml files inside. So you probably have the tool already

 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
ポーランド
Local time: 19:23
2009に入会
英語 から ポーランド語
+ ...
Probably better still Jul 17, 2012

In newer Word versions, you can save documents as XML Word documents. They open in Word like regular documents, but are XML with some binary content added.

 
nicomigo
nicomigo
ドイツ
TOPIC STARTER
I'll try that :) Jul 18, 2012

Thank you for your answers, great! I had no idea the docx was already a zip! I will try unzipping it and see how it looks like

EDIT: I checked it out and it's fine, everything is in there. One could say that there is a lot that could be optimized in there, but it's editable in Notepad++

[Edited at 2012-07-19 07:06 GMT]


 
Rolf Keller
Rolf Keller
ドイツ
Local time: 19:23
英語 から ドイツ語
docx versus simple XML Jul 18, 2012

Adam Łobatiuk wrote:

In newer Word versions, you can save documents as XML Word documents. They open in Word like regular documents, but are XML with some binary content added.


You are right. This works even with "old" Word 2003, provided the MS converters for newer versions are installed. These converters deliver .docx as well.

A .docx archive is more than just one file, e. g. it contains separate files with the pictures (if any). Sometimes this is very convenient.


 
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 20:23
フィンランド語 から フランス語
XLIFF? Jul 21, 2012

nicomigo wrote:
I have been working with Trados for quite some time now and I like the way Trados converts Word files into editable xml files (ttx). The think is, I would like to have the same without Trados:

Why don't you convert your Word documents (or any other translatable file type, for that matter) to XLIFF? After all, XLIFF is a type of XML (your requirement) specifically designed to handle translatable documents, complete with tags for rendering formatting (again, your requirement). Of course, XLIFF isn't necessarily pretty when opened in a text editor, but then, neither is TTX.

There are a number of tools that will produce XLIFF files, including free ones.


 


To report site rules violations or get help, contact a site moderator:

このフォーラムのモデレーター
Maya Gorgoshidze[Call to this topic]
Prachya Mruetusatorn[Call to this topic]

You can also contact site staff by submitting a support request »

MS WORD question - convert *.doc(x) to XML or editable txt with tags






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »