Max size of DV TM
Auteur du fil: Matthias Brombach
Matthias Brombach
Matthias Brombach  Identity Verified
Allemagne
Local time: 13:41
Membre (2007)
néerlandais vers allemand
+ ...
May 8, 2012

Dear colleagues,

I just wanted to ask about your experience what the max size of a DV TM may be. I recently tried to convert a Studio TM with a size of > 300 MB into a DV TM, but after hours (really!) of processing the import (by an .txt file created with Trados 2007) without any remarkable progress I terminated the process. Converting Studio TMs of a „normal“ size is always going on fine, so I wonder one of you might know a trick how to deal with it.
Best regards,
M
... See more
Dear colleagues,

I just wanted to ask about your experience what the max size of a DV TM may be. I recently tried to convert a Studio TM with a size of > 300 MB into a DV TM, but after hours (really!) of processing the import (by an .txt file created with Trados 2007) without any remarkable progress I terminated the process. Converting Studio TMs of a „normal“ size is always going on fine, so I wonder one of you might know a trick how to deal with it.
Best regards,
Matthias

[Bearbeitet am 2012-05-08 09:01 GMT]
Collapse


 
Klaus Herrmann
Klaus Herrmann  Identity Verified
Allemagne
Local time: 13:41
Membre (2002)
anglais vers allemand
+ ...
Moin Matthias! May 8, 2012

I'm sure someone will come up with the exact TM size limit, but in the meantime, here's how I go about big TXT TMs:
- Most important, make sure to UNCHECK "delete duplicate TM entries" in the import options.
- Cut TXT into 5-6 smaller chunks of 50 k
- Do the import
- Buy two or three 6-packs of your favorite beer
- Open DVX TM, use DVX's function to delete duplicate.
- Open the first bottle...
- If you're out of beer, your TM will be ready.

... See more
I'm sure someone will come up with the exact TM size limit, but in the meantime, here's how I go about big TXT TMs:
- Most important, make sure to UNCHECK "delete duplicate TM entries" in the import options.
- Cut TXT into 5-6 smaller chunks of 50 k
- Do the import
- Buy two or three 6-packs of your favorite beer
- Open DVX TM, use DVX's function to delete duplicate.
- Open the first bottle...
- If you're out of beer, your TM will be ready.

(In my experience, TXT is pretty reliable compared to the TMX import).

Gruß
Klaus
Collapse


 
Matthias Brombach
Matthias Brombach  Identity Verified
Allemagne
Local time: 13:41
Membre (2007)
néerlandais vers allemand
+ ...
AUTEUR DU FIL
Unfortunately... May 8, 2012

...step 4 wouldn´t be Alt beer (just because you have to drink it where it comes from), but the other steps sound fine although not very promising.

Moin Klaus,

thanks, I will try it overnight simply because it´s not yet the time for step 6...;-)

Best regards to Düsseldorf,

Matthias

[Bearbeitet am 2012-05-08 11:03 GMT]


 
David Turner
David Turner  Identity Verified
Local time: 13:41
français vers anglais
+ ...
Not sure whether you're using DVX1 or DVX2... May 8, 2012

... but DVX2 will usually import .tmx TMs of that size in a matter of minutes.

 
Matthias Brombach
Matthias Brombach  Identity Verified
Allemagne
Local time: 13:41
Membre (2007)
néerlandais vers allemand
+ ...
AUTEUR DU FIL
Also .tmx files? May 8, 2012

Hi David,
so you leave out the intermediate step to create a .txt file first with Trados 2007?
I just wonder because when importing tmx files in a DVX TM (I use the newest build) DVX doesn´t recognice the language combinations, that´s why I still stick to import Studio TMs as .txt. But I think Klaus his hint may help, to uncheck first "delete duplicate TM entries". I will try again later this day, but thanks.
Best regards,
Matthias


 
Matthias Brombach
Matthias Brombach  Identity Verified
Allemagne
Local time: 13:41
Membre (2007)
néerlandais vers allemand
+ ...
AUTEUR DU FIL
Step 7 completed after approx. 2 h (255,000 segments, 1 Jever Pils), but... May 9, 2012

...the problems are still (nearly) the same:
Whereas the source TM in Studio works very smooth without running out the performance of my PC (2.1 GHz, 4 GB RAM), the converted TM in DVX2 really slows DVX2 down. Klaus, I know you as an experienced user of DVX; do you still stick to it? And what PC configuration do you recommend with AutoSuggest running in DVX2? Even with smaller projects and smaller TMs and term banks the performance sometimes is slow and you can listen to the HD working, wo
... See more
...the problems are still (nearly) the same:
Whereas the source TM in Studio works very smooth without running out the performance of my PC (2.1 GHz, 4 GB RAM), the converted TM in DVX2 really slows DVX2 down. Klaus, I know you as an experienced user of DVX; do you still stick to it? And what PC configuration do you recommend with AutoSuggest running in DVX2? Even with smaller projects and smaller TMs and term banks the performance sometimes is slow and you can listen to the HD working, working, working... Or do you also tend to use Studio? Yesterday I was about trying to use Studio also for the translation process itself (before I just used it to prepare the translated .sdlxliff files for my customer), but my version (Studio 2009 Freelance) doesn´t offer AutoSuggest (creating an own AutoSuggest file, to be more precisely). And only when SDL will implement reasonable short cuts for term handling (as DV has got), then I will think, (but just think!) about working in Studio. Any suggestions? Thanks!

Best regards,
Matthias
Collapse


 
MikeTrans
MikeTrans
Allemagne
Local time: 13:41
italien vers allemand
+ ...
Size of TMs in DVX2... May 30, 2012

Hi Matthias,

I don't know max sizes for TMs, but here a reference for large DTBs:

EMEA, FR-DE, a medical DTB, 300.000+ segments = 1.08 GB
DGT, FR-DE, Union Européenne, 333.000 segments = 1.47 GB

The response times are very fast after a search of the 1st segment.

It's VERY important for performance to do a Repair of your DTBs and Projects on a regular basis, after importing/deleting lots of segments or documents in your project, especiall
... See more
Hi Matthias,

I don't know max sizes for TMs, but here a reference for large DTBs:

EMEA, FR-DE, a medical DTB, 300.000+ segments = 1.08 GB
DGT, FR-DE, Union Européenne, 333.000 segments = 1.47 GB

The response times are very fast after a search of the 1st segment.

It's VERY important for performance to do a Repair of your DTBs and Projects on a regular basis, after importing/deleting lots of segments or documents in your project, especially before and after removing Duplicates.

I have private DTBs containing Chunk Segments of the 2 Big Mammas above (for displaying Concordances). They both contain 3.400.000+ segments with the same short translation "CHUNK_xxx".
For practical reasons, I've split them in 2 parts of 1.8 GB each. No problems whatsoever.

Greets,
Mike

[Edited at 2012-05-30 18:55 GMT]
Collapse


 
Matthias Brombach
Matthias Brombach  Identity Verified
Allemagne
Local time: 13:41
Membre (2007)
néerlandais vers allemand
+ ...
AUTEUR DU FIL
Maybe a batch repair process available? May 31, 2012

Hi Mike,

thanks, I get your point and I will try it more frequently as already performed. How sad that there is no batch routine available to do the repair process with all or with a choice of projects, TMs and termbases.

Best regards,

Matthias


 
MikeTrans
MikeTrans
Allemagne
Local time: 13:41
italien vers allemand
+ ...
Repair & Compact May 31, 2012

Matthias,

There are actually 2 options: Tools > Repair and Tools > Compact

My experience is that I use Compact after removing lots of files in a project, or after deleting or making a new import of a considerable number of segments in a TM.

Whereas, after a power shut down, after a program crash, after Duplicates removal, then it's advicable to chose the Repair option, other than compacting it will re-index the TM which takes much longer.

Mike


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 13:41
français vers polonais
+ ...
2 GB Jul 5, 2012

Matthias Brombach wrote:

I just wanted to ask about your experience what the max size of a DV TM may be.

Theoretically 2 GB per file, as for MS Jet 4.0 databases.
Attention, as the DVX TM contains several files, the complete set may reach several GBs.
As the languages specific files are usually greater than the main one, I think something like 1,5 GB for the main file is realistic.
It's hard to see to say how many segments it may contain because it depends of the segment length, I suppose 1.5 million TUs is a good approximation.

I recently tried to convert a Studio TM with a size of > 300 MB into a DV TM, but after hours (really!) of processing the import (by an .txt file created with Trados 2007) without any remarkable progress I terminated the process. Converting Studio TMs of a „normal“ size is always going on fine, so I wonder one of you might know a trick how to deal with it.


Generally, If you experience problems split the file and compact the TM after importing every parts.
It may also help if the Trados TM contains "tricky" segments, e.g. DVX may fail on extremely long segments with multiple tags.

Cheers
GG

[Edited at 2012-07-05 08:57 GMT]


 
Matthias Brombach
Matthias Brombach  Identity Verified
Allemagne
Local time: 13:41
Membre (2007)
néerlandais vers allemand
+ ...
AUTEUR DU FIL
Thank you (dziękuję)... Jul 5, 2012

...for your answers; yes, I got it now that compacting on a regular basis and splitting the .tmx file to be imported would be wise. Maybe you also know how to export .tmx from Studio 2009 by size (in portions)? Also, when importing big .tmx files from Studio, the "language codes" do not appear in DVX2, so I am still forced to import .txt by an intermediate step with the use of Trados 2007, which makes the whole process more time consuming. My customer sends projects with the same Studio-TM, but ... See more
...for your answers; yes, I got it now that compacting on a regular basis and splitting the .tmx file to be imported would be wise. Maybe you also know how to export .tmx from Studio 2009 by size (in portions)? Also, when importing big .tmx files from Studio, the "language codes" do not appear in DVX2, so I am still forced to import .txt by an intermediate step with the use of Trados 2007, which makes the whole process more time consuming. My customer sends projects with the same Studio-TM, but updated, that´s why, and I always would like to import, better: update it with a minimum effort in DVX2.

Best regards,

Matthias
Collapse


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 13:41
français vers polonais
+ ...
Invalid TMX... Jul 5, 2012

Matthias Brombach wrote:

...for your answers; yes, I got it now that compacting on a regular basis and splitting the .tmx file to be imported would be wise. Maybe you also know how to export .tmx from Studio 2009 by size (in portions)?

You can use filters when exporting but IMO it's faster to use a decent text editor and split' em manually.

Also, when importing big .tmx files from Studio, the "language codes" do not appear in DVX2,

It happens when DVX considers the TMX is malformed (invalid chars, invalid header etc.).

It's not always true, it may be also related to some DVX filter errors e.g. extremely large segments with a gazillion of tags, kinda 300 words and 200 tags.
This kind of segments may be sometimes found in incorrectly prepared DTP jobs, when the Trados segmentation rules are screwed up.

so I am still forced to import .txt by an intermediate step with the use of Trados 2007, which makes the whole process more time consuming. My customer sends projects with the same Studio-TM, but updated, that´s why, and I always would like to import, better: update it with a minimum effort in DVX2.

Yep, I understand...
As Studio is totally marginal for me (one job per year...), I didn't pay attention but it seems their TMX may contain invalid characters.
Try to open the TM file in Olifant and fix invalid chars first before you import it in DVX.

Cheers
GG


 


To report site rules violations or get help, contact a site moderator:

Modérateur(s) de ce forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Max size of DV TM






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »