Personalising machine translation by feeding in translation memories or bilingual files?
Thread poster: Mark Bossanyi
Mark Bossanyi
Mark Bossanyi  Identity Verified
Bulgaria
Local time: 07:34
Member (2008)
French to English
+ ...
Jun 11, 2019

Does anyone know of a (neural?) machine translation system that I can personalise or train to translate in my own style by feeding in the translation memories or CAT tool bilingual (xliff) files that I have compiled over the years?

[Edited at 2019-06-11 07:02 GMT]

[Edited at 2019-06-11 07:14 GMT]


 
DZiW (X)
DZiW (X)
Ukraine
English to Russian
+ ...
redundancy Jun 12, 2019

Mark, a modern neural network is a set of factor-weight-possible outcome bruteforcing, usually set at random for a possible/approved solution. Therefore, it must know all the characteristics guessing their non-equal 'importance' weights followed by an individual (constant human-assisted) feedback, which is very limited and inefficient. Shortly, it cannot know for sure what is RIGHT for you in every case/context. While hyped, the "AI" possibilities are rather mediocre nowadays.
... See more
Mark, a modern neural network is a set of factor-weight-possible outcome bruteforcing, usually set at random for a possible/approved solution. Therefore, it must know all the characteristics guessing their non-equal 'importance' weights followed by an individual (constant human-assisted) feedback, which is very limited and inefficient. Shortly, it cannot know for sure what is RIGHT for you in every case/context. While hyped, the "AI" possibilities are rather mediocre nowadays.

Indeed, you can use your lex, complementing such CAT/MT engines as PROMT with your TMs, but no program can properly mimic one's preferences or peculiarities--only guessing by trials and errors. It's more reasonable to prefer specific rule-based MTs, propagating your TM fragments.

Every client/audience requires a certain style. Furthermore, most translation equivalents have no unique solutions whereas the machine wants it clear--or uses the first 'best match', which may not always be the best for readers.


Therefore, you could traditionally couple a CAT and MT so it would find the best match or reuse your* edited variants.
Collapse


Shatlyk Penayev
 
Mark Bossanyi
Mark Bossanyi  Identity Verified
Bulgaria
Local time: 07:34
Member (2008)
French to English
+ ...
TOPIC STARTER
Thank you DZiW, Jun 12, 2019

Yes, I would not expect any system, neural or otherwise, to always get it exactly how I want it. I have used neural systems without any input from my TMs and quite often found the results very good. But being a bit of a perfectionist in terms of style and readability, I find myself spending too much time on post-editing, changing the order of clauses, etc., which can sometimes even makes it take almost as long as translating the text from scratch.

But I gather from your reply that
... See more
Yes, I would not expect any system, neural or otherwise, to always get it exactly how I want it. I have used neural systems without any input from my TMs and quite often found the results very good. But being a bit of a perfectionist in terms of style and readability, I find myself spending too much time on post-editing, changing the order of clauses, etc., which can sometimes even makes it take almost as long as translating the text from scratch.

But I gather from your reply that neural systems would be less responsive to my translation memories than rule-based machine translation systems. Do you think SDL Language Cloud could be a suitable option, for example? And importantly, I am wondering whether systems such as STL Language Cloud and PROMT would be able to sequester my input for confidentiality reasons. Do you have any information on this?
Collapse


 
Jean Dimitriadis
Jean Dimitriadis  Identity Verified
English to French
+ ...
ModernMT Jun 12, 2019

There's ModernMT ( https://www.modernmt.eu/ ), but the cloud edition is expensive (€4 per thousand words).

I see it does EN>FR and FR>EN, but only EN>BG, not BG>EN.

Can be used directly in Matecat, maybe other CAT tools as well.

There's also a free and open source edition i
... See more
There's ModernMT ( https://www.modernmt.eu/ ), but the cloud edition is expensive (€4 per thousand words).

I see it does EN>FR and FR>EN, but only EN>BG, not BG>EN.

Can be used directly in Matecat, maybe other CAT tools as well.

There's also a free and open source edition in case you want to get your hands dirty: https://github.com/modernmt/modernmt

[Edited at 2019-06-12 10:42 GMT]
Collapse


 
Mark Bossanyi
Mark Bossanyi  Identity Verified
Bulgaria
Local time: 07:34
Member (2008)
French to English
+ ...
TOPIC STARTER
Thank you Jean Dimitriadis, Jun 12, 2019

I will look into these options.

For ModernMT, is the €4 that you mention per 1000 words of translation output or per 1000 words of TM input?


 
Jean Dimitriadis
Jean Dimitriadis  Identity Verified
English to French
+ ...
Cost Jun 12, 2019

There is no cost for uploading your memories, only for using their SaaS. It's €4 for 1,000 words of MT output.

From their website: State-of-the-art neural machine translation as a service that learns from your translation memories and corrections.

Edited to add the following:

If you try Matecat, there is a big caveat.

Please note that by default, the segments are saved in a public TM (MyMemory).

I'm pretty sure you want to avoid
... See more
There is no cost for uploading your memories, only for using their SaaS. It's €4 for 1,000 words of MT output.

From their website: State-of-the-art neural machine translation as a service that learns from your translation memories and corrections.

Edited to add the following:

If you try Matecat, there is a big caveat.

Please note that by default, the segments are saved in a public TM (MyMemory).

I'm pretty sure you want to avoid that, so here's what to do:

-Create a private TM resource: In the Project creation page, click on Settings (Alternatively, in the TM and glossary field, expand the drop-down menu and select Create resource).
- Click on + New resource button in the opened dialog. Give the TM an optional name. Hit Confirm. You will see that “MyMemory: Collaborative translation memory” resource is Enabled for Lookup, but not set to be Updated anymore. That way, translated segments will only be stored in your private resources. You need to do that systematically for each new project.

After that, you are good to go.

[Edited at 2019-06-12 10:53 GMT]
Collapse


 
Mark Bossanyi
Mark Bossanyi  Identity Verified
Bulgaria
Local time: 07:34
Member (2008)
French to English
+ ...
TOPIC STARTER
Thanks again Jean, Jun 12, 2019

for your very helpful reply.

 
Yvonne Manuela Meissner
Yvonne Manuela Meissner  Identity Verified
Netherlands
Local time: 06:34
Member
Dutch to German
+ ...
@Mark May 10, 2020

By chance I ended up in this thread. I have ModernMT in my MateCat. You can as well use it in Trados but have to download an app in the SDL store.

BIG NEWS: This whole month still - due to Covid-19 - you can use ModernMT for free... a very very nice gesture!!!! Good for a try.


 
Milan Condak
Milan Condak  Identity Verified
Local time: 06:34
English to Czech
Statistical MT for Windows May 11, 2020

Mark Bossanyi wrote:

Does anyone know of a (neural?) machine translation system that I can personalise or train to translate in my own style by feeding in the translation memories or CAT tool bilingual (xliff) files that I have compiled over the years?


Statistical MT for Windows from TMXs

Since November 2019, I have been using the Slate Desktop "engine build factory". The engine is a set of local files and no internet connection is required for translation.

Slate Desktop shall build as many "engines" as you want. From your or from external TMXs.

Here is a presentation on CS-BG engine

www.condak.cz/nove/2020-03/02/cs/02.html

02 CS-BG engine for Slate Desktop

Machine translated content Google Translate:

Automated translation CS-BG

Use of TMX for SMT engine and TBX for glossary
  
01 European free language resources
02 CS-BG engine for Slate Desktop
03 CELEX and TMX
04 TMX and Slate Desktop in OmegaT
05 IATE, TBX and XLSX
06 Glossary
07 TM, Slate Connect and Glossary

Slate Desktop and Slate Connect work on Windows (or Linux) computers. More on author's sites

www.slate.rocks or www.slatedesktop.com

On my site see "build engin"

http://www.condak.cz/nove/

Happy translating,

Milan

[Edited at 2020-05-11 14:26 GMT]


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Personalising machine translation by feeding in translation memories or bilingual files?







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »