Preloading Machine Translation
Thread poster: Ian Kahn

Ian Kahn
United Kingdom
Local time: 08:27
German to English
Dec 4, 2019

Hey everybody,

Let's say I'm working on a project with no access to the internet.

Is there any way for OmegaT to "pre-load" machine translations (I use the Google Translate API) that I can then see while I'm translating each line?

Would be pretty helpful and seems pretty simple to do.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 09:27
Member (2006)
English to Afrikaans
+ ...
SITE LOCALIZER
@Ian Dec 4, 2019

Ian Kahn wrote:
Is there any way for OmegaT to "pre-load" machine translations that I can then see while I'm translating each line?


It's simple enough, but it will take some work.

1. Extract all segments from OmegaT.
2. Translate those segments using Google Translate.
3. Align the extracted source text with the machine translated text to create a TMX memory.
4. Use that TMX memory in OmegaT in the /tm/ subfolder somewhere.

Ways to extract all segments from OmegaT:
- using a script (usually bundled with OmegaT) that writes source and/or target to a file.
- by using Ctrl+F, then selecting "regular expressions" and setting the number of results to 100 000, and then searching for ".".
- if it's all one file, by simply selecting all text in OmegaT and copy/pasting it to a plain text file.

To translate all segments using Google Translate, you can use the AutoIt script mentioned here:
https://www.proz.com/forum/cat_tools_technical_help/308360.html

You're going to have to google for how to align two files. There is e.g. LF Aligner.

I suggest you put the TMX in a subfolder called tm/penalty-10/ so that OmegaT penalizes fuzzy matches from the machine translated TMX file by 10%.

Finally, the DGT fork of OmegaT has some interesting features regarding machine translation, so check it out (it contains pretty much the same features as the official OmegaT, and there is no bad blood between the developers):
http://185.13.37.79/?q=node/31


 

Ian Kahn
United Kingdom
Local time: 08:27
German to English
TOPIC STARTER
Thanks Dec 4, 2019

Samuel Murray wrote:

It's simple enough, but it will take some work.



Thanks so much Samuel! That's really helpful!


 

tcordonniery
France
Local time: 09:27
Call pre-translation with DGT-OmegaT Dec 21, 2019

Samuel Murray wrote:
Finally, the DGT fork of OmegaT has some interesting features regarding machine translation, so check it out (it contains pretty much the same features as the official OmegaT, and there is no bad blood between the developers):
http://185.13.37.79/?q=node/31


Thanks Samuel for making reference to DGT-OmegaT.
Indeed, I have added some features which could help for this problem.

In command line, I extended the feature create pseudo translate (in standard OmegaT it only enables empty or "translation like the source"):
java -jar OmegaT.jar --console-pseudotranslatetmx --pseudotranslatetmx=/where/to/put/file.tmx --pseudotranslatetype=Google2
(techically, Google2 will be replaced by org.omegat.core.machinetranslators.net.Google2Translate, which is the class to be called)

This will generate a TMX which you can put in the mt/ folder, and then work offline. Translations will appear in the MT pane but referred as "Local", not as "Google".

Other possibility is, from the UI, to use the menu "Edit => Search & Pre-translate".


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Preloading Machine Translation

Advanced search






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search