JRC-Acquis, how to convert to tmx
Thread poster: Magdalena Kowalska

Magdalena Kowalska  Identity Verified
United Kingdom
Local time: 17:58
Polish to English
+ ...
Dec 13, 2015

Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

... See more
Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

Has anyone succeeded in using the JRC texts with their cat tool?
Collapse


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 16:58
Member (2009)
Dutch to English
+ ...
two tips Dec 13, 2015

Magdalena Kowalska wrote:

Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

Has anyone succeeded in using the JRC texts with their cat tool?


I suggest getting Andras Farkas’s collection. For a small fee, he will supply you with the ultimate EU collection of TMXs, or in any other format you might want: http://www.farkastranslations.com/eu_translation_memories.php

The best place to start if you want to get the DGT/JRC stuff directly from the EU is here:

https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory

[Edited at 2015-12-13 17:36 GMT]

[Edited at 2015-12-13 17:37 GMT]


 

Emma Goldsmith  Identity Verified
Spain
Local time: 17:58
Member (2010)
Spanish to English
And a third tip Dec 13, 2015

Dominique Pivard posted a useful video on the DGT TM here:

https://www.youtube.com/watch?v=GNj07W2ZqhQ


 

Blaž Košir
Belgium
Local time: 17:58
English to Slovenian
+ ...
Try here Dec 13, 2015

Try here: http://www.ttmem.com/terminology/download-translation-memory/european-commission-translation-memory/

 

Magdalena Kowalska  Identity Verified
United Kingdom
Local time: 17:58
Polish to English
+ ...
TOPIC STARTER
Thanks Dec 16, 2015

I actually did that already a few years ago.. downloading, aligning with that tool, etc. Jut wasn't sure it is still the same TM. It is worth to add the 2015 additions though, which I'm doing right now.

 

Milan Condak  Identity Verified
Local time: 17:58
English to Czech
Extract and split Dec 16, 2015

Magdalena Kowalska wrote:

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.


TMXs are ready in multilingual Translation Memory.

Since November 2007 the European Commission's Directorate-General for Translation has made its multilingual Translation Memory

https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory

How to produce bilingual extractions

The multilingual extraction has English as the source language. Users can extract any language pair as follows, using the extraction tool TMXtract:
For the Windows Operating System:
Download the TMXtract.jar file;

After extraction I use Heartsome TMX Editor for merging and splitting of TMXs.

http://www.condak.cz/nove/2015-12/08/cs/04.html

Another solution: use a CAT with server for TMs. Felix-cat is now open-source, a server is included.

Milan


 

CafeTran Training (X)
Netherlands
Local time: 17:58
DGT-Translation Memory: different generations Jun 7, 2017

For the DGT-Translation Memory different generations can be downloaded (2007, 2011 etc.). Do these generations overlap (contain identical TUs)?

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

JRC-Acquis, how to convert to tmx

Advanced search







SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »
WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search