moving from Trados to OmegaT
Thread poster: Vojtěch Drábek
Vojtěch Drábek
Vojtěch Drábek
Czech Republic
English to Czech
May 27, 2014

Hi,

I was using a very old version of Trados and decided to try OmegaT, but I tried creating a project I have already translated and adding my translation memory to it and it reports only a negligible percentage of exact matches on files that are already in the translation memory. I thought it was because of different fuzzy matching rules, but surely exact match is an exact match and morover I cannot find these settings (like how whitespace, punctuation or numbers are handled when c
... See more
Hi,

I was using a very old version of Trados and decided to try OmegaT, but I tried creating a project I have already translated and adding my translation memory to it and it reports only a negligible percentage of exact matches on files that are already in the translation memory. I thought it was because of different fuzzy matching rules, but surely exact match is an exact match and morover I cannot find these settings (like how whitespace, punctuation or numbers are handled when counting fuzzy match percentage) anywhere in OmegaT. Has anyone else encountered this problem?

Thanks a lot,

Vojtěch Drábek
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:21
Member (2006)
English to Afrikaans
+ ...
@Vojtěch May 27, 2014

Vojtěch Drábek wrote:
I was using a very old version of Trados and decided to try OmegaT, but I tried creating a project I have already translated and adding my translation memory to it and it reports only a negligible percentage of exact matches on files that are already in the translation memory.


1. In what format is the source files?
2. Can you remember if the file had lots and lots of tags in Trados?
3. Have you tried switching your project to "paragraph segmentation" just to see what happens?




[Edited at 2014-05-27 18:08 GMT]


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 09:21
Russian to English
+ ...
TM format? May 27, 2014

You know, I expect, that you have to use the .tmx format for the TM (not .ttx), unless you go via Okapi plugin for Trados (see Documentation on the omegat website.

 
Vojtěch Drábek
Vojtěch Drábek
Czech Republic
English to Czech
TOPIC STARTER
moving from Trados to OmegaT May 27, 2014

Thanks for the replies. The files are HTML, converted from custom format message files. This (having to convert the files to HTML and then back) I consider a weak point but as OmegaT does not seem to have the capability of custom filters, I set that aside for now. They do not have many tags, in fact I think there are very few if any. If by paragraph segmentation you mean the checkbox in project properties (sentence level segmentation), it has no effect (and the segments are so small that it shou... See more
Thanks for the replies. The files are HTML, converted from custom format message files. This (having to convert the files to HTML and then back) I consider a weak point but as OmegaT does not seem to have the capability of custom filters, I set that aside for now. They do not have many tags, in fact I think there are very few if any. If by paragraph segmentation you mean the checkbox in project properties (sentence level segmentation), it has no effect (and the segments are so small that it should not). The translation memory is in TMX format. I noticed a potential problem with normal vs. non-breaking spaces, but I do not know where to change the setting to treat the spaces as equal. However, there are still segments that are exactly the same but are not inserted automatically.Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:21
Member (2006)
English to Afrikaans
+ ...
Automatic insertion May 27, 2014

Vojtěch Drábek wrote:
The files are HTML, converted from custom format message files. This (having to convert the files to HTML and then back) I consider a weak point but as OmegaT does not seem to have the capability of custom filters, I set that aside for now.


That is indeed a weak point of OmegaT, though I suspect one could just say that OmegaT's intended user is not someone who would want to create custom filters unless they know Java. The tradition in OmegaT is that only developers create additional filters and then distribute them in the next release.

However, there are still segments that are exactly the same but are not inserted automatically.


Perhaps you know this, but if you want automatic insertion of 100% matches, the TMX file must be in the /tm/auto/ subfolder.


 
Vojtěch Drábek
Vojtěch Drábek
Czech Republic
English to Czech
TOPIC STARTER
fuzzy match settings May 27, 2014


Vojtěch Drábek wrote:
The files are HTML, converted from custom format message files. This (having to convert the files to HTML and then back) I consider a weak point but as OmegaT does not seem to have the capability of custom filters, I set that aside for now.


That is indeed a weak point of OmegaT, though I suspect one could just say that OmegaT's intended user is not someone who would want to create custom filters unless they know Java. The tradition in OmegaT is that only developers create additional filters and then distribute them in the next release.

Well, I know Java some, but it requires digging in the code and I have no time for this right now. But it is not a big problem right now.

However, there are still segments that are exactly the same but are not inserted automatically.


Perhaps you know this, but if you want automatic insertion of 100% matches, the TMX file must be in the /tm/auto/ subfolder.



I did not know that, thanks! This leaves the probable cause of problems to be non-breaking spaces. If only I could find the setting for treating spaces (if it is there), or I will have to change all spaces in the source files.


 
Milan Condak
Milan Condak  Identity Verified
Local time: 15:21
English to Czech
Segmentation and tags May 27, 2014

Vojtěch Drábek wrote:

...different fuzzy matching rules, but surely exact match...



I made a presentation in Czech, I hope it can help you:

Less segments - more matches

Adding the abbreviation into CAT

http://www.condak.net/cat_other/omegat/2013-07-24/cs/00.html

Another method for better matching is to remove the tags.

For another filters look at OkapiTools, see

http://www.proz.com/forum/czech/265756-nástroje_okapi.html

Check Rainbow, if your format is not supported.

Milan



[Upraveno: 2014-05-27 20:07 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:21
Member (2006)
English to Afrikaans
+ ...
Editing the source files May 27, 2014

Vojtěch Drábek wrote:
...or I will have to change all spaces in the source files.


Editing the source files is a common suggestion in the OmegaT world.


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 15:21
English to French
+ ...
Howto on editing/creating a file filter Jun 19, 2014

Vojtěch Drábek wrote:


The files are HTML, converted from custom format message files. This (having to convert the files to HTML and then back) I consider a weak point but as OmegaT does not seem to have the capability of custom filters, I set that aside for now.


That is indeed a weak point of OmegaT, though I suspect one could just say that OmegaT's intended user is not someone who would want to create custom filters unless they know Java. The tradition in OmegaT is that only developers create additional filters and then distribute them in the next release.


Well, I know Java some, but it requires digging in the code and I have no time for this right now. But it is not a big problem right now.

There is a specific howto to edit or create a file filter:
http://www.omegat.org/en/howtos/new_filter.html
and to compile OmegaT:
http://www.omegat.org/en/howtos/compiling_from_source.html

Didier


 
Vojtěch Drábek
Vojtěch Drábek
Czech Republic
English to Czech
TOPIC STARTER
custom file filter Jul 8, 2014

Didier Briel wrote:

There is a specific howto to edit or create a file filter:
http://www.omegat.org/en/howtos/new_filter.html
and to compile OmegaT:
http://www.omegat.org/en/howtos/compiling_from_source.html

Didier


Thanks, but this is not in fact about creating file filters, it is about creating an XML filter from another by changing some tags. I have many files that are not XML, now I transform them to some kind of HTML (but it is not ideal, now I have encountered a problem that OmegaT probably does something to the text -some whitespace normalization- outside the filter and I cannot find where)


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 15:21
English to French
+ ...
For HTML, spaces are handled in FilterVisitor.java Jul 8, 2014

Vojtěch Drábek wrote:

Didier Briel wrote:

There is a specific howto to edit or create a file filter:
http://www.omegat.org/en/howtos/new_filter.html
and to compile OmegaT:
http://www.omegat.org/en/howtos/compiling_from_source.html

Didier


Thanks, but this is not in fact about creating file filters, it is about creating an XML filter from another by changing some tags. I have many files that are not XML, now I transform them to some kind of HTML (but it is not ideal, now I have encountered a problem that OmegaT probably does something to the text -some whitespace normalization- outside the filter and I cannot find where)

We're already discussing the point in the OmegaT mailing list. However, for the record, for the HTML filter, spaces are handled in FilterVisitor.java. The howto for creating file filters applies partially. Duplicate the HTML filter with new names, and then you can do changes in the new filter.

Didier


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


moving from Trados to OmegaT






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »