Term Recognition for Non-Space languages
Thread poster: Peter Ross
Peter Ross
Peter Ross  Identity Verified
Australia
Local time: 08:39
Thai to English
+ ...
Apr 29, 2011

Hi

Does anyone know for sure what CAT programs support Term Recognition for non-space languages (languages that do not have spaces in between words but only at phrase or sentence level)? For example, Thai, Lao?

(Programs would need a special algorithm or look up to find and match terms within phrases/sentences)

Thanks

Peter


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 00:39
English to French
+ ...
OmegaT, for some languages Apr 29, 2011

Peter Ross wrote:
Does anyone know for sure what CAT programs support Term Recognition for non-space languages (languages that do not have spaces in between words but only at phrase or sentence level)? For example, Thai, Lao?

(Programs would need a special algorithm or look up to find and match terms within phrases/sentences)

OmegaT has a tokenizer plugin, which improves term recognition for such languages.

However, the specific algorithms depend on those provided by Lucene, and so only specific languages are covered.
I know Chinese and Japanese work.
A Thai tokenizer is available, too, but I have no feedback, so I don't know whether it is efficient or not.

Didier


 
Peter Ross
Peter Ross  Identity Verified
Australia
Local time: 08:39
Thai to English
+ ...
TOPIC STARTER
Term Recognition for Non-Space languages - presegmentation May 10, 2011

Thanks Didier

Hats of to OmegaT and other free programs. The interface is not for the faint-hearted so I'm sorry I didn't succeed in testing the Thai tokenizer.

I understand that when programs like Microsoft Word operate in a localized fashion they carry out a kind of background segmentation process which allows non-space language words to be recognized as such (for example in Thai, clicking on "unsegmented" text will highlight a Thai word). However, it seems that progr
... See more
Thanks Didier

Hats of to OmegaT and other free programs. The interface is not for the faint-hearted so I'm sorry I didn't succeed in testing the Thai tokenizer.

I understand that when programs like Microsoft Word operate in a localized fashion they carry out a kind of background segmentation process which allows non-space language words to be recognized as such (for example in Thai, clicking on "unsegmented" text will highlight a Thai word). However, it seems that programs like Trados and WordFast have yet to implement this kind of feature, so term recognition cannot work.

The alternative is to presegment text before translation. For Thai, there's a free program at http://pioneer.chula.ac.th/~awirote/index.html.

Peter
Collapse


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 00:39
English to French
+ ...
It's easier with support May 10, 2011

Peter Ross wrote:
Hats of to OmegaT and other free programs. The interface is not for the faint-hearted so I'm sorry I didn't succeed in testing the Thai tokenizer.


With the help of the Yahoo support group, it would be easier.

Didier


 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 01:39
English to Turkish
+ ...
Deja Vu X May 11, 2011

I have made a test with DVX, entered some terms to the termbase and DVX recognized them. You can download a 30-day trial version of DVX and test it yourself. See also DVX group at http://tech.groups.yahoo.com/group/dejavu-l/

Selcuk


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Term Recognition for Non-Space languages







Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »