https://www.proz.com/forum/omegat_support/283598-issue_with_inflections_in_glossary_and_memory.html&phpv_redirected=1

Issue with inflections in glossary and memory
Thread poster: nivaca
nivaca
nivaca
Colombia
Local time: 04:28
Mar 23, 2015

The glossary and translation memory doesn't work well with inflected languages as Latin. For instance, if in the source I have the word "dicendum" (the gerundive of "dico" ["I say"]), and if I add it to the glossary as it appears, then other inflections of the word will not be recognised by the glossary: "dicitur", "dicamus", etc.

I would be quite useful if OmegaT allowed for certain ways of dealing with inflections in the glossary. One simple way might be the following: use or rege
... See more
The glossary and translation memory doesn't work well with inflected languages as Latin. For instance, if in the source I have the word "dicendum" (the gerundive of "dico" ["I say"]), and if I add it to the glossary as it appears, then other inflections of the word will not be recognised by the glossary: "dicitur", "dicamus", etc.

I would be quite useful if OmegaT allowed for certain ways of dealing with inflections in the glossary. One simple way might be the following: use or regex in glossary entries. E.g.: "dic[endum, atum]".
Is this possible as of today?
Collapse


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 05:28
Russian to English
+ ...
It does Mar 23, 2015

The tokenizer function in OmegaT does that. Check the users' manual. How well it works for Latin I can't say, but it works for Russian and German.

Susan


 
nivaca
nivaca
Colombia
Local time: 04:28
TOPIC STARTER
Not for Latin. Mar 24, 2015

But there is no tokenizer for Latin, I'm afraid.

 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 11:28
English to French
+ ...
It relies on the Hunspell dictionary Mar 24, 2015

nivaca wrote:

But there is no tokenizer for Latin, I'm afraid.

For languages not covered by Lucene, the tokenizer is provided by Hunspell (you have to install the Hunspell dictionary corresponding to the source language).

I tried and I couldn't get it to work. That might be because the Hunspell dictionary I installed doesn't contain the necessary information, or because it does accept some stemming, but not the one I tried.

I tried these two dictionaries:
http://rpmfind.net/linux/rpm2html/search.php?query=hunspell-la
http://extensions.openoffice.org/en/project/latin-spelling-and-hyphenation-dictionaries

You can find information on Hunspell stemming information here:
http://manpages.ubuntu.com/manpages/dapper/man4/hunspell.4.html

Didier


 
nivaca
nivaca
Colombia
Local time: 04:28
TOPIC STARTER
Worked in Linux Mar 24, 2015

Didier,

Your recommendation of using Hunspell plus Latin dictionary worked fine in Linux. (Xubuntu 14.10). The glossary seems to work correctly now with inflections.

However, it doesn't work for me on Mac OS X. (I installed Hunspell with Brew, and used the very same dictionary.) I supposed there's still some fiddling to do in order to make it work.

Thanks.

Nicolas


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Issue with inflections in glossary and memory






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »