OmegaT 2.0 released
Thread poster: Vito Smolej
Vito Smolej
Vito Smolej
Germany
Local time: 00:55
Member (2004)
English to Slovenian
+ ...
SITE LOCALIZER
Oct 11, 2009

Dear All,

The 2.0 version of OmegaT is now released as a stable version, including a
revised manual.

Compared to the previous 1.8, 2.0 offers 39 functional enhancements.

The loading and indexing system has been completely rewritten, providing "on
demand" matching. As a result, load time should now be under a minute in
most cases. Memory consumption has also been reduced, allowing to load large
projects (e.g., 300,000 words) together
... See more
Dear All,

The 2.0 version of OmegaT is now released as a stable version, including a
revised manual.

Compared to the previous 1.8, 2.0 offers 39 functional enhancements.

The loading and indexing system has been completely rewritten, providing "on
demand" matching. As a result, load time should now be under a minute in
most cases. Memory consumption has also been reduced, allowing to load large
projects (e.g., 300,000 words) together with large translation memories
(e.g., 63 MB, 20,000 entries). The on-demand computation is still very fast,
and the difference isn't usually noticeable.

The Editor has been rewritten, providing enhanced features for RTL
languages.

Using OmegaT-tokenizers (http://sourceforge.net/projects/omegat-plugins),
OmegaT 2.0 can compute fuzzy matches and glossary matches based on stemming,
which can largely improve matching in most languages. "Stop words" are also
ignored in fuzzy matches for a number of languages, further improving the
matches.

OmegaT supports dictionaries in StarDict (http://stardict.sourceforge.net/)
format.

OmegaT now allows getting a machine translation of the current segment with
Google Translate.

There are new filters for QuarkXPress Copy Flow Gold, allowing to use OmegaT
for DTP projects, SubRip subtitles (SRT), LaTeX, Android resources and ResX
resources. The PO filter now loads existing translations.

OmegaT is available as a Java Web Start application
(http://omegat.sourceforge.net/webstart.html), allowing to use it without
any installation.

Stability has also be improved, with several important bug corrections.

As part of these enhancements, OmegaT now requires Java 1.5.

Compared with the previous 2.0.4 update 1, the new stable 2.0.5 contains a
revised manual and a command line feature to generate pseudo translated
TMXs.

OmegaT 2.0.5 can be downloaded from
https://sourceforge.net/projects/omegat/files/

... as per broadcast by Didier

[Edited at 2009-10-11 13:05 GMT]
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 00:55
Member (2006)
English to Afrikaans
+ ...
Some notes Oct 11, 2009

VitoSmolej wrote:
OmegaT supports dictionaries in StarDict (http://stardict.sourceforge.net/)
format.


Note that not all StarDict dictionaries work in OmegaT. Apparently there are several dialects (different subformats) of StarDict, and OmegaT works only with some of them. There is no way to tell which dictionaries will or will not work -- the only way to tell is to try to use it.

OmegaT is available as a Java Web Start application
(http://omegat.sourceforge.net/webstart.html), allowing to use it without
any installation.


Just in case anyone isn't familiar with Java Web Start, well, it doesn't install OmegaT on your computer but it does download the entire program every time you want to use it. So this option wouldn't save you from having to download it -- it simply saves you from having to install it.

...the new stable 2.0.5 contains a ... command line feature to generate pseudo translated TMXs.


Do you happen to know where in the user manual this procedure would be described?

Samuel


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 00:55
English to French
+ ...
Java Web Start does download the program Oct 11, 2009

Samuel Murray wrote:

OmegaT is available as a Java Web Start application
(http://omegat.sourceforge.net/webstart.html), allowing to use it without
any installation.


Just in case anyone isn't familiar with Java Web Start, well, it doesn't install OmegaT on your computer but it does download the entire program every time you want to use it. So this option wouldn't save you from having to download it -- it simply saves you from having to install it.

No, it does download the program in a "Java cache". It only downloads it again if there are changes (thus providing automatic updates).


...the new stable 2.0.5 contains a ... command line feature to generate pseudo translated TMXs.


Do you happen to know where in the user manual this procedure would be described?


As described in changes.txt:
- Generate pseudo-translated tmx
(see documentation->translation memories->pseudo-translated memory)

Didier


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 00:55
English to Hungarian
+ ...
Size matters Oct 11, 2009

VitoSmolej wrote:

The loading and indexing system has been completely rewritten, providing "on
demand" matching. As a result, load time should now be under a minute in
most cases. Memory consumption has also been reduced, allowing to load large
projects (e.g., 300,000 words) together with large translation memories
(e.g., 63 MB, 20,000 entries).


Am I the only one who finds this woefully inadequate?
I mean, even leaving aside large projects that generate large TMs, just the Acquis TM in itself is about 1 million TUs, and if you add a TM created from the europarl corpus plus a bit of this and that you can easily get to 10 times what OmegaT claims to be able to handle.
This is 2009, people are using large memories. If you overhaul your TM handling solutions, you should make sure they can handle a million or so TUs.

Anyway, it's good to see OmegaT development continue.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 00:55
Member (2006)
English to Afrikaans
+ ...
Some ideas Oct 11, 2009

FarkasAndras wrote:
I mean, ... just the Acquis TM in itself is about 1 million TUs, and if you add a TM created from the Europarl corpus plus a bit of this and that you can easily get to 10 times what OmegaT claims to be able to handle. ... This is 2009, people are using large memories.


Personally I think there comes a point at which a standalone TM program is no longer sufficient, and it becomes necessary for a program to connect to the TM server, if the user wants to use large TMs.

I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it?


 
Laurent KRAULAND (X)
Laurent KRAULAND (X)  Identity Verified
France
Local time: 00:55
French to German
+ ...
Indeed ;) Oct 11, 2009

FarkasAndras wrote:


Anyway, it's good to see OmegaT development continue.


And thanks for the information, Victor!

Samuel Murray wrote:
I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it?


What would be the need to host such a oversize TM on a freelancer's computer/storage device anyway? Just wondering (not a question of capacity: I have 1 TB at my disposal)...

[Edited at 2009-10-11 21:03 GMT]


 
Vito Smolej
Vito Smolej
Germany
Local time: 00:55
Member (2004)
English to Slovenian
+ ...
TOPIC STARTER
SITE LOCALIZER
Documentation as PDF Oct 12, 2009

see my profile here:

http://www.proz.com/profile/91005 *

or use the URL

http://www.textnart.de/OmegaT.pdf

I would appreciate to hear about omissions, inconsistencies etc.

Regards

Vito

* - plug it here, just to improve my
... See more
see my profile here:

http://www.proz.com/profile/91005 *

or use the URL

http://www.textnart.de/OmegaT.pdf

I would appreciate to hear about omissions, inconsistencies etc.

Regards

Vito

* - plug it here, just to improve my Page rank (g)
Collapse


 
Vito Smolej
Vito Smolej
Germany
Local time: 00:55
Member (2004)
English to Slovenian
+ ...
TOPIC STARTER
SITE LOCALIZER
re Acquis TM Oct 12, 2009

Samuel Murray wrote:
I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it?


Here's some actual numbers:

size 270MB (just DE SL part)
loading time about 3 seconds on 2.0x OmegaT

Note that Acquis material includes all the current languages (a nice XLST script anyone;), so, yes, it is huge. However, if you go for a single pair, it may still be huge ... But less huge (g).



[Edited at 2009-10-12 18:51 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 00:55
Member (2006)
English to Afrikaans
+ ...
Just three seconds? Oct 12, 2009

VitoSmolej wrote:
Here's some actual numbers:
Size: 270MB (just DE SL part)
Loading time: about 3 seconds on 2.0x OmegaT


Just three seconds and it will give a fuzzy match from a segment anywhere in the TM (including, say, the rear)???


 
Vito Smolej
Vito Smolej
Germany
Local time: 00:55
Member (2004)
English to Slovenian
+ ...
TOPIC STARTER
SITE LOCALIZER
Stand by for further news ... Oct 13, 2009

Samuel Murray wrote:Just three seconds and it will give a fuzzy match from a segment anywhere in the TM (including, say, the rear)???

I'll do some more tests and report.

Quoting how long it takes to load, says of course nothing about how fast it matches. But I think it would a laugh of the year, if the access time would scale anything but logarithmically with TM size. The users would have noticed this some time ago.

Of course what I think may not match the reality. So let's do some tests.

Regards

Vito


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 18:55
Russian to English
+ ...
OmegaT-tokenizers 0.2-2.0 released Oct 14, 2009

OmegaT-tokenizers has been updated to include Lucene 2.9.0. This is the feature that enables glossary "stemming" (to find inflections of words) and "stop-word" to eliminate little words like "and" and "the" from TM fuzzy matching.

The following new tokenizers are available:
Arabic, Persian, SmartChinese, Turkish, Hungarian and Romanian.

OmegaT-tokenizers is availab
... See more
OmegaT-tokenizers has been updated to include Lucene 2.9.0. This is the feature that enables glossary "stemming" (to find inflections of words) and "stop-word" to eliminate little words like "and" and "the" from TM fuzzy matching.

The following new tokenizers are available:
Arabic, Persian, SmartChinese, Turkish, Hungarian and Romanian.

OmegaT-tokenizers is available from
https://sourceforge.net/projects/omegat-plugins/

(This just in from Didier.)
Collapse


 
Hakan Kiyici
Hakan Kiyici  Identity Verified
Türkiye
Local time: 01:55
Member (2009)
English to Turkish
+ ...
disappointed again Nov 24, 2010

I had installed OmegaT earlier. It did not work properly. I had given up.

Reading some articles of SubRip file types, OmegaT was advised. I installed the latest version. It is working at 50% CPU and gets stuck. Incredibly slow at times.


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 00:55
English to French
+ ...
It is not a normal behaviour Nov 24, 2010

Hakan Kiyici wrote:

I had installed OmegaT earlier. It did not work properly. I had given up.

Reading some articles of SubRip file types, OmegaT was advised. I installed the latest version. It is working at 50% CPU and gets stuck. Incredibly slow at times.

It is not a normal behaviour.

What is your operating system?

What version of OmegaT did you install?

Didier


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


OmegaT 2.0 released






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »