OmegaT 2.0 released Thread poster: Vito Smolej
| Vito Smolej Germany Local time: 00:55 Member (2004) English to Slovenian + ... SITE LOCALIZER
Dear All, The 2.0 version of OmegaT is now released as a stable version, including a revised manual. Compared to the previous 1.8, 2.0 offers 39 functional enhancements. The loading and indexing system has been completely rewritten, providing "on demand" matching. As a result, load time should now be under a minute in most cases. Memory consumption has also been reduced, allowing to load large projects (e.g., 300,000 words) together ... See more Dear All, The 2.0 version of OmegaT is now released as a stable version, including a revised manual. Compared to the previous 1.8, 2.0 offers 39 functional enhancements. The loading and indexing system has been completely rewritten, providing "on demand" matching. As a result, load time should now be under a minute in most cases. Memory consumption has also been reduced, allowing to load large projects (e.g., 300,000 words) together with large translation memories (e.g., 63 MB, 20,000 entries). The on-demand computation is still very fast, and the difference isn't usually noticeable. The Editor has been rewritten, providing enhanced features for RTL languages. Using OmegaT-tokenizers (http://sourceforge.net/projects/omegat-plugins), OmegaT 2.0 can compute fuzzy matches and glossary matches based on stemming, which can largely improve matching in most languages. "Stop words" are also ignored in fuzzy matches for a number of languages, further improving the matches. OmegaT supports dictionaries in StarDict (http://stardict.sourceforge.net/) format. OmegaT now allows getting a machine translation of the current segment with Google Translate. There are new filters for QuarkXPress Copy Flow Gold, allowing to use OmegaT for DTP projects, SubRip subtitles (SRT), LaTeX, Android resources and ResX resources. The PO filter now loads existing translations. OmegaT is available as a Java Web Start application (http://omegat.sourceforge.net/webstart.html), allowing to use it without any installation. Stability has also be improved, with several important bug corrections. As part of these enhancements, OmegaT now requires Java 1.5. Compared with the previous 2.0.4 update 1, the new stable 2.0.5 contains a revised manual and a command line feature to generate pseudo translated TMXs. OmegaT 2.0.5 can be downloaded from https://sourceforge.net/projects/omegat/files/ ... as per broadcast by Didier
[Edited at 2009-10-11 13:05 GMT] ▲ Collapse | | | Samuel Murray Netherlands Local time: 00:55 Member (2006) English to Afrikaans + ...
Note that not all StarDict dictionaries work in OmegaT. Apparently there are several dialects (different subformats) of StarDict, and OmegaT works only with some of them. There is no way to tell which dictionaries will or will not work -- the only way to tell is to try to use it. Just in case anyone isn't familiar with Java Web Start, well, it doesn't install OmegaT on your computer but it does download the entire program every time you want to use it. So this option wouldn't save you from having to download it -- it simply saves you from having to install it. ...the new stable 2.0.5 contains a ... command line feature to generate pseudo translated TMXs. Do you happen to know where in the user manual this procedure would be described? Samuel | | | Didier Briel France Local time: 00:55 English to French + ... Java Web Start does download the program | Oct 11, 2009 |
Samuel Murray wrote: Just in case anyone isn't familiar with Java Web Start, well, it doesn't install OmegaT on your computer but it does download the entire program every time you want to use it. So this option wouldn't save you from having to download it -- it simply saves you from having to install it. No, it does download the program in a "Java cache". It only downloads it again if there are changes (thus providing automatic updates). ...the new stable 2.0.5 contains a ... command line feature to generate pseudo translated TMXs. Do you happen to know where in the user manual this procedure would be described? As described in changes.txt: - Generate pseudo-translated tmx (see documentation->translation memories->pseudo-translated memory) Didier | | | Size matters | Oct 11, 2009 |
VitoSmolej wrote: The loading and indexing system has been completely rewritten, providing "on demand" matching. As a result, load time should now be under a minute in most cases. Memory consumption has also been reduced, allowing to load large projects (e.g., 300,000 words) together with large translation memories (e.g., 63 MB, 20,000 entries). Am I the only one who finds this woefully inadequate? I mean, even leaving aside large projects that generate large TMs, just the Acquis TM in itself is about 1 million TUs, and if you add a TM created from the europarl corpus plus a bit of this and that you can easily get to 10 times what OmegaT claims to be able to handle. This is 2009, people are using large memories. If you overhaul your TM handling solutions, you should make sure they can handle a million or so TUs. Anyway, it's good to see OmegaT development continue. | |
|
|
Samuel Murray Netherlands Local time: 00:55 Member (2006) English to Afrikaans + ...
FarkasAndras wrote: I mean, ... just the Acquis TM in itself is about 1 million TUs, and if you add a TM created from the Europarl corpus plus a bit of this and that you can easily get to 10 times what OmegaT claims to be able to handle. ... This is 2009, people are using large memories. Personally I think there comes a point at which a standalone TM program is no longer sufficient, and it becomes necessary for a program to connect to the TM server, if the user wants to use large TMs. I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it? | | | Laurent KRAULAND (X) France Local time: 00:55 French to German + ...
FarkasAndras wrote: Anyway, it's good to see OmegaT development continue. And thanks for the information, Victor! Samuel Murray wrote: I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it? What would be the need to host such a oversize TM on a freelancer's computer/storage device anyway? Just wondering (not a question of capacity: I have 1 TB at my disposal)...
[Edited at 2009-10-11 21:03 GMT] | | | Vito Smolej Germany Local time: 00:55 Member (2004) English to Slovenian + ... TOPIC STARTER SITE LOCALIZER | Vito Smolej Germany Local time: 00:55 Member (2004) English to Slovenian + ... TOPIC STARTER SITE LOCALIZER re Acquis TM | Oct 12, 2009 |
Samuel Murray wrote: I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it? Here's some actual numbers: size 270MB (just DE SL part) loading time about 3 seconds on 2.0x OmegaT Note that Acquis material includes all the current languages (a nice XLST script anyone;), so, yes, it is huge. However, if you go for a single pair, it may still be huge ... But less huge (g).
[Edited at 2009-10-12 18:51 GMT] | |
|
|
Samuel Murray Netherlands Local time: 00:55 Member (2006) English to Afrikaans + ... Just three seconds? | Oct 12, 2009 |
VitoSmolej wrote: Here's some actual numbers: Size: 270MB (just DE SL part) Loading time: about 3 seconds on 2.0x OmegaT Just three seconds and it will give a fuzzy match from a segment anywhere in the TM (including, say, the rear)??? | | | Vito Smolej Germany Local time: 00:55 Member (2004) English to Slovenian + ... TOPIC STARTER SITE LOCALIZER Stand by for further news ... | Oct 13, 2009 |
Samuel Murray wrote:Just three seconds and it will give a fuzzy match from a segment anywhere in the TM (including, say, the rear)??? I'll do some more tests and report. Quoting how long it takes to load, says of course nothing about how fast it matches. But I think it would a laugh of the year, if the access time would scale anything but logarithmically with TM size. The users would have noticed this some time ago. Of course what I think may not match the reality. So let's do some tests. Regards Vito | | | Susan Welsh United States Local time: 18:55 Russian to English + ... OmegaT-tokenizers 0.2-2.0 released | Oct 14, 2009 |
OmegaT-tokenizers has been updated to include Lucene 2.9.0. This is the feature that enables glossary "stemming" (to find inflections of words) and "stop-word" to eliminate little words like "and" and "the" from TM fuzzy matching. The following new tokenizers are available: Arabic, Persian, SmartChinese, Turkish, Hungarian and Romanian. OmegaT-tokenizers is availab... See more OmegaT-tokenizers has been updated to include Lucene 2.9.0. This is the feature that enables glossary "stemming" (to find inflections of words) and "stop-word" to eliminate little words like "and" and "the" from TM fuzzy matching. The following new tokenizers are available: Arabic, Persian, SmartChinese, Turkish, Hungarian and Romanian. OmegaT-tokenizers is available from https://sourceforge.net/projects/omegat-plugins/ (This just in from Didier.) ▲ Collapse | | | Hakan Kiyici Türkiye Local time: 01:55 Member (2009) English to Turkish + ... disappointed again | Nov 24, 2010 |
I had installed OmegaT earlier. It did not work properly. I had given up. Reading some articles of SubRip file types, OmegaT was advised. I installed the latest version. It is working at 50% CPU and gets stuck. Incredibly slow at times. | |
|
|
Didier Briel France Local time: 00:55 English to French + ... It is not a normal behaviour | Nov 24, 2010 |
Hakan Kiyici wrote: I had installed OmegaT earlier. It did not work properly. I had given up. Reading some articles of SubRip file types, OmegaT was advised. I installed the latest version. It is working at 50% CPU and gets stuck. Incredibly slow at times. It is not a normal behaviour. What is your operating system? What version of OmegaT did you install? Didier | | | There is no moderator assigned specifically to this forum. To report site rules violations or get help, please contact site staff » OmegaT 2.0 released Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |