java.lang.ArrayIndexOutOfBoundsException
Thread poster: Jon Babcock
Jon Babcock
Jon Babcock
Local time: 08:06
Chinese to English
Dec 1, 2011

Just started with OmegaT. Ver. 2.3.0_1 on Linux Mint Debian Edition (LMDE), with Chinese-English.
When I try to import a plain text, utf-8 encoded file of ZH-TW consisting of 86577 lines at an average of 6 or 7 Chinese characters per line, I get a java.lang.ArrayIndexOutOfBoundsException: -32768 error and the file won't load. Wonder if anyone has any suggestions what I might try. Thanks. Jon-
PS Java version is: Java(TM) SE Runtime Environment (build 1.6.0_26-b03)

[Edited a
... See more
Just started with OmegaT. Ver. 2.3.0_1 on Linux Mint Debian Edition (LMDE), with Chinese-English.
When I try to import a plain text, utf-8 encoded file of ZH-TW consisting of 86577 lines at an average of 6 or 7 Chinese characters per line, I get a java.lang.ArrayIndexOutOfBoundsException: -32768 error and the file won't load. Wonder if anyone has any suggestions what I might try. Thanks. Jon-
PS Java version is: Java(TM) SE Runtime Environment (build 1.6.0_26-b03)

[Edited at 2011-12-01 22:07 GMT]
Collapse


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 16:06
English to French
+ ...
Submit a bug report or subscribe to the user group Dec 2, 2011

Jon Babcock wrote:
Just started with OmegaT. Ver. 2.3.0_1 on Linux Mint Debian Edition (LMDE), with Chinese-English.
When I try to import a plain text, utf-8 encoded file of ZH-TW consisting of 86577 lines at an average of 6 or 7 Chinese characters per line, I get a java.lang.ArrayIndexOutOfBoundsException: -32768 error and the file won't load. Wonder if anyone has any suggestions what I might try.

I couldn't reproduce it with a text file with 345,000 lines.

So, you could submit a bug report on Sourceforge, or you could subscribe to the OmegaT Yahoo support group, where the issue could be analyzed in details.

What parameter are you using for the Text file filter (Options > File Filters > Text Files > Options)?

Just out of pure logic, you could try to split your file in three pieces, and see whether you are able to load the three pieces separately.

Didier


 
Jon Babcock
Jon Babcock
Local time: 08:06
Chinese to English
TOPIC STARTER
java.lang.ArrayIndexOutOfBoundsException [solved ?] Dec 2, 2011

Thanks, Didier.

While waiting for my post here to be approved, I discovered the OmegaT yahoo group. I'll probably move subsequent questions about using OmegaT to that venue, as you suggest.

By installing the newer OmegaT-2.5.0_4 (both versions-- with and without embedded java) and by opening the Chinese 86577-line plain text file in LibreOffice and saving it as an .odt file, I was able import it into the appropriate OmegaT project without a problem. In fact, the job co
... See more
Thanks, Didier.

While waiting for my post here to be approved, I discovered the OmegaT yahoo group. I'll probably move subsequent questions about using OmegaT to that venue, as you suggest.

By installing the newer OmegaT-2.5.0_4 (both versions-- with and without embedded java) and by opening the Chinese 86577-line plain text file in LibreOffice and saving it as an .odt file, I was able import it into the appropriate OmegaT project without a problem. In fact, the job consists of two large files; the second one is 98778 lines in size. Once in .odt format, I was able to load both of these into the OmegaT project, producing about 185,355 segments in total. So I will use this later version of OmegaT with .odt source files henceforth.

Just now I tried again to load the text file that wouldn't load under OmegaT-2.3.0_1 and that originally prompted my post.
It loaded without a problem! In fact at some point while waiting for my post here to be approved yesterday I did change the Options > File Filters > Text Filter Options which you mention, from its default at Empty Lines to Line Breaks. Perhaps this was the solution, so simple after all!

Yesterday, I was indeed able to load much smaller utf8 text files of around 500 lines of Chinese into the project, but I didn't experiment enough to find the size at which OmegaT throws the ArrayIndexOutOfBoundsException error. And now I can't remember whether this was BEFORE or AFTER I had changed the Text Filter Options from Empty Lines to Line Breaks.

I'm just beginning to try OmegaT for Chinese-English again after I gave up on it when it was in its very early stages of development many years ago. This time it looks like it's going to work. Persistence pays.

Jon-

PS:
Problem Solved:
Going back to OmegaT-2.3.0_1, I just confirmed that the problem in my case resulted from the Empty Lines default setting for Options > File Filters > Text Filter Options. After changing this to Line Breaks, I am able to load huge Chinese plain text files without a problem.

Note that my text files come from saving each file as plain text in LibreOffice (3.3.3.1) [running on Linux Mint Debian Edition (LMDE)] where I am given Character Set and Paragraph Break options which I have set as UTF-8 and LF (line feed) respectively. IOW, I didn't choose CR&LF or CR for the paragraph breaks. Also note that I have added Language "Chinese" and Language Pattern "ZH-TW" with Break/Exception checked and Pattern Before set to \n (i.e. newline) and Pattern After set to nothing to the Options Segmentation Setup. This approach assumes that the text files have already been parsed into lines where each line will represent one segment in OmegaT. Of course there are many other ways to skin a cat, but this is working for me now.





[Edited at 2011-12-02 13:49 GMT]
Collapse


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


java.lang.ArrayIndexOutOfBoundsException






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »