OmegaT 2.0.5 build 4 questions
Thread poster: Marc Baas
Marc Baas
Marc Baas  Identity Verified
United States
Local time: 09:06
Dutch to English
+ ...
Feb 18, 2011

Hi all,

I'm completely new to OmegaT, but not to CAT software.

Last night I downloaded the stable version to check it out and run some tests on a project that I am currently working on where I definitely need a tool to check what I have to translate and what not (complicated project as well).

I noticed two things that are a bit confusing to me.

1) The segmentation that OmegaT does seems very, very fragmented to me. I always prefer to have segme
... See more
Hi all,

I'm completely new to OmegaT, but not to CAT software.

Last night I downloaded the stable version to check it out and run some tests on a project that I am currently working on where I definitely need a tool to check what I have to translate and what not (complicated project as well).

I noticed two things that are a bit confusing to me.

1) The segmentation that OmegaT does seems very, very fragmented to me. I always prefer to have segments of at least full sentences, because more often than not, due to different word orders in languages, one needs to know the full sentence to know how to translate it.
OmegaT, in my case does not leave a single sentence in tact (tried with several different files). Is there an 'easy' way (don't have much time to spare at the moment) in which I can tell OmegaT to leave my sentences the way they are and not fragment them?

2) The other thing I noticed is that commas disappear from my source text as a result of the segmentation, which (if I cannot manage to resolve this) would make the tool useless to me. Obviously it would take too much time to first edit with OmegaT, and then compare source and target texts manually to try to find the commas that are missing. That is, when first having to compare the OmegaT source text with the original one for missing punctuation.
I hope this is just a setting, but if it is a bug, I will have to keep an eye out for something else that does not change my source text when I load it into the tool.

Aside from these two fundamental issues, it does look like a very nice tool though.

One last point: did anyone try the latest version, and can you comment on the reliability of that one for practical use in translating?

Any comments are highly appreciated.

Marc
Collapse


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 09:06
English to French
+ ...
Some things to check Feb 18, 2011

Marc Baas wrote:
1) The segmentation that OmegaT does seems very, very fragmented to me. I always prefer to have segments of at least full sentences, because more often than not, due to different word orders in languages, one needs to know the full sentence to know how to translate it.

By default, OmegaT segmentation rules are quite conservative.
So, if your text is segmented more than for logical end of sentences, it could come:
- From your source text. If it contains hard returns in the middle of sentences, then OmegaT will consider them as two different sentences. What is the format of your source text?
- From abbreviations. If your source text contain numerous abbreviations ending with '.', and these abbreviations are not in OmegaT rules, then you have to provide them.

OmegaT, in my case does not leave a single sentence in tact (tried with several different files). Is there an 'easy' way (don't have much time to spare at the moment) in which I can tell OmegaT to leave my sentences the way they are and not fragment them?

Yes, it's called the paragraph mode. In Project > Properties, uncheck the Enable Sentence-Level Segmenting box.
If, after that, there is still some segmentation, it comes from your source text.


2) The other thing I noticed is that commas disappear from my source text as a result of the segmentation,

By design, OmegaT doesn't remove text with segmentation, it only splits it.
The only text removed from the Editor (but *not* from the target text) is space between sentences.


which (if I cannot manage to resolve this) would make the tool useless to me. Obviously it would take too much time to first edit with OmegaT, and then compare source and target texts manually to try to find the commas that are missing. That is, when first having to compare the OmegaT source text with the original one for missing punctuation.
I hope this is just a setting, but if it is a bug, I will have to keep an eye out for something else that does not change my source text when I load it into the tool.

I've never heard of such a behaviour (removing commas, or any other text, for that matter).
Again, what is the format of your source documents?

One last point: did anyone try the latest version, and can you comment on the reliability of that one for practical use in translating?

Most people experienced with OmegaT use the "latest" version, which is generally at least as stable as the "standard" one.

Didier


 
Marc Baas
Marc Baas  Identity Verified
United States
Local time: 09:06
Dutch to English
+ ...
TOPIC STARTER
Thanks Feb 18, 2011

Thanks Didier for the very elaborate and detailed explanation. I very much appreciate your effort.

My source text is a MS Office document, which it seems to read fine, both the 2007 format as the older one. I tried to see if there are differences between that one and the .odt format (OpenOffice) but could not tell any differences in the tool itself.
So for that matter I was very pleasantly surprized as to how easily it reads these formats (as opposed to some of the commercial
... See more
Thanks Didier for the very elaborate and detailed explanation. I very much appreciate your effort.

My source text is a MS Office document, which it seems to read fine, both the 2007 format as the older one. I tried to see if there are differences between that one and the .odt format (OpenOffice) but could not tell any differences in the tool itself.
So for that matter I was very pleasantly surprized as to how easily it reads these formats (as opposed to some of the commercial tools around).

I will definately follow your advice and check things there. Like I said before, I really lack the time at the moment to dive in deep into the software. I just urgently needed a tool that did some checking for me. Which, by the way, it did perfectly.

What you are pointing out that it might be the source text could be. I'm working on a medical document that is litterally cramped with abbreviations, formulas and such. So it could be that this is the main culprit.

Thanks agian, and I will also download the latest version and install that one, so I have the most current version.

Marc
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 09:06
Member (2006)
English to Afrikaans
+ ...
My segadder script Feb 18, 2011

Didier Briel wrote:
Marc Baas wrote:
1) The segmentation that OmegaT does seems very, very fragmented to me.

So, if your text is segmented more than for logical end of sentences, it could come:
- From abbreviations. If your source text contain numerous abbreviations ending with '.', and these abbreviations are not in OmegaT rules, then you have to provide them.


If you find that you need the ability to easily add abbreviations to the segmentation rules quickly, and if you're using MS Windows, you can use my segadder script which somewhat automtates the process of adding abbreviations to the segmentation rules:

http://leuce.com/tempfile/omtautoit/segadder.zip


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 09:06
English to French
+ ...
Formulas, drawings, etc., can break the text Feb 18, 2011

Marc Baas wrote:
My source text is a MS Office document, which it seems to read fine, both the 2007 format as the older one.

OmegaT can only read the "2007" (i.e., .docx) format, not the legacy (i.e., doc) one.

What you are pointing out that it might be the source text could be. I'm working on a medical document that is litterally cramped with abbreviations, formulas and such. So it could be that this is the main culprit.

Yes, if there are "things" in the middle of the text, they might well it.

Didier


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


OmegaT 2.0.5 build 4 questions






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »