Counting unique words in a project after inserting a TM
Thread poster: Michael Mestre
Michael Mestre
Michael Mestre
France
Local time: 07:12
English to French
+ ...
Apr 1, 2010

(My question concerns OmegaT 2.0.5, but it shouldn't be very different with other versions).

Dear colleagues,

I am translating several big documents that my client (an agency) is sending one by one.
I am only charging what I call "unique source words", that is the total number of words from unique segments.
My invoice does not include the words whose segments produce a 100% match after inserting the TMs from the previous documents (regardless of whether the
... See more
(My question concerns OmegaT 2.0.5, but it shouldn't be very different with other versions).

Dear colleagues,

I am translating several big documents that my client (an agency) is sending one by one.
I am only charging what I call "unique source words", that is the total number of words from unique segments.
My invoice does not include the words whose segments produce a 100% match after inserting the TMs from the previous documents (regardless of whether the formatting tags are identical or not).
The idea is to charge the same amount as what would have been counted if all the documents had been translated at the same time.

OmegaT does provide a count of the unique source words in the project_stats.txt file, but that figure doesn't seem to take into account the external TMs.
What I am currently doing is that I copy the TM in TMX 1.4 format, then ask OmegaT to insert automatically all the 100% matches (Options > Editing behaviour > Insert the best fuzzy match, with the threshold set to 100%).
The problem is that the matches are not inserted automatically in the whole document ; I have to press Enter for all the segments until I reach the end of the document.
Keeping Enter pressed does make the program insert matches, but it misses some of them, and a bug makes it paste the wrong matches for several segments (it probably cannot accurately keep track of an array index in that situation).

My questions are:
-Is there a way to ask OmegaT to insert all these fuzzy matches in the whole document automatically, without any manual action from the user ?
-Even better, is there a way to make OmegaT compute the number of unique words minus the 100% matches from the external TMs ?

Thank you !
Collapse


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 07:12
English to French
+ ...
Use 2.1.4 or merge your TM Apr 1, 2010

Michael Mestre wrote:

(My question concerns OmegaT 2.0.5, but it shouldn't be very different with other versions).


It is:
2.1.0 vs. 2.0.5

Implemented requests:

- Match Statistics
http://sourceforge.net/support/tracker.php?aid=2876216


My questions are:
-Is there a way to ask OmegaT to insert all these fuzzy matches in the whole document automatically, without any manual action from the user ?

Not currently.


-Even better, is there a way to make OmegaT compute the number of unique words minus the 100% matches from the external TMs ?

Use 2.1.4. As it will count 100% matches from external TMs, you should get the figure you want, by comparing the count without the external TMs.
Or merge your internal TM with the external one(s), either with TMXMerge or Olifant, just for counting purposes.

Didier


 
Michael Mestre
Michael Mestre
France
Local time: 07:12
English to French
+ ...
TOPIC STARTER
Thank you ! Apr 1, 2010

Thank you Didier for this very quick answer.

Is the 2.1.x branch stable enough for intensive use in its current state ?
I would love to try it soon.


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 07:12
English to French
+ ...
Beta mostly means there is no documentation Apr 1, 2010

Michael Mestre wrote:

Is the 2.1.x branch stable enough for intensive use in its current state ?


From Marc Prior (The state of OmegaT):
Users are encouraged to try what the OmegaT team describes as the “beta version”; the development team takes great care to ensure that the code in this version is also stable (some of the commercial competition could learn some lessons from OmegaT in this regard), and the “beta” status refers primarily to the fact that the accompanying documentation is not up to date.

Didier


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Counting unique words in a project after inserting a TM






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »