Specific features of MT dictionaries
Thread poster: Oleg Vigodsky
Oleg Vigodsky
Oleg Vigodsky
Russian Federation
Local time: 11:12
English to Russian
Apr 15, 2011

Dear collegues,

For already many years I use Machine Translation (MT) systems (PROMT) and customize user MT dictionaries (mainly, English into Russian, Telecom & IT). As far as I see, such dictionaries (shall) have many specific features:

1. In is worth to code some dict-terms in plural

For example, if "line number" is coded, it will be "translated" (Russian output) as "a number of a line". If this source term in plural is not coded, "line numbers" will be
... See more
Dear collegues,

For already many years I use Machine Translation (MT) systems (PROMT) and customize user MT dictionaries (mainly, English into Russian, Telecom & IT). As far as I see, such dictionaries (shall) have many specific features:

1. In is worth to code some dict-terms in plural

For example, if "line number" is coded, it will be "translated" (Russian output) as "a number of a line". If this source term in plural is not coded, "line numbers" will be output as "numbers of a line" (in most of cases this is not OK). So I need to code "line numbers" term (as "numbers of lines")

2. It is worth to code some dict-terms which contain acronyms

For example, "backup CS". I coded it as "backup CS server".

3. Coding dict-terms containig conjunctions (AND, OR)

For example, "backup and active CS". I coded it as "backup and active CS server".

4. Coding many 'evident' dict-terms

For example, "line type". If this term is not coded, in some cases the system processes "type" as a verb.

5. It is worth to code some fixed expressions

For example, "it should be noted that", "if otherwise not specified"...

6. Coding word compbinations containng a gerund

For example, "message listing". In most cases it shall be output as "listing of messages"


Above are most evident (for me) features. In fact, if such dict-terms are correctly coded (in MT cictionary), a raw MT ouput is much better, and further post-editing (if any) is more simple and less time-consuming.

On the other hand, a capacity of such MT dictionary becomes "serious" (much more detailed than any conventional eletronic dictionary). E.g., our main user MT dictionary contains almost 70.000 entries.

Can you comment above features? To add, delete or change something?

Thank you in advance.



Note: Please do not comment MT itself (do not underline how you LOVE it).
Collapse


 
Oleg Vigodsky
Oleg Vigodsky
Russian Federation
Local time: 11:12
English to Russian
TOPIC STARTER
Added No. 7 Apr 18, 2011

Would like to edit my post, but editing feature is already timed out (for me).

7. Coding entries which contain dashes (and similar characters)

Suppose we need to code a "connection oriented" entry. Done. But, I see that some writers use such forms:
- connection-oriented,
- connectionoriented

So, I have to code all 3 entries with a single translation. This case is 'unreal' and 'stupid' for conventional (not MT) dictionaries. But I need a MT outpu
... See more
Would like to edit my post, but editing feature is already timed out (for me).

7. Coding entries which contain dashes (and similar characters)

Suppose we need to code a "connection oriented" entry. Done. But, I see that some writers use such forms:
- connection-oriented,
- connectionoriented

So, I have to code all 3 entries with a single translation. This case is 'unreal' and 'stupid' for conventional (not MT) dictionaries. But I need a MT output to be consistent - without such coding the MT output will be a little bit different.

Threre are other similar MT dict entries containing comma, dot, etc.
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Specific features of MT dictionaries






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »