Pages in topic:   [1 2] >
Software misnomers and how to use MT
Thread poster: Phil Hand
Phil Hand
Phil Hand  Identity Verified
China
Local time: 13:32
Chinese to English
Sep 18, 2012

Inspired by Lucy's post on MT, I was thinking about the software we use, and I realised how much of it is misnamed.

Example: I love spellcheck, it's a great tool. But it's got naff all to do with spelling - I'd never trust Microsoft with that. If I have a spelling doubt, I turn to a reputable dictionary. So what does spellcheck do? It's a typo catcher.

Grammar check: hated by many, I love it. It's another typo checker, but this time it catches the mistakes where you've
... See more
Inspired by Lucy's post on MT, I was thinking about the software we use, and I realised how much of it is misnamed.

Example: I love spellcheck, it's a great tool. But it's got naff all to do with spelling - I'd never trust Microsoft with that. If I have a spelling doubt, I turn to a reputable dictionary. So what does spellcheck do? It's a typo catcher.

Grammar check: hated by many, I love it. It's another typo checker, but this time it catches the mistakes where you've accidentally made a real word instead of just letter salad.

Computer aided translation: What a lie, it doesn't aid me with the translation part at all! But I love my CAT tool, because it helps me maintain consistency. It should be called computer aided consistency.

Now, I would argue that MT falls into the same category. It doesn't really translate. It can't - translation means taking the meaning of text A and representing it in language B, and computers can't do meaning. What MT does is dictionary look-up on the whole of a text at once.

(Incidentally, I'd love to see a comparison of an MT tool against a tool that did exactly that - dictionary look-up on every word. I bet that the look-up tool would perform just as well as MT, because humans are really good at reading meaning into text, even when it's a bit scrambled.)

There's a great quote from the blog linked in Lucy's thread:
"I only use MT more or less as a pretty good, context-based dictionary."
http://patenttranslator.wordpress.com/2012/09/15/post-editing-of-machine-translations-is-a-good-definition-of-the-term-a-fools-errand/

Now, that sounds pretty smart to me, and it got me thinking.

Post-editing is one of the world's all time dumb ideas. (To quote the mad patent translator: "If I tried to massage MT with my editing to lick it into a shape that would resemble human translation, it would take me at least as long as retranslating because...MT post-editing that can add real value is in fact retranslating.") Sooner or later, this madness will pass, and we as an industry will want to start figuring out how to use MT to actually add value rather than subtract it.

So, how can we use MT? Or, what kind of dynamic look-up tool would you like to see?
Collapse


 
Veronica Coquard
Veronica Coquard
France
Local time: 07:32
French to English
+ ...
A personalised all-in-one look-up tool Sep 18, 2012

If I could create a dream tool, it be a bit like the ProZ term search, but it would look up the term in all the on-line bilingual glossaries that I regularly use, allowing me to personalise the selection. Also, it would be "fuzzy" in the sense that it wouldn't come up empty-handed just because I'd left off an accent or used a different form or verb, for example.

I fully agree on the typo catcher misnomer! Funny.


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 06:32
Member (2009)
Dutch to English
+ ...
@Veronica: Sep 18, 2012

What you describe sounds a little like IntelliWebSearch (http://www.intelliwebsearch.com/index.asp ) or Multifultor (http://www.proz.com/forum/translator_resources/205846-multifultor_is_the_new_intelliwebsearch.html ).
... See more
What you describe sounds a little like IntelliWebSearch (http://www.intelliwebsearch.com/index.asp ) or Multifultor (http://www.proz.com/forum/translator_resources/205846-multifultor_is_the_new_intelliwebsearch.html ).

Michael
Collapse


 
Oliver Pekelharing
Oliver Pekelharing  Identity Verified
Netherlands
Local time: 07:32
Dutch to English
@Micheal Sep 19, 2012

Sorry for hijacking this thread for a wee second, but Micheal, can Multifultor look up jurlex and/or evoterm? How about the van dale EW? Now that would be useful.

Regards,

Olly


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 07:32
Member (2006)
English to Afrikaans
+ ...
Word for word vs statistical MT Sep 19, 2012

Phil Hand wrote:
(Incidentally, I'd love to see a comparison of an MT tool against a tool that did exactly that - dictionary look-up on every word. I bet that the look-up tool would perform just as well as MT, because humans are really good at reading meaning into text, even when it's a bit scrambled.)


No, Google Translate is infinitely better than word-for-word systems. I have seen some of those, and they are terrible. Even if the dictionary is highly subject-specific, they make terrible blunders. One tiny advantage of the word-for-word systems is that they typically use the same target word for the same source word, which Google Translate can't be relied to do. This is why using an extensive glossary is crucial when using Google Translate, to ensure that you use the same terms throughout your translation, because Google Translate will use a number of synonyms instead.

Samuel


 
Tntranslations
Tntranslations
Local time: 08:32
@Phil Sep 19, 2012

Phil Hand wrote:

Now, I would argue that MT falls into the same category. It doesn't really translate. It can't - translation means taking the meaning of text A and representing it in language B, and computers can't do meaning. What MT does is dictionary look-up on the whole of a text at once.


I think there are many ways to achieve this transfer of meaning between languages that you mention. One is the kind of idealized translation, which only humans can do. This is where you extract the complete meaning and function of the source text by interpreting the text and combining your interpretation with facts about the world, stylistic knowledge etc. The other kind of translation that only humans can do is the kind where you break up the source sentence in your mind into little pieces, shake them about and produce a sentence that's perfectly natural in the target language. In these cases, the source and the target sentences don't have the sort of clear relation that a machine can replicate.

However, there's a third kind of transfer, where you look at a sentence and instantly know the template to use to translate it. I'm not talking solely about stereotypical examples such as "Push the button to so-and-so", but also about other more complex cases where there's a reliable relation between the templates used to express meanings in both languages. In these cases all the translator does is fill in the variables. This is the kind of task a machine excels in. The relation between the source and target sentences is stable, so the machine can extract it and learn what phrases to use to fill the variables (or the relation can be programmed, in case of rule-based MT).

Phil Hand wrote:
So, how can we use MT? Or, what kind of dynamic look-up tool would you like to see?


We should keep the MT running alongside the TM. If you see the source segment is something which could feasibly be translated by MT (you will develop an intuition for this), have a look at the MT suggestion. Quickly (use only few seconds) check whether the MT suggestion is useful. If you are using customized MT with guaranteed correct terminology (for instance an MT engine created from a large translation memory with the open source Moses software), you might want to base your translation on the MT suggestion even if the structure is incorrect.


 
Veronica Coquard
Veronica Coquard
France
Local time: 07:32
French to English
+ ...
@Micheael Sep 19, 2012

Wow, thanks! I will definitely look into that!



 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 06:32
Member (2009)
Dutch to English
+ ...
@Olly: Sep 19, 2012

Olly Pekelharing wrote:

Sorry for hijacking this thread for a wee second, but Micheal, can Multifultor look up jurlex and/or evoterm? How about the van dale EW? Now that would be useful.

Regards,

Olly


Hi Olly,

Hmm. I actually haven't tried, mainly because I am now using IntelliWebSearch.

re: Van Dale: I know IntelliWebSearch works with my local NL-EN Van Dale. Multifultor should, but I never got around to trying.

re: JurLex/Evoterm: I'm not sure. (1) I just quickly tried JurLex (in Multifultor), which didn't work. I have a feeling that Kluwer's newfangled 17-step login procedure might be the problem there;) (2) Also just tried Evroterm (in both Multifultor and IWS): also doesn't work. I'll ask Graham. Maybe he can come up with something. I'll also ask in the IWS forum.

It would indeed be great if we could actually use ONE search tool to access everything!


 
Heike Behl, Ph.D.
Heike Behl, Ph.D.  Identity Verified
Ireland
Local time: 06:32
Member (2003)
English to German
+ ...
Semantic knowledge in MT Sep 19, 2012

There is a lot of semantic and grammar knowledge in MT. That's why it's important to have user-definable dictionaries. In some systems, you might indeed have to put whichever translation would be the best in your context on top. But there are also numerous much more refined systems where appropriate semantic and grammatical tags allow the computer to choose translations with the correct meaning from a list of possible translations. Ie. if a user adds new words to the MT dictionary, they will als... See more
There is a lot of semantic and grammar knowledge in MT. That's why it's important to have user-definable dictionaries. In some systems, you might indeed have to put whichever translation would be the best in your context on top. But there are also numerous much more refined systems where appropriate semantic and grammatical tags allow the computer to choose translations with the correct meaning from a list of possible translations. Ie. if a user adds new words to the MT dictionary, they will also have to enter the correct tags for the MT system to work. The more knowledgeable that user is in regard to MT in general and to the specific MT system in particular, the better the translation results will be.

Also, a word-by-word translation will never do any syntactical transformation to change the source into an acceptable target syntax. Depending on the language combination, this could cause major problems.

There will be no grammatical inflections. How would you be able to distinguish different cases in German, for instance? With the flexible word order of German there would be no indication whether a particular noun would be subject, direct or indirect object. Genitives would be lost as well. Even the most benevolent reader wouldn't be able to read any meaning into that kind of word salad.

And then there's disambiguation, a very important thing in English, but in other languages as well. How would a word-lookup tool know whether a word like "drive" is a verb or a noun? It would be a random decision based on whichever definition comes first in the dictionary.

Serious MT systems are much, much more than just word-lookup tools. So, Phil, how much are you willing to bet?


I agree with you on post-editing MT, although some languages are more suitable in this regard than others.
But the usefulness of MT depends on the expectations people have.

People who just want a gist translation of a longer text will do very well with MT, particularly if it has been post-edited not to human translation standards, but just enough that the worst errors undermining basic understanding are fixed. There are cases where people need to review huge amounts of texts to find a few that are of real interest to them. Those few texts they can have translated by humans. It would be forbiddingly expensive to have everything translated by humans.

One big problem is that so many people who are not aware of what is involved in MT want to use it to get cheap, acceptable or even publishable translations from MT with "a little post-editing". This will in most cases not work.

A while ago, an agency I work for did some research regarding the usability of Google Translate for their business. They asked how long post-editing a specific text would take. I've never heard about this project again, although the person doing the research seemed really convinced (at least initially) it would be a great idea.
Collapse


 
Meta Arkadia
Meta Arkadia
Local time: 12:32
English to Indonesian
+ ...
One for all Sep 19, 2012

Michael Beijer wrote:
It would indeed be great if we could actually use ONE search tool to access everything!

CafeTran is a CAT tool, not necessarily a search tool, but using CafeTran, I can access all resources in one go: My TMs and glossaries (unlimited number, with priorities set), as many Internet resources as I think I will need, and my local dictionaries (like Van Dale) from within CafeTran, no other software needed.

http://cafetran4mac.blogspot.com/2011/07/dictionaries-and-dictionaries.html
http://cafetran4mac.blogspot.com/2011/05/internet-resources.html
http://cafetran4mac.blogspot.com/2012/07/top-priority.html

All this works if you use a Mac, I don't know if there's an easy solution for searching local dictionaries if you run CafeTran under Windows or Linux/Ubuntu, the other search options are CafeTran's.


 
Phil Hand
Phil Hand  Identity Verified
China
Local time: 13:32
Chinese to English
TOPIC STARTER
Thanks, very interesting ideas Sep 20, 2012

A lot of very interesting discussion here.

@TNT: That makes a lot of sense to me. Where the language is not "free", but follows preset formats, it does seem likely that MT could be very successful. And domain-specific MT (serving in part as a terminology base) could be very useful.

@Samuel and Heike:

OK, so I might lose my bet. But I'm going to take a little more convincing yet. I went and got a little text, very laboriously did dictionary look up on every
... See more
A lot of very interesting discussion here.

@TNT: That makes a lot of sense to me. Where the language is not "free", but follows preset formats, it does seem likely that MT could be very successful. And domain-specific MT (serving in part as a terminology base) could be very useful.

@Samuel and Heike:

OK, so I might lose my bet. But I'm going to take a little more convincing yet. I went and got a little text, very laboriously did dictionary look up on every word, and this is what I got:

Merck Serono - Life/ change/ through/ medicine

To the/ portfolio/ this/ area/ belong/ leading/ available only on prescription/ medicine/ how/ Erbitux®/ and/ Rebif®./ This/ come/ patient/ benefit,/ the/ on/ cancer/ respectively/ multiple sclerosis/ suffer/ are./ Above/ out/ offer/ Merck Serono/ drug/ on/ treatment/ of/ infertility,/ growth disorder,/ cardiovascular/ or/ metabolic/ disease/ on./ The/ focus/ the/ research/ and/ development/ lie/ open/ the/ therapeutic/ area/ oncology,/ neurodegenerative/ disease/ plus/ rheumatology.


It's pretty gross, but I would argue that you can just about get the gist of it.

Now, here's Google Translate (I know it may not be the best tool, but it's the easiest to get):

Merck Serono - Changing lives through medicine

The portfolio of this division includes leading prescription drugs such as Erbitux ® and Rebif ®. This will benefit patients who are ill with cancer or multiple sclerosis. In addition, Merck Serono drug in the treatment of infertility, growth disorders, cardiovascular or metabolic diseases. The focus of the research and development focuses on the therapeutic areas of oncology, neurodegenerative diseases and rheumatology.


Much smoother, I fully agree. But... But...
I'm not sure I'm getting any more semantic information from the GT version. It's easier to read, but I don't think I can trust any of the extra information. For example, the use of "this" at the beginning of sentence 2. I my "look up", it's just the wrong number. In the smoother-sounding GT version, I wonder, does it say "this" deliberately? Do they mean Rebif?

Text taken from this source: http://www.merck.de/de/unternehmen/unsere_maerkte/about_merck_serono.html
Collapse


 
Rolf Keller
Rolf Keller
Germany
Local time: 07:32
English to German
Try the RECENT Multifultor Sep 20, 2012

Michael Beijer wrote:

just tried Evroterm (in both Multifultor and IWS): also doesn't work.


I've just tried http://evroterm.gov.si with Multifultor 1.2.0.0 - it worked.


http://www.proz.com/forum/translator_resources/230479-multifultor_the_totally_free_term_search_tool.html#1995820


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 06:32
Member (2009)
Dutch to English
+ ...
@Rolf: Sep 20, 2012

Sorry, that was a typo. Olly typed 'Evoterm', and I changed it to 'Evroterm'.
We're actually talking about a paid online dictionary. In my case: http://translex.co.uk/GWIT.html

Similarly,
... See more
Sorry, that was a typo. Olly typed 'Evoterm', and I changed it to 'Evroterm'.
We're actually talking about a paid online dictionary. In my case: http://translex.co.uk/GWIT.html

Similarly, 'JurLex' is this one: http://gatewaywoordenboeken.nl/juridisch-economisch-lexicon.html

The problem seems to be how to access these (pass-word protected) dictionaries using Multifultor or IntelliWebSearch...

Michael
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 07:32
Member (2006)
English to Afrikaans
+ ...
@Phil Sep 20, 2012

Phil Hand wrote:
OK, so I might lose my bet. But I'm going to take a little more convincing yet. I went and got a little text, very laboriously did dictionary look up on every word, and this is what I got...


Afrikaans: 'n Begraafplaas word ook soms 'n kerkhof genoem, omdat begraafplase aanvanklik meestal in die hof (= tuin of erf) van 'n kerk aangelê is. Binne die begraafplase vind 'n mens soms 'n sogenaamde kerkhofhuis met begroetings- of afskeidsruimte waar daar kort voor die teraardebestelling aan die oorledene die laaste eer betoon word.

Google: A cemetery is sometimes called a graveyard, because cemeteries initially mostly in the court (= garden or yard) of a church laid out. Within the cemeteries one finds sometimes a called kerkhofhuis with greetings or goodbyes space where there shortly before the burial pay the last respects to the deceased.

Dictionary: A cemetary become also sometimes a graveyard called, because cemetaries initially mostly in the court ( = garden or plot) of a church have-flair-for is. Inside the cemetaries find a human sometimes a so-called graveyard-house with greetings- or separation-space where there short in-front the internment on the deceased the last honour show become.


 
Phil Hand
Phil Hand  Identity Verified
China
Local time: 13:32
Chinese to English
TOPIC STARTER
Nice illustration of my point, I think! Sep 20, 2012

Thanks, Samuel. Your example speaks to my point: I could just about understand the dictionary version, but I was missing some detail. The GT version was much smoother, and created the illusion that that detail was being provided; in fact, the detail contained errors.

"kerkhof...hof (= tuin of erf)"
"graveyard...court (= garden or yard)"

In a passage about words, getting the words right is pretty vital.

I honestly don't think I can get more *correct* i
... See more
Thanks, Samuel. Your example speaks to my point: I could just about understand the dictionary version, but I was missing some detail. The GT version was much smoother, and created the illusion that that detail was being provided; in fact, the detail contained errors.

"kerkhof...hof (= tuin of erf)"
"graveyard...court (= garden or yard)"

In a passage about words, getting the words right is pretty vital.

I honestly don't think I can get more *correct* information about the original text from the GT version than from the dictionary version. This isn't because the dictionary version is good, it's because people are really good at reading texts, and reading through errors in grammar and word order to the meaning beneath.

Added to that, when I read the GT version I'm getting some misleading detail added in by the MT system.

(Though I should admit "find a human" in the dictionary version is misleading, too. I think I'm less likely to be fooled by it, because the whole text is so disjointed, but it's an arguable point.)
Collapse


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Software misnomers and how to use MT






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »