Pages in topic:   [1 2] >
Understanding Google translate?
Thread poster: Jeff Whittaker
Jeff Whittaker
Jeff Whittaker  Identity Verified
United States
Local time: 07:23
Member (2002)
Spanish to English
+ ...
Oct 15, 2009

I was playing around with the google translation tool and it seems to be doing some weird things. I understand that these tools are not very useful for anything more than getting a gist of what the document is about, but usually there is some logic as to WHY the computer makes a mistake (incorrect pronoun or synonym choice, etc. because the computer has no real-world experience).

What I have noticed with the recent incarnation of translation software (based on frequency statistics
... See more
I was playing around with the google translation tool and it seems to be doing some weird things. I understand that these tools are not very useful for anything more than getting a gist of what the document is about, but usually there is some logic as to WHY the computer makes a mistake (incorrect pronoun or synonym choice, etc. because the computer has no real-world experience).

What I have noticed with the recent incarnation of translation software (based on frequency statistics rather than a simple dictionary) is that many times the program seems to come up with a "translation" that, although ungrammatical, may seem to make sense to the monolingual person, but is completely wrong. It also sometimes just leaves a lot of words out for no reason.

1) Here is a simple example from Italian to English:
Papà parlava di lavoro e mamma si occupava dell'ospitalità.
[literally: Dad talked about business and Mom took care of the hospitality]

Google's translation:
Mom and Dad talking business took care of hospitality.

By what rule did Google combine both subjects together (especially since both verbs are singular)? Why did Google interpret this sentence in this manner?

2) Another example:
Tutte queste percezioni hanno contribuito a formarmi e me ne rendo conto nelle fasi decisionali.

Google's translation:
All these perceptions have contributed to and I realize in the early stages of decision making.

For some reason, Google omits the word "formarmi" ([contributed to] training/molding/shaping me).

3) È un’azienda innovativa e flessibile.

Google's translation: It is an innovative and flexible. [omits the word "company"] Why?

4) Sometimes a particular nuance is omitted:

Purtroppo non è più possibile un contatto diretto con tutti.
[literally: Unfortunately, direct contact with everyone "is no longer" possible.]

Google's translation:
Unfortunately you can not direct contact with everyone.

Scary to think that some of this stuff is being "edited" by monolinguals or without reading the source text.



[Edited at 2009-10-15 20:47 GMT]
Collapse


 
Laurent KRAULAND (X)
Laurent KRAULAND (X)  Identity Verified
France
Local time: 12:23
French to German
+ ...
Skipping words? Oct 15, 2009

Jeff Whittaker wrote:
By what rule did Google combine both subjects together (especially since both verbs are singular)? Why did Google interpret this sentence in this manner?


Hi Jeff,
I played with GTT too and it returns quite the same results as the "simple" tool (translate.google.com), i.e. words unknown to the machine or even synonyms are simply skipped, although the overall result may make sense at some point. Other than that, I assume that GTT has a real problem with the order of words within a sentence. Combining these two would be enough to make "perfect" translations, which at the same time are untrue to the SL.

How about that as a first approach?

[Edited at 2009-10-15 18:25 GMT]


 
Karletto
Karletto
English to Slovenian
+ ...
found out that.. Oct 15, 2009

it works like this:
Papà parlava di lavoro e
Mamma si occupava dell'ospitalità.

notice i pressed enter between "e" and "mamma".


it's translation tool for free so what can you expect??

[Edited at 2009-10-15 18:30 GMT]

[Edited at 2009-10-15 18:30 GMT]


 
Jeff Whittaker
Jeff Whittaker  Identity Verified
United States
Local time: 07:23
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
Misleading Appearences? Oct 15, 2009

Laurent KRAULAND wrote:


I played with GTT too and it returns quite the same results as the "simple" tool (translate.google.com), i.e. words unknown to the machine or even synonyms are simply skipped, although the overall result may make sense at some point.


Yes, I noticed that too. Perhaps that is why it "seems" to do such a great job "translating" web pages from Arabic or Chinese to English. I always wondered why there are hardly ever any untranslated words or characters. It seems that Google may just skip over words it does not know and when it encounters a grammatical structure it does not understand, it seems to make something up or pull some kind of fuzzy match from cyberspace.


[Edited at 2009-10-15 18:44 GMT]


 
Quamrul Islam
Quamrul Islam  Identity Verified
Local time: 17:23
Member (2009)
English to Bengali
+ ...
It's always so funny and unpredictable ! Oct 15, 2009

Yes, Google translation is funny most of the time because it's unpredictable. For a simple test, once I gave the input : हिन्दी लिखना कितना आसान है (=It's so easy to write Hindi).

Can you imagine what Google gave me as translation? It gave me, "It's so easy to write English".

Those who know both the input and output languages well would be surprised to find the inexplicable deviation, while those knowing only one of the languages would enjoy the magical output.


 
Jeff Whittaker
Jeff Whittaker  Identity Verified
United States
Local time: 07:23
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
Google translate Oct 15, 2009

Quamrul Islam wrote:
Can you imagine what Google gave me as translation? It gave me, "It's so easy to write English".


Very funny. 15 years ago, MT software would never make these kinds of mistakes or omissions. It seems that by solving one problem, they have created another and "thrown the baby out with the bathwater".

At first glance, the "translations" look complete and, as you said, "magical", but upon further examination, the magic quickly wears off...

[Edited at 2009-10-15 18:53 GMT]


 
Valerie35 (X)
Valerie35 (X)
Local time: 12:23
German to English
Google phrase Oct 15, 2009

I copied this phrase in (from above):

हिन्दी लिखना कितना आसान है

and got:

It's so easy to write Hindi

[Edited at 2009-10-15 19:48 GMT]


 
kimjasper
kimjasper  Identity Verified
Denmark
Local time: 12:23
Member (2006)
English to Danish
+ ...
The ultimate blooper Oct 15, 2009

This can happen if you only rely on machine translation and forget to hire a proofreader:
http://adweek.blogs.com/adfreak/2008/07/then-well-grab.html


 
Neil Coffey
Neil Coffey  Identity Verified
United Kingdom
Local time: 11:23
French to English
+ ...
Statistical process Oct 15, 2009

Most computer translation engines nowadays work on an essentially statistical process. They generally work from two "models":

(a) a translation model, which effectively says "given phrase X, there is a Y% change that Z will be the translation" (with some refinements)
(b) a language model, which effectively says "this series of words has this likelihood of occurring in the target language"

Then, the "translation" process involves creating a (potentially large) numb
... See more
Most computer translation engines nowadays work on an essentially statistical process. They generally work from two "models":

(a) a translation model, which effectively says "given phrase X, there is a Y% change that Z will be the translation" (with some refinements)
(b) a language model, which effectively says "this series of words has this likelihood of occurring in the target language"

Then, the "translation" process involves creating a (potentially large) number of candidate translations (naive guesses, if you like) and then finding the candidate that has maximum likelihood of being the translation given models (a) and (b).

I touch on some of this in slightly more detail in an article I wrote a few months ago on the use of machine translation:

http://ezinearticles.com/?Machine-Translation---How-it-Works,-What-Users-Expect,-and-What-They-Get&id=2323365

Peter Norvig (Google's head of search quality and co-author of "Artificial Intelligence: A Modern Approach") also gave an informative talk a few years ago that illustrates the essentially technique used by Google Translate. Don't just have the URL to hand, but it's searchable on the Internet-- I think via his home page.
Collapse


 
Jeff Whittaker
Jeff Whittaker  Identity Verified
United States
Local time: 07:23
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
Machine translation Oct 15, 2009

Thank you. I will definitely read it.


Neil Coffey wrote:

I touch on some of this in slightly more detail in an article I wrote a few months ago on the use of machine translation:

http://ezinearticles.com/?Machine-Translation---How-it-Works,-What-Users-Expect,-and-What-They-Get&id=2323365



 
Daniel Grau
Daniel Grau  Identity Verified
Argentina
Member (2008)
English to Spanish
Translate server error Oct 16, 2009

I think this is a Photoshop hoax. If you zoom in on the picture, you'll notice that the bottom of the characters are actually parallel to the bottom of the picture, instead of being parallel to the frame of the sign.

Regards,

Daniel


 
Tim Drayton
Tim Drayton  Identity Verified
Cyprus
Local time: 13:23
Turkish to English
+ ...
My recent experience Oct 16, 2009

I was also playing with Google translate a few days ago and decided to enter various three-word sentences in Turkish and see what English translation it would come up with. I believe that Google translate works with blocks of three words at a time, so in theory it should perform very well on this test. Indeed it did for the most part, but when I entered "Cats love their owners" (Kediler sahiplerini severler), it came up with the translation "Owners love their cats", i.e. it had mixed up t... See more
I was also playing with Google translate a few days ago and decided to enter various three-word sentences in Turkish and see what English translation it would come up with. I believe that Google translate works with blocks of three words at a time, so in theory it should perform very well on this test. Indeed it did for the most part, but when I entered "Cats love their owners" (Kediler sahiplerini severler), it came up with the translation "Owners love their cats", i.e. it had mixed up the subject and object of the sentence. Given that the object in the Turkish sentence carries a case marker, this is a very serious blooper.Collapse


 
Pablo Bouvier
Pablo Bouvier  Identity Verified
Local time: 12:23
German to Spanish
+ ...
Understanding Google translate? Oct 17, 2009

ValBerlin wrote:

I copied this phrase in (from above):

हिन्दी लिखना कितना आसान है

and got:

It's so easy to write Hindi

[Edited at 2009-10-15 19:48 GMT]


I did the same test from hindi into spanish and got near the same result:

Es tan fácil escribir Hindi (It is so easy to write Hindi)

In any case Hindi has been translated correctly.


 
Mathilde Verbaas
Mathilde Verbaas  Identity Verified
Czech Republic
Local time: 12:23
English to Dutch
+ ...
I think it sometimes uses English as an 'in between' language Oct 20, 2009

I used google a lot to get the gist of Czech websites. I found out that Czech-English translations produce a quite good result (I can understand the basic meaning of the text) but Czech-Dutch translations are crap, sometimes the translation even has the opposite meaning of the original text! After playing around a bit, it seems that google translates Czech texts first to English and then translates the English texts to Dutch.

Do other people have the same experience with other lang
... See more
I used google a lot to get the gist of Czech websites. I found out that Czech-English translations produce a quite good result (I can understand the basic meaning of the text) but Czech-Dutch translations are crap, sometimes the translation even has the opposite meaning of the original text! After playing around a bit, it seems that google translates Czech texts first to English and then translates the English texts to Dutch.

Do other people have the same experience with other language pairs?
Collapse


 
Jeff Allen
Jeff Allen  Identity Verified
France
Local time: 12:23
Multiplelanguages
+ ...
Google Translate and combined subjects Nov 21, 2009

Jeff Whittaker wrote:
By what rule did Google combine both subjects together (especially since both verbs are singular)? Why did Google interpret this sentence in this manner?


Laurent KRAULAND wrote:
I played with GTT too and it returns quite the same results as the "simple" tool (translate.google.com), i.e. words unknown to the machine or even synonyms are simply skipped, although the overall result may make sense at some point. Other than that, I assume that GTT has a real problem with the order of words within a sentence. Combining these two would be enough to make "perfect" translations, which at the same time are untrue to the SL.
How about that as a first approach?


Gregor Trebec wrote:
it works like this:
Papà parlava di lavoro e
Mamma si occupava dell'ospitalità.

notice i pressed enter between "e" and "mamma".

it's translation tool for free so what can you expect??


The combined subjects issue has been one of key pains for MT for a couple of decades, and is what led to creating rules in controlled language authoring to tell writers to produce parallel and repeated structure for all phrases and clauses of the sentence so that the MT engine could easily recognize and follow the parallelism.

One of the ways that this was handled with controlled language authoring systems was when to present the authors with sentences that have parallel and dependent structures, and to have them select the word upon which a phrase depended. I spent a couple of years doing that and teaching 150 technical authors how to do it and a few dozen technical translators on how to work with it in a dozen target languages.
It however requires a software interface to make it possible.

Then additional work was done to add semantic rules and statistical modeling to pre-select for the authors. This was described in a post-graduate thesis by Eric Crestan in 2000. I was on the thesis committee. See abstract at this link:
http://www.mail-archive.com/[email protected]/msg00259.html

The Google MT team is certainly aware of this problem that has plagued rule-based MT for a long time. My guess is that might have added some rules in their stat-based MT system to make subject combination choices as a default when certain thresholds as passed.


Jeff
Jeff


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Understanding Google translate?






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »