Pages in topic:   [1 2] >
Using Google Translate to process long texts without buying an API access key
Thread poster: Emal Ghamsharick

Emal Ghamsharick  Identity Verified
Germany
Local time: 21:06
English to German
+ ...
Nov 11, 2014

Using Google Translate to process long texts without buying an API access key

Do you translate long texts with many random strings, e.g., for software interfaces and websites? Want to pre-process parts of it, so you can just proofread it (“machine translation post-editing”) instead of typing it all?

Some computer-assisted translation (CAT) tools let you use Google’s Translate API. The service is not free. But you can hack it. First I’ll explain the theory, then
... See more
Using Google Translate to process long texts without buying an API access key

Do you translate long texts with many random strings, e.g., for software interfaces and websites? Want to pre-process parts of it, so you can just proofread it (“machine translation post-editing”) instead of typing it all?

Some computer-assisted translation (CAT) tools let you use Google’s Translate API. The service is not free. But you can hack it. First I’ll explain the theory, then a specific example.

More: http://noordertranslation.wordpress.com/2014/11/11/3/
Collapse


 

Alex Lago  Identity Verified
Spain
Local time: 21:06
Member (2009)
English to Spanish
+ ...
You aren't using the right terms, your suggestion is not a hack, it is a workaround Nov 11, 2014

First of all you should know that hacking a program is illegal and advertising the fact that you hack programs could get you into trouble with the copyright owner and the authorities, so if I where you I would be careful advertising hacks.

Second of all what you have posted in your website is not in fact a hack but a workaround, you are not hacking into the Google API, you are showing people how to achieve the same result without using the API itself, that is called a workaround and
... See more
First of all you should know that hacking a program is illegal and advertising the fact that you hack programs could get you into trouble with the copyright owner and the authorities, so if I where you I would be careful advertising hacks.

Second of all what you have posted in your website is not in fact a hack but a workaround, you are not hacking into the Google API, you are showing people how to achieve the same result without using the API itself, that is called a workaround and is in fact legal, so no problem there.
Collapse


 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:06
Member (2006)
English to Afrikaans
+ ...
Google Translate by alignment Nov 11, 2014

Emal Ghamsharick wrote:
Some computer-assisted translation (CAT) tools let you use Google’s Translate API. The service is not free. But you can hack it.


It looks as if you're explaining how to align a translation that was done on Google Translate. Yes, if you don't have a Google Translate API subscription (or if you find the API too cumbersome even if you have a subscription, as I have), you can create a TM by translating the source text with Google Translate and then aligning the translation to the source text.

In your case, you used Wordfast Anywhere to perform the segment extraction (not all CAT tools offer segment extraction, but I know that Wordfast Classic has it). You then used Excel to create the TM, but you can also use PlusTools to align a two-column table.

Finally, you seem to use the word "auto-propagate" to mean "propagate". If you have to copy/paste content into the columns manually, then it's not "auto" ... (-: ... but we know what you meant.

I find it odd that you use Wordfast Anywhere, if you assume that the user has Wordfast Pro, because Wordfast Pro's "PM Perspective" view contains both a segment extraction and a bilingual table creator, just like Wordfast Anywhere. I suppose the advantage of Wordfast Anywhere is that it works also for people who don't have Wordfast Pro.

Here's how I usually do it:

I open the file in Wordfast and then do a segment extraction. I save the extracted segment as aaa.txt. I then resave aaa.txt as bbb.txt and perform some preprocessing on it (e.g. remove sensitive information). Then, I translate bbb.txt in Google Translate (using the program QTranslate) and save the translation as ccc.txt. I then save ccc.txt as ddd.txt and perform some post-processing on it (e.g. fix the spacing errors that Google introduces). Then I align aaa.txt with ddd.txt using PlusTools, and when I'm happy with the result (usually it's perfectly aligned), I generate the TM. Finally, I change the translation units' user ID to something that tells me it's a Google translation. I create bbb.txt and ccc.txt to enable me to roll back to a previous state if I discover that I should have done something a little different.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 20:06
Member (2009)
Dutch to English
+ ...
Come on, guys… Nov 11, 2014

Way too much trouble if you ask me. A subscription isn't exactly expensive.

Michael


 

Salam Alrawi  Identity Verified
United States
Local time: 14:06
English to Arabic
+ ...
Removing sensitive info Nov 11, 2014

Samuel Murray wrote:

Emal Ghamsharick wrote:
Some computer-assisted translation (CAT) tools let you use Google’s Translate API. The service is not free. But you can hack it.


It looks as if you're explaining how to align a translation that was done on Google Translate. Yes, if you don't have a Google Translate API subscription (or if you find the API too cumbersome even if you have a subscription, as I have), you can create a TM by translating the source text with Google Translate and then aligning the translation to the source text.

In your case, you used Wordfast Anywhere to perform the segment extraction (not all CAT tools offer segment extraction, but I know that Wordfast Classic has it). You then used Excel to create the TM, but you can also use PlusTools to align a two-column table.

Finally, you seem to use the word "auto-propagate" to mean "propagate". If you have to copy/paste content into the columns manually, then it's not "auto" ... (-: ... but we know what you meant.

I find it odd that you use Wordfast Anywhere, if you assume that the user has Wordfast Pro, because Wordfast Pro's "PM Perspective" view contains both a segment extraction and a bilingual table creator, just like Wordfast Anywhere. I suppose the advantage of Wordfast Anywhere is that it works also for people who don't have Wordfast Pro.

Here's how I usually do it:

I open the file in Wordfast and then do a segment extraction. I save the extracted segment as aaa.txt. I then resave aaa.txt as bbb.txt and perform some preprocessing on it (e.g. remove sensitive information). Then, I translate bbb.txt in Google Translate (using the program QTranslate) and save the translation as ccc.txt. I then save ccc.txt as ddd.txt and perform some post-processing on it (e.g. fix the spacing errors that Google introduces). Then I align aaa.txt with ddd.txt using PlusTools, and when I'm happy with the result (usually it's perfectly aligned), I generate the TM. Finally, I change the translation units' user ID to something that tells me it's a Google translation. I create bbb.txt and ccc.txt to enable me to roll back to a previous state if I discover that I should have done something a little different.



I really enjoyed your explanation. But I have one question for you if you don't mind answering. Why would you remove something (e.g. Sensitive information) from file bbb.txt? Will google pick it up or something?


 

Wolfgang Jörissen  Identity Verified
Belize
Dutch to German
+ ...
0,64 € Nov 11, 2014

That was my last bill for API usage. I am not a heavy user, but nevertheless, a couple of colleagues of mine use the same key. Even if I would have criminal energy within me (quod non), I see no reason for "hacking" that system.

 

Jorge Payan  Identity Verified
Colombia
Local time: 15:06
Member (2002)
German to Spanish
+ ...
EUR 3,8 per month Nov 11, 2014

It is how much I pay Dallas Cao for using his Google Translate for Translators (GT4T) tool (http://dallascao.com/en/gt4t). It takes the text to be translated to both Bing Translator and Google Translate (and even to mymemory.translated.net) and inserts the translation back into any target segment field provided in the TenT environment, Word, Excel, the Clipboard, etc.

It saves me the time to
... See more
It is how much I pay Dallas Cao for using his Google Translate for Translators (GT4T) tool (http://dallascao.com/en/gt4t). It takes the text to be translated to both Bing Translator and Google Translate (and even to mymemory.translated.net) and inserts the translation back into any target segment field provided in the TenT environment, Word, Excel, the Clipboard, etc.

It saves me the time to be messing around with settings for each TenT I use, and being a Windows application works almost everywhere in the screen. Dallas is in charge of paying for the API usage to whoever it corresponds.

Frankly, I don't see the point of saving yourself such a meager amount per month. The investment I make is easily amortized after the first 15 minutes of paid work.

Saludos
Collapse


 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:06
Member (2006)
English to Afrikaans
+ ...
The API is slow, that's what Nov 11, 2014

Wolfgang Jörissen wrote:
I am not a heavy user, but nevertheless, a couple of colleagues of mine use the same key. Even if I would have criminal energy within me (quod non), I see no reason for "hacking" that system.


The API translates one segment at a time, which is a slow process, particularly with very short segments. Sometimes one wants more speed, and then using the API is not suitable.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 20:06
Member (2009)
Dutch to English
+ ...
absolute insanity Nov 11, 2014

Samuel Murray wrote:

Here's how I usually do it:

I open the file in Wordfast and then do a segment extraction. I save the extracted segment as aaa.txt. I then resave aaa.txt as bbb.txt and perform some preprocessing on it (e.g. remove sensitive information). Then, I translate bbb.txt in Google Translate (using the program QTranslate) and save the translation as ccc.txt. I then save ccc.txt as ddd.txt and perform some post-processing on it (e.g. fix the spacing errors that Google introduces). Then I align aaa.txt with ddd.txt using PlusTools, and when I'm happy with the result (usually it's perfectly aligned), I generate the TM. Finally, I change the translation units' user ID to something that tells me it's a Google translation. I create bbb.txt and ccc.txt to enable me to roll back to a previous state if I discover that I should have done something a little different.



Wow, I just re-read your post. Why in god's name would you want to do all that before every job? What a complete waste of time. I just don't understand why you want to go through all those steps every time you start a job, just to end up with a crappy Google Translated TMX?! What about multi-file projects?

You say that the API is too slow, but how fast do you need it to be? I mean, don't you translate segment by segment, like everyone else? Or is it just to save money? You say "Sometimes one wants more speed", but why exactly? What is the benefit, other than getting something for free.

Also, I've been thinking about the various GT "facilitators" out there, and I have a sneaking suspicion they probably aren't paying Google anything. More likely they just coded a hack and invented a pricing schedule

"3. Both Google Translate API and Bing translator API are now paid services, but you don’t need to pay Google or Microsoft directly. As a GT4T customer, you pay me and I pay Google and Microsoft." (http://gt4t.net/en/purchase/ )

Hmm.

How can GT4T offer set annual rates? It just doesn't make sense. If it was legit, wouldn't you expect the software to just have an input field for the Google Translate API key like all the CAT tools?

Michael

[Edited at 2014-11-11 23:10 GMT]


 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:06
Member (2006)
English to Afrikaans
+ ...
@Michael Nov 12, 2014

Michael Beijer wrote:
I just re-read your post. Why ... would you want to do all that before every job?


I don't do that before every job (I never said so, either). I was simply describing my method of accomplishing what the original poster accomplishes with his method. As to "why"... well, as stated in more than one post in this thread, allow me to repeat: it is sometimes better to get Google Translations in bulk, instead of piece-meal.

What a complete waste of time. I just don't understand why you want to go through all those steps ... just to end up with a crappy Google Translated TMX?!


(Well, yes, if one ends up with a TMX file, it would certainly be a waste of time... because then you'd have to convert the TMX back to something useful again.)

CAT hopping always takes extra time. Whether it is wasted time depends on how much time is saved in the end.

What about multi-file projects?


Perhaps your CAT tool can't handle multifile segment extraction, but that does not mean no CAT tools can. Wordfast Pro can do it. Wordfast Classic used to be able to do it. OmegaT can do it. With a bit of tinkering, even Trados 2009+ can do it.

And besides, if you prefer to use a CAT tool that can extract only one file at a time, well... ever heard of "merge"?

You say that the API is too slow, but how fast do you need it to be?


The original post was specifically about translating lists of words. The API causes pauses of up to a second between each segment. When translating lists, those pauses can be excruciating.

I mean, don't you translate segment by segment, like everyone else?


Translating segment by segment does not mean one can't do preparations for more than one segment at a time. After all, when you receive a file with lots of formatting problems, don't you start off by fixing the entire file, before you start translating even the first segment?

There is nothing strange about creating reference materials before you start translating.

You say "Sometimes one wants more speed", but why exactly?


What an odd question.

Remember, the "more speed" that we're talking about in this thread relates less to overall translation speed than to responsiveness of the user interface. The API causes pauses. Using a TM instead of the API effectively gets rid of the pauses. Granted, for some people those pauses are not annoying (or hardly noticeable, if their software pauses between segments anyway).

I've been thinking about the various GT "facilitators" out there, and I have a sneaking suspicion they probably aren't paying Google anything.


Please keep suspicion-mongering to a separate thread where like-minded people can reply with either evidence or further suspicions. Besides, it has nothing to do with the current topic, as the original poster clearly uses Google's own service directly.


[Edited at 2014-11-12 09:26 GMT]


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 20:06
Member (2009)
Dutch to English
+ ...
Translating lists of words … with Google Translate? Nov 12, 2014

Samuel Murray wrote:

You say that the API is too slow, but how fast do you need it to be?


The original post was specifically about translating lists of words. The API causes pauses of up to a second between each segment. When translating lists, those pauses can be excruciating.



Translating lists of words with Google Translate? Wow, whoever tries such a thing deserves your tortuous workflow

Michael


 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:06
Member (2006)
English to Afrikaans
+ ...
Translating lists Nov 12, 2014

Michael Beijer wrote:
Samuel Murray wrote:
You say that the API is too slow, but how fast do you need it to be?

The original post was specifically about translating lists of words. The API causes pauses of up to a second between each segment. When translating lists, those pauses can be excruciating.

Translating lists of words with Google Translate? Wow, whoever tries such a thing deserves your tortuous workflow.


1. Google Translate is ideally suited to translating lists of words. How would you translate lists of words... with a dictionary next to the keyboard?

2. My workflow is completed in under a minute. If that is tortuous, then you really need to find another career.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 20:06
Member (2009)
Dutch to English
+ ...
…with 29 dictionaries! Nov 12, 2014

Samuel Murray wrote:

1. Google Translate is ideally suited to translating lists of words. How would you translate lists of words... with a dictionary next to the keyboard?

Depends on the list of words.

I generally don't translate lists of words, and if I do, I get paid a lot more than my regular word rate, so yes, with a dictionary (or 15) on my desk, and another god knows how many on my computer, plus …
• Google Translate + Microsoft Translator in little boxes in CafeTran (via their respective APIs),
• my TMs and external databases inside CafeTran,
• my 40 million-TU TMLookup TMX database,
• my dtSearch/Copernic desktop search programs,
• IntelliWebSearch,
plus … plus … plus …
Samuel Murray wrote:
2. My workflow is completed in under a minute. If that is tortuous, then you really need to find another career.

Anything involving PlusTools is by definition tortuous. Isn't it from 1989?

Michael


 

Milan Condak  Identity Verified
Local time: 21:06
English to Czech
Instant translation Nov 12, 2014

Samuel Murray wrote:

1. Google Translate is ideally suited to translating lists of words.

2. My workflow is completed in under a minute.


My workflow is completed in under 1-10 minutes.

I created a presentation "Instant translation" four years ago.

There are 8 pages in Czech:

http://www.condak.net/machine_t/cs/instant/cs/00.html

03.html = 2010: hunalign, 2014: I use LF Aligner 4.05

05.html = Okamžitý překlad / Instant translation

Video (7 sekund) s průběhem překladu (méně než 1 sekunda, 137 segmentů)
Video (7 seconds) with translation process (shorter than 1 sec., 137 segments)

I do not use this method anymore. For translation with Google Translate I use desktop application for Windows (as Samuel, but not QTranslate).

For bulk editing target TUs in TMX I use Virtaal (spaces, mistranslation,...) and for individual editing I receive offers from more engines:

http://www.condak.net/machine_t/qtranslate/20131101/cs/03.html

Edited TMX is directly used in OmegaT.

The using of dictionary created from extracted words and GT in OmegaT:

http://www.condak.net/cat_other/omegat/2013-11-24/cs/03.html

What is wrong, or more words phrase is manually given into glossary.

Happy PEMT in own hands.

Milan


 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:06
Member (2006)
English to Afrikaans
+ ...
@Salam Nov 14, 2014

Salam Alrawi wrote:
Why would you remove something (e.g. Sensitive information) from file bbb.txt? Will Google pick it up or something?


There are those who believe that Google might do that, yes.

But I often just want to make sure the TM I create is sanitised, and it's easier to sanitise it before sending it to Google Translate than afterwards. For example, Google Translate might translate "London, Bristol, Manchester" as "London, Bristol, Manchester", but it might also translate it as "London, Bristol, Edinburgh" (because Google isn't intelligent, and yes, Google can translate such strings in such odd ways), and if I try to remove "Manchester" from the TM, I would be stuck with Edinburgh in the TM.


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Using Google Translate to process long texts without buying an API access key

Advanced search






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search