Pages in topic:   < [1 2]
Creation of TM out of 2 TMs with different target languages
Thread poster: avsie (X)
Daniel Grau
Daniel Grau  Identity Verified
Argentina
Member (2008)
English to Spanish
How to do it [post-edited to show < & > symbols correctly] Sep 24, 2005

Marie-Claude Falardeau wrote:
We have 2 TMs: EN>SP and EN>IT. They pretty much have the same source segments (at least 75% of them, I'd say).
Now, we would like to have a IT>SP (or SP>IT) TM.


You can actually translate the tmx memory itself!

1) Open EN>SP tmx file in Word. Do a global replacement with wildcards to ensure none of the <codes> get translated:

\<*\>
to
tw4winExternal style

2) Do a similar global replacement, also with wildcards to turn the target language (ES-01, ES-AR, etc. or whatever you are using) also into a non-translable

\<tuv lang="ES-01"*\</seg\>
to
tw4winExternal style

The result should now show something like this (where the [bracketed text] denotes tw4winExternal style):

[<tu creationdate="20020602T120103Z" creationid="+A!" usagecount="0">]
[ <tuv lang="EN-US">]
[ <seg>] NOTICE OF PRIVACY PRACTICES [</seg>]
[ </tuv>]
[ <tuv lang="ES-AR">]
[ <seg>NOTIFICACIÓN DE PRÁCTICAS DE PRIVACIDAD:</seg>]
[ </tuv>]
[</tu>]

Note that only "NOTICE OF PRIVACY PRACTICES" remains without the tw4winExternal style.

3) Submit the file for EN>IT autotranslation at 100% matches, making sure it copies the English if there is no 100% match. Only the matching segments will get translated. So, if the initial file had more or less this structure (again, [bracketed text] denotes tw4winExternal style):

English1 [Spanish1] I'm not showing the intervening <codes>
English2 [Spanish2]
English3 [Spanish3]
....

And after autotranslation, you will end up with something like this
(where the {curly brackets} denote translated pairs):

{English1} {Italian1} [Spanish1]
{English2} {English2} [Spanish2]
{English3} {Italian3} [Spanish1]

Note that now all we have to do is get rid of English-English segments like the second one, which did not have a match.

4) Do a global wildcard replacement to get rid of the non-matches:

\<tu [!\{]@\{0\>(*)\<\}0\{\>\1\<0\}*\</tu\>

Test the above with a Find. It should find only the English-Italian pairs. IF you copied it correctly, this will delete all "{English} {English} [Spanish]" instances, leaving just the "{English} {Italian} [Spanish]":

{English1} {Italian1} [Spanish1]

{English3} {Italian3} [Spanish1]



5) Do several times a non-wildcard searches to get rid of extra returns (we don't want more than two in a row):

^p^p^p
to
^p^p

7) Finally, do a Trados cleanup of the file and save as Unicode. Note that intermediate files should be saved in .DOC format, so as no to loose the styles and colors.

Regards,

Daniel

[Edited at 2005-09-25 23:56]


 
avsie (X)
avsie (X)  Identity Verified
Local time: 03:19
English to French
+ ...
TOPIC STARTER
Interesting! Sep 24, 2005

Gee, thanks!

We didn't actually need it in the end (the project was cancelled), but if this situation comes again, I'll surely give it a try!

Thanks!

Marie-Claude


 
Pages in topic:   < [1 2]


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Creation of TM out of 2 TMs with different target languages







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »