XML translation in CAT tool
Thread poster: AchimT (X)
AchimT (X)
AchimT (X)
English
May 5, 2008

I'm putting this here (not into the Trados forum) as it is an XML related question - but I do run into this when translating in Trados. Here goes:

Sometimes (but not always) when translating an XML file for example into French, I cannot open the file on my system in a browser, it stops with an error message. I could be related to the Unicode special characters (the accent character) that causes these problems. It seems the browser complains about everything that starts with a '&'.... See more
I'm putting this here (not into the Trados forum) as it is an XML related question - but I do run into this when translating in Trados. Here goes:

Sometimes (but not always) when translating an XML file for example into French, I cannot open the file on my system in a browser, it stops with an error message. I could be related to the Unicode special characters (the accent character) that causes these problems. It seems the browser complains about everything that starts with a '&'.

I wonder, how can that be? And what can I do to avoid this issue?

I would in addition like to ask, are there other problems that can be introduced into XML files when translating in Trados that could prevent the Content Management system that output the XML file, or a browser, from reading the translated file?

One last question - do I still need to create an ini file in Trados if I have a DTD? Or can Trados 8 use the DTD directly as an ini file? And is the creation of the ini file as straightforward as it looks in the manual, or are there hidden traps?

Apologies if any of my questions could be answered by reading the manual - I've sincerely tried my best at finding these answers in the manual.

Thanks a lot in advance for all replies

Achim
Collapse


 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 13:49
English to Turkish
+ ...
A question May 5, 2008

AchimT wrote:

Sometimes (but not always) when translating an XML file for example into French, I cannot open the file on my system in a browser, it stops with an error message. I could be related to the Unicode special characters (the accent character) that causes these problems. It seems the browser complains about everything that starts with a '&'.



Which browser program do you use, internet explorer or firefox?


 
AchimT (X)
AchimT (X)
English
TOPIC STARTER
Both May 6, 2008

I am using IE8 and Firefox 2.

My concern is, will the customer also have problems opening this in whichever software he is using, and how can I safely test and know this? I wonder, are the files I'm producing flawed, or are they ok the way Tag Editor exports them?

Strangely, sometimes - but not always - I can open a file that doesn't open on my system (in IE or Firefox) on another system. But not always.

So I wonder, is some of the XML code that TE exports
... See more
I am using IE8 and Firefox 2.

My concern is, will the customer also have problems opening this in whichever software he is using, and how can I safely test and know this? I wonder, are the files I'm producing flawed, or are they ok the way Tag Editor exports them?

Strangely, sometimes - but not always - I can open a file that doesn't open on my system (in IE or Firefox) on another system. But not always.

So I wonder, is some of the XML code that TE exports flawed, or is this some setting or flaw on my system specifically?

Thanks for your help

Achim
Collapse


 
Jan Sundström
Jan Sundström  Identity Verified
Sweden
Local time: 11:49
English to Swedish
+ ...
Don't trust your browser May 6, 2008

AchimT wrote:

I am using IE8 and Firefox 2.

My concern is, will the customer also have problems opening this in whichever software he is using, and how can I safely test and know this? I wonder, are the files I'm producing flawed, or are they ok the way Tag Editor exports them?

Strangely, sometimes - but not always - I can open a file that doesn't open on my system (in IE or Firefox) on another system. But not always.

So I wonder, is some of the XML code that TE exports flawed, or is this some setting or flaw on my system specifically?

Thanks for your help

Achim


In general, since the nature of the XML language is eXtensible, you can't trust any web browser to parse your XML correctly.

And since the client might have a highly customized application reading his particular XML files, there is no "universal" program that can "guess" how the XML should be displayed on the client side.

My gut reaction is to always trust TagEditor, and don't mess around with the XML files in post. Recent versions of Trados do support XML and are capable of creating ini files with or without DTD, so just trust the built in capabilities.

The only thing you can do after translation is to open the XML in a text editor (like XML Spy). But even so, you will just be able to verify that the structure is correct, not that your special characters are displayed correctly. On the contrary, unicode chars might look weird, but that's the way it should be. Don't be tempted to make "corrections" in the editor post-Trados.

Sorry, maybe that was not the answer that you wanted to hear, but this is how I work.

/J


 
Piotr Bienkowski
Piotr Bienkowski  Identity Verified
Poland
Local time: 11:49
English to Polish
+ ...
The & character starts an entity May 7, 2008

It seems the browser complains about everything that starts with a '&'.



Any browser and actually any tool capable of parsing and validating XML will complain, if it encounters the & character on its own.

The & character tells the browser/tool that here is an entity, i.e. a series of characters that otherwise can't be represented with a single letter in the encoding of the XML file.

You can find the encoding information at the top of the file, in something like:

Code:
<?xml version="1.0" encoding="utf-8"?>



There can be something else instead of utf-8 in this part, for example windows-1252 or iso-8859-1 or even us-ascii.

My advice is to avoid using the & character and to type the equivalent of "and" in your target language.

But if you can't, because you have things like AT&T in your file, you can safely edit it in a Unicode-aware text editor, or an XML editor, and replace the standalone & character with the entity that stands for it. Because the & character always starts an entity in XML (and HTML) files, this character itself must be represented by an entity, if you want to display just an &. The entity for & is

Code:
&amp;





[Edited at 2008-05-07 06:14]


 
AchimT (X)
AchimT (X)
English
TOPIC STARTER
Thanks! May 7, 2008

Thanks a lot for the replies.

It had been my main concern that the customer might have similar problems in his systems if I have them, but from your replies I conclude that as a matter of fact, it may be the proprietary nature of the customer's application and the resulting extended XML code that makes it impossible for me to read it.

So the best approach is probably to just tell every customer asking for translation of XML files that we should test the export I can sen
... See more
Thanks a lot for the replies.

It had been my main concern that the customer might have similar problems in his systems if I have them, but from your replies I conclude that as a matter of fact, it may be the proprietary nature of the customer's application and the resulting extended XML code that makes it impossible for me to read it.

So the best approach is probably to just tell every customer asking for translation of XML files that we should test the export I can send him in a little test translation prior to doing the entire job.

Thanks a lot again for your help!

Achim
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

XML translation in CAT tool







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »