Strange tags in source text
Thread poster: Paul Klassen
Paul Klassen
Paul Klassen
Canada
Local time: 08:22
French to English
Mar 7, 2013

I have had this problem using Trados on Word files, but thought it was a Word thing. Now I'm translating a LaTex file using OmegaT, and am encountering the same thing … tags that seem to serve no purpose insterted between each character in a paragraph It only afects one or two paragraphs, the rest are normal. Please take a look:



Any idea what could be causing this? and how to get rid of them?

Thanks

Paul


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 13:22
English to French
+ ...
It might be real formatting Mar 7, 2013

Paul Klassen wrote:
I have had this problem using Trados on Word files, but thought it was a Word thing. Now I'm translating a LaTex file using OmegaT, and am encountering the same thing … tags that seem to serve no purpose insterted between each character in a paragraph It only afects one or two paragraphs, the rest are normal. Please take a look:



Any idea what could be causing this?

If it is not a bug (difficult to say without discussing the file in details, which would be better done in the OmegaT Yahoo support group), it might be real formatting.
LaTeX can place very precisely characters.

and how to get rid of them?

The simplest would be to clean the source document, but that supposes understanding the LaTeX syntax, to know what to remove.

Didier


 
Paul Klassen
Paul Klassen
Canada
Local time: 08:22
French to English
TOPIC STARTER
Please clarify Mar 8, 2013

Thanks for your prompt reply, Didier.

Since *.tex files are pure text, I wasn't sure what to make of your comments (i.e. what kind of formatting could be embedded in the text like that). But you inspired me to try some stuff.

First of all, in text editors that bit of text looks like this:

avec $\ell=\|y\|=(\sum_{p}y_{i}^{2})^{1/2}$ la norme euclidienne de $y$ et $m(\theta)$ une fonction de transformation des coordonnées polaires d'angles $\theta$ telle que
... See more
Thanks for your prompt reply, Didier.

Since *.tex files are pure text, I wasn't sure what to make of your comments (i.e. what kind of formatting could be embedded in the text like that). But you inspired me to try some stuff.

First of all, in text editors that bit of text looks like this:

avec $\ell=\|y\|=(\sum_{p}y_{i}^{2})^{1/2}$ la norme euclidienne de $y$ et $m(\theta)$ une fonction de transformation des coordonnées polaires d'angles $\theta$ telle que $m(\theta)=y/\ell(y)$ et $\|m(\theta)\|=1$. $m(\theta)$ peut également s'exprimer comme suit :

That seems ok.

If I create another *.tex file including only that paragraph and view it in TeXnicCenter (without running build, as it has no front matter), the paragraph looks fine, but loading that file into OmegaT shows the same gibberish. If I load that file in a text editor (Notepad++) and then save it as a text file (*.txt), it looks normal in OmegaT. (but saving that file back to *.tex does not eliminate the problem).

I'm really not sure whether this is an OmegaT problem, or what.

Thanks again,
Paul
Collapse


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 13:22
English to French
+ ...
Probably a bug Mar 8, 2013

Paul Klassen wrote:
Since *.tex files are pure text,

They are not. I.e., they are no more text than XML files.
LaTeX files are formatted files, the issue being that we had to hard-code a parser, as there is no standard parser we could use.

First of all, in text editors that bit of text looks like this:

avec $\ell=\|y\|=(\sum_{p}y_{i}^{2})^{1/2}$ la norme euclidienne de $y$ et $m(\theta)$ une fonction de transformation des coordonnées polaires d'angles $\theta$ telle que $m(\theta)=y/\ell(y)$ et $\|m(\theta)\|=1$. $m(\theta)$ peut également s'exprimer comme suit :

That seems ok.

Supposing there are no other things explaining the behaviour, it looks like a bug in OmegaT.

If I create another *.tex file including only that paragraph and view it in TeXnicCenter (without running build, as it has no front matter), the paragraph looks fine, but loading that file into OmegaT shows the same gibberish. If I load that file in a text editor (Notepad++) and then save it as a text file (*.txt)

No need to "save as". Renaming it from the desktop would do the same thing.

, it looks normal in OmegaT.

When renamed as .txt, you are not using the LaTeX parser, but the Text filter. It has cons and pros. If your LaTeX file is not heavily formatted, it might be simpler to translated it as a text file. You see more LaTeX "tags", and you might have very odd linebreaks, but you have no risk of a wrong interpretation.

(but saving that file back to *.tex does not eliminate the problem).

There's no reason it should, you are just renaming your file.

I'm really not sure whether this is an OmegaT problem, or what.

It's probably a bug.
You could open a bug report on Sourceforge.

Didier


 
Paul Klassen
Paul Klassen
Canada
Local time: 08:22
French to English
TOPIC STARTER
Thanks Mar 8, 2013

Thank you Didier. I've filed a report. I've been a Trados user for years, but am tired of the constant drain on my wallet. OmegaT seems to be well reviewed, so I thought I'd give it a try. This is my first project on this CAT.

 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Strange tags in source text






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »