How to remove tags from the original text
Thread poster: Oscar Rivera
Oscar Rivera
Oscar Rivera
Hungary
Local time: 22:11
English to Spanish
+ ...
Dec 10, 2014

Hello everyone,

So I made the move from 2.5 to 3.1 as there were some new features the former (2.5) lacked, e.g. search and replace (CTRL+K). However, 2.5 had an "advantage" over 3.1 in that I didn't have any problems with the original text containing tags. I just started a new project and within one paragraph, one single word has been broken down into syllables with tags. This has happened throughout the text in all the paragraphs.

I would like to eliminate these tags
... See more
Hello everyone,

So I made the move from 2.5 to 3.1 as there were some new features the former (2.5) lacked, e.g. search and replace (CTRL+K). However, 2.5 had an "advantage" over 3.1 in that I didn't have any problems with the original text containing tags. I just started a new project and within one paragraph, one single word has been broken down into syllables with tags. This has happened throughout the text in all the paragraphs.

I would like to eliminate these tags from broken down words from the paragraphs in the original document as it makes it impossible to spot words rightaway. This is what I get: "decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."

I went to the user's manual and I couldn't find the "Remove tags" feature/option. I learned about CodeZapper but I'd like to remove the tags directly from OmegaT.

Thanks in advance.
Collapse


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 16:11
Russian to English
+ ...
I suspect... Dec 10, 2014

if you tried this particular file in 2.5 you would get the same thing, because as I understand it, it has to do with the .docx format. I use Codezapper, which works fine. Alternatively, if you don't need ANY tags in the document, use the "remove tags" setting at Project > Properties (Control +E).
Someone else may have another solution for you.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 22:11
Member (2006)
English to Afrikaans
+ ...
This is what you get Dec 11, 2014

Oscar Rivera wrote:
This is what I get: "decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."


Thanks to a bug in ProZ.com's forum software that hasn't been fixed in (oh, I think) ten years, that is not what you get. What you get, is this:

"de<t20/>ci<t21/>di<t22/>do <t23/>ma<t24/>ni<t25/>fes<t26/>tar<t27/>nos<t28/> y ac<t29/>tuar en el con<t30/>ti<t31/>nen<t32/>te, in<t33/>vo<t34/>lu<t35/>cran<t36/>do a di<t37/>ver<t38/>sos ac<t39/>to<t40/>res so<t41/>cia<t42/>les, y bus<t43/>can<t44/>do con<t45/>cien<t46/>ti<t47/>zar a la so<t48/>cie<t49/>dad."

I went to the user's manual and I couldn't find the "Remove tags" feature/option.


"Remove tags" is in the Project > Properties dialog. But that will remove all tags, not just the unuseful ones. Are you sure the file would look better in an earlier version of OmegaT?

Samuel


Caroline Urbano
 
esperantisto
esperantisto  Identity Verified
Local time: 23:11
Member (2006)
English to Russian
+ ...
SITE LOCALIZER
Some advise Dec 11, 2014

What you show makes me think that the document was produced by OCRing or converting a PDF file. Anyway, read Marc Prior’s guide on DOCX compatibility. If your document contains no particular formatting, try clearing it (select text and, as I remember, Shift+Ctrl+O). Unlike removing tags in the OmegaT project properties menu, resetting the formatting will save certain elements represented as tags at their respective position... See more
What you show makes me think that the document was produced by OCRing or converting a PDF file. Anyway, read Marc Prior’s guide on DOCX compatibility. If your document contains no particular formatting, try clearing it (select text and, as I remember, Shift+Ctrl+O). Unlike removing tags in the OmegaT project properties menu, resetting the formatting will save certain elements represented as tags at their respective positions. Also, try a simple trick: convert the document to RTF and back to DOCX.Collapse


 
Oscar Rivera
Oscar Rivera
Hungary
Local time: 22:11
English to Spanish
+ ...
TOPIC STARTER
Belated thanks for the prompt help Dec 15, 2014

Susan Welsh wrote:

if you tried this particular file in 2.5 you would get the same thing, because as I understand it, it has to do with the .docx format. I use Codezapper, which works fine. Alternatively, if you don't need ANY tags in the document, use the "remove tags" setting at Project > Properties (Control +E).
Someone else may have another solution for you.



Susan, I also think it had to do with the document itself, more specifically the .docx format. Thanks for the "remove tag" advice. It worked and solved the problem. The target document without the tags looks exactly as the original source document.

Samuel Murray wrote:

Oscar Rivera wrote:
This is what I get: "decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."


Thanks to a bug in ProZ.com's forum software that hasn't been fixed in (oh, I think) ten years, that is not what you get. What you get, is this:

"decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."

I went to the user's manual and I couldn't find the "Remove tags" feature/option.


"Remove tags" is in the Project > Properties dialog. But that will remove all tags, not just the unuseful ones. Are you sure the file would look better in an earlier version of OmegaT?

Samuel




Samuel, you're right. I am not sure if it'd look better in an earlier OmegaT version and the "remove tags" worked really well. Fortunately, the document hasn't been altered.

esperantisto wrote:

What you show makes me think that the document was produced by OCRing or converting a PDF file. Anyway, read Marc Prior’s guide on DOCX compatibility. If your document contains no particular formatting, try clearing it (select text and, as I remember, Shift+Ctrl+O). Unlike removing tags in the OmegaT project properties menu, resetting the formatting will save certain elements represented as tags at their respective positions. Also, try a simple trick: convert the document to RTF and back to DOCX.


Esperantisto, thanks for the advice. I'd seen the document before but I'd never gotten round to reading it.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 22:11
Member (2006)
English to Afrikaans
+ ...
Remember, though, "Remove tags" doesn't actually remove tags Dec 15, 2014

Oscar Rivera wrote:
Thanks for the "remove tag" advice. It worked and solved the problem. The target document without the tags looks exactly as the original source document.


FWIW, as Marc mentioned in another thread, the "Remove tags" feature doesn't actually remove any tags -- it simply shoves all the tags of the segment to the end of the segment, without letting the translator see it. So, theoretically, if there is any loss of formatting, it should not affect more than just the one sentence that the formatting is in.


 
Oscar Rivera
Oscar Rivera
Hungary
Local time: 22:11
English to Spanish
+ ...
TOPIC STARTER
The tags at the end hasn't altered the document so far. Dec 15, 2014

Samuel Murray wrote:

Oscar Rivera wrote:
Thanks for the "remove tag" advice. It worked and solved the problem. The target document without the tags looks exactly as the original source document.


FWIW, as Marc mentioned in another thread, the "Remove tags" feature doesn't actually remove any tags -- it simply shoves all the tags of the segment to the end of the segment, without letting the translator see it. So, theoretically, if there is any loss of formatting, it should not affect more than just the one sentence that the formatting is in.


Thanks once again, Samuel. Although the tags were pushed at the end of the segmen, so far it hasn't altered the formatting.


 
Gabriel Torem
Gabriel Torem  Identity Verified
Argentina
Local time: 17:11
English to Spanish
+ ...
Unwanted Tags in OmegaT: Aug 20, 2021

I found that using odt files instead of docx files automatically removes all unwanted tags in the middle of words.

Oscar Rivera wrote:

Hello everyone,

So I made the move from 2.5 to 3.1 as there were some new features the former (2.5) lacked, e.g. search and replace (CTRL+K). However, 2.5 had an "advantage" over 3.1 in that I didn't have any problems with the original text containing tags. I just started a new project and within one paragraph, one single word has been broken down into syllables with tags. This has happened throughout the text in all the paragraphs.

I would like to eliminate these tags from broken down words from the paragraphs in the original document as it makes it impossible to spot words rightaway. This is what I get: "decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."

I went to the user's manual and I couldn't find the "Remove tags" feature/option. I learned about CodeZapper but I'd like to remove the tags directly from OmegaT.

Thanks in advance.


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


How to remove tags from the original text






Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »