Pages in topic:   [1 2] >
Inserting tags on version 4.2.0
Thread poster: Alvaro Pavié
Alvaro Pavié
Alvaro Pavié
Chile
Local time: 23:48
English to Spanish
+ ...
May 31, 2019

Greetings!

I began using this CAT sofware about a month ago and so far I'm pretty happy with it. However, I'm having issues using tags on the latest version, since I have to remove most of them before working on a document (due to PDF to .docx conversion). How do I insert tags and make sure they will work once the translated document is created?

Thanks!


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 23:48
Russian to English
+ ...
You don't insert tags Jun 3, 2019

You don't insert tags in the target document; they have to be there in the source document, and then you repeat the same tags in the target document. If I understand you correctly, you have stripped out the formatting in your .docx when you converted the PDF. I'm not sure why you did that, but you have to format the source document the way you want it to look before you translate it, or you have to format the target document in Word (or whatever application) when you are finished with OmegaT.... See more
You don't insert tags in the target document; they have to be there in the source document, and then you repeat the same tags in the target document. If I understand you correctly, you have stripped out the formatting in your .docx when you converted the PDF. I'm not sure why you did that, but you have to format the source document the way you want it to look before you translate it, or you have to format the target document in Word (or whatever application) when you are finished with OmegaT.


[Edited at 2019-06-03 12:02 GMT]
Collapse


 
Alvaro Pavié
Alvaro Pavié
Chile
Local time: 23:48
English to Spanish
+ ...
TOPIC STARTER
An easier way to format words? Jun 3, 2019

Susan Welsh wrote:

You don't insert tags in the target document; they have to be there in the source document, and then you repeat the same tags in the target document. If I understand you correctly, you have stripped out the formatting in your .docx when you converted the PDF. I'm not sure why you did that, but you have to format the source document the way you want it to look before you translate it, or you have to format the target document in Word (or whatever application) when you are finished with OmegaT.


[Edited at 2019-06-03 12:02 GMT]


I used TransTools to remove the tags from the .docx file (didn't actually receive the original PDFs, the documents were already converted when I got them) before using OmegaT, because otherwise I wouldn't have been able to translate due to the sheer amount of tags on the source segments.

So, if I want to italicize a foreign word on the target document, I have to do it manually after finishing the translation? Is there an easier way to do that?


 
Milan Condak
Milan Condak  Identity Verified
Local time: 05:48
English to Czech
Tagswipe and hide tags Jun 3, 2019

Alvaro Pavié wrote:

Greetings!

I began using this CAT sofware about a month ago and so far I'm pretty happy with it. However, I'm having issues using tags on the latest version, since I have to remove most of them before working on a document (due to PDF to .docx conversion). How do I insert tags and make sure they will work once the translated document is created?

Thanks!


Why? Too much tags! It is recommanded first reduce tags by script Tagswipe.

Tags diminishing machine translation results

http://www.condak.cz/nove/2019-03/31/en/00.html

Formatting

After this pretranslation you need to switch tags back and add them to translated texts.

How?

Ctrl+T, in some cases can be added all tags by Crtl+Shift-T.
--
Hide tags if you need it and and them back into target segment.

Milan


 
Alvaro Pavié
Alvaro Pavié
Chile
Local time: 23:48
English to Spanish
+ ...
TOPIC STARTER
Is TransTools bad? Jun 3, 2019

Milan Condak wrote:

Why? Too much tags! It is recommanded first reduce tags by script Tagswipe.

Tags diminishing machine translation results

http://www.condak.cz/nove/2019-03/31/en/00.html

Formatting

After this pretranslation you need to switch tags back and add them to translated texts.

How?

Ctrl+T, in some cases can be added all tags by Crtl+Shift-T.
--
Hide tags if you need it and and them back into target segment.

Milan


So, it isn't a good idea to use external methods of tag removal such as TransTools?


 
Milan Condak
Milan Condak  Identity Verified
Local time: 05:48
English to Czech
TransTools is a good tool, too Jun 4, 2019

Alvaro Pavié wrote:

Milan Condak wrote:

After this pretranslation you need to switch tags back and add them to translated texts.

How?

Ctrl+T, in some cases can be added all tags by Crtl+Shift-T.
--
Hide tags if you need it and and them back into target segment.

Milan


So, it isn't a good idea to use external methods of tag removal such as TransTools?


Alvaro,

if you have installed it in MS Word, you can use it and see result of its action. I have been using DGT-OmegaT and DGT-Wizard. I have in notebook with OmegaT (DGT-OmegaT) MS Word Start in which Word plugins (Wordfast, PlusTools or TransTools) do not work.
In version OmegaT 4.2.0 is the script "tagwipe.groovy" included.

Milan


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 23:48
Russian to English
+ ...
Codezapper Jun 4, 2019

I use Codezapper to clean up all .docx files before translation. I don't quite understand your problem, Alvaro, or why Milan says you have to "switch tags back and add them to translated texts." Codezapper (and I believe TransTools too, which I have used occasionally) gets rid of the junk tags, but not the real formatting tags, so there is nothing to "put back in."

If you want to send me your converted PDF for a test, feel free.

Keep in mind that you can also ask
... See more
I use Codezapper to clean up all .docx files before translation. I don't quite understand your problem, Alvaro, or why Milan says you have to "switch tags back and add them to translated texts." Codezapper (and I believe TransTools too, which I have used occasionally) gets rid of the junk tags, but not the real formatting tags, so there is nothing to "put back in."

If you want to send me your converted PDF for a test, feel free.

Keep in mind that you can also ask questions on the Yahoo list, which gets more traffic than here: https://groups.yahoo.com/neo/groups/OmegaT/info
Collapse


 
Milan Condak
Milan Condak  Identity Verified
Local time: 05:48
English to Czech
Without Wipetag, Codezapper Jun 4, 2019

Susan Welsh wrote:

or why Milan says you have to "switch tags back and add them to translated texts."



The two chapter of OmegaT manual docs:

1. In installed OmegaT/docs/en/chapter.project.properties.html

"Remove Tags
When enabled, all the formatting tags are removed from source segments. This is especially useful when dealing with texts where inline formatting is not really useful (e.g., OCRed PDF, bad converted .odt or .docx, etc.)

In a normal case it should always be possible to open the target documents, as only inline tags are removed. Non-visible formatting (i.e., which doesn't appear as tags in the OmegaT editor) is retained in target documents."

2. In installed OmegaTdocs/en/chapter.project.properties.html

"Remove Tags
When enabled, all the formatting tags are removed from source segments. This is especially useful when dealing with texts where inline formatting is not really useful (e.g., OCRed PDF, bad converted .odt or .docx, etc.) In a normal case it should always be possible to open the target documents, as only inline tags are removed. Non-visible formatting (i.e., which doesn't appear as tags in the OmegaT editor) is retained in target documents."
---
My link:

http://www.condak.cz/nove/2019-03/31/en/00.html

shows DOCX with extremly much tags. After "removing tags" in Project properties, the tags are hidden. MT or TMX can give hits or matches.

After pretranslation you need to switch tags (wrong text) back = to uncheck a box "Remove tags" in Project properties and place tags (back on proper place) to target field of opened segment by Ctrl+T.

Milan


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 23:48
Russian to English
+ ...
to Milan Jun 4, 2019

Now you are talking about "Remove tags" within OmegaT, which indeed removes all tags -- but it is not the same thing as using Tagwipe or TransTools or Codezapper, which don't. If I understand you correctly. Anyway, hopefully Alvaro has enough information now to solve his problem.

 
Alvaro Pavié
Alvaro Pavié
Chile
Local time: 23:48
English to Spanish
+ ...
TOPIC STARTER
Using Tagwipe deleted the entire format of my source document! Jun 25, 2019

Susan Welsh wrote:

Now you are talking about "Remove tags" within OmegaT, which indeed removes all tags -- but it is not the same thing as using Tagwipe or TransTools or Codezapper, which don't. If I understand you correctly. Anyway, hopefully Alvaro has enough information now to solve his problem.


Well, I ended up trying to use Tagwipe (and before that I prepared the document since, once again, it was a word created from a PDF) and, after some attempts learning how to run the script (and the different levels of wiping it has, I went gradually from 1 to 8), I managed to remove the tags and marvelled at the sight of segments which were now tag-free. Then, I proceeded to translate the document and I'm almost finished.

Before going to bed and call it a day, I went to the source folder of my project... Only to find that the document's format was completely gone! Wasn't the point of Tagwipe to remove only the junk tags without altering the original format of the file? I have no idea what to do now, everything is a mess and I don't know if I'll have the time to rearrange the whole thing manually. What can I do?


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 23:48
Russian to English
+ ...
Tagwipe Jun 25, 2019

Send Milan a private email via his profile, since he apparently has not seen this post. I've never used Tagwipe, but I find this very strange indeed.
Don't you have a copy of your source file, maybe in an email from your client?// If you have the properly formatted source file, send it to me and I will put it through Codezapper, which for sure only removes junk tags.



[Edited at 2019-06-25 12:49 GMT]


 
Alvaro Pavié
Alvaro Pavié
Chile
Local time: 23:48
English to Spanish
+ ...
TOPIC STARTER
I have the original file the client sent me. Jun 25, 2019

Susan Welsh wrote:

Send Milan a private email via his profile, since he apparently has not seen this post. I've never used Tagwipe, but I find this very strange indeed.
Don't you have a copy of your source file, maybe in an email from your client?// If you have the properly formatted source file, send it to me and I will put it through Codezapper, which for sure only removes junk tags.



[Edited at 2019-06-25 12:49 GMT]


I do have a copy. However, what about confidentiality? Granted, it's just an article extracted from a management magazine regarding winning strategies for organizations, but is it ethical for me to send you the document without my client's consent? Wouldn't it be preferable for you to send me the Codezapper software? I'm aware that an email must be sent to its creator in order to obtain it, but I don't have enough time to wait for it. Anyways, I've also asked on the OmegaT Yahoo Group for answers.

I will try and email Milan, but I'm not sure what will that accomplish.


 
Milan Condak
Milan Condak  Identity Verified
Local time: 05:48
English to Czech
Changes in source document Jun 25, 2019

Alvaro Pavié wrote:

Before going to bed and call it a day, I went to the source folder of my project... Only to find that the document's format was completely gone! Wasn't the point of Tagwipe to remove only the junk tags without altering the original format of the file?


The tools like Codezapper, Tagwipe, TransTool are removing tags = they change a source DOCX.
After reducing tags user have to view the source DOCX and control a layout of text and graphics.

Important tags remain in DOCX and they are visible in OmegaT. User can hide them by feature "remove tags" in OmegaT. This feature does not change a source DOCX and after swithing mode "remove tags" off in OmegaT, user can Add tags (insert tags) to translation in target segment by Ctrl+T. OmegaT takes information of type of tags and their order from source DOCX.

If someone remove all tags from source DOCX, s/he has in DOCX simple text. The target shall be the same as the source.

Milan


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 05:48
Member (2006)
English to Afrikaans
+ ...
@Alvaro Jun 26, 2019

Alvaro Pavié wrote:
I'm having issues using tags on the latest version, since I have to remove most of them before working on a document (due to PDF to .docx conversion). How do I insert tags and make sure they will work once the translated document is created?


I have followed the discussion here and in the OmegaT e-mail group, but allow me to add my opinion on this: as you know, OmegaT does not allow you to add character formatting during the translation process, so you have to add the formatting afterwards (in the final file in e.g. MS Word) or you have to add it beforehand (i.e. before you start the translation, you open the file in e.g. MS Word and prepare it with all the required formatting, and THEN you start the translation in OmegaT).

When I need to translate PDF files, I first convert the PDF file to plain text, then I recreate the layout manually in MS Word, and then I paste the plain text into the MS Word file, and change the character formatting so that the MS Word file looks more or less like the original PDF (inasmuch as it is necessary), and only then... only then do I start the translation in my CAT tool.

I know that some CAT tools allow you to add simple character formatting (bold, etc.) in the CAT tool itself, but OmegaT doesn't. But even in such CAT tools, it is a good idea to "prepare" the file before loading it into the CAT tool, because only the most basic formatting can be fixed inside the CAT tool.


[Edited at 2019-06-26 07:07 GMT]


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 23:48
Russian to English
+ ...
confidentiality Jun 26, 2019

I just received your post, out of sequence and a day late, for some reason.
On confidentiality, that's up to you - I don't know what NDA you signed, although I don't see how a published magazine article could be considered confidential.
Codezappers is not free (although it's cheap). If you want a copy, you should order it from its creator.


 
Pages in topic:   [1 2] >


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Inserting tags on version 4.2.0






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »