Success Series

Join ProZ.com every Wednesday at 14:00 GMT / 10:00 AM EST for ProZ.com Translator Success series. Each week ProZ.com will bring speakers & presenters on to help ensure Freelance linguists have success & achieve their business objectives.

Click for Full Participation

What is the best way to translate a PDF document?
Thread poster: André Moreira

André Moreira  Identity Verified
Austria
Local time: 16:07
Member (Jul 2019)
English to Portuguese
+ ...
Oct 18

Greetings,
I realize PDF documents can cause more trouble to finish on the later stages of structure and format editing since the exported text often, if not always, comes with imperfections and lack of precision in terms of paragraph positioning, fonts, image positioning, lines etc ...
What is the best way to avoid these problems if there is any?
What is the best way to translate a PDF?
Thanks.

[Edited at 2019-10-18 22:16 GMT]


 

Stepan Konev  Identity Verified
Russian Federation
Local time: 18:07
English to Russian
Tidy up the converted file Oct 18

1. Ocr a pdf file
2. Copy all and paste to Notepad.
3. Copy all in Notepad and paste back to MS Word
4. Apply formats and insert all pictures as per the original layout.
Steps 2 and 3 releive your from the formatting-related headache (tag soup).
That's it.


LDSngo91
Rebecca Cockburn
 

Vadim Kadyrov  Identity Verified
Ukraine
Local time: 17:07
Member (2011)
English to Russian
+ ...
The best way Oct 19

is to never touch a PDF file for translation, since this format was never intended to be edited/translated/manipulated in any way. The format was invented to provide documents, information 'as is' - for reference, etc.

Still, the only way is to try to OCR a pdf file and then use a CAT tool to translate the corresponding doc file.
Of course, you will have to adjust formatting - for an extra fee, of course.

But this is a workaround, an imperfect way to do the impos
... See more
is to never touch a PDF file for translation, since this format was never intended to be edited/translated/manipulated in any way. The format was invented to provide documents, information 'as is' - for reference, etc.

Still, the only way is to try to OCR a pdf file and then use a CAT tool to translate the corresponding doc file.
Of course, you will have to adjust formatting - for an extra fee, of course.

But this is a workaround, an imperfect way to do the impossible - i.e. try to translate something that was never intended to be translated, only be given as a reference.
Collapse


DZiW
Adrian MM.
Tom in London
Christel Zipfel
Björn Vrooman
 

Achmad Fuad Lubis  Identity Verified
Indonesia
Local time: 22:07
Member (2015)
English to Indonesian
+ ...
Using the Adobe Acrobat Distiller XI to convert a PDF document to a other formats for translation Oct 21

Try this link for help https://helpx.adobe.com/acrobat/11/using/exporting-pdfs-file-formats.html.

 

Hamish Young  Identity Verified
New Zealand
Local time: 04:07
Member (2010)
Chinese to English
Use OCR software Oct 22

Almost everything I work on comes in PDF format, and I like it that way. However, clients seldom expect the format of the target text to match the source exactly. I would be checking first to see if this is actually a requirement of the job. If so, it is quite normal to add on an extra charge to cover time spent on formatting. Particularly if the PDF file contains unformattable text that you find difficult to work with, you should run the file through OCR software to extract the text, and this w... See more
Almost everything I work on comes in PDF format, and I like it that way. However, clients seldom expect the format of the target text to match the source exactly. I would be checking first to see if this is actually a requirement of the job. If so, it is quite normal to add on an extra charge to cover time spent on formatting. Particularly if the PDF file contains unformattable text that you find difficult to work with, you should run the file through OCR software to extract the text, and this will also help with formatting, since many OCRs can reproduce the format of a PDF almost perfectly in Word. A good OCR tool is more useful than a CAT tool for PDF files.Collapse


Gareth Callagy
 

Dan Lucas  Identity Verified
United Kingdom
Local time: 15:07
Member (2014)
Japanese to English
No need to choose one or the other Oct 22

Hamish Young wrote:
A good OCR tool is more useful than a CAT tool for PDF files.

I think I understand your sentiment, but they are not mutually exclusive.

If I use FineReader to perform OCR on a document, that doesn't mean that I cannot or should not use a CAT tool on the resulting (typically Word) file. They are equally useful.

Sometimes I cannot obtain a readable document for my CAT tool without OCR, but by the same token I wouldn't attempt a document of any size without my CAT tool even if a readable version of the source file were available.

Regards,
Dan


 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:07
Member (2006)
English to Afrikaans
+ ...
Assume the worst, hope for the best Oct 23

André Moreira wrote:
I realize PDF documents can cause more trouble to finish on the later stages of structure and format editing since the exported text often comes with imperfections and lack of precision in terms of paragraph positioning, fonts, image positioning, lines etc. What is the best way to avoid these problems if there is any?


If you intend to recreate the entire file manually, then it's a matter of preference whether you want to recreate it in the source language first, and then translate it, or first translate it, and then recreate it in the target language.

If you intend to use e.g. OCR and simply fix the formatting errors, then I recommend that you fix those errors before you start translating, to ensure that the text is suitable for translating in a CAT tool. If you choose to "finish on the later stages of structure and format editing" only after the translation is done, then (a) the text won't be CAT tool friendly while you translate it and (b) it would be difficult to schedule your time since you can't predict how long it is going to take to fix those errors. If you fix errors before you start translating, then its easier to determine halfway through the project how long it will take for the project to be completed.

When you get a PDF file, you should assume the worst: assume that you'll need to recreate everything from scratch. But then try the various ways to speed up the process, e.g. if you have OCR or PDF conversion tools, try them out to see if they produce usable results.

If the text can't be copied (i.e. it is non-editable), then sometimes you can OCR it, but sometimes it's faster to just type it give it to a typist to type for you. Often, typists are also able to take care of much of the formatting as part of the typing service.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What is the best way to translate a PDF document?

Advanced search






PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search