Per page pricing for scanned image PDF?
Thread poster: Nigel Lindup
Nigel Lindup
Nigel Lindup
Switzerland
Local time: 13:33
French to English
+ ...
Aug 18, 2020

Dear colleagues,

Can anyone give me any guidance on how to quote for an 11-page Sp-Eng job from Venezuela, comprising a scanned image PDF, with no word count feasible (well I suppose one CAN always do a count!) and very fiddly, with several legal/banking/notary's forms, stamps, and certifications, and a formal contract? It looks as though it's for use in a court case, but I don't know.

It was suggested that I might give a price per page but being new to commercial tra
... See more
Dear colleagues,

Can anyone give me any guidance on how to quote for an 11-page Sp-Eng job from Venezuela, comprising a scanned image PDF, with no word count feasible (well I suppose one CAN always do a count!) and very fiddly, with several legal/banking/notary's forms, stamps, and certifications, and a formal contract? It looks as though it's for use in a court case, but I don't know.

It was suggested that I might give a price per page but being new to commercial translation I have no idea how to pitch it. Any advice would be very welcome.

Thanks,
Nigel
Collapse


 
Kevin Fulton
Kevin Fulton  Identity Verified
United States
Local time: 07:33
German to English
OCR to get source word count Aug 18, 2020

Most clients want to know upfront what a translation will cost and are unhappy with invoices based on a target word count.
Many translators use an optical character recognition program to read scanned texts, either to get a document that can be processed using a CAT tool, and/or to get a word count for quotation purposes.

Prior to discovering OCR, I used a crude method to estimate the source word count in a document. I'd decide on a unit of measure (in the US, an inch, but 3
... See more
Most clients want to know upfront what a translation will cost and are unhappy with invoices based on a target word count.
Many translators use an optical character recognition program to read scanned texts, either to get a document that can be processed using a CAT tool, and/or to get a word count for quotation purposes.

Prior to discovering OCR, I used a crude method to estimate the source word count in a document. I'd decide on a unit of measure (in the US, an inch, but 3 cm or any other measure can work). I'd count the number of words in one inch of running text in the most densely-worded paragraph (e.g. 50), then multiply that by the number of inches of text on a page (e.g. 6), yielding approximately 300 words. On pages with less dense, "fiddly" text with numbers, bank information, seals, etc. I'd charge for half an average page. Then I'd add a "fudge factor" of 10–15%. I almost always came in under my estimate, and would generally split the difference with the customer if it were a significant amount.
Collapse


Elena Feriani
Yolanda Broad
Maria Teresa Borges de Almeida
 
Tina Vonhof (X)
Tina Vonhof (X)
Canada
Local time: 05:33
Dutch to English
+ ...
Alternative Aug 18, 2020

I agree with Kevin's suggestion of using a conversion tool but in this case with many forms, stamps, etc., you could also consider a per page charge. Estimate the time you would need and count more time for those complicated pages than for the ones with regular text. With the complicated pages, it is the formatting that often takes the most time and a word count does not allow for that. I usually explain to that to the client up front - why some pages cost more than others.

Regardl
... See more
I agree with Kevin's suggestion of using a conversion tool but in this case with many forms, stamps, etc., you could also consider a per page charge. Estimate the time you would need and count more time for those complicated pages than for the ones with regular text. With the complicated pages, it is the formatting that often takes the most time and a word count does not allow for that. I usually explain to that to the client up front - why some pages cost more than others.

Regardless of how you decide to do it in this case, it will be worth your while to be able to work with conversion tools. You may need to try a few to see which one you like best.
Collapse


Jorge Payan
Maria Teresa Borges de Almeida
 
Tom in London
Tom in London
United Kingdom
Local time: 12:33
Member (2008)
Italian to English
Tell them upfront. Aug 18, 2020

Tell them you can't work with PDFs and ask them to kindly provide you with a conversion to .docx so that you can give them an estimate. Otherwise this project really isn't worth your while because even with a good (i.e. expensive) OCR conversion tool the results will be chaotic - and you will be blamed if your translation is equally chaotic. You might end up spending hours just trying, probably unsuccessfully, to correct the formatting - and that is not what you do. You're a translator - not a f... See more
Tell them you can't work with PDFs and ask them to kindly provide you with a conversion to .docx so that you can give them an estimate. Otherwise this project really isn't worth your while because even with a good (i.e. expensive) OCR conversion tool the results will be chaotic - and you will be blamed if your translation is equally chaotic. You might end up spending hours just trying, probably unsuccessfully, to correct the formatting - and that is not what you do. You're a translator - not a fixer of other people's messes! Trust me; I've been there and I know what those documents are like. After a number of very bad experiences (and a serious falling-out with one of my regular clients) I never accept those projects any more.




[Edited at 2020-08-18 19:05 GMT]
Collapse


Dan Lucas
Sandra & Kenneth Grossman
Chris Foster
Maria Teresa Borges de Almeida
Trevino Translations (X)
mariealpilles
Beatriz Ramírez de Haro
 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 12:33
Member (2014)
Japanese to English
Tread cautiously Aug 18, 2020

Tom in London wrote:
You might end up spending hours just trying, probably unsuccessfully, to correct the formatting - and that is not what you do.

Tom has the right of it. Probably better to avoid. It's hard to judge these things unless you have experience, and the process of acquiring sufficient useful experience is usually rather painful.

Still. I would start by throwing a relatively well-formed PDF into some OCR software to what the result is like. If it's not too bad, and the job is appealing enough - in economic terms, say it's into four figures on an interesting subject - then I will quote for the translation and the layout/DTP separately. "This many characters, so this much for the translation, and 5 hours to fix the layout at 50 euro an hour" or something like that.

For a complex layout (stamps?) it may well be quicker to start entirely from scratch, but if you use CAT tools that still doesn't get you a machine-readable source text, so you'll have to work without CAT, try to use the OCR output, or type it out (!). If I really don't want the job but the client is keen for me to go ahead then I may quote a very steep price to either put them off or, if they accept it, ensure that it is lucrative.

Regards,
Dan


Sandra & Kenneth Grossman
Sheila Wilson
Maria Teresa Borges de Almeida
mariealpilles
Michele Fauble
 
jyuan_us
jyuan_us  Identity Verified
United States
Local time: 07:33
Member (2005)
English to Chinese
+ ...
If you are translating a page of scanned PDF that has a few tables on it Aug 18, 2020

It could take you twice or 3 times the time needed for translating a page with the same word count without a table.

Sheila Wilson
Josephine Cassar
Maria Teresa Borges de Almeida
 
WS McCallum
WS McCallum
New Zealand
Local time: 23:33
French to English
Quick manual count Aug 19, 2020

The quick way of doing this is to take the most word-packed line on each page as a sample line, count the words in that line, and then count the number of lines on that page, mentally adding up half lines, part lines, bits in stamps etc. together so that they equal a full line.

Sample line word count x number of "full" lines on the page = page word count.

Then repeat the process for each page.

PS: If in doubt, round up, and don't forget the expansion factor
... See more
The quick way of doing this is to take the most word-packed line on each page as a sample line, count the words in that line, and then count the number of lines on that page, mentally adding up half lines, part lines, bits in stamps etc. together so that they equal a full line.

Sample line word count x number of "full" lines on the page = page word count.

Then repeat the process for each page.

PS: If in doubt, round up, and don't forget the expansion factor for your target language if that is the count you are quoting.

Using this method, it should take you less than 15 minutes do a quote on that pdf.



[Edited at 2020-08-19 08:59 GMT]

[Edited at 2020-08-19 09:22 GMT]
Collapse


 
Tom in London
Tom in London
United Kingdom
Local time: 12:33
Member (2008)
Italian to English
Yes but Aug 19, 2020

WS McCallum wrote:

.....Using this method, it should take you less than 15 minutes do a quote on that pdf.



Yes, but it is important that you should include in your estimate, as an extra, the (very considerable) charge for converting the PDF to Word (including any charge for purchasing new software), and for reformatting the text in the target language; also specifying the additional time this would require, i.e. a longer deadline for delivery of the translation.

If you do not include this you may find yourself accepting a job that will take you twice or three times as long as you thought (and a lot of grief trying to reproduce the appearance of the source text including different fonts you may have to look for and install, working with skewed official stamps, handwritten signatures, etc.)



[Edited at 2020-08-19 10:07 GMT]


Sheila Wilson
Maria Teresa Borges de Almeida
Tina Vonhof (X)
Vladimir Morozov
Beatriz Ramírez de Haro
 
WS McCallum
WS McCallum
New Zealand
Local time: 23:33
French to English
Swings and roundabouts Aug 20, 2020

Having been doing translations since a time when documents were delivered on paper and you had to type them out, I don't consider this to be an additional burden.

It's just base-line translation: you read the text as you go and type it out on-screen. It's actually a lot quicker than mucking around with scanning each page with OCR software and trying to sort out the out-of-order gobbledygook generated.

If you are translating in the legal or medical sector, receiving the
... See more
Having been doing translations since a time when documents were delivered on paper and you had to type them out, I don't consider this to be an additional burden.

It's just base-line translation: you read the text as you go and type it out on-screen. It's actually a lot quicker than mucking around with scanning each page with OCR software and trying to sort out the out-of-order gobbledygook generated.

If you are translating in the legal or medical sector, receiving these sorts of files is par for the course.
Collapse


 
Nigel Lindup
Nigel Lindup
Switzerland
Local time: 13:33
French to English
+ ...
TOPIC STARTER
Thanks! Here's another tip.... Aug 20, 2020

Many thanks to all for your advice and tips, which I will try to put to good use.

Here's another tip, from an Arabic translator colleague:

"...there is a very easy, though relatively unknown, solution [often used by text-processing units]. Upload your document to Google Drive, then open it with Google Docs from the Drive window on your browser. The resulting Google Docs file will contain both the original image and the converted text. You can simply copy-paste the text
... See more
Many thanks to all for your advice and tips, which I will try to put to good use.

Here's another tip, from an Arabic translator colleague:

"...there is a very easy, though relatively unknown, solution [often used by text-processing units]. Upload your document to Google Drive, then open it with Google Docs from the Drive window on your browser. The resulting Google Docs file will contain both the original image and the converted text. You can simply copy-paste the text into Word and have your word count.

"The Google Docs OCR is very effective, rendering near perfect results, even with the Arabic script that is notorious for being un-OCR-able."

With my job it was by no means "near perfect" owing to the higgledy-piggledy presentation, but all the text was in fact there, albeit requiring a good deal of reformatting. Another colleague tried it with a straight Russian image PDF and was very impressed.
Collapse


Christian Mihai Chitic
Zamira B.
 
DZiW (X)
DZiW (X)
Ukraine
English to Russian
+ ...
NDA Aug 20, 2020

Nigel, using online solutions (even including free ones like Wordfast Anywhere) usually does break the non-disclosure agreement, let alone some elements are still distorted or rendered improperly just worsening the layout. I would go for a specialized tool like FineReader or those with an OEM driver bundle. Why, making a paper from scratch is also ok.

Besides, translation is about meaning (via words), so one shouldn't translate signatures/visa--unless instructed so in accord with th
... See more
Nigel, using online solutions (even including free ones like Wordfast Anywhere) usually does break the non-disclosure agreement, let alone some elements are still distorted or rendered improperly just worsening the layout. I would go for a specialized tool like FineReader or those with an OEM driver bundle. Why, making a paper from scratch is also ok.

Besides, translation is about meaning (via words), so one shouldn't translate signatures/visa--unless instructed so in accord with the laws.

So, either negotiate it with the client or consider some bulk/fixed price when a certificate costs some $30 per page no matter how many words and extra fuss or DTP. Simply substantiate your rate, so your client could understand and accept it.

However, always know the market (rivals) and remember your 'absolute bottom' when you are ready to say nicely 'Good bye' and leave.


A service--especially post-"quarantine"--is rather cheap, whereas a turnkey solution as a product is quite valuable and expensive.
Collapse


 
Tom in London
Tom in London
United Kingdom
Local time: 12:33
Member (2008)
Italian to English
Yes.. Aug 20, 2020

WS McCallum wrote:

Having been doing translations since a time when documents were delivered on paper and you had to type them out, I don't consider this to be an additional burden.

It's just base-line translation: you read the text as you go and type it out on-screen. It's actually a lot quicker than mucking around with scanning each page with OCR software and trying to sort out the out-of-order gobbledygook generated.

If you are translating in the legal or medical sector, receiving these sorts of files is par for the course.


Yes; I have occasionally circumvented this problem by using my dictation software to rewrite the whole document, in English, directly into a Word file. This, of course, does not preserve the original formatting.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Per page pricing for scanned image PDF?







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »