Word count glitch in Word 2007 - could it be exploited?
Thread poster: Jack Doughty

Jack Doughty  Identity Verified
United Kingdom
Local time: 11:31
Russian to English
+ ...
In memoriam
Jan 22, 2012

I had a job from an agency the other day. Let me say at once that I have no complaints against the agency, it is the only one which has ever paid me in advance, and no, it wasn't a cheque which could bounce, it was paid into moneybookers and I've checked the account, and the money is indeed in there.

But the point is, they sent me a Word file (Word 2007) of three pages, which they said was 1025 words. I thought it looked rather short, so I checked the word count and it was 1025 wo
... See more
I had a job from an agency the other day. Let me say at once that I have no complaints against the agency, it is the only one which has ever paid me in advance, and no, it wasn't a cheque which could bounce, it was paid into moneybookers and I've checked the account, and the money is indeed in there.

But the point is, they sent me a Word file (Word 2007) of three pages, which they said was 1025 words. I thought it looked rather short, so I checked the word count and it was 1025 words as they said. I returned the job and they said there should be three more pages. They sent it to me again, this time in Word and in pdf. The Word file was the same as before, but the pdf file had all the pages. I converted the pdf to a Word file using Abbyy Fine Reader, and found the complete file also had a word count of 1025 words. The Word file must have been wrong. I then did the rest of the job.

As I say, the agency is fine, it must just be some software problem. But could it be duplicated by an unscrupulous agency, which could then ask you to do a job of, say, 5000 words, and doctor the file so that the word count would be 5000 words, but actually it would be 6000, so that the unsuspecting translator might not notice it and do an extra 1000 words for free? Has anyone ever had such a problem?

[Edited at 2012-01-22 17:42 GMT]
Collapse


 

Tony M
France
Local time: 12:31
Member
French to English
+ ...
Similar problems here Jan 22, 2012

Once again, I hasten to add that these comments are in no way a criticism of any of the agencies I work for, who have always been totally honest with me. But odd anomalies do sometimes arise and cause headaches:

1) I once had a similar situation, with a Word doc. and the matching PDF, where extracting the text (directly, not via OCR) led to a wordcount (still in Word) that was almost double that of the original doc.! In the end, we had no choice but to conclude that it was some quir
... See more
Once again, I hasten to add that these comments are in no way a criticism of any of the agencies I work for, who have always been totally honest with me. But odd anomalies do sometimes arise and cause headaches:

1) I once had a similar situation, with a Word doc. and the matching PDF, where extracting the text (directly, not via OCR) led to a wordcount (still in Word) that was almost double that of the original doc.! In the end, we had no choice but to conclude that it was some quirk of the PDF file, sinc e the .doc wordcount seemed the more plauisble.

2) On another occasion, I ended up with 3 quite different wordcounts: 2 different results using PDF > OCR > .doc, on both formatted and 'text only', and a 3rd different result once the original Word .doc had been found.

3) I am currently having a tussle with a customer who sends me his Trados analyses (with blank TM) which yield a wordcount significantly lower than that under Word for the same .doc; the difference is sometimes as much as 10% or even more. And when I do the analysis in Wordfast, the wordcount is higher still.

We have not been able to find out where the error comes in, issues with text in text boxes and headers/footers still wouldn't be enough to explain the discrepancy. And while I am quite prepared to accept that my Wordfast analysis may for some reason be over-inflated, I don't see how it can be justified to rely on the Trados wordcount when it is so at odds with Word's own one. On the usual smaller jobs, the difference wouldn't be worth worrying about, but as quite a high volume of work is involved here, the total amount at stake is actually significant.

[I should just add that I am using Word 2003 under XP]

[Modifié le 2012-01-22 17:23 GMT]
Collapse


 

neilmac  Identity Verified
Spain
Local time: 12:31
Spanish to English
+ ...
Not sure if it helps, but Jan 23, 2012

This sort of thing happens to me quite often.

For example, the other day a direct client sent me a Word document which had obviously been recovered from another format (it may have been from Freehand, as they had asked me if I could translate in Freehand and I'd replied yes, but at twice the price of a Word doc due to all the fiddling about). Anyway, everything that had been an accented letter in Spanish was now a hieroglyphic or squiggle, which meant that even running a normal sp
... See more
This sort of thing happens to me quite often.

For example, the other day a direct client sent me a Word document which had obviously been recovered from another format (it may have been from Freehand, as they had asked me if I could translate in Freehand and I'd replied yes, but at twice the price of a Word doc due to all the fiddling about). Anyway, everything that had been an accented letter in Spanish was now a hieroglyphic or squiggle, which meant that even running a normal spellcheck would take ages.
They had helpfully sent me the original in PDF too, which I managed to convert into a more workable Word document. However, whereas their Word came up as 6000 words, the same document, but converted from PDF into Word, came up as 9000 words. I told the client and we assumed it must be because the PDF-converted doc may be counting the numbers etc, so we agreed I'm only going to bill them the 6000 in their version.

Things like this have happened with PowerPoint too, and I find that different counting programs will often come with different sums.

[Edited at 2012-01-23 19:58 GMT]
Collapse


 

Daniel Grau  Identity Verified
Argentina
English to Spanish
Could it be the text boxes? Jan 24, 2012

PDF-to-Word converters tend to use text boxes. If your document has text boxes and you do Select All before counting, you will only get the count for the text in the main flow. In order to include text boxes in the count, your cursor has to be blinking (no selection). Also, if you use the Word Count feature, make sure there is a checkmark at the bottom.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Word count glitch in Word 2007 - could it be exploited?

Advanced search






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search