Factoring the ratio of codes to words into your pricing (Déjà Vu support)

Technical forums » Déjà Vu support »
Factoring the ratio of codes to words into your pricing
Track this topic

Factoring the ratio of codes to words into your pricing

Thread poster: Kevin Lossner

Kevin Lossner

Portugal
Local time: 02:45
German to English
+ ...

Oct 4, 2008

For the last few days I've been plodding slowly through a text of the type I normally blaze through with great speed, but in this case excessive idiotic formatting (the usual bold, italic and underline formats, but also lots of hidden text elements embedded in the middle of sentences as hints for hyperlinks to be added later) is slowing me down tremendously. These aren't "rogue codes" but are in fact serving a real purpose in the text.

This is really not a new phenomenon. Anyone who works with DV knows how large "code salad" can impact a job. But up until now I have never given any specific thought to using the "code density" (ratio of codes to words) as a factor in pricing. Has anyone done this? ▲ Collapse

Vito Smolej
Germany
Local time: 03:45
Member (2004)
English to Slovenian
+ ...

SITE LOCALIZER

I deeply sympathize

Oct 5, 2008

This kind of texts are a real minefield. I had this kind of stuff to do last week. Quite a bundle, but "...just 5% no-matches...". Fact is, the 80%> were real killing fields.

A lot of the tags were just shifted: instead of the source structure T1 A T2 B T3 C I had again and again cases like A T1 B T2 C T3 C in the pretranslated target. I suspect the agency has lost parts of its TM and then fought its way through with half-cooked exports and imports. And I also suspect the client may have done his share too: in a nonnegligible number of cases tags were positioned correctly, only one or two or all tags were wrong: they represented some pointers to images or what not and it would need just one single insertion or rename in some subdirectory and all the image name tags would be wrong...

Anyhow, my words per hours were half to one third of my usual rate. What could I do? I have suffered through it and will check next time for suspiciously high "close-to-repeats". I also warned the agency that I would renegotiate their rates if and when it happens again.

Sigh...

I dont think I was of any help, Kev (g)

Regards

Vito

[Edited at 2008-10-05 16:21]

Kevin Lossner

Portugal
Local time: 02:45
German to English
+ ...

TOPIC STARTER

Nasty matches

Oct 5, 2008

I have run into enough trouble from high matches lately that I am growing ever less inclined to entertain any kind of scale for them. This applies to alleged 100% matches as well! I do a hell of a lot of Trados jobs in DV - lately a lot of them MS Word documents worked up in TagEditor from two clients who have recently stopped messing with Trados translations using MS Word macros for the most part. What I am seeing time and again are segments marked as 100% matches by TWB that are in fact not - the tags in the source and target differ in significant ways! The consequence is that I have to fiddle around to fix these segments, which often takes so much time that it's faster to copy the source to the target and start over. Grrrrrrr.

How does one recognize these minefields as such? I don't have a clear answer, but I think I'm going to have to start looking for one and tracking the statistics on some jobs. It's too bad the code count is for all codes; what really interest me are the mid-segment codes. The current document that is causing me to despair has a 10% ratio of codes to source words. I suspect this is rather high, but then many TE jobs with codes at the beginning and end of the segment will likely show similar stats, so I can't really depend on that ratio as an indicator. ▲ Collapse

Andrea Kowalenko

Spain
Local time: 03:45
Member (2006)
Spanish to German
+ ...

Good news - no need to change codes in 100% matches from TE

Oct 5, 2008

Just some days ago, I read on the DéjàVu user group on Yahoo that it is not necessary to fiddle with 100% matches that come from TagEditor. It was mentioned that with ttx files you do not need to change codes in 100% matches, if the come from TE, and that one can lock them to be able to run a code check without them. In case of interest, you might want to search the archives at the mentioned Yahoo group.

This applies apparently even if:

- there are codes in the target but not in the source
- there is a different number of codes in source and target
- there is the same number of codes in source and target, but the codes are not in the same order

I have not checked this out yet, but I thought this information to be very interesting.

This doesn't solve your actual problem with almost 100% matches, I know, but it is good news anyway, isn't it? At least for future jobs ... Although I imagine you would like to tear your hair out thinking about the time you spent on these codes in the past. Don't shoot the messenger!

Kind regards,

Andrea ▲ Collapse

Kevin Lossner

Portugal
Local time: 02:45
German to English
+ ...

TOPIC STARTER

That depends

Oct 5, 2008

kowalingua wrote:
Just some days ago, I read on the DéjàVu user group on Yahoo that it is not necessary to fiddle with 100% matches that come from TagEditor.

If you are referring to the fact that the codes need not always match in the source and target, that is correct, and that's a feature I take full advantage of when there are useless tags for optional hyphens and similar stuff in the source text. The problem I'm talking about here with 100% matches needing correction would likely apply even if the work were done purely in Trados. When I presegment a TTX using Workbench, some segments are marked as 100% matches and the target is populated with something which will give a result with different formatting or possible cause problems in generating a target file. I haven't investigated every aspect of this phenomenon yet or checked to see whether exactly the same trouble occurs if working in TE or if a segment-by-segment working mode will show this content as fuzzy instead. That's an exercise that will have to wait until I have time to kill, which may be a while.

Vito Smolej
Germany
Local time: 03:45
Member (2004)
English to Slovenian
+ ...

SITE LOCALIZER

what is 100%, depends not just on the source and target

Oct 5, 2008

...When I presegment a TTX using Workbench, some segments are marked as 100% matches and the target is populated with something which will give a result with different formatting or possible cause problems in generating a target file...

There must be some penalty parameter to take account of that ...

Kevin Lossner

Portugal
Local time: 02:45
German to English
+ ...

TOPIC STARTER

Penalty parameters

Oct 5, 2008

Vito Smolej wrote:
There must be some penalty parameter to take account of that ...

Likely so. The problem though is that while I may adjust the settings on my Trados installation to eliminate the false indication of 100% matches by applying appropriate penalties, there is less I can do about clients who run their own analyses and supply me with this information together with a PO. Sure, the ones I have the issue with would probably adjust the order in my favor, but it still involves time I don't really have at the moment. When I get a chance to document the problem very specifically, I will share that information with these agencies and ask that they adjust their calculations. It is to their advantage to do so as well, since this raises the bill for the end client.

Anne Bohy

France
Local time: 03:45
English to French

Trados has to improve

Oct 5, 2008

Aren't you mainly pointing the fact that Trados calculations for fuzzy matches are really biased in some cases?
I too have come across cases where complete different order of words was considered a nearly-perfect match... Or the nightmare of numbers not handled correctly by Trados but ignored in calculations.
SDL/Trados tend to consider the rules they apply for match-rating as their proprietary sotware. But these rules should be a common standard (who should define the standard: another interesting question). Only implementation should be proprietary.
Currently, all rules are at the advantage o the customer, and disadvantage of the translator.
I do not know why we should (manually) do the extra work of checking everything. The tool has to improve and take these new source texts into account, that's all. ▲ Collapse

Kevin Lossner

Portugal
Local time: 02:45
German to English
+ ...

TOPIC STARTER

No argument from me!

Oct 5, 2008

bohy wrote:
Aren't you mainly pointing the fact that Trados calculations for fuzzy matches are really biased in some cases?
I too have come across cases where complete different order of words was considered a nearly-perfect match... Or the nightmare of numbers not handled correctly by Trados but ignored in calculations.

While the starting point for this discussion was actually the hassles involved with tags/codes in a text and how to deal with this in pricing, the issue of crappy Trados matching and the problems you cite are very relevant for many DV users, because many of us work with pre-segmented Trados jobs paid based on an analysis with that somewhat satanic software.

I think "biased" is a very generous term for you to use, but I'll adopt it, because it's more polite than most words I can think of except possibly "wrong".

However, we DV users have some similar issues with regard to numbers in segments, because these and certain punctuation elements at the end of a segment are often not handled correctly in matches/assemblies/propagation. Nothing is perfect unfortunately.

As I have gained more experience with various tools and seen the many exceptions to assumptions about "time saved" with matches, I am ever less inclined to consider them in my pricing. I do consider it reasonable and necessary to do so in some circumstances, but there are certain situations and possibly certain file types where traps occur frequently. When I deal with my end customers, I am very open about my use of TM, and I tell them that although in some circumstances the technology may enable me to give them a serious price break, it is to be viewed as a quality management tool which, in the wrong hands, carries its own quality risks with it (fuzzies mistaken for 100% matches or bad 100% matches just being two items on a long list). I always reserve the right to decide what discounts to apply if any when I see whether there was any real difference in the work necessary.

In the case of the present issue - the effect of tags/codes on one's efficiency and how to anticipate the problem and deal with it quantitatively in advance when quoting a price, I am still at a loss for a good approach.

David BUICK

Local time: 03:45
Member (2006)
French to English
+ ...

Time to change your pricing strategy?

Oct 5, 2008

I work for a number of translation agencies and I have never, never accepted a rate which involves set discounts for repetitions, let along numbers of codes.

I consider my added value as a translator is precisely that I am not simply an operator of CAT software. CAT software just happens to be a tool I may choose to use to do my job.

If an agency tries to make me factor in repetitions into my quote, I tell them that it is just as much if not more work spotting the 1% vital difference in a block with a 99% match as translating 99% new text with 1% the same (which I believe to be true). Of course, if there is a massive amount of 100% identical repetition I may offer them a lower rate per word, but there's no way I am ever going to get drawn into agreeing a price with set percentages for repetitions, codes, full stops or indeed anything other than the number of words. The way I see it this means more work for them, more work for me, and more room for time-consuming disagreements. At the end of the day either the price is right for me and them, or it's not. ▲ Collapse

Kevin Lossner

Portugal
Local time: 02:45
German to English
+ ...

TOPIC STARTER

Premium, not discount in this case

Oct 5, 2008

Eutychus wrote:
I work for a number of translation agencies and I have never, never accepted a rate which involves set discounts for repetitions, let along numbers of codes.

You are totally right about those damned 1% differences in many cases. I have sometimes stared at a long segment for so long trying to discern differences that I could have translated the thing 5 times while looking at it. So I often just copy the source to target (if there are many tags/codes to deal with and re-translate the whole thing.

My main consideration here doesn't really involve discounts, however. I am thinking more of what premium to charge in cases like the job I am working on right now, which can be rightly described as tag salad. Based on what it has done to my throughput in what is otherwise the subject I translate fastest, I should probably be charging a 300% premium. I will certainly be having a heart-to-heart discussion with the client about this tomorrow and inform them of the problems created by the insistence of their PR agency in formatting this content in a very unusual way. Now that I have become brutally aware of how this sort of thing impacts productivity, the scientist in me wants some nice way of quantifying the pain so I can make predict the potential impact of such issues in the future and

(1) account for them in my pricing and
(2) in my scheduling as well.

I suppose in the end I'll take some fairly vanilla text with no formatting and start applying different levels of formatting (including hidden text inline commentaries which will end up as tags in TagEditor) to sections of it and then see how the resulting tag densities affect my translation efficiency. If I were writing an MA thesis for translation studies I suppose I could even spin this into the topic. But it's a pain in the neck when I have a lot of work piled up.

David Turner

Local time: 03:45
French to English
+ ...

CodeZapper

Oct 8, 2008

Don't think I've ever come across a "hint for a hyperlink".
Do you mean a hyperlink placemarker i.e. a greyed "I" symbol marking the place where the link jumps to? These can sometimes fall right in the middle of a word thereby splitting it with a co{1}de.
If so, you might like to try CodeZapper, a macro template for removing rogue code that I put together.
Among other things, it moves these placemarkers from the middle of words to the end of the paragraph.
You can download it from the Atril Forum or the Yahoo dejavu-l user group files section.
Atril have also made good progress on rogue codes. The latest beta versions of DVX contain far fewer of them and a public version should be released soon. The latest public version 303 was already quite a bit better than previous versions in this respect.

HTH,
David Turner ▲ Collapse

Kevin Lossner

Portugal
Local time: 02:45
German to English
+ ...

TOPIC STARTER

Hyperlink hints

Oct 8, 2008

David Turner wrote:
Don't think I've ever come across a "hint for a hyperlink".

And I hope you never do! These are literally "hints" or notes written in the middle of sentences I am supposed to translate and marked as red text (like a mention of the section to be linked and its section reference number in the "creation script" used by the PR agency doing the web site). They describe to the graphic artist doing the web design what is to be done. Since this red text was not supposed to be translated or counted, I made it hidden text, which caused it to be embedded in a code. Unfortunately there is nothing to be cleaned up there with CodeZapper or anything else. It is simply an idiotic way to create a source text, and I have warned the customer that a 300% premium will apply if this method is used in the future.

However, this project has brought the whole issue of how inline codes slow down one's translation to the fore. I intend to find a way of quantifying this effect for legitimate, unavoidable inline codes (usually for formatting, seldom for idiotic notes like I am currently dealing with) and to figure out a fair way to include them in my price calculations. I was curious if anyone had done something similar on a quantitative or qualitative basis up to now.

Rod Walters

Japan
Local time: 10:45
Japanese to English

Code Zapper for Powerpoint?

Jan 20, 2009

I'm never been bothered by rogue codes with the Word / Workbench combination, but it can be hideous in the Powerpoint / TagEditor combination. Every single number and symbol has its own little tag. This is particularly a problem with double-byte languages where there are single-byte characters interspersed throughout the text.*

Do you know of anything like Code Zapper that can strip these tags out of Powerpoint?

* And Japanese presenters do love to use lotsa colours. Purple highlights are a favourite. How many times has my attention to tags slipped, only to find after cleanup that the text after a certain point is all purple and must be returned by hand to its former motley glory? ▲ Collapse

Login to reply/comment

To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov	[Call to this topic]

You can also contact site staff by submitting a support request »

Factoring the ratio of codes to words into your pricing

Forum rules

Help and orientation

Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators. Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way. More info »

Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business. More info »


	X Sign in to your ProZ.com account... Username: Password: Forgot your password? Or create a new account