We translators may be on the verge of dealing with more math surrounding our beloved words. Until now, the number that mattered most has been word count, because that is how we are paid. And fuzzy matches, of course—that magic algorithm that calculates how much partial help (and payment) we get.
We consider a segment short, medium, or long based on its word count. However, machine translation (MT) offers different ideas about different lengths; even neural MT with all the widespread praise for fluency.
For segments of one or two words, there is little to no fluency. So is that where neural MT goes out the window? Maybe, or maybe not. We have to find out. Either way, all of sudden, segment word counts have become metrics indicative of whether MT will translate fluently. Food for thought: how many words is still short? What is medium? What is long?
Another number to consider is edit distance. It may be coming to a translation productivity tool near you. That is because we will increasingly want to know how much MT is helping us do our job, and the answer to that (or a part of the answer) is how many changes we make to the suggestions we receive.
If the MT is really awesome in a 30-word sentence, we smile, make two changes, and the edit distance is low. If the MT requires reordering the words and fixing brands that were inaccurately translated, then we are putting a lot more effort and the edit distance will reflect that.
As much as I love edit distance, it is just a comparison between one initial “image” of the sentence and the final “image” that we created. But then there is adaptive technology, that keeps changing the suggestions as we move. Working with adaptive technology, there isn’t really an initial “image” or “initial full sentence translation.” So, it is complicated.
Let’s explore some other numbers: what is the average number of words per sentence in our text?
If it is high—let’s say 16—it means longer and more fluent sentences, and maybe we should expect neural MT to lend us a hand with it. But if we are translating software strings or mobile content, life isn’t like that, is it? Our average word count per sentence is pretty low; our strings are mostly short (except for messages). So we may not get as much help from MT.
Maybe the average number of words per sentence would be an interesting number to take into account when we get a new project?
All of that said, these numbers would be auxiliary metrics. The word count that we use today may be looking at the end of its life. With adaptive technology and varying qualities of MT suggestions, it will become very difficult to associate all the possible variations of those suggestions to overall word count.
After all, fuzzy matches are based on translation memories, on a predetermined percentage of similarity to an entire segment previously translated. How would we use such a strict concept for all the different parameters when working with MT and adaptive technology?
Today, most of the time, the combination of MT and fuzzy matches constitute a system that outputs some mixture of “fuzzy grid for 75 / 85% and above” and “discounts for MT suggestions below that threshold.” End-clients and agencies calculate edit distance to get an informed view of whether the MT is still helping us (or if it is helping too much). And that is the system: “fuzzy with some MT discount.” Edit distance is also great to detect patterns that will improve MT output: if we’re making lots of changes, the MT may need some improvement.
But with adaptive technology and who knows what other forms of AI-powered augmented translation, how are we going to fit the fuzzy grid into this? We won’t. Translators will be paid based on time spent working. So everybody will change how they work and everybody will win. Dear translators, start thinking how much your hour is worth. It is about time.