Post-editing (Machine Translation (MT))

Technical forums » Machine Translation (MT) »
Post-editing
Track this topic

Post-editing

Thread poster: Ade Indarta

Ade Indarta

Indonesia
Local time: 12:41
Member (2007)
English to Indonesian

Aug 8, 2010

I just started my reading on post-editing. But I found that most researches on post-editing are to compare the productivity between Postediting and Human Translation.
Should it be MT+Postediting and Human Translation+Human Editing?

I mean comparing the time required to (post)edit MT and the time required to translate from scratch off course will be different. Editing will take shorter time than translation (assuming that the translation is good).

We should compa... See more

Jeff Allen

France
Local time: 07:41
Multiplelanguages
+ ...

clarification of HT when compared with MT Postediting

Aug 8, 2010

AdeIndarta wrote:

I just started my reading on post-editing. But I found that most researches on post-editing are to compare the productivity between Postediting and Human Translation.
Should it be MT+Postediting and Human Translation+Human Editing?

I mean comparing the time required to (post)edit MT and the time required to translate from scratch off course will be different. Editing will take shorter time than translation (assuming that the translation is good).

We should compare the time required to post-edit MT and the time required to edit HT AND the quality resulted from both processes.

Hi Ade,

In most case studies on MT Postediting (PE) compared with Human Translation (HT), HT refers to the Translate/Edit/Proof (TEP) cycle (or as is used by most in the professional translation arena.

the goal has been to compare TEP with MT + Postediting (and sometimes even 2 stages of PE).

However, in some cases, there is the case where the Posteditor is in fact a key subject matter expert on the topic. Most of the Postediting projects I have conducted are for content where I was the author of the source content (or participated in the content design in general), or was a trainer on the topic to train authors or translators. There were even cases where I did a first stage postedit of a piece of content, gave it to a 2nd posteditor, and then had to change some of the terminology again after do a recheck, because I know the subject matter better than the 2nd posteditor. This also happens in HT translation projects on a regular basis (often referred to as the red pen syndrome effect).

The general rule has been to compare the overall cycle without MT (just HT / TEP) against the overall cycle of using MT + one or more postediting stages.

I hope that helps.

Jeff

Ade Indarta

Indonesia
Local time: 12:41
Member (2007)
English to Indonesian

TOPIC STARTER

HT Editing V.S MT Post-editing

Aug 8, 2010

Hi Jeff,

Thanks for your reply.

You mentioned that HT refers to TEP cycle. TEP usually involves at least two to three translators, each being responsible for each stage. I think this is the common practice in translation services providers. However, in some studies that I read, they seem to use only one translator to translate and then compare the process with post-editing process.

I myself often have to edit someone's translation. And I am interested to know how this post-editing differs from (human) translation editing, not how it differs from translation.

I suspect the type of the MT used will affect the post-editing process. Editing RBMT will be easier I supposed since both the engine and human will start from the same knowledge, linguistic knowledge.

SMT post-editing should have bigger challenge; I do not know whether we can apply our linguistic knowledge in the editing method considering that the engine does not use one to translate.

Maybe the post-editor will have to use statistical principle too when editing SMT.

Ade

[Edited at 2010-08-08 13:07 GMT] ▲ Collapse

Jeff Allen

France
Local time: 07:41
Multiplelanguages
+ ...

more about HT and TEP with respect to RBMT and SMT projects

Aug 8, 2010

AdeIndarta wrote:

You mentioned that HT refers to TEP cycle. TEP usually involves at least two to three translators, each being responsible for each stage. I think this is the common practice in translation services providers. However, in some studies that I read, they seem to use only one translator to translate and then compare the process with post-editing process.

I myself often have to edit someone's translation. And I am interested to know how this post-editing differs from (human) translation editing, not how it differs from translation.

I suspect the type of the MT used will affect the post-editing process. Editing RBMT will be easier I supposed since both the engine and human will start from the same knowledge, linguistic knowledge.

SMT post-editing should have bigger challenge; I do not know whether we can apply our linguistic knowledge in the editing method considering that the engine does not use one to translate.

Maybe the post-editor will have to use statistical principle too when editing SMT.

Hi again Ade,

The TEP cycle principle is only one of several different types of Best Practices to ensure quality. It is not a rule, but a guideline that can be followed. Sometimes, others refers to this as the 4-eye principle in which 2 sets of additional eyes see the work, which is not just used in the translation industry, but also in content verification in software testing in general.
In TEP, especially in projects the tasks of people at each of the 3 stages is different. The translator does research (unless there are separate terminologists) and translates, the translation editor revises the content, and the proofreader looks for style guidelines, and other content/formatting/terminology consistency issues. This type of process is easier to find in large institution, in-house run projects.

There is a difference between the principle of TEP and how it is managed based on number of available participants, logistics, volume of content throughput to process, high volume projects, imposed turn around times, etc. As the translation industry constantly tends to be at the mercy of unplanned translation, so the translation participants so often find themselves at the end of the cycle with no time and must do everything tomorrow, and translation sales people who also often don't know how to educate the customer (or refuse riduculous turn-around times), so it is a vicious circle of meeting impossible deadlines for a lot of the translation work that is done.
So the result is that translation jobs/projects will get around this problem by staffing the project with as many translators as possible, based on their availability, and have them participate in the different phases of the TEP cycle. Much also depends on how well organized the project is, if there are in-house people involved, etc. Yet the need to have multiple translators playing the different roles is usually a sign of a project with lots of volume to handle, and tight turn around times.

However, in smaller in-house teams and heavily outsourced projects, the subject matter expertise is not as easy to find, so it is necessary to double-up and triple-up on the tasks with the same people, which can end up diluting their roles and making them do several different tasks. This is not a problem in itself, but simply depends on how well it is coordinated and managed.

We also need to see that the crowdsourcing approach provides an even different twist on how to manage translation projects.

To add to this is if the requirements include the need to have a draft translation or a full high-quality TEP translation. Both of these exist in the translation industry, and have for years, but few translation professionals talk about the fact that they can and do perform draft translations, when necessary. Yet, this is one of the main types of MT processed known as MT content gisting + minimal postediting.

One of the biggest challenges for MT implementation projects over the years has been that most of those proposing the MT solutions and services do not have experience in the TEP cycle. There are really few of us who had worked previously in the translation field, who learned how to quickly map and adapt the phases of an MT implementation onto existing translation cycle phases.
So one of the important checks to make, especially for high volume and high-quality expectations for translated content, is whether those MT solution and services providers have the background to understand the existing translation processes or not. The mistake that happened all to often is that translation buyers assumed that MT vendors and providers had this experience and knowledge.
Well, all of the forum discussions on this topic on LinkedIn over the past 6-8 months prove that this was missing, and that there is a significant gap in this knowledge and experience, and some MT providers state overtly that they are just know "learning" about translation cycle and translation quality measurment processes and related issues.

Another thing to add to the mix is what I mentioned in my post above about having subject matter expert (SME) translators, who work is diluted by including non-specialized editors and proofreaders who change the vocabulary and other style patterns that the translation SME knows by heart. Then it is necessary to correct that again. Hence, the need for translaton style guidelines and glossaries, which can avoid introducing such inconsistencies downstream in the translation cycle.

As for the RBMT and (customizable) SMT cycles, the RBMT and RBMT-like providers have many more years of experience deploying their products into translation cycles with corporate customers.

The customizable SMT providers (and various flavors of hybrid solution providers), for the most part, are new to the field and are just beginning to implement their systems among customers. Depending on how much training material (and the quality of it) affects how well they can customize their systems and provide better translation output.
There are also different types of people involved in the customized SMT cycle, including computational linguists or other types of technicians for the training cycles.

Implementing non-dictionary customized RBMT into projects can result in poor quality too, because it can lead to what a lot of people label as MT word-for-word translation, but it is rather just the lack of having used the terminology-based features to customize the system and improve the quality, just as a terminology definition phase with a resulting reference glossary would be done for the HT cycle.

It just depends on well the system is implemented and managed according to the output quality expectations defined and agreed upon with the customer (if this has been done upfront).

As for customizable SMT, there is still be stated by many as being in its infant and early stages. Some have more experience that others. It also depends how customizable it is, and to what extent can be overridden with linguistic knowledge, glossaries or not.

Anyone trying to implement a non-customizable SMT system into projects with any type of high-quality expectations, will end up doing more work than needed. In one of my LISA Globalization articles years ago about who owns the quality issues in the Localization cycle, I compared the translation issues with software issues (known as bugs) in a software product lifecycle and mention how one can result in REKT (a play on words with the word wrecked) projects. It's all a matter of handling as much as possible upstream in the cycle, to avoid quality degradation later in the cycle, which becomes exponential in any project, whether is be mechanical engineering, software, translation, or anything else.

All SMT projects should avoid trying to find and hire translators with statistical computational knowledge. That should be a different type of person, usually a computational linguist, and who should provide complementary tasks in working with the team to enable the best possible quality output, especially with iterative customization cycles.

Hope that helps a bit more.

Jeff

Login to reply/comment

To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Mahmoud Akbari	[Call to this topic]
Prachya Mruetusatorn	[Call to this topic]

You can also contact site staff by submitting a support request »

Post-editing

Forum rules

Help and orientation

CafeTran Espresso
You've never met a CAT tool this clever! Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free Buy now! »

Anycount & Translation Office 3000
Translation Office 3000 Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators. More info »

Recent posts | FAQ | Rules | Moderators | Article knowledgebase

Your current localization setting

English

Select a language

More languages...

Post-editing

Post-editing

You have native languages that can be verified

Your current localization setting

Select a language