https://www.proz.com/forum/post_editing_machine_translation/330694-how_much_of_machine_translation_is_plagiarism.html&phpv_redirected=1

How much of machine translation is plagiarism?
Thread poster: Daniel Frisano
Daniel Frisano
Daniel Frisano  Identity Verified
Italy
Local time: 07:12
Member (2008)
English to Italian
+ ...
Nov 23, 2018

We all know that taking the shortcut of MT means tapping into available bilingual resources.

Has it ever occurred to anyone that MT could be seen as plagiarizing someone else's work rather than creating one's own?


Tom in London
 
Jan Truper
Jan Truper  Identity Verified
Germany
Local time: 07:12
Member (2016)
English to German
... Nov 23, 2018

Daniel Frisano wrote:
Has it ever occurred to anyone that MT could be seen as plagiarizing someone else's work rather than creating one's own?


It has occurred to me, but it doesn't bother me.

Firstly, when I use MT, I hardly ever use the hits the way they are presented, but as a sort of base to get the translation ball rolling.

Secondly, on more than one occasion, I have seen my own work pop up in Linguee (which is the foundation of DeepL), so I see it as a give and take.


Kay-Viktor Stegemann
Jorge Payan
Tom in London
 
Steve R.
Steve R.
United States
Russian to English
Copyright infringement Nov 23, 2018

I refer to copyright for the sake of analogy, not because I think plagiarism and copyright infringement are synonymous. To be sure, they are not, though in certain cases one can result in the other.

To my point...

In copyright infringement cases, the de minimis defense may be used in a pleading to assert that the alleged instance of copying is of so little significance that the court ought not to bother itself with ruling on the complaint.

A bit ironic, bu
... See more
I refer to copyright for the sake of analogy, not because I think plagiarism and copyright infringement are synonymous. To be sure, they are not, though in certain cases one can result in the other.

To my point...

In copyright infringement cases, the de minimis defense may be used in a pleading to assert that the alleged instance of copying is of so little significance that the court ought not to bother itself with ruling on the complaint.

A bit ironic, but not all cases, even though the facts of the issue are not necessarily disputed, warrant the attention, in this case, of the court.

Perhaps a similar idea can apply here.

Just a thought.
Collapse


 
Jeff Allen
Jeff Allen  Identity Verified
France
Local time: 07:12
Multiplelanguages
+ ...
MT systems vary in how they reuse content from bilingual content Nov 23, 2018

Daniel Frisano wrote:

We all know that taking the shortcut of MT means tapping into available bilingual resources.


There are several different types of MT systems, and each of them is trained on the bilingual content in different ways, and then use that content to generate the MT output.

None of them (rule-based, statistics based, hybrid, neural) are based on loading up TMs and then using exact match copy and paste from the TM content.

The older rule-based MT systems are trained on TMs and will reuse any exact strings down to the term level. It depends if the TM is enabled as a type of dictionary. They still usually combine language grammar rules with dictionaries of different types and can mix into parts of segments. The European Commission rule-based system had been known to have been heavily trained in this way to reuse TM segments. But that is not how the majority of commercial MT system providers intended their rule-based MT systems to work.

Statistics based MT systems train on bilingual corpora but to create statistics based models. This is very different.
Neural MT is an extension of the statistical approach.

Daniel Frisano wrote:
Has it ever occurred to anyone that MT could be seen as plagiarizing someone else's work rather than creating one's own?


Example based MT would be closest to doing this.
Rule-based MT partially depending on the reuse of TM content as dictionaries
Statistics and neural MT is the farthest from plagiarizing unless of course the content that it has been trained on is so unique that the content your provide is the only example to train the model.


Jorge Payan
 
finnword1
finnword1
United States
Local time: 01:12
English to Finnish
+ ...
none, in my opinion Nov 23, 2018

There are many translation companies requiring you to use (a/k/a plagiarize) their TM. I personally know a major US manufacturer who insists that their technical writers copy and paste as much as possible, word for word, from their already published manuals, and want their translators to plagiarize the existing TM which they supply. If an existing translation is good, why change it?

 
Daniel Frisano
Daniel Frisano  Identity Verified
Italy
Local time: 07:12
Member (2008)
English to Italian
+ ...
TOPIC STARTER
Nov 26, 2018

MT ≠ TM

 
Tom in London
Tom in London
United Kingdom
Local time: 06:12
Member (2008)
Italian to English
The trick Nov 26, 2018

Daniel Frisano wrote:

We all know that taking the shortcut of MT means tapping into available bilingual resources.

Has it ever occurred to anyone that MT could be seen as plagiarizing someone else's work rather than creating one's own?


The trick is to NEVER click on the "suggest a better translation" option.


 
Tom in London
Tom in London
United Kingdom
Local time: 06:12
Member (2008)
Italian to English
It does bother me Nov 26, 2018

Jan Truper wrote:

It has occurred to me, but it doesn't bother me


Well, it does bother me, because making all these MT machines work better relies on gullible translators contributing their own work.

Otherwise known as "shooting yourself in the foot" or "putting yourself out of business".

[Edited at 2018-11-26 10:22 GMT]


 
Robert Rietvelt
Robert Rietvelt  Identity Verified
Local time: 07:12
Member (2006)
Spanish to Dutch
+ ...
Talking about copyright Nov 26, 2018

Jan Truper wrote:

Secondly, on more than one occasion, I have seen my own work pop up in Linguee (which is the foundation of DeepL), so I see it as a give and take.


I have seen my website pop up in Linguee.


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


How much of machine translation is plagiarism?


Translation news





Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »