Slate Desktop: your personal MT engine (Machine Translation (MT))

Technical forums » Machine Translation (MT) »
Slate Desktop: your personal MT engine
Track this topic

Pages in topic: < [1 2 3 4 5 6 7 8] >

Slate Desktop: your personal MT engine

Thread poster: Mohamed

Tom Hoar (X)
United States
Local time: 01:34
English

Re: CAT vendor support

Sep 21, 2015

Each CAT vendor has different policies about MT connector support. We have the SDK's for OmegaT (open source) Trados, MemoQ and Across. We're waiting for cooperation from Transit, Wordfast and Deja Vu. I need to ask the CafeTran team, but I can't find an email link on their website. Hans, thank you for taking the lead and asking the for their input. I'll continue trying from my side.

Tom Hoar (X)
United States
Local time: 01:34
English

Re: OSX version

Sep 21, 2015

An OSX version is in our roadmap, but not with this campaign. The underlying Slate (Moses) technology compiles on OSX. Actually one of the lead computational linguistics on the research team develops exclusively on Mac.

We've chosen to delay an OSX version because there's much more to productizing the technology for users than just compiling and making it work. When we launch our products, we want to have the resources to support our customers. It's not about taking your money. It's about giving you value for your investment. That includes our support when you need us.

My chief engineer and I talked about OSX support a lot! We decided when we're ready to move on that, we'll have true fundraising Indiegogo campaign to raise enough money to buy a top-end Mac machine. Then, we can do all the pre-release builds, testing and post sales support. So, stay tuned.

Of course, if you want to voice your support with a contribution to the campaign, it's easier for us to hear your voice

▲ Collapse

Patrick Porter
United States
Local time: 01:34
Spanish to English
+ ...

2 ways of getting more out of TMs

Sep 23, 2015

Meta Arkadia wrote:

Michael Beijer wrote: devising carefully constructed counterexamples is fun

Not carefully constructed, I'm afraid. I just took a rather old screenshot (no Recall as you undoubtedly noticed) from my archives to compare it with the MT of the Thieves of Mountain View (congrats, by the way) in an answer to Bernard. I wanted to show that an optimised use of MT technology can be very useful, whereas a "general" MT translation will most likely produce err, not so good results.

I do believe MT holds a promise, and I do think it can be useful for freelance translators. I only wonder if users of advanced CAT tools (DejaVu, CafeTran) that already implement MT technology will need it. But so far, I didn't get an answer to that question.

And by the way, tahoar, it's OS X here. Forever.

Cheers,

Hans

This CafeTran sounds interesting...the subsegment matching, etc. I'm going to have to take a look at it. Thanks for the example Hans. It's a good demonstration of the benefit of moving beyond mere fuzzy matching (taking your word for it, of course, because I don't read Dutch or German). For some time now there has been mention about various tools out there implementing this to some extent, but the example, despite being just an example, is compelling. In fact I see similar examples every day in my own work with my own MT engines.

As for how it differs from statistical MT, my guess would be that they use different algorithms (although that might be stating the obvious). It seems that both involve some sort of tokenization/breaking down into smaller units of text and recursively checking shorter and shorter phrases for previous translations. In stat MT there is a word-to-word alignment done as part of this step, and I imagine that the same thing would be needed in the subsegment matching tools like CafeTran's. In any case, there might be differences in performance or scalability, two factors which may be important to some, but not so much to others. Also, stat MT can move beyond simply phrase-based and can generate models based on data like part-of-speech, syntax, etc., although I'm not sure whether this offers a benefit to the production work of a professional translator. It would definitely be interesting to look into comparing the difference, advantages, disadvantages.

Patrick Porter
United States
Local time: 01:34
Spanish to English
+ ...

agree to disagree

Sep 23, 2015

Bernhard Sulzer wrote:

Patrick Porter wrote:

...The general-purpose public engines are pretty good at doing a fair-to-middling job on giving you the gist of whatever you throw at it in any subject area.

I disagree. What they give you is like a roulette. Worse. Show me two or three longer sentences where any engine gives you the gist - you're saying you can trust it to tell you a minimum of correct information. Let's start with some legal paragraph. Or medical. That kind of gist is been selling as MT, ready for post-editing.

Hi Bernhard,
I've read many of your posts/comments over the years on Proz and can see that you are clearly an MT-skeptic, but also a concerned and admirable defender of the profession. There is probably a lot we would agree on, but it seems like MT isn't an area where we would find lots of common ground. To clarify my quote above, I meant to suggest a scenario where all that is needed is a vauge idea of the meaning, probably by a member of the general public, not a professional translator who should already be able to read a source text and understand it, and not in things like legal translation and medical translation. However my opinion is that even GT can be a good starting point for a professional while researching terminology. The reason is that many times I will look for reliable bilingual resource online via traditional search, and assuming that GT is trained on the same bilingual resources that the search engine crawls, it seems natural that it should also be a helpful tool.

Bernhard Sulzer wrote:
Not sure what's so new about this new engine. You can build your own TM (translation memory) already that you can trust and work with.

The idea behind tools like this is to use those same translation memories in a more powerful way than simply searching for exact matches and fuzzy matches for entire segments. Any sentence might contain shorter phrases or other units that might have been translated in other sentences. In my fields of work, I tend to see many of the same phrases, terms, very frequently. Termbases are one way to handle this, but they require a lot of time and manual curating.

Bernhard Sulzer wrote:
...You're going to be replacing the natural way of translating with constant checking and re-checking, cause you can never be sure that what you get is correct. Whatever engine or TM or whatever it's called you use, if it's not 100% the same text you had before, you will have to check, accept or edit or replace. But I get the impression we're being sold the ultimate translation machine built on our powerful own words that will just spit out the right stuff and save us lots of time....

I know this is what some are looking for...a perfectly reliable translation machine..and there seems to be a lot of hyping in the media, etc. about this mythical machine that is soon to come, and who knows? I'm no futurist, but what is clear is that there are technical tools available right now that have proven themselves to speed up my work and in fact even improve quality in situations where things like terminology consistency, etc. are important. For me, they don't have to be perfect to save time. But this might have to do with my particular working style, into which these kinds of tools fit nicely.

[Edited at 2015-09-23 18:49 GMT]

Meta Arkadia
Local time: 12:34
English to Indonesian
+ ...

Green

Sep 23, 2015

tahoar wrote:
...we'll have true fundraising Indiegogo campaign to raise enough money to buy a top-end Mac machine.

I still use the cheapest 27" iMac, rotational HDD, model late 2009. And it's more than fast enough for my work, including using MT features in my CAT tool. Maybe I should consider starting developing so I can let others pay for a top-end Mac.

Cheers,

Hans

Bernhard Sulzer

United States
Local time: 01:34
English to German
+ ...

Tools versus human translators

Sep 23, 2015

Hi Patrick,

Thanks for your input.
Let me say just a few things.

If GT helps you, I'll accept that. I really don't use it, I prefer to do my own research if the need arises. No matter what it gives you, you have to be careful using it, but I'm sure you know that. I hardly ever tried it. I remember a project where I was under enormous time pressure and I used it but had to be very careful, it was way off on many results and involved heavy editing. In any case, usin... See more

Bernhard Sulzer wrote:
Not sure what's so new about this new engine. You can build your own TM (translation memory) already that you can trust and work with.

Patrick Porter wrote:
The idea behind tools like this is to use those same translation memories in a more powerful way than simply searching for exact matches and fuzzy matches for entire segments. Any sentence might contain shorter phrases or other units that might have been translated in other sentences. In my fields of work, I tend to see many of the same phrases, terms, very frequently. Termbases are one way to handle this, but they require a lot of time and manual curating.

If a tool is helpful, why not use it, as long as you don't have to pay enormous amounts of money. Tools should support us and help us do a better job, they shouldn't be a reason for clients/agencies to expect or even demand lower rates. This thinking is however very prevalent with some people supporting the development of better tools for translators. They would give anything to eliminate us completely, to make the tool into the translator. But we're far from being eliminated.

Patrick Porter wrote:
I know this is what some are looking for...a perfectly reliable translation machine..and there seems to be a lot of hyping in the media, etc. about this mythical machine that is soon to come, and who knows? I'm no futurist, but what is clear is that there are technical tools available right now that have proven themselves to speed up my work and in fact even improve quality in situations where things like terminology consistency, etc. are important. For me, they don't have to be perfect to save time. But this might have to do with my particular working style, into which these kinds of tools fit nicely.

I am not against tools, I use them myself. But MT shouldn't be advertised as a replacement for humans because it certainly is not. One main point is it doesn't "translate." But using term bases and TMs can be very useful, no doubt.
Gist or no gist, I only provide professional translations for people who want and need them. That's the result I sell, the quality text. I don't sell what my tools helped me with. I am sure you agree. No matter what tools I use, the quality I provide will come at an adequate price. That price doesn't just dwindle because we use CAT and other tools.

[Edited at 2015-09-24 03:43 GMT] ▲ Collapse

Meta Arkadia
Local time: 12:34
English to Indonesian
+ ...

I think...

Sep 24, 2015

Bernhard Sulzer wrote:
But MT shouldn't be advertised as a replacement for humans because it certainly is not. One main point is it doesn't "translate."

... most of us got that by now. However, "the rest of us" is trying to use machine translation technology to leverage our own TMs and relevant - and only relevant - other resources (like the DGT to very humanly translate EU BS) by either creating our own "machine," or by using MT technology in our CAT tools. This is what this subject is about.

Amen.

Hans

[Edited at 2015-09-24 00:07 GMT]

Tom Hoar (X)
United States
Local time: 01:34
English

subsegment auto-assemble vs statistical

Sep 24, 2015

I think this would be a great topic in the "CAT Tools Technical Help" forum because it transcends MT into the greater CAT tool experience. Maybe there they have a non-MT crowd with subsegment experience that can address how it works.

Tom Hoar (X)
United States
Local time: 01:34
English

It's about tools supporting human translators... not replacing them.

Sep 24, 2015

In another thread, I mentioned the 1966 report LANGUAGE AND MACHINES - COMPUTERS IN TRANSLATION AND LINGUISTICS by the Automatic Language Processing Advisory Committee (ALPAC). Here are links to the original PDF report and the Wikipedia article:

http://www.nap.edu/html/alpac_lm/ARC000005.pdf
https://en.wikipedia.org/wiki/ALPAC

This report defines machine translation as:

going by algorithm from machine-readable source text to useful target text, without recourse to human translation or editing.

The phrase without recourse to human translation or editing means to me that the intention to replace human translators is inseparable from MT. I think Bernhard's and others' comments rightfully reflect this perception based on this definition.

By inference a system that includes recourse to human translators for review and editing is not MT (note I intentionally didn't use the term post-editing) -- even if a subsystem (algorithm) converts machine-readable source text to target text.

We specifically design Slate Desktop for human translators to proofread and edit its target language drafts. This claim, however, is not unique to Slate Desktop. Our unique claims are: 1) the ease with which translators can personalize the engine's performance, and 2) the privacy that comes from doing the work on your own PC.

Note that we don't make claims about the quality of Slate Desktop's draft output. Oddly enough, Slate Desktop has very little control over the quality of the drafts. The translator's expertise and TMs control quality. This is like Photoshop has very little control over the quality of the graphics. It's mostly in the hands of the graphic designer's expertise. As translators use the tools, they learn how their choices of TMs they put into engines affect quality. Our experience has been that a beginning translator with a good inventory of personal TMs will achieve results equal to or better than most Cloud-based engines without any of the associated privacy concerns.

There are way too many variables to discuss in one reply post, such quality variation by language pair and bad habits from previous experience. However, if you look, you can find independent testimony to the affects of personalizing an engine.

Bernhard, I respect your skepticism and "Let's wait and see" stance. The door's always open for great exchanges like this. Thanks!

[Edited at 2015-09-24 07:48 GMT]

2nl (X)

Netherlands
Local time: 07:34

I'll check whether a connector for CafeTran has been added already

Sep 26, 2015

tahoar wrote:

Of course, if you want to voice your support with a contribution to the campaign, it's easier for us to hear your voice

Okay, but let me first check whether a connector/perk for CafeTran has been added.

2nl (X)

Netherlands
Local time: 07:34

MT cannot see context

Sep 26, 2015

Tom,

I've just watched your webinar on Slate Desktop and near the end you say 'Computers cannot see context'. Are you referring to 'real-life context' or to the context of previous segments in the current translation project?

Let me clarify that: In CafeTran I can instruct the auto-assembling feature to use a specific translation for a certain word (source term). (See: http://cafetran.wikidot.com/inserting-alternative-target-terms-via-the-context-menu)

Can Slate Desktop, or its underlying engine (Moses), put translations for terms that you have approved of on top of the stack, so that they will be used for the rest of the project?)

Hans

[Edited at 2015-09-27 00:20 GMT] ▲ Collapse

Milan Condak

Local time: 07:34
English to Czech

Apertium offline in OmegaT

Sep 26, 2015

2nl wrote:

Tom,

I've just watched your webinar on Slate Desktop ...
Hans

[Edited at 2015-09-26 14:37 GMT]

Hi,

I also watched the webinar on Slate Desktop. I am using OmegaT for HTML translation.

In OmegaT can be checked more MT engines in the same time.

I translate into Czech language. I mostly use MyMemory (Google Translate) checked in OmegaT. Using of Slate DeskTop is similar to using of Apertium offline.

Apertium is web on-line MT with some language pairs. Czech language is not supported yet. There is possibility to convert MT engine into *.jar file and run it offline in OmegaT (a picture on the bottom).

http://www.condak.net/cat_other/omegat/2013-09-17/cs/02.html

I tested in September 2013 the offline version of Apertium in OmegaT:

http://www.condak.net/cat_other/omegat/2013-09-17/cs/03.html

The *.jar file can run also without an integration into OmegaT:

http://www.condak.net/cat_other/omegat/2013-09-17/cs/04.html

I hope the *.jar files are OS indipended, they need Java.

On the last page 05.html you can see, that Apertium *.jar feature is bi-directional (PL-CS and CS-PL).

My second remark is on re-using ready pairs of files from Opus project prepared for Moses. This base data is not necessery convert from TMX. This step is skipped?

Milan,
the hobbyist

Michael Beijer

United Kingdom
Local time: 06:34
Member (2009)
Dutch to English
+ ...

Moses files on Opus site

Sep 26, 2015

Milan Condak wrote:

2nl wrote:

Tom,

I've just watched your webinar on Slate Desktop ...
Hans

[Edited at 2015-09-26 14:37 GMT]

Hi,

I also watched the webinar on Slate Desktop. I am using OmegaT for HTML translation.

[…]

My second remark is on re-using ready pairs of files from Opus project prepared for Moses. This base data is not necessery convert from TMX. This step is skipped?

Milan,
the hobbyist

I'm also planning to use those (many, very large) Moses files offered on the Opus site.

Richard Hill

Mexico
Local time: 00:34
Member (2011)
Spanish to English

Post-editing

Sep 26, 2015

Mohamed wrote:

It's geared towards translators, not only big companies or LSPs.

Mohamed

[Edited at 2015-09-04 21:50 GMT]

While this software sounds interesting, I wonder if we'll see agencies using it with their sometimes massive arrays of TMs, and then posting the resulting texts as post-editing jobs, as is currently the case with MT.

Jeff Allen

France
Local time: 07:34
Multiplelanguages
+ ...

context in the frameworking of computing and MT systems

Sep 26, 2015

2nl wrote:

Tom,

I've just watched your webinar on Slate Desktop and near the end you say 'Computers cannot see context'. Are you referring to 'real-life context' or to the context of previous segments in the current translation project?

Hans

[Edited at 2015-09-26 14:37 GMT]

Hi Hans,
I didn't see the webinar. I can see how the statement "computers cannot see context" as ambiguous as I have made the same statement to reflect both of the types of interpretations that you have mentioned. And that was made in general about computer science topics for various software and system, and also with regard to MT.

Statistical MT (SMT) has made it possible to extend beyond simple linear segment processing, but I would not say that this has resolved overall contextual translation processing needs.

1)
Look up this see this thread also in the ProZ forums where I explained a number of these items in 2005.
http://www.proz.com/forum/translator_resources/37742-is_this_the_future_automatic_simultaneous_translation_within_5_years-page2.html

In that thread, I refer to the doctoral thesis by Eric Crestan which introduced some very early results of semantic disambiguation of statistical processing with rule-based and knowledge-based (semantic framework) MT system implementations in a corporate tech doc and translation production environment. I was the president of his thesis defense committee in 2000.
http://www.mail-archive.com/[email protected]/msg00259.html

and I had worked onsite with writers and translators several years before on the project that he was able to assist with for his doctoral work.

His thesis is now available for free download at this link:
http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type=pdf&doi=10.1.1.130.2598

2) Some of the most in-depth work on anaphora and cataphora (these are terms used in linguistics referring to backward and forward context) has been done by Ruslan Mitkov and his team at the Univ of Wolverhamton in the UK. I remember funding one of his earliest projects on this for MT back in 1999-2001 while working as Technical Director at ELDA.
Just look up Mitkov anaphora resolution on the internet and you will come across various publications.

Jeff

Pages in topic: < [1 2 3 4 5 6 7 8] >

Login to reply/comment

To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Mahmoud Akbari	[Call to this topic]
Prachya Mruetusatorn	[Call to this topic]

You can also contact site staff by submitting a support request »

Slate Desktop: your personal MT engine

Forum rules

Help and orientation

Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers! The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc. More info »

CafeTran Espresso
You've never met a CAT tool this clever! Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free Buy now! »

Recent posts | FAQ | Rules | Moderators | Article knowledgebase

Your current localization setting

English

Select a language

More languages...

Slate Desktop: your personal MT engine

Slate Desktop: your personal MT engine

You have native languages that can be verified

Your current localization setting

Select a language