Déjà vu X2 Pro: pros and cons of Big Mama TM
Thread poster: Pavel Tsvetkov
Pavel Tsvetkov
Pavel Tsvetkov  Identity Verified
Bulgaria
Local time: 23:51
Member (2008)
English to Bulgarian
+ ...

Moderator of this forum
Jul 26, 2012

Dear Colleagues,

It has often been discussed what the winning strategy as to the number of translation memories used should be.

1) Should one use just one Big Mama TM?
2) Should TM's be client-based?
3)... or subject-based?

So far, as a Trados power user, I have used option 2). But with the new technologies now available in modern CAT tools (AutoWrite for example) it seems to be also beneficial to have just one TM and maybe even just one termbas
... See more
Dear Colleagues,

It has often been discussed what the winning strategy as to the number of translation memories used should be.

1) Should one use just one Big Mama TM?
2) Should TM's be client-based?
3)... or subject-based?

So far, as a Trados power user, I have used option 2). But with the new technologies now available in modern CAT tools (AutoWrite for example) it seems to be also beneficial to have just one TM and maybe even just one termbase.

Please, share your thoughts.

Kind Regards,
PTs
Collapse


 
Victor Dewsbery
Victor Dewsbery  Identity Verified
Germany
Local time: 22:51
German to English
+ ...
Reference Jul 26, 2012

Hi Pavel,

I have already made a few comments on this issue here:
http://language-mystery.blogspot.de/2012/01/12-facts-hints-and-ideas-on-databases.html

I hope the article is helpful.


 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 23:51
English to Turkish
+ ...
Subject and Client codes and AW Jul 26, 2012

Hi Pavel,

You already know about subject and client codes in DVX projects, TM and TBs. They are indeed very useful when translating, order of AutoSearch window results are partially based on these metadata.

But DeepMiner does not take these codes into consideration when making suggestions with the AutoWrite feature. In any case I prefer the Big Mama - Big Papa approach.

Client attribute should be used with care. A client is not the agency but the end-user
... See more
Hi Pavel,

You already know about subject and client codes in DVX projects, TM and TBs. They are indeed very useful when translating, order of AutoSearch window results are partially based on these metadata.

But DeepMiner does not take these codes into consideration when making suggestions with the AutoWrite feature. In any case I prefer the Big Mama - Big Papa approach.

Client attribute should be used with care. A client is not the agency but the end-user, e.g. both ABC agency and XYZ agency may send Microsoft related jobs. Then your client is Microsoft.

So do not use

001 - ABC Agency
002 - XYZ Agency

OR

001 - Microsoft - ABC Agency
002 - Microsoft - XYZ Agency

BUT only

001 - Microsoft


On the other hand you may work with only one agency sending you different jobs, mainly fashion and medicine related jobs but sometimes other projects. In that case you can use:

001 - ABC Agency - General
002 - ABC Agency - Fashion Projects
003 - ABC Agency - Medical Projects


Hope it is clear.

Selcuk
Collapse


 
Olaf Reibedanz
Olaf Reibedanz  Identity Verified
Colombia
Local time: 15:51
Member (2003)
English to German
+ ...
Go for a combination of Big Mama + client specific TMs Jul 26, 2012

Hi Pavel,

Why don't you use both:
- One Big Mama TM (or 2 or 3 thematic TMs for very broad subjects)
- Individual TMs for specific clients

That's what I do and it works very well. What I use is:
- Dozens of client-specific TMs (one TM for each combination of end client/agency and language pair). For example: Translation agency XY_Barclays_ENG-GER
- 1 Finance Big Mama for all texts related to finance (which is my main area of specialisation).... See more
Hi Pavel,

Why don't you use both:
- One Big Mama TM (or 2 or 3 thematic TMs for very broad subjects)
- Individual TMs for specific clients

That's what I do and it works very well. What I use is:
- Dozens of client-specific TMs (one TM for each combination of end client/agency and language pair). For example: Translation agency XY_Barclays_ENG-GER
- 1 Finance Big Mama for all texts related to finance (which is my main area of specialisation).
- 1 General Big Mama for all other areas

When I do a project for a client in finance for whom I have already worked in the past, I usually first do a Pretranslate only with the client-specific TM, and after that, I may add the Finance Big Mama TM.

When on the other hand I have never worked for that client before, I use my Finance Big Mama right from the start (including during pretranslation).

And when I do a project for a new client (for whom I never worked before) in another area (outside finance), I use both the Finance Big Mama and the General Big Mama (just in case I get some useful entries, even if there may not be many of them) but with a higher minimum score under Tools>Options.

It is very nice to have a Big Mama (I sometimes even use it as a dictionary when answering Kudoz questions). However. in my opinion, if you ONLY work with Big Mama, you have several drawbacks:

1) You put apples and pears into one basket. For example, if you have a text from the engineering field, most of your entries will be useless for a financial context, and you will end up with useless entries that clog your AutoSearch or concordance search window.

Of course, if you only work in one area, like Finance, Law, Engineering, etc. this may be less of a problem, but I still like having client-specific TMs just to be able to ONLY work with the client-specific TM if I decide to do so. Besides, this also makes it easier to swap TMs with colleagues (in case you work together with others on the same project).

2) After a while your TM will reach the maximum size

3) It may slow down the programme

Therefore, for me, the combination Big Mama + client-specific TMs is the best solution that gives me the greatest flexibility.

Hope this helps!

Olaf


[Edited at 2012-07-26 21:20 GMT]
Collapse


 
MikeTrans
MikeTrans
Germany
Local time: 22:51
Italian to German
+ ...
The more context, the better, but computer performance goes at the limit Jul 27, 2012

Hi,

personally I use a single TM for any language pair for which I build a precise catalogue of all attributes it contains, where those attributes are Domain, Subject, Project and Client. When searching expressions, it's important to read the context taking these attributes into account. Also, with a catalogue (I have it in tmx format done with the TM manager Olifant) I have a precise idea of what is contained in my TM and I'm eventually able to export only relevant segments.
... See more
Hi,

personally I use a single TM for any language pair for which I build a precise catalogue of all attributes it contains, where those attributes are Domain, Subject, Project and Client. When searching expressions, it's important to read the context taking these attributes into account. Also, with a catalogue (I have it in tmx format done with the TM manager Olifant) I have a precise idea of what is contained in my TM and I'm eventually able to export only relevant segments.
With this system, even after 20 years of translations, the TM should keep in the region of 100-120 thousand segments, assumed that you only allow *quality* content to be kept.

For me, big mammas are TMs like the DGT (Release 2007: 350.000 segments; release 2011: 1.500.000 segments!) or EMEA (a pharmaceutical and medical DB of 350.000 segments which are mostly broken because badly aligned) or a compilation of the EuroParl (about the same size range).
The problem with the DGT TM is: the content doesn't appear in the order of subjects treated, but is instead a serie of documents processed, so in one place there's the evaluation of a chemical product, in the next lines a document is talking about legislation in agricultural matters etc.

It would be very useful to extract the content by subject, but I have resigned trying to do so, it's almost impossible, except you can extract by keywords using X-Bench: the search box allows for Regular Expressions that have the lenght for about 1000 query terms. I first extract what appears more than 3 times in my project to be translated and then I send such a query to X-Bench. The result: I work with only 40% of the big mamma's content, not a terrible difference, though.

Performance:
Trados Studio, MemoQ, DVX2 have all an acceptable speed in dealing with such TM sizes, but the limit is reached if you load 2 of them, even read-only.

Termbases:
In DVX2, Termbases are IMHO much more important than TMs: A big-sized TB for the subject will help you to assemble your sentences much more than any translation memory could do, so my philosophy here is: 1 single TB for 1 specific subject, no matter of the size.

Greets,
Mike
Collapse


 
Pavel Tsvetkov
Pavel Tsvetkov  Identity Verified
Bulgaria
Local time: 23:51
Member (2008)
English to Bulgarian
+ ...

Moderator of this forum
TOPIC STARTER
Thank you... Jul 30, 2012

...for sharing your thoughts and opinions, everybody!

Kind Regards,
PTs


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Déjà vu X2 Pro: pros and cons of Big Mama TM






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »