You ask the tool to extract the terms from a selected text or project and it basically checks for words that are repeated x number of times (this can be changed in the settings). Usually it's 3. You can adjust the number of characters in the selected words or phrases too. Once the tool has finished extracting, the thing gives you a list of 'candidate terms' which you can accept or reject. Once you've done that, you can click 'export to termbase' or something similar and all the terms you accepted go into a termbase of your choosing.
I think it's quite obvious from these criteria that you'll have lots of 'you', 'and', 'in' etc. as we use these words most... Which means you can spend ages going through the list. On the other side, obscure things (which you might want in your termbase for a next time because you won't remember, as the term was, as I said, 'obscure') won't flag as a potential term because they aren't repeated the set number of times because, again, they are obscure. Of course you can set the required number of repeats for a candidate term to be selected to lower than 3, but then you'll end up going through practically every single word and phrase in the text...
Philippe Etienne wrote:
I didn't try the MemoQ thing, but it's basically my opinion.
From what I've experienced from toying around with Multiterm Extract/PhraseFinder or whatever SDL used to call them, I gather they may be helpful on large projects with ample deadlines and direct contact with an end customer: you create a glossary with recurring words and expressions of major importance before starting translation and submit it to the end customer for review and approval.
Then you have a reliable glossary before actually starting the translation, which can be handy to avoid post-translation term changes and time-wasting.
But these tools often require a lot of manual checking/validating, so I'd say that below 100kwords, there is little point spending time on term extraction, and on-the-fly termbase-populating (how'sthat?) is usually the most time-effective and reliable method.
You can also use them to build glossaries from legacy material, like TMs or bilingual docs, where candidate target terms are also extracted and matched to source. With the same limitations.
To me it's mainly a tool for agencies or the proverbial "terminologists", but I understand that software companies want freelance translators to believe that it saves them time.
I think I'd lose the will to live if I had to go through a term-extraction list for a 100k job... Though I would contemplate aligning and then using LiveDocs in MemoQ with on-the-fly concordance instead (SDL must have it too: it's a corpus of bilingual or monolingual texts you select to work with in your project. It basically works like a TM without being a TM).
Someone else said something about the translator knowing best what should be in the termbase. I agree... What I've seen in glossaries from agencies... I mentioned the word 'and' for a reason... In my mind they just go 'extract terms' and then 'export to termbase' without looking over the list.