Pages in topic:   < [1 2 3 4 5 6 7 8 9 10 11] >
TMLookup
Thread poster: FarkasAndras

FarkasAndras  Identity Verified
Local time: 18:10
English to Hungarian
+ ...
TOPIC STARTER
Yes it is Jan 5, 2016

Michael Beijer wrote:

Strange. I actually find those 895 hits in my screenshot very useful. It's far from "useless and time-consuming". In fact, that is exactly how I use TMLookup: if I can't think of a translation for a word when translating, I selected and press my TMLookup shortcut. The program immediately springs into action and shows me if the relevant term is present in my extremely large database consisting of TMXs. 895 hits is great, and just what I want in such a case.
However, sometimes, it would be useful to be able to filter on individual TMXs, and thus narrow down the results.

I'm also happy to get hundreds of hits from a db search. I look at the first dozen, and if there is a mix of two or three translations that all look acceptable for my context, I can do a quick numbers comparison: do dual-language searches with each of the possible translations and see which is more frequent. If translation A co-occurs with my source term 6 times, translation B 24 times and translation C 657 times, then C is the most likely candidate. If I got 5 hits in total with a distribution of 1-1-2 I wouldn't have much to go on.
It's pretty easy to restrict searches by adding search terms as needed, and TMLookup even supports boolean seaches, so you can do things like:
train NOT rail NOT trasport NOT passenger
or
"carry out" OR "implement" strategies
You can also add terms in both languages at the same time AND you can do regex searches. Honestly, I'm not sure what other possible options exist for refining searches. Fuzzy searches would improve the overall performance, but that would come at the cost of speed and precision, as well as increased complexity and probably DB size.


 

Meta Arkadia
Local time: 00:10
English to Indonesian
+ ...
The One Word Jan 5, 2016

FarkasAndras wrote:
I'm also happy to get hundreds of hits from a db search.


I'm happy when my CAT tool automatically inserts the right word for me.

Cheers,

Hans


 

FarkasAndras  Identity Verified
Local time: 18:10
English to Hungarian
+ ...
TOPIC STARTER
Really? Jan 5, 2016

Your CAT doesn't have a translator's professional knowledge and judgement, so most of the time it doesn't know what the correct word is. Hence the need for human research and, by extension, the need for translators.

 

Michael Beijer  Identity Verified
United Kingdom
Local time: 17:10
Member (2009)
Dutch to English
+ ...
your "One Word" is a myth (and a very harmful one at that) Jan 5, 2016

The One Word

Meta Arkadia wrote:

FarkasAndras wrote:
I'm also happy to get hundreds of hits from a db search.


I'm happy when my CAT tool automatically inserts the right word for me.

Cheers,

Hans


Personally, I think that that whole approach is complete and utter nonsense. A chimera if you will, and one that I have recently stopped chasing. I actually just wrote about this very same topic to a colleague this morning, which I will share here as it is relevant:

Email:
"My main aim for this year is to simplify all of my workflows: strip away absolutely everything that is not needed, so I can make as much money as possible. With this in mind, I have completely stopped adding thousands of little snippets to my various glossaries when translating, as I discovered that it was really slowing me down, and not really paying off since I do not rely on AA, but instead on a combination of Google Translate and dictation.

I do so many kinds of different jobs every week (patents, medical audits, contracts, marketing, education, annual reports, menus, websites, birthday cards, hot air balloons, space travel, logistics, travel brochures, etc. etc. etc.), that it is simply impossible to optimise an AA system to be of much use. On a more philosophical level, I was thinking the other day about how often over the last 20 years or so I have actually ever run across an identical sentence. And you know what? Hardly ever. It is therefore much faster for me to just translate from scratch, using the above-mentioned Google Translate and/or dictation. Also, just dictating the translation is pretty much always faster than looking at any fuzzy matches in little windows, no matter how well the CAT tool developer has managed to highlight the diffs.

Take, for instance, the 12,000-word marketing mumbo-jumbo PowerPoint file I am currently translating. It is basically just one gigantic collection of slogans, with lots of colours, and circles, and arrows, and creatively formatted BS. The client wants it back as a PowerPoint file, as they are going to use it in the boardroom immediately after receiving it from me. Now, I could open this in CafeTran (or any other grid-based CAT tool), but the amount of context I would lose in the process just doesn't make any sense. The text does not consist of "segments" (arbitrarily cut up at specific locations so my CAT tool can force my flowing, human text into its grid), but rather, meaningful units, divided across paragraphs, little text bubbles, marked with basic formatting, etc. In order to produce a well-flowing, good-looking translation, I really need to be able to see in its original form. I am merging & joining around 40% of the "sentences", and often even move things around more than that, sometimes even adding my own headings. I am also extensively changing the formatting right and left, and it is so much easier and more fun to do so directly in PowerPoint (or MS Word), than by fiddling around with little tags. A grid-based CAT tool simply isn't designed to allow me to do this. They are great for certain thins, but terrible at so many others.

I haven't entirely given up on grid-based CAT tools, but translating in the original documents (with the help of Felix, so that I do still have access to a TM) has been a real revelation over these last few weeks, and I really think that the quality of my translations has increased as a result. So, in conclusion, in my view, the number one thing that CAT tools should be investigating in 2016 and onwards is how to allow you to work in the original documents, while still offering you TM functionality, and basically just staying the hell out of your way."

Hans, I'm happy for you that CafeTran automatically inserts the right word for you. That's great. However, I think I will stick to my approach, which involves recreating a text that consists of larger units of meaning, which flows well and is pleasing to the ear. So much of the stuff translated these days is done in CAT tools, and it really shows in the final product: choppy, artificial-sounding target texts, which slavishly follow the structures of the source language and document. Sure, maybe you guys might save a millisecond or two with your robotic workflows, but I prefer producing quality.

Michael

PS: Totally agree with Andras: my clients pay me for my brain and my writing skills, not my ability to nerd around with software.

[Edited at 2016-01-05 15:19 GMT]

[Edited at 2016-01-05 17:52 GMT]


 
Post removed: This post was hidden by a moderator or staff member for the following reason: Pls use the specialized Cafetran forum. Thank you

Michael Beijer  Identity Verified
United Kingdom
Local time: 17:10
Member (2009)
Dutch to English
+ ...
Hmm, just had a closer look at your suggestion ... Jan 5, 2016

FarkasAndras wrote:

Just switch one of the search boxes to seach the source field, and enter the name of the tmx. Leave your search term in the other box.
[…]


Hmm, just had a closer look at your suggestion, but this isn't quite what I'd need: I don't always necessarily know the name of the exact TMX that I want to "narrow down on". Actually, I would probably more often want to use this potential new feature to do the reverse.

E.g., I just ran a few searches today when translating and noticed that I was getting a lot of noise from a particular set of parent very large TMXs (specifically: OpenSubtitles2013_en-nl.tmx, which is huge, and contains a lot of garbage). I would like to use this feature in such cases to quickly remove these results. As far as I can tell, there would be two ways to quickly remove specific unwanted results:

1. By implementing some kind of negative filter
2. Or, by implementing a filter as I suggested originally, where I can just filter on a particular column, and thus just ignore the offending results by scrolling past them

What exactly does that new code of yours do?

Michael


 

FarkasAndras  Identity Verified
Local time: 18:10
English to Hungarian
+ ...
TOPIC STARTER
details Jan 5, 2016

Michael Beijer wrote:

Hmm, just had a closer look at your suggestion, but this isn't quite what I'd need: I don't always necessarily know the name of the exact TMX that I want to "narrow down on". Actually, I would probably more often want to use this potential new feature to do the reverse.

Well, if you get one hit from that TMX, then its name is in the table.

Michael Beijer wrote:
E.g., I just ran a few searches today when translating and noticed that I was getting a lot of noise from a particular set of parent very large TMXs (specifically: OpenSubtitles2013_en-nl.tmx, which is huge, and contains a lot of garbage). I would like to use this feature in such cases to quickly remove these results. As far as I can tell, there would be two ways to quickly remove specific unwanted results:

1. By implementing some kind of negative filter

Negative source filtering is a problem in the current program. You can do negative searches, but a negative term can't be the first in the search expression. So there is no practical way to do this unless all the TMXes you want results from start with a common word or letter. Then you can do tm NOT tm_full_of_junk.tmx (you could list all the letters of the alphabet but the search would probably take ages to run: a OR b OR c OR d ... NOT Opensubtitles).
As I said before this might be rectified in FTS5 but I wouldn't hold my breath.

Michael Beijer wrote:
2. Or, by implementing a filter as I suggested originally, where I can just filter on a particular column, and thus just ignore the offending results by scrolling past them

You can already filter on a particular column. With the seach box. You just can't filter with a negative expression only. I'm getting severe deja vu... But you seem to mean alphabetical sorting, which is a completely different concept. You can sort hits based on the last column. The setting is in the setup file so you have to manually edit the setup file and restart. Maybe I will add a menu option or some other kind of runtime switch. Clickable column headers would be the fanciest/most obvious but it would take a bit of fiddling and I doubt I will have the motivation to do it. I can add a menu entry in 15 minutes if there are no unexpected road bumps.

Michael Beijer wrote:
What exactly does that new code of yours do?

It adds two buttons alongside the highlight box. One hides all hits that fail to match a (regex-enabled) filter expression, the other hides all that do match it. So you can use it to hide all your opensubtitles hits. It looks at all the displayed columns indiscriminately.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 17:10
Member (2009)
Dutch to English
+ ...
oops Jan 5, 2016

FarkasAndras wrote:

Michael Beijer wrote:
2. Or, by implementing a filter as I suggested originally, where I can just filter on a particular column, and thus just ignore the offending results by scrolling past them

You can already filter on a particular column. With the seach box. You just can't filter with a negative expression only. I'm getting severe deja vu... But you seem to mean alphabetical sorting, which is a completely different concept. You can sort hits based on the last column. The setting is in the setup file so you have to manually edit the setup file and restart. Maybe I will add a menu option or some other kind of runtime switch. Clickable column headers would be the fanciest/most obvious but it would take a bit of fiddling and I doubt I will have the motivation to do it. I can add a menu entry in 15 minutes if there are no unexpected road bumps.


Oops, sorry for the confusion: I indeed meant sorting. Basically like you can do in Excel, via Data > Sort. I've gotten spoilt using Ron's CSV Editor, where you can alphabetically sort any column (in a CSV/tabbed txt file) merely be double-clicking its header.


 

FarkasAndras  Identity Verified
Local time: 18:10
English to Hungarian
+ ...
TOPIC STARTER
filter Jan 6, 2016

Meta Arkadia wrote:

Michael Beijer wrote:
895 hits is great, and just what I want in such a case.
However, sometimes, it would be useful to be able to filter on individual TMXs, and thus narrow down the results.


That's one thing, the other would be to repeatedly filter on things like "target language not like CONSUMER INTEREST" (don't know how to do this in TMLookup) to see alternatives. Can still be a lot of work, especially with 895 hits.

Cheers,

Hans


This post only showed up for me now. As I said before, current program versions can't do this unless you also enter a positive target language search term.
I.e. you can do
consumer NOT "consumer interest"
to get hits that contain consumer but not consumer interest, or even
consumer interest NOT "consumer interest"
to get hits that contain both terms but not as one phrase. Due to a limitation in sqlite, you can't do
NOT "consumer interest"
BTW I would be very surprised if db viewer for sqlite (or whatever the name is) had this functionality, as it is a limitation of the underlying SQLite code. They would have had to add their own secret sauce to make it work.

But fear not! Here is a new version with filter buttons:
https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.51_win.zip
If you enter consumer interest in the highlight field and click -Filter, it will hide every hit that contains the phrase (in any column). BTW the highlight box is a regex field, while the main search boxes use boolean expressions, so the two work differently. In the highlight box, " NOT and - are taken literally, . is an any-character wildcard, [] is for character sets etc. I will add a syntax description to the readme. (Interestingly /for a regex nerd/, the regex mode in the main search boxes also works slightly differently from the regex mode of the highlight field. The former is parsed by SQLite while the latter is parsed in my own perl code. SQLite's regex parser doesn't recognize characters like á as letters, so some expressions don't work properly. The highlight box has better non-ascii support.)
I'm still not sure if I will leave the filter buttons in, especially if FTS5 adds support for single negative search expressions.
I also added sorting options to the View menu.


[Edited at 2016-01-06 13:04 GMT]


 

FarkasAndras  Identity Verified
Local time: 18:10
English to Hungarian
+ ...
TOPIC STARTER
New version Jan 7, 2016

Fixed an import bug and added sdltm support:
https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.52_win.zip


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 17:10
Member (2009)
Dutch to English
+ ...
Thanks! Jan 7, 2016

FarkasAndras wrote:

Fixed an import bug and added sdltm support:
https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.52_win.zip


Love the new negative filter!

Any way to sort hits on name of source TMX via View > Sort hits?


 

FarkasAndras  Identity Verified
Local time: 18:10
English to Hungarian
+ ...
TOPIC STARTER
? Jan 7, 2016

Michael Beijer wrote:

FarkasAndras wrote:

Fixed an import bug and added sdltm support:
https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.52_win.zip


Love the new negative filter!

Any way to sort hits on name of source TMX via View > Sort hits?

I don't get the question. If you entered the name of the source TMX as the source field and display the source field as the last column, that's what alphabetic sorting does, obviously. That's what it's for.
BTW if the source field is not the last in your display for some reason, you can specify an arbitrary column number in the setup file.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 17:10
Member (2009)
Dutch to English
+ ...
aha Jan 7, 2016

FarkasAndras wrote:

Michael Beijer wrote:

FarkasAndras wrote:

Fixed an import bug and added sdltm support:
https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.52_win.zip


Love the new negative filter!

Any way to sort hits on name of source TMX via View > Sort hits?

I don't get the question. If you entered the name of the source TMX as the source field and display the source field as the last column, that's what alphabetic sorting does, obviously. That's what it's for.
BTW if the source field is not the last in your display for some reason, you can specify an arbitrary column number in the setup file.


Aha, I think i understand now (sorry, doing too many things at once). I assumed View > Sort hits > Alphabetical alphabetically sorted the results based on the source language column, rather than based on the TMX source column.


 

FarkasAndras  Identity Verified
Local time: 18:10
English to Hungarian
+ ...
TOPIC STARTER
1.53 Jan 10, 2016

https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.53_win.zip

- Sort menu removed, sort buttons added to column headers (they toggle between the default sorting method and sorting alphabetically on that column)
- Readme expanded


 

Traducendo Co. Ltd
Malta
Local time: 18:10
Member (2008)
Spanish to Italian
+ ...
Offtopic but hopefully relevant Jan 11, 2016

Hi everyone and sorry for the offtopic.

One of my translators candidely decided to translate a huge file using Virtaal and now she's not able to export the file in target format (.docx) nor to provide me with a tmx.

I have been going through forums and websites, and I have checked the virtaal help pages to find a solution yet nothing popped out.
This forum seems the closest to my issue.

Maybe any of you Virtaal users can help me to get .docx file from
... See more
Hi everyone and sorry for the offtopic.

One of my translators candidely decided to translate a huge file using Virtaal and now she's not able to export the file in target format (.docx) nor to provide me with a tmx.

I have been going through forums and websites, and I have checked the virtaal help pages to find a solution yet nothing popped out.
This forum seems the closest to my issue.

Maybe any of you Virtaal users can help me to get .docx file from the .xliff the translator sent me?
Unfortunately Trados 2011 and 2014 seem to be able to open the xliff and my client is expecting the delivery within few days.

Please let me know if there is a solution!!!

Thanks a lot
Collapse


 
Pages in topic:   < [1 2 3 4 5 6 7 8 9 10 11] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

TMLookup

Advanced search







SDL Trados Business Manager Lite
Create customer quotes and invoices from within SDL Trados Studio

SDL Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search