Pages in topic:   < [1 2 3 4 5 6 7 8 9 10 11] >
TMLookup
Thread poster: FarkasAndras

Danesh
Local time: 15:34
English to Farsi (Persian)
+ ...
THANKS Jun 25, 2016

FarkasAndras wrote:

You should use 1.54. It has quite a few new features.
I haven't updated the website in ages. I should get that sorted at last...


Dear András,
Thanks a million, and more power to your elbow!
Best,
Danesh


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
@András Dec 8, 2016

I'm having this stupid problem where TMXs I import keep getting imported with the languages the wrong way around:

Dutch gets imported as English
and English gets imported as Dutch

The two languages are clearly correctly labelled in the TMX, but when I do Edit > Import...

I get this:

Capture

If I click away this error message, I get this:

Capture2

it says

Column 1 = nl
Column 2 = en

In my TMX, srclang="nl"

and yet, when I try to import it, en and nl get reversed.

I know I am probably doing sth very simple wrong, but what?

Michael

[Edited at 2016-12-08 23:26 GMT]


 

FarkasAndras  Identity Verified
Local time: 13:04
English to Hungarian
+ ...
TOPIC STARTER
who made that? Dec 9, 2016

Michael Joseph Wdowiak Beijer wrote:

I'm having this stupid problem where TMXs I import keep getting imported with the languages the wrong way around:

Dutch gets imported as English
and English gets imported as Dutch

The two languages are clearly correctly labelled in the TMX, but when I do Edit > Import...

I get this:


If I click away this error message, I get this:


it says

Column 1 = nl
Column 2 = en

In my TMX, srclang="nl"

and yet, when I try to import it, en and nl get reversed.

I know I am probably doing sth very simple wrong, but what?

Michael

[Edited at 2016-12-08 23:26 GMT]

The languages in the tmx are not recognized, hence the xx yy and the luck-of-the-draw import order. Probably some weird tmx file, most likely some "creative" tag formatting. Where's the file from? Email it to me or post it and I'll have a look.
As a workaround, add the language codes to the end of the filename. Such as crappyTM_en-nl.tmx. If TMLookup can't ID the languages based on the file, it checks the file name. This also works with tabbed txt files.
BTW the srclang is irrelevant. The srclang entry has no real purpose. Same for adminlang. What matters for TMLookup is which language comes first in the actual content. It appears that this tmx has the en first and nl second.

BTW here's a new version while I'm at it: https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.56_win.zip

The main new feature over 1.54 is that if you put search terms in both search boxes at the same time, you get search results quickly now. These searches were horrendously slow before due to an SQLite bug/lack of optimization. I looked into it and optimized the queries to get around the problem.

[Edited at 2016-12-09 10:28 GMT]


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
:-) Dec 9, 2016

FarkasAndras wrote:

Michael Joseph Wdowiak Beijer wrote:

I'm having this stupid problem where TMXs I import keep getting imported with the languages the wrong way around:

Dutch gets imported as English
and English gets imported as Dutch

The two languages are clearly correctly labelled in the TMX, but when I do Edit > Import...

I get this:


If I click away this error message, I get this:


it says

Column 1 = nl
Column 2 = en

In my TMX, srclang="nl"

and yet, when I try to import it, en and nl get reversed.

I know I am probably doing sth very simple wrong, but what?

Michael

[Edited at 2016-12-08 23:26 GMT]

The languages in the tmx are not recognized, hence the xx yy and the luck-of-the-draw import order. Probably some weird tmx file, most likely some "creative" tag formatting. Where's the file from? Email it to me or post it and I'll have a look.
As a workaround, add the language codes to the end of the filename. Such as crappyTM_en-nl.tmx. If TMLookup can't ID the languages based on the file, it checks the file name. This also works with tabbed txt files.
BTW the srclang is irrelevant. The srclang entry has no real purpose. Same for adminlang. What matters for TMLookup is which language comes first in the actual content. It appears that this tmx has the en first and nl second.

BTW here's a new version while I'm at it: https://dl.dropboxusercontent.com/u/16377950/TMLookup_1.56_win.zip

The main new feature over 1.54 is that if you put search terms in both search boxes at the same time, you get search results quickly now. These searches were horrendously slow before due to an SQLite bug/lack of optimization. I looked into it and optimized the queries to get around the problem.

[Edited at 2016-12-09 10:28 GMT]


The TMX was created by Déjà Vu X3. Just sent it to you.

Thanks for the new version!

Michael


 

FarkasAndras  Identity Verified
Local time: 13:04
English to Hungarian
+ ...
TOPIC STARTER
Creative tag formatting Dec 9, 2016

Michael Joseph Wdowiak Beijer wrote:

The TMX was created by Déjà Vu X3. Just sent it to you.

Thanks for the new version!

Michael


As expected, this is due to creative tag formatting. The tmx has the <tuv xml:lang="nl"> tag split between two lines and TMLookup expects it to be on one line. Again, like a previous issue, this is because TMLookup doesn't have a proper xml parser because I can't be bothered to learn how to implement one. So instead of wrangling with a horrible coding problem I'm left wrangling with somewhat less horrible troubleshooting problems every now and then. So it goes. I could just look for xml:lang= without the tuv, but in principle some tmx files could have other elements where the language is specified with xml:lang=, not just the text itself. So then it could break on those. This can be solved in multiple ways of course, but none of them are trivial or appetizing to me. Implementing a proper parser is the least appetizing of all. So maybe I'll fix this... maybe not. Adding the language codes to the filename should work.


In the meantime, the new version of sqlite that will allow for somewhat fancier/faster text searches is trickling down the pipeline. It went through two stages and it is now one step away from where I can start fiddling with it. We'll see.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
aha Dec 9, 2016

FarkasAndras wrote:

Michael Joseph Wdowiak Beijer wrote:

The TMX was created by Déjà Vu X3. Just sent it to you.

Thanks for the new version!

Michael


As expected, this is due to creative tag formatting. The tmx has the tag split between two lines and TMLookup expects it to be on one line. Again, like a previous issue, this is because TMLookup doesn't have a proper xml parser because I can't be bothered to learn how to implement one. So instead of wrangling with a horrible coding problem I'm left wrangling with somewhat less horrible troubleshooting problems every now and then. So it goes. I could just look for xml:lang= without the tuv, but in principle some tmx files could have other elements where the language is specified with xml:lang=, not just the text itself. So then it could break on those. This can be solved in multiple ways of course, but none of them are trivial or appetizing to me. Implementing a proper parser is the least appetizing of all. So maybe I'll fix this... maybe not. Adding the language codes to the filename should work.


In the meantime, the new version of sqlite that will allow for somewhat fancier/faster text searches is trickling down the pipeline. It went through two stages and it is now one step away from where I can start fiddling with it. We'll see.


Indeed, I just had another look at it, and they sure got creative with the line breaks.

Until you fix it I'll just add the language codes to the filename, which seems to work fine. I might also mention to Atril support that their TMXs are a little weird.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
@András Dec 9, 2016

btw, if it isn't already the case, can you make it so that appended language codes override anything inside the files?

meaning: if my TMX has "nl-BE" and "en-US", but it ends "...blah blah blah.nl-en.tmx", can you make it so it sees "nl" and "en"?

thx, M


 

FarkasAndras  Identity Verified
Local time: 13:04
English to Hungarian
+ ...
TOPIC STARTER
Well... Dec 9, 2016

Michael Joseph Wdowiak Beijer wrote:

FarkasAndras wrote:

Michael Joseph Wdowiak Beijer wrote:

The TMX was created by Déjà Vu X3. Just sent it to you.

Thanks for the new version!

Michael


As expected, this is due to creative tag formatting. The tmx has the tag split between two lines and TMLookup expects it to be on one line. Again, like a previous issue, this is because TMLookup doesn't have a proper xml parser because I can't be bothered to learn how to implement one. So instead of wrangling with a horrible coding problem I'm left wrangling with somewhat less horrible troubleshooting problems every now and then. So it goes. I could just look for xml:lang= without the tuv, but in principle some tmx files could have other elements where the language is specified with xml:lang=, not just the text itself. So then it could break on those. This can be solved in multiple ways of course, but none of them are trivial or appetizing to me. Implementing a proper parser is the least appetizing of all. So maybe I'll fix this... maybe not. Adding the language codes to the filename should work.


In the meantime, the new version of sqlite that will allow for somewhat fancier/faster text searches is trickling down the pipeline. It went through two stages and it is now one step away from where I can start fiddling with it. We'll see.


Indeed, I just had another look at it, and they sure got creative with the line breaks.

Until you fix it I'll just add the language codes to the filename, which seems to work fine. I might also mention to Atril support that their TMXs are a little weird.

To be fair, their tmx is perfectly valid. Perhaps a little unusual but fine. It's my "parsing" that is not up to scratch.
BTW, if you have a lot of tmxes to import, you could fix them by removing line breaks after <tuv. Sed and other tools can do mass replacements on multiple files.

I don't want the codes in filenames to overrule langcodes read from inside the file... the latter tends to be more reliable I would think.
But in your specific example "nl-BE" should be read by TML as nl and "en-US" as en... it chops off the bit after the hyphen.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
OK, next question Jan 4, 2017

Can you make it so TMLookup shows line breaks? They display correctly in e.g. Heartsome:

Capture

Capture2

The reason I am asking is I am now also using TMLookup to store glossaries in, and would like to be able to structure entries a bit using line breaks.


 

FarkasAndras  Identity Verified
Local time: 13:04
English to Hungarian
+ ...
TOPIC STARTER
Line breaks? Jan 4, 2017

That's an odd one... Is it normal to have line breaks in tmx? I don't think I have ever seen a tmx with line breaks in the text. TMLookup converts tmx files to tabbed txt before importing, and obviously it nukes all line breaks in the process.
In any case it should be relatively easy to implement. Perhaps, perhaps, perhaps.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
hmm Jan 4, 2017

FarkasAndras wrote:

That's an odd one... Is it normal to have line breaks in tmx? I don't think I have ever seen a tmx with line breaks in the text. TMLookup converts tmx files to tabbed txt before importing, and obviously it nukes all line breaks in the process.
In any case it should be relatively easy to implement. Perhaps, perhaps, perhaps.


Can't remember off the top of my head, but I think only a few CAT tools support line breaks in TMs/TMXs. I think Studio and DVX X3 do, and memoQ and CafeTran don't, for example. Would have to check though. Anyway, it would be super-cool for my purposes of using TMLookup for TMs (via F1) and TBs (via F2).


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
hmm, me again Jan 12, 2017

So I've been very happily using TMLookup to search through my zillions of glossaries and termbases while translating in Déjà Vu X3, and have a feature request, unless it's already possible.

While testing, I sometimes import small glossaries, but then want to delete them afterwards. I figured if I added a third column and gave it a specific name, there should be a way to delete all entries with this specific column name, such as "glossaryb6565 test". Is there already a way to do th
... See more
So I've been very happily using TMLookup to search through my zillions of glossaries and termbases while translating in Déjà Vu X3, and have a feature request, unless it's already possible.

While testing, I sometimes import small glossaries, but then want to delete them afterwards. I figured if I added a third column and gave it a specific name, there should be a way to delete all entries with this specific column name, such as "glossaryb6565 test". Is there already a way to do this?
Collapse


 

FarkasAndras  Identity Verified
Local time: 13:04
English to Hungarian
+ ...
TOPIC STARTER
Sure Jan 12, 2017

Michael Joseph Wdowiak Beijer wrote:

So I've been very happily using TMLookup to search through my zillions of glossaries and termbases while translating in Déjà Vu X3, and have a feature request, unless it's already possible.

While testing, I sometimes import small glossaries, but then want to delete them afterwards. I figured if I added a third column and gave it a specific name, there should be a way to delete all entries with this specific column name, such as "glossaryb6565 test". Is there already a way to do this?


This is already possible.
Untested procedure:
set one of the search boxes to the source column
run a search for glossaryb6565
check the number of hits and make sure it matches the size of the glossary you want to remove
click edit/delete hits from database

If your db is important to you, I warmly recommend having a relatively recent backup on hand just in case.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 12:04
Member (2009)
Dutch to English
+ ...
wow, I am senile already Jan 12, 2017

FarkasAndras wrote:

Michael Joseph Wdowiak Beijer wrote:

So I've been very happily using TMLookup to search through my zillions of glossaries and termbases while translating in Déjà Vu X3, and have a feature request, unless it's already possible.

While testing, I sometimes import small glossaries, but then want to delete them afterwards. I figured if I added a third column and gave it a specific name, there should be a way to delete all entries with this specific column name, such as "glossaryb6565 test". Is there already a way to do this?


This is already possible.
Untested procedure:
set one of the search boxes to the source column
run a search for glossaryb6565
check the number of hits and make sure it matches the size of the glossary you want to remove
click edit/delete hits from database

If your db is important to you, I warmly recommend having a relatively recent backup on hand just in case.


I actually already knew that. Well, I remembered I did, once I read yr post. I've actually done it in the past and it works great.

Am having a whale of a time using TMLookup as my main glossary lookup tool at the moment. I of course also use termbases in Déjà Vu X3, but for all my massive/messy glossaries, TMLookup is great. And no need to worry about ever running of space like with many of the ones built into CAT tools.


 

Zaki Jawich  Identity Verified
Turkey
Local time: 15:04
Arabic to English
+ ...
Right-to-Left languages? Jan 11, 2018

Thank you, Mr. Farkas, for this fantastic tool.

Can you make it so it supports Right-to-Left languages, such as Arabic?

When I insert an English-Arabic TMX, the tool reverses the Arabic text.

I'll give an example, in English, to clarify the matter:

It shows "you are How .Hi" instead of, "Hi. How are you?"

Thank you in advance.
Zaki


 
Pages in topic:   < [1 2 3 4 5 6 7 8 9 10 11] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

TMLookup

Advanced search







Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search