SF on Linux, a warning
Thread poster: opolt

opolt  Identity Verified
Germany
Local time: 03:01
English to German
+ ...
Sep 7, 2012

Hi all,

after much humming and hawing, I installed SF a few weeks on my Linux machine. I had tested the software some time ago and found it more or less ok, but this time around, it was a complete failure. Why?

Well, on Swordfish's home page, it is clearly stated that it has been tested against Ubuntu 11.10 and openSUSE 12.1 only, and only in their 32 bit versions. And I didn't heed that advice. I run Fedora 17, the 64bit version. I overlooked that warning, mainly becau
... See more
Hi all,

after much humming and hawing, I installed SF a few weeks on my Linux machine. I had tested the software some time ago and found it more or less ok, but this time around, it was a complete failure. Why?

Well, on Swordfish's home page, it is clearly stated that it has been tested against Ubuntu 11.10 and openSUSE 12.1 only, and only in their 32 bit versions. And I didn't heed that advice. I run Fedora 17, the 64bit version. I overlooked that warning, mainly because I was in the midst of a very big, very urgent project where I needed a workable CAT solution on Linux right away, and on top of that I had to deal with a health issue at the same time ... and I didn't do enough testing.

When the project was already somewhat advanced (and only after paying the license), I ran into all sorts of problems. Among them were serious display issues with tags, crashes after which the program wouldn't start anymore (I had to delete the databases, rebuilding them from scratch), and matches in most externally supplied TMs not showing up.

Now, I admit it was all my fault that I paid for the licence without reading the warning on SF's homepage, that I didn't do the required testing on my current Linux distro before that, etc., and I won't complain about that. I'm not asking for a refund; that's not why I'm writing this.

But to put it simply (as an advice for those translators who are using Linux): don't do this at home, you will regret it. Use one of the tested distros or forget it (at least if you want to purchase the license). Obviously, Linux as a platform is not unified enough to allow for this kind of dabbling. In many cases, e.g. when installing Adobe's programs or other standard commercial apps, such as Skype, installing on an unsupported platform is not a problem at all. But in this case it seems it is.

I know that for the developer(s) it's hard to keep up with the ever-changing landscape of Linux distros. Though I would say that "classic GNOME" and 32bit are not really considered modern any more, and that the number of top distros to consider is at least 5 or 6 (in my book). But it is of course completely up to the maker to decide which platforms to support and which not.

However, it is my opinion that the maker(s) of SF should be more honest and straightforward about the platforms they support. In every major distro that I know of, there's a small text file found under /etc , with the sole purpose of indicating the name of the distribution and its release number. On Fedora, it's /etc/fedora-release, for instance. It should be easy to figure that out with any scripting language during the install process. Ditto for the CPU architecture: using lscpu is one way, grepping under /proc is another possibility. There's also "uname -a". As usual on Linux, almost all of this code can be had for free:
http://unix.stackexchange.com/questions/6345/how-can-i-get-distribution-name-and-version-number-in-a-simple-shell-script

But the installation routine shouldn't stop there. Instead, when it detects a platform that is considered unsupported, it should abort right away, or at the very minimum it should pop up a big, red warning to the user that he's going to run into problems. Failure to do either of these is somewhat dishonest, IMHO. I know that the Linux platform is a mess, but the tools are there to avoid the worst.
Collapse


 

Rodolfo Raya  Identity Verified
Local time: 22:01
English to Spanish
Works in other distributions Sep 8, 2012

Hi,

Swordfish is tested on Ubuntu and openSUSE, but it works also in pther distributions like Linux Mint. The program is tested in 32 bit but also works in 64 bit, that's why 64 bit installers are available.

There are too many Linux distributions. Some are good, some are bad. Swordfish works in some of them, not all.

A Swordfish license is independent of the OS. It can be used on Linux, Windows or Mac OS X. If Swordfish doesn't work in your current distribu
... See more
Hi,

Swordfish is tested on Ubuntu and openSUSE, but it works also in pther distributions like Linux Mint. The program is tested in 32 bit but also works in 64 bit, that's why 64 bit installers are available.

There are too many Linux distributions. Some are good, some are bad. Swordfish works in some of them, not all.

A Swordfish license is independent of the OS. It can be used on Linux, Windows or Mac OS X. If Swordfish doesn't work in your current distribution, you can still use the license in a different one.

FWIW, the program was initially developed on Red Hat Linux. When Red Hat created Fedora as test platform, an attempt to use Fedora was done but the system was very unstable and development moved to SuSE Linux. When SuSE decided to create openSUSE as test platform, history repeated and the system became unstable again. Development moved next to Windows and Mac OS X.

Regards,
Rodolfo
Collapse


 

opolt  Identity Verified
Germany
Local time: 03:01
English to German
+ ...
TOPIC STARTER
Thanks Rodolfo Sep 8, 2012

I appreciate your feedback and I'm really glad there is the option to transfer the license to another OS. I'm still undecided on that though; after all I have only recently switched to Fedora from Ubuntu (as many Linux users have).

I know supporting the different distributions out there is not easy, Rodolfo (some even say it's impossible and have given up on Linux entirely), and I'm not even asking you to support the one that I am using (though it's one of the more popular ones, and
... See more
I appreciate your feedback and I'm really glad there is the option to transfer the license to another OS. I'm still undecided on that though; after all I have only recently switched to Fedora from Ubuntu (as many Linux users have).

I know supporting the different distributions out there is not easy, Rodolfo (some even say it's impossible and have given up on Linux entirely), and I'm not even asking you to support the one that I am using (though it's one of the more popular ones, and it's very stable these days). I think I can understand your plight as a developer.

However, I would like to reiterate what I said before, namely that I would be preferable and more honest, both for the users and for you from the support side, to be stricter and more upfront about the distros that are supported and which are not (see my previous post). That would make things much easier for both parties, IMHO.

As an aside, I have noted other problems in Swordfish, above all with docx import, in that SF would sometimes recognize hard line breaks, and sometimes just ignore them, i.e. there was not even a space left between the words. There was no rule to that behaviour, just from looking at the Word file. But maybe that too was related to the platform issue.

Furthermore I'd like to mention that I encountered quite a number of usability issues with the program; for instance, personally I found the way databases are assigned/selected quite confusing. Also the Find/Replace dialog turned out somewhat unwieldly; there are some focus and display issues with it. Maybe that could be improved by making it dockable (along with other windows which are now pop-ups). But mainly, the menus appeared to offer quite a maze of options and actions. Namely the "Edit" and the "Tasks" menus contain a very long list of entries, making it difficult to find the right one. Given that a lot of horizontal space is available on today's screens, they could be spread out in other (new) menus, according to a different categorization. Or one could group them into submenus. That would make things a lot easier. These are just some ideas; unfortunately the related job was just too stressful to focus on the program itself.

[Edited at 2012-09-08 22:43 GMT]
Collapse


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Does a good CAT tool need databases? Sep 27, 2012

opolt wrote:

When the project was already somewhat advanced (and only after paying the license), I ran into all sorts of problems. Among them were serious display issues with tags, crashes after which the program wouldn't start anymore (I had to delete the databases, rebuilding them from scratch), and matches in most externally supplied TMs not showing up.


When I bought my Swordfish license (after Rodolfo's remark that only wishes of paying users could be honoured) I had great expectations of Swordfish. The use of open standards was (is) very attractive.

What I found not to be attractive at all, was the use of closed (in the sense of: you have to use them) databases. There was no way to easily import a rather large term file into a terms database. I had to split it up into parts of 50,000 terms. I had to create 7 term bases to store my large term file. This took me several hours on a fast iMac. When you're into the habit of constantly optimising your central term file, this scenario isn't very attractive ...

After the slowish import I was very anxious to see Swordfish' performance in term recognition. It was very, very disappointing. It was very slow and many words weren't recognised. Rodolfo could not explain this and wrote me that my use of such a large term file was not a good idea. End of Swordfish, bye bye license ...

Later I found a CAT tool, also written in Java, that stores its terms in simple text files. Term recognition was (is) blazing fast and reliable. Hence my question: Does a good CAT tool really need databases, with today's powerful computers?


 

Rodolfo Raya  Identity Verified
Local time: 22:01
English to Spanish
Databases Sep 27, 2012

Hans Lenting wrote:
Hence my question: Does a good CAT tool really need databases, with today's powerful computers?


Yes, databases are still required for handling large datasets.

Regards,
Rodolfo


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Just curious: how 'large' are large datasets? Sep 28, 2012

Rodolfo Raya wrote:

Hans Lenting wrote:
Hence my question: Does a good CAT tool really need databases, with today's powerful computers?


Yes, databases are still required for handling large datasets.

Regards,
Rodolfo


Thanks for your reply. And thanks for not waving away my arguments!

I was wondering: how large do these datasets have to be to make import in a database necessary?

And how well will these databases perform then?

Go figure: if a database with 600,000 term pairs already misbehaves 'in terms of' recognition speed and recognition correctness, how bad will a large database with TUs perform then?

Sorry, but I'm not convinced of the need of databases at all.

BTW: Swordfish shares with Wordfast Pro the record for slowest import of terms. CafeTran holds the record (just because it actually doesn't import), followed by MemoQ, followed by Déjà Vu and Transit. This sounds like a nice benchmark video topic for Dominique.

A lot of optimisation work for you to do in this area, me thinks.


 

Rodolfo Raya  Identity Verified
Local time: 22:01
English to Spanish
Off topic Sep 28, 2012

Hans Lenting wrote:

I was wondering: how large do these datasets have to be to make import in a database necessary?



Databases are used mainly for TM data. with over 20MB of data, finding TM matches is much faster using databases with fuzzy indexes. Using a linear memory scan is too slow.


And how well will these databases perform then?


Very well. That is why most CAT tools use them.


Go figure: if a database with 600,000 term pairs already misbehaves 'in terms of' recognition speed and recognition correctness, how bad will a large database with TUs perform then?


Databases are intended mainly for TM data and work very well.

The data you tried to use (I've seen it) doesn't qualify for terminology data. Using it in a database is a waste of time.

Regards,
Rodolfo


 

Gyula Erdész
Hungary
Local time: 03:01
Member (2005)
English to Hungarian
+ ...
painfully slow import of larger TM, too Sep 28, 2012

Hans Lenting wrote:

Swordfish shares with Wordfast Pro the record for slowest import of terms.


Same bad performance with large TMs. The other day I tried to import a relative large TM file (320 000 entries). After 40 minutes I stopped the process. Am I inpatient?

Hans Lenting wrote:
A lot of optimisation work for you to do in this area, me thinks.


Totally agree with you, Hans.


 

Andrzej Kaznowski  Identity Verified
Poland
Local time: 03:01
Polish to English
+ ...
Comments on Swordfish usability under Linux Sep 28, 2012

I have followed this discussion with interest and a little bemusement. I have been a heavy user of Swordfish for many years now (since the very first version, in fact) and have five licenses running primarily under Ubuntu. My bemusement stems from the fact that my experience with Swordfish has been nothing like that presented in the discussion. Swordfish has proved to be an incredibly productive tool for me and I can thoroughly recommend it. When installed on a supported OS there are no problems... See more
I have followed this discussion with interest and a little bemusement. I have been a heavy user of Swordfish for many years now (since the very first version, in fact) and have five licenses running primarily under Ubuntu. My bemusement stems from the fact that my experience with Swordfish has been nothing like that presented in the discussion. Swordfish has proved to be an incredibly productive tool for me and I can thoroughly recommend it. When installed on a supported OS there are no problems. Everything installs fine and works perfectly.

I don't want to get into all my reasons why I think Swordfish is so good (after all, everyone has their favourite CAT tool) but I would like to address a couple of the points raised in the discussion as a signal to others that the issues raised are, in my opinion, not typical or not applicable.

The first issue is the apparent difficulty with database management. For me, the solution Swordfish uses is logical and consistent. I agree that it is not immediately apparent what to do, but a little time with the software and it becomes clearer and more obvious why this particular approach was taken. The flexibility provided by Swordfish in database management is very impressive and has proved invaluable over the years. It might be more than a lone translator needs, and therefore appear over complex, but for more "heavy-weight" usage it really starts to shine.

Terminology management TMs - I also use terminology management in Swordfish and the key to efficient usage is for the TMs to consist only of a strict term list. Term searches are then very fast and efficient. However, if a TM is used that contains a mixture of terms and whole sentences then this will lead to the problems described. There are other tricks that can be applied to improve speed with very big term lists but these are not normally required. I have never needed to split even very large term TMs.

I also wish to address clarity of menus, I think this is an unfair criticism of Swordfish. CAT programs have a wealth of obscure and difficult-to-explain-in-a-word functions, so expecting menu options to be obvious and clear is perhaps a little over optimistic. I have tried other Java-based CAT software and found the menu options so obscure that the programs are unusable without considerable time spent reading the manual. Fair enough - it is what manuals are for! I personally believe the menu options in Swordfish are clearer than most and I believe they become obvious with just a little use. I personally disagree that Swordfish has "a maze of options and actions" - I call these a wealth of features and is something I want in a CAT program

One of the great "features" of Swordfish is the excellent interaction between the developers and the user community. Over the years I, and others, have submitted numerous requests for features or submitted bug reports and they have all been dealt with incredibly fast. - in fact, virtually instantaneously. The points raised in this discussion will, I have no doubt, be taken into consideration by the developers (judging by the responses from Rodolfo, they are following this discussion closely). But for others reading this discussion, my advice is follow the recommendations from the developers on the website and use the user community to submit feature requests/bugs.
Collapse


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Quite on topic Sep 28, 2012

Rodolfo Raya wrote:

Databases are intended mainly for TM data and work very well.

The data you tried to use (I've seen it) doesn't qualify for terminology data. Using it in a database is a waste of time.


I had indeed sent you a copy of my large dataset (with a scrambled target side IIRC). You then advised me to import it in several parts. What you now are writing is very different ...

BTW: I think a lot of users nowadays put chunks (speech parts) into their term bases.


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Maxprograms' responsiveness Sep 28, 2012

Andrzej Kaznowski wrote:

One of the great "features" of Swordfish is the excellent interaction between the developers and the user community.


For what it's worth: My experience was quite opposite of yours, I'm afraid. Rodolfo wrote to me that I was 'always asking', implying to stop with requesting features.

I wish you good luck with Swordfish. My workflow obviously is very different from yours. Like you wrote: we all have different needs (perhaps depending on our language combinations and subject fields)?


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maya Gorgoshidze[Call to this topic]

You can also contact site staff by submitting a support request »

SF on Linux, a warning

Advanced search






SDL MultiTerm 2021
One central location to store and manage multilingual terminology.

By providing access to all those involved in applying terminology (such as engineers, marketers, translators, and terminologists), our terminology management solution ensures consistent and high-quality content from source through to translation.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search