Exporting a large TM (500.000+), shortcut through MS Access
Thread poster: Wolfgang Jörissen
Wolfgang Jörissen
Wolfgang Jörissen  Identity Verified
Belize
Dutch to German
+ ...
Nov 21, 2009

Have looked in the archives, but maybe not long enough. Could someone of you describe the procedure again how to extract a large tm into another useable form (txt, tmx or whatever)? The export option DVX offers would probably take days and weeks.

 
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 20:05
English to German
+ ...
One possible way Nov 21, 2009

I just did it this way:

The Source and Target are in the table Sentences and are indexed by different language codes. For me, 7 means German and 9 means English.

The complete texts are in the field Sentence of the table Sentences.
This field has the type Memo, so it can be very large, but cannot be used for sorting or indexing.
If you want to sort, group, or index it in Access, you would have to use the field Sentence2Index instead, which has the type Text a
... See more
I just did it this way:

The Source and Target are in the table Sentences and are indexed by different language codes. For me, 7 means German and 9 means English.

The complete texts are in the field Sentence of the table Sentences.
This field has the type Memo, so it can be very large, but cannot be used for sorting or indexing.
If you want to sort, group, or index it in Access, you would have to use the field Sentence2Index instead, which has the type Text and is restricted to 255 characters.
I chose to do this later outside, so I only need the fields ID, Lang, and Sentence here.

First I created a query with the fields ID, Lang, and Sentence based on the table Sentences, writing the Lang code 7 below the column Lang, to filter for the German text.

In SQL View it looks like this:

SELECT Sentences.ID, Sentences.Sentence, Sentences.Lang
FROM Sentences
WHERE (((Sentences.Lang)=7));

Then I created another query with the 9 instead of 7 for English:

SELECT Sentences.ID, Sentences.Sentence, Sentences.Lang
FROM Sentences
WHERE (((Sentences.Lang)=9));

Then I created the third and last query to combine the languages, based on the above two queries. I simply had to draw the ID field from one query over to the other to create the link (INNER JOIN). In SQL View it looks like this:

SELECT QueryLang9.Sentence AS English, QueryLang7.Sentence AS German
FROM QueryLang9 INNER JOIN QueryLang7 ON QueryLang9.ID = QueryLang7.ID;

Then I tried to export this query and realized that it uses the ; as a separator instead of the tab which I would need.
As a workaround I selected the whole table (about 20MB), copied it to the Clipboard and from there into Excel.

Finished.
Collapse


 
Wolfgang Jörissen
Wolfgang Jörissen  Identity Verified
Belize
Dutch to German
+ ...
TOPIC STARTER
Character limitation? Nov 22, 2009

Thanks Harry, I will give this a try.

Harry Bornemann wrote:

As a workaround I selected the whole table (about 20MB), copied it to the Clipboard and from there into Excel.



Wasn't there a 256 character limitation in Excel cells?


 
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 20:05
English to German
+ ...
Depends.. Nov 22, 2009

Wolfgang Jörissen wrote:

Wasn't there a 256 character limitation in Excel cells?

It depends on the Excel version and the way of importing, otherwise I would paste it into Notepad, or first refine it in Access to remove any duplicates and other trash.

If you want to export from Access directly into a text file, the simplest way would be to first import a text file of your desired format and use the Import Wizard to create and save an import/export profile, which can be specified and reused in an Access export macro.


 
Javier Arrizabalaga
Javier Arrizabalaga
Local time: 04:05
Exporting a large TM (500.000+), shortcut through MS Access Nov 26, 2009

Wolfgang Jörissen wrote:

Have looked in the archives, but maybe not long enough. Could someone of you describe the procedure again how to extract a large tm into another useable form (txt, tmx or whatever)? The export option DVX offers would probably take days and weeks.


Hi Wolfgang,

The DVX import process takes many time because it should fill many information of each entry in the TM, so later DVX can give you the assembled rows faster when translating the projects. Anyway, the export process should take less time than importing. Probably you can get your TMX exported file in 10 or 15 minutes.

BR


 
Wolfgang Jörissen
Wolfgang Jörissen  Identity Verified
Belize
Dutch to German
+ ...
TOPIC STARTER
Probably... not Nov 26, 2009

Dear jarriza,

Been there, done that, tried all possible export formats (tmx, txt, Trados txt, Access...), but to no avail and after 7-8 hours of occupying the computer, I resigned, and probably most of the other posters in this thread share my experience. So I think the Access workaround is the only way to go in my case.

Thanks anyway.


 
Javier Arrizabalaga
Javier Arrizabalaga
Local time: 04:05
Exporting a large TM (500.000+), shortcut through MS Access Nov 26, 2009

Wolfgang Jörissen wrote:

Dear jarriza,

Been there, done that, tried all possible export formats (tmx, txt, Trados txt, Access...), but to no avail and after 7-8 hours of occupying the computer, I resigned, and probably most of the other posters in this thread share my experience. So I think the Access workaround is the only way to go in my case.

Thanks anyway.


Which DVX version do you have? Which are your computer specs?

BR


 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 05:05
English to Turkish
+ ...
18 minutes for 1 million translation units Nov 26, 2009

Hi Wolfgang,

I just tested with a large TM (958,814 translation units), it took only 18 minutes. It would take 10 minutes without the extra information (subject, client, project, file, row, is source, date, user).

So check only the required information, e.g. date and user.


And the test was performed with a 3 year old laptop (2GB RAM), dvx build 310.


 
Wolfgang Jörissen
Wolfgang Jörissen  Identity Verified
Belize
Dutch to German
+ ...
TOPIC STARTER
310? Nov 26, 2009

Selcuk Akyuz wrote:

And the test was performed with a 3 year old laptop (2GB RAM), dvx build 310.



Hi Selcuk!

Have I have missed something? I have build 303, which is the one that is available for download at the Atril site.


 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 05:05
English to Turkish
+ ...
Build 310 Nov 26, 2009

Hi Wolfgang,

Build 310 is a beta version. Some users have been testing it for almost one year. I have installed it last week, and did not test it in a real project. I don't think there will be any difference in export time compared to build 303.

Selcuk


 
Wolfgang Jörissen
Wolfgang Jörissen  Identity Verified
Belize
Dutch to German
+ ...
TOPIC STARTER
Welcome Nov 26, 2009

Javier Arrizabalaga wrote:

Which DVX version do you have? Which are your computer specs?

BR


First of all, great to have you here! Nice to welcome an Atril staff member to this forum. Stay around, we need you

Well, if you really say so, I might try it again. Last time I tried, it was just too frustrating and I broke off. I know I can deactivate some of the export options, but in earlier (DV3) days, an export was also a sort of backup for me, which I regurlarly performed. "Project" and "file" are valuable pieces of information that I do want to have included.

Besides that, I share parts of my TM with a colleague who does not use DVX.

My computer specs: Dual Core (not sure which one), 4 GB RAM, XP, DVX build 303

[Bearbeitet am 2009-11-26 12:41 GMT]


 
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 20:05
English to German
+ ...
TransferText Nov 26, 2009

Harry Bornemann wrote:

If you want to export from Access directly into a text file, the simplest way would be to first import a text file of your desired format and use the Import Wizard to create and save an import/export profile, which can be specified and reused in an Access export macro.

BTW, the relevant macro action is called TransferText in German, looks like it could be the same in Dutch.


 
Javier Arrizabalaga
Javier Arrizabalaga
Local time: 04:05
Exporting a large TM (500.000+) Nov 26, 2009

Wolfgang Jörissen wrote:

Javier Arrizabalaga wrote:

Which DVX version do you have? Which are your computer specs?

BR


First of all, great to have you here! Nice to welcome an Atril staff member to this forum. Stay around, we need you


I will try to participate in ProZ often. I hope I could help you to take advantage of all the DVX functions and features, and also I will try to solve your issues.


Well, if you really say so, I might try it again. Last time I tried, it was just too frustrating and I broke off. I know I can deactivate some of the export options, but in earlier (DV3) days, an export was also a sort of backup for me, which I regurlarly performed. "Project" and "file" are valuable pieces of information that I do want to have included.

Besides that, I share parts of my TM with a colleague who does not use DVX.

My computer specs: Dual Core (not sure which one), 4 GB RAM, XP, DVX build 303

[Bearbeitet am 2009-11-26 12:41 GMT]


With this computer your DVX TM should be exported in less than 15 minutes to a TMX file. If your colleague uses another CAT Tool, I think exporting to TMX is the best way to move a translation memory from one tool to another.


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 04:05
French to Polish
+ ...
Excel cell size Nov 26, 2009

Wolfgang Jörissen wrote:

Thanks Harry, I will give this a try.

Harry Bornemann wrote:

As a workaround I selected the whole table (about 20MB), copied it to the Clipboard and from there into Excel.


Wasn't there a 256 character limitation in Excel cells?


Probably many years ago
It's the column width limitation now.
The max. cell size is 32,767 characters (2^15-1) but only 1000 are displayed directly in a cell.

Cheers
GG


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 04:05
French to Polish
+ ...
310 or newer Nov 26, 2009

Selcuk Akyuz wrote:

Build 310 is a beta version. Some users have been testing it for almost one year. I have installed it last week, and did not test it in a real project. I don't think there will be any difference in export time compared to build 303.


A new official build will be published in December, AFAIK.
It was announced at the virtual Proz conference.
XLIFF, Transit, bells and whistles.

Cheers
GG


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Exporting a large TM (500.000+), shortcut through MS Access






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »