Pages in topic:   < [1 2 3 4 5 6 7 8]
Launch of TAUS/TDA inminent. The super cloud
Thread poster: Felipe Gútiez Velasco
RoyMarie
RoyMarie
United States
Local time: 06:36
A Web 2.0 perspective on translation in the future Jan 19, 2009

TAUS is an organization made up of localization people (mostly from IT companies) involved in "large" corporate translation projects of mostly very static information i.e. manuals, documentation, web pages about products, software interfaces etc... All important to do but not really high profile in terms of the overall corporation. When was the last time you read the manual of a car you bought or a mixer or a blender. Business line managers who run international business divisions in sales/marke... See more
TAUS is an organization made up of localization people (mostly from IT companies) involved in "large" corporate translation projects of mostly very static information i.e. manuals, documentation, web pages about products, software interfaces etc... All important to do but not really high profile in terms of the overall corporation. When was the last time you read the manual of a car you bought or a mixer or a blender. Business line managers who run international business divisions in sales/marketing/production are the real power players here and people involved with "localization" are generally much lower in stature and influence. The TAUS approach and modus operandi is very much a localization culture based.

This is in contrast to an approach by somebody like Google who will quite possibly make tools and collaboration infrastructure available to facilitate data sharing and improve productivity for both professional and amateur translators.

To my observation, the forces really driving these sharing and efficiency initiatives are the core business imperatives that arise from the increasingly flat world. From a global marketplace enabled by Web 2.0+ technologies e.g. I have children who routinely buy designer clothes from China at a fraction of the cost at a local store. We all know of many examples that show the world is opening up in this way, though this may be a silly example.

This direct contact among globally scattered buyers and sellers means that a lot more information needs to be translated to enable and facilitate trade. As countries across the world grow online populations, the thirst for information beyond shopping also grows. Why is the Indonesian Wikipedia only 30,000 pages when the English Wikipedia is 3M+ pages? 260M people live in Indonesia and most do not speak English.

There are many such discrepancies across the world. Knowledge is concentrated in a very few select languages or maybe just English, German, French & Japanese where the bulk (90%+) of the worlds patents come from.

The following essays indicate the transformational impact translation can have on the world and the human condition:

Essay : The End Of The Language Barrier http://www.worldwidelexicon.org/original/344.html
http://www.ethanzuckerman.com/blog/the-polyglot-internet/
The Global Voices project is already a great example of people translating content that they want the world to see. http://globalvoicesonline.org/

It is not physically possible to convert the knowledge of the world with just a purely human translation effort. Information is growing too rapidly. Automation is necessary, but automation alone cannot succeed, it needs competent human guidance for this to work. Global corporations now understand that they really need to make all kinds of information available to build loyal and satisfied global customer bases. Sharing general linguistic assets and intelligent man-machine collaboration can enable global enterprises to convert huge masses of knowledge content to many languages. Truly make knowledge a universal resource. This does not mean high quality human translation falls by the wayside. There will always be a need for that even if we actually do reach a point where MT can be used for some kinds of technical manuals.

So back to TAUS -- the approach is perhaps outdated. It is a traditional top-down patronistic and father-knows-best approach. The Wikipedia would have never launched with that kind of thinking.

Google informed us that while there are 700,000 human translators who might be considered professionals across the globe there are actually 600M+ competent bilingual people on the Internet who are capable of doing some translation or help clean up automated translation. Is it useful to get these people involved? How could this happen?

To my mind, The TAUS Data Association (TDA) does not make sense for the following reasons:

The TAUS approach has a predominantly localization focus and so will not draw many professional translators who are perhaps afraid of being marginalized and is too far of the beaten path in it's focus to draw the 600M+ that could help with massive translation projects.

Even more specifically:

1) The technology platform for the pooled data is undefined and so it is very unclear what the benefit will be based on the very meager definitions that have been presented on the platform to date. There is also no clear definition or even discussion of standards that could be the foundation that would drive intelligent data collection and aggregation.

2) Anybody who has pooled TM from disparate sources or played with Open Source SMT (Moses) is aware that just pooling data does not automatically lead to benefits. A standardization and normalization process needs to take place to make the data equivalent and compatible. Initial TAUS tests have shown that there was very little benefit from just throwing data into a common bucket.

Again this is not clearly recognized as an issue by TDA and this should give ProZ.com reason to hesitate. I have worked with many differing pools of data with Moses SMT and understand that a significant amount of work needs to be invested in data cleaning, normalization and preparation for the leverage to be meaningful. There is very little awareness of this process at this point among the people who have joined who all assume that more is better. In fact , it is often not.

3) The costs are somewhat high for the relative value and will discourage the many smaller players who could also benefit and contribute. The business model encourages high volume contribution but has no mention of quality as a consideration for benefit. So it is possible that large amounts of crap will be collected. Old TM that has been lying around is "donated" to make a muddy soup that is low value to all.

4) Much of the data that will be made available can be downloaded without trouble from the websites anyway and could easily be aligned for a lower cost, than incurred by joining the association and buying the data access. Remember that most of the data they plan to put into the consortium is already on the website of the contributor. In the case of the EU, all the data can simply be downloaded as TMX files with no problem at all. So why would anybody want to pay and go to TDA to get it?

5) There is no outreach to the little man, the freelance translator, the 600M+ who under the right conditions could be encouraged to contribute 5 sentences each. The TDA is basically an old boys network. Web 2.0 is all about empowerment of the masses, engaging hundreds of thousands to change the world. Yes, only a few really contribute but the Wikipedia is a good example of open collaboration where the process really does produce usable quality. Tens of thousands do a little and maybe an elite 1000 is responsible for the huge bulk of the work. However, the Wikipedia is used by hundreds of millions. This is true for many social network based collaborations.

So if data sharing makes so much sense, why am I beating on the TDA?

Actually I think it is an admirable first effort that at least raises the possibility of shared action. The committee should be commended for coming up with the idea of sharing.

I think it is worth raising these issues as they may get attention and perhaps a few of these issues can even be addressed.

So what could ProZ.com do?

ProZ.com could be a real force in an initiative that was a collaboration of experts that are globally scattered that guided and managed intelligent data pooling. The KudoZ system is an example of how they could lead on building broadly leverage able linguistic assets. Maybe make this more open, package it and find new ways to monetize this effort by selling subscriptions to corporations and LSPs. The ProZ.com Living Dictionary.

ProZ.com could work with a standards focused organization to develop a more useful data aggregation strategy that considers how the data could be normalized and standardized. In exchange for this, maybe contributing members get access to the super data at no cost and also can get consulting contracts with corporations who want to do this behind firewalls. I am sure that LISA and OSCAR would welcome a collaboration and ongoing dialogue. High quality and management could be monetized more easily.

ProZ.com could help members develop new kinds of professional services that focus on translation related but not purely translation work, e.g.Translation Corpus development, Linguistic Consulting services, Data Normalization strategies etc.. and services from virtual expert panels that could e.g. advise global enterprises on the best way to convert a 100,000 page knowledge base into 10 languages, pulling in ProZ.com membership to help with the post-editing (for a fee of course).

ProZ.com could save the money they would spend going to TAUS meetings and develop educational programs that help members understand how to become more effective using Basic NLP concepts, SMT, RbMT, Corpus Preparation, Post-Editing MT efficiently. These are all technologies gaining momentum and that companies will use in future. If the members understand these technologies and start getting involved, develop some expertise, they may find that there is plenty of work available as the world tries to move to a model where anything that exists in a source language should also exist in 30 other languages. I believe that world is coming soon.

ProZ.com could help members devise time and skill contribution based service business models for services rendered rather than the hated (3) cents per word model.

ProZ.com could make technology investments that would facilitate member engagement with these next generation technologies. Build a collaboration platform so that members can engage and experiment and develop expertise in a ProZ.com environment

ProZ.com could form some technology partnerships with Web 2.0 companies so that their expert members could draw on the 600M+ online bilingual population as a resource. Managed crowdsourcing -- find ways to engage people to build up some linguistic assets that become critical for people to consider

Having said all this, I am not a translator and so I cannot say I feel your pain. I see translation as a fundamentally human activity, one that will never be completely replaced by computers, but one where humans can use machines to leverage themselves. Like singers with microphones.

The coming wave of needed translation of enterprise and world knowledge is truly going to be too much for humans to handle alone, and ProZ.com should lead in a community collaboration approach to help change the world of translation and the role of translators, rather than follow this somewhat unclear TDA mission.

We are all faced with a real challenge today to grow and embrace the change that is coming. The polyglot internet is an opportunity for everybody in translation. If the leaders of the professional translation world do not step up and lead to enable and facilitate this others will. Google has already made noises about making this possible. We need to all be thinking how to get a million or ten million humans to contribute to a common data pool and effort that is valuable to hundreds of millions. TAUS should be commended for raising awareness of this issue but change will come when thousands are engaged rather than an elite group of corporate professionals.
Collapse


 
Pages in topic:   < [1 2 3 4 5 6 7 8]


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Launch of TAUS/TDA inminent. The super cloud






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »