;   Medical Translation Insight: The "super cloud" of TM sharing? - ForeignExchange Translations

The "super cloud" of TM sharing?

The super cloud of TM sharing?TM sharing is all the rage.

Now TAUS, long a proponent of sharing translation memory ("TM") content across industry players, has released their TAUS Data Association language data exchange portal.

As of today, TDA's web site lists 29 member companies, split roughly down the middle between translation suppliers and buyers. To be sure, there are some large companies on this list. The TAUS team's hard work to enlist members has clearly paid off.

At the same time, there remain serious questions regarding usefulness and intellectual property.

The former, usefulness, is limited right now but will undoubtedly improve with time. There are limited language combinations, industries and domains. Also, the content seems to be of, shall we say, "mixed quality".

A language search for "pop-up blocker" brought up different results for "computer hardware" and "computer software" industries. More disconcerting is the fact that the combined ten segments displayed are inconsistent, contain typos and non-printable characters, and some are clearly specific to the companies that provided them.

These same issues exist in traditional TMs as well. However, with traditional TMs, the user is more or less in control of the data. We generally have a sense of the source, quality, and appropriateness for different TMs and segments.

With a shared TM, we know almost nothing about the origins and quality of the data. It's hard to imagine how this would enhance the translation throughput or quality.

In part, that's because of the above-mentioned intellectual property concerns. While software companies might not be concerned about sharing their content, it's hard to imagine drug and device companies endorsing this practice.

Similarly, do the language service providers participating in this effort have the right to share TM segments with third parties? I can't speak for other translation companies, of course, but at ForeignExchange, we wouldn't do this even for clients where we explicitly own the TMs.

And why would we want to? Currently, there is no medical industry/domain and the data quality would have to be substantially better before this would make sense to consider.

Nonetheless, TDA does represent a substantial step forward in the translation business. It demonstrates that innovation is possible, that clients and suppliers can work together, and that it's possible to develop solutions, not just tweak services or add to tool features. And for all of that, congratulations to TAUS!

Did you like this post? Subscribe to Medical Translation Insight via email or RSS.


  1. Anonymous said...
    This post reads a little like – “good idea – but not in my backyard” and also shows that you just have not taken the five minutes needed to read the material. The image used for the post is taken from the press release, which clearly states that there are 45 founding members (not 29) and that there are were 70 languages pairs available at launch (now 80 after one-month or so). This is not a limited number of language pairs. There are four companies quoted on the press release and four others on the website. Each explains their motivation for joining. On the same page as the press release, there is a link to a story on the benchmarking work done by the University Leeds showing 16% increased leveraging (just one advantage and an initial benchmark).

    I can only assume you refer to limited languages because you did not sign up for free registration of the language search tool and so access was limited to 7 languages. I just tried “pop up blocker” for each of these 7 and found no results.

    Your post informs that the repository gives “inconsistent” translations. Sure. TDA accumulates all the translated data from the global translation community. The reality is that different companies use different terminology. One of the unique values of TDA is that everyone can immediately find these variances in translations. The next release of the Language Search Engine will also show the frequency of the usage of different translations as well as the names of the data owners. This way, users get guidance on which translation they may prefer to use in any language. The Language Search Engine also gives part of speech information and in a future release will display synonyms and related terms. We are working with the Centre for Translation Studies of the University of Leeds to add these sophisticated linguistic features. We are confident that the Language Search Engine (which is a free public service) will help to make QA processes a lot more efficient and will also help overtime to streamline and unify industry-specific terminology.

    The TMs shared in TDA contain errors. That’s true, and as your post rightly states: this is also true for TMs stored on company’s own servers. However, the fact that companies start sharing TMs on an industry platform will help to improve quality. Firstly, TDA applies some basic checks, such as: TMX compliance, XML tag integrity check, missing translations. Secondly, there is the embarrassment that data providers and data owners will be exposed to when they share bad quality TMs publicly. Such transparency improves quality (i.e. members are able to see exactly who the owner and provider of a TM is). Thirdly, TDA has introduced a web 2.0 peer review for translation quality. TMs get star ratings from other users of the platform.

    Legal teams representing each of the 45 founding members, who come from all sides of the industry, have reviewed and agreed a water tight (and succinct) data sharing and pooling agreement, which takes account of intellectual property, inter alia. Language service providers are already asking their clients to share TMs in the TDA Language Exchange Portal

    The cloud image used for the press release shows healthcare, medical instruments, and pharma and biotechnology as distinct industry categories. Sharing TMs of some published material would bring real benefits to all concerned. And for the reasons given above, quality will improve as a result. There are areas in all industries where is pays to cooperate as a complement completing. Linguistic assets is one of these areas.

    This response is almost as long as the brochure explaining all aspects of TDA services, because of the low quality analysis provided by the initial post. I would urge you and anyone reading this to take the few minutes needed to look at www.tausdata.org properly in order to get a good overview. And if you want to know more, please you can sign up for a free fortnightly webinar, request the business plan, email and/or call us.

    Rahzeb Choudhury said...
    Apologies....notice now that you looked up "pop-up blocker", which is there....the following paragraph then applies...

    Kirti said...
    I think the TAUS/TDA has very selectively released information about the "benefits" of sharing data. They report what they consider positive results but interestingly choose to either completely suppress or ignore any study that shows that the benefits of sharing may not be so great.

    Yes there are benefits when the data is used with care but there are also problems introduced when one is not careful.

    I was involved with a study where TAUS data was pooled for SMT. The benefits were not so clear and it showed that there are serious data normalization and standardization issues to address before benefits can accrue.

    It also indicates that the data needs to be cleaned before use.

    This is a link to a detailed 60 page report that details some of the data quality issues and shows that even for SMT (the data hog) it does not always make sense to consolidate data and clearly shows that MORE DATA IS NOT ALWAYS BETTER

    Also what most do not realize is that as much as 60% of the data can be obtained free in TMX form directly from the European Union. Jost's article also points out several other sources that you can get TM without the excessive membership fees and overhead that the TDA requires. See this link for a large variety of mostly free TM resources available on the web http://translationjournal.net/journal/51pondering.htm

    It would be good to see that TDA has the integrity to post negative or even slightly less positive results as well as the ones that seem to be designed to attract more membership.

    Also since we are counting members - has anybody asked: If this is such a great deal why have you not been able to get more than 50 members after your publicizing these "benefits"?

    As an SMT developer I can say that it simply does not calculate favorably on a cost/benefit basis for us and the last thing we all need is to have one more membership to add to our fixed costs.

    Kirti Vashee
    Asia Online
    ForeignExchange Translations said...
    Thank you for the detailed comments, Rhazed and Kirti!

    Just to clarify, Kirti's mention of "Jost's article" is in reference to our post When TMs jump the shark.
    Anonymous said...
    May I ask what is the final purpose of this collection of TMs? to offer it as a free service to anyone? to make companies believe that they can just have someone do a few searches and voila a translation of their materials comes out in 5 languages? I am not clear on the purpose...

Post a Comment


Services | Resources | Company | Contact Us | Blog | Home

(c) Copyright 2010, ForeignExchange Translations, Inc.