;   Medical Translation Insight: Prepping messy files for TM usage - ForeignExchange Translations

Prepping messy files for TM usage

Prepping messy files for TM usageTranslation memories are great - IF the TM databases are clean and IF the source files are well structured. Unfortunately, that is not always the case.

Luckily, Dave Turner has taken it upon himself to remedy this situation. He has developed and made available free of charge two terrific tools.

The first is his Code Zapper macro, which can be used to remove rogue codes. For example, it moves place markers from the middle of words to the end of the paragraph.

Dave then followed this up with Format Fixer, which:

  • deletes leading spaces and tabs inserted typewriter style to indent text, and sets the equivalent indent,
  • deletes excess spaces between words,
  • deletes excess paragraph marks and sets the equivalent vertical spacing,
  • attempts to correct frequent punctuation errors (space before comma or inside a parenthesis for example),
  • tries to fix PDF converted files (removes hard and soft returns to make text wrap properly),
  • adds a space between a number and a letter as in 20ohm, 10daN -> 20 ohm, 10 daN
Code Zapper as well as Format Fixer are available free of charge on Yahoo!Groups dejavu-l forum and can be used with any TM system. Check them out!

[Thanks to Kevin Lossner's blog for the tip!]

UPDATE 2010-01-02: Dave Turner published an updated version of CodeZapper that features improvements in the PDFTidy and PDFFix routines. PDFTidy should provide better tidying up of PDF converted files before CZ is run. PDFFix should provide better elimination of stubborn rogue codes, especially in PDF converted files.

For more neat TM tools, check out these resources:
Subscribing to Medical Translation Insight via email or RSS provides you with daily news regarding language, technology, and regulations.


Post a Comment


Services | Resources | Company | Contact Us | Blog | Home

(c) Copyright 2010, ForeignExchange Translations, Inc.