Identifying languages and countries
Written by ForeignExchange Translations on Monday, April 26, 2010
The translation business isn't exactly known for adhering to established norms. From quality definitions to translation memory discounting to service-delivery processes, every translation company pretty much does their own thing.
The same is true when it comes to language and country codes. Say, for instance, you want to identify German. Do you use DE, D, GE, DEU, or something else? Translation companies routinely use all of these and many more abbreviations.
Most of the time, it is easy enough to figure out what someone means but it's easy for translators or clients to get confused. One company's "PO" abbreviation could refer to Polish and another one's to Portuguese.
Fortunately, there exists an easy solution: ISO 639.
As of 2009, four international standards for identifying languages have been approved by the International Organisation for Standardization:
- ISO 639-1 two-letter codes
- ISO 639-2 three-letter codes
- ISO 639-3, also three-letter codes, for all language categories identified by Ethnologue
- ISO 639-5 for language clusters and groups
Identifying languages is a good first step. But what if you want to distinguish, say, a Spanish translation for Peru from one intended for Mexico? Or what if a client requests a translation for "IN" - does that refer to Indonesia or India?
Luckily, there is a solution for this as well: ISO 3166 is the International Standard for country codes. It lists 246 official short names and code elements as shown on this list of English country names and code elements. By combining language and country codes, anybody can now distinguish the French spoken in Switzerland (fr_ch) from that of France (fr_fr) and Canada (fr_ca).
So, let's do a favor to ourselves and our clients and use ISO 639 and ISO 3166 to identify languages and countries.
Thanks, Megan, for the idea!
For further reading, take a look at the following articles:
- International standards for date and time
- Wikipedia's entries for IS 639 and ISO 3166
- LangID helps you identify unknown languages
ForeignExchange Translations provides specialized medical translation and software localization services to pharmaceutical and medical device companies. Contact us to learn more.
Categories: off topic





Cf. http://www.netreport.fr/knowledgebase/UserHelp/11_Reference_Material/04_Net_Report_Local_IDs/00_Introducing_Locale_IDs.htm
Peter.wilms.van.kersbergen@medtronic.com
I have not given up hope of seing the ISO abbreviations used properly on the list ... one day.
(via LinkedIn)
For example, in Mainland China, the Simplified script is used almost exclusively; however, there are two main spoken dialects (Mandarin and Cantonese). At the same time, both Taiwan and Hong Kong use the Traditional script, but the former speaks Mandarin, while the latter speaks Cantonese.
In general, I'm used to seeing "ZH-CN" for Mandarin Chinese, Simplified Characters, and "ZH-TW" for Mandarin Chinese, Traditional Characters. Presumably, "ZH-HK" would suffice for Cantonese/Traditional, but that leaves out the large population of Cantonese-speaking, simplified character-writing folks in the southern Chinese mainland. And that's all to say nothing of the other 12 (less common) main dialects of Chinese recognized by Ethnologue...
Fortunately, in the world of written translations, simplified characters are mutually intelligible by any Mandarin, Cantonese, or Hakka speaker who has studied them, and the same for the traditional characters. Audio content gets a little more dicey, but can typically be clarified with a client as necessary. All the same, it presents an interesting (and complex) case for ISO codes.