Internationalisation, Localisation

Label all dialects of a given language

(significantly updated 20 Oct 2018)

If you are an American developer, don’t label your dialect as simply English when referring to the British, Australian and Canadian localisations as British English, Canadian English and Australian English. I specifically mention American developers – and English settings – because a) this is an English-language document and b) I’ve noticed this tendency from US developers more often than I have others. It’s not just English though; I’ve come across it with Portuguese and Chinese too. Label them all; you’re less likely to cause offence or annoyance to people who don’t speak or write the dialect you consider archetypical. Major software companies like Apple, Google and Microsoft typically do this right. For example, on most Google products and current versions of MacOS and Windows, you can’t just pick ‘English’; you’d have to pick a specific kind.

The unmarked-dialect problem can happen inadvertently when a company starts with their local dialect and later adds others. For example, Twitter has a generic ‘English’ (American) and ‘British English’ (added a few years ago). It would make more sense to add a label to the default English afterwards and call it English (US). Pinterest, Dropbox, Facebook and Apple retroactively added United States labels to their products after adding other English localisations.

Again, this isn’t just about English. If your software is available in both Brazilian and European/African Portuguese, don’t label the Brazilian version Português when you’ve labelled the European version Português europeu. Label them both, or only label Brazilian Portuguese. I’ve noticed the ‘local dialect as default’ with North and South American variants more than I’ve seen with with European ones, however, with the exception of French.

If you are going to leave a language unmarked, make it the originating country’s dialect (British English, European Portuguese, the French spoken in France, Simplified Chinese – but call them English, Portuguese, French and Chinese) and mark the local variants (US English, Brazilian Portuguese, Traditional Chinese, Canadian French, Swiss German). Better yet, label them all if you offer multiple versions of a language.

It’s not necessary to label a language if only one version of it is available unless

  • It’s likely to cause confusion, especially when the language is written in multiple scripts or there are mutual intelligibility problems between two subtypes of a language. This can arise with languages like Portuguese, Chinese and Serbian. If date formatting varies within a language, only offering one date format can also be confusing.
  • The lack of options reduces the usability of your site or application for writers and speakers of other subtypes of a given language. An example might be a website that only offers an English spellchecker that works in the United States, but nowhere else.
  • The ideal choice would be to add the missing options, but that’s not always practical; development teams have limited time, money and other resources. The next best thing is to recognise the fault and make it clear that you’re not just pretending that these other populations don’t exist.

    The OS X spellchecking language selector: four versions of English, French, German, Spanish, Italian, Dutch, two versions of Portuguese, Danish, Swedish, Russian and Polish all labelled in their own languages.

    Fig. 1: Marking all dialects
    A good example from Apple’s spellchecking options in OS X. Every language is listed in its own language, despite the computer’s interface language being set to English (Dansk, Deutsch, Español, Italiano, Nederlands, Svenska, Français, Polski, Português, Русский) and all the regional variants for English and Portuguese are marked.

    Language selector, listing German, Spanish, French, Hebrew, Italian, Brazilian Portuguese, Russian and US English (with no other variant available).

    Fig. 2: Unnecessary dialect label for English

    Above is an example of what not to do with English. Since there are no other dialects of English included here, there’s no need to label it. Just label it as ‘English’ unless you have more than one version available or if there’s some feature of the application that would make it difficult for users of other English variants to use, like spellchecking. If I see English (US), I would expect to find, when clicking the list, other English variants, only to be disappointed. The Portuguese label is OK.

Internationalisation, Localisation

Make the relationship between location and language flexible

While you’ll find most people in a location may prefer the most commonly spoken local language (or languages) or dialect of a particular language, there may be exceptions: members of a minority language group, immigrants, expats, members of the armed forces, civil servants, language-learners or people who may simply prefer another language.

A lot of non-native English-speakers have grown used to setting sites to English-language options because of poor or confusing translations. Of course, your priority should be to improve your translations, but if your users want to see your English page despite them visiting from a country with a different primary language, let them see your English-language pages.

Some websites are not very flexible, but offer one or two common regional languages. Some only offer one. There are even websites that ostensibly sell to countries with multiple languages, but offer only one language (like websites that sell to Switzerland and have only a Swiss German option with no French or Italian available). Switzerland has four official languages – but according to Apple, Ebay and Paypal there’s just one, German. Something similar happens to sites that target Belgian customers and only offer French, even though a slight majority of the population speak Flemish and there is a German-speaking minority.

In order to access eBay in a different language (or dialect) other than the one (or ones) associated with your country, you need to use a different country’s eBay site. You can’t visit a Spanish-language eBay site and have prices default to US Dollars or Pound Sterling.

There are lots of reasons why someone might want to have modular language and currency settings:

  • An Australian working in China: English (Australia), Chinese Yuan, China
  • A Chinese person working in Spain, paid in Euros: Chinese (Simplified), Euro, Spain
  • An American Japanese-learner in the Netherlands, who has a US bank account: Japanese, US Dollar, Netherlands
  • A Romanian working in Italy, being paid in Euros: Romanian (Romania), Euro, Italy
  • A German working in Singapore, being paid in Singapore Dollars: German (Germany), Singapore Dollar, Singapore
  • A British expat working temporarily in the US, being paid in pounds to a British bank
    account: English (UK), Pound Sterling, United States

An example of a site that allows for flexibility is Etsy. You can choose any country listed on the site, which is a separate setting from your language and currency. It also shows you which settings it’s derived from your IP address and your computer’s language settings, and asks to confirm them.

A screenshot of the language, location and currency selector on Etsy. It says English UK, US Dollar, United States.

In this screenshot, Etsy has cleverly detected my preferred settings: English (UK), US Dollar as a currency, and US as a location. As you can see, I’m not bound to choose English (US) or Spanish (if I’m lucky) or French or Chinese (if I’m really lucky). I can choose Dutch or Russian if I like. If I happened to have an overseas bank account I could change the currency from ‘US Dollar’ to ‘Euro’ (to give an example).

Websites and operating systems that don’t offer this facility, despite the existence of other language translations for the website, make lots of assumptions about the users and their habits. They may fit for some people but there will always be exceptions and allowing for that possibility may make your customers or readers happier because of your thoughtfulness. A way (for web apps) to base language preferences on what the user prefers, rather than what you think they might prefer, is to use the language their browser is set to in order to determine the display language.

Internationalisation, Localisation

We’re not all living in America

Despite what some developers seem to think, the entire world does not constitute the USA. This section is specifically targeted towards American software companies and online businesses (though the nature of other sections can apply to them as well, and I’ve seen some non-American companies that make the same mistakes) who market their services outside the US, but the lack of attention given to internationalisation makes their products less useful for non-American users.

  • Do not require a US ‘zip code’ (postcode) as a condition of signing up to your site. Say ‘postcode/zip code’ – and don’t make it mandatory for all countries, as some countries don’t use postcodes.
  • Do not require a ‘state’ if your site is meant for people in various countries. Not every country is divided into states (or provinces).  Japan has prefectures. France has departments. Some countries just don’t have any administrative divisions that would be used in an address. Requiring states may put people in a state of frustration.
  • Do not put ‘United States’ at the top of the list of a country selector if you plan on having users from outside the US. If you plan on targeting users from English-speaking countries in general, you could put a group of English-speaking countries on top – like a list that has the US, UK, Canada, Australia, India, New Zealand and Nigeria at the top. A Spanish-language website could list countries like Spain, Mexico, Argentina, Guatemala and Colombia first.
  • Do not force the ‘Month/Day/Year’ date format on English-language users, especially if it’s written numerically. Most people in the UK, Australia, India, South Africa, New Zealand, Ireland, Nigeria and every other English-speaking country that is not the US, Canada, the Philippines or Belize will not want their dates to be formatted that way. All these countries use the Day/Month/Year format. Either make the date format flexible or detect the user’s date format based on their language settings, like Github.
  • If your website offers an English-language spellchecker and you plan to have users outside the US, you must, at the very least, offer British and Canadian English spellcheckers. Otherwise you will have very frustrated users who have words like ‘colour’ and ‘centre’ treated as mistakes when they’re not. There are open-source spellcheckers for all major English dialects. Include them or just get rid of your spellchecker. If you can only be bothered to offer an American spellchecker, you are being thoughtless. There is absolutely no excuse. The same applies to Portuguese and German – a German-language site may get users from Switzerland, and there are significant differences between Portuguese and Brazilian Portuguese. An example of this thoughtlessness is the Prezi site, which only offers an English (US) spellchecker. iWork for iCloud also has this problem.
  • Don’t make English (US) a hidden default. Base spellcheckers and other settings on what the user has chosen. Word processors should open documents in the user’s default language if there isn’t already a proofing language associated with the document already. This is particularly obnoxious when the user has set their operating system to another language entirely.
  • If you need bank account information, don’t ask for a ‘routing number’ if you plan on selling outside the US (or Canada, for that matter). The ‘routing number’ has other names in other English-speaking countries, like ‘sort code’. The name for a transactional bank account can vary too – it can be a ‘current account’ (UK), ‘checking account’ (US), ‘chequing account’ (Canada – note the spelling) or ‘cheque account’ (UK and Australia).

What not to do:

System Preferences

In OS X Yosemite, I can’t search for the option to ‘customise’, but I can search for ‘customize’. The common spelling in the UK, Australia, New Zealand and Ireland (and amongst many speakers in India, Nigeria and other countries where a variant of Commonwealth English is an administrative language) is ‘customise’. OS X should allow for both spellings. At least OS X is better than Prezi, though, who can’t even be bothered to include spellcheckers for English dialects other than American.

(Update, 20 October 2018: This particular Apple problem has been eliminated with the release of MacOS Mojave.)

Further reading:
American Computing Assumption

Language selector with flags. Spanish and Chinese are labelled in English.
Internationalisation, Localisation

Avoid using flags to designate languages

Web designers use flags as a convenient shorthand to stand for languages on multilingual websites. For example, a site available in French, English, Spanish and Japanese might show a French flag, a Union Jack (or an American flag), a Spanish flag and a Japanese flag. Unfortunately there are a lot of problems that the use of flags in language selectors invites, especially for languages that are used by several countries, like German, Spanish, English, French and Chinese.

There are lots of people who may be offended – or at the very least, perturbed – by an American flag to represent English: Canadians, British people, Australians and New Zealanders for example (and you may also offend Americans who recognise their dialect doesn’t represent the totality of the English-speaking world). A Union Jack for English may offend or confuse non-British, but English-speaking, users like Americans, South Africans, Canadians and Irish people. If you want to mark a specific dialect of a language, write something like Português do Brasil or Português brasileiro rather than using a Brazilian flag. And sometimes the flag doesn’t even match the dialect being used; examples include websites that have an English option that’s marked with the Union Jack, but the site itself is full of American spellings (or vice versa, where the American flag is used, but the English variant used is a Commonwealth version), or even worse, Brazilian Portuguese websites that use the flag of Portugal (which can confuse European or African Portuguese-speakers visiting the site and expecting European Portuguese).

Not to mention there are some countries that have multiple languages: India, China, Belgium, Nigeria, South Africa, Switzerland and Canada, for example. What language would you choose if you used an Indian flag? It could represent Hindi, Gujarati, Punjabi, English or Kannada, amongst others.

Bad:

  • German flag Deutsch – German is also spoken in Switzerland, Austria and Belgium.
  • Union Jack English – English is spoken in multiple countries, not just the UK. The Union Jack is preferable to the American flag, since English did originate in one of the UK’s constituent countries, not the US. But it’s still not a good choice.
  • American flag English – Using an American flag can be annoying or offensive to British, Canadian, Australian and other non-American speakers. English did not originate in the US.
  • Flag of EnglandEnglish – Using the flag of England may be confusing if you don’t recognise it.
  • Spanish flag Español – Using a Spanish flag may be confusing if you’re using a Latin American Spanish dialect. Also, Castilian Spanish is one of many languages spoken in Spain; others include Basque, Catalan and Galician.
  • Mexican flag Español – Spanish is spoken elsewhere, and Spanish did not originate in Mexico. Using a Mexican flag to represent Latin American Spanish is wrong. There are many other Spanish-speaking countries in Latin America, including Guatemala, Argentina, Honduras, El Salvador and Colombia.
  • Portuguese flag Português – Using a Portuguese flag can confuse or annoy Brazilian users if you’re using Brazilian Portuguese despite using the Portuguese flag. If the flag of Portugal is used to label Brazilian Portuguese, it may also mislead users who may be expecting European or African Portuguese because of the flag.
  • Brazilian flag Português – Potentially misleading or offensive to users from Portugal, Cape Verde or other countries that speak Portuguese. Portuguese did not originate in Brazil.

There are a few cases when there’s a one-to-one correspondence between language and the country associated with it, like Japanese, Turkish and Danish. But with large pluricentric languages like Chinese, English, Portuguese, French, Spanish and German, it’s not a good idea to have languages indicated with flags.

Countries like South Africa, India and China have large numbers of official or common languages. What language would a South African, Indian or Chinese flag represent? China alone contains nearly 300 languages. India has sixteen recognised languages. South Africa has eleven official languages. Not to mention that Mandarin Chinese is a pluricentric language that’s spoken outside Mainland China, as is Cantonese.

Examples of what not to do seen in the wild:

Panic.com blog: English is labelled with an American flag on the blog page, and Japanese is labelled with a Japanese flag – and with its own name in kanji and an English translation. These flags are redundant.

Screen Shot 2014-05-07 at 13.14.11

I like Panic. I use their Coda product regularly. But it’s a shame to see them resorting to this frustrating trope, especially with the American flag used for English. Just label the languages without the flags – the names are already there. The flag is admittedly less of a problem for Japanese than it is for English, but I stand by the principle that is never a good idea to label languages with flags.

This French lyrics site offers an English translation, which is indicated with a Union Jack. The word ‘English’ would be better and more recognisable – people visiting the site from the US, Canada or other countries may think the Union Jack button without an accompanying ‘English’ label means they’re being taken to a UK-centric site.

Screenshot of English language selector, represented by a circular Union Jack button

At oldapps.com, flags are used for languages. An American flag is used for English – again, English did not originate in the US. They didn’t use a Mexican flag for Spanish, did they? Even worse, Chinese and Spanish are labelled in English instead of in their own languages, though German says ‘Deutsch’ as it should. It’s not even consistent. I can say that this is the worst language selector I’ve ever seen.

Language selector with flags. Spanish and Chinese are labelled in English.

Don’t use flags for languages.

Here’s what you can do instead:
Languages

  • Deutsch (Deutschland)
  • Deutsch (Österreich)
  • Deutsch (Schweiz)
  • English (UK)
  • English (US)
  • English (Canada)
  • Español (España)
  • Español (Latinoamérica)
  • Français (France)
  • Français (Belgique)
  • Français (Canada)
  • Português (Brasil)
  • Português (Portugal)

Countries

  • The Union Jack. Click for UK orders
  • American flag Click for US orders

If you’d like to read a blog dedicated to the criticism of flags to indicate languages, you can visit Flags are not languages – an excellent resource for anyone developing multilingual websites.

Screenshot of the German-language version of Microsoft Exchange. 'Status' is mistranslated as 'State' or 'Bundesland'.
Internationalisation, Localisation

Provide accurate translations

If you want to translate your software into other languages (or foreign dialects of the same language), please be sure your translations are accurate – otherwise your users might be confused, irritated or even offended. This should be self-explanatory advice, but for some developers – including large multi-national companies like Microsoft, Apple and Yahoo – it’s not.

Screenshot of the German-language version of Microsoft Exchange. 'Status' is mistranslated as 'State' or 'Bundesland'.

Image from here

In the German translation of Microsoft Exchange 2007, a particularly ridiculous error turned up in the user interface – ‘state’ (as in ‘status’) was translated as Bundesland or ‘federal state’. This is like a piece of English-language software asking for the ‘province’ of an internet connection. According to the website I found this image from, Microsoft may have been using an automatic translator from English to German, which is fraught with pitfalls. Don’t use automatic translation without having a fluent speaker read the translation to see if it’s accurate. Computerised translations can often be wrong, and your user base will let you know it. It takes longer, but it’s worth it. The example is from Microsoft, a billion-dollar company – surely they, of all people, have the resources to create an accurate German translation?

A lot of people end up using English-language software even though they’re not fluent English-speakers or readers, simply because the translations into their native language are that bad. If you plan on marketing outside your home country and have the resources, make the effort to provide good translations. Not perfunctory machine translations that haven’t been reviewed by native speakers or reasonably fluent second-language learners – good translations. It’s absolutely inexcusable that a billion-dollar company like Microsoft couldn’t proofread their software well enough to avoid asking their German-speaking users about the ‘federal state’ of a connection.

 

Internationalisation, Localisation

Use flexible date and time formats and avoid confusing users

(updated 16 Sep 2015 – information about LinkedIn and date formatting issues)

Some developers will hard-code the date and time format in their software. Sometimes this only applies to their English-language interfaces, but sometimes it extends further, which is even more thoughtless. Many American and anglophone Canadian developers tend to impose the ‘Month/Day/Year’ date format on everyone, regardless of what is popular elsewhere. Do not hard-code date and time formats. Either let the user choose it (like Wikipedia) or detect it from the user’s computer settings (like Github).

There are three date formats, listed in approximate order of popularity:

  • Day/month/year – Long form: 21(st) January 2015 Short form: 21.01.2015 (most popular in Europe, Central America, non-English-speaking North America, South America, Africa and Western Asia, and used in a military context in the USA). Please note that commas are not used in this date format.
  • Year/month/day – Long form: 2015 January 21 Short form: 2015-01-21 (the ISO standard, most popular in East Asia, also used as a standard date format when sorting dates in computing)
  • Month/day/year – Long form: January 21(st), 2015 Short form: 01/21/2015 (popular in anglophone North America, particularly the USA; you may see the month/day/year format written in long form by other English-speakers, but not the confusing short form)

Of these three date formats, Day/Month/Year is the most popular. If you asked North American software developers, you’d think it would be Month/Day/Year, even though this date format is only used on a widespread basis in the USA, by some people in English-speaking Canada and Belize.

The Month/Day/Year format can be confusing when written in short form, especially with a date like 03/04/2014 – are you referring to 3 April, or 4 March?

Avoid using numerical dates, where most date-format confusions happen. 10/09/2004 can be 10 September 2004 or 9 October 2004. Write the month name out or abbreviate it using words, not numbers. ‘10 Sep 2004’ and ‘Sep 10 2004’ are unambiguous, unlike 10/09/04 and 09/10/04. You’ll also want to write the year out to avoid confusion. Christopher Heng has an excellent article about avoiding numerical dates on The Site Wizard if you’d like to read more.

Dropbox, a popular cloud-storage provider based in San Francisco, had a hard-coded Month/Day/Year date format (written numerically, to add to the confusion) on the English-language version until the end of 2013, after years of complaints from their international user base. Evernote and Firefox OS are currently the subject of similar complaints. There is a two-year-old thread from Dell users complaining about one of their products having a hard-coded M/D/Y date format if their language is set to English. It should not take five years for a website with an international user base to support date formats understandable by a wide variety of people. People on LinkedIn have been complaining about a default, unchangeable US date format (written numerically on certain parts of the site) for the last two years, with no change in sight. If I am able to change the date format from M/D/Y to D/M/Y on a cheap mobile phone, then I should be able to do it with a smartphone UI or a web app.

A screenshot showing some OS X System Updates on Apple's website. All the dates are written in a numerical Month Day Year format.

What not to do: Apple’s support website shows numerical Month/Day/Year dates for updates, even if you have a region chosen that should change the way dates appear. The first three dates could be ambiguous, especially at the end of the year. I tried this list of updates with both the US and UK localisations of the Apple sites and the date formats didn’t change.

Date separators can also use different punctuation – in English-speaking countries, there are a variety of different date separators, depending on region (the commonest are the slash/stroke and the full stop/period), but in German-speaking countries, people use the full stop.

Give the option for 24-hour time (0:00 to 23:59) and 12-hour time (12:00am to 11:59pm). Most non-English-speaking countries don’t use 12-hour time, and there are many English-speakers who prefer 24-hour time in user interfaces (it’s the standard in UK English regional settings on OS X, Windows and iOS).

Examples of date format flexibility: Windows allows the user to choose any date order they like, even when using English (US) as their region format.

Examples of date format inflexibility: English-language OS X and iOS will only allow you to put the day in front of the month if you’re using a region format other than US or Canada. I ended up changing my region format to UK to get my preferred date order – I hate month/day/year.

Some flexibility, but room for improvement: Trello will detect 12-hour and 24-hour time formats, and will understand different inputted date formats, but the text still shows Month/Day/Year date formats. They’re not numeric, but some users may want to have the dates to be displayed as ’12 October 2015′ instead of ‘October 12, 2015’. (Update, 16 May 2015: Trello now detects both date and time format; the day is now in front for me, which is in line with my date settings on OS X.)

Here is a good example of what to do from Wikipedia – there are no numerical dates except for year-month-day, which is unambiguous, and you have the choice between Day Month Year, Year Month Day and Month Day Year.

A date format selector on Wikipedia, with options for no preference, month day year, day month year and year month day.

Addition, 11 April 2015:

I’m convinced that the US ‘Month/Day/Year’ date format should be avoided in sites that will be used outside the US – and especially websites that will be translated into other languages. People including date formatting in scripts should make the default English format the ISO standard (Year/Month/Day), making regional preferences like MDY and DMY opt-in by specifying a language and dialect. I’ve seen too many websites where the American date format lurks as an artefact, even on localisations for other languages like French and German.