What language should a user be presented with in a login service?  The preferred language, of course.  The exercise remaining is to define the two terms language and preferred, and then to determine who preferres the language for the specific user.

A login service may use the following steps to determine what language to present, and terminate the search for information at the first positive answer

  1. Check if the user has chosen a language for the login page. End user choice is recorded in a cookie.
  2. Check if the referring service provider is requesting a preferred language
  3. Check if the browser has set an Accept-Language header
  4. Use a globally defined language as the default setting

A web service (service provider) may run through a similar checklist

  1. Check if the user has chosen a language for the site. End user choice is recorded in a cookie.
  2. Check if the preferredLanguage attribute has a value supported by your site
  3. Check if the browser has set an Accept-Language header
  4. Use a globally defined language as the default setting

Preferred language is usually preferred by the end user, but in Feide and other federations the home organization of the end user may also choose to enforce a specific language.  For elementary schools this could be a good choice, as children may add choices not supporting the educational goals.  In a situation with heated language discussions, this policy option could lead to further discussion and great opportunities for hands-on learning about political processes.

The Accept-Language header reflects configuration of the browser or operating system.  Such configuration is not necessarily reflecting the needs for web services, but could be optimized for instance for keyboard layout (Danish English being a popular choice, and this is weird even if English is an Anglo-Saxon language and the Angles and Saxons hailed from Denmark). Using the Accept-Language header from the browser without the other choices is not considered good practice, as commented in advice from W3C

The HTTP Accept-Language header was originally only intended to specify the user’s language. However, since many applications need to know the locale of the user, common practice has used Accept-Language to determine this information. It is not a good idea to use the HTTP Accept-Language header alone to determine the locale of the user. If you use Accept-Language exclusively, you may handcuff the user into a set of choices not to his liking.

The vocabulary for language codes is defined in several standards, among these are ISO 639-1 (two-letter language codes) and ISO 639-2 (three-letter language codes.  Feide is changing the encoding from two-letter language codes to three-letter language codes to add support for the two Sami languages that do not have two-letter codes: Southern Sami and Lule Sami.  Support exists for English, Norwegian, Norwegian bokmål, Norwegian Nynorsk and Northern Sami.  If you ask me why a small group of people like the Norwegians (4,8 million) have two kinds of Norwegians (two really similar forms of the same language), you should prepare yourself for a long discussion on urban/rural, political power, nationalism and the construction of a national state.  Or you could just accept the way it is, and move on to why the Sami (30000-70000) have multiple distinct language.

More practical advice on how to handle language processing in simpleSAMLphp is available.

Summing up: Language choices are complex.  Good luck!

Advertisements