Overview [2]#

Language-Tag is a Language Code for a Language usually used with HTTP and XML

Language-Tag are a way to tag digital resources to indicate in what human language they are. They are also used by software to tell an user's preference about languages.

They can express the language itself but also the writing system, the national variant and many other things.

Language-Tag and their subtags, including private use and extensions, are to be considered as Case-insensitive at all times: there exist conventions for the capitalization of some of the subtags, but these MUST NOT be taken to carry meaning.

A few examples of Language-Tag:

  • fr: French language,
  • en-AU: English language, as written and spoken in Australia
  • en-US: English language as written and spoken in United States
  • az-Latn-IR, Azeri language, written in the Latin script, as used in Iran.
They are specified in IETF RFC 5646

The Country Code part is the same as the ISO 3166-1 alpha-2 codes.

Language-Tag are made of language-subtag separated by hyphens. The list of possible language-subtag is mostly directly copied from various ISO standards such as ISO 639.

They are used in many formats and protocols for instance in XML (through the xml:lang attribute) and in HTTP (the browser can indicate to the Web server what language the user prefers, should the Web server have several versions).

The IANA Registry for language-subtag is at: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

Language-Tag HTTP Header Fields [1]#

The Accept-Language HTTP Request header advertises which languages the client is able to understand, and which locale variant is preferred. Using content negotiation, the server then selects one of the proposals, uses it and informs the client of its choice with the Content-Language HTTP Response header. Browsers set adequate values for this header according their user interface language and even if a user can change it, this happens rarely (and is frowned upon as it leads to fingerprinting).

This header is a hint to be used when the server has no way of determining the language via another way, like a specific URL, that is controlled by an explicit user decision. It is recommended that the server never overrides an explicit decision. The content of the Accept-Language is often out of the control of the user (like when traveling and using an Internet Cafe in a different country); the user may also want to visit a page in another language than the locale of their user interface.

RFC 3066 essentially allowed you to compose language tags that were either a Language-Tag on its own, a language code plus a country code, or one of a small number of specially registered values in the IANA language tag IANA Registry.

RFC 5646 caters for more types of subtag, and allows you to combine them in various ways. While this may appear to make life much more complicated, generally speaking choosing language tags will continue to be a simple matter - however, where you need additional power it will be available to you. In fact, for most people, RFC 5646 should actually make life simpler in a number of ways – for one thing, there is only one place you need to look now for valid subtags.

More Information#

There might be more information for this subject on one of the following: