Understanding the Locales on Debian GNU/Linux

Understanding the Locales on Debian GNU/Linux
Each computer system comes with its specific setup regarding the system language, and character encoding that is in use. Based on this configuration the error messages, the help system as well as the program’s feedback is displayed on screen.

On UNIX/Linux systems this setup is called POSIX [7] locales, and standardized as IEEE Std 1003.1-2017 [3]. Such a locale can vary for the system as a whole, and the single user accounts as every single user can individualize his working environment. In this article we will explain to you how to figure out the current locale setup on Debian GNU/Linux, to understand its single adjusting screws, and how to adapt the system to your needs.

Note that this article is tailored to Debian GNU/Linux Release 10 “Buster”. Unless otherwise stated the techniques described here also work for its derivates like Ubuntu or Linux Mint [8].

What is a locale?

Generally speaking, a locale is a set of values that reflect the nature and the conventions of a country, or a culture. Among others these values are stored as environment variables that represent the language, the character encoding, the date and time formatting, the default paper size, the country’s currency as well as the first day of the week.

As touched on before, there is a general setting known as ‘default locale’, and a user-defined setting. The default locale works system-wide and is stored in the file /etc/default/locale. Listing 1 displays the default locale on a Debian GNU/Linux using German as the main language, and 8 bit unicode (UTF-8) as the character set [11].

Listing 1: The default locale on a German Debian GNU/Linux

$ cat /etc/default/locale # File generated by update-locale LANG=“de_DE.UTF-8” $ —-

Please note that in contrast to Debian GNU/Linux, on some earlier Ubuntu versions the system-wide locale setup is stored at /etc/locale.conf.

The user-defined settings are stored as a hidden file in your home directory, and the actual files that are evaluated depend on the login shell that you use [6]. The traditional Bourne shell (/bin/sh) [4] reads the two files /etc/profile and ~/.profile, whereas the Bourne-Again shell (Bash) (/bin/bash) [5] reads /etc/profile and ~/.bash_profile. If your login shell is Z shell (/bin/zsh) [9], the two files ~/.zprofile and ~/.zlogin are read, but not ~/.profile unless invoked in Bourne shell emulation mode [10].

Starting a shell in a terminal in an existing session results in an interactive, non-login shell. This may result in reading the following files – ~/.bashrc for Bash, and /etc/zshrc as well as ~/.zshrc for Z shell [6].

Naming a locale

As explained here [12], the name of a locale follows a specific pattern. The pattern consists of language codes, character encoding, and the description of a selected variant.

A name starts with an ISO 639-1 lowercase two-letter language code [13], or an ISO 639-2 three-letter language code [14] if the language has no two-letter code. For example, it is de for German, fr for French, and cel for Celtic. The code is followed for many but not all languages by an underscore _ and by an ISO 3166 uppercase two-letter country code [15]. For example, this leads to de_CH for Swiss German, and fr_CA for a French-speaking system for a Canadian user likely to be located in Québec.

Optionally, a dot . follows the name of the character encoding such as UTF-8, or ISO-8859-1, and the @ sign followed by the name of a variant. For example, the name en_IE.UTF-8@euro describes the setup for an English system for Ireland with UTF-8 character encoding, and the Euro as the currency symbol.

Commands and Tools

The number of commands related to locales is relatively low. The list contains locale that purely displays the current locale settings. The second one is localectl that can be used to query and change the system locale and keyboard layout settings. In order to activate a locale the tools dpkg-reconfigure and locale-gen come into play – see the example below.

Show the locale that is in use

Step one is to figure out the current locale on your system using the locale command as follows:

Listing 2: Show the current locale

$ locale LANG=de_DE.UTF-8 LANGUAGE= LC_CTYPE=“de_DE.UTF-8LC_NUMERIC=“de_DE.UTF-8
LC_TIME=“de_DE.UTF-8LC_COLLATE=“de_DE.UTF-8LC_MLITESPEEDARY=“de_DE.UTF-8
LC_MESSAGES=“de_DE.UTF-8LC_PAPER=“de_DE.UTF-8LC_NAME=“de_DE.UTF-8
LC_ADDRESS=“de_DE.UTF-8LC_TELEPHONE=“de_DE.UTF-8LC_MEASUREMENT=“de_DE.UTF-8
LC_IDENTIFICATION=“de_DE.UTF-8LC_ALL= $ —-

Please note that other Linux distributions than Debian GNU/Linux may use additional environment variables not listed above. The single variables have the following meaning:

  • LANG: Determines the default locale in the absence of other locale related environment variables
  • LANGUAGE: List of fallback message translation languages
  • LC_CTYPE: Character classification and case conversion
  • LC_NUMERIC: Numeric formatting
  • LC_TIME: Date and time formats
  • LC_COLLATE: Collation (sort) order
  • LC_MLITESPEEDARY: Mlitespeedary formatting
  • LC_MESSAGES: Format of interactive words and responses
  • LC_PAPER: Default paper size for region
  • LC_NAME: Name formats
  • LC_ADDRESS: Convention used for formatting of street or postal addresses
  • LC_TELEPHONE: Conventions used for representation of telephone numbers
  • LC_MEASUREMENT: Default measurement system used within the region
  • LC_IDENTIFICATION: Metadata about the locale information
  • LC_RESPONSE: Determines how responses (such as Yes and No) appear in the local language (not in use by Debian GNU/Linux but Ubuntu)
  • LC_ALL: Overrides all other locale variables (except LANGUAGE)

List available locales

Next, you can list the available locales on your system using the locale command accompanied by its option -a. -a is short for –all-locales:

Listing 3: Show available locales

$ locale -a C C.UTF-8 de_DE@euro de_DE.utf8 en_US.utf8 POSIX $ —-

Listing 3 contains two locale settings for both German (Germany) and English (US). The three entries C, C.UTF-8, and POSIX are synonymous and represent the default settings that are appropriate for data that is parsed by a computer program. The output in Listing 3 is based on the list of supported locales stored in /usr/share/i18n/SUPPORTED.

Furthermore, adding the option -v (short for –verbose) to the call leads to a much more extensive output that includes the LC_IDENTIFICATION metadata about each locale. Figure 1 shows this for the call from Listing 3.

In order to see which locales already exist, and which ones need further help to be completed you may also have a look at the map of the Locale Helper Project [20]. Red markers clearly show which locales are unfinished. Figure 2 displays the locales for South Africa that look quite complete.

Show available character maps

The locale command comes with the option -m that is short for –charmaps. The output shows the available character maps, or character set description files [16]. Such a file is meant to “define characteristics for the coded character set and the encoding for the characters specified in Portable Character Set, and may define encoding for additional characters supported by the implementation” [16]. Listing 4 illustrates this with an extract of the entire list.

Listing 4: Character set description files

$ locale -m ANSI_X3.110-1983 ANSI_X3.4-1968 ARMSCII-8 ASMO_449 BIG5 BIG5-HKSCS … $ —-

Show the definitions of locale variables

Each variable used for a locale comes with its own definition. Using the option -k (short for –keyword-name) the locale command displays this setting in detail. Listing 5 illustrates this for the variable LC_TELEPHONE as it is defined in a German environment – the phone number format, the domestic phone format, the international selection code as well as the country code (international prefix), and the code set. See the Locale Helper Project [20] for a detailed description of the values.

Listing 5: The details of LC_TELEPHONE

$ locale -k LC_TELEPHONE tel_int_fmt=“+%c %a %l” tel_dom_fmt=“%A %l”
int_select=“00” int_prefix=“49” telephone-codeset=“UTF-8” $ —-

Changing the current locale

The knowledge regarding the locale becomes necessary as soon as you run a system that comes with a different locale than you are used to – for example, on a Linux live system. Changing the locale can be done in two ways – reconfiguring the Debian locales package [19], and adding the required locale using the command locale-gen. For option one, running the following command opens a text-based configuration dialog shown in Figure 3:

# dpkg-reconfigure locales

Press the space bar in order to choose the desired locale(s) from the list shown in the dialog box, and choose “OK” to confirm your selection. The next dialog window offers you a list of locales that are available for the default locale. Select the desired one, and choose “OK”. Now, the according locale files are generated, and the previously selected locale is set for your system.

For option two, generating the desired locale is done with the help of the command locale-gen. Listing 6 illustrates this for a French setup:

Listing 6: Generating a French locale

locale-gen fr_FR.UTF-8
Generating locales… fr_FR.UTF-8done Generation complete. # —-

In order to use the previously generated locale as the default one, run the command in Listing 7 to set it up properly:

Listing 7: Manually setting the locale

# update-locale LANG=fr_FR.UTF-8

As soon as you open a new terminal session, or re-login to your system, the changes are activated.

Compile a locale definition file

The command localectl helps you to manually compile a locale definition file. In order to create a French setting run the command as follows:

Listing 8: Compile a locale definition

# localedef -i fr_FR -f UTF-8 fr_FR.UTF-8

Conclusion

Understanding locales can take a while as it is a setup that is influenced by several factors. We explained how to figure out your current locale, and how to change it properly. Adpating the Linux system to your needs should be much easier for you from now on.

Links and References
Related Posts
Leave a Reply

Your email address will not be published.Required fields are marked *