@@ -34,16 +34,15 @@ The :mod:`locale` module defines the following exception and functions:
3434
3535 If *locale * is given and not ``None ``, :func: `setlocale ` modifies the locale
3636 setting for the *category *. The available categories are listed in the data
37- description below. *locale * may be a string, or a pair,
38- language code and encoding. If it is a pair, it is converted to a locale
39- name using the locale aliasing engine. An empty string specifies the user's
37+ description below. *locale * may be a :ref: `string <locale_name >`, or a pair,
38+ language code and encoding. An empty string specifies the user's
4039 default settings. If the modification of the locale fails, the exception
4140 :exc: `Error ` is raised. If successful, the new locale setting is returned.
4241
43- The format of the *locale * and the language code strings is platform
44- dependent, but the forms `` language[_territory][.encoding][@modifier] ``
45- and `` language[_territory] `` respectively are typically accepted on all
46- platforms .
42+ If *locale * is a pair, it is converted to a locale name using
43+ the locale aliasing engine.
44+ The language code has the same format as a :ref: ` locale name < locale_name >`,
45+ but without encoding and `` @ ``-modifier .
4746 The language code and encoding can be ``None ``.
4847
4948 If *locale * is omitted or ``None ``, the current setting for *category * is
@@ -351,8 +350,8 @@ The :mod:`locale` module defines the following exception and functions:
351350 ``'LANG' ``. The GNU gettext search path contains ``'LC_ALL' ``,
352351 ``'LC_CTYPE' ``, ``'LANG' `` and ``'LANGUAGE' ``, in that order.
353352
354- The format of the language code is platform depended, but on Posix
355- platforms it usually looks like `` language[_territory] `` .
353+ The language code has the same format as a :ref: ` locale name < locale_name >`,
354+ but without encoding and `` @ ``-modifier .
356355 The language code and encoding may be ``None `` if their values cannot be
357356 determined.
358357 The "C" locale is represented as ``(None, None) ``.
@@ -366,8 +365,8 @@ The :mod:`locale` module defines the following exception and functions:
366365 the language code and encoding. *category * may be one of the :const: `!LC_\* `
367366 values except :const: `LC_ALL `. It defaults to :const: `LC_CTYPE `.
368367
369- The format of the language code is platform dependent, but on Posix
370- platforms it usually looks like `` language[_territory] `` .
368+ The language code has the same format as a :ref: ` locale name < locale_name >`,
369+ but without encoding and `` @ ``-modifier .
371370 The language code and encoding may be ``None `` if their values cannot be
372371 determined.
373372 The "C" locale is represented as ``(None, None) ``.
@@ -625,6 +624,59 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
625624part of a character class such as letter or whitespace.
626625
627626
627+ .. _locale_name :
628+
629+ Locale names
630+ ------------
631+
632+ The format of the locale name is platform dependent, and the set of supported
633+ locales can depend on the system configuration.
634+
635+ On Posix platforms, it usually has the format
636+
637+ .. productionlist :: locale_name
638+ : language ["_" territory] ["." charset] ["@" modifier]
639+
640+ where *language * is a two- or three-letter language code from `ISO 639 `_,
641+ *territory * is a two-letter country or region code from ISO 3166,
642+ *charset * is a locale encoding, and *modifier * is a script name,
643+ a language subtag, a sort order identifier, or other locale modifier
644+ (e.g. "latin", "valencia", "stroke" and "euro").
645+
646+ On Windows, several formats are supported.
647+ A subset of `IETF BCP 47 `_ tags:
648+
649+ .. productionlist :: locale_name
650+ : language ["-" script] ["-" territory] ["." charset]
651+ : language ["-" script] "-" territory "-" modifier
652+
653+ where *language * and *territory * has the same meaning as in Posix,
654+ *script * is a four-letter script code from `ISO 15924 `_,
655+ and *modifier * is a language subtag, a sort order identifier
656+ or custom modifier (e.g. "valencia", "stroke" or "x-python").
657+ Both hyphen ("``- ``") and underscore ("``_ ``") separators are supported.
658+ Only UTF-8 encoding is allowed for BCP 47 tags.
659+
660+ Windows supports also locale names in the format
661+
662+ .. productionlist :: locale_name
663+ : language ["_" territory] ["." charset]
664+
665+ where *language * and *territory * are long names, such as "English" and
666+ "United States", and *charset * is either a code page number (e.g. "1252")
667+ or UTF-8.
668+ Only the underscore separator is supported in this format.
669+
670+ The "C" locale is supported on all platforms.
671+
672+ .. _ISO 639 : https://www.iso.org/iso-639-language-code
673+ .. _IETF BCP 47 : https://www.rfc-editor.org/info/bcp47
674+ .. _ISO 15924 : https://www.unicode.org/iso15924/
675+
676+ .. https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02
677+ .. https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings
678+
679+
628680 .. _embedding-locale :
629681
630682For extension writers and programs that embed Python
0 commit comments