Q & A for OpenI18N Locale Name Guideline [Informative] ======================================================= [Last update: 2003-03-11] Q01. What is a Standard Locale Name? A01. A Standard Locale Name is the name that should be recognized by software supporting the locale. It is important to have a single name that software can use to access the same entity. The Standard Locale Name is not necessarily an elegant name but should be distinct and supported. Q02. What is a standard value? A02. A standard value is a value assigned to a part of a locale name. A Standard Locale Name consists of parts whose values are standard values. Q03. What is a user/implementation-defined name? A03. A user/implementation-defined name is a name that refers to a different entity from any of the Standard Locale Names or any constitutional part of the Standard Locale Names. Such names may be necessary, for example, when an application-specific locale is created and installed on a system. User/implementation-defined names may also be used when users/implementors want to define locales that are slightly different from the definitions of the Standard Locale Names. Q04. Why are territory and codeset not optional? A04. It is difficult to decide what are the default values for the omitted parts. There are several implementation-defined methods to supplement unassigned parts. The Standard Locale Name should refer to the same entity when the same name is specified. Thus we decided not to make them optional. There are also other reasons for each part. TERRITORY is mandatory because some locale categories such as LC_MONETARY mainly depend on the TERRITORY field. CODESET is mandatory because multiple incompatible codesets exist and it is necessary to retain data integrity. Q05. Why are the standard values for codeset limited? There are many codeset names used in the world which are not included in the list of standard values in the guideline. A05. What we specify are the Standard Locale Names. The codeset name that is used widely as a part of locale name is discussed in the guideline. For example, UTF-32 and UCS-4 are not included in the standard values because they are not used as parts of locale names. Failing to be included in the list of standard values does not mean that the names cannot be used on your system. They will continue to be valid names on the system. However, Standard Locale Names should be used when communication is necessary among different systems or software components. Q06. Are there any examples of the Standard Locale Names? A06. The following are a few examples of the Standard Locale Names: en_US.ISO-8859-1 fr_CA.UTF-8 ja_JP.EUC-JP Q07. Can we not use the names that we have always used? A07. You can continue to use the names that you are currently using. This guideline does not prohibit anyone from using non-standard names. Some systems may have the Standard Locale Names supported already. Some systems may need to implement the Standard Locale Names in addition to the existing names. Having a single set of standard names is the important first step. In the long run, it is desired that names converge on Standard Locale Names. For example, a user who is using zh_TW.big5 in the LANG environment variable can continue to use it. The mapping between standard locale names and user/implementation-defined locale names should be performed internally by the system. Q08. Why is the IANA charset registry not used for the standard values for codeset? A08. The IANA charset registry is just a collection of character set names (or encoding schemes). It does not have consistent naming convention. Sometimes it is hard to know who and how a name was registered. Therefore we decided not to adopt it. There are two major parties in the collection of locale names. One is based on IANA charset registry and the other one is derived from the Standards activities in major commercial UNIXes. Both sides have no consistent naming convention nor definite reason to show the superiority over the other. Our decision is to set a reasonable set of rules for the standard codeset values and to try and make up a set of standard values following them. If existing names are different from the standard values, an implementation-defined alias mechanism needs to be used to allow them to be used instead of the standard values. Such an alias mechanism is available for major software components such as glibc and the X Window System. Our codeset alias table provides mappings to the IANA registry. Q09. Why is the locale name that most people in my country are using, not specified in the standard values? A09. Our Codeset Alias Table is not complete. We need your help to keep it accurate and up to date. Q10. Is there any way to identify a specific version of a codeset? A10. The CODESET field has arbitrary number of subfields. A specific version can be specified within those subfields. By default, reference should be made to the latest version. Q11. There is variability in the definition of codesets. Is there any plan to provide definitions of codesets? A11. No. We believe contents of a codeset will be obvious by alias definitions. We don't provide contents of locale definitions, either. Q12. What shall an implementation do for the Standard Locale Names? A12. Operating system distributions should provide locales with the Standard Locale Names so that application program can use them. All the locale-sensitive components of the operating system, such as the Standard C Library (libc), the X Window System, the Java Runtime Environment, and so on should run properly with the Standard Locale Names. Operating systems should also provide alias tables for each locale-sensitive component so that the users can use any of the alias names the same way as the standard names. If an application program has locale-sensitive components, such components should be able to work with the Standard Locale Names. Q13. What kind of locale names are not recommended by the OpenI18N Naming Guideline? A13. The composition of the locale name used to identify associated resources should conform to the naming guideline. Each component of the locale name, LANGUAGE, TERRITORY, CODESET and MODIFIERS, should be unique, self-identifiable and conforming to the appropriate standards. Ambiguous and inconsistent names are not recommended, for example, japanese.euc, korean.euc or chinese.euc. Q14. Where can I find ISO documents? A14. Please visit the ISO web site http://www.iso.ch to obtain copies of ISO publications. Q15. How can I register new language code or new country code? A15. The following URLs are used by the maintenance agencies for ISO language code and country code. The maintenance agency for ISO 639-2 (language codes): http://lcweb.loc.gov/standards/iso639-2/langhome.html The maintenance agency for ISO 3166 (country codes): http://www.din.de/gremien/nas/nabd/iso3166ma/ The RFC3066 (Tags for the Identification of Languages) discusses the registration of a language tag with IANA. Q16. How can I request to add a codeset name which is not listed in the Codeset Alias Table? A16. We will implement registration process through FSG/LANANA. Q17. Why is VISCII included in the Codeset Alias Table, when glibc does not support it as a locale? A17. As a note for VISCII users; some software components may have a problem implementing VISCII as a codeset for a locale. glibc is one of them. A member of basic character set, defined in the C language standard, needs to have the same value between char and wchar_t. Codesets which are not compatible with ASCII may conflict with the standards requirement. Q18. Why are numeric characters not allowed for STRING1 of CODESET? A18. There are a couple of conventions that are used for the ISO/IEC 8859 series of coded character set. One is ISO8859-1 and the other is ISO-8859-1. In order to not make this guideline ambiguous, we decided to not allow numeric characters in STRING1. Therefore ISO-8859-1 is recognized as a standard locale name and ISO8859-1 becomes a user/implementation-defined locale name. Q19. Where will the standard locale names become visible to users? A19. Standard locale name should be used as an interchangeable locale name between software. For example the return value of the setlocale() or nl_langinfo() function. A user may also specify the standard locale name in locale related environment valuables, such as LANG, and as an input parameter to setlocale(). [END]