LI18NUX 2000
Globalization Specification
Version 1.0 with Amendment 4
Linux Internationalization Initiative (Li18nux)
Copyright © 2000 The Free Standards Group. All rights reserved.
This document specifies interfaces and functionalities that must be supported by operating systems to run internationalized application software. This document also includes recommendations for operating systems to ease development of internationalized application software.
This specification only lists internationalization aspects of each functionality provided by the conforming operating systems.
[POSIX.1]
ISO/IEC 9945-1:1996 Information technology — Portable Operating System Interface (POSIX) — Part 1: System Application Program Interface (API) [C Language]
[POSIX.2]
ISO/IEC 9945-2:1993 Information technology — Portable Operating System Interface (POSIX) — Part 2: Shell and Utilities
[ISO C]
ISO/IEC 9899:1990 Programming languages — C
ISO/IEC 9899:1990/Amd.1:1995 Programming languages — C Amendment 1: C Integrity
[ISO C 99]
ISO/IEC 9899:1999 Programming languages — C
[XCU5]
The Single UNIX Specification, Version 2
Commands and Utilities, Issue 5
(The Open Group CAE Specification C604)
[XBD5]
The Single UNIX Specification, Version 2
System Interface Definitions, Issue 5
(The Open Group CAE Specification C605)
[XSH5]
The Single UNIX Specification, Version 2
System Interfaces and Headers, Issue 5 (2 volumes)
(The Open Group CAE Specification C606)
[XCURSES4.2]
The Single UNIX Specification, Version 2
X/Open Curses (XCurses), Issue 4 Version 2
(The Open Group CAE Specification C610)
[ICU]
International Components for Unicode 2.0
http://oss.software.ibm.com/icu/
[ICU4J]
International Components for Unicode for Java 2.0
http://oss.software.ibm.com/icu4j/
[Perl 5.6]
Perl 5.6 (March 23, 2000)
http://www.perl.com/pub/n/Perl_5.6.0_is_out!
[Java]
Java 2 Platform, Standard Edition, v1.3 API Specification
http://java.sun.com/products/jdk/1.3/docs/api/index.html
[X11R6]
The X Window System, Version 11, Release 6
ftp://ftp.x.org/pub/R6.4/xc/doc/hardcopy/
[Unicode 3.0]
The Unicode Standard, Version 3.0
The Unicode Consortium, Addison Wesley Longman, ISBN 0-201-61633-5
[ISO 10646-1]
ISO/IEC 10646-1:2000 Information technology — Universal Multiple-Octet Coded Character Set (UCS) — Part 1: Architecture and Basic Multilingual Plane
[ISO 639]
ISO 639:1988 Code for the representation of names of languages
[ISO 3166-1]
ISO 3166-1:1997 Codes for the representation of names of countries and their subdivisions — Part 1: Country codes
[IANA-Charset-Registry]
IANA Registry of Character Sets
http://www.isi.edu/in-notes/iana/assignments/character-sets
[ISO 8859-1]
ISO/IEC 8859-1:1998 Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1
[ISO 8859-2]
ISO/IEC 8859-2:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2
[ISO 8859-5]
ISO/IEC 8859-5:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet
[ISO 8859-7]
ISO 8859-7:1987 Information processing — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet
[ISO 8859-9]
ISO/IEC 8859-9:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5
[ISO 8859-13]
ISO/IEC 8859-13:1998 Information technology — 8-bit single-byte coded graphic character sets — Part 13: Latin alphabet No. 7
[ISO 8859-15]
ISO/IEC 8859-15:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 15: Latin alphabet No. 9
For conformance purposes the following environments are defined:
Application Execution Environment is a minimum operating system environment that can run internationalized application software. The functionalities defined in this environment are mandatory and shall be present on all conforming implementations.
The following sections are applied to Application Execution Environment:
3. Base Libraries
4. Shells and Utilities
End User Environment is an operating system environment with user interface. It is assumed that End User Environment has a set of utilities for user interaction.
This environment includes all the interfaces and utilities provided by Application Execution Environment. Additional interfaces and utilities are defined for the following sub-environments:
Server environment is an operating system environment suitable for backend server purposes. Graphical user interfaces are not required in this environment.
The following sections are applied to Server Environment:
3. Base Libraries
4. Shells and Utilities
5. Programming Languages (with Software Development Options)
9. Network Servers
Desktop environment is an operating system environment suitable for end user interaction. Graphical user interface is required in this environment.
The following sections are applied to Desktop Environment:
3. Base Libraries
4. Shells and Utilities
5. Programming Languages (with Software Development Options)
6. Graphical User Interface
7. Input Methods
8. Output Methods
10. Internet Tools
If an interface or utility is defined as “supported in End User Environment”, that interface or utility shall be available in both Server and Desktop environments.
The following options can be supported in each environment:
If any of these options is supported, utilities, libraries and associated modules to develop internationalized software (such as compilers or interpreters) shall be provided.
In this version of the specification, the following options are available:
C Language Development Option
Java Language Development Option
Several levels are defined for conformance for each environment. These levels are defined as follows:
The level 1 is the bottom-line level of conformance. All conforming implementations shall provide this level of interfaces and utilities to conform to this specification. If level is not specified in the specification,
that specification shall be considered as Level 1.
The level 2 is more advanced or extended level of conformance. Conforming implementations are encouraged to provide this level of interfaces and utilities to conform to this specification, but it is not mandatory.
The following terms are used in this specification:
Implementation-defined
A value or behavior is implementation-defined when it is left to the implementation to define [and document] the corresponding requirements for correct application behavior.
May
With respect to implementations, the word “may” is to be interpreted as an optional feature that is not required in this specification but can be provided. With respect to application, the word “may” means that the feature is optional. The term “optional” has the same definition as “may”.
Shall
In this specification, the word “shall” is to be interpreted as a mandatory requirement on the implementation or on application, depending upon the context. The term “must” has the same definition as “shall”.
Should
With respect to implementations, the word “should” is to be interpreted as an implementation recommendation, but not a requirement. With respect to application, the word “should” is to be interpreted as recommended programming practice.
Supported
Certain facilities in this specification are optional. If a facility is supported, it behaves as specified by this specification.
If a facility is “supported” by an implementation, the implementation must document how to obtain and install the facility, or the facility is installed by installer of the implementation by explicitly selected by the user or implicitly installed with other system components. If an implementation “supports” a facility, the distributor of the implementation shall commit that the facility can run on the implementation.
Unspecified
When a value or behavior is unspecified, the specification defines no portability requirements for a facility on an implementation even when faced with an application that uses the facility. An application that requires specific behavior in such an instance, rather than tolerating any behavior when using that facility, is not a portable application.
Provided
Certain facilities in this specification are mandatory and implemented in all conforming implementations.
Obsolescence
The indication of that subject statement or clause will be removed from future revision of this standard.
character
A sequence of one or more bytes representing a single graphic symbol or control code.
This term corresponds to the ISO C standard term multibyte character (multi-byte character), where a single-byte character is a special case of a multi-byte character. Unlike the usage in the ISO C standard, character here has no necessary relationship with storage space, and byte is used when storage space is discussed.
[Single UNIX Specification, Version 2]
byte
An individually addressable unit of data storage that is equal to or larger than an octet, used to store a character or a portion of a character; see character.
A byte is composed of a contiguous sequence of bits, the number of which is implementation-dependent. The least significant bit is called the low-order bit; the most significant is called the high-order bit.
Note that this definition of byte deviates intentionally from the usage of byte in some international standards, where it is used as a synonym for octet (always eight bits). On a system based on the ISO/IEC 9945-2:1993 standard, a byte may be larger than eight bits so that it can be an integral portion of larger data objects that are not evenly divisible by eight bits (such as a 36-bit word that contains four 9-bit bytes).
[Single UNIX Specification, Version 2]
character set
A finite set of different characters used for the representation, organization or control of data.
[Single UNIX Specification, Version 2]
coded character set
A set of unambiguous rules that establishes a character set and the one-to-one relationship between each character of the set and its bit representation.
[Single UNIX Specification, Version 2]
codeset
The result of applying rules that map a numeric code value to each element of a character set. An element of a character set may be related to more than one numeric code value but the reverse is not true. However, for state-dependent encodings the relationship between numeric code values to elements of a character set may be further controlled by state information.
The character set may contain fewer elements than the total number of possible numeric code values; that is, some code values may be unassigned.
[Single UNIX Specification, Version 2]
internationalization
The provision within a computer program of the capability of making itself adaptable to the requirements of different native languages, local customs and coded character sets.
[Single UNIX Specification, Version 2]
globalization
A product development approach which ensures that software products are usable in the worldwide markets through a combination of internationalization and localization.
locale
The definition of the subset of a user's environment that depends on language and cultural conventions.
[Single UNIX Specification, Version 2]
localization
The process of establishing information within a computer system specific to the operation of particular native languages, local customs and coded character sets.
[Single UNIX Specification, Version 2]
local customs
The conventions of a geographical area or territory for such things as date, time and currency formats.
[Single UNIX Specification, Version 2]
portable filename character set
The set of characters from which portable filenames are constructed. For a filename to be portable across implementations conforming to this specification set and the ISO POSIX-1 standard, it must consist only of the following characters:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 . _ -
The last three characters are the period, underscore and hyphen characters, respectively. The hyphen must not be used as the first character of a portable filename. Upper- and lower-case letters retain their unique identities between conforming implementations. In the case of a portable pathname, the slash character may also be used.
[Single UNIX Specification, Version 2]
file-system-safe character
Multibyte character which does not contain either 0x00 or 0x2F in any byte of its representation.
Input Method Engine
A part or a module of building block of input method which implements a language- or a script-specific logic of composing a string from one or more sequence of event or a string, which can be independent from windowing system, graphical user interface, or visual appearance.
This chapter defines runtime library interfaces required to conform to this specification. Conforming implementations shall provide the C language APIs defined by [ISO C] and [POSIX.1]. In addition to the C language interface, conforming level 2 implementations shall provide interfaces for other programming languages.
Conforming implementations shall provide the internationalization functions listed in the Table 3-1 and the headers listed in the Table 3-2. The specifications of the functions and the definitions of the headers shall conform to [POSIX.1] and [ISO C].
In addition to the functions in the Table 3-1, conforming implementations shall provide the wide character and wide string I/O functionality through printf/scanf family of functions as specified in [ISO C].
Table 3-1 C Language internationalization functions
|
btowc() |
fgetwc() |
fgetws() |
fputwc() |
fputws() |
|
fwide() |
fwprintf() |
fwscanf() |
getwc() |
getwchar() |
|
iswalnum() |
iswalpha() |
iswcntrl() |
iswctype() |
iswdigit() |
|
iswgraph() |
iswlower() |
iswprint() |
iswpunct() |
iswspace() |
|
iswupper() |
iswxdigit() |
localeconv() |
mblen() |
mbrlen() |
|
mbrtowc() |
mbsinit() |
mbsrtowcs() |
mbstowcs() |
mbtowc() |
|
putwc() |
putwchar() |
setlocale() |
strftime() |
swprintf() |
|
swscanf() |
towctrans() |
towlower() |
towupper() |
ungetwc() |
|
vfwprintf() |
vswprintf() |
vwprintf() |
wcrtomb() |
wcscat() |
|
wcschr() |
wcscmp() |
wcscoll() |
wcscpy() |
wcscspn() |
|
wcsftime() |
wcslen() |
wcsncat() |
wcsncmp() |
wcsncpy() |
|
wcspbrk() |
wcsrchr() |
wcsrtombs() |
wcsspn() |
wcsstr() |
|
wcstod() |
wcstok() |
wcstol() |
wcstombs() |
wcstoul() |
|
wcsxfrm() |
wctob() |
wctomb() |
wctrans() |
wctype() |
|
wmemchr() |
wmemcmp() |
wmemcpy() |
wmemmove() |
wmemset() |
|
wprintf() |
wscanf() |
|
|
|
Table 3-2 C language headers
|
<locale.h> |
<wchar.h> |
<wctype.h> |
Note: Application programs should refer to limits in symbolic names, such as MB_CUR_MAX and MB_LEN_MAX, not the implementation-specific values directly.
Conforming level 2 implementations shall provide the following functions. The specifications of the functions shall conform to [ISO C 99].
|
wcstof() |
wcstold() |
wcstoll() |
wcstoull() |
Conforming implementations shall provide the internationalization functions listed in the Table 3-3 and headers listed in the Table 3-4. The specifications of the functions and the definitions of the headers shall conform to [XSH5].
Table 3-3 Additional C Language internationalization functions
|
catclose() |
catgets() |
catopen() |
|
iconv() |
iconv_close() |
iconv_open() |
|
nl_langinfo() |
strfmon() |
strptime() |
|
wcswidth() |
wcwidth() |
|
Table 3-4 Additional C language headers
|
<iconv.h> |
<langinfo.h> |
<monetary.h> |
<nl_types.h> |
Conforming implementations shall provide the message handling functions listed in Table 3-5 and headers listed in Table 3-6 which is specified in Annex C: Publicly Available Specifications.
Table 3-5 Additional message handling functions
|
gettext() |
dgettext() |
textdomain() |
bindtextdomain() |
|
|
dcgettext() |
ngettext() |
dngettext() |
dcngettext() |
|
|
bind_textdomain_codeset() |
|
|
|
|
Table 3-6 Additional message handling functions headers
|
<libintl.h> |
Conforming level 1 implementations should support the POSIX regular expression functions listed in the Table 3-7 and the header <regex.h>.
The specifications of the functions and the definitions of the header should conform to [XSH5].
Table 3-7 POSIX regular expression functions
|
regcomp() |
regexec() |
regerror() |
regfree() |
Conforming implementations shall provide the application execution environment in which the internationalized applications (written by using the internationalization functions above) can behave appropriately depending on the value of environment variables, without requiring any change of the applications.
See Annex A: Environment Variables for the environment variables to which internationalization functions will refer.
Conforming implementations shall support the application execution environments specified in Annex B.
Conforming level 2 implementations shall define _XOPEN_CURSES version test macro and provide the internationalized curses library functions which are specified in [XCURSES4.2].
Conforming level 2 implementations shall support Java Runtime environment ([Java]), Internationalization Components for Unicode [ICU], ICU for Java [ICU4J], and Perl execution environment [Perl 5.6] including Perl interpreter and modules.
The following Perl modules are related with internationalization:
(see http://www.perl.com/CPAN-local/modules/00modlist.long.html#Part2-ThePerl5M)
|
Name |
Description |
|
I18N:: |
|
|
::Charset |
Character set names and aliases |
|
::Collate |
Locale based comparisons |
|
::LangTags |
compare & extract language tags (RFC1766) |
|
::WideMulti |
Wide and multibyte character string |
|
|
|
|
Locale:: |
|
|
::Country |
ISO 3166 two letter country codes |
|
::Date |
Month/weekday names in various languages |
|
::Langinfo |
The <langinfo.h> API |
|
::Language |
ISO 639 two letter language codes |
|
::Msgcat |
Access to XPG4 message catalog functions |
|
::PGetText |
What GNU gettext does, written in pure perl |
|
::gettext |
Multilanguage messages |
|
|
|
|
Unicode:: |
|
|
::String |
String manipulation for Unicode strings |
|
::Map8 |
Convert between most 8bit encodings |
GNU C library version 2.2
In the next version of this specification, conforming implementations may be required to provide POSIX regular expression functions and internationalized curses library functions.
This chapter defines runtime environment required to support traditional UNIX command interpreter called “shell” and other basic utilities defined in [POSIX.2].
Shell implementation
Conforming level 1 implementations shall be able to use Portable Filename Character Set defined in [POSIX.2]. For filename globbing, conforming level 1 implementations shall provide the functionality defined in [POSIX.2], with the following exceptions: [refer to Annex F: A]
Range expression (such as [a-z]) can be based on code point order instead of collating element order.
Equivalence class expression (such as [=a=]) and multi-character collating element expression (such as [.ch.]) are optional.
Handling of a multi-character collating element is optional.
Conforming level 2 implementations shall be able to use file-system-safe characters as arguments and filenames.
Conforming level 2 implementations shall implement the globbing functionality of the shell as defined in [POSIX.2].
Conforming implementations shall provide a shell that supports the functionalities of “Bourne shell”, with internationalization capabilities defined above.
The utilities implementation
Conforming implementations shall provide the following utilities to generate and refer to locale definitions as specified in [XCU5]:
|
locale |
localedef |
Conforming implementations shall provide the following utilities to edit text files encoded in the supported codesets as specified in [XCU5].
Note: To edit text is to determine character boundaries correctly and perform operations such as insert, copy and delete characters based on the determined character boundaries. Input and output requirements are specified in 7. Input Methods and 8. Output Methods respectively.
|
ed |
ex |
vi |
This specification has no requirements on date and time formatting functionality of shells and utilities.
Conforming implementations shall provide the following utilities to process text as specified in [XCU5].
|
comm |
diff |
egrep |
expand |
fgrep |
fold |
|
|
grep |
iconv |
join |
more |
mailx |
|
|
|
nm (symbol sorting order) |
od (floating point) |
pr |
printf |
|||
|
sed |
sort |
unexpand |
uniq |
wc |
|
|
The mailx utility can be implemented as Mail. The more utility can be implemented as less.
On conforming level 2 implementations, utilities that process regular expressions shall support Basic Regular Expression (BRE) and Extended Regular Expression (ERE) as specified in [POSIX.2].
On conforming level 1 implementations, utilities that process regular expressions should support BRE and ERE as specified in [POSIX.2], with the following exceptions: [refer to Annex F: A]
Range expression (such as [a-z]) can be based on code point order instead of collating element order.
Equivalence class expression (such as [=a=]) and multi-character collating element expression (such as [.ch.]) are optional.
Handling of a multi-character collating element is optional.
The following utilities are relevant:
|
egrep |
grep |
sed |
awk |
Conforming implementations shall provide the following utilities to correctly handle filenames that use file-system-safe characters.
For filename globbing, conforming level 1 implementations shall provide the functionality defined in [POSIX.2], with the following exceptions: [refer to Annex F: A]
Range expression (such as [a-z]) can be based on code point order instead of collating element order.
Equivalence class expression (such as [=a=]) and multi-character collating element expression (such as [.ch.]) are optional.
Handling of a multi-character collating element is optional.
|
cpio |
find |
ls |
tar |
Conforming implementations shall support at least one text editor that can edit text encoded in UTF-8.
Note: To edit text is to determine character boundaries correctly and perform operations such as insert, copy and delete characters based on the determined character boundaries. Input and output requirements are specified in 7. Input Methods and 8. Output Methods respectively.
Conforming implementations shall support terminal emulators that can handle codesets for supported locales.
Conforming implementations should support terminal emulation for all supported locales, but an implementation may provide different terminal emulators for each locale.
Conforming implementations shall provide the following utilities to convert message catalog source files into message catalogs.
|
gencat |
msgfmt |
Conforming implementations with C Language Development Option shall provide the following utilities to create and update message catalog source files.
|
msgmerge |
xgettext |
Conforming implementations shall provide the following utility to handle localized messages.
gettext
Examples of level 1 implementation
GNU bash
GNU textutils
GNU shellutils
GNU fileutils
Terminal Emulators:
kterm and kon.
jfbterm, supporting CJK, working under frame buffer, output only.
rxvt, supporting CJK, working under X Window System.
Unicon available at:
http://turbolinux.com.cn/TLDN/chinese/project/unicon/
zhcon by Bluepoint Corp.:
cce (Console Terminal) available at:
http://programmer.lib.sjtu.edu.cn/cce/cce.html
XLinux console, supporting 12 languages:
Unicode fonts and tools for X11:
http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html
XFree86 4.0.1 (includes already the above):
http://www.zepler.org/~rwb197/xterm/
In a future version of this specification, shell’s function of handling file-system-safe characters will become mandatory.
This chapter defines the requirements for various programming languages. Only programming languages with internationalization requirements are listed here. Note that the specifications defined by this chapter shall be provided by conforming implementations if the relevant Software Development Option is supported.
Conforming level 2 implementations with Software Development Options shall support the compiler or interpreter for the following languages:
C (if the implementation supports the C Language Development Option)
Java (if the implementation supports the Java Language Development Option)
Perl
Each programming language shall be internationalized as specified in the following specifications:
C language as specified in [ISO C]
Java language as specified in [Java]
Perl language as specified in [Perl 5.6]
Note: See 3. Base Libraries about runtime environment of Perl and Java languages.
The following implementation examples are available for these languages:
C: GNU Compiler Collection
http://www.gnu.org/software/gcc/gcc.html
C: Fortran & C Package (Linux)
Fujitsu Kyushu System Engineering Limited (in Japan)
Fujitsu C/C++ Express (Linux)
Fujitsu America Inc. (in US)
Perl:
http://www.perl.com/pub/n/Perl_5.6.0_is_out!
Java:
None
This chapter defines runtime library interfaces for graphical user interface (GUI). Conforming implementations shall provide the graphical user interface defined by the X Window System Version 11 Release 6 [X11R6].
Conforming implementations shall provide the API for following functions:
Locale
setlocale()
XSupportsLocale()
XSetLocaleModifiers()
Internationalized Text Drawing
XCreateFontSet() — not recommended (use XOpenOM()/XCreateOC())
XFreeFontSet()
XFontsOfFontSet()
XBaseFontNameListOfFontSet()
XLocaleOfFontSet()
XContextDependentDrawing()
XExtentsOfFontSet()
XmbTextEscapement()
XwcTextEscapement()
XmbTextExtents()
XwcTextExtents()
XmbTextPerCharExtents()
XwcTextPerCharExtents()
XmbDrawString()
XwcDrawString()
XmbDrawImageString()
XwcDrawImageString()
XmbDrawText()
XwcDrawText()
X Output Methods—X11R6 Extension
XOpenOM()
XCloseOM()
XDisplayOfOM()
XLocaleOfOM()
XSetOMValues()
XGetOMValues()
XCreateOC()
XDestroyOC()
XOMOfOC()
XSetOCValues()
XGetOCValues()
Resource Management
XrmInitialize()
XrmLocaleOfDatabase()
XrmParseCommand()
XResourceManagerString()
XScreenResourceString()
XrmGetFileDatabase()
XrmGetStringDatabase()
XrmMergeDatabases()
XrmCombineDatabase()
XrmCombineFileDatabase()
XrmGetDatabase()
XrmSetDatabase()
XrmGetResource()
XrmEnumerateDatabase()
XrmPutResource()
XrmPutStringResource()
XrmPutLineResource()
XrmPutFileDatabase()
XrmDestroyDatabase()
XrmPermStringToQuark()
XrmQGetResource()
XrmQGetSearchList()
XrmQGetSearchResource()
XrmQPutResource()
XrmQPutStringResource()
XrmQuarkToString()
XrmStringToBindingQuarkList()
XrmStringToQuark()
XrmStringToQuarkList()
XrmUniqueQuark()
Inter-Client Communication
XmbTextListToTextProperty()
XwcTextListToTextProperty()
XmbTextPropertyToTextList()
XwcTextPropertyToTextList()
XFreeStringList()
XwcFreeStringList()
XmbSetWMProperties()
XSetWMProperties()
XSetWMName()
XSetWMIconName()
X Input Methods—Internationalized Text Input
XOpenIM()
XCloseIM()
XDisplayOfIM()
XLocaleOfIM()
XSetIMValues()
XGetIMValues()
XCreateIC()
XVaCreateNestedList()
XDestroyIC()
XIMOfIC()
XSetICValues()
XGetICValues()
XSetICFocus()
XUnsetICFocus()
XmbResetIC()
XwcResetIC()
XFilterEvent()
XmbLookupString()
XwcLookupString()
XRegisterIMInstantiateCallback()
XUnregisterIMInstantiateCallback()
Conforming level 2 implementations shall support languages listed in Annex B. Conforming level 1 implementations need not to support languages that require complex text layout (the applicable languages are marked in the table in Annex B).
The following implementation example is available for this category.
XFree86 4.0.1:
None
This chapter defines the requirements for graphic toolkits supported on top of the X Window System and the X Window System servers.
Graphic Toolkits
There are no requirements on the Graphic Toolkits in terms of internationalization.
X Window Servers
There is no requirement on the X Window Servers in terms of internationalization.
The following implementation examples are available for this category.
[Graphic Toolkits]
GTK+:
Qt:
http://www.troll.no/products/qt.html
[X Window Server which supports outline fonts]
X-TrueType Server (X-TT):
http://X-TT.dsl.gr.jp/index.html
XFree86 4.0.1:
In a future version of this specification, Unicode, BiDi (bidirectional text), and vertical writing will become requirements.
This chapter defines the requirements for text input used by the X Window System and other environments. Such mechanism is needed to support non-Western languages (for example, Chinese, Japanese and Korean).
Conforming implementations shall provide means, i.e., Input Method(s) for user to input characters specified in the Annex B: Supported locales and codesets.
Conforming implementations shall provide X Input Method Server(s) which can connect with Input Method Engines of the supported locales. An Input Method Engine can be implemented as a separate process communicating with an X Input Method Server or can be integrated into the X Input Method Server.
Conforming implementations shall support Input Method Engines for the supported locales, that can be connected with the above Input Method Server(s). The conforming implementations shall document which Input Method Engines are supported by the above X Input Method Server(s) and how user can get and install the Engines into the conforming implementations.
The X Input Method Server(s) should have a capability to switch Input Method Engines dynamically, but a conforming implementation may provide multiple Input Method Servers per locale.
Conforming level 1 implementations should provide an X Input Method Server which supports UTF-8 encoding and allows user to input whole repertoire of [Unicode 3.0].
Conforming level 2 implementations shall provide an X Input Method Server which supports UTF-8 encoding and allows user to input whole repertoire of [Unicode 3.0].
Note: User-friendly input operation is preferable, but it is acceptable to use non-user-friendly input operation, such as entering hexadecimal code points, to input not-so-frequently-used characters. Also note that the input requirement does not imply that the input characters are displayed correctly.
Conforming implementations may provide X Input Method Server(s) which supports locale specific character repertoire and locale specific character encodings.
Every application that has X Window System based GUI and has a capability to accept character input from users should have the interface with the above X Input Method Server(s).
Conforming implementations should provide means for user to input characters specified in the supported locale through Console and TTY device interfaces.
X Input Method Server (Generic): IIIMF
X Input Method Servers (Japanese): kinput2, and Xwnmo.
X Input Method Servers (Chinese):
Chinput, supporting both GB and Big5
http://turbolinux.com.cn/~justiny/project-chinput.html
xcin, supporting both Big5 and GB
X Input Method Servers (Korean): ami, hanIM and byeoroo
Chinese Console:
supports CJK and Big5 display and input with a platform-independent input server
http://www.redflag-linux.com/news/open.htm
yh-3.1-opensource.tgz
In the next version of this specification, the recommendation of single X Input Method Server which can switch Input Method Engines dynamically will become mandatory requirement.
In the next version of this specification, the recommendation for conforming level 1 implementations regarding the X Input Method Server(s) which support UTF-8 encoding will become mandatory requirement.
This chapter defines the requirements for text output used by the X Window System. Such mechanism is needed to support languages that require complex text rendering.
Conforming implementations shall provide means, i.e., Output Method(s), for user to output characters specified in the Annex B: Supported locales and codesets.
Conforming implementations shall provide X Output Method interface defined in X11R6 Xlib specification chapter 13 as a displaying primitive for X Window System.
Conforming level 1 implementations should provide multibyte and wide character interface which cover the following collections of UCS implementation level 1 defined in [ISO 10646-1].
Conforming level 2 implementations shall provide multibyte and wide character interface which cover the following collections of UCS implementation level 1 defined in [ISO 10646-1].
Note: [ISO 10646-1] defines character blocks for subsetting purpose and are called character collections. Such character collections are used here to indicate minimum displayable subset.
|
1 |
BASIC LATIN |
0020-007E |
|
2 |
LATIN-1 SUPPLEMENT |
00A0-00FF |
|
3 |
LATIN EXTENDED-A |
0100-017F |
|
4 |
LATIN EXTENDED-B |
0180-024F |
|
5 |
IPA EXTENSIONS |
0250-02AF |
|
8 |
BASIC GREEK |
0370-03CF |
|
9 |
GREEK SYMBOLS AND COPTIC |
03D0-03FF |
|
10 |
CYRILLIC |
0400-04FF |
|
11 |
ARMENIAN |
0530-058F |
|
27 |
BASIC GEORGIAN |
10D0-10FF |
|
30 |
LATIN EXTENDED ADDITIONAL |
1E00-1EFF |
|
31 |
GREEK EXTENDED |
1F00-1FFF |
|
32 |
GENERAL PUNCTUATION |
2000-206F (only graphical characters) |
|
33 |
SUPERSCRIPTS AND SUBSCRIPTS |
2070-209F |
|
34 |
CURRENCY SYMBOLS |
20A0-20CF |
|
36 |
LETTERLIKE SYMBOLS |
2100-214F |
|
37 |
NUMBER FORMS |
2150-218F |
|
38 |
ARROWS |
2190-21FF |
|
39 |
MATHEMATICAL OPERATORS |
2200-22FF |
|
40 |
MISCELLANEOUS TECHNICAL |
2300-23FF |
|
41 |
CONTROL PICTURES |
2400-243F |
|
42 |
OPTICAL CHARACTER RECOGNITION |
2440-245F |
|
44 |
BOX DRAWING |
2500-257F |
|
45 |
BLOCK ELEMENTS |
2580-259F |
|
46 |
GEOMETRIC SHAPES |
25A0-25FF |
|
47 |
MISCELLANEOUS SYMBOLS |
2600-26FF |
|
|
|
|
|
49 |
CJK SYMBOLS AND PUNCTUATION |
3000-303F |
|
50 |
HIRAGANA |
3040-309F |
|
51 |
KATAKANA |
30A0-30FF |
|
52 |
BOPOMOFO |
3100-312F |
|
54 |
CJK MISCELLANEOUS |
3190-319F |
|
55 |
ENCLOSED CJK LETTERS AND MONTHS |
3200-32FF |
|
56 |
CJK COMPATIBILITY |
3300-33FF |
|
60 |
CJK UNIFIED IDEOGRAPHS |
4E00-9FFF |
|
62 |
CJK COMPATIBILITY IDEOGRAPHS |
F900-FAFF |
|
66 |
CJK COMPATIBILITY FORMS |
FE30-FE4F |
|
69 |
HALFWIDTH AND FULLWIDTH FORMS |
FF00-FFEF |
|
71 |
HANGUL EXTENDED |
AC00-D7A3 |
|
76 |
YI SYLLABLES |
A000-A48F |
|
77 |
YI RADICALS |
A490-A4CF |
|
81 |
CJK UNIFIED IDEOGRAPHS EXTENSION A |
3400-4DBF |
Conforming implementations should provide an X Output Method which supports the encoding schemes listed in Annex B.
Conforming implementations shall provide a terminal emulator on the X Window System that output characters in the supported locale.
Conforming implementations should provide console or tty device interface that output characters in the supported locale.
X11R6.4 Xlib, and IIIMXCF
xterm patches available at:
http://www.zepler.org/~rwb197/xterm/
None
This chapter defines the requirements for various network servers, such as file sharing servers and WWW servers.
The requirements on the following kinds of servers will be discussed in this section.
NetBIOS over TCP/IP
AppleTalk
Network File System
HTTP Server
This version of the specification has no requirements for the Network Servers.
None
In a future version of this specification, the requirements on the handling of names, e.g., filename, domain name, resource name, and user name, will be specified in this section.
This chapter defines the requirements for Internet client tools, such as WWW browsers and Mail User Agents (MUAs).
Conforming implementations shall make at least one codeset available per locale specified in Annex B.
The supported codeset should be in [IANA-Charset-Registry].
Conforming level 2 implementations of Web browsers and mail user agents shall be able to input and output whole repertoire of [Unicode 3.0].
Note: Character output is restricted as specified in 8. Output Methods.
The following implementation examples are available for this category.
Mozilla
mutt
None
This chapter defines requirements related to printing, such as APIs, utilities and their behavior.
This version of the specification has no requirements for printing.
None
In a future version of this specification, requirements from the Printing subgroup of the Li18nux working group will be provided.
Conforming implementations shall provide the following environment variables that are relevant to the operation of internationalized interfaces or internationalized commands and utilities.
LANG
LC_ALL
LC_COLLATE
LC_CTYPE
LC_MESSAGES
LC_MONETARY
LC_NUMERIC
LC_TIME
NLSPATH
The usage and the semantics of these environment variables shall be the same as the description in “6.2 Internationalisation Variables” in [XBD5].
Conforming implementations shall provide handling capability of the following locales.
C
POSIX
Conforming implementations shall support the following locales.
Note 1: The language names come from ISO 639.
Note 2: To avoid political discussion, the region/country names used here does not strictly follow ISO 3166-1.
|
af_ZA |
Afrikaans |
SOUTH AFRICA |
[Support of this locale is level 2] |
|
ar_AE |
Arabic |
UNITED ARAB EMIRATES |
[Output method support is level 2] |
|
ar_BH |
|
BAHRAIN |
[Output method support is level 2] |
|
ar_DZ |
|
ALGERIA |
[Output method support is level 2] |
|
ar_EG |
|
EGYPT |
[Output method support is level 2] |
|
ar_IN |
|
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
ar_IQ |
|
IRAQ |
[Output method support is level 2] |
|
ar_JO |
|
JORDAN |
[Output method support is level 2] |
|
ar_KW |
|
KUWAIT |
[Output method support is level 2] |
|
ar_LB |
|
LEBANON |
[Output method support is level 2] |
|
ar_LY |
|
LIBYAN ARAB JAMAHIRIYA |
[Output method support is level 2] |
|
ar_MA |
|
MOROCCO |
[Output method support is level 2] |
|
ar_OM |
|
OMAN |
[Output method support is level 2] |
|
ar_QA |
|
QATAR |
[Output method support is level 2] |
|
ar_SA |
|
SAUDI ARABIA |
[Output method support is level 2] |
|
ar_SD |
|
SUDAN |
[Output method support is level 2] |
|
ar_SY |
|
SYRIAN ARAB REPUBLIC |
[Output method support is level 2] |
|
ar_TN |
|
TUNISIA |
[Output method support is level 2] |
|
ar_YE |
|
YEMEN |
[Output method support is level 2] |
|
as_IN |
Assamese |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
be_BY |
Byelorussian |
BELARUS |
|
|
bg_BG |
Bulgarian |
BULGARIA |
|
|
bn_IN |
Bengali |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
ca_ES |
Catalan |
SPAIN |
|
|
cs_CZ |
Czech |
CZECH REPUBLIC |
|
|
da_DK |
Danish |
DENMARK |
|
|
de_AT |
German |
AUSTRIA |
|
|
de_BE |
|
BELGIUM |
[Support of this locale is level 2] |
|
de_CH |
|
SWITZERLAND |
|
|
de_DE |
|
GERMANY |
|
|
de_LU |
|
LUXEMBOURG |
|
|
el_GR |
Greek |
GREECE |
|
|
en_AU |
English |
AUSTRALIA |
|
|
en_BE |
|
BELGIUM |
|
|
en_BW |
|
BOTSWANA |
[Support of this locale is level 2] |
|
en_CA |
|
CANADA |
|
|
en_GB |
|
UNITED KINGDOM |
|
|
en_HK |
|
HONG KONG |
[Support of this locale is level 2] |
|
en_IE |
|
IRELAND |
|
|
en_IN |
|
INDIA |
[Support of this locale is level 2] |
|
en_NZ |
|
NEW ZEALAND |
|
|
en_PH |
|
PHILIPPINES |
[Support of this locale is level 2] |
|
en_SG |
|
SINGAPORE |
[Support of this locale is level 2] |
|
en_US |
|
UNITED STATES |
|
|
en_ZA |
|
SOUTH AFRICA |
|
|
en_ZW |
|
ZIMBABWE |
[Support of this locale is level 2] |
|
es_AR |
Spanish |
ARGENTINA |
|
|
es_BO |
|
BOLIVIA |
|
|
es_CL |
|
CHILE |
|
|
es_CO |
|
COLOMBIA |
|
|
es_CR |
|
COSTA RICA |
|
|
es_DO |
|
DOMINICAN REPUBLIC |
|
|
es_EC |
|
ECUADOR |
|
|
es_ES |
|
SPAIN |
|
|
es_GT |
|
GUATEMALA |
|
|
es_HN |
|
HONDURAS |
|
|
es_MX |
|
MEXICO |
|
|
es_NI |
|
NICARAGUA |
|
|
es_PA |
|
PANAMA |
|
|
es_PE |
|
PERU |
|
|
es_PR |
|
PUERTO RICO |
|
|
es_PY |
|
PARAGUAY |
|
|
es_SV |
|
REPUBLIC OF EL SALVADOR |
|
|
es_UY |
|
URUGUAY |
|
|
es_VE |
|
VENEZUELA |
|
|
et_EE |
Estonian |
ESTONIA |
|
|
eu_ES |
Basque |
SPAIN |
[Support of this locale is level 2] |
|
fa_IN |
Persian |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
fa_IR |
|
IRAN, ISLAMIC REPULIC OF |
[Support of this locale is level 2] [Output method support is level 2] |
|
fi_FI |
Finnish |
FINLAND |
|
|
fo_FO |
Faroese |
FAROE ISLANDS |
|
|
fr_BE |
French |
BELGIUM |
|
|
fr_CA |
|
CANADA |
|
|
fr_CH |
|
SWITZERLAND |
|
|
fr_FR |
|
FRANCE |
|
|
fr_LU |
|
LUXEMBOURG |
|
|
ga_IE |
Irish |
IRELAND |
|
|
gl_ES |
Galician |
SPAIN |
[Support of this locale is level 2] |
|
gu_IN |
Gujarati |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
gv_GB |
Manx Gaelic |
UNITED KINGDOM |
[Support of this locale is level 2] |
|
he_IL |
Hebrew |
ISRAEL |
[Output method support is level 2] |
|
hi_IN |
Hindi |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
hr_HR |
Croatian |
CROATIA |
|
|
hu_HU |
Hungarian |
HUNGARY |
|
|
id_ID |
Indonesian |
INDONESIA |
[Support of this locale is level 2] |
|
is_IS |
Icelandic |
ICELAND |
|
|
it_CH |
Italian |
SWITZERLAND |
|
|
it_IT |
|
ITALY |
|
|
ja_JP |
Japanese |
JAPAN |
|
|
kl_GL |
Greenlandic |
GREENLAND |
|
|
kn_IN |
Kannada |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
ko_KR |
Korean |
KOREA, REPUBLIC OF |
|
|
ks_IN |
Kashmiri |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
kw_GB |
Cornish |
UNITED KINGDOM |
[Support of this locale is level 2] |
|
lt_LT |
Lithuanian |
LITHUANIA |
|
|
lv_LV |
Latvian, Lettish |
LATVIA |
|
|
mk_MK |
Macedonian |
MACEDONIA, THE FORMER YUGOSLAV REPUBLIC OF |
|
|
ml_IN |
Malayalam |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
ms_MY |
Malay |
MALAYSIA |
[Support of this locale is level 2] |
|
nl_BE |
Dutch |
BELGIUM |
|
|
nl_NL |
|
NETHERLANDS |
|
|
no_NO |
Norwegian |
NORWAY |
|
|
or_IN |
Oriya |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
pa_IN |
Punjabi |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
pl_PL |
Polish |
POLAND |
|
|
ps_IN |
Pashto, Pushto |
INDIA |
[Support of this locale is level 2] [Output method support is level 2] |
|
pt_BR |
Portuguese |
BRAZIL |