LI18NUX 2000

Globalization Specification



Version 1.0 with Amendment 4



Linux Internationalization Initiative (Li18nux)

















Copyright © 2000 The Free Standards Group. All rights reserved.


1. Foreword

1.1 Scope

This document specifies interfaces and functionalities that must be supported by operating systems to run internationalized application software. This document also includes recommendations for operating systems to ease development of internationalized application software.

This specification only lists internationalization aspects of each functionality provided by the conforming operating systems.

1.2 Normative References

[POSIX.1]

ISO/IEC 9945-1:1996 Information technology — Portable Operating System Interface (POSIX) — Part 1: System Application Program Interface (API) [C Language]

[POSIX.2]

ISO/IEC 9945-2:1993 Information technology — Portable Operating System Interface (POSIX) — Part 2: Shell and Utilities

[ISO C]

ISO/IEC 9899:1990 Programming languages — C

ISO/IEC 9899:1990/Amd.1:1995 Programming languages — C Amendment 1: C Integrity


[ISO C 99]

ISO/IEC 9899:1999 Programming languages — C

[XCU5]

The Single UNIX Specification, Version 2

Commands and Utilities, Issue 5

(The Open Group CAE Specification C604)

[XBD5]

The Single UNIX Specification, Version 2

System Interface Definitions, Issue 5

(The Open Group CAE Specification C605)

[XSH5]

The Single UNIX Specification, Version 2


System Interfaces and Headers, Issue 5 (2 volumes)


(The Open Group CAE Specification C606)

[XCURSES4.2]

The Single UNIX Specification, Version 2

X/Open Curses (XCurses), Issue 4 Version 2

(The Open Group CAE Specification C610)

[ICU]

International Components for Unicode 2.0

http://oss.software.ibm.com/icu/

[ICU4J]

International Components for Unicode for Java 2.0

http://oss.software.ibm.com/icu4j/

[Perl 5.6]

Perl 5.6 (March 23, 2000)


http://www.perl.com/pub/n/Perl_5.6.0_is_out!

[Java]

Java 2 Platform, Standard Edition, v1.3 API Specification

http://java.sun.com/products/jdk/1.3/docs/api/index.html

[X11R6]

The X Window System, Version 11, Release 6

ftp://ftp.x.org/pub/R6.4/xc/doc/hardcopy/

[Unicode 3.0]

The Unicode Standard, Version 3.0

The Unicode Consortium, Addison Wesley Longman, ISBN 0-201-61633-5

[ISO 10646-1]

ISO/IEC 10646-1:2000 Information technology — Universal Multiple-Octet Coded Character Set (UCS) — Part 1: Architecture and Basic Multilingual Plane

[ISO 639]

ISO 639:1988 Code for the representation of names of languages

[ISO 3166-1]

ISO 3166-1:1997 Codes for the representation of names of countries and their subdivisions — Part 1: Country codes

[IANA-Charset-Registry]

IANA Registry of Character Sets

http://www.isi.edu/in-notes/iana/assignments/character-sets

[ISO 8859-1]

ISO/IEC 8859-1:1998 Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1

[ISO 8859-2]

ISO/IEC 8859-2:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2

[ISO 8859-5]

ISO/IEC 8859-5:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet

[ISO 8859-7]

ISO 8859-7:1987 Information processing — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet

[ISO 8859-9]

ISO/IEC 8859-9:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5

[ISO 8859-13]

ISO/IEC 8859-13:1998 Information technology — 8-bit single-byte coded graphic character sets — Part 13: Latin alphabet No. 7

[ISO 8859-15]

ISO/IEC 8859-15:1999 Information technology — 8-bit single-byte coded graphic character sets — Part 15: Latin alphabet No. 9


1.3 Conformance

1.3.1 Conforming Environments

For conformance purposes the following environments are defined:

(1) Application Execution Environment [Obsolescence]

Application Execution Environment is a minimum operating system environment that can run internationalized application software. The functionalities defined in this environment are mandatory and shall be present on all conforming implementations.

The following sections are applied to Application Execution Environment:

3. Base Libraries

4. Shells and Utilities


(2) End User Environment

End User Environment is an operating system environment with user interface. It is assumed that End User Environment has a set of utilities for user interaction.

This environment includes all the interfaces and utilities provided by Application Execution Environment. Additional interfaces and utilities are defined for the following sub-environments:

(a) Server Environment [Obsolescence]

Server environment is an operating system environment suitable for backend server purposes. Graphical user interfaces are not required in this environment.

The following sections are applied to Server Environment:

3. Base Libraries

4. Shells and Utilities

5. Programming Languages (with Software Development Options)

9. Network Servers

(b) Desktop Environment

Desktop environment is an operating system environment suitable for end user interaction. Graphical user interface is required in this environment.

The following sections are applied to Desktop Environment:

3. Base Libraries

4. Shells and Utilities

5. Programming Languages (with Software Development Options)

6. Graphical User Interface

7. Input Methods

8. Output Methods

10. Internet Tools


If an interface or utility is defined as “supported in End User Environment”, that interface or utility shall be available in both Server and Desktop environments.

The following options can be supported in each environment:

(3) Software Development Options

If any of these options is supported, utilities, libraries and associated modules to develop internationalized software (such as compilers or interpreters) shall be provided.

In this version of the specification, the following options are available:

1.3.2 Conformance Levels

Several levels are defined for conformance for each environment. These levels are defined as follows:

(1) Level 1

The level 1 is the bottom-line level of conformance. All conforming implementations shall provide this level of interfaces and utilities to conform to this specification. If level is not specified in the specification,

that specification shall be considered as Level 1.

(2) Level 2

The level 2 is more advanced or extended level of conformance. Conforming implementations are encouraged to provide this level of interfaces and utilities to conform to this specification, but it is not mandatory.

2. Terminology

2.1 Definition of Terms

The following terms are used in this specification:

Implementation-defined

A value or behavior is implementation-defined when it is left to the implementation to define [and document] the corresponding requirements for correct application behavior.

May

With respect to implementations, the word “may” is to be interpreted as an optional feature that is not required in this specification but can be provided. With respect to application, the word “may” means that the feature is optional. The term “optional” has the same definition as “may”.

Shall

In this specification, the word “shall” is to be interpreted as a mandatory requirement on the implementation or on application, depending upon the context. The term “must” has the same definition as “shall”.

Should

With respect to implementations, the word “should” is to be interpreted as an implementation recommendation, but not a requirement. With respect to application, the word “should” is to be interpreted as recommended programming practice.

Supported

Certain facilities in this specification are optional. If a facility is supported, it behaves as specified by this specification.

If a facility is “supported” by an implementation, the implementation must document how to obtain and install the facility, or the facility is installed by installer of the implementation by explicitly selected by the user or implicitly installed with other system components. If an implementation “supports” a facility, the distributor of the implementation shall commit that the facility can run on the implementation.

Unspecified

When a value or behavior is unspecified, the specification defines no portability requirements for a facility on an implementation even when faced with an application that uses the facility. An application that requires specific behavior in such an instance, rather than tolerating any behavior when using that facility, is not a portable application.

Provided

Certain facilities in this specification are mandatory and implemented in all conforming implementations.

Obsolescence

The indication of that subject statement or clause will be removed from future revision of this standard.

2.2 General Terms

character

A sequence of one or more bytes representing a single graphic symbol or control code.

This term corresponds to the ISO C standard term multibyte character (multi-byte character), where a single-byte character is a special case of a multi-byte character. Unlike the usage in the ISO C standard, character here has no necessary relationship with storage space, and byte is used when storage space is discussed.

[Single UNIX Specification, Version 2]

byte

An individually addressable unit of data storage that is equal to or larger than an octet, used to store a character or a portion of a character; see character.

A byte is composed of a contiguous sequence of bits, the number of which is implementation-dependent. The least significant bit is called the low-order bit; the most significant is called the high-order bit.

Note that this definition of byte deviates intentionally from the usage of byte in some international standards, where it is used as a synonym for octet (always eight bits). On a system based on the ISO/IEC 9945-2:1993 standard, a byte may be larger than eight bits so that it can be an integral portion of larger data objects that are not evenly divisible by eight bits (such as a 36-bit word that contains four 9-bit bytes).

[Single UNIX Specification, Version 2]

character set

A finite set of different characters used for the representation, organization or control of data.

[Single UNIX Specification, Version 2]

coded character set

A set of unambiguous rules that establishes a character set and the one-to-one relationship between each character of the set and its bit representation.

[Single UNIX Specification, Version 2]

codeset

The result of applying rules that map a numeric code value to each element of a character set. An element of a character set may be related to more than one numeric code value but the reverse is not true. However, for state-dependent encodings the relationship between numeric code values to elements of a character set may be further controlled by state information.

The character set may contain fewer elements than the total number of possible numeric code values; that is, some code values may be unassigned.

[Single UNIX Specification, Version 2]

internationalization

The provision within a computer program of the capability of making itself adaptable to the requirements of different native languages, local customs and coded character sets.

[Single UNIX Specification, Version 2]

globalization

A product development approach which ensures that software products are usable in the worldwide markets through a combination of internationalization and localization.

locale

The definition of the subset of a user's environment that depends on language and cultural conventions.

[Single UNIX Specification, Version 2]

localization

The process of establishing information within a computer system specific to the operation of particular native languages, local customs and coded character sets.

[Single UNIX Specification, Version 2]

local customs

The conventions of a geographical area or territory for such things as date, time and currency formats.

[Single UNIX Specification, Version 2]

portable filename character set

The set of characters from which portable filenames are constructed. For a filename to be portable across implementations conforming to this specification set and the ISO POSIX-1 standard, it must consist only of the following characters:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

a b c d e f g h i j k l m n o p q r s t u v w x y z

0 1 2 3 4 5 6 7 8 9 . _ -


The last three characters are the period, underscore and hyphen characters, respectively. The hyphen must not be used as the first character of a portable filename. Upper- and lower-case letters retain their unique identities between conforming implementations. In the case of a portable pathname, the slash character may also be used.

[Single UNIX Specification, Version 2]

file-system-safe character

Multibyte character which does not contain either 0x00 or 0x2F in any byte of its representation.

Input Method Engine

A part or a module of building block of input method which implements a language- or a script-specific logic of composing a string from one or more sequence of event or a string, which can be independent from windowing system, graphical user interface, or visual appearance.


3. Base Libraries

(1) Scope

This chapter defines runtime library interfaces required to conform to this specification. Conforming implementations shall provide the C language APIs defined by [ISO C] and [POSIX.1]. In addition to the C language interface, conforming level 2 implementations shall provide interfaces for other programming languages.

(2) Requirements

Conforming implementations shall provide the internationalization functions listed in the Table 3-1 and the headers listed in the Table 3-2. The specifications of the functions and the definitions of the headers shall conform to [POSIX.1] and [ISO C].

In addition to the functions in the Table 3-1, conforming implementations shall provide the wide character and wide string I/O functionality through printf/scanf family of functions as specified in [ISO C].

Table 3-1 C Language internationalization functions

btowc()

fgetwc()

fgetws()

fputwc()

fputws()

fwide()

fwprintf()

fwscanf()

getwc()

getwchar()

iswalnum()

iswalpha()

iswcntrl()

iswctype()

iswdigit()

iswgraph()

iswlower()

iswprint()

iswpunct()

iswspace()

iswupper()

iswxdigit()

localeconv()

mblen()

mbrlen()

mbrtowc()

mbsinit()

mbsrtowcs()

mbstowcs()

mbtowc()

putwc()

putwchar()

setlocale()

strftime()

swprintf()

swscanf()

towctrans()

towlower()

towupper()

ungetwc()

vfwprintf()

vswprintf()

vwprintf()

wcrtomb()

wcscat()

wcschr()

wcscmp()

wcscoll()

wcscpy()

wcscspn()

wcsftime()

wcslen()

wcsncat()

wcsncmp()

wcsncpy()

wcspbrk()

wcsrchr()

wcsrtombs()

wcsspn()

wcsstr()

wcstod()

wcstok()

wcstol()

wcstombs()

wcstoul()

wcsxfrm()

wctob()

wctomb()

wctrans()

wctype()

wmemchr()

wmemcmp()

wmemcpy()

wmemmove()

wmemset()

wprintf()

wscanf()





Table 3-2 C language headers

<locale.h>

<wchar.h>

<wctype.h>



Note: Application programs should refer to limits in symbolic names, such as MB_CUR_MAX and MB_LEN_MAX, not the implementation-specific values directly.

Conforming level 2 implementations shall provide the following functions. The specifications of the functions shall conform to [ISO C 99].

wcstof()

wcstold()

wcstoll()

wcstoull()



Conforming implementations shall provide the internationalization functions listed in the Table 3-3 and headers listed in the Table 3-4. The specifications of the functions and the definitions of the headers shall conform to [XSH5].

Table 3-3 Additional C Language internationalization functions

catclose()

catgets()

catopen()

iconv()

iconv_close()

iconv_open()

nl_langinfo()

strfmon()

strptime()

wcswidth()

wcwidth()



Table 3-4 Additional C language headers

<iconv.h>

<langinfo.h>

<monetary.h>

<nl_types.h>



Conforming implementations shall provide the message handling functions listed in Table 3-5 and headers listed in Table 3-6 which is specified in Annex C: Publicly Available Specifications.

Table 3-5 Additional message handling functions

gettext()

dgettext()

textdomain()

bindtextdomain()


dcgettext()

ngettext()

dngettext()

dcngettext()


bind_textdomain_codeset()






Table 3-6 Additional message handling functions headers

<libintl.h>


Conforming level 1 implementations should support the POSIX regular expression functions listed in the Table 3-7 and the header <regex.h>.

The specifications of the functions and the definitions of the header should conform to [XSH5].

Table 3-7 POSIX regular expression functions

regcomp()

regexec()

regerror()

regfree()


Conforming implementations shall provide the application execution environment in which the internationalized applications (written by using the internationalization functions above) can behave appropriately depending on the value of environment variables, without requiring any change of the applications.

See Annex A: Environment Variables for the environment variables to which internationalization functions will refer.

Conforming implementations shall support the application execution environments specified in Annex B.

Conforming level 2 implementations shall define _XOPEN_CURSES version test macro and provide the internationalized curses library functions which are specified in [XCURSES4.2].

Conforming level 2 implementations shall support Java Runtime environment ([Java]), Internationalization Components for Unicode [ICU], ICU for Java [ICU4J], and Perl execution environment [Perl 5.6] including Perl interpreter and modules.

The following Perl modules are related with internationalization:

(see http://www.perl.com/CPAN-local/modules/00modlist.long.html#Part2-ThePerl5M)

Name

Description

I18N::


::Charset

Character set names and aliases

::Collate

Locale based comparisons

::LangTags

compare & extract language tags (RFC1766)

::WideMulti

Wide and multibyte character string



Locale::


::Country

ISO 3166 two letter country codes

::Date

Month/weekday names in various languages

::Langinfo

The <langinfo.h> API

::Language

ISO 639 two letter language codes

::Msgcat

Access to XPG4 message catalog functions

::PGetText

What GNU gettext does, written in pure perl

::gettext

Multilanguage messages



Unicode::


::String

String manipulation for Unicode strings

::Map8

Convert between most 8bit encodings


(3) Implementation Examples

GNU C library version 2.2

(4) Future Direction

In the next version of this specification, conforming implementations may be required to provide POSIX regular expression functions and internationalized curses library functions.

4. Shells and Utilities

(1) Scope

This chapter defines runtime environment required to support traditional UNIX command interpreter called “shell” and other basic utilities defined in [POSIX.2].

(2) Requirements

Conforming level 1 implementations shall be able to use Portable Filename Character Set defined in [POSIX.2]. For filename globbing, conforming level 1 implementations shall provide the functionality defined in [POSIX.2], with the following exceptions: [refer to Annex F: A]

Conforming level 2 implementations shall be able to use file-system-safe characters as arguments and filenames.

Conforming level 2 implementations shall implement the globbing functionality of the shell as defined in [POSIX.2].

Conforming implementations shall provide a shell that supports the functionalities of “Bourne shell”, with internationalization capabilities defined above.

(a) Locale

Conforming implementations shall provide the following utilities to generate and refer to locale definitions as specified in [XCU5]:

locale

localedef



(b) Text Editor

Conforming implementations shall provide the following utilities to edit text files encoded in the supported codesets as specified in [XCU5].

Note: To edit text is to determine character boundaries correctly and perform operations such as insert, copy and delete characters based on the determined character boundaries. Input and output requirements are specified in 7. Input Methods and 8. Output Methods respectively.

ed

ex

vi

(c) Date and Time formatting

This specification has no requirements on date and time formatting functionality of shells and utilities.

(d) Text Processing

Conforming implementations shall provide the following utilities to process text as specified in [XCU5].

comm

diff

egrep

expand

fgrep

fold

grep

iconv

join

more

mailx


nm (symbol sorting order)

od (floating point)

pr

printf

sed

sort

unexpand

uniq

wc


The mailx utility can be implemented as Mail. The more utility can be implemented as less.

(e) Regular Expressions

On conforming level 2 implementations, utilities that process regular expressions shall support Basic Regular Expression (BRE) and Extended Regular Expression (ERE) as specified in [POSIX.2].

On conforming level 1 implementations, utilities that process regular expressions should support BRE and ERE as specified in [POSIX.2], with the following exceptions: [refer to Annex F: A]

The following utilities are relevant:

egrep

grep

sed

awk


(f) Filename Handling

Conforming implementations shall provide the following utilities to correctly handle filenames that use file-system-safe characters.

For filename globbing, conforming level 1 implementations shall provide the functionality defined in [POSIX.2], with the following exceptions: [refer to Annex F: A]

cpio

find

ls

tar


(g) General Text Editor

Conforming implementations shall support at least one text editor that can edit text encoded in UTF-8.

Note: To edit text is to determine character boundaries correctly and perform operations such as insert, copy and delete characters based on the determined character boundaries. Input and output requirements are specified in 7. Input Methods and 8. Output Methods respectively.

(h) Terminal Emulator

Conforming implementations shall support terminal emulators that can handle codesets for supported locales.

Conforming implementations should support terminal emulation for all supported locales, but an implementation may provide different terminal emulators for each locale.

(i) Message catalogs

Conforming implementations shall provide the following utilities to convert message catalog source files into message catalogs.

gencat

msgfmt


Conforming implementations with C Language Development Option shall provide the following utilities to create and update message catalog source files.

msgmerge

xgettext



(j) Message Handling

Conforming implementations shall provide the following utility to handle localized messages.

gettext

(3) Implementation Examples

Examples of level 1 implementation

GNU bash

GNU textutils

GNU shellutils

GNU fileutils


Terminal Emulators:

kterm and kon.

jfbterm, supporting CJK, working under frame buffer, output only.

rxvt, supporting CJK, working under X Window System.

Unicon available at:

http://turbolinux.com.cn/TLDN/chinese/project/unicon/

zhcon by Bluepoint Corp.:

http://openunix.org/

cce (Console Terminal) available at:

http://programmer.lib.sjtu.edu.cn/cce/cce.html

XLinux console, supporting 12 languages:

http://www.xlinux.com.tw/


Unicode fonts and tools for X11:

http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html

XFree86 4.0.1 (includes already the above):

http://www.zepler.org/~rwb197/xterm/

(4) Future Direction

In a future version of this specification, shell’s function of handling file-system-safe characters will become mandatory.

5. Programming Languages

(1) Scope

This chapter defines the requirements for various programming languages. Only programming languages with internationalization requirements are listed here. Note that the specifications defined by this chapter shall be provided by conforming implementations if the relevant Software Development Option is supported.

(2) Requirements

Conforming level 2 implementations with Software Development Options shall support the compiler or interpreter for the following languages:

Each programming language shall be internationalized as specified in the following specifications:

Note: See 3. Base Libraries about runtime environment of Perl and Java languages.

(3) Implementation Examples

The following implementation examples are available for these languages:

C: GNU Compiler Collection

http://www.gnu.org/software/gcc/gcc.html

C: Fortran & C Package (Linux)

Fujitsu Kyushu System Engineering Limited (in Japan)

http://www.fqs.co.jp/fort-c/


Fujitsu C/C++ Express (Linux)

Fujitsu America Inc. (in US)

http://www.tools.fujitsu.com/

Perl:

http://www.perl.com/pub/n/Perl_5.6.0_is_out!

Java:

http://java.sun.com/

(4) Future Directions

None

6. Graphical User Interface

6.1 Graphic Libraries

(1) Scope

This chapter defines runtime library interfaces for graphical user interface (GUI). Conforming implementations shall provide the graphical user interface defined by the X Window System Version 11 Release 6 [X11R6].

(2) Requirements

Conforming implementations shall provide the API for following functions:

setlocale()

XSupportsLocale()

XSetLocaleModifiers()


XCreateFontSet() — not recommended (use XOpenOM()/XCreateOC())

XFreeFontSet()

XFontsOfFontSet()

XBaseFontNameListOfFontSet()

XLocaleOfFontSet()

XContextDependentDrawing()

XExtentsOfFontSet()

XmbTextEscapement()

XwcTextEscapement()

XmbTextExtents()

XwcTextExtents()

XmbTextPerCharExtents()

XwcTextPerCharExtents()

XmbDrawString()

XwcDrawString()

XmbDrawImageString()

XwcDrawImageString()

XmbDrawText()

XwcDrawText()


XOpenOM()

XCloseOM()

XDisplayOfOM()

XLocaleOfOM()

XSetOMValues()

XGetOMValues()

XCreateOC()

XDestroyOC()

XOMOfOC()

XSetOCValues()

XGetOCValues()


XrmInitialize()

XrmLocaleOfDatabase()

XrmParseCommand()

XResourceManagerString()

XScreenResourceString()

XrmGetFileDatabase()

XrmGetStringDatabase()

XrmMergeDatabases()

XrmCombineDatabase()

XrmCombineFileDatabase()

XrmGetDatabase()

XrmSetDatabase()

XrmGetResource()

XrmEnumerateDatabase()

XrmPutResource()

XrmPutStringResource()

XrmPutLineResource()

XrmPutFileDatabase()

XrmDestroyDatabase()

XrmPermStringToQuark()

XrmQGetResource()

XrmQGetSearchList()

XrmQGetSearchResource()

XrmQPutResource()

XrmQPutStringResource()

XrmQuarkToString()

XrmStringToBindingQuarkList()

XrmStringToQuark()

XrmStringToQuarkList()

XrmUniqueQuark()


XmbTextListToTextProperty()

XwcTextListToTextProperty()

XmbTextPropertyToTextList()

XwcTextPropertyToTextList()

XFreeStringList()

XwcFreeStringList()

XmbSetWMProperties()

XSetWMProperties()

XSetWMName()

XSetWMIconName()

XOpenIM()

XCloseIM()

XDisplayOfIM()

XLocaleOfIM()

XSetIMValues()

XGetIMValues()

XCreateIC()

XVaCreateNestedList()

XDestroyIC()

XIMOfIC()

XSetICValues()

XGetICValues()

XSetICFocus()

XUnsetICFocus()

XmbResetIC()

XwcResetIC()

XFilterEvent()

XmbLookupString()

XwcLookupString()

XRegisterIMInstantiateCallback()

XUnregisterIMInstantiateCallback()


Conforming level 2 implementations shall support languages listed in Annex B. Conforming level 1 implementations need not to support languages that require complex text layout (the applicable languages are marked in the table in Annex B).

(3) Implementation Examples

The following implementation example is available for this category.

XFree86 4.0.1:

http://www.xfree86.org/

(4) Future Direction

None

6.2 Graphic Toolkits and X Window Servers

(1) Scope

This chapter defines the requirements for graphic toolkits supported on top of the X Window System and the X Window System servers.

(2) Requirements

There are no requirements on the Graphic Toolkits in terms of internationalization.

There is no requirement on the X Window Servers in terms of internationalization.

(3) Implementation Examples

The following implementation examples are available for this category.

[Graphic Toolkits]

GTK+:

http://www.gtk.org/

Qt:

http://www.troll.no/products/qt.html

[X Window Server which supports outline fonts]

X-TrueType Server (X-TT):

http://X-TT.dsl.gr.jp/index.html

XFree86 4.0.1:

http://www.xfree86.org/

(4) Future Directions

In a future version of this specification, Unicode, BiDi (bidirectional text), and vertical writing will become requirements.

7. Input Methods

(1) Scope

This chapter defines the requirements for text input used by the X Window System and other environments. Such mechanism is needed to support non-Western languages (for example, Chinese, Japanese and Korean).

(2) Requirements

Conforming implementations shall provide means, i.e., Input Method(s) for user to input characters specified in the Annex B: Supported locales and codesets.

Conforming implementations shall provide X Input Method Server(s) which can connect with Input Method Engines of the supported locales. An Input Method Engine can be implemented as a separate process communicating with an X Input Method Server or can be integrated into the X Input Method Server.

Conforming implementations shall support Input Method Engines for the supported locales, that can be connected with the above Input Method Server(s). The conforming implementations shall document which Input Method Engines are supported by the above X Input Method Server(s) and how user can get and install the Engines into the conforming implementations.

The X Input Method Server(s) should have a capability to switch Input Method Engines dynamically, but a conforming implementation may provide multiple Input Method Servers per locale.

Conforming level 1 implementations should provide an X Input Method Server which supports UTF-8 encoding and allows user to input whole repertoire of [Unicode 3.0].

Conforming level 2 implementations shall provide an X Input Method Server which supports UTF-8 encoding and allows user to input whole repertoire of [Unicode 3.0].

Note: User-friendly input operation is preferable, but it is acceptable to use non-user-friendly input operation, such as entering hexadecimal code points, to input not-so-frequently-used characters. Also note that the input requirement does not imply that the input characters are displayed correctly.

Conforming implementations may provide X Input Method Server(s) which supports locale specific character repertoire and locale specific character encodings.

Every application that has X Window System based GUI and has a capability to accept character input from users should have the interface with the above X Input Method Server(s).

Conforming implementations should provide means for user to input characters specified in the supported locale through Console and TTY device interfaces.

(3) Implementation Examples

X Input Method Server (Generic): IIIMF

X Input Method Servers (Japanese): kinput2, and Xwnmo.

X Input Method Servers (Chinese):

Chinput, supporting both GB and Big5

http://turbolinux.com.cn/~justiny/project-chinput.html

xcin, supporting both Big5 and GB

http://xcin.linux.org.tw/

X Input Method Servers (Korean): ami, hanIM and byeoroo

Chinese Console:

supports CJK and Big5 display and input with a platform-independent input server

http://www.redflag-linux.com/news/open.htm

yh-3.1-opensource.tgz

(4) Future Direction

In the next version of this specification, the recommendation of single X Input Method Server which can switch Input Method Engines dynamically will become mandatory requirement.

In the next version of this specification, the recommendation for conforming level 1 implementations regarding the X Input Method Server(s) which support UTF-8 encoding will become mandatory requirement.

8. Output Methods

(1) Scope

This chapter defines the requirements for text output used by the X Window System. Such mechanism is needed to support languages that require complex text rendering.

(2) Requirements

Conforming implementations shall provide means, i.e., Output Method(s), for user to output characters specified in the Annex B: Supported locales and codesets.

Conforming implementations shall provide X Output Method interface defined in X11R6 Xlib specification chapter 13 as a displaying primitive for X Window System.

Conforming level 1 implementations should provide multibyte and wide character interface which cover the following collections of UCS implementation level 1 defined in [ISO 10646-1].

Conforming level 2 implementations shall provide multibyte and wide character interface which cover the following collections of UCS implementation level 1 defined in [ISO 10646-1].

Note: [ISO 10646-1] defines character blocks for subsetting purpose and are called character collections. Such character collections are used here to indicate minimum displayable subset.

1

BASIC LATIN

0020-007E

2

LATIN-1 SUPPLEMENT

00A0-00FF

3

LATIN EXTENDED-A

0100-017F

4

LATIN EXTENDED-B

0180-024F

5

IPA EXTENSIONS

0250-02AF

8

BASIC GREEK

0370-03CF

9

GREEK SYMBOLS AND COPTIC

03D0-03FF

10

CYRILLIC

0400-04FF

11

ARMENIAN

0530-058F

27

BASIC GEORGIAN

10D0-10FF

30

LATIN EXTENDED ADDITIONAL

1E00-1EFF

31

GREEK EXTENDED

1F00-1FFF

32

GENERAL PUNCTUATION

2000-206F (only graphical characters)

33

SUPERSCRIPTS AND SUBSCRIPTS

2070-209F

34

CURRENCY SYMBOLS

20A0-20CF

36

LETTERLIKE SYMBOLS

2100-214F

37

NUMBER FORMS

2150-218F

38

ARROWS

2190-21FF

39

MATHEMATICAL OPERATORS

2200-22FF

40

MISCELLANEOUS TECHNICAL

2300-23FF

41

CONTROL PICTURES

2400-243F

42

OPTICAL CHARACTER RECOGNITION

2440-245F

44

BOX DRAWING

2500-257F

45

BLOCK ELEMENTS

2580-259F

46

GEOMETRIC SHAPES

25A0-25FF

47

MISCELLANEOUS SYMBOLS

2600-26FF




49

CJK SYMBOLS AND PUNCTUATION

3000-303F

50

HIRAGANA

3040-309F

51

KATAKANA

30A0-30FF

52

BOPOMOFO

3100-312F

54

CJK MISCELLANEOUS

3190-319F

55

ENCLOSED CJK LETTERS AND MONTHS

3200-32FF

56

CJK COMPATIBILITY

3300-33FF

60

CJK UNIFIED IDEOGRAPHS

4E00-9FFF

62

CJK COMPATIBILITY IDEOGRAPHS

F900-FAFF

66

CJK COMPATIBILITY FORMS

FE30-FE4F

69

HALFWIDTH AND FULLWIDTH FORMS

FF00-FFEF

71

HANGUL EXTENDED

AC00-D7A3

76

YI SYLLABLES

A000-A48F

77

YI RADICALS

A490-A4CF

81

CJK UNIFIED IDEOGRAPHS EXTENSION A

3400-4DBF


Conforming implementations should provide an X Output Method which supports the encoding schemes listed in Annex B.

Conforming implementations shall provide a terminal emulator on the X Window System that output characters in the supported locale.

Conforming implementations should provide console or tty device interface that output characters in the supported locale.

(3) Implementation Examples

X11R6.4 Xlib, and IIIMXCF

xterm patches available at:

http://www.zepler.org/~rwb197/xterm/

(4) Future Direction

None

9. Network Servers

(1) Scope

This chapter defines the requirements for various network servers, such as file sharing servers and WWW servers.

The requirements on the following kinds of servers will be discussed in this section.

(2) Requirements

This version of the specification has no requirements for the Network Servers.

(3) Implementation Examples

None

(4) Future Directions

In a future version of this specification, the requirements on the handling of names, e.g., filename, domain name, resource name, and user name, will be specified in this section.

10. Internet Tools

(1) Scope

This chapter defines the requirements for Internet client tools, such as WWW browsers and Mail User Agents (MUAs).

(2) Requirements

Conforming implementations shall make at least one codeset available per locale specified in Annex B.

The supported codeset should be in [IANA-Charset-Registry].

Conforming level 2 implementations of Web browsers and mail user agents shall be able to input and output whole repertoire of [Unicode 3.0].

Note: Character output is restricted as specified in 8. Output Methods.

(3) Implementation Examples

The following implementation examples are available for this category.

Mozilla

http://www.mozilla.org/

mutt

http://www.mutt.org/

(4) Future Direction

None

11. Printing

(1) Scope

This chapter defines requirements related to printing, such as APIs, utilities and their behavior.

(2) Requirements

This version of the specification has no requirements for printing.

(3) Implementation Examples

None

(4) Future Direction

In a future version of this specification, requirements from the Printing subgroup of the Li18nux working group will be provided.

Annex A (Normative): Environment Variables



Conforming implementations shall provide the following environment variables that are relevant to the operation of internationalized interfaces or internationalized commands and utilities.

LANG

LC_ALL

LC_COLLATE

LC_CTYPE

LC_MESSAGES

LC_MONETARY

LC_NUMERIC

LC_TIME

NLSPATH

The usage and the semantics of these environment variables shall be the same as the description in “6.2 Internationalisation Variables” in [XBD5].

Annex B (Normative): Supported locales and codesets

Conforming implementations shall provide handling capability of the following locales.

C

POSIX

Conforming implementations shall support the following locales.

Note 1: The language names come from ISO 639.

Note 2: To avoid political discussion, the region/country names used here does not strictly follow ISO 3166-1.

af_ZA

Afrikaans

SOUTH AFRICA

[Support of this locale is level 2]

ar_AE

Arabic

UNITED ARAB EMIRATES

[Output method support is level 2]

ar_BH


BAHRAIN

[Output method support is level 2]

ar_DZ


ALGERIA

[Output method support is level 2]

ar_EG


EGYPT

[Output method support is level 2]

ar_IN


INDIA

[Support of this locale is level 2] [Output method support is level 2]

ar_IQ


IRAQ

[Output method support is level 2]

ar_JO


JORDAN

[Output method support is level 2]

ar_KW


KUWAIT

[Output method support is level 2]

ar_LB


LEBANON

[Output method support is level 2]

ar_LY


LIBYAN ARAB JAMAHIRIYA

[Output method support is level 2]

ar_MA


MOROCCO

[Output method support is level 2]

ar_OM


OMAN

[Output method support is level 2]

ar_QA


QATAR

[Output method support is level 2]

ar_SA


SAUDI ARABIA

[Output method support is level 2]

ar_SD


SUDAN

[Output method support is level 2]

ar_SY


SYRIAN ARAB REPUBLIC

[Output method support is level 2]

ar_TN


TUNISIA

[Output method support is level 2]

ar_YE


YEMEN

[Output method support is level 2]

as_IN

Assamese

INDIA

[Support of this locale is level 2] [Output method support is level 2]

be_BY

Byelorussian

BELARUS


bg_BG

Bulgarian

BULGARIA


bn_IN

Bengali

INDIA

[Support of this locale is level 2] [Output method support is level 2]

ca_ES

Catalan

SPAIN


cs_CZ

Czech

CZECH REPUBLIC


da_DK

Danish

DENMARK


de_AT

German

AUSTRIA


de_BE


BELGIUM

[Support of this locale is level 2]

de_CH


SWITZERLAND


de_DE


GERMANY


de_LU


LUXEMBOURG


el_GR

Greek

GREECE


en_AU

English

AUSTRALIA


en_BE


BELGIUM


en_BW


BOTSWANA

[Support of this locale is level 2]

en_CA


CANADA


en_GB


UNITED KINGDOM


en_HK


HONG KONG

[Support of this locale is level 2]

en_IE


IRELAND


en_IN


INDIA

[Support of this locale is level 2]

en_NZ


NEW ZEALAND


en_PH


PHILIPPINES

[Support of this locale is level 2]

en_SG


SINGAPORE

[Support of this locale is level 2]

en_US


UNITED STATES


en_ZA


SOUTH AFRICA


en_ZW


ZIMBABWE

[Support of this locale is level 2]

es_AR

Spanish

ARGENTINA


es_BO


BOLIVIA


es_CL


CHILE


es_CO


COLOMBIA


es_CR


COSTA RICA


es_DO


DOMINICAN REPUBLIC


es_EC


ECUADOR


es_ES


SPAIN


es_GT


GUATEMALA


es_HN


HONDURAS


es_MX


MEXICO


es_NI


NICARAGUA


es_PA


PANAMA


es_PE


PERU


es_PR


PUERTO RICO


es_PY


PARAGUAY


es_SV


REPUBLIC OF EL SALVADOR


es_UY


URUGUAY


es_VE


VENEZUELA


et_EE

Estonian

ESTONIA


eu_ES

Basque

SPAIN

[Support of this locale is level 2]

fa_IN

Persian

INDIA

[Support of this locale is level 2] [Output method support is level 2]

fa_IR


IRAN, ISLAMIC REPULIC OF

[Support of this locale is level 2] [Output method support is level 2]

fi_FI

Finnish

FINLAND


fo_FO

Faroese

FAROE ISLANDS


fr_BE

French

BELGIUM


fr_CA


CANADA


fr_CH


SWITZERLAND


fr_FR


FRANCE


fr_LU


LUXEMBOURG


ga_IE

Irish

IRELAND


gl_ES

Galician

SPAIN

[Support of this locale is level 2]

gu_IN

Gujarati

INDIA

[Support of this locale is level 2] [Output method support is level 2]

gv_GB

Manx Gaelic

UNITED KINGDOM

[Support of this locale is level 2]

he_IL

Hebrew

ISRAEL

[Output method support is level 2]

hi_IN

Hindi

INDIA

[Support of this locale is level 2] [Output method support is level 2]

hr_HR

Croatian

CROATIA


hu_HU

Hungarian

HUNGARY


id_ID

Indonesian

INDONESIA

[Support of this locale is level 2]

is_IS

Icelandic

ICELAND


it_CH

Italian

SWITZERLAND


it_IT


ITALY


ja_JP

Japanese

JAPAN


kl_GL

Greenlandic

GREENLAND


kn_IN

Kannada

INDIA

[Support of this locale is level 2] [Output method support is level 2]

ko_KR

Korean

KOREA, REPUBLIC OF


ks_IN

Kashmiri

INDIA

[Support of this locale is level 2] [Output method support is level 2]

kw_GB

Cornish

UNITED KINGDOM

[Support of this locale is level 2]

lt_LT

Lithuanian

LITHUANIA


lv_LV

Latvian, Lettish

LATVIA


mk_MK

Macedonian

MACEDONIA, THE FORMER YUGOSLAV REPUBLIC OF


ml_IN

Malayalam

INDIA

[Support of this locale is level 2] [Output method support is level 2]

ms_MY

Malay

MALAYSIA

[Support of this locale is level 2]

nl_BE

Dutch

BELGIUM


nl_NL


NETHERLANDS


no_NO

Norwegian

NORWAY


or_IN

Oriya

INDIA

[Support of this locale is level 2] [Output method support is level 2]

pa_IN

Punjabi

INDIA

[Support of this locale is level 2] [Output method support is level 2]

pl_PL

Polish

POLAND


ps_IN

Pashto, Pushto

INDIA

[Support of this locale is level 2] [Output method support is level 2]

pt_BR

Portuguese

BRAZIL