HP OpenVMS Systems Documentation
This chapter describes typical features of international software and the features provided with the HP C Run-Time Library (RTL) that enable you to design and implement international software.
See the Reference Section for more detailed information on the
functions described in this chapter.
10.1 Internationalization Support
The HP C RTL has added capabilities to allow application
developers to create international software. The HP C RTL
obtains information about a language and a culture by reading this
information from locale files.
If you are using these HP C RTL capabilities, you must install a separate kit to provide these files to your system. See the appendix "Installing OpenVMS Internationalization data kit" in the OpenVMS Upgrade and Installation Guide.
On OpenVMS VAX systems, the save set VMSI18N0nn is provided on the same media as the OpenVMS operating system.
On OpenVMS Alpha systems the save set is provided on the Layered Product CD, and is named VMSI18N0nn or ALPVMSI18N0n_07nn.
To install this save set, follow the standard OpenVMS installation procedures using this save-set name as the name of the kit. There are several categories of locales that you can select to install. You can select as many locales as you need by answering the following prompts:
* Do you want European and US support? [YES]? * Do you want Chinese GB18030 support (locale and Unicode converters) [YES]? * Do you want Chinese support? [YES]? * Do you want Japanese support? [YES]? * Do you want Korean support? [YES]? * Do you want Thai support? [YES]? * Do you want the Unicode converters? [YES]?
This kit also has an Installation Verification Procedure that we
recommend you run to verify the correct installation of the kit.
10.1.2 Unicode Support
In OpenVMS Version 7.2, the HP C Run-Time Library added the Universal Unicode locale, which is distributed with the OpenVMS system, not with the VMSI18N0nn kit. The name of the Unicode locale is:
Like those locales shipped with the VMSI18N0nn kit, the Unicode locale is located at the standard location referred to by the SYS$I18N_LOCALE logical name.
The UTF8-20 Unicode is based on Unicode standard Version V2.0. The Unicode locale uses UCS-4 as wide-character encoding and UTF-8 as multibyte character encodings.
HP C RTL also includes converters that perform conversions between Unicode and any other supported character sets. The expanded set of converters includes converters for UCS-2, UCS-4, and UTF-8 Unicode encoding. The Unicode converters can be used by the ICONV CONVERT utility and by the iconv family of functions in the HP C Run-Time Library.
In OpenVMS Version 7.2, the HP C Run-Time Library added
Unicode character set converters for Microsoft Code Page 437.
10.2 Features of International Software
International software is software that can support multiple languages and cultures. An international program should be able to:
To meet the previous requirements, an application should not make any assumptions about the language, local customs, or the coded character set used. All this localization data should be defined separately from the program, and only bound to it at run time.
The rest of this chapter describes how you can create international
software using HP C.
10.3 Developing International Software Using HP C
The HP C environment provides the following facilities to create international software:
A locale consists of different categories, each of which determines one aspect of the international environment. Table 10-1 lists the categories in a locale and describes the information in each.
|LC_COLLATE||Contains information about collating sequences.|
|LC_CTYPE||Contains information about character classification.|
|LC_MESSAGES||Defines the answers that are expected in response to yes/no prompts.|
|LC_MONETARY||Contains monetary formatting information.|
|LC_NUMERIC||Contains information about formatting numbers.|
|LC_TIME||Contains time and date information.|
The locales provided reside in the directory defined by the SYS$I18N_LOCALE logical name. The file-naming convention for locales is:
An application sets up its international environment at run time by calling the setlocale function. The international environment is set up in one of two ways:
The syntax for the setlocale function is:
char *setlocale(int category, const char *locale)
If an application does not call the setlocale function, the default locale is the C locale. This allows such applications to call those functions that use information in the current locale.
If the setlocale function is called with "" as the locale argument, the function checks for a number of logical names to determine the locale name for the category specified.
In addition to the logical names defined by a user, there are a number of systemwide logical names, set up during system startup, that define the default international environment for all users on a system:
function checks for user-defined logical names first, and if these are
not defined, it checks the system logical names.
10.6 Using Message Catalogs
An important requirement for international software is that it should be able to communicate with the user in the user's own language. The messaging system enables program messages to be created separately from the program source, and linked to the program at run time.
Messages are defined in a message text source file, and compiled into a message catalog using the GENCAT command. The message catalog is accessed by a program using the functions provided in the HP C RTL.
The functions provided to access the messages in a catalog are:
For information on generating message catalogs, see the GENCAT command
description in the OpenVMS system documentation.
10.7 Handling Different Character Sets
The HP C RTL supports a number of state-independent codesets and codeset encoding schemes that contain the ASCII encoded Portable Character Set. It does not support state-dependent codesets. The codesets supported are:
The characters in a codeset are defined in a charmap file. The charmap
files supplied by HP are located in the directory defined by the
SYS$I18N_LOCALE logical name. The file type for a charmap file is .CMAP.
10.7.2 Converter Functions
The file-naming convention for codeset converters is:
Where fromcode is the name of the source codeset, and tocode is the name of the codeset to which characters are converted.
You can add codeset converters to a given system by installing the converter files in the directory pointed by the logical name SYS$I18N_ICONV.
Codeset converter files can be implemented either as table-based conversion files or as algorithm-based converter files created as OpenVMS shareable images.
The following summarizes the necessary steps to create a table-based codeset converter file:
To create an algorithm-based codeset converter file implemented as a shareable image, follow these steps:
By default, SYS$I18N_ICONV is a search list where the first directory in the list SYS$SYSROOT:[SYS$I18N.ICONV.USER] is meant for use as a site-specific repository for iconv codeset converters.
The number of codesets and locales installed vary from system to
system. Check the SYS$I18N directory tree for the codesets, converters,
and locales installed on your system.
10.8 Handling Culture-Specific Information
Each locale contains the following cultural information:
You can extract some of this cultural information using the
function and the
function. See Section 10.8.1.
10.8.1 Extracting Cultural Information From a Locale
The nl_langinfo function returns a pointer to a string that contains an item of information obtained from the program's current locale. The information you can extract from the locale is:
function returns a pointer to a data structure that contains numeric
formatting and monetary formatting data from the LC_NUMERIC and
10.8.2 Date and Time Formatting Functions
The functions that use the date and time information are:
function uses the monetary information in a locale to convert a number
of values into a string. The format of the string is controlled by a
10.8.4 Numeric Formatting
The information in LC_NUMERIC is used by various functions. For example,
, and the print and scan functions determine the radix character from
the LC_NUMERIC category.
10.9 Functions for Handling Wide Characters
A character can be represented by single-byte or multibyte values depending on the codeset. To make it easier to handle both single-byte and multibyte characters in the same way, the HP C RTL defines a wide-character data type, wchar_t . This data type can store characters that are represented by 1-, 2-, 3-, or 4-byte values.
The functions provided to support wide characters are: