<locale.h>
are used to
tailor significant portions of the C run-time environment to
national conventions for punctuation of
decimal numbers, currency,
and so on. This tailoring provides the capability for writing programs
that are portable across many countries. For example, some countries
separate the whole and fractional parts of a number with a comma
instead of a period. Alphabetization, currency notation, dates, and
times are all expressed differently in different countries. Each set
of definitions for culture-dependent issues is called a locale,
and each locale has a name that is a null-terminated string.
This chapter introduces fundamental concepts of localization and the basic
structure used in localization functions, and discusses the three
nonstandard locales supplied by the SAS/C Library. Descriptions of the
standard localization functions (including strcoll
and strxfrm
)
follow.
For details on how to create your own locales, see User-Added Locales .
"S370"
"POSIX"
"C"
exec
linkage, this is the same as the
"POSIX"
locale. For other programs, it is the same as the
"S370"
locale.
""
""
locale interpretation is controlled by several environment
variable settings. See the setlocale
function description for more
information.
A locale is divided into categories, which define different parts
of the whole locale. You can change one category without having to
change the entire locale. For example, you can change the way currency
is displayed without changing the expression of dates and time. The
categories of locale defined in <locale.h>
are as follows:
LC_ALL
LC_COLLATE
strcoll
and
strxfrm
work.
LC_CTYPE
isgraph
) work.
LC_CTYPE
also affects the multibyte functions (such as mblen
and wcstombs
) as well as the treatment of multibyte characters by
the formatted I/O functions (such as printf
and sprintf
) and
the string functions.
isdigit
and isxdigit
are not affected by LC_CTYPE
.
LC_MONETARY
LC_NUMERIC
LC_TIME
strftime
formats time values.
This category does not
affect the behavior of asctime
.
<locale.h>
defines the structure struct lconv
. The standard
members of this structure are explained in the following list. The
default "C"
locale values, as defined by the ANSI and ISO
C standards, are in
parentheses after the descriptions. A default value of CHAR_MAX
indicates the information is not available.
char *decimal_point
"."
)
char *thousands_sep
""
)
char *grouping
""
)
char *int_curr_symbol
""
)
char *currency_symbol
""
)
char *mon_decimal_point
""
)
char *mon_thousands_sep
""
)
char *mon_grouping
""
)
char *positive_sign
""
)
char *negative_sign
""
)
char int_frac_digits
CHAR_MAX
)
char frac_digits
CHAR_MAX
)
char p_cs_precedes
1
if the currency_symbol
and a nonnegative monetary
value are separated by a space, otherwise it is set to 0
.
(CHAR_MAX
)
char p_sep_by_space
1
if the currency_symbol
comes before a nonnegative
monetary value; it is set to 0
if the currency_symbol
comes after
the value. (CHAR_MAX
)
char n_cs_precedes
1
if the currency_symbol
and a negative monetary value
are separated by a space, otherwise it is set to 0
. (CHAR_MAX
)
char n_sep_by_space
1
if the currency_symbol
comes before a negative
monetary value; it is set to 0
if the currency_symbol
comes after
the value. (CHAR_MAX
)
char p_sign_posn
positive_sign
for
a nonnegative monetary value. (CHAR_MAX
)
char n_sign_posn
negative_sign
for
a negative monetary value. (CHAR_MAX
)
grouping
and mon_grouping
are defined by the
following values:
CHAR_MAX
CHAR_MAX
is defined in
<limits.h>
as 255
.
0
p_sign_posn
and n_sign_posn
is defined by the
following:
0
currency_symbol
.
1
currency_symbol
.
2
currency_symbol
.
3
currency_symbol
.
4
currency_symbol
.
"S370"
locale defines its categories as follows:
LC_COLLATE
strxfrm
to behave like strncpy
, except that it
returns the number of characters copied, not a pointer to a string.
LC_CTYPE
LC_MONETARY
LC_NUMERIC
LC_TIME
strftime
to return the same values for days of the week,
dates, and so on, as asctime
.
"POSIX"
locale defines its categories as follows:
LC_COLLATE
LC_CTYPE
LC_MONETARY
LC_NUMERIC
LC_TIME
strftime
to return the same values for days of the week,
dates, and so on, as asctime
.
"S370"
locales, the run-time library supplies
three other locales in source form or load form or both. The load
module names can be found in L$CLxxxx where xxxx is the name
of the locale. See your SAS Software Representative for SAS/C sofware
products for the location of the source code.
"DBCS"
LC_CTYPE
or LC_ALL
.
For category LC_COLLATE
, this locale enables strxfrm
and
strcoll
to transform and collate mixed double-byte character
strings. In all other aspects, the "DBCS"
locale is identical to the
"S370"
locale.
"SAMP"
LC_CTYPE
"S370"
locale.
LC_COLLATE
"S370"
locale.
LC_NUMERIC
LC_MONETARY
LC_TIME
"C"
locale conventions with a modified sample date routine used
with strftime
%x
format.
"DBEX"
LC_CTYPE
ctype
table pointer is NULL
.
LC_COLLATE
strxfrm
and strcoll
functions that use the table.
LC_NUMERIC
"S370"
locale is used.)
LC_MONETARY
"S370"
locale is used.)
LC_TIME
"S370"
locale is used.)
#include <locale.h> struct lconv *localeconv(void);
localeconv
sets the elements of an object of type struct lconv
to
the appropriate values for the formatting of numeric objects, both monetary and
nonmonetary, with respect to the current locale.
localeconv
returns a pointer to a filled-in object of type
struct lconv
. The char *
members of this structure may point
to ""
, indicating that the value is not available or is of 0
length. The
char
members are nonnegative numbers and can be equal to CHAR_MAX
,
indicating the value is not available. See The Locale Structure lconv
for a discussion of each structure member returned by
localeconv
.
localeconv
overwrites the previous value of the
structure; if you need to reuse the previous value, be sure to save
it. The following code saves the value of the structure:
struct lconv save_lconv; save_lconv = *(localeconv());Also, calls to
setlocale
with categories LC_ALL
, LC_MONETARY
,
or LC_NUMERIC
may overwrite the contents of the structure returned by
localeconv
.
frac_digits
indicates
how many digits are to the right of the decimal point character. If
frac_digits
is negative, the number is padded on the right, before
the decimal point character, with the same number of 0s as the absolute
value of frac_digits
. It is assumed that the digit grouping
string has at most only one element.
#include <stdio.h> #include <string.h> #include <locale.h> size_t dec_fmt(); int main() { struct lconv *lc; /* pointer to locale values */ char buf[256]; setlocale(LC_NUMERIC, "SAMP"); /* Use "SAMP" locale. */ /* Call localeconv to obtain locale conventions. */ lc = localeconv(); dec_fmt(buf, lc, 12345678, 0); printf("The number is: 12,345,678. == %s\n", buf); dec_fmt(buf, lc, 12345678, 2); printf("The number is: 123,456.78 == %s\n", buf); dec_fmt(buf, lc, -12345678, 4); printf("The number is: -1,234.5678 == %s\n", buf); dec_fmt(buf, lc, 12345678, -5); printf("The number is: 1,234,567,800,000. == %s\n", buf); dec_fmt(buf, lc, 12345678,10); printf("The number is: 0.0012345678 == %s\n", buf); } size_t dec_fmt(char *buf, struct lconv *lc, int amt, int frac_digits) { char numstr[128]; /* number string */ char *ns_ptr, /* number string pointer */ *buf_start; /* output buffer start pointer */ int ngrp, /* number of digits per group */ ntocpy, /* digits to copy */ non_frac; /* number of nonfractional digits */ size_t ns_len; /* number string length */ if (abs(frac_digits) > 100) /* Return error if too big */ return 0; buf_start = buf; sprintf(numstr, "%+-d", amt); /* Get amount as number string */ ns_ptr = numstr; /* Point to number string */ ns_len = strlen(ns_ptr); /* Get number string length */ if (frac_digits < 0) { /* zero pad left of decimal point */ memset(ns_ptr + ns_len, '0', (size_t) -frac_digits); *(ns_ptr + ns_len - frac_digits) = '\0'; ns_len -= frac_digits; /* Add extra digit length. */ frac_digits = 0; } /* zero pad right of decimal point */ if ((non_frac = ns_len - frac_digits - 1) < 0) { sprintf(numstr,"%+0*d", ns_len - non_frac + 1, amt); non_frac = 1; /* e.g., 0.000012345678 */ } if (amt < 0) *buf++ = *ns_ptr; /* Insert sign in buffer. */ ns_ptr++; /* Skip +/- in number string. */ /* Convert grouping to int. */ if (!(ngrp = (int) *(lc->grouping))) ngrp = non_frac; /* Use non_frac len if none. */ /* Get number of digits to copy for first group. */ if (!(ntocpy = non_frac % ngrp)) ntocpy = ngrp; while (non_frac > 0) { /* Separate groups of digits. */ memcpy(buf, ns_ptr, ntocpy); /* Copy digits. */ ns_ptr += ntocpy; /* Advance pointers. */ buf += ntocpy; if (non_frac -= ntocpy) { /* Insert separator and set */ *buf++ = *(lc->thousands_sep);/* number of digits for other */ ntocpy = ngrp; /* groups. */ } } *buf++ = *(lc->decimal_point); /* Insert decimal point. */ if (frac_digits > 0) /* Copy fraction + '\0' */ memcpy(buf, ns_ptr, frac_digits + 1); else *buf = '\0'; /* Else just null-terminate. */ return strlen(buf_start); /* Return converted length. */ }
#include <locale.h> char *setlocale(int category, const char *locale);
setlocale
selects the current locale or a portion thereof, as
specified by the category
and locale
arguments.
Here are the valid categories:
LC_ALL
LC_COLLATE
strcoll
and strxfrm
.
LC_CTYPE
LC_MONETARY
localeconv
.
LC_NUMERIC
localeconv
.
LC_TIME
strftime
.
locale
points to a locale_string
that has one of the
following formats:
xxxx
Here is an example:
setlocale(LC_ALL, "C");
xxxx
; category=yyyy
<;
category=zzzz
...>
Here is an example:
setlocale(LC_MONETARY, "C;LC_TIME=SAMP");
category=yyyy
<; category=zzzz
...>
Here is an example:
setlocale(LC_MONETARY, "LC_TIME=SAMP");
""
(null string)
NULL
(pointer)
Here is an example:
setlocale(LC_ALL, NULL);
In the locale_string
formats, xxxx
,
yyyy
, and zzzz
have the following meanings:
xxxx
specifies the name for the requested category.
yyyy
and zzzz
are overrides for mixed
categories.
x
, y
, and z
can
be uppercase alphabetic (A through Z),
numeric (0 through 9), $, @, or #.
LC_ALL
category is requested or the specific
category and the override match. For example, this statement sets only
the LC_MONETARY
category; the LC_TIME
override is ignored.
setlocale(LC_MONETARY,"C;LC_TIME=SAMP");When
NULL
is specified, setlocale
returns the current
locale string in the same format as described earlier.
If the locale string is specified as ""
, the library consults a
number of environment variables to determine the appropriate settings.
If none of the environment variables are defined, the "C"
locale is
used.
The locale ""
is resolved in the following steps:
LC_ALL
is defined and not NULL, the
value of LC_ALL
is used as the locale.
LC_ALL
was specified, and there is an
environment variable with the same name as the category, the value of
that variable is used (if not NULL). For instance, if the category
was LC_COLLATE
, and the environment variable LC_COLLATE
is "DBEX"
,
then the LC_COLLATE
component of the "DBEX"
locale is used. Note that
if the category is LC_ALL
,
the effect is the same as if each other
category were set independently.
LANG
is defined and not NULL,
the value
of LANG
is used as the locale.
_LOCALE
is defined and not NULL, the
value of _LOCALE
is used as the locale.
setlocale
returns the string associated with the specified
category
value. If the value for locale
is not valid or
an error occurs when loading the requested locale, setlocale
returns
NULL
and the locale is not affected.
If locale
is specified as NULL
, setlocale
returns the
string associated with the category
for the current locale and the
locale is not affected.
When issuing setlocale
for a mixed locale
, if a category
override fails (for example, if the locale load module cannot be
found), then for that category the LC_ALL
category is used if it
is being set at the same time. Otherwise, a NULL
return is issued
and the category for which the locale override was requested remains
unchanged. The returned string indicates the actual mixed locale in effect.
setlocale
overwrites the previous value, so you
must save the current value before calling setlocale
again, if you
need the value later. This involves determining the string length for
the locale name, allocating storage space for it, and copying it. The following
code does all three:
char *name; name = setlocale(LC_ALL, NULL); name = strsave(name);When you want to revert to the locale saved in
name
, use the
following code:
setlocale(LC_ALL, name);
"S370"
locale,
which is built-in, locale information
is kept in a load module created by compiling and linking
"S370"
source
code for the locale's localization data and routines. The
load module name is created by appending up to the first four uppercased
characters of the locale's name to the "L$CL"
prefix. For
example, if the locale name is MYLOCALE
, the load module name is
L$CLMYLO
.
The locale load module is loaded when needed (with loadm
) during
setlocale
processing. For the exact details and requirements on
specifying localization data and routines, see
User-Added Locales .
Also loadm for load
module library search and location requirements.
Mixed locale strings can be specified. In this case, all requested
locales are loaded by a single call to setlocale
.
#include <locale.h> main() { char *name; /* Set all portions of the current locale to */ /* the values defined by the built-in "C" locale. */ setlocale(LC_ALL, "C"); /* Set the LC_COLLATE category to the value */ /* defined by the "DBCS" locale. */ setlocale(LC_COLLATE, "DBCS"); /* Set the LC_CTYPE category to the value */ /* defined by the "DBCS" locale. */ setlocale(LC_CTYPE, "DBCS"); /* Set the LC_NUMERIC category to the value */ /* defined by the "SAMP" locale. */ setlocale(LC_NUMERIC, "SAMP"); /* Return mixed locale string. */ name = setlocale(LC_ALL, NULL); }The following string is pointed to by
name
:
"C@;LC_COLLATE=DBCS;LC_CTYPE=DBCS;LC_NUMERIC=SAMP"This string is interpreted as if the "
C
" locale
(LC_ALL
) is in
effect except for LC_COLLATE
and LC_CTYPE
,
which have the
"DBCS
" locale in effect, and LC_NUMERIC
,
which is using "SAMP
".
The same results can be accomplished with a single call to
setlocale
:
#include <locale.h> void main() { char *name; name = setlocale(LC_ALL, "C;LC_COLLATE=DBCS;LC_CTYPE=DBCS;LC_NUMERIC=SAMP"); }
#include <string.h> int strcoll(const char *str1, const char *str2);
strcoll
compares two character strings (str1
and
str2
)
using the collating sequence or routines (or both) defined by the
LC_COLLATE
category of the current locale. The return value has the
same relationship to 0
as str1
has to str2
.
If two strings
are equal up to the point where one of them terminates (that is, contains a null
character), the longer string is considered greater.
Note that
when the "POSIX"
locale is in effect,
either explicitly or by default,
strcoll
compares the strings according to the ASCII collating order rather
than the EBCDIC order.
strcoll
is one of the following:
0
0
str1
compares less than str2
.
0
str1
compares greater than str2
.
strcoll
. In the "S370
" locale, the strcoll
return value is the
same as if the strcmp
function were used to compare the strings.
strcoll
is not properly terminated, a
protection or addressing exception may occur.
strcoll
uses the following logic when comparing two strings:
strcoll
calls the locale's strcoll
function equivalent,
if available, and returns its value. See
LOCALE strcoll EQUIVALENT .
strcoll
calls the locale's strxfrm
function equivalent, if
available, to transform the strings for a character-by-character
comparison. See LOCALE strxfrm EQUIVALENT .
strcoll
calls the library's double-byte collation routine with a
standard double-byte collating sequence if the locale is a double-byte
locale as determined from the LC_COLLATE
category, no locale
strcoll
or strxfrm
function is available, and no collation table
is supplied.
strcoll
uses a collation table to compare the two strings (which
are then compared character by character) if the locale is a
single-byte locale and has a collation table available.
strcoll
calls the strcmp
function to compare the strings and
returns its value if none of the above are true.
#include <locale.h> #include <string.h> #include <stdio.h> main() { char *s1, *s2, *lcn; int result; /* Obtain locale name. */ lcn = setlocale(LC_COLLATE, NULL); s1 = " A B C D"; s2 = " A C B"; result = strcoll(s1, s2); if (result == 0) printf("%s = %s in the \"%s\" locale", s1, s2, lcn); else if (result < 0) printf("%s < %s in the \"%s\" locale", s1, s2, lcn); else printf("%s > %s in the \"%s\" locale", s1, s2, lcn); }
#include <string.h> size_t strxfrm(char *str1, const char *str2, size_t n);
strxfrm
transforms the string pointed to by str2
using the
collating sequence or routines (or both) defined by the LC_COLLATE
category of the current locale. The resulting string is placed into
the array pointed to by str1
.
The transformation is such that if the strcmp
function is applied
to two transformed strings, it returns greater than, equal to, or less
than 0
, corresponding to the result of the strcoll
function
applied directly to the two original strings.
No more than n
characters are placed into the resulting array
pointed to by str1
, including the terminating null character. If
n
is 0
, str1
is permitted to be NULL
. The results are
unpredictable if the strings pointed to by str1
and str2
overlap.
strxfrm
returns the length of the transformed string (not including the
terminating null character). If the value returned is n
or more, the
first n
characters are written to the array without null termination. In
the "S370
" locale, the behavior of strxfrm
is like that of strncpy
except that the number of characters copied to the output array is returned
instead of a pointer to the copied string.
str2
argument is not properly terminated or n
is
bigger than the output array pointed to by str1
, then a
protection or addressing exception may occur. The size of the output
array needed to hold the transformed string pointed to by str2
can
be determined by the following statement:
size_needed = 1 + strxfrm(NULL, str1, 0);
strxfrm
uses the following logic when transforming a string:
strxfrm
calls the locale's strxfrm
function equivalent,
if available, to transform str2
and place the result in the
str1
array. See LOCALE strxfrm EQUIVALENT .
strxfrm
calls the library's double-byte strxfrm
collation
routine with a standard double-byte collating sequence if the locale is
a double-byte locale as determined from the LC_COLLATE
category,
no locale strxfrm
function is available, and no collation table is
supplied.
strxfrm
uses a collation table to transform the string if the
locale is a single-byte locale and has a collation table available.
strxfrm
invokes the equivalent of strncpy
to copy str2
to str1
if none of the above are true.
strcmp
yields the same result as
strcoll
when it is used to compare two strings transformed by
strxfrm
.
#include <locale.h> #include <string.h> main() { char *str1, *str2, /* input strings pointers */ txf1[80], txf2[80]; /* transform arrays */ int result_strcoll, result_strcmp; /* compare results */ str1 = " A B C D"; str2 = " A C B"; if ((strxfrm(txf1, str1, sizeof(txf1)) < sizeof(txf1)) && (strxfrm(txf2, str2, sizeof(txf2)) < sizeof(txf2))) result_strcmp = strcmp(txf1, txf2); else exit(4); /* error exit if length is too big */ result_strcoll = strcoll(str1, str2); /* Get strcoll result. */ /* Result must be 0 or result signs must be the same. */ if ((result_strcmp == result_strcoll) || (result_strcmp*result_strcoll > 0)) exit(0); else exit(8); /* Else this is an error. */ }
"S370"
and "POSIX"
)
and the three nonstandard locales
supplied by the SAS/C Library by creating your own locales.
Following is a discussion of the
user-added
structures that correspond to the standard locale
structures, a listing of the header file <localeu.h>
and
an example locale ("SAMP"
), and
discussions of user-defined strcoll
and
strxfrm
functions.
setlocale
.
The header file
<localeu.h>
maps the various data
structures and routines required for a locale. You must
include it as well as
<locale.h>
when compiling a locale. Locale
source code should be compiled with the
RENT
or RENTEXT
compiler
option and link-edited RENT
(for re-entrant).
Each category for
setlocale
has a structure mapped within <localeu.h>
that
corresponds to it.
The following table describes the user-supplied
categories that correspond to the
categories for setlocale
.
Table 10.1 User-Supplied Locale Categories
Category Structure Name Description
LC_NUMERIC _lc_numeric LC_NUMERIC contains nonmonetary numeric formatting items; see the description of localeconv in Chapter 15, "Localization Functions." LC_MONETARY _lc_monetary LC_MONETARY contains monetary formatting items; see the description of localeconv in localeconv in Chapter 15. LC_TIME _lc_time LC_TIME contains function pointers to locale-specific date and formatting routines, pointers to month and weekday names, and a.m. and p.m. designation; all are used with various strftime formats. LC_CTYPE _lc_ctype LC_CTYPE contains a flag for enablement of DBCS processing and string recognition by other library routines, including recognition of multibyte characters in formatted I/O format strings such as those used in printf. LC_CTYPE also contains a pointer to the character type table that affects the behavior of the character type functions such as isalpha and tolower. LC_COLLATE _lc_collate LC_COLLATE contains a mode flag indicating the processing mode and collation table pointer for strxfrm and strcoll. Mode flag meanings are as follows: 0 indicates single-byte mode. 1 indicates double-byte mode. >1 indicates multibyte mode. For single-byte mode, the library's strxfrm and strcoll functions use the collation table if supplied. In double-byte and multibyte mode, any use of the table is strictly left up to the user-supplied routines. The library functions use a standard double-byte collation in double-byte mode when the collation table pointer is NULL. Included in this category are function pointers to locale-specific versions of strxfrm and strcoll. The locale-specific version of the strxfrm function requires an additional fourth parameter. This parameter is a pointer to a size_t variable where the number of characters consumed from the input string by the function is placed. Also, the value returned by the locale's strxfrm is the number of characters placed in the output array, not necessarily the total transformed string length. LC_ALL void *_lc_all[5] LC_ALL is an array of pointers to the other _lc structures in the following in the following order: [0] &_lc_collat [1] &_lc_ctype [2] &_lc_monetar [3] &_lc_numeric [4] &_lc_time.When the pointer to a structure that corresponds to a category is
NULL
, the name returned by setlocale
reflects the new
locale's name. However, it has the default "C
" locale
characteristics for that category. Similarly, if individual elements
of a structure (pointers) are
NULL
or binary 0
, that piece of
the locale also exhibits "C
" locale behavior.
<localeu.h>
header file required
for compiling a user-added locale.
/* This header file defines additions to the ANSI locale.h header */ /* file that are required for compiling both user-added locale */ /* value table load modules and several library functions. The */ /* "C" defaults appear as comments for the _lc_numeric and */ /* _lc_monetary categories. */ static struct _lc_numeric { char *decimal_point; /* "." */ char *thousands_sep; /* "" */ char *grouping; /* "" */ }; static struct _lc_monetary { char *int_curr_symbol; /* "" */ char *currency_symbol; /* "" */ char *mon_decimal_point; /* "" */ char *mon_thousands_sep; /* "" */ char *mon_grouping; /* "" */ char *positive_sign; /* "" */ char *negative_sign; /* "" */ char int_frac_digits; /* CHAR_MAX */ char frac_digits; /* CHAR_MAX */ char p_cs_precedes; /* CHAR_MAX */ char p_sep_by_space; /* CHAR_MAX */ char n_cs_precedes; /* CHAR_MAX */ char n_sep_by_space; /* CHAR_MAX */ char p_sign_posn; /* CHAR_MAX */ char n_sign_posn; /* CHAR_MAX */ }; static const struct _lc_time { /* locale's date and time conversion routine */ char *(*_lct_datetime_conv)(); /* address of locale's day conversion routine */ char *(*_lct_date_conv)(); /* address of locale's time conversion routine */ char *(*_lct_time_conv)(); /* address of weekday abbreviation table */ char *_lct_wday_name [7] ; /* address of full weekday name table */ char *_lct_weekday_name [7] ; /* address of month abbreviation table */ char *_lct_mon_name [12] ; /* address of full month name table */ char *_lct_month_name [12] ; /* locale's before-noon designation */ char *_lct_am; /* locale's after-noon designation */ char *_lct_pm; }; #define SBCS 0 /* single-byte character set */ #define DBCS 1 /* double-byte character set */ static const struct _lc_collate { /* single-, double-, or multibyte character indicator */ int _lcc_cmode; /* pointer to collation table */ void *_lcc_colltab; /* pointer to user-added strcoll function */ int (*_lcc_strcoll)(); /* pointer to user-added strxfrm function */ size_t (*_lcc_strxfrm)(); }; static const struct _lc_ctype { /* single-, double-, or multibyte character indicator */ int _lcc_cmode; /* character type table pointer */ void *_lcc_ctab; }; /* If _lcc_cmode is set to DBCS, it only has an impact on the ANSI */ /* multibyte character handling functions, not on isalpha, and */ /* so on. _lcc_ctab is for single-byte characters only, per the */ /* ANSI ctype.h-allowed representation of "unsigned char," and */ /* has no relation to to _lcc_mode. */ static const void *_lc_all [5] ; /* pointers to _lc struct */ /* [0] - &_lc_collate */ /* [1] - &_lc_ctype */ /* [2] - &_lc_monetary */ /* [3] - &_lc_numeric */ /* [4] - &_lc_time */
L$CLSAMP
("SAMP
") and L$CLDBEX
("DBEX
") are provided
in source form with the compiler and library to serve as
skeleton locales. You can easily modify these locales
to create new locales.
Ask your SAS Software Representative for SAS/C compiler products for
information about obtaining copies of these programs.
Here is an abbreviated listing of the "SAMP
" locale,
illustrating the data structures and routine formats required for a
locale. The L$CLDBEX
example
is a double-byte example locale (not shown) with
sample strcoll
and strxfrm
routines.
#title l$clsamp -- "SAMP" sample locale /* This is the "SAMP" locale value module table, which */ /* provides a skeleton example to modify for a */ /* particular locale. For those locales requiring */ /* double-byte character support, see the "DBEX" locale */ /* (L$DLDBEX) for examples of setting up a double-byte */ /* LC_CTYPE, LC_COLLATE, strcoll, and strxfrm. */ /* */ /* Any addresses of functions or tables not specified */ /* with a category use the "C" locale equivalent */ /* function or table. If a whole category is not specified */ /* and the locale is requested for that category with */ /* setlocale, effectively the "C" locale is used, although */ /* the locale string returned contains the locale's name. */ #include <stddef.h> #include <locale.h> #include <localeu.h> #include <dynam.h> #include <stdlib.h> #include <string.h> #include <time.h> #eject /* */ /* ENTRY: <=== _dynamn (externally visible) needs to be */ /* be compiled with SNAME l$clsam. */ /* */ /* USAGE: <=== Prototype call: _dynamn (dynamically */ /* loaded with the loadm function from the */ /* setlocale function). */ /* */ /* Needs to be compiled with RENT or RENText */ /* option and link-edited RENT. */ /* */ /* For example, the following is a call made */ /* to setlocale: */ /* setlocale(LC_ALL, "SAMP"); */ /* The following code is executed within */ /* setlocale code to load L$CLSAMP and then */ /* call it: */ /* loadm(L$CLSAMP, &fp) */ /* fncptr = (char *** )(*fp)(); */ /* */ /* ARGUMENTS: <=== None */ /* */ /* RETURNS: <=== A pointer to an array of pointers */ /* */ /* static const void *lc_all_samp[5] = */ /* &collate, collate pointer */ /* &ctype, ctype pointer */ /* &monetary monetary pointer */ /* &numeric numeric pointer */ /* &time time format pointer */ /* */ /* */ /* END */ #eject /*-----------------COLLATION category-----------------------*/ static const unsigned char sbcs_collate_table_samp [256] = /* If a collation table is specified, that is, its address */ /* is nonzero, a locale strxfrm function is not coded, and */ /* the locale is not a multibyte (double-byte) locale, then */ /* the collation array must have 256 elements that */ /* translate any character's 8-bit representation to its */ /* proper place in the locale's collating sequence. */ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, /* 0x00-0f */ 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, /* 0x10-1f */ 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, /* 0x20-2f */ 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f, . . . . . . . . /* 0x30-ef */ . . . . . . . . . . . . . . . . 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, /* 0xf0-ff */ 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff; #define SBCS 0 #define DBCS 1 static const struct _lc_collate lc_collate_samp = { SBCS, /* single-byte character mode */ sbcs_collate_table_samp, /* collation table address */ 0, /* locale strcoll collation */ /* function pointer */ 0 /* locale strxfrm transform */ /* function pointer */ ; /* See L$CLDBEX for DBCS example of strcoll and strxfrm functions. */ #eject /*-----------------CTYPE category--------------------------*/ #define U 1 /* uppercase */ #define L 2 /* lowercase */ #define N 4 /* number */ #define W 8 /* white space */ #define P 16 /* punctuation */ #define S 32 /* blank */ #define AX 64 /* alpha extender */ #define X 128 /* hexadecimal */ static const unsigned char lc_ctab_samp[513] = { /* The character type table array, if coded, must contain */ /* 513 single char elements. The first element is the EOF */ /* representation (-1 or 0xff) followed by 256 elements */ /* that contain the char types for any 8-bit character */ /* returned by functions isalpha, isnumeric, and so on. */ /* The next 256 elements contain the mappings for the */ /* tolower and toupper string transformation functions. */ 0, /* -1 = EOF */ 0, /* 00 = nul */ 0, /* 01 = soh */ 0, /* 02 = stx */ 0, /* 03 = etx */ 0, /* 04 = sel */ W, /* 05 = ht */ 0, /* 06 = rnl */ 0, /* 07 = del */ 0, /* 08 = ge */ 0, /* 09 = sps */ 0, /* 0a = rpt */ W, /* 0B = vt */ W, /* 0C = ff */ W, /* 0D = cr */ 0, /* 0E = so */ 0, /* 0F = si */ 0, /* 10 = dle */ 0, /* 11 = dcl */ 0, /* 12 = dc2 */ 0, /* 13 = dc3 */ . . . . . . U, /* E6 = W */ U, /* E7 = X */ U, /* E8 = Y */ U, /* E9 = Z */ 0, /* EA */ 0, /* EB */ 0, /* EC */ 0, /* ED */ 0, /* EE */ 0, /* EF */ N|X, /* F0 = 0 */ N|X, /* F1 = 1 */ N|X, /* F2 = 2 */ N|X, /* F3 = 3 */ N|X, /* F4 = 4 */ N|X, /* F5 = 5 */ N|X, /* F6 = 6 */ N|X, /* F7 = 7 */ N|X, /* F8 = 8 */ N|X, /* F9 = 9 */ 0, /* FA */ 0, /* FB */ 0, /* FC */ 0, /* FD */ 0, /* FE */ 0, /* FF = eo */ /* Lower 257 bytes contain char types, next */ /* 256 contain the tolower and toupper */ /* character mappings. */ 0x00, /* 00 = nul */ 0x01, /* 01 = soh */ 0x02, /* 02 = stx */ 0x03, /* 03 = etx */ . . . . . . 0x7d, /* 7D = ' */ 0x7e, /* 7E = = */ 0x7f, /* 7F = " */ 0x80, /* 80 */ 0xc1, /* 81 = a -> C1 = A */ 0xc2, /* 82 = b -> C2 = B */ 0xc3, /* 83 = c -> C3 = C */ 0xc4, /* 84 = d -> C4 = D */ 0xc5, /* 85 = e -> C5 = E */ 0xc6, /* 86 = f -> C6 = F */ 0xc7, /* 87 = g -> C7 = G */ 0xc8, /* 88 = h -> C8 = H */ 0xc9, /* 89 = i -> C9 = I */ . . . . . . 0xbc, /* BC */ 0xbd, /* BD = ] (close bracket) */ 0xbe, /* BE */ 0xbf, /* BF */ 0xc0, /* C0 = (open brace) */ 0x81, /* C1 = A -> 81 = a */ 0x82, /* C2 = B -> 82 = b */ 0x83, /* C3 = C -> 83 = c */ 0x84, /* C4 = D -> 84 = d */ 0x85, /* C5 = E -> 85 = e */ 0x86, /* C6 = F -> 86 = f */ 0x87, /* C7 = G -> 87 = g */ 0x88, /* C8 = H -> 88 = h */ 0x89, /* C9 = I -> 89 = i */ 0xca, /* CA = shy */ 0xcb, /* CB */ 0xcc, /* CC */ 0xcd, /* CD */ 0xce, /* CE */ . . . . . . 0xf7, /* F7 = 7 */ 0xf8, /* F8 = 8 */ 0xf9, /* F9 = 9 */ 0xfa, /* FA */ 0xfb, /* FB */ 0xfc, /* FC */ 0xfd, /* FD */ 0xfe, /* FE */ 0xff /* FF = eo */ }; static const struct _lc_ctype lc_ctype_samp = SBCS, /* single-byte character mode */ lc_ctab_samp /* ctype table pointer */ }; #eject /*------------NUMERIC category--------------------*/ const static struct _lc_numeric lc_numeric_samp = { ".", /* decimal_point */ ",", /* thousands_sep */ "\3" /* grouping */ }; /*------------MONETARY category---------------------*/ static const struct _lc_monetary lc_monetary_samp = { "DOL", /* int_curr_symbol */ "$", /* currency_symbol */ ".", /* mon_decimal_point */ ",", /* mon_thousands_sep */ "\3", /* mon_grouping */ "", /* positive_sign */ "-", /* negative_sign */ 2, /* int_frac_digits */ 2, /* frac_digits */ 1, /* p_cs_precedes */ 0, /* p_sep_by_space */ 1, /* n_cs_precedes */ 0, /* n_sep_by_space */ 1, /* p_sign_posn */ 1 /* n_sign_posn */ ; #eject /*------------TIME category------------------------------*/ char *sampdcnv(struct tm *tp); static const struct _lc_time lc_time_samp = { 0, /* pointer to date and time conversion */ /* routine function pointer */ sampdcnv, /* pointer to date conversion */ /* routine function pointer */ 0, /* pointer to time conversion */ /* routine function pointer */ /* weekday name abbreviations */ "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", /* weekday full names */ "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", /* month name abbreviations */ "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", /* month full names */ "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December", "AM", /* locale's "AM" equivalent */ "PM" /* locale's "PM" equivalent */ }; char *sampdcnv(struct tm *tp) { /* SAMP date conversion routine */ /* Function returns the date in the form: */ /* wkd mon dd 'yy */ /* for example, Thu Oct 10 '85. */ char *time_format; time_format = asctime(tp); memcpy(time_format + 11, " '", 2); memcpy(time_format + 13, time_format + 22, 2); *(time_format + 15) = '\ 0'; return time_format; /* End sampdcnv. */ #eject /* ALL category - array of pointers to category structures */ static const void *lc_all_samp[5] =lc_collate_samp , /* pointer to collate category */ lc_ctype_samp , /* pointer to ctype category */ lc_monetary_samp , /* pointer to monetary category */ lc_numeric_samp , /* pointer to nu meric samp */ lc_time_samp /* pointer to time samp */ ; /*-----------------Return category pointers---------------*/ void *_dynamn() /* executable entry point */ { return (void *)&lc_all_samp; /* Return address of ALL array. */ }
strcoll
function is not adequate for the
needs of a locale, you can write and use your own routine to do the
collation. You include this routine as
part of the
LC_COLLATE
category
of a locale to be called from the library
strcoll
function after setlocale
has loaded the
locale.
Because it is your own routine, you can give it any legal name,
as long as you are consistent in its use. (For instance, the following
example uses the name loclcoll
.)
The locale's routine can make use of
any information available in the locale, such as the mode and
collation tables. In addition, if the
LC_COLLATE
mode is not 0
, the
collation tables coded as part of the locale are not
restricted to any format as long as the locale's
strxfrm
and strcoll
routines
can understand them.
The locale's routine is invoked from the library's
strcoll
function with the
equivalent of the following call:
/* library strcoll function */ int strcoll(char *str1, const char *str2) { . . . /* Return locale's strcoll value to library's */ /* strcoll caller. */ return loclcoll(str1, str2); }The
loclcoll
function code appears as part of the
LC_COLLATE
category
within the locale source code:
.
.
.
/* collation tables, transformation tables, */
/* and other locale data */
int loclcoll(const char *str1, const char *str2)
/* ARG DCL DESCRIPTION */
/* */
/* str1 const char * pointer to first input string */
/* */
/* str2 const char * pointer to second input string */
/* */
/* RETURNS: <=== str1 < str2 a negative value */
/* str1 = str2 0 */
/* str1 > str2 a positive value */
/* */
{
.
. /* locale's equivalent strcoll function code */
.
return x
/* Return a result, x. */
}
.
. /* more locale data, routines, and so on */
.
For an example of a locale routine for strcoll
, see
the L$CLDBEX
source code member distributed with the compiler
and library.
strxfrm
function is not adequate for the
needs of a locale, you can write and use your own
routine to do the transformation.
You include this routine as part of the
LC_COLLATE
category
of a locale to be called from the library
strxfrm
function after being loaded with an appropriate
setlocale
call.
Because it is your own routine, you can give it any legal name,
as long as you are consistent in its use.
(For instance, the following example
uses the name loclxfrm
.)
There is one
main difference in behavior requirements for the locale equivalent
of the strxfrm
function
and behavior requirements for the library version.
After the output buffer is
filled, it stops scanning the input string and returns the size of
the filled output buffer rather than the total transformed length.
It also places the number of characters consumed from the input string
in the area addressed by
an additional fourth parameter. The locale's routine can use
any information available in the locale, such as the mode and
collation tables. In addition,
if the LC_COLLATE
mode is not 0
, the
collation and
transformation tables coded as part of the locale are not
restricted to any format as long as the locale's
strxfrm
and strcoll
routines
can understand them.
The reason
the behavior requirements of the library
and locale strxfrm
routines differ
is to allow the strcoll
function
to call strxfrm
with a limited buffer that might
permit only partial transformation of the whole string. Theoretically,
any number of output characters can be produced by strxfrm
from
any number of input characters.
The locale's
strxfrm
routine
is invoked from within strxfrm
with an
equivalent of the following call. (The choice of "loclxfrm"
is
arbitrary. It could be any legal name as long as it is consistent.)
size_t strxfrm(char *str1, const char *str2, size_t n) { size_t nchar_xfrmed, used; . . . wchar_xfrmed = loclxfrm(str1, str2, n, &used); . . . }The
loclxfrm
function code appears as part of the
LC_COLLATE
category
within the locale source code:
. . /* collation tables, transformation tables, and other */ /* locale data */ . size_t loclxfrm(char *str1, const char *str2, size_t n, size_t *used) /* ARG DCL DESCRIPTION */ /* */ /* str1 char * pointer to the transformed */ /* string output array */ /* */ /* str2 const char * pointer to input string array */ /* */ /* n size_t maximum number of bytes */ /* (characters) written to str1 */ /* including the terminating null. */ /* If n or more characters are */ /* required for the transformed */ /* string, only the first n are */ /* written and the string is not */ /* null-terminated. */ /* */ /* used size_t * pointer to size_t (unsigned int) */ /* value where the number of input */ /* characters consumed from the */ /* input string str2 is returned */ /* */ /* RETURNS: <=== The number of characters placed in the*/ /* output transformation array is returned. */ /* */ /* Also, the number of characters consumed */ /* (scanned) from the input string str2 is */ /* placed in the size_t (unsigned int) */ /* value pointed to by used. */ /* */ /* The total number of characters required */ /* for the transformation is obtained by a */ /* special call: */ /* total_loclxfrm_len = loclxfrm(0, s2, 0, &used) */ { . . /* locale's strxfrm code */ } . . . /* more locale data, routines, and so on */ .For an example of a locale routine for
strxfrm
, see
the L$CLDBEX
source code member distributed with the compiler
and library.
Copyright (c) 1998 SAS Institute Inc. Cary, NC, USA. All rights reserved.