<locale.h> are used to
tailor significant portions of the C run-time environment to
national conventions for punctuation of
decimal numbers, currency,
and so on. This tailoring provides the capability for writing programs
that are portable across many countries. For example, some countries
separate the whole and fractional parts of a number with a comma
instead of a period. Alphabetization, currency notation, dates, and
times are all expressed differently in different countries. Each set
of definitions for culture-dependent issues is called a locale,
and each locale has a name that is a null-terminated string.
This chapter introduces fundamental concepts of localization and the basic
structure used in localization functions, and discusses the three
nonstandard locales supplied by the SAS/C Library. Descriptions of the
standard localization functions (including strcoll and strxfrm)
follow.
For details on how to create your own locales, see User-Added Locales .
"S370"
"POSIX"
"C"
exec linkage, this is the same as the
"POSIX"
locale. For other programs, it is the same as the
"S370" locale.
""
"" locale interpretation is controlled by several environment
variable settings. See the setlocale
function description for more
information.
A locale is divided into categories, which define different parts
of the whole locale. You can change one category without having to
change the entire locale. For example, you can change the way currency
is displayed without changing the expression of dates and time. The
categories of locale defined in <locale.h> are as follows:
LC_ALL
LC_COLLATE
strcoll and
strxfrm work.
LC_CTYPE
isgraph) work.
LC_CTYPE also affects the multibyte functions (such as mblen
and wcstombs) as well as the treatment of multibyte characters by
the formatted I/O functions (such as printf and sprintf) and
the string functions.
isdigit and isxdigit are not affected by LC_CTYPE.
LC_MONETARY
LC_NUMERIC
LC_TIME
strftime formats time values.
This category does not
affect the behavior of asctime.
<locale.h> defines the structure struct lconv. The standard
members of this structure are explained in the following list. The
default "C" locale values, as defined by the ANSI and ISO
C standards, are in
parentheses after the descriptions. A default value of CHAR_MAX
indicates the information is not available.
char *decimal_point
".")
char *thousands_sep
"")
char *grouping
"")
char *int_curr_symbol
"")
char *currency_symbol
"")
char *mon_decimal_point
"")
char *mon_thousands_sep
"")
char *mon_grouping
"")
char *positive_sign
"")
char *negative_sign
"")
char int_frac_digits
CHAR_MAX)
char frac_digits
CHAR_MAX)
char p_cs_precedes
1 if the currency_symbol and a nonnegative monetary
value are separated by a space, otherwise it is set to 0.
(CHAR_MAX)
char p_sep_by_space
1 if the currency_symbol comes before a nonnegative
monetary value; it is set to 0 if the currency_symbol comes after
the value. (CHAR_MAX)
char n_cs_precedes
1 if the currency_symbol and a negative monetary value
are separated by a space, otherwise it is set to 0. (CHAR_MAX)
char n_sep_by_space
1 if the currency_symbol comes before a negative
monetary value; it is set to 0 if the currency_symbol comes after
the value. (CHAR_MAX)
char p_sign_posn
positive_sign for
a nonnegative monetary value. (CHAR_MAX)
char n_sign_posn
negative_sign for
a negative monetary value. (CHAR_MAX)
grouping and mon_grouping are defined by the
following values:
CHAR_MAX
CHAR_MAX is defined in
<limits.h> as 255.
0
p_sign_posn and n_sign_posn is defined by the
following:
0
currency_symbol.
1
currency_symbol.
2
currency_symbol.
3
currency_symbol.
4
currency_symbol.
"S370" locale defines its categories as follows:
LC_COLLATE
strxfrm to behave like strncpy, except that it
returns the number of characters copied, not a pointer to a string.
LC_CTYPE
LC_MONETARY
LC_NUMERIC
LC_TIME
strftime to return the same values for days of the week,
dates, and so on, as asctime.
"POSIX" locale defines its categories as follows:
LC_COLLATE
LC_CTYPE
LC_MONETARY
LC_NUMERIC
LC_TIME
strftime to return the same values for days of the week,
dates, and so on, as asctime.
"S370" locales, the run-time library supplies
three other locales in source form or load form or both. The load
module names can be found in L$CLxxxx where xxxx is the name
of the locale. See your SAS Software Representative for SAS/C sofware
products for the location of the source code.
"DBCS"
LC_CTYPE or LC_ALL.
For category LC_COLLATE, this locale enables strxfrm and
strcoll to transform and collate mixed double-byte character
strings. In all other aspects, the "DBCS" locale is identical to the
"S370" locale.
"SAMP"
LC_CTYPE
"S370" locale.
LC_COLLATE
"S370" locale.
LC_NUMERIC
LC_MONETARY
LC_TIME
"C" locale conventions with a modified sample date routine used
with strftime %x format.
"DBEX"
LC_CTYPE
ctype table pointer is NULL.
LC_COLLATE
strxfrm and strcoll
functions that use the table.
LC_NUMERIC
"S370" locale is used.)
LC_MONETARY
"S370" locale is used.)
LC_TIME
"S370" locale is used.)

#include <locale.h> struct lconv *localeconv(void);
localeconv sets the elements of an object of type struct lconv to
the appropriate values for the formatting of numeric objects, both monetary and
nonmonetary, with respect to the current locale.
localeconv returns a pointer to a filled-in object of type
struct lconv. The char * members of this structure may point
to "", indicating that the value is not available or is of 0 length. The
char members are nonnegative numbers and can be equal to CHAR_MAX,
indicating the value is not available. See The Locale Structure lconv
for a discussion of each structure member returned by
localeconv.
localeconv overwrites the previous value of the
structure; if you need to reuse the previous value, be sure to save
it. The following code saves the value of the structure:
struct lconv save_lconv; save_lconv = *(localeconv());Also, calls to
setlocale with categories LC_ALL, LC_MONETARY,
or LC_NUMERIC may overwrite the contents of the structure returned by
localeconv.
frac_digits indicates
how many digits are to the right of the decimal point character. If
frac_digits is negative, the number is padded on the right, before
the decimal point character, with the same number of 0s as the absolute
value of frac_digits. It is assumed that the digit grouping
string has at most only one element.
#include <stdio.h>
#include <string.h>
#include <locale.h>
size_t dec_fmt();
int main()
{
struct lconv *lc; /* pointer to locale values */
char buf[256];
setlocale(LC_NUMERIC, "SAMP"); /* Use "SAMP" locale. */
/* Call localeconv to obtain locale conventions. */
lc = localeconv();
dec_fmt(buf, lc, 12345678, 0);
printf("The number is: 12,345,678. == %s\n", buf);
dec_fmt(buf, lc, 12345678, 2);
printf("The number is: 123,456.78 == %s\n", buf);
dec_fmt(buf, lc, -12345678, 4);
printf("The number is: -1,234.5678 == %s\n", buf);
dec_fmt(buf, lc, 12345678, -5);
printf("The number is: 1,234,567,800,000. == %s\n", buf);
dec_fmt(buf, lc, 12345678,10);
printf("The number is: 0.0012345678 == %s\n", buf);
}
size_t dec_fmt(char *buf, struct lconv *lc, int amt,
int frac_digits)
{
char numstr[128]; /* number string */
char *ns_ptr, /* number string pointer */
*buf_start; /* output buffer start pointer */
int ngrp, /* number of digits per group */
ntocpy, /* digits to copy */
non_frac; /* number of nonfractional digits */
size_t ns_len; /* number string length */
if (abs(frac_digits) > 100) /* Return error if too big */
return 0;
buf_start = buf;
sprintf(numstr, "%+-d", amt); /* Get amount as number string */
ns_ptr = numstr; /* Point to number string */
ns_len = strlen(ns_ptr); /* Get number string length */
if (frac_digits < 0) { /* zero pad left of decimal point */
memset(ns_ptr + ns_len, '0', (size_t) -frac_digits);
*(ns_ptr + ns_len - frac_digits) = '\0';
ns_len -= frac_digits; /* Add extra digit length. */
frac_digits = 0;
}
/* zero pad right of decimal point */
if ((non_frac = ns_len - frac_digits - 1) < 0) {
sprintf(numstr,"%+0*d", ns_len - non_frac + 1, amt);
non_frac = 1; /* e.g., 0.000012345678 */
}
if (amt < 0) *buf++ = *ns_ptr; /* Insert sign in buffer. */
ns_ptr++; /* Skip +/- in number string. */
/* Convert grouping to int. */
if (!(ngrp = (int) *(lc->grouping)))
ngrp = non_frac; /* Use non_frac len if none. */
/* Get number of digits to copy for first group. */
if (!(ntocpy = non_frac % ngrp))
ntocpy = ngrp;
while (non_frac > 0) { /* Separate groups of digits. */
memcpy(buf, ns_ptr, ntocpy); /* Copy digits. */
ns_ptr += ntocpy; /* Advance pointers. */
buf += ntocpy;
if (non_frac -= ntocpy) { /* Insert separator and set */
*buf++ = *(lc->thousands_sep);/* number of digits for other */
ntocpy = ngrp; /* groups. */
}
}
*buf++ = *(lc->decimal_point); /* Insert decimal point. */
if (frac_digits > 0)
/* Copy fraction + '\0' */
memcpy(buf, ns_ptr, frac_digits + 1);
else *buf = '\0'; /* Else just null-terminate. */
return strlen(buf_start); /* Return converted length. */
}


#include <locale.h> char *setlocale(int category, const char *locale);
setlocale selects the current locale or a portion thereof, as
specified by the category and locale arguments.
Here are the valid categories:
LC_ALL
LC_COLLATE
strcoll and strxfrm.
LC_CTYPE
LC_MONETARY
localeconv.
LC_NUMERIC
localeconv.
LC_TIME
strftime.
locale points to a locale_string that has one of the
following formats:
xxxx
Here is an example:
setlocale(LC_ALL, "C");
xxxx ; category=yyyy <;
category=zzzz ...>
Here is an example:
setlocale(LC_MONETARY, "C;LC_TIME=SAMP");
category=yyyy <; category=zzzz ...>
Here is an example:
setlocale(LC_MONETARY, "LC_TIME=SAMP");
"" (null string)
NULL (pointer)
Here is an example:
setlocale(LC_ALL, NULL);
In the locale_string formats, xxxx ,
yyyy , and zzzz
have the following meanings:
xxxx specifies the name for the requested category.
yyyy and zzzz are overrides for mixed
categories.
x , y , and z can
be uppercase alphabetic (A through Z),
numeric (0 through 9), $, @, or #.
LC_ALL category is requested or the specific
category and the override match. For example, this statement sets only
the LC_MONETARY category; the LC_TIME override is ignored.
setlocale(LC_MONETARY,"C;LC_TIME=SAMP");When
NULL is specified, setlocale returns the current
locale string in the same format as described earlier.
If the locale string is specified as "", the library consults a
number of environment variables to determine the appropriate settings.
If none of the environment variables are defined, the "C"
locale is
used.
The locale "" is resolved in the following steps:
LC_ALL is defined and not NULL, the
value of LC_ALL is used as the locale.
LC_ALL was specified, and there is an
environment variable with the same name as the category, the value of
that variable is used (if not NULL). For instance, if the category
was LC_COLLATE, and the environment variable LC_COLLATE
is "DBEX",
then the LC_COLLATE component of the "DBEX"
locale is used. Note that
if the category is LC_ALL,
the effect is the same as if each other
category were set independently.
LANG is defined and not NULL,
the value
of LANG is used as the locale.
_LOCALE
is defined and not NULL, the
value of _LOCALE is used as the locale.
setlocale returns the string associated with the specified
category value. If the value for locale is not valid or
an error occurs when loading the requested locale, setlocale returns
NULL and the locale is not affected.
If locale is specified as NULL, setlocale
returns the
string associated with the category for the current locale and the
locale is not affected.
When issuing setlocale for a mixed locale, if a category
override fails (for example, if the locale load module cannot be
found), then for that category the LC_ALL category is used if it
is being set at the same time. Otherwise, a NULL return is issued
and the category for which the locale override was requested remains
unchanged. The returned string indicates the actual mixed locale in effect.
setlocale overwrites the previous value, so you
must save the current value before calling setlocale again, if you
need the value later. This involves determining the string length for
the locale name, allocating storage space for it, and copying it. The following
code does all three:
char *name; name = setlocale(LC_ALL, NULL); name = strsave(name);When you want to revert to the locale saved in
name, use the
following code:
setlocale(LC_ALL, name);
"S370" locale,
which is built-in, locale information
is kept in a load module created by compiling and linking
"S370" source
code for the locale's localization data and routines. The
load module name is created by appending up to the first four uppercased
characters of the locale's name to the "L$CL" prefix. For
example, if the locale name is MYLOCALE, the load module name is
L$CLMYLO.
The locale load module is loaded when needed (with loadm) during
setlocale processing. For the exact details and requirements on
specifying localization data and routines, see
User-Added Locales .
Also loadm for load
module library search and location requirements.
Mixed locale strings can be specified. In this case, all requested
locales are loaded by a single call to setlocale.
#include <locale.h>
main()
{
char *name;
/* Set all portions of the current locale to */
/* the values defined by the built-in "C" locale. */
setlocale(LC_ALL, "C");
/* Set the LC_COLLATE category to the value */
/* defined by the "DBCS" locale. */
setlocale(LC_COLLATE, "DBCS");
/* Set the LC_CTYPE category to the value */
/* defined by the "DBCS" locale. */
setlocale(LC_CTYPE, "DBCS");
/* Set the LC_NUMERIC category to the value */
/* defined by the "SAMP" locale. */
setlocale(LC_NUMERIC, "SAMP");
/* Return mixed locale string. */
name = setlocale(LC_ALL, NULL);
}
The following string is pointed to by name:
"C@;LC_COLLATE=DBCS;LC_CTYPE=DBCS;LC_NUMERIC=SAMP"This string is interpreted as if the "
C" locale
(LC_ALL) is in
effect except for LC_COLLATE and LC_CTYPE,
which have the
"DBCS" locale in effect, and LC_NUMERIC,
which is using "SAMP".
The same results can be accomplished with a single call to
setlocale:
#include <locale.h>
void main()
{
char *name;
name = setlocale(LC_ALL,
"C;LC_COLLATE=DBCS;LC_CTYPE=DBCS;LC_NUMERIC=SAMP");
}
#include <string.h> int strcoll(const char *str1, const char *str2);
strcoll compares two character strings (str1 and
str2)
using the collating sequence or routines (or both) defined by the
LC_COLLATE category of the current locale. The return value has the
same relationship to 0 as str1 has to str2.
If two strings
are equal up to the point where one of them terminates (that is, contains a null
character), the longer string is considered greater.
Note that
when the "POSIX" locale is in effect,
either explicitly or by default,
strcoll
compares the strings according to the ASCII collating order rather
than the EBCDIC order.
strcoll is one of the following:
0
0
str1 compares less than str2.
0
str1 compares greater than str2.
strcoll. In the "S370" locale, the strcoll
return value is the
same as if the strcmp function were used to compare the strings.
strcoll is not properly terminated, a
protection or addressing exception may occur.
strcoll uses the following logic when comparing two strings:
strcoll calls the locale's strcoll function equivalent,
if available, and returns its value. See
LOCALE strcoll EQUIVALENT .
strcoll calls the locale's strxfrm function equivalent, if
available, to transform the strings for a character-by-character
comparison. See LOCALE strxfrm EQUIVALENT .
strcoll calls the library's double-byte collation routine with a
standard double-byte collating sequence if the locale is a double-byte
locale as determined from the LC_COLLATE category, no locale
strcoll or strxfrm function is available, and no collation table
is supplied.
strcoll uses a collation table to compare the two strings (which
are then compared character by character) if the locale is a
single-byte locale and has a collation table available.
strcoll calls the strcmp function to compare the strings and
returns its value if none of the above are true.
#include <locale.h>
#include <string.h>
#include <stdio.h>
main()
{
char *s1, *s2, *lcn;
int result;
/* Obtain locale name. */
lcn = setlocale(LC_COLLATE, NULL);
s1 = " A B C D";
s2 = " A C B";
result = strcoll(s1, s2);
if (result == 0)
printf("%s = %s in the \"%s\" locale", s1, s2, lcn);
else
if (result < 0)
printf("%s < %s in the \"%s\" locale", s1, s2, lcn);
else
printf("%s > %s in the \"%s\" locale", s1, s2, lcn);
}
#include <string.h> size_t strxfrm(char *str1, const char *str2, size_t n);
strxfrm transforms the string pointed to by str2 using the
collating sequence or routines (or both) defined by the LC_COLLATE
category of the current locale. The resulting string is placed into
the array pointed to by str1.
The transformation is such that if the strcmp function is applied
to two transformed strings, it returns greater than, equal to, or less
than 0, corresponding to the result of the strcoll function
applied directly to the two original strings.
No more than n characters are placed into the resulting array
pointed to by str1, including the terminating null character. If
n is 0, str1 is permitted to be NULL. The results are
unpredictable if the strings pointed to by str1 and str2 overlap.
strxfrm returns the length of the transformed string (not including the
terminating null character). If the value returned is n or more, the
first n characters are written to the array without null termination. In
the "S370" locale, the behavior of strxfrm is like that of strncpy
except that the number of characters copied to the output array is returned
instead of a pointer to the copied string.
str2 argument is not properly terminated or n is
bigger than the output array pointed to by str1, then a
protection or addressing exception may occur. The size of the output
array needed to hold the transformed string pointed to by str2 can
be determined by the following statement:
size_needed = 1 + strxfrm(NULL, str1, 0);
strxfrm uses the following logic when transforming a string:
strxfrm calls the locale's strxfrm function equivalent,
if available, to transform str2 and place the result in the
str1 array. See LOCALE strxfrm EQUIVALENT .
strxfrm calls the library's double-byte strxfrm collation
routine with a standard double-byte collating sequence if the locale is
a double-byte locale as determined from the LC_COLLATE category,
no locale strxfrm function is available, and no collation table is
supplied.
strxfrm uses a collation table to transform the string if the
locale is a single-byte locale and has a collation table available.
strxfrm invokes the equivalent of strncpy to copy str2
to str1 if none of the above are true.
strcmp yields the same result as
strcoll when it is used to compare two strings transformed by
strxfrm.
#include <locale.h>
#include <string.h>
main()
{
char *str1, *str2, /* input strings pointers */
txf1[80], txf2[80]; /* transform arrays */
int result_strcoll, result_strcmp; /* compare results */
str1 = " A B C D";
str2 = " A C B";
if ((strxfrm(txf1, str1, sizeof(txf1)) < sizeof(txf1)) &&
(strxfrm(txf2, str2, sizeof(txf2)) < sizeof(txf2)))
result_strcmp = strcmp(txf1, txf2);
else exit(4); /* error exit if length is too big */
result_strcoll = strcoll(str1, str2); /* Get strcoll result. */
/* Result must be 0 or result signs must be the same. */
if ((result_strcmp == result_strcoll) ||
(result_strcmp*result_strcoll > 0))
exit(0);
else exit(8); /* Else this is an error. */
}
"S370" and "POSIX")
and the three nonstandard locales
supplied by the SAS/C Library by creating your own locales.
Following is a discussion of the
user-added
structures that correspond to the standard locale
structures, a listing of the header file <localeu.h> and
an example locale ("SAMP"), and
discussions of user-defined strcoll and
strxfrm functions.
setlocale.
The header file
<localeu.h> maps the various data
structures and routines required for a locale. You must
include it as well as
<locale.h> when compiling a locale. Locale
source code should be compiled with the
RENT or RENTEXT compiler
option and link-edited RENT (for re-entrant).
Each category for
setlocale has a structure mapped within <localeu.h> that
corresponds to it.
The following table describes the user-supplied
categories that correspond to the
categories for setlocale.
Table 10.1 User-Supplied Locale Categories
Category Structure Name Description
LC_NUMERIC _lc_numeric LC_NUMERIC contains nonmonetary numeric
formatting items; see the description
of localeconv in Chapter 15,
"Localization Functions."
LC_MONETARY _lc_monetary LC_MONETARY contains monetary formatting
items; see the description of localeconv in
localeconv in Chapter 15.
LC_TIME _lc_time LC_TIME contains function pointers to
locale-specific date and formatting
routines, pointers to month and weekday
names, and a.m. and p.m. designation;
all are used with various strftime
formats.
LC_CTYPE _lc_ctype LC_CTYPE contains a flag for enablement
of DBCS processing and string
recognition by other library routines,
including recognition of multibyte
characters in formatted I/O format strings
such as those used in printf. LC_CTYPE
also contains a pointer to the character
type table that affects the behavior of the
character type functions such as isalpha
and tolower.
LC_COLLATE _lc_collate LC_COLLATE contains a mode flag indicating
the processing mode and collation table
pointer for strxfrm and strcoll. Mode
flag meanings are as follows:
0 indicates single-byte mode.
1 indicates double-byte mode.
>1 indicates multibyte mode.
For single-byte mode, the library's
strxfrm and strcoll functions use the
collation table if supplied. In double-byte
and multibyte mode, any use of the table is
strictly left up to the user-supplied
routines. The library functions use a
standard double-byte collation in
double-byte mode when the collation table
pointer is NULL. Included in this category
are function pointers to locale-specific
versions of strxfrm and strcoll. The
locale-specific version of the strxfrm
function requires an additional fourth
parameter. This parameter is a pointer
to a size_t variable where the number
of characters consumed from the input
string by the function is placed. Also,
the value returned by the locale's strxfrm is
the number of characters placed in the output
array, not necessarily the total transformed
string length.
LC_ALL void *_lc_all[5] LC_ALL is an array of pointers to the
other _lc structures in the following
in the following order:
[0] &_lc_collat
[1] &_lc_ctype
[2] &_lc_monetar
[3] &_lc_numeric
[4] &_lc_time.
When the pointer to a structure that corresponds to a category is
NULL, the name returned by setlocale reflects the new
locale's name. However, it has the default "C" locale
characteristics for that category. Similarly, if individual elements
of a structure (pointers) are
NULL or binary 0, that piece of
the locale also exhibits "C" locale behavior.
<localeu.h> header file required
for compiling a user-added locale.
/* This header file defines additions to the ANSI locale.h header */
/* file that are required for compiling both user-added locale */
/* value table load modules and several library functions. The */
/* "C" defaults appear as comments for the _lc_numeric and */
/* _lc_monetary categories. */
static struct _lc_numeric {
char *decimal_point; /* "." */
char *thousands_sep; /* "" */
char *grouping; /* "" */
};
static struct _lc_monetary {
char *int_curr_symbol; /* "" */
char *currency_symbol; /* "" */
char *mon_decimal_point; /* "" */
char *mon_thousands_sep; /* "" */
char *mon_grouping; /* "" */
char *positive_sign; /* "" */
char *negative_sign; /* "" */
char int_frac_digits; /* CHAR_MAX */
char frac_digits; /* CHAR_MAX */
char p_cs_precedes; /* CHAR_MAX */
char p_sep_by_space; /* CHAR_MAX */
char n_cs_precedes; /* CHAR_MAX */
char n_sep_by_space; /* CHAR_MAX */
char p_sign_posn; /* CHAR_MAX */
char n_sign_posn; /* CHAR_MAX */
};
static const struct _lc_time {
/* locale's date and time conversion routine */
char *(*_lct_datetime_conv)();
/* address of locale's day conversion routine */
char *(*_lct_date_conv)();
/* address of locale's time conversion routine */
char *(*_lct_time_conv)();
/* address of weekday abbreviation table */
char *_lct_wday_name [7] ;
/* address of full weekday name table */
char *_lct_weekday_name [7] ;
/* address of month abbreviation table */
char *_lct_mon_name [12] ;
/* address of full month name table */
char *_lct_month_name [12] ;
/* locale's before-noon designation */
char *_lct_am;
/* locale's after-noon designation */
char *_lct_pm;
};
#define SBCS 0 /* single-byte character set */
#define DBCS 1 /* double-byte character set */
static const struct _lc_collate {
/* single-, double-, or multibyte character indicator */
int _lcc_cmode;
/* pointer to collation table */
void *_lcc_colltab;
/* pointer to user-added strcoll function */
int (*_lcc_strcoll)();
/* pointer to user-added strxfrm function */
size_t (*_lcc_strxfrm)();
};
static const struct _lc_ctype {
/* single-, double-, or multibyte character indicator */
int _lcc_cmode;
/* character type table pointer */
void *_lcc_ctab;
};
/* If _lcc_cmode is set to DBCS, it only has an impact on the ANSI */
/* multibyte character handling functions, not on isalpha, and */
/* so on. _lcc_ctab is for single-byte characters only, per the */
/* ANSI ctype.h-allowed representation of "unsigned char," and */
/* has no relation to to _lcc_mode. */
static const void *_lc_all [5] ; /* pointers to _lc struct */
/* [0] - &_lc_collate */
/* [1] - &_lc_ctype */
/* [2] - &_lc_monetary */
/* [3] - &_lc_numeric */
/* [4] - &_lc_time */
L$CLSAMP ("SAMP") and L$CLDBEX ("DBEX") are provided
in source form with the compiler and library to serve as
skeleton locales. You can easily modify these locales
to create new locales.
Ask your SAS Software Representative for SAS/C compiler products for
information about obtaining copies of these programs.
Here is an abbreviated listing of the "SAMP" locale,
illustrating the data structures and routine formats required for a
locale. The L$CLDBEX example
is a double-byte example locale (not shown) with
sample strcoll and strxfrm routines.
#title l$clsamp -- "SAMP" sample locale
/* This is the "SAMP" locale value module table, which */
/* provides a skeleton example to modify for a */
/* particular locale. For those locales requiring */
/* double-byte character support, see the "DBEX" locale */
/* (L$DLDBEX) for examples of setting up a double-byte */
/* LC_CTYPE, LC_COLLATE, strcoll, and strxfrm. */
/* */
/* Any addresses of functions or tables not specified */
/* with a category use the "C" locale equivalent */
/* function or table. If a whole category is not specified */
/* and the locale is requested for that category with */
/* setlocale, effectively the "C" locale is used, although */
/* the locale string returned contains the locale's name. */
#include <stddef.h>
#include <locale.h>
#include <localeu.h>
#include <dynam.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#eject
/* */
/* ENTRY: <=== _dynamn (externally visible) needs to be */
/* be compiled with SNAME l$clsam. */
/* */
/* USAGE: <=== Prototype call: _dynamn (dynamically */
/* loaded with the loadm function from the */
/* setlocale function). */
/* */
/* Needs to be compiled with RENT or RENText */
/* option and link-edited RENT. */
/* */
/* For example, the following is a call made */
/* to setlocale: */
/* setlocale(LC_ALL, "SAMP"); */
/* The following code is executed within */
/* setlocale code to load L$CLSAMP and then */
/* call it: */
/* loadm(L$CLSAMP, &fp) */
/* fncptr = (char *** )(*fp)(); */
/* */
/* ARGUMENTS: <=== None */
/* */
/* RETURNS: <=== A pointer to an array of pointers */
/* */
/* static const void *lc_all_samp[5] = */
/* &collate, collate pointer */
/* &ctype, ctype pointer */
/* &monetary monetary pointer */
/* &numeric numeric pointer */
/* &time time format pointer */
/* */
/* */
/* END */
#eject
/*-----------------COLLATION category-----------------------*/
static const unsigned char sbcs_collate_table_samp [256] =
/* If a collation table is specified, that is, its address */
/* is nonzero, a locale strxfrm function is not coded, and */
/* the locale is not a multibyte (double-byte) locale, then */
/* the collation array must have 256 elements that */
/* translate any character's 8-bit representation to its */
/* proper place in the locale's collating sequence. */
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, /* 0x00-0f */
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, /* 0x10-1f */
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, /* 0x20-2f */
0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f,
. . . . . . . . /* 0x30-ef */
. . . . . . . .
. . . . . . . .
0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, /* 0xf0-ff */
0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff;
#define SBCS 0
#define DBCS 1
static const struct _lc_collate lc_collate_samp = {
SBCS, /* single-byte character mode */
sbcs_collate_table_samp, /* collation table address */
0, /* locale strcoll collation */
/* function pointer */
0 /* locale strxfrm transform */
/* function pointer */
;
/* See L$CLDBEX for DBCS example of strcoll and strxfrm functions. */
#eject
/*-----------------CTYPE category--------------------------*/
#define U 1 /* uppercase */
#define L 2 /* lowercase */
#define N 4 /* number */
#define W 8 /* white space */
#define P 16 /* punctuation */
#define S 32 /* blank */
#define AX 64 /* alpha extender */
#define X 128 /* hexadecimal */
static const unsigned char lc_ctab_samp[513] =
{
/* The character type table array, if coded, must contain */
/* 513 single char elements. The first element is the EOF */
/* representation (-1 or 0xff) followed by 256 elements */
/* that contain the char types for any 8-bit character */
/* returned by functions isalpha, isnumeric, and so on. */
/* The next 256 elements contain the mappings for the */
/* tolower and toupper string transformation functions. */
0, /* -1 = EOF */
0, /* 00 = nul */
0, /* 01 = soh */
0, /* 02 = stx */
0, /* 03 = etx */
0, /* 04 = sel */
W, /* 05 = ht */
0, /* 06 = rnl */
0, /* 07 = del */
0, /* 08 = ge */
0, /* 09 = sps */
0, /* 0a = rpt */
W, /* 0B = vt */
W, /* 0C = ff */
W, /* 0D = cr */
0, /* 0E = so */
0, /* 0F = si */
0, /* 10 = dle */
0, /* 11 = dcl */
0, /* 12 = dc2 */
0, /* 13 = dc3 */
. .
. .
. .
U, /* E6 = W */
U, /* E7 = X */
U, /* E8 = Y */
U, /* E9 = Z */
0, /* EA */
0, /* EB */
0, /* EC */
0, /* ED */
0, /* EE */
0, /* EF */
N|X, /* F0 = 0 */
N|X, /* F1 = 1 */
N|X, /* F2 = 2 */
N|X, /* F3 = 3 */
N|X, /* F4 = 4 */
N|X, /* F5 = 5 */
N|X, /* F6 = 6 */
N|X, /* F7 = 7 */
N|X, /* F8 = 8 */
N|X, /* F9 = 9 */
0, /* FA */
0, /* FB */
0, /* FC */
0, /* FD */
0, /* FE */
0, /* FF = eo */
/* Lower 257 bytes contain char types, next */
/* 256 contain the tolower and toupper */
/* character mappings. */
0x00, /* 00 = nul */
0x01, /* 01 = soh */
0x02, /* 02 = stx */
0x03, /* 03 = etx */
. .
. .
. .
0x7d, /* 7D = ' */
0x7e, /* 7E = = */
0x7f, /* 7F = " */
0x80, /* 80 */
0xc1, /* 81 = a -> C1 = A */
0xc2, /* 82 = b -> C2 = B */
0xc3, /* 83 = c -> C3 = C */
0xc4, /* 84 = d -> C4 = D */
0xc5, /* 85 = e -> C5 = E */
0xc6, /* 86 = f -> C6 = F */
0xc7, /* 87 = g -> C7 = G */
0xc8, /* 88 = h -> C8 = H */
0xc9, /* 89 = i -> C9 = I */
. .
. .
. .
0xbc, /* BC */
0xbd, /* BD = ] (close bracket) */
0xbe, /* BE */
0xbf, /* BF */
0xc0, /* C0 = (open brace) */
0x81, /* C1 = A -> 81 = a */
0x82, /* C2 = B -> 82 = b */
0x83, /* C3 = C -> 83 = c */
0x84, /* C4 = D -> 84 = d */
0x85, /* C5 = E -> 85 = e */
0x86, /* C6 = F -> 86 = f */
0x87, /* C7 = G -> 87 = g */
0x88, /* C8 = H -> 88 = h */
0x89, /* C9 = I -> 89 = i */
0xca, /* CA = shy */
0xcb, /* CB */
0xcc, /* CC */
0xcd, /* CD */
0xce, /* CE */
. .
. .
. .
0xf7, /* F7 = 7 */
0xf8, /* F8 = 8 */
0xf9, /* F9 = 9 */
0xfa, /* FA */
0xfb, /* FB */
0xfc, /* FC */
0xfd, /* FD */
0xfe, /* FE */
0xff /* FF = eo */
};
static const struct _lc_ctype lc_ctype_samp =
SBCS, /* single-byte character mode */ lc_ctab_samp /* ctype table pointer */
};
#eject
/*------------NUMERIC category--------------------*/
const static struct _lc_numeric lc_numeric_samp = {
".", /* decimal_point */
",", /* thousands_sep */
"\3" /* grouping */
};
/*------------MONETARY category---------------------*/
static const struct _lc_monetary lc_monetary_samp = {
"DOL", /* int_curr_symbol */
"$", /* currency_symbol */
".", /* mon_decimal_point */
",", /* mon_thousands_sep */
"\3", /* mon_grouping */
"", /* positive_sign */
"-", /* negative_sign */
2, /* int_frac_digits */
2, /* frac_digits */
1, /* p_cs_precedes */
0, /* p_sep_by_space */
1, /* n_cs_precedes */
0, /* n_sep_by_space */
1, /* p_sign_posn */
1 /* n_sign_posn */
;
#eject
/*------------TIME category------------------------------*/
char *sampdcnv(struct tm *tp);
static const struct _lc_time lc_time_samp = {
0, /* pointer to date and time conversion */
/* routine function pointer */
sampdcnv, /* pointer to date conversion */
/* routine function pointer */
0, /* pointer to time conversion */
/* routine function pointer */
/* weekday name abbreviations */
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat",
/* weekday full names */
"Sunday", "Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday",
/* month name abbreviations */
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec",
/* month full names */
"January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November",
"December",
"AM", /* locale's "AM" equivalent */
"PM" /* locale's "PM" equivalent */
};
char *sampdcnv(struct tm *tp)
{
/* SAMP date conversion routine */
/* Function returns the date in the form: */
/* wkd mon dd 'yy */
/* for example, Thu Oct 10 '85. */
char *time_format;
time_format = asctime(tp);
memcpy(time_format + 11, " '", 2);
memcpy(time_format + 13, time_format + 22, 2);
*(time_format + 15) = '\ 0';
return time_format;
/* End sampdcnv. */
#eject
/* ALL category - array of pointers to category structures */
static const void *lc_all_samp[5] =lc_collate_samp , /* pointer to collate category */ lc_ctype_samp , /* pointer to ctype category */ lc_monetary_samp , /* pointer to monetary category */ lc_numeric_samp , /* pointer to nu
meric samp */ lc_time_samp /* pointer to time samp */
;
/*-----------------Return category pointers---------------*/
void *_dynamn() /* executable entry point */
{
return (void *)&lc_all_samp; /* Return address of ALL array. */
}
strcoll function is not adequate for the
needs of a locale, you can write and use your own routine to do the
collation. You include this routine as
part of the
LC_COLLATE category
of a locale to be called from the library
strcoll function after setlocale has loaded the
locale.
Because it is your own routine, you can give it any legal name,
as long as you are consistent in its use. (For instance, the following
example uses the name loclcoll.)
The locale's routine can make use of
any information available in the locale, such as the mode and
collation tables. In addition, if the
LC_COLLATE mode is not 0, the
collation tables coded as part of the locale are not
restricted to any format as long as the locale's
strxfrm and strcoll routines
can understand them.
The locale's routine is invoked from the library's
strcoll function with the
equivalent of the following call:
/* library strcoll function */
int strcoll(char *str1, const char *str2)
{
.
.
.
/* Return locale's strcoll value to library's */
/* strcoll caller. */
return loclcoll(str1, str2);
}
The loclcoll function code appears as part of the
LC_COLLATE category
within the locale source code:
.
.
.
/* collation tables, transformation tables, */
/* and other locale data */
int loclcoll(const char *str1, const char *str2)
/* ARG DCL DESCRIPTION */
/* */
/* str1 const char * pointer to first input string */
/* */
/* str2 const char * pointer to second input string */
/* */
/* RETURNS: <=== str1 < str2 a negative value */
/* str1 = str2 0 */
/* str1 > str2 a positive value */
/* */
{
.
. /* locale's equivalent strcoll function code */
.
return x /* Return a result, x. */
}
.
. /* more locale data, routines, and so on */
.
For an example of a locale routine for strcoll, see
the L$CLDBEX source code member distributed with the compiler
and library.
strxfrm function is not adequate for the
needs of a locale, you can write and use your own
routine to do the transformation.
You include this routine as part of the
LC_COLLATE category
of a locale to be called from the library
strxfrm function after being loaded with an appropriate
setlocale call.
Because it is your own routine, you can give it any legal name,
as long as you are consistent in its use.
(For instance, the following example
uses the name loclxfrm.)
There is one
main difference in behavior requirements for the locale equivalent
of the strxfrm function
and behavior requirements for the library version.
After the output buffer is
filled, it stops scanning the input string and returns the size of
the filled output buffer rather than the total transformed length.
It also places the number of characters consumed from the input string
in the area addressed by
an additional fourth parameter. The locale's routine can use
any information available in the locale, such as the mode and
collation tables. In addition,
if the LC_COLLATE mode is not 0, the
collation and
transformation tables coded as part of the locale are not
restricted to any format as long as the locale's
strxfrm and strcoll routines
can understand them.
The reason
the behavior requirements of the library
and locale strxfrm routines differ
is to allow the strcoll function
to call strxfrm with a limited buffer that might
permit only partial transformation of the whole string. Theoretically,
any number of output characters can be produced by strxfrm from
any number of input characters.
The locale's
strxfrm routine
is invoked from within strxfrm with an
equivalent of the following call. (The choice of "loclxfrm" is
arbitrary. It could be any legal name as long as it is consistent.)
size_t strxfrm(char *str1, const char *str2, size_t n)
{
size_t nchar_xfrmed, used;
.
.
.
wchar_xfrmed = loclxfrm(str1, str2, n, &used);
.
.
.
}
The loclxfrm function code appears as part of the
LC_COLLATE category
within the locale source code:
.
. /* collation tables, transformation tables, and other */
/* locale data */
.
size_t loclxfrm(char *str1, const char *str2,
size_t n, size_t *used)
/* ARG DCL DESCRIPTION */
/* */
/* str1 char * pointer to the transformed */
/* string output array */
/* */
/* str2 const char * pointer to input string array */
/* */
/* n size_t maximum number of bytes */
/* (characters) written to str1 */
/* including the terminating null. */
/* If n or more characters are */
/* required for the transformed */
/* string, only the first n are */
/* written and the string is not */
/* null-terminated. */
/* */
/* used size_t * pointer to size_t (unsigned int) */
/* value where the number of input */
/* characters consumed from the */
/* input string str2 is returned */
/* */
/* RETURNS: <=== The number of characters placed in the*/
/* output transformation array is returned. */
/* */
/* Also, the number of characters consumed */
/* (scanned) from the input string str2 is */
/* placed in the size_t (unsigned int) */
/* value pointed to by used. */
/* */
/* The total number of characters required */
/* for the transformation is obtained by a */
/* special call: */
/* total_loclxfrm_len = loclxfrm(0, s2, 0, &used) */
{
.
. /* locale's strxfrm code */
}
.
.
. /* more locale data, routines, and so on */
.
For an example of a locale routine for strxfrm, see
the L$CLDBEX source code member distributed with the compiler
and library.
Copyright (c) 1998 SAS Institute Inc. Cary, NC, USA. All rights reserved.