Data Set Options under OpenVMS |
Controls the size of the I/O data cache that is allocated for
a SAS file.
Default: |
65024
|
Valid in: |
DATA step and PROC steps
|
Category: |
Data Set Control
|
Engines: |
V9, V8, V7, V6
|
OpenVMS specifics: |
All aspects are host-specific
|
CACHESIZE=n | nK | nM | nG | hexX
|
-
n | nK
| nM | nG
-
specifies the cache size in multiples of
1 (bytes); 1,024 (kilobytes); 1,048,576 (megabytes); or 1,073,741,824 (gigabytes).
For example, a value of 8
specifies 8 bytes, and a value of 3k
specifies 3,072 bytes.
Range: |
0 to 2,147,483,136 |
-
hexX
-
specifies the cache size as a hexadecimal
value. You must specify the value beginning with a number (0-9), followed
by hexadecimal characters (0-9, A-F), and then followed by an X. For
example, the value 2dx
sets the cache size to 45 bytes.
Pages of SAS files are cached in memory
with each cache containing multiple pages. The CACHESIZE= data
set option controls the size (in bytes) of the data cache used to buffer the I/O pages.
Note that memory is consumed for each data cache, and multiple caches are
used for each data set opened. Thus, the disadvantage of specifying extremely
large CACHESIZE= values is large consumption
of memory. The advantage of a larger CACHESIZE= value
is that it reduces the actual number of disk I/Os required
to read from or write to a file. For example, if you are reading a large data
set, you can use the following statements:
libname test v9 '[mydir]';
data new;
set test.big(cachesize=65024);
. . . more data lines . . .
run;
This DATA step reads the TEST.BIG data
set in the most efficient manner.
If a data cache is used, then one disk I/O is
the size of the CACHESIZE= value. If
no data cache is allocated, then one disk I/O is
the size of the BUFSIZE= value. The
size of the page is controlled with the BUFSIZE= data
set option.
The CACHESIZE= and BUFSIZE= data set options
are similar, but they have important differences. BUFSIZE= specifies
the file's page size, which is a permanent attribute of the file. It can be
set only when the file is created. CACHESIZE= is
the size of the internal memory cache that is used for the duration of the
current file open. It can change any time the file is opened. Specifying a
large BUFSIZE= value and CACHESIZE=0 improves I/O the
same way that specifying a large CACHESIZE= value
does. However, because only complete pages can be written to the file, if
the actual data requires less space than the specified BUFSIZE= value,
the file uses more disk space than necessary.
For example, if you specify BUFSIZE=65024 and
CACHESIZE=0, I/O is
performed in increments of the page size. If the data actually require only
32,000 bytes of storage, then more than half the space allocated for the file
is unused. If you specify BUFSIZE=32768 and CACHESIZE=65024, I/O is
still performed in increments of 65,024 bytes. However, if the data requires
only 32,000 bytes, little space is
wasted.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.