DATASETS Procedure

COPY Statement

Copies all or some of the SAS files in a SAS library.
Restriction: The COPY statement does not support data set options.
Tips: See the example in PROC COPY to migrate from a 32-bit machine to a 64-bit machine.

The COPY statement defaults to the encoding and data representation of the output library when you use Remote Library Services (RLS) such as SAS/SHARE or SAS/CONNECT. If you are not using RLS, you must use the PROC COPY option NOCLONE for the output files to take on the encoding and data representation of the output library. Using the NOCLONE option results in a copy with the data representation of the data library (if specified in the OUTREP= LIBNAME option) or the native data representation of the operating environment.

Manipulating SAS Files

Syntax

Required Arguments

OUT=libref-1
names the SAS library to copy SAS files to.
Alias:OUTLIB= and OUTDD=
IN=libref-2
names the SAS library containing SAS files to copy.
Alias:INLIB= and INDD=
Default:the libref of the procedure input library
Interaction:To copy only selected members, use the SELECT or EXCLUDE statements.

Optional Arguments

ALTER=alter-password
provides the Alter password for any alter-protected SAS files that you are moving from one data library to another. Because the MOVE option deletes the SAS file from the original data library, you need Alter access to move the SAS file.
CLONE | NOCLONE
specifies whether to copy the following data set attributes:
  • size of input/output buffers
  • whether the data set is compressed
  • whether free space is reused
  • data representation of input data set, library, or operating environment
  • encoding value
  • whether a compressed data set can be randomly accessed by an observation number
These attributes are specified with data set options, SAS system options, and LIBNAME statement options:
  • BUFSIZE= value for the size of the input/output buffers
  • COMPRESS= value for whether the data set is compressed
  • REUSE= value for whether free space is reused
  • OUTREP= value for data representation
  • ENCODING= or INENCODING= for encoding value
  • POINTOBS= value for whether a compressed data set can be randomly accessed by an observation number
For the BUFSIZE= attribute, the following table summarizes how the COPY statement works:
CLONE and the Buffer Page Size Attribute
Option
COPY Statement
CLONE
Uses the BUFSIZE= value from the input data set for the output data set. However, specifying BUFSIZE= value in the OVERRIDE= option list results in a copy that uses the specified value.
NOCLONE
Uses the current setting of the SAS system option BUFSIZE= for the output data set.
Neither
Determines the type of access method, sequential or random, used by the engine for the input data set and the engine for the output data set. If both engines use the same type of access, the COPY statement uses the BUFSIZE= value from the input data set for the output data set. If the engines do not use the same type of access, the COPY statement uses the setting of SAS system option BUFSIZE= for the output data set.
For the COMPRESS= attribute, the following table summarizes how the COPY statement works:
CLONE and the Compression Attribute
Option
COPY Statement
CLONE
Uses the values from the input data set for the output data set. However, specifying COMPRESS= value in the OVERRIDE= option list results in a copy that uses the specified encoding.
NOCLONE
Results in a copy with the compression of the operating environment or, if specified, the value of the COMPRESS= option in the LIBNAME statement for the library.
Neither
Defaults to CLONE.
For the REUSE= attribute, the following table summarizes how the COPY statement works:
CLONE and the Reuse Space Attribute
Option
COPY Statement
CLONE
Uses the values from the input data set for the output data set. If the engine for the input data set does not support the reuse space attribute, then the COPY statement uses the current setting of the corresponding SAS system option. However, specifying REUSE= value in the OVERRIDE= option list results in a copy that uses the specified value.
NOCLONE
Uses the current setting of the SAS system options COMPRESS= and REUSE= for the output data set.
Neither
Defaults to CLONE.
For the OUTREP= attribute, the following table summarizes how the COPY statement works:
CLONE and the Data Representation Attribute
Option
COPY Statement
CLONE
Results in a copy with the data representation of the input data set. However, specifying OUTREP= value in the OVERRIDE= option list results in a copy that uses the specified data representation
NOCLONE
Results in a copy with the data representation of the operating environment or, if specified, the value of the OUTREP= option in the LIBNAME statement for the OUT= library.
Neither
Defaults to CLONE.
Data representation is the form in which data is stored in a particular operating environment. Different operating environments use the following different standards or conventions:
  • for storing floating-point numbers (for example, IEEE or IBM 390)
  • for character encoding (ASCII or EBCDIC)
  • for the ordering of bytes in memory (big Endian or little Endian)
  • for word alignment (4-byte boundaries or 8-byte boundaries)
  • for data-type length (16-bit, 32-bit, or 64-bit)
Native data representation is when the data representation of a file is the same as the CPU operating environment. For example, a file in Windows data representation is native to the Windows operating environment.
For the ENCODING= attribute, the following table summarizes how the COPY statement works.
CLONE and the Encoding Attribute
Option
COPY Statement
CLONE
Results in a copy that uses the encoding of the input data set or, if specified, the value of the INENCODING= option in the LIBNAME statement for the input library. However, specifying ENCODING= value in the OVERRIDE= option list results in a copy that uses the specified encoding.
NOCLONE
Results in a copy that uses the encoding of the current session encoding or, if specified, the value of the OUTENCODING= option in the LIBNAME statement for the output library.
Neither
Defaults to CLONE.
All data that is stored, transmitted, or processed by a computer is in an encoding. An encoding maps each character to a unique numeric representation. An encoding is a combination of a character set with an encoding method. A character set is the repertoire of characters and symbols that are used by a language or group of languages. An encoding method is the set of rules that are used to assign the numbers to the set of characters that are used in an encoding.
For the POINTOBS= attribute, the following table summarizes how the COPY statement works. To use POINTOBS=, the output data set must be compressed.
CLONE and the POINTOBS= Attribute
Option
COPY Statement
CLONE
Uses the POINTOBS= value from the input data set for the output data set. However, specifying POINTOBS= value in the OVERRIDE= option list results in a copy that uses the specified value.
NOCLONE
Uses the LIBNAME statement if the output data set is compressed and the POINTOBS= option is specified and supported by the output engine. If the LIBNAME statement is not specified and the data set is compressed, the default is POINTOBS=YES when supported by the output engine.
Neither
Defaults to CLONE.
CONSTRAINT=YES | NO
specifies whether to copy all integrity constraints when copying a data set.
Default:NO
Tip:For data sets with integrity constraints that have a foreign key, the COPY statement copies the general and referential constraints if CONSTRAINT=YES is specified and the entire library is copied. If you use the SELECT or EXCLUDE statement to copy the data sets, then the referential integrity constraints are not copied. For more information, see Understanding Integrity Constraints in SAS Language Reference: Concepts.
DATECOPY
copies the SAS internal date and time when the SAS file was created and the date and time when it was last modified to the resulting copy of the file. Note that the operating environment date and time are not preserved.
Restrictions:DATECOPY cannot be used with encrypted files or catalogs.

DATECOPY can be used only when the resulting SAS file uses the V8 or V9 engine.

Tips:You can alter the file creation date and time with the DTC= option in the MODIFY statement. See MODIFY statement.

If the file that you are copying has attributes that require additional processing, the last modified date is changed to the current date. For example, when you copy a data set that has an index, the index must be rebuilt, and the last modified date changes to the current date. Other attributes that require additional processing and that could affect the last modified date include integrity constraints and a sort indicator.

FORCE
allows you to use the MOVE option for a SAS data set on which an audit trail exists.
Note:The AUDIT file is not moved with the audited data set.
INDEX=YES | NO
specifies whether to copy all indexes for a data set when copying the data set to another SAS library.
Default:YES
MEMTYPE=(mtype-1 <...mtype-n>)
restricts processing to one or more member types.
Alias:MT=, MTYPE=
Default:If you omit MEMTYPE= in the PROC DATASETS statement, the default is MEMTYPE=ALL.
Note:When PROC COPY processes a SAS library on tape and the MEMTYPE= option is not specified, it scans the entire sequential library for entries until it reaches the end-of-file. If the sequential library is a multivolume tape, all tape volumes are mounted. This behavior is also true for single-volume tape libraries.
MOVE
moves SAS files from the input data library (named with the IN= option) to the output data library (named with the OUT= option) and deletes the original files from the input data library.
Restriction:The MOVE option can be used to delete a member of a SAS library only if the IN= engine supports the deletion of tables. A tape format engine does not support table deletion. If you use a tape format engine, SAS suppresses the MOVE operation and prints a warning.
OVERRIDE=(ds_option-1=value-1 <...ds_option-n=value-n>)
overrides specified output data set options copied from the input data set. Some data set options might not be appropriate in the output data set context of COPY.
Restriction:The OVERRIDE option is ignored if the NOCLONE option is specified. However, it can be used to modify data set attributes other than those controlled by the NOCLONE option.
Tip:When copying a data set stored in another host data representation or encoding, the default (or CLONE) behavior of COPY is to preserve the other host data representation or encoding in the new copy of the data set. By specifying OVERRIDE=(OUTREP=session ENCODING=session) in the COPY statement, the new copy of the data set is created in the host data representation and encoding of the SAS session that is executing the COPY.
NOCLONE
See the description of CLONE | NOCLONE.

Details

Using the Block I/O Method to Copy

The block I/O method is used to copy blocks of data instead of one observation at a time. This method can increase performance when you are copying large data sets. SAS determines whether to use this method. Not all data sets can use the block I/O method. There are restrictions set by the COPY statement and the Base SAS engine.
To display information in the SAS log about the copy method that is being used, you can specify the MSGLEVEL= system option as follows:
options msglevel=i;
The following message is written to the SAS log, if the block I/O method is not used:
INFO: Data set block I/O cannot be used because:
If the COPY statement determines that the block I/O will not be used, one of the following explanations is written to the SAS log:
INFO: - The data sets use different engines, have different variables or have attributes that might differ.
INFO: - There is no member level locking.
INFO: - The OBS option is active.
INFO: - The FIRSTOBS option is active.
If the Base SAS engine determines that the block I/O method will not be used, one of the following explanations is written to the SAS log:
INFO: - Referential Integrity Constraints exist.
INFO: - Cross Environment Data Access is being used.
INFO: - The file is compressed.
INFO: - The file has an audit file which is not suspended.
If you are having performance issues and want to create a subset of a large data set for testing, you can use the OBS=0 option. In this case, you want to reduce the use of system resources by disabling the block I/O method.
The following example uses the OBS=0 option to reduce the use of system resources:
options obs=0 msglevel=i;
proc copy in=old out=lib;
select a;
run;
You get the same results when you use the SET statement:
data lib.new;
    if 0 then set old.a;
    stop;
run;

Copying an Entire Library

To copy an entire SAS library, simply specify an input data library and an output data library following the COPY statement. For example, the following statements copy all the SAS files in the SOURCE data library into the DEST data library:
proc datasets library=source;
   copy out=dest;
run;

Copying Selected SAS Files

To copy selected SAS files, use a SELECT or EXCLUDE statement. For more discussion of using the COPY statement with a SELECT or an EXCLUDE statement, see Specifying Member Types When Copying or Moving SAS Files and see Manipulating SAS Files for an example. Also, see EXCLUDE statement and SELECT statement.
You can also select or exclude an abbreviated list of members. For example, the following statement selects members TABS, TEST1, TEST2, and TEST3:
select tabs test1-test3;
Also, you can select a group of members whose names begin with the same letter or letters by entering the common letters followed by a colon (:). For example, you can select the four members in the previous example and all other members having names that begin with the letter T by specifying the following statement:
select t:;
You specify members to exclude in the same way that you specify those to select. That is, you can list individual member names, use an abbreviated list, or specify a common letter or letters followed by a colon (:). For example, the following statement excludes the members STATS, TEAMS1, TEAMS2, TEAMS3, TEAMS4 and all the members that begin with the letters RBI from the copy operation:
exclude stats teams1-teams4 rbi:;
Note that the MEMTYPE= option affects which types of members are available to be selected or excluded.
When a SELECT or EXCLUDE statement is used with CONSTRAINT=YES, only the general integrity constraints on the data sets are copied. Any referential integrity constraints are not copied. For more information, see “Understanding Integrity Constraints” in SAS Language Reference: Concepts.

Specifying Member Types When Copying or Moving SAS Files

The MEMTYPE= option in the COPY statement differs from the MEMTYPE= option in other statements in the procedure in several ways:
  • A slash does not precede the option.
  • You cannot limit its effect to the member immediately preceding it by enclosing the MEMTYPE= option in parentheses.
  • The SELECT and EXCLUDE statements and the IN= option (in the COPY statement) affect the behavior of the MEMTYPE= option in the COPY statement according to the following rules:
    1. MEMTYPE= in a SELECT or EXCLUDE statement takes precedence over the MEMTYPE= option in the COPY statement. The following statements copy only VISION.CATALOG and NUTR.DATA from the default data library to the DEST data library; the MEMTYPE= value in the first SELECT statement overrides the MEMTYPE= value in the COPY statement.
      proc datasets;
         copy out=dest memtype=data;
            select vision(memtype=catalog) nutr;
      run;
    2. If you do not use the IN= option, or you use it to specify the library that happens to be the procedure input library, the value of the MEMTYPE= option in the PROC DATASETS statement limits the types of SAS files that are available for processing. The procedure uses the order of precedence described in rule 1 to further subset the types available for copying. The following statements do not copy any members from the default data library to the DEST data library. Instead, the procedure issues an error message because the MEMTYPE= value specified in the SELECT statement is not one of the values of the MEMTYPE= option in the PROC DATASETS statement.
            /* This step fails! */
      proc datasets memtype=(data program);
         copy out=dest;
            select apples / memtype=catalog;
      run;
    3. If you specify an input data library in the IN= option other than the procedure input library, the MEMTYPE= option in the PROC DATASETS statement has no effect on the copy operation. Because no subsetting has yet occurred, the procedure uses the order of precedence described in rule 1 to subset the types available for copying. The following statements successfully copy BODYFAT.DATA to the DEST data library because the SOURCE library specified in the IN= option in the COPY statement is not affected by the MEMTYPE= option in the PROC DATASETS statement.
      proc datasets library=work memtype=catalog;
         copy in=source out=dest;
            select bodyfat / memtype=data;
      run;

Copying Views

The COPY statement with NOCLONE specified supports the OUTREP= and ENCODING= LIBNAME options for SQL views, DATA step views, and some SAS/ACCESS views (Oracle and Sybase). When you use the COPY statement with Remote Library Services (RLS) such as SAS/SHARE or SAS/CONNECT, the COPY statement defaults to the encoding and data representation of the output library.
CAUTION:
If you use the DATA statement's SOURCE=NOSAVE option when creating a DATA step view, the view cannot be copied from one version of SAS to another version.

Copying Password-Protected SAS Files

You can copy a password-protected SAS file without specifying the password. In addition, because the password continues to correspond to the SAS file, you must know the password in order to access and manipulate the SAS file after you copy it.

Copying Data Sets with Long Variable Names

If the VALIDVARNAME=V6 system option is set and the data set has long variable names, the long variable names are truncated, unique variables names are generated, and the copy succeeds. The same is true for index names. If VALIDVARNAME=ANY, the copy fails with an error if the OUT= engine does not support long variable names.
When a variable name is truncated, the variable name is shortened to eight bytes. If this name has already been defined in the data set, the name is shortened and a digit is added, starting with the number 2. The process of truncation and adding a digit continues until the variable name is unique. For example, a variable named LONGVARNAME becomes LONGVARN, provided that a variable with that name does not already exist in the data set. In that case, the variable name becomes LONGVAR2.
CAUTION:
Truncated variable names can collide with names already defined in the input data set.
This behavior is possible when the variable name that is already defined is exactly eight bytes long and ends in a digit. In the following example, the truncated name is defined in the output data set and the name from the input data set is changed:
options validvarname=any;
data test;
   longvar10='aLongVariableName';
   retain longvar1-longvar5 0;
run;

options validvarname=v6;
proc copy in=work out=sasuser;
   select test;
run;
In this example, LONGVAR10 is truncated to LONGVAR1 and placed in the output data set. Next, the original LONGVAR1 is copied. Its name is no longer unique. Therefore, it is renamed LONGVAR2. The other variables in the input data set are also renamed according to the renaming algorithm. The following example is from the SAS log:
1    options validvarname=any;
2    data test;
3       longvar10='aLongVariableName';
4       retain longvar1-longvar5 0;
5    run;

NOTE: The data set WORK.TEST has 1 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           2.60 seconds
      cpu time            0.07 seconds


6
7    options validvarname=v6;
8    proc copy in=work out=sasuser;
9       select test;
10   run;

NOTE: Copying WORK.TEST to SASUSER.TEST (memtype=DATA).
NOTE: The variable name longvar10 has been truncated to longvar1.
NOTE: The variable longvar1 now has a label set to longvar10.
NOTE: Variable LONGVAR1 already exists on file SASUSER.TEST, using LONGVAR2
instead.
NOTE: The variable LONGVAR2 now has a label set to LONGVAR1.
NOTE: Variable LONGVAR2 already exists on file SASUSER.TEST, using LONGVAR3
instead.
NOTE: The variable LONGVAR3 now has a label set to LONGVAR2.
NOTE: Variable LONGVAR3 already exists on file SASUSER.TEST, using LONGVAR4
instead.
NOTE: The variable LONGVAR4 now has a label set to LONGVAR3.
NOTE: Variable LONGVAR4 already exists on file SASUSER.TEST, using LONGVAR5
instead.
NOTE: The variable LONGVAR5 now has a label set to LONGVAR4.
NOTE: Variable LONGVAR5 already exists on file SASUSER.TEST, using LONGVAR6
instead.
NOTE: The variable LONGVAR6 now has a label set to LONGVAR5.
NOTE: There were 1 observations read from the data set WORK.TEST.
NOTE: The data set SASUSER.TEST has 1 observations and 6 variables.
NOTE: PROCEDURE COPY used (Total process time):
      real time           13.18 seconds
      cpu time            0.31 seconds


11
12   proc print data=test;
13   run;

ERROR: The value LONGVAR10 is not a valid SAS name.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.15 seconds
      cpu time            0.01 seconds

Using the COPY Procedure Instead of the COPY Statement

Generally, the COPY procedure functions the same as the COPY statement in the DATASETS procedure. The following is a list of differences:
  • The IN= argument is required with PROC COPY. In the COPY statement, IN= is optional. If omitted, the default value is the libref of the procedure input library.
  • PROC DATASETS cannot work with libraries that allow only sequential data access.
  • The COPY statement honors the NOWARN option but PROC COPY does not.

Copying Generation Groups

You can use the COPY statement to copy an entire generation group. However, you cannot copy a specific version in a generation group.

Transporting SAS Data Sets between Hosts

You use the COPY procedure, along with the XPORT engine or a REMOTE engine, to transport SAS data sets between hosts. See “Strategies for Moving and Accessing SAS Files” in Moving and Accessing SAS Files for more information.