SAS File Processing with CEDA

What Types of Processing Does CEDA Support?

CEDA supports SAS 7 and later SAS files that are created in directory-based operating environments like UNIX, Windows, and OpenVMS. CEDA provides the following SAS file processing for these SAS engines:
BASE
default Base SAS engine for SAS 9 (V9), SAS 8 (V8), and SAS 7 (V7).
SASESOCK
TCP/IP port engine for SAS/CONNECT software.
TAPE
sequential engine for SAS 9 (V9TAPE), SAS 8 (V8TAPE), and SAS 7 (V7TAPE).
SAS File Processing Provided by CEDA
SAS File Type
Engine
Supported Processing
SAS data file
BASE, TAPE, SASESOCK
input and output1
PROC SQL view
BASE
input
SAS/ACCESS view for Oracle or Sybase
BASE
input
MDDB file2
BASE
input
1For output processing that replaces an existing SAS data file, there are behavioral differences. For more information, see Behavioral Differences for Output Processing.
2CEDA supports SAS 8 and later MDDB files.

Behavioral Differences for Output Processing

For output processing that replaces an existing SAS data file, the BASE and TAPE engines behave differently regarding the following attributes:
encoding
  • The BASE engine uses the encoding of the file from the source library. That is, the encoding is cloned.
  • The TAPE engine uses the current SAS session encoding.
  • For both the BASE and TAPE engines, by default PROC COPY uses the encoding of the file from the source library. If, instead, you want to use the encoding of the current SAS session, specify the NOCLONE option. If you want to use a different encoding, specify the NOCLONE option and the ENCODING= option. When you use PROC COPY with SAS/SHARE or SAS/CONNECT, the default behavior is to use the encoding of the current SAS session.
data representation
  • The BASE and TAPE engines use the data representation of the current SAS session, except with PROC COPY.
  • For both the BASE and TAPE engines, by default PROC COPY uses the data representation of the file from the source library. If, instead, you want to use the data representation of the current SAS session, specify the NOCLONE option. If you want to use a different data representation, specify the NOCLONE option and the OUTREP= option. When you use PROC COPY with SAS/SHARE or SAS/CONNECT, the default behavior is to use the data representation of the current SAS session.

Restrictions for CEDA

CEDA has the following restrictions:
  • CEDA does not support DATA step views, SAS/ACCESS views that are not for SAS/ACCESS for Oracle or Sybase, SAS catalogs, stored compiled DATA step programs, item stores, DMDB files, FDB files, or any SAS file that was created prior toSAS 7.
  • Update processing is not supported.
  • Integrity constraints cannot be read or updated.
  • An audit trail file cannot be updated but it can be read.
  • Indexes are not supported. Therefore, WHERE optimization with an index is not supported.
  • On z/OS, members of UNIX file system libraries can be created using any SAS data representation. However, when bound libraries are created, they are assigned the data representation of the SAS session that creates the library. SAS does not allow the creation of bound library members with a data representation that differs (except for the character encoding) from the data representation of the library. For example, if you create a bound library with 31-bit SAS on z/OS, the library has a data representation of MVS_32 for the duration of its existence. You cannot use the OUTREP option of the LIBNAME statement to create a member in the library with a data representation other than MVS_32. For more information about library implementation types for BASE and sequential engines on z/OS, see SAS Companion for z/OS.
  • Because the BASE engine translates the data as the data is read, multiple procedures require SAS to read and translate the data multiple times. In this way, the translation could affect system performance.
  • If a data set is damaged, CEDA cannot process the file in order to repair it. CEDA does not support update processing, which is required in order to repair a damaged data set. To repair the file, you must move it back to the environment where it was created or a compatible environment that does not invoke CEDA processing. For information about how to repair a damaged data set, see the REPAIR statement in the DATASETS procedure in Base SAS Procedures Guide.
  • Transcoding could result in character data loss when encodings are incompatible. For information about encoding and transcoding, see the SAS National Language Support (NLS): Reference Guide.
  • Loss of precision can occur in numeric variables when you move data between operating environments. If a numeric variable is defined with a short length, you can try increasing the length of the variable. Full-size numeric variables are less likely to encounter a loss of precision with CEDA. For more information, see Numeric Precision in SAS Software.
  • Numeric variables have a minimum length of either 2 or 3 bytes, depending on the operating environment. In an operating environment that supports a minimum of 3 bytes (such as Windows or UNIX), CEDA cannot process a numeric variable that was created with a length of 2 bytes (for example, in z/OS). If you encounter this restriction, then use the XPORT engine or the CPORT and CIMPORT procedures instead of CEDA.
Note: If you encounter these restrictions because your files were created under a previous version of SAS, consider using the MIGRATE procedure, which is documented in the Base SAS Procedures Guide. PROC MIGRATE retains many features, such as integrity constraints, indexes, and audit trails.

Understanding When CEDA Is Used to Process a File

Because CEDA translation is transparent, you might not be aware when CEDA is being used. Knowing when CEDA is used could be helpful (for example, CEDA translation might require additional resources).
Starting in SAS 9, SAS writes a message by default to the log when CEDA is used. Here is an example:
Note: Data file HEALTH.GRADES.DATA is in a format that is native to another     
host, or the file encoding does not match the session encoding. Cross           
Environment Data Access will be used, which might require additional CPU        
resources and might reduce performance.
CEDA is used in these situations:
  • when the encoding of character values for the SAS file is incompatible with the currently executing SAS session encoding.
  • when the data representation of the SAS file is incompatible with the data representation of the currently executing SAS session. For example, an incompatibility can occur if you move a file from an operating environment like Windows to an operating environment like UNIX, or if you have upgraded to 64-bit UNIX from 32-bit UNIX.
    In the following table, each row contains a group of operating environments that are compatible with each other. CEDA is used only when you create a file with a data representation in one row and process the file under a data representation of another row. (Environments are named by the operating system and the platform on which SAS is executed.)
Compatibility across Environments
Data Representation Value
Environment
ALPHA_TRU64
LINUX_IA64
LINUX_X86_64
SOLARIS_X86_64
Tru64 UNIX 1
Linux for Itanium-based systems1
Linux for x64 1
Solaris for x64 1
ALPHA_VMS_32
OpenVMS Alpha 2
ALPHA_VMS_64
VMS_IA64
OpenVMS Alpha 2
OpenVMS on HP Integrity2
HP_IA64
HP_UX_64
RS_6000_AIX_64
SOLARIS_64
HP-UX for the Itanium Processor Family Architecture
HP-UX for PA-RISC, 64-bit
AIX
Solaris for SPARC
HP_UX_32
MIPS_ABI
RS_6000_AIX_32
SOLARIS_32
HP-UX for PA-RISC
MIPS ABI
AIX
Solaris for SPARC
LINUX_32
INTEL_ABI
Linux for Intel architecture
ABI for Intel architecture
MVS_32
31-bit SAS on z/OS
MVS_64_BFP
64-bit SAS on z/OS
OS2
OS/2 for Intel
VAX_VMS
OpenVMS VAX
WINDOWS_32
WINDOWS_64
32-bit SAS on Microsoft Windows3
64-bit SAS on Microsoft Windows (for both Itanium-based systems and x64)3
1Although all four of the environments in this group are compatible, catalogs are an exception. Catalogs are compatible between Tru64 UNIX and Linux for Itanium. Catalogs are compatible between Linux for x64 and Solaris for x64.
2Although these OpenVMS environments have different data representations for some compiler types, SAS data sets that are created by the BASE engine do not store the data types that are different. Therefore, if the encoding is compatible, CEDA is not used between these environments. However, note that SAS 9 does not support SAS 8 catalogs from OpenVMS. You can migrate the catalogs with the MIGRATE procedure. For more information, see the Base SAS Procedures Guide.
3Although these Windows environments are compatible, catalogs are an exception. Catalogs are not compatible between 32-bit and 64-bit SAS for Windows.

Determining Whether Update Processing Is Allowed

If a file's data representation is the same as the data representation of the processing environment, and if the encoding is compatible with the currently executing SAS session encoding, then you can manually update the file, because CEDA is not needed in order to translate the file. For example, if a file was created in a 64-bit Solaris environment or if the OUTREP= option was used to designate the file with that data representation, then you can update the file in a 64-bit SAS session on Solaris for SPARC, HP-UX, or AIX.
Otherwise, if CEDA is used to translate the file, you cannot update the file. If you attempt to update the file, then you receive an error message stating that updating is not allowed. For example:
ERROR: File HEALTH.OXYGEN cannot be updated because its encoding does 
not match the session encoding or the file is in a format native to another  
host, such as SOLARIS_32, HP_UX_32, RS_6000_AIX_32,MIPS_ABI.
To determine the data representation and the encoding of a file, you can use the CONTENTS procedure (or the CONTENTS statement in PROC DATASETS). For example, the data set HEALTH.OXYGEN was created in a UNIX environment in SAS 9. The file was moved to a SAS 9 Windows environment, in which the following CONTENTS output was requested:
CONTENTS Output Showing Data Representation
In the output, the Data Representation lists HP_UX_64, RS_6000_AIX_64, SOLARIS_64, HP_IA64, and the Encoding is latin1 Western (ISO).