Processing Data Using Cross-Environment Data Access (CEDA) |
What Types of Processing Does CEDA Support? |
CEDA supports SAS 7 and later SAS files that are created in directory-based operating environments like UNIX, Windows, and OpenVMS. CEDA provides the following SAS file processing for these SAS engines:
BASE |
default Base SAS engine for SAS 9 (V9), SAS 8 (V8), and SAS 7 (V7). |
SASESOCK |
TCP/IP port engine for SAS/CONNECT software. |
TAPE |
sequential engine for SAS 9 (V9TAPE), SAS 8 (V8TAPE), and SAS 7 (V7TAPE). |
SAS File Type | Engine | Supported Processing |
---|---|---|
SAS data file | BASE, TAPE, SASESOCK | input and output (table note 1) processing |
PROC SQL view | BASE | input processing |
SAS/ACCESS view for Oracle or Sybase | BASE | input processing |
MDDB file (table note 2) | BASE | input processing |
TABLE NOTE 1: For output processing that replaces an existing SAS data file, there are behavioral differences. For more information, see Behavioral Differences for Output Processing.
TABLE NOTE 2: CEDA supports SAS 8 and later MDDB files.
Behavioral Differences for Output Processing |
For output processing that replaces an existing SAS data file, the BASE and TAPE engines behave differently regarding the following attributes:
The BASE engine uses the encoding of the existing file; that is, the encoding is cloned.
The TAPE engine uses the current SAS session encoding.
For both the BASE and TAPE engines, the COPY procedure uses the encoding of the file from the source library (that is, the file being copied), regardless of whether the file existed in the target library.
The BASE and TAPE engines use the data representation of the native environment, except with the COPY procedure. By default, PROC COPY uses the data representation of the file that is copied. When writing out the file, if you want PROC COPY to use the data representation of the target operating system and not the data representation of the source file, you must specify the NOCLONE option.
Restrictions for CEDA |
CEDA has the following restrictions:
CEDA does not support DATA step views, SAS/ACCESS views that are not for SAS/ACCESS for Oracle or Sybase, SAS catalogs, stored compiled DATA step programs, item stores, DMDB files, FDB files, or any SAS file that was created before SAS 7.
Update processing is not supported.
Integrity constraints cannot be read or updated.
An audit trail file cannot be updated but it can be read.
Indexes are not supported. Therefore, WHERE optimization with an index is not supported.
On z/OS, only UNIX file system libraries fully support CEDA. SAS can create bound library members on z/OS with a character encoding that differs from the default. CEDA is used in creating such a member as well as in any subsequent attempts to read the member. However, for bound library members, the character encoding is the only aspect of the data representation that might differ from the default. For example, it is not possible, using the OUTREP option of a LIBNAME statement, to create a member in data representation for another host besides z/OS. For more information about the various types of libraries that are supported by SAS on z/OS, see Library Implementation Types for Base and Sequential Engines in SAS Companion for z/OS.
Because the BASE engine translates the data as the data is read, multiple procedures require SAS to read and translate the data multiple times. In this way, the translation could affect system performance.
If a foreign data set is damaged, CEDA cannot process the file in order to repair it. CEDA does not support update processing, which is required in order to repair a damaged data set. To repair the foreign file, you must move it back to its native environment. For information on how to repair a damaged data set, see the REPAIR statement in the DATASETS procedure in Base SAS Procedures Guide.
Transcoding could result in character data loss when encodings are incompatible. For information about encoding and transcoding, see the SAS National Language Support (NLS): Reference Guide.
Loss of precision can occur in numeric variables when you move data between operating environments. If a numeric variable is defined with a short length, you can try increasing the length of the variable. Full-size numeric variables are less likely to encounter a loss of precision with CEDA. For more information, see Numeric Precision in SAS Software.
Numeric variables have a minimum length of either 2 or 3 bytes, depending on the operating environment. In an operating environment that supports a minimum of 3 bytes (such as Windows or UNIX), CEDA cannot process a numeric variable that was created with a length of 2 bytes (for example, in z/OS). If you encounter this restriction, then use the XPORT engine or the CPORT and CIMPORT procedures instead of CEDA.
Note: If you encounter these restrictions because your files were created under a previous version of SAS, consider using the MIGRATE procedure, which is documented in the Base SAS Procedures Guide. PROC MIGRATE retains many features, such as integrity constraints, indexes, and audit trails.
Understanding When CEDA Is Used to Process a File |
Because CEDA translation is transparent, you might not be aware when CEDA is being used. Knowing when CEDA is used could be helpful because, for example, CEDA translation might require additional resources.
Starting in SAS 9, by default, SAS writes a message to the log when CEDA is used. Here is an example:
Note: Data file HEALTH.GRADES.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.
CEDA is used in these situations:
when the encoding of character values for the SAS file is incompatible with the currently executing SAS session encoding.
when the data representation of the SAS file is incompatible with the data representation of the currently executing SAS session. For example, an incompatibility can occur if you move a file from an operating environment like Windows to an operating environment like UNIX, or if you have upgraded to a 64-bit platform from a 32-bit platform.
In the following table, each row contains a group of operating environments that are compatible with each other. CEDA is used only when you create a file with a data representation in one row and process the file under a data representation of another row. (Environments are named by the operating system and the platform on which SAS is executed.)
Data Representation Value | Environment |
---|---|
ALPHA_TRU64
LINUX_IA64 LINUX_X86_64 SOLARIS_X86_64 |
Tru64 UNIX (table note 1)
Linux for Itanium (table note 1) Linux for x64 (table note 1) Solaris for x64 (table note 1) |
ALPHA_VMS_32 | OpenVMS Alpha (table note 2) |
ALPHA_VMS_64
VMS_IA64 |
OpenVMS Alpha (table note 2)
OpenVMS for HP Integrity servers 64-bit platform (table note 2) |
HP_IA64
HP_UX_64 RS_6000_AIX_64 SOLARIS_64 |
HP-UX for Itanium on 64-bit platform
HP-UX for PA-RISC on 64-bit platform AIX on 64-bit platform Solaris on SPARC 64-bit platform |
HP_UX_32
MIPS_ABI RS_6000_AIX_32 SOLARIS_32 |
HP-UX on 32-bit platform
ABI UNIX on 32-bit platform AIX on 32-bit platform Solaris on SPARC 32-bit platform |
LINUX_32
INTEL_ABI |
Linux for Intel architecture
ABI UNIX for Intel on 32-bit platform |
MVS_32 | z/OS on 32-bit platform |
OS2 | OS/2 for Intel on 32-bit platform |
VAX_VMS | OpenVMS VAX |
WINDOWS_32 | Microsoft Windows on 32-bit platform |
WINDOWS_64 | Microsoft Windows 64-Bit Edition (for both Itanium-based systems and x64) |
TABLE NOTE 1: Although all four of the environments in this group are compatible, catalogs are an exception:
Catalogs are compatible between Tru64 UNIX and Linux for Itanium.
Catalogs are compatible between Linux for x64 and Solaris for x64.
TABLE NOTE 2: Although these OpenVMS environments have different data representations for some compiler types, SAS data sets that are created by the BASE engine do not store the data types that are different. Therefore, if the encoding is compatible, CEDA is not used between these environments. However, note that SAS 9 does not support SAS 8 catalogs from OpenVMS. You can migrate the catalogs with the MIGRATE procedure . For more information on the MIGRATE procedure, see the Base SAS Procedures Guide.
Determining Whether Update Processing Is Allowed |
If a file's data representation is the same as the data representation of the processing environment, and if the encoding is compatible with the currently executing SAS session encoding, then you can manually update the file, because CEDA is not needed in order to translate the file. For example, in a Windows environment, if a file was created in a Windows environment or if the OUTREP= option was used to designate the file in Windows data representation, then you can update the file.
Otherwise, if CEDA is used to translate the file, you cannot update it. If you attempt to update the file, then you will receive an error message that says that updating is not allowed. For example:
ERROR: File HEALTH.OXYGEN cannot be updated because its encoding does not match the session encoding or the file is in a format native to another host, such as SOLARIS_32, HP_UX_32, RS_6000_AIX_32, MIPS_ABI.
To determine the data representation and the encoding of a file, you can use the CONTENTS procedure (or the CONTENTS statement in PROC DATASETS). For example, the data set HEALTH.OXYGEN was created in a UNIX environment in SAS 9. The file was moved to a SAS 9 Windows environment, in which the following CONTENTS output was requested:
CONTENTS Output Showing Data Representation
The SAS System 1 The CONTENTS Procedure Data Set Name HEALTH.OXYGEN Observations 31 Member Type DATA Variables 7 Engine V9 Indexes 0 Created Wednesday, January 24, 2007 10:11:39 Observation Length 56 Last Modified Wednesday, January 24, 2007 10:11:33 Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation SOLARIS_32, HP_UX_32, RS_6000_AIX_32, MIPS_ABI Encoding latin1 Western ( ISO ) Engine/Host Dependent Information Data Set Page Size 5120 Number of Data Set Pages 1 First Data Page 1 Max Obs per Page 90 Obs in First Data Page 31 Number of Data Set Repairs 0 File Name /u/xxxxxx/myfiles/health/\oxygen.sas7bdat Release Created 9.0200A0 Host Created HP-UX Alphabetic List of Variables and Attributes # Variable Type Len 1 AGE Num 8 6 MAXPULSE Num 8 7 OXYGEN Num 8 4 RSTPULSE Num 8 5 RUNPULSE Num 8 3 RUNTIME Num 8 2 WEIGHT Num 8
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.