SAS Institute. The Power to Know

SAS/ACCESS(R) 9.2 Interface to PC Files: Reference

Previous Page | Next Page

File Format-Specific Reference for the IMPORT and EXPORT Procedures

Microsoft Excel Workbook Files


Overview of Microsoft Excel Files Essentials

SAS/ACCESS Interface to PC Files Server works with Microsoft Excel workbook 5, 95, 97, 2000, 2002, 2003, 2007. Throughout this document these files are collectively referred to as .xls files, except for version 2007. Microsoft Excel workbook 2007 file formats are referred to as .xlsb, .xlsm, or .xlsx files. Files that are created with Excel 2007 can create and work with all Excel file formats.

An Excel file represents an Excel workbook. An Excel workbook is a collection of worksheets. A worksheet that was created with a version before Excel 2007 can contain up to 256 columns and 65,536 rows in an .xls file.

A cell is the intersection of a column and a row. It is referenced by a column number and a row number. For example, B5. A cell is the basic unit that saves data in the worksheet. A cell can contain a numeric value or a text value of up to 32,767 characters.

A range is a subset of cells in a worksheet. Its address identifies it: It begins with the name of the top-left cell and ends with the name of the bottom-right cell, separated by two periods. For example, the range B2..E8 is the range address for a rectangular block of 12 cells, where the top-left cell is B2 and the bottom-right cell is E8 (shown as shaded, below). A range name identifies a range.

Excel 2007 has been enhanced to support 16,384 columns and 1,048,576 rows in a worksheet. Files that are created with Excel 2007 can have have an .xlsb, .xlsx, or .xlsm extension. Due to the limitations of the ACE engine and the Jet engine, the SAS LIBNAME engine is limited to 255 columns in an Excel file.

A Range of Data in an Excel Worksheet

[Range of Data in an Excel Worksheet]

SAS/ACCESS Interface to PC Files Server treats an Excel workbook as a database, and a range (subset of cells in a worksheet) as a table. A range name must be defined in the Excel file before SAS can use it. A worksheet is treated as a special range. A worksheet name appended with a $ character is treated as a range.

For example, Sheet1 is a sheet name in an Excel file. SAS treats Sheet1$ as a valid range name and uses it to refer to the whole worksheet. You need to use SAS n-literal when referring to the sheet name. For example, 'Sheet1$'n. The first row of data in a range is usually treated as a column heading and used to create a SAS variables name.

  • Excel 4 files, only one spreadsheet is allowed per file

  • Excel 4, Excel 5, and Excel 95 limits are 256 columns, and 16,384 rows

  • Excel 97, 2000, 2002, 2003 limits are 256 columns, and 65,536 rows

  • Excel 2007 limits are 16,384 columns, and 1,048,576 rows.

  • Excel 95 files are treated as the same format as Excel 5 files.

  • Excel 2000, 2002, and 2003 files with an .xls file extension are treated as the same format as Excel 97 files.

  • Excel 2007 has three different file extensions, .xlsb, .xlsm, and .xlsx.


Excel Data Types

Microsoft Excel software has two data types: character and numeric.

  • Character data as labels or formula strings. Character data is generally considered text and can include character type dates and numbers. A cell can save up to 32,767 characters.

  • Numeric data can be entered as numbers or formulas. Numeric data can include numbers (0 through 9), formulas, or error values (such as #NULL!, #N/A, #VALUE!).


Excel Numeric Data and Time Values

The conversion of date and time values between SAS data sets and Microsoft Excel spreadsheets is transparent to users. However, you are encouraged to understand the differences between them.

In Microsoft Excel software, a date value is the integer portion of a number that can range from 01 January 1900 (saved as integer value: 1) to 31 December 9999 (saved as integer value: 2,958,465). A Microsoft Excel software time value is the decimal portion of a number that represents time as a proportion of a day. For example, 0.0 is midnight, 0.5 is noon, and 0.999988 is 23:59:59 (on a 24-hour clock). While a number can have both a date and a time portion, the formats in Microsoft Excel display a number in a date, time, or date/time format.

In SAS software, SAS dates are valid back to AD 1582 and ahead to AD 9,999. A date value is represented by the number of days between January 01, 1960, and that date. A time value is represented by the number of seconds between midnight and that time of day. A datetime value is represented by the number of seconds between midnight January 01, 1960, and that datetime.

When you export a SAS time value to an Excel file, the value could be displayed as "1/0/1900" in the Excel file. Format the cell with a Time format to see the time value displayed correctly.


Excel File Formats

If you use 'Save As' and select 'Save As type' of 'Excel 97-2003 Workbook', the file is saved as an .xls file in that format. Under 'Save As' you can select the other 2007 formats:

  • Excel Workbook creates a .xlsx file.

  • Excel Macro-Enabled Workbook creates an .xlsm file.

  • Excel Binary Workbook creates an .xlsb file.

  • Excel 97-2003 Workbook creates an .xls file.


SAS Import and Export Utilities Support for Excel Files

SAS Import and Export Utilities provide three components to access Microsoft Excel files.

LIBNAME statement

generates a LIBNAME statement and PROC SQL code for the PC files engine read data from or write data to an Excel file. This component supports Excel versions 5, 95, 97, 2000, 2002, 2003, and 2007.

See: LIBNAME Statement:PC Files on Microsoft Windows
The ACCESS and DBLOAD procedures

generate ACCESS procedure code to read data from an Excel file. Also, generates the DBLOAD procedure code to write data to an Excel file. This component supports only Excel versions 4 and 5/95. This is for SAS 6 compatibility support and is available only on Microsoft Windows.

See: Importing and Exporting Microsoft Excel Files with the ACCESS and DBLOAD Procedures.
.xls file formats

translates Excel .xls file formats to read data from or write data to an Excel file. This component supports Excel versions 5/95, 97, 2000, 2002, and 2003. For details about available statements.

Restriction: This component does not support V 2007 .xlsb or .xlsx files.
Restriction: This component does not currently support such DBCS character sets as Chinese, Japanese, and Korean.
See: ACCESS Procedure Syntax.

LIBNAME Statement Options to Import or Export Excel Data

The LIBNAME statement component for importing and exporting Microsoft Excel workbook data generates SAS LIBNAME statement code. The LIBNAME statement uses the Microsoft Ace engine or Microsoft Jet engine to access data in Microsoft Excel workbook files.

LIBNAME Statement Options to Import or Export Excel Data
Data Source Syntax Valid Value Default Value PROC IMPORT PROC EXPORT
EXCEL

EXCEL97

EXCEL2000

EXCEL2002

EXCEL2003

EXCEL2007

DBDSOPTS= Excel data set options
Yes Yes
DBSASLABEL= Compat|None Compat Yes No
GETNAMES= Yes|No Yes Yes No
MIXED= Yes|No No Yes No
NEWFILE= Yes|No No No Yes
RANGE= range name
Yes No
SCANTEXT= Yes|No Yes Yes No
SCANTIME= Yes|No Yes Yes No
SHEET= sheet name
Yes Yes
TEXTSIZE= 1 to 32767 1024 Yes No
USEDATE= Yes|No Yes Yes No

SAS LIBNAME statement options to import or export Excel Data files on Linux, Windows and UNIX. This component requires that the PC Files Server is running on a Microsoft Windows operating platform where the PC files reside.

SAS LIBNAME Statement Options to Import or Export Excel Data files on Linux, Windows and UNIX
Data Source Syntax Valid Value Default Value PROC IMPORT PROC EXPORT
EXCELCS DBDSOPTS= 'Excel data set options'
Yes Yes
DBSASLABEL= Yes|No Yes Yes No
PORT= 1 to 65535
Yes Yes
RANGE= 'range name'
Yes No
SCANTEXT= Yes|No Yes Yes No
SCANTIME= Yes|No Yes Yes No
SERVER= 'server name'
Yes Yes
SERVERPASS= 'server password'
Yes Yes
SERVERUSER= 'server User ID'
Yes Yes
SERVICE= 'service name'
Yes Yes
SHEET= 'sheet name'
Yes Yes
SSPI=

64-bit Windows only

Yes| No No Yes Yes
TEXTSIZE= 1 to 32767 1024 Yes No
USEDATE= Yes|No Yes Yes No
VERSION= '5' | '95' | '97' | '2000' | '2002' | '2003' | '2007' '97' Yes Yes


IMPORT and EXPORT Procedure Options for Excel Files

DBDSOPTS=any data set options that are valid for the SAS Excel LIBNAME engine

enables data set options for the LIBNAME engine such as READBUFF=, INSERTBUFF=, DBTYPE=, DROP=, FIRSTOBS=, and OBS=. These options are for advanced users who are familiar with the PC Files LIBNAME engine.

Default: 1for UNIX users who use the client server model with an Excel workbook.
Note: To improve performance for reading data, set the READBUFF= option to 25 or higher.

Enclose the options in single or double quotation marks as shown in these examples: DBDSOPTS='FIRSTOBS=10 READBUFF=25'; .

If the option string that you specify contains single quotations marks, use double quotation marks around it in your statement.

For example: DBDSOPTS="DBTYPE=(BilledTo='CHAR(8)')"; .

See Data Set Options and Data Set Options for PC Files on Linux and UNIX, and 64-Bit Windows for a list of supported data set options.

DBSASLABEL=COMPAT|NONE|YES|NO

specifies the data source column names.

COMPAT

specifies that the data source column headings are saved as the corresponding SAS label names.

Alias: YES
NONE

specifies that the data source column headings are not saved as SAS label names. The SAS label names are then left as blanks.

Alias: NO
GETNAMES=YES|NO

specifies whether the IMPORT procedure is to generate SAS variable names from the data values in the first row of the import file.

If data in the first row of the input file contains special characters for a SAS variable name, such as a blank; SAS converts the character to an underscore. For example, the column heading.

YES

specifies that the IMPORT procedure generate SAS variable names from the data values in the first row of the imported Excel file.

NO

specifies that the IMPORT procedure generate SAS variable names as F1, F2, F3, and so on.

Default: Yes
Restriction: Valid only for Windows.
Restriction: Valid only for the IMPORT procedure.
Restriction: Supported only when DBMS=EXCEL.
Restriction: When SAS reads the data value in the first row of the input file, SAS checks for invalid SAS name characters (such as a blank). Invalid characters are converted to an underscore. For example, the data value Occupancy Code becomes the SAS variable name Occupancy_Code .
MIXED=YES|NO

assigns a SAS character type for the column and converts all numeric data values to character data values when mixed data types are found.

Default: NO
Restriction: Supported only when DBMS=EXCEL.
Restriction: If MIXED=YES, the connection is set to import mode and updates are not allowed.
Note: Due to a limitation in the Microsoft Ace engine and the Microsoft Jet Excel engine, using MIXED=YES could result in improper text variable lengths.
NO

assigns numeric or character type for the column, depending on the majority of the type data that is found.

Restriction: Numeric data in a character column and character data in a numeric column are imported as missing values.

These registry settings can affect the behavior of the MIXED= option. The options are located in the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel key of the Microsoft Windows registry. To change the value of TypeGuessRows, follow these steps in Microsoft Windows:

  1. Start [arrow] Run

    Enter Regedt32 select OK.

  2. Open this key in the Registry Editor window:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel

  3. Double-click TypeGuessRows.

  4. Select Decimal, enter 0 in the Value field, and select OK.

  5. Exit from the Registry Editor window.

Registry Settings for the MIXED Option
TypeGuessRows An integer type with a default value of 8. You can use the number of rows in the worksheet range in scans to determine column types. If you set this type to 0, all rows up to 16384 in the range are checked. Microsoft states that the valid range of TypeGuessRows is 0-16. However, you could set as high as 16384, and it would still operate correctly.
CAUTION:
Changing the TypeGuessRows value cause a scan to fail if you set it higher than 16384. It also affects any software that uses the Microsoft Ace provider to access Excel file data, including accessing Excel data in a Microsoft Access database. The TypeGuessRows value is registered with and controlled by Microsoft. It is recommended that you set the value to 0.   [cautionend]
ImportMixedTypes A string type with a default value of Text. If a column contains more than one type of data while scanning TypeGuessRows rows, the column type is determined to be Text if the setting value is Text. If the setting value is Majority Type, the most common column type determines the column type.

For the MIXED=YES option to work correctly, you should you change TypeGuessingRows to 0 in the Microsoft Windows registry so that all rows in the specified range are scanned. As a result, when you use MIXED=YES, the Microsoft Ace engine and the Microsoft Jet engine always assign character type for columns with data of mixed data types.

The numeric data is converted to character data.

NEWFILE=YES|NO

when exporting a SAS data set to an existing Excel file, specifies whether to delete the specified Excel file, and load the SAS data set to a sheet in a new Excel file.

YES

specifies that the EXPORT procedure deletes the specified Excel file, if it exists, and loads the SAS data set to a sheet in the new Excel file.

NO

specifies that the EXPORT procedure loads the SAS data set to a sheet and appends it to the existing Excel file. If the specified Excel file does not exist, an Excel file is created, and the SAS data set is loaded.

Restriction: Available only when DBMS=EXCEL.
PORT=port-number

specifies the number of the port that is listening on the PC Files Server. The valid value is between 1 and 65535. This port or service name displays on the PC Files Server display when the application is started in server mode.

Default: 8621
Alias: PORT_NUMBER
Restriction: Available only for the client server model.
Restriction: The PORT= statement option and the SERVICE= statement option should not be used in the same procedure
RANGE=range-name|absolute-range

subsets a spreadsheet by identifying the rectangular set of cells to import. The range name is the user defined spreadsheet name, to represent a range of cells within the spreadsheet in the Excel file. The range name is not case sensitive and does not allow any special character, except an underscore. You must define range name with a workbook scope so that the name is visible to SAS.

When data is imported from an Excel file, a sheet name that is appended with a $ character is treated as a range name. The range name refers to the whole sheet.

You can use the DATASETS procedure to list the data set names, that are mapped to the range names. If the displayed range name contains single quotes, keep the single quotes as part of the range name to access the sheet. For example, to retrieve data from the spreadsheet for two separate sheet names, My#Test and CustomerOrders:

RANGE="'My#Test$'";
RANGE="' CustomerOrders$'";

Absolute range identifies the top-left cell that begins the range and the bottom-right cell that ends the range.

Example 1. C2:F12 represents cells within Column C, Row 2, and Column F, Row 12. Here is another example:

RANGE="'summary$a4:b20'";

Example 2, $ indicates the end of the sheet name, colon (:) separates the values for upper left (UL) and lower right (LR) of the cell range. If this statement is not specified, the IMPORT procedure reads the entire spreadsheet as a range.

If the range name is available, it is recommended that you use RANGE= option without the SHEET= option for the IMPORT procedure. To use the absolute range address, it is strongly recommended that you use the full range address with quotes. For example, specify 'sheet_name$A1:C7'n . Use the SHEET= option only when you want to access the whole sheet.

Restriction: Supported only for the IMPORT procedure.
SCANTEXT=YES|NO

specifies whether to scan the column to determine the length of the text data for each data source column. This option applies only to character data type columns.

YES

scans the length of text data for a data source column and uses the length of the longest string of data that it finds as the SAS variable width. If the maximum length that it finds is greater than the value in the TEXTSIZE= option, the smaller value in TEXTSIZE= option is applied as the SAS variable width.

NO

does not scan the length of text data for a data source column. The column length returned from the Microsoft Ace or Microsoft Jet engine is used as the SAS variable width. If the returned column width is greater than what is specified in the TEXTSIZE= option, the smaller value specified in TEXTSIZE= option is applied as the SAS variable width.

Specify SCANTEXT=NO when you need to update data in a Microsoft Excel workbook.

Alias: SCAN_TEXT|SCANMEMO
Note: So that the SCANTEXT=YES option works correctly, it is strongly recommended that you change TypeGuessingRows to 0 in the Microsoft Windows registry. This ensures that all rows in the specified range are scanned.

To correct truncated text data for DBCS users (including Chinese, Japanese, and Korean), you can set the environment variable DBE_DBCS to YES. This doubles the scanned text data length and therefore corrects the problem of truncated text data. To set the DBE_DBCS environment variable, submit this statement: Options set=DBE_DBCS YES; .

The Microsoft Ace or Microsoft Jet engine handles the SCANTEXT= option.

These registry settings could affect the behavior of the SCANTEXT= option. These are located in the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel key of the Microsoft Windows registry.

TypeGuessRows An integer type with a default value of 8. The number of rows in the worksheet range is used to scan to determine column types. If set to 0, all rows in the range are checked. Microsoft states that the valid range of TypeGuessRows is 0 to 16. However, it could be set as high as 16384 and still operate correctly.
ImportMixedTypes A string type with a default value of Text. If a column contains more than one data type when scanning of TypeGuessRows rows, the column type is determined to be Text if the setting value is Text. If the setting value is Majority Type, the most common column type determines the column type.

SCANTIME=YES|NO

specifies whether to scan the date/time data while importing data from a date/time column from the Microsoft Excel workbook.

YES

scans the date/time column and assigns the TIME. format for a date/time column only if time values are found in the column.

NO

specifies not to scan the date/time column.

  • The DATETIME. format is assigned for a date/time column if USEDATE=NO.

  • The DATE. format is assigned for a date/time column if USEDATE=YES.

SERVER=PC-Files-Server-name

specifies the PC Files Server name. Excel supports spreadsheet names up to 31 characters. SAS supports data set names up to 32 characters. You must bring up the listener on the PC Files Server before you can establish a connection. You can configure the service name, port number, maximum number of concurrent connections allowed, and data encryption on your PC Files Server.

Alias: SERVER_NAME.
Note: Required statement.
Restriction: Available only for client server model.
See: PC Files Server Administration.
SERVICE=service-name

specifies the service name that is defined on your service file for your client and server machines. This port number or service name is displayed on the PC Files Server control panel screen when it is started on the PC. The service name needs to be defined on your UNIX machine and your PC Files Server.

Alias: SERVER_NAME
Restriction: SERVICE= and PORT= options should not be used in the same procedure.
Restriction: Available only for client server model.
SHEET=sheet-name

identifies a particular spreadsheet in an Excel workbook. If the sheet-name contains a special character such as space in the the EXPORT procedure, SAS converts it to an underscore.

For example, a sheet name "Employee Information" becomes ''Employee_Information".

To ensure that the IMPORT procedure reads the data that you want, identify the spreadsheet by specifying the RANGE= option or the SHEET= option. If the sheet name contains special characters, you can use the RANGE= the RANGE= instead of the SHEET= option to import data.

If you do not specify this statement, the EXPORT procedure writes the spreadsheet name as the SAS data set name.

If your sheet name contains special characters, use PROC DATASETS to list the available data set names. If the sheet name shown contains single quotes, keep the single quotes as part of the sheet name in order to be able to access the sheet.

For example, specify SHEET="'My#Test'";

Note: it is recommended that you use the RANGE= option without the SHEET option in the IMPORT procedure. Use the SHEET= option only when you want to import the entire spreadsheet.
Note: If both the range name and the sheet name are missing, the IMPORT procedure reads the first spreadsheet in the workbook. If you do not specify this statement, the IMPORT procedure reads the first spreadsheet that was physically saved in the Excel file.
Restriction: use the SHEET option only when you want to import the entire spreadsheet.
Restriction: Avoid sheet-names that look like cell references, which have <1-3 characters> plus 1 or more digits. For example, A1, IV65536, TRY123, XFD1048576.
SSPI=YES|NO

enables the server administrator to allow Integrated Windows Authentication. This is a mechanism for the Windows client and server to exchange credentials.

Default: No
Note: SSPI can also be enabled by specifying the -SSPI option on the SAS command line.
Restriction: Valid only when used on a 64-bit Windows PC.
TEXTSIZE=1 to 32767

specifies the SAS maximum variable length that is allowed while importing data from Microsoft Excel text columns. Any text data in Excel whose length exceeds this value is truncated when it is imported into SAS.

Alias: DBMAX_TEXT
USEDATE=YES|NO

specifies whether to assign a DATE or a DATETIME format while importing a date/time column from a Microsoft Excel workbook.

YES

assigns the DATE format for the corresponding date/time column in the Microsoft Excel table.

NO

assigns the DATETIME format for the corresponding date/time column in the Microsoft Excel table.

See: The SCANTIME= statement, to assign the appropriate TIME format
See also: Processing Date and Time Values between SAS and Microsoft Excel.
VERSION=file-version'

specifies the version of the file that you want to create.

Default: the default value is 97 for .xls files
Note: Always enclose the version in single quotes.
Restriction: If the file exists on the PC Files Server, this statement is ignored.
Restriction: Available only for client server model.
Valid Values: 2007, 2003, 2002, 2000, 97, 95, and 5

Example 1: Export a SAS Data Set to an Excel File

This example exports a SAS data set called SDF.ORDERS to an Excel 2007 .xlsb file with the Orders sheet name. In this case, SHEET supports only an .xlsb file.

LIBNAME SDF V9 "&sasdir";
PROC EXPORT DATA=SDF.ORDERS
            FILE="c:\temp\demo.xlsb"
            DBMS=EXCEL REPLACE;
   SHEET="Orders";
RUN;


Example 2: Import a Range from Excel 2007 File to a SAS Data Set

This example imports a range named INVOICE from an Excel workbook file and performs these tasks:

  • uses the first row of data as column names

  • scans length for character type columns

  • assigns DATETIME. format for date/time columns

  • leaves SAS labels blank

  • limits the size of text fields to be less than or equal to 512 characters

PROC IMPORT OUT=WORK.INVOICE
            FILE="&demodir.demo.xlsb"
            DBMS=EXCEL REPLACE;
   RANGE="INVOICE";
   GETNAMES=YES;
   SCANTEXT=YES;
   USEDATE=NO;
   SCANTIME=YES;
   DBSASLABEL=NONE;
   TEXTSIZE=512;   /* default is 1024 */
RUN;


Example 3: Import a Range From an Excel File on a PC Files Server to a SAS Data Set

The code in this example imports a range named Orders from an Excel file on a PC Files Server. It assigns the DATETIME. format for date/time columns.

PROC IMPORT OUT=WORK.ORDERS
            DATAFILE="&pcfdir.demo.xls"
            DBMS=EXCELCS REPLACE;
   RANGE="Orders";
   SERVER="&server";
   PORT=8621;
   USEDATE=NO;
   SCANTIME=YES;
RUN;


Example 4: Export a SAS Data Set to an Excel Workbook File

This code exports a SAS data set named SDF.CUSTOMER to an Excel workbook file, with Customer sheet name on the PC Files Server.

LIBNAME SDF V9 "&sasdir";
PROC EXPORT DATA=SDF.CUSTOMER
            FILE="c:\temp\demo.xls"
            DBMS=EXCELCS REPLACE;
   SHEET="Customer";
   SERVER="&server";
   SERVICE=SASPCFILE;
RUN;


Example 5: The IMPORT Procedure Using SSPI - Microsoft 64-bit Windows

proc import out=work.test datafile='C:\myFile.xlsb' dbms=excelcs replace;
   server=filesrv;
   port=8621;
   sspi=yes;
run;

Previous Page | Next Page | Top of Page