SAS Institute. The Power to Know

SAS/ACCESS(R) 9.2 Interface to PC Files: Reference

Previous Page | Next Page

File Format-Specific Reference for the IMPORT and EXPORT Procedures

Overview Delimited Files

In computer programming, a delimited text file is a file in which the individual data values contain embedded delimiters, such as quotation marks, commas, and tabs. A delimiter is a character that separates words or phrases in a text string that defines the beginning or end of a contiguous string of character data.

  • the delimiter is not considered part of the character data string

  • the first row of data is usually read as column headings

  • the column headings are then converted to SAS variable names

  • a line character indicates a new row

A delimited text file is also called a delimiter-separated values file (DSV or DSM).

CSV Files

A comma-separated values file, is a form of a delimited file. The data values are separated by commas. A file where each line can represent

  • an entry

  • a record

  • a row

  • an observation in a database management system

  • other applications

First,Last,Age
Abraham,Howard,31
Smith,Jack,24
Chen,Jo,38
;


Tab Delimited File

A TAB delimited file is a form of delimited file where the delimiter is a tab character. The values form columns of a database table, that can be exported to a database table. Control characters that represent the tab key separate the values.


Other Delimiters

Files that have other delimiters such as spaces ( ) or semicolons, are also known as delimited text files or delimited files.


External File Interface (EFI)

EFI

The SAS Import and Export Wizards use the SAS External File Interface components to read and write data in delimited external files.

The delimited file component is included in Base SAS and does not require a license for SAS/ACCESS Interface to PC Files. Be aware of these behaviors when using these wizards and procedures to import or export data in a delimited external file.

  • When data values are enclosed in quotation marks, delimiters within the value are treated as character data. Quotation marks are removed from character values.

  • Two consecutive delimiters indicate a missing value.

  • A delimiter can be specified as one or more characters.

  • While exporting data, the EXPORT procedure discards items that exceed the output line length.

  • The delimiter can be in binary form. Example: delimiter='09'X.

  • As the IMPORT procedure reaches the end of the current input data row, variables without any values are set to missing.

See the DROPOVER= option in the FILE statement in the SAS Language Reference: Dictionary.

Featured in:


Summary of Supported Import and Export Procedure Options

Delimited Files

This table lists the availability of the IMPORT and EXPORT procedure statements and options. The supported file types are CSV (comma delimited values), TAB (tab-separated values), and DLM (delimiter separated values). There are alternatives to accessing delimited PC files formats under Linux and UNIX. Currently, Windows, UNIX, and VMS can access delimited files.

See: Example 1. Import a TAB Delimited File into SAS.
Summary of Supported IMPORT and EXPORT Procedure Statements
Delimited File Type Statement Options PROC Import PROC Export Valid Value Default Value
CSV and TAB DATAROW= Yes No 1 to 2147483647 2
GETNAMES= Yes No Yes | No Yes
GUESSINGROWS= Yes No 1 to 2147483647 20
PUTNAMES= No Yes Yes|No Yes
DLM DATAROW= Yes No 1 to 2147483647 2
DELIMITER= Yes Yes 'char' | 'nn'x ' '
GETNAMES= Yes No Yes|No Yes
GUESSINGROWS= Yes No 1 to 2147483647 20
PUTNAMES= No Yes Yes | No Yes


Details of Supported IMPORT and EXPORT Statements Options

DATAROW

specifies the row number where the IMPORT procedure starts reading data.

Default: 2
Valid Values: 1 to 2147483647
Restriction: If GETNAMES=Yes, then DATAROW=number ge 2
Restriction: If GETNAMES=No, then DATAROW= 1
DELIMITER='char'| 'nn'x

specifies the delimiter that separates the columns of data for the IMPORT and EXPORT procedures.

Default: Blank character.
Restriction: For DLM and DSV files only.
Restriction: If you do not specify a DELIMITER value, the default value (a blank character) is used as the delimiter.
Note: You can specify the delimiter as a single character or as a hexadecimal value. For example, if columns of data are separated by an ampersand, specify DELIMITER='&' .
GETNAMES=YES|NO

specifies whether the IMPORT procedure is to generate SAS variable names from the data values in the first row of the import file.

If data in the first row of the input file contains special characters for a SAS variable name, such as a blank; SAS converts the character to an underscore. For example, the column heading.

YES

specifies that the IMPORT procedure generate SAS variable names from the data values in the first row of the imported Excel file.

NO

specifies that the IMPORT procedure generate SAS variable names as F1, F2, F3, and so on.

Default: Yes
Restriction: Valid only for Windows.
Restriction: Valid only for the IMPORT procedure.
Restriction: Supported only when DBMS=EXCEL.
Restriction: When SAS reads the data value in the first row of the input file, SAS checks for invalid SAS name characters (such as a blank). Invalid characters are converted to an underscore. For example, the data value Occupancy Code becomes the SAS variable name Occupancy_Code .
GUESSINGROWS=1 to 2147483647

specifies that number of rows that the IMPORT procedure is to scan to determine the appropriate data type for the columns. The scan process scans from row 1 to the row number that is specified by GUESSINGROWS= option.

Default: 20
Note: The default row number can be changed in the SAS REGISTRY as follows:

From the SAS menu Solutions [arrow] Accessories [arrow] Registry Editor. When the Registry Editor opens Products [arrow] BASE [arrow] EFI [arrow] GuessingRows

PUTNAMES=YES|NO

specifies whether the EXPORT procedure is to write SAS variable names to the the first record of the exported data file.

YES specifies that the EXPORT procedure is to:
  • Write the SAS variable names to the first row of the exported data file as column headings.

  • Write the first row of the SAS data set to the second row of the exported data file.

NO specifies that the EXPORT procedure is to write the first row of SAS data set values to the first row of the exported data file.
Default: Yes
Note: If you specify the LABEL= option, the SAS variable labels (not the variable names) are written as column headings.

Examples of Importing and Exporting Delimited File Types


Example 1. Import a TAB Delimited File into SAS

This code illustrates how the IMPORT procedure uses the first row of the tab delimited file to generate SAS variable names. SAS starts to read data from row 2, and scans 10 rows of data to determine data types for each column. The invoice.txt file saves data with the tab character ('09'x) as the delimiter.

PROC IMPORT OUT=WORK.TEST
            FILE="&dlmdir.\invoice.txt"
            DBMS=TAB REPLACE;
    GETNAMES=YES;
    DATAROW=2;
    GUESSINGROWS=10;
RUN;


Example 2. Import a Space Delimited File into SAS

. The IMPORT procedure generates generic variable names such as VAR1 and VAR2. It starts to read data from row 2, and scans the default number of rows (20) to determine the data types for each column.'20'x is the hexadecimal value for a space in ASCII code.

 PROC IMPORT OUT=WORK.TEST
            DATAFILE="&dlmdir.\invoice.txt"
            DBMS=DLM REPLACE;
     DELIMITER='20'x;
     GETNAMES=NO;
     DATAROW=2;
RUN;


Example 3. Export a SAS Data Set to a CSV File

.

The EXPORT procedure exports the SAS data set, SDF.INVOICE, to a CSV file; invoice.csv. The SAS variable name is not used. The first row of the SAS data set is written to the first row of the CSV file.

LIBNAME SDF V9 "&sasdir";
PROC EXPORT DATA=SDF.INVOICE
            OUTFILE="c:\temp\invoice.csv"
            DBMS=DLM REPLACE;
     DELIMITER=',';
     PUTNAMES=NO;
RUN;


Example 4. Import a Subset of a CSV File into SAS

The IMPORT procedure starts to read data in rows 6. It reads 10 observations from the selected columns, in the customer CSV file. The global OBS= option limits the number of data rows to import. The OBS=MAX option resets the OBS= option to the default value.

OPTIONS OBS=10;
PROC IMPORT OUT= WORK.Test (KEEP=Customer_ID Name Address First-Ordered_Date)
            DATAFILE= "&dlmdir.\customer.csv"
            DBMS=CSV REPLACE;
     GETNAMES=YES;
     DATAROW=6;
RUN;
OPTIONS OBS=MAX;

Previous Page | Next Page | Top of Page