SAS Institute. The Power to Know

SAS/ACCESS(R) 9.2 Interface to PC Files: Reference

space
Previous Page | Next Page

File Format-Specific Reference for IMPORT and EXPORT Procedures

Delimited Files


Delimited File Essentials

In computer programming, a delimited text file is a flat file that contains data that is separated by a delimiting character called a delimiter. A delimiter defines the beginning or end of a contiguous string of character data. The delimiter is not considered part of the character data string. A delimited text file is also called a delimited-separated values (DSV or DLM) file.

A comma-separated values (CSV) file is a form of delimited file where the delimiter is a comma. A TAB file is a form of delimited file where the delimiter is a tab character. Files that have other delimiters such as spaces ( ) or semicolons (;) are also known as delimited text files, or delimited files.

In a delimited text file, the first row of data is normally read as column names. The column names are then used as SAS variable names. A line character indicates a new row. A CSV file might contain lines of data as follows:

First,Last,Age
Abraham,Howard,31
Smith,Jack,24
Chen,Jo,38

The Import and Export wizards and Import and Export procedures use SAS External File Interface (EFI) components to read and to write data in delimited external files.

The delimited file component is included in Base SAS and does not require a license for SAS/ACCESS Interface to PC Files. Please be aware of the following behaviors when using these wizards and procedures to import or export data in a delimited external file.

  • When data values are enclosed in quotation marks, delimiters within the value are treated as character data. Quotation marks are removed from character values.

  • Two consecutive delimiters indicate a missing value.

  • A delimiter can be specified as one or more characters and can be in binary form. Here is an example: delimiter='09'X .

  • While exporting data with PROC EXPORT, items that exceed the output line length are discarded. See the DROPOVER option of the FILE statement in SAS Language Reference: Dictionary.

  • As PROC IMPORT reaches the end of the current input data record, variables without any values assigned are set to missing.


IMPORT and EXPORT Procedure Statements for Delimited Files

This table lists the statements that are available to process delimited text external files using the IMPORT and EXPORT procedures.Data source files consist of CSV=comma-separated values files, TAB=tab-separated values files, DLM=delimiter separated values files. See Examples of Importing and Exporting Delimited Files.

IMPORT and EXPORT Procedure Statements for Delimited Files
Data Source Syntax Valid Value Default Value In PROC IMPORT? In PROC EXPORT?
CSV DATAROW= 1 to 2147483647 2 Yes No
GETNAMES= YES | NO YES Yes No
GUESSINGROWS= 1 to 2147483647 20 Yes No
PUTNAMES= YES | NO YES No Yes
TAB DATAROW= 1 to 2147483647 2 Yes No
GETNAMES= YES | NO YES Yes No
GUESSINGROWS= 1 to 2147483647 20 Yes No
PUTNAMES= YES | NO YES No Yes
DLM DATAROW= 1 to 2147483647 2 Yes No
DELIMITER= 'char' | 'nn'x ' ' Yes Yes
GETNAMES= YES | NO YES Yes No
GUESSINGROWS= 1 to 2147483647 20 Yes No
PUTNAMES= YES | NO YES No Yes

DATAROW=1 to 2147483647;

starts reading data from the specified row number in the delimited text external file. When GETNAMES=YES (default for GETNAMES=), the default value is 2 and you can specify only a value equal to or greater than 2. When GETNAMES=NO, the default value is 1.

DELIMITER='char' | 'nn'x;

specifies the delimiter that separates columns of data in the imported or exported file. You can specify the delimiter as a single character or as a hexadecimal value. For example, if columns of data are separated by an ampersand, specify DELIMITER='&' . If you do not specify DELIMITER=, PROC IMPORT assumes that the delimiter is a blank character.

GETNAMES=YES | NO;

determines whether to generate SAS variable names from the data values in the first record of the imported file.

If a data value in the first record is read and it contains special characters that are not valid in a SAS name, such as a blank, SAS converts the character to an underscore. For example, the column name Occupancy Code would become the SAS variable name Occupancy_Code .

YES

specifies that PROC IMPORT is to generate SAS variable names from the data values in the first record of the imported file.

NO

specifies that PROC IMPORT is to generate SAS variable names as VAR1, VAR2, VAR3, and so on.

GUESSINGROWS=1 to 2147483647;

specifies that the number of rows in the delimited text file to scan to determine the appropriate data type for the columns. The scan process scans from row 1 to the number that is specified by GUESSINGROWS=. The default value is 20. However, you could change the default value in the SAS registry. To do this, from the SAS menu select Solutions [arrow] Accessories [arrow] Registry Editor. From the Registry Editor that opens, select Products [arrow] BASE [arrow] EFI [arrow] GuessingRows.

PUTNAMES=YES | NO;

determines whether to write SAS variable names to the first record of the exported data file. If you specify the LABEL option, SAS variable labels are written instead of variable names.

YES

specifies that PROC EXPORT is to write SAS variable names to the first record and write the first observation data to the second record of the exported data file.

NO

specifies that PROC EXPORT is to write the first observation data to the exported data file.


Examples of Importing and Exporting Delimited Files

  1. Import CSV data into SAS

    This sample code uses the first row of data for SAS variable names, reads data in from row 2, and scans 10 rows of data to determine data types for each columns.

    PROC IMPORT OUT=WORK.TEST
                FILE="&dlmdir.\invoice.csv"
                DBMS=CSV REPLACE;
        GETNAMES=YES;
        DATAROW=2;
        GUESSINGROWS=10;
    RUN;

  2. Import Space-Separated Data Values into SAS

    This sample code lets SAS generate variable names as VAR1, VAR2, and so on. It reads data starting from row 2 and scans the default number of rows (20) to determine the data types for each column. '20'x is the hexadecimal value for a blank character (space) in ASCII code.

     PROC IMPORT OUT=WORK.TEST
                DATAFILE="&dlmdir.\invoice.txt"
                DBMS=DLM REPLACE;
         DELIMITER='20'x;
         GETNAMES=NO;
         DATAROW=;
    RUN;

  3. Export a SAS Data Set to a CSV File

    This example code exports SAS data set SDF.INVOICE to a CSV file named invoice.csv. However, it does not write column names to the first row of data. Instead, it exports data only to the output file.

    LIBNAME SDF V9 "&sasdir";
    PROC EXPORT DATA=SDF.INVOICE
                OUTFILE="c:\temp\invoice.csv"
                DBMS=DLM REPLACE;
         DELIMITER=','; 
         PUTNAMES=NO;
    RUN;

  4. Import a Subset of CSV Data into SAS

    This example code reads in data from rows 6-10 of the selected columns in the customer.csv file. It uses the global OBS= option to limit the number of data rows to import and OBS=MAX to reset the option to the default value.

    OPTIONS OBS=10;
    PROC IMPORT OUT= WORK.Test (KEEP=Customer_ID Name Address First-Ordered_Date
                DATAFILE= "&dlmdir.\customer.csv"
                DBMS=CSV REPLACE;
         GETNAMES=YES;
         DATAROW=6;
    RUN;
    OPTIONS OBS=MAX;

space
Previous Page | Next Page | Top of Page