File Format-Specific Reference for the IMPORT and EXPORT Procedures |
In computer programming, a delimited text file is a file in which the individual data values contain embedded delimiters, such as quotation marks, commas, and tabs. A delimiter is a character that separates words or phrases in a text string that defines the beginning or end of a contiguous string of character data.
The delimiter is not considered part of the character data string.
The first row of data is usually read as column headings.
The column headings are then converted to SAS variable names.
A line character indicates a new row.
Note: Support of delimited files is included in Base SAS. The SAS/ACCESS to PC Files license is not needed to use this list of features.
CSV Files |
A comma-separated values file is a form of a delimited file. The data values are separated by commas. In a CSV-type file, each line can represent one of these items:
an entry
a record
a row
an observation in a database management system
other applications
Tab-Delimited File |
A tab-delimited file is a form of delimited file. The data values are separated by control characters that represent the TAB key. The data values form columns of a database table. The columns can be exported to a database table.
Other Delimiters |
Files that have other delimiters such as spaces or semicolons are also known as delimited text files or delimited files.
External File Interface (EFI) |
The SAS Import and Export Wizards use the SAS External File Interface methods to read and write data in delimited external files.
Be aware of these behaviors when using the wizards and procedures to import or export data in delimited files.
When data values are enclosed in quotation marks, delimiters within the value are treated as character data.
Quotation marks are removed from character values.
Two consecutive delimiters indicate a missing value.
A delimiter can be specified as one or more characters.
While exporting data, the EXPORT procedure discards items that exceed the output line length. See the DROPOVER= option in the FILE statement in the SAS Language Reference: Dictionary.
The delimiter can be in binary form. Example: delimiter='09'X.
As the IMPORT procedure reaches the end of the current input data row, variables without any values are set to missing.
See: | Using the SAS Import and Export Wizards and The IMPORT Procedure |
IMPORT and EXPORT Procedure Statements for Delimited Files |
The supported file types are CSV (comma-separated values), TAB (tab-separated values), and DLM (delimiter-separated values).
See: Example 1: Import a Tab-Delimited File into SAS and Example 3: Export a SAS Data Set to a CSV File
Delimited File Type | Statement Options | PROC Import | PROC Export | Valid Value | Default Value |
---|---|---|---|---|---|
CSV and TAB | DATAROW | Yes | No | 1 to 2147483647 | 2 |
GETNAMES | Yes | No | Yes | No | Yes | |
GUESSINGROWS | Yes | No | 1 to 2147483647 | 20 | |
PUTNAMES | No | Yes | Yes|No | Yes | |
DLM | DATAROW | Yes | No | 1 to 2147483647 | 2 |
DELIMITER | Yes | Yes | 'char' | 'nn'x | ' ' | |
GETNAMES | Yes | No | Yes|No | Yes | |
GUESSINGROWS | Yes | No | 1 to 2147483647 | 20 | |
PUTNAMES | No | Yes | Yes | No | Yes |
specifies the row number where the IMPORT procedure starts reading data.
Default: | 2 |
Valid Values: | 1 to 2147483647 |
Restriction: | If GETNAMES=Yes, then DATAROW=number ge 2. |
Restriction: | If GETNAMES=No, then DATAROW=1. |
specifies the delimiter that separates the columns of data for the IMPORT and EXPORT procedures.
specifies whether the IMPORT procedure is to generate SAS variable names from the data values in the first row of the import file.
If a data value in the first record contains special characters that are not valid in a SAS name, SAS converts the character to an underscore. For example, the column name Occupancy Code becomes the SAS variable name Occupancy_Code.
YES |
specifies that the IMPORT procedure generate SAS variable names from the data values in the first row of the imported Excel file. |
NO |
specifies that the IMPORT procedure generate SAS variable names as F1, F2, F3, and so on. |
Default: | Yes |
Restriction: | Valid only for Windows. |
Restriction: | Valid only for the IMPORT procedure. |
Restriction: | Supported only when DBMS=EXCEL. Due to a Microsoft Jet engine limitation, no more than 64 characters of column names are written to SAS variable labels. |
specifies that number of rows that the IMPORT procedure is to scan to determine the appropriate data type for the columns. The scan process scans from row 1 to the row number that is specified by GUESSINGROWS= option.
Default: | 20 |
Restriction: | Number of rows: 1 to 2147483647 |
Note: |
The default row number can be changed in
the SAS REGISTRY as follows:
From the SAS menu Solutions Accessories Registry Editor. When the Registry Editor opens, select Products BASE EFI GuessingRows. This opens the Edit Signed Integer Value where you can modify the Value Data item. |
specifies whether the EXPORT procedure is to write SAS variable names to the first record of the exported data file.
YES |
specifies that the EXPORT procedure is to
do the following tasks:
|
NO | specifies that the EXPORT procedure is to write the first row of SAS data set values to the first row of the exported data file. |
Default: | Yes |
Note: | If you specify the LABEL= option, the SAS variable labels (not the variable names) are written as column headings. |
This code illustrates how the IMPORT procedure uses the first row of the tab delimited file to generate SAS variable names. SAS starts to read data from row 2, and scans 10 rows of data to determine data types for each column. The invoice.txt file saves data with the tab character ('09'x) as the delimiter.
PROC IMPORT OUT=WORK.TEST FILE="&dlmdir.\invoice.txt" DBMS=TAB REPLACE; GETNAMES=YES; DATAROW=2; GUESSINGROWS=10; RUN;
The IMPORT procedure generates generic variable names such as VAR1 and VAR2. It starts to read data from row 2, and scans the default number of rows (20) to determine the data types for each column.'20'x is the hexadecimal value for a space in ASCII code.
PROC IMPORT OUT=WORK.TEST DATAFILE="&dlmdir.\invoice.txt" DBMS=DLM REPLACE; DELIMITER='20'x; GETNAMES=NO; DATAROW=2; RUN;
The EXPORT procedure exports the SAS data set, SDF.INVOICE, to a CSV file; invoice.csv. The SAS variable name is not used. The first row of the SAS data set is written to the first row of the CSV file.
LIBNAME SDF "&sasdir"; PROC EXPORT DATA=SDF.INVOICE OUTFILE='c:\temp\invoice.csv' DBMS=DLM REPLACE; DELIMITER=','; PUTNAMES=NO; RUN;
The IMPORT procedure starts to read data in row 6. It reads 10 observations from the selected columns in the customer CSV file. The global OBS= option limits the number of data rows to import. The OBS=MAX option resets the OBS= option to the default value.
OPTIONS OBS=10; PROC IMPORT OUT= WORK.Test (KEEP=Customer_ID Name Address First-Ordered_Date) DATAFILE= "&dlmdir.\customer.csv" DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=6; RUN; OPTIONS OBS=MAX;
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.