DATA Step Processing |
Creating a SAS Data File or a SAS View |
You can create either a SAS data file, a data set that holds actual data, or a SAS view, a data set that references data that is stored elsewhere. By default, you create a SAS data file. To create a SAS view instead, use the VIEW= option in the DATA statement. With a SAS view you can, for example, process monthly sales figures without having to edit your DATA step. Whenever you need to create output, the output from a SAS view reflects the current input data values.
The following DATA statement creates a SAS view called MONTHLY_SALES.
data monthly_sales / view=monthly_sales;
The following DATA statement creates a data file called TEST_RESULTS.
data test_results;
Sources of Input Data |
You select data-reading statements based on the source of your input data. There are at least six sources of input data:
data that you can remotely access through an FTP protocol, TCP/IP socket, a SAS catalog entry, or through a URL
data that is stored in a Database Management System (DBMS) or other vendor's data files.
Usually DATA steps read input data records from only one of the first three sources of input. However, DATA steps can use a combination of some or all of the sources.
Reading Raw Data: Examples |
The components of a DATA step that produce a SAS data set from raw data stored in an external file are outlined here.
data weight; 1 infile 'your-input-file'; 2 input IDnumber $ Week1 Week16; 3 WeightLoss=Week1-Week16; 4 run; 5 proc print data=weight; 6 run; 7
Begin the DATA step and create a SAS data set called WEIGHT. | |
This example reads raw data from instream data lines.
data weight2; 1 input IDnumber $ Week1 Week16; 2 WeightLoss2=Week1-Week16; 3 datalines; 4 2477 195 163 2431 220 198 2456 173 155 2412 135 116 ; 5 proc print data=weight2; 6 run; 7
Signal end of data lines with a semicolon and execute the DATA step. | |
You can also take advantage of options in the INFILE statement when you read instream data lines. This example shows the use of the MISSOVER option, which assigns missing values to variables for records that contain no data for those variables.
data weight2; infile datalines missover; 1 input IDnumber $ Week1 Week16; WeightLoss2=Week1-Week16; datalines; 2 2477 195 163 2431 2456 173 155 2412 135 116 ; 3 proc print data=weight2; 4 run; 5
Use the MISSOVER option to assign missing values to variables that do not contain values in records that do not satisfy the current INPUT statement. | |
This example shows how to use multiple input files as instream data to your program. This example reads the records in each file and creates the ALL_ERRORS SAS data set. The program then sorts the observations by Station, and creates a sorted data set called SORTED_ERRORS. The print procedure prints the results.
options pageno=1 nodate linesize=60 pagesize=80; data all_errors; length filelocation $ 60; input filelocation; /* reads instream data */ infile daily filevar=filelocation filename=daily end=done; do while (not done); input Station $ Shift $ Employee $ NumberOfFlaws; output; end; put 'Finished reading ' daily=; datalines; . . .myfile_A. . . . . .myfile_B. . . . . .myfile_C. . . ; proc sort data=all_errors out=sorted_errors; by Station; run; proc print data = sorted_errors; title 'Flaws Report sorted by Station'; run;
Multiple Input Files in Instream Data
Flaws Report sorted by Station 1 Number Obs Station Shift Employee OfFlaws 1 Amherst 2 Lynne 0 2 Goshen 2 Seth 4 3 Hadley 2 Jon 3 4 Holyoke 1 Walter 0 5 Holyoke 1 Barb 3 6 Orange 2 Carol 5 7 Otis 1 Kay 0 8 Pelham 2 Mike 4 9 Stanford 1 Sam 1 10 Suffield 2 Lisa 1
Reading Data from SAS Data Sets |
This example reads data from one SAS data set, generates a value for a new variable, and creates a new data set.
data average_loss; 1 set weight; 2 Percent=round((AverageLoss * 100) / Week1); 3 run; 4
Begin the DATA step and create a SAS data set called AVERAGE_LOSS. | |
Generating Data from Programming Statements |
You can create data for a SAS data set by generating observations with programming statements rather than by reading data. A DATA step that reads no input goes through only one iteration.
data investment; 1 begin='01JAN1990'd; end='31DEC2009'd; do year=year(begin) to year(end); 2 Capital+2000 + .07*(Capital+2000); output; 3 end; put 'The number of DATA step iterations is '_n_; 4 run; 5 proc print data=investment; 6 format Capital dollar12.2; 7 run; 8
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.