SUPPORT / SAMPLES & SAS NOTES
 

Support

Sample 41880: Read all files from a directory and create separate SAS® data sets with unique names

DetailsCodeResultsAboutRate It

The sample code on the Full Code tab shows how to read all files in a directory and create separate SAS data sets with unique names. Use the PIPE engine in the FILENAME statement to access the directory information. Then, use macro code with a %DO loop to execute a DATA step separately to read each of all files in the directory. To view the SAS log showing the execution of the sample program, click the Results tab.

This sample assumes that there are four comma-delimited files in the folder C:\_today\. The names are:

   file1.csv, file2.csv, file3.csv, and file4.csv.

The first DATA step is used to get the directory information to create a data set that contains each full pathname of all the files with a .csv extension.

The next DATA step uses _NULL_ in the DATA statement to avoid creating a SAS data set since the purpose of this step is to create macro variables. In the SET statement, the END= option is used to create a variable that indicates when you have reached the end of a data set. A variable called COUNT is created to increment each time an observation is read to keep track of how many files there are. The first CALL SYMPUTX creates macro variables that each contains the full path and name of each file to read. The COUNT variable is used to create separate macro variables for each path and name combination. The second CALL SYMPUTX creates macro variables that each contains the name of the file that is used in the DATA statement to create a unique SAS data set for each file read. The last macro variable is created after all observations of the path and names have been read in order to place the total count of files that will be read into a macro variable called MAX.

The macro is created so that a separate DATA step is run for each file to be read from the directory. This is controlled by the macro %DO loop. The DATA statement uses the macro variable with the %DO loops index variable I to use each of the separate macro variables created that contain the unique data set names. The INFILE statement uses the %DO loops index variable also except the separate macro variables created contain each path and name. Because this sample is reading comma delimited files, the DSD option and the TRUNCOVER option are used in the INFILE statement. If the input files have records longer than 256 bytes, you might need to use the LRECL= option in the INFILE statement to increase the size of the input buffer, depending on your release of SAS.




These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.