Sample 33027: Looking for Unique Codes in Your Data
Many of us are presented with SAS data sets where codes such as 9999 are intermingled with real data values. Sometimes these codes represent missing values; sometimes they represent other non-data values.
If you run SAS procedures on numeric variables in such a data set, you will, obviously, produce nonsense. What we present here is a macro that will automatically check all the numeric variables in a SAS data set for a specific data value, and produce a report showing which variables contain this special value and how many times it appeared.
The macro is called FIND_VALUE and is presented below. You can download this macro and many other useful macros from the SAS Companion Web Site: support.sas.com/publishing. Search for my book, Cody's Data Cleaning Techniques, Second Edition, and then click on the link to download the programs and data files from the book.
You call the macro like this:
%find_value(Dsn=the name of your SAS data set,
Value=the value you are looking for)
To demonstrate this macro, a data set called SPECIAL was created with several character and numeric variables. Values of 9999 were entered several times for some of the numeric variables. The macro call looks like this:
%find_value(Dsn=Special, Value=9999)
With the following result:
This sample was authored by Ron Cody. His books:
are available from the SAS Publishing online bookstore.
These sample files and code examples are provided by SAS Institute
Inc. "as is" without warranty of any kind, either express or implied, including
but not limited to the implied warranties of merchantability and fitness for a
particular purpose. Recipients acknowledge and agree that SAS Institute shall
not be liable for any damages whatsoever arising out of their use of this material.
In addition, SAS Institute will provide no support for the materials contained herein.
There are several key points to discuss in this program. First, we use the keyword _numeric_ to reference all the numeric variables in the data set, and we use an array to cycle through all the variables. We use __i (that's two underscores followed by an i) for the loop counter because we can be fairly confident that the data set we are processing does not contain a variable with the same name. Next, the key to this program is the VNAME function that returns a variable name, given an array reference.
Each time the program finds the selected value (in this case, 9999), it writes an observation out to a temporary (TEMP) data set. The only variable in this data set is the name of the variable.
Finally, PROC FREQ is used list the variable names for which the key value was found and the frequency. This information is also placed in an output data set (SUMMARY) in case you need it for further processing.
*-----------------------------------------------------------------*
| Macro name: find_value.sas |
| purpose: Identifies any specified value for all numeric vars |
| Calling arguments: dsn= sas data set name |
| value= numeric value to search for |
| example: to find variable values of 9999 in data set test, use |
| %find_value(dsn=test, value=9999) |
*-----------------------------------------------------------------*;
%macro find_value(dsn=, /* The data set name */
value=999 /* Value to look for, default is 999 */ );
title "Variables with &value as missing values";
data temp;
set &dsn;
length Varname $ 32;
array nums[*] _numeric_;
do __i = 1 to dim(nums);
if nums[__i] = &value then do;
Varname = vname(nums[__i]);
output;
end;
end;
keep Varname;
run;
proc freq data=temp;
tables Varname / out=summary(keep=Varname Count)
nocum;
run;
proc datasets library=work;
delete temp;
run;
quit;
%mend find_value;
These sample files and code examples are provided by SAS Institute
Inc. "as is" without warranty of any kind, either express or implied, including
but not limited to the implied warranties of merchantability and fitness for a
particular purpose. Recipients acknowledge and agree that SAS Institute shall
not be liable for any damages whatsoever arising out of their use of this material.
In addition, SAS Institute will provide no support for the materials contained herein.
What we present here is a macro that will automatically check all the numeric variables in a SAS data set for a specific data value, and produce a report showing which variables contain this special value and how many times it appeared. The macro is called FIND_VALUE
Date Modified: | 2008-08-22 16:11:36 |
Date Created: | 2008-08-21 09:26:26 |
Operating System and Release Information
SAS System | Base SAS | z/OS | 9 TS M0 | |
Microsoft® Windows® for 64-Bit Itanium-based Systems | 9 TS M0 | |
Microsoft Windows Server 2003 Datacenter 64-bit Edition | 9 TS M0 | |
Microsoft Windows Server 2003 Enterprise 64-bit Edition | 9 TS M0 | |
Microsoft Windows 2000 Advanced Server | 9 TS M0 | |
Microsoft Windows 2000 Datacenter Server | 9 TS M0 | |
Microsoft Windows 2000 Server | 9 TS M0 | |
Microsoft Windows 2000 Professional | 9 TS M0 | |
Microsoft Windows NT Workstation | 9 TS M0 | |
Microsoft Windows Server 2003 Datacenter Edition | 9 TS M0 | |
Microsoft Windows Server 2003 Enterprise Edition | 9 TS M0 | |
Microsoft Windows Server 2003 Standard Edition | 9 TS M0 | |
Microsoft Windows XP Professional | 9 TS M0 | |
64-bit Enabled AIX | 9 TS M0 | |
64-bit Enabled HP-UX | 9 TS M0 | |
64-bit Enabled Solaris | 9 TS M0 | |
HP-UX IPF | 9 TS M0 | |
Linux | 9 TS M0 | |
OpenVMS Alpha | 9 TS M0 | |
Tru64 UNIX | 9 TS M0 | |