The following example uses a very small data set to illustrate the various macro options. The data step below creates a simple data set with one numeric variable, id, one character variable, a, and four observations. Notice that one value of a is a missing value.
data Ex1;
infile datalines missover;
input id:1. a:$1.;
datalines;
1 b
2 c
3
4 a
;
proc print data=Ex1;
run;
This statement makes the CtoN macro available in your current SAS session. Within the quotes, specify the location of the file containing your local copy of the CtoN macro. The statement only needs to be run once per SAS session.
%inc "<location of your file containing the CtoN macro>";
The first call of the macro detects all character variables in the data= data set, and creates an output data set named Ex1_N in which the one character variable found, a, is replaced with a numeric variable, also called a, that is formatted using the character values in the original variable. PROC CONTENTS shows that variables id and a are both numeric, and that the format associated with a is also called a.
%CtoN(data=Ex1)
proc contents data=Ex1_N;
ods select variables;
run;
Printing data set Ex1_N looks identical to the original data set, Ex1, because of the formatting. Notice that the missing value in the original character variable is also missing in the new numeric variable.
proc print data=Ex1_N;
run;
Using a FORMAT statement with no format specified removes the formatting so that the numeric coding of a is visible. Notice that the values 1, 2, 3 are assigned to the original values in sorted order, which is the default.
proc print data=Ex1_N;
format a;
run;
The next macro call shows the effects of the noreplace and noformat options. With noreplace, the original character variable is retained along with a new numeric variable named a_N. The noformat option prevents the assignment of a format to the numeric variable as can be seen in the PROC PRINT results. PROC CONTENTS shows there are now three variables in data set Ex1_N and no formats are associated.
%CtoN(data=Ex1, options=noreplace noformat)
proc print data=Ex1_N;
run;
proc contents data=Ex1_N;
ods select variables;
run;
a |
Char |
1 |
a_N |
Num |
8 |
id |
Num |
8 |
|
Next, the order= option is used. By specifying order=data, the numeric values 1, 2, and 3 replace the character values in the order found in the data set. That is, the first value found, 'b', is assigned value 1. The second value, 'c', is assigned 2, and the third nonmissing value, 'a', is assigned 3. All variables are again retained by the noreplace option and formatting is prevented by the noformat option so that a simple PROC PRINT shows the change in ordering as compared to the default (order=internal).
%CtoN(data=Ex1, order=data, options=noreplace noformat)
proc print data=Ex1_N;
run;
Finally, the prefix= and suffix= options are used to show how the new variable names can be customized. The following macro call adds 'num_' at the beginning of all new numeric variable names in the output data set, new. Since the default is suffix=_N, specifying suffix= prevents any suffix characters from being added to the ends of the variable names. Note that when the replace option is in effect, the prefix= and suffix= options are ignored.
%CtoN(data=Ex1, prefix=num_, suffix=, out=new, options=noreplace noformat)
proc print data=new;
run;
A single call of the CtoN macro can create numeric variables for each of many character variables in a data set. The sashelp.class data set has two character variables, Name and Sex, along with several numeric variables as shown by PROC CONTENTS.
proc contents data=sashelp.class;
ods select variables;
run;
Age |
Num |
8 |
Height |
Num |
8 |
Name |
Char |
8 |
Sex |
Char |
1 |
Weight |
Num |
8 |
|
This call of the macro processes all character variables and creates a new data set, named newclass, which contains numeric replacements of the character variables. As in the example above, printing the data set using the formats that are assigned by default looks the same as printing the original data set. By removing all formats with a FORMAT statement, the numeric coding of the new variables can be seen.
%CtoN(data=sashelp.class, out=newclass)
proc print data=newclass;
run;
proc print data=newclass;
format _all_;
run;
14 |
69.0 |
112.5 |
Alfred |
M |
13 |
56.5 |
84.0 |
Alice |
F |
13 |
65.3 |
98.0 |
Barbara |
F |
14 |
62.8 |
102.5 |
Carol |
F |
14 |
63.5 |
102.5 |
Henry |
M |
12 |
57.3 |
83.0 |
James |
M |
12 |
59.8 |
84.5 |
Jane |
F |
15 |
62.5 |
112.5 |
Janet |
F |
13 |
62.5 |
84.0 |
Jeffrey |
M |
12 |
59.0 |
99.5 |
John |
M |
11 |
51.3 |
50.5 |
Joyce |
F |
14 |
64.3 |
90.0 |
Judy |
F |
12 |
56.3 |
77.0 |
Louise |
F |
15 |
66.5 |
112.0 |
Mary |
F |
16 |
72.0 |
150.0 |
Philip |
M |
12 |
64.8 |
128.0 |
Robert |
M |
15 |
67.0 |
133.0 |
Ronald |
M |
11 |
57.5 |
85.0 |
Thomas |
M |
15 |
66.5 |
112.0 |
William |
M |
14 |
69.0 |
112.5 |
1 |
2 |
13 |
56.5 |
84.0 |
2 |
1 |
13 |
65.3 |
98.0 |
3 |
1 |
14 |
62.8 |
102.5 |
4 |
1 |
14 |
63.5 |
102.5 |
5 |
2 |
12 |
57.3 |
83.0 |
6 |
2 |
12 |
59.8 |
84.5 |
7 |
1 |
15 |
62.5 |
112.5 |
8 |
1 |
13 |
62.5 |
84.0 |
9 |
2 |
12 |
59.0 |
99.5 |
10 |
2 |
11 |
51.3 |
50.5 |
11 |
1 |
14 |
64.3 |
90.0 |
12 |
1 |
12 |
56.3 |
77.0 |
13 |
1 |
15 |
66.5 |
112.0 |
14 |
1 |
16 |
72.0 |
150.0 |
15 |
2 |
12 |
64.8 |
128.0 |
16 |
2 |
15 |
67.0 |
133.0 |
17 |
2 |
11 |
57.5 |
85.0 |
18 |
2 |
15 |
66.5 |
112.0 |
19 |
2 |
|
Specify the var= option if only a subset of the existing character variables are to be replaced. In this call to the macro, only the Sex variable is replaced. Preventing the association of a format with the noformat option allows a basic PROC PRINT step to show the numeric coding. Additionally, the numeric values (1, 2) are assigned to the values of Sex in the order seen in the data set (via order=data), resulting in Sex='M' now being coded 1 rather than 2.
%CtoN(data=sashelp.class, var=Sex, order=data, out=newclass, options=noformat)
proc print data=newclass;
run;
Alfred |
14 |
69.0 |
112.5 |
1 |
Alice |
13 |
56.5 |
84.0 |
2 |
Barbara |
13 |
65.3 |
98.0 |
2 |
Carol |
14 |
62.8 |
102.5 |
2 |
Henry |
14 |
63.5 |
102.5 |
1 |
James |
12 |
57.3 |
83.0 |
1 |
Jane |
12 |
59.8 |
84.5 |
2 |
Janet |
15 |
62.5 |
112.5 |
2 |
Jeffrey |
13 |
62.5 |
84.0 |
1 |
John |
12 |
59.0 |
99.5 |
1 |
Joyce |
11 |
51.3 |
50.5 |
2 |
Judy |
14 |
64.3 |
90.0 |
2 |
Louise |
12 |
56.3 |
77.0 |
2 |
Mary |
15 |
66.5 |
112.0 |
2 |
Philip |
16 |
72.0 |
150.0 |
1 |
Robert |
12 |
64.8 |
128.0 |
1 |
Ronald |
15 |
67.0 |
133.0 |
1 |
Thomas |
11 |
57.5 |
85.0 |
1 |
William |
15 |
66.5 |
112.0 |
1 |
|