|
Identifies the
normality test that is applied to the data.
|
|
Provides the
label that is associated with the applied normality test.
|
|
Provides the
normality statistic that is calculated by the applied normality test.
|
|
Provide the probability
(type and value) for the applied normality test.
|
|
Provide the number
of observations, mean, and standard deviation for the analysis variable.
|
To initialize
these macro variables, we will now create a macro that calculates
values for them and also specifies an SGRENDER procedure that uses
template GINSET. The macro needs two parameters: one for passing a
SAS data set name, and a second for passing the name of a column in
that data set.
The following macro code uses PROC
UNIVARIATE to create two output data sets. A DATA step then reads
the output data sets, creates the required macro variables, and assigns
values to those macro variables in a local symbol table. When the
macro runs the SGRENDER procedure, the values of the macro variables
are imported into the GINSET template to produce a graph with insets,
similar to the graph in
shown in Passing Parameter Values to a Template. As mentioned
earlier, the normality test that is performed on the analysis variable
will be based on the number of observations in that analysis variable.
Note: To make the
following macro more robust, it could be designed to validate the
parameters.
%macro histogram(dsn,var);
/* compute tests for normality */
ods output TestsForNormality=norm;
proc univariate data=&dsn normaltest;
var &var;
output out=stats n=n mean=mean std=std;
run;
%local nobs mean std test testlabel stat ptype pvalue;
data _null_;
set stats(keep=n mean std);
call symputx("nobs",n);
call symput("mean",strip(put(mean,12.3)));
call symput("std",strip(put(std,12.4)));
if n > 2000 then /* use Shapiro-Wilk */
set norm(where=(TestLab="D"));
else /* use Kolmogorov-Smirnov */
set norm(where=(TestLab="W"));
call symput("testlabel","("||trim(testlab)||")");
call symput("test",strip(test));
call symput("ptype",strip(ptype));
call symput("stat",strip(put(stat,best8.)));
call symput("pvalue",psign||put(pvalue,pvalue6.4));
run;
proc sgrender data=&dsn template=ginset;
run;
%mend;
-
The %MACRO statement declares a
macro named HISTOGRAM that takes two parameters: DSN (for the data
set name) and VAR (for the column name).
-
The ODS OUTPUT statement produces
a SAS data set named NORM from the TestsForNormality output object
that will be generated by the UNIVARIATE procedure (next statement).
For more information on the ODS OUTPUT statement,
see the
SAS Output Delivery System: User's Guide.
-
Deriving the input data set name
from the DSN parameter and the analysis variable name from the VAR
parameter, the UNIVARIATE procedure calculates the number of observations,
mean, and standard deviation for the analysis variable. It writes
the values for these statistics to an output data set named STATS,
storing the values in variables named N, MEAN, and STD.
-
The %LOCAL statement creates a
set of local macro variables to add to the local symbol table.
-
The DATA step reads variables N,
MEAN, and STD from the STATS data set.
-
The first three CALL SYMPUT routines
use the data input variables to assign labels and values to the local
macro variables N, MEAN, and STD. On each CALL SYMPUT, the first argument
identifies the macro variable to receive the value, and the second
argument identifies the data input variable that contains the value
to assign to the macro variable in the symbol.
-
The IF/ELSE structure determines
which normality test values to read from the NORM data set that was
created by the ODS OUTPUT statement. If there are fewer than 2000
observations, the Shapiro-Wilk test values are used; otherwise, the
Kolmogorov-Smirnov values are used.
-
The remaining CALL SYMPUT routines
assign values to the rest of the macro variables, using the values
from variables in the NORM data set.