Special SAS Data Sets |
All SAS/STAT procedures create SAS data sets. Any table generated by a procedure can be saved to a data set by using the Output Delivery System (ODS), and many procedures also have syntax that enables you to save other statistics to data sets. Some of these data sets are organized according to certain conventions so that they can be read by a SAS/STAT procedure for further analysis. Such specially organized data sets are recognized by the TYPE= data set attribute.
The CORR procedure (see the Base SAS Procedures Guide: Statistical Procedures), for example, can create a data set with the attribute TYPE=CORR containing a correlation matrix. This TYPE=CORR data set can be read by the REG or FACTOR procedure, among others. If the original data set is large, using a special SAS data set in this way can save computer time by avoiding the recomputation of the correlation matrix in subsequent analyses.
PROC REG, for example, can create a TYPE=EST data set containing estimated regression coefficients. If you need to make predictions for new observations, you can use the SCORE procedure to read both the TYPE=EST data set and a data set containing the new observations. PROC SCORE can then compute predicted values or residuals without repeating the entire regression analysis. See Chapter 77, The SCORE Procedure, for an example.
A special SAS data set might contain different kinds of statistics. A special variable called _TYPE_ is used to distinguish the various statistics. For example, in a TYPE=CORR data set, an observation in which _TYPE_=’MEAN’ contains the means of the variables in the analysis, and an observation in which _TYPE_=’STD’ contains the standard deviations. Correlations appear in observations with _TYPE_=’CORR’. Another special variable, _NAME_, is needed to identify the row of the correlation matrix. Thus, the correlation between variables X and Y is given by the value of the variable X in the observation for which _TYPE_=’CORR’ and _NAME_=’Y’, or by the value of the variable Y in the observation for which _TYPE_=’CORR’ and _NAME_=’X’.
The special data sets created by SAS/STAT procedures can generally be used directly by other procedures without modification. However, if you create an output data set with PROC CORR and use the NOCORR option to omit the correlation matrix from the OUT= data set, you need to set the TYPE= option either in parentheses following the OUT= data set name in the PROC CORR statement or in parentheses following the DATA= option in any other procedure that recognizes the special TYPE= attribute. In either case, the TYPE= option should be set to COV, CSSCP, or SSCP according to what type of matrix is stored in the data set and what data set types are accepted as input by the other procedures you plan to use. If you do not follow these steps and you use the TYPE=CORR data set with no correlation matrix as input to another procedure, the procedure might issue an error message indicating that the correlation matrix is missing from the data set.
You can create special SAS data sets directly in a DATA step by specifying the TYPE= option in parentheses after the data set name in the DATA statement. See Example A.2: Creating a TYPE=CORR Data Set in a DATA Step for an example. If you use a DATA step with a SET statement to modify a special SAS data set, you must specify the TYPE= option in the DATA statement. The TYPE= attribute of the data set in the SET statement is not automatically copied to the data set being created. You can determine the TYPE= attribute of a data set by using the CONTENTS procedure (see Example A.1: A TYPE=CORR Data Set Produced by PROC CORR and the Base SAS Procedures Guide for details).
Table A.1 summarizes the TYPE= data sets that can be used as input to SAS/STAT procedures. Table A.2 summarizes the TYPE= data sets that are created by SAS/STAT procedures and the statements each procedure uses to create its special output data sets. Most procedures accept ordinary SAS data sets and create ordinary output SAS data sets with no TYPE= specification in addition to the special data sets shown in the tables. When you specify a data set with a type that the procedure does not recognize, the procedure prints an error message and stops executing.
Procedure |
Special TYPE= Data Sets Accepted |
---|---|
ACECLUS |
ACE, CORR, COV, SSCP, UCORR, UCOV |
BOXPLOT |
BOXPLOT, CHARTSUM |
CALIS |
CORR, COV, FACTOR, RAM, SSCP, UCORR, UCOV, WEIGHT |
CANDISC |
CORR, COV, SSCP, CSSCP |
CATMOD |
EST |
CLUSTER |
DISTANCE |
DISCRIM |
CORR, COV, SSCP, CSSCP, LINEAR, QUAD, MIXED |
FACTOR |
ACE, CORR, COV, FACTOR, SSCP, UCORR, UCOV |
LIFEREG |
EST |
LOGISTIC |
EST LOGISMOD |
MI |
EST, COV, CORR |
MIANALYZE |
EST, COV, CORR |
MODECLUS |
DISTANCE |
PHREG |
EST |
PRINCOMP |
ACE, CORR, COV, EST, FACTOR, SSCP, UCORR, UCOV |
PROBIT |
EST |
QUANTREG |
EST |
REG |
CORR, COV, SSCP, UCORR, UCOV |
ROBUSTREG |
EST |
SCORE |
SCORE= data set can be of any type |
SIMNORM |
CORR, COV |
SURVEYLOGISTIC |
EST |
STEPDISC |
CORR, COV, SSCP, CSSCP |
TREE |
TREE |
VARCLUS |
CORR, COV, FACTOR, SSCP, UCORR, UCOV |
Procedure |
TYPE= |
Statement and Option Required |
---|---|---|
ACECLUS |
ACE |
PROC ACECLUS OUTSTAT= |
BOXPLOT |
BOXPLOT |
PLOT / OUTBOX= |
CALIS |
CORR |
PROC CALIS OUTSTAT= |
CANCORR |
CORR |
PROC CANCORR OUTSTAT= |
CANDISC |
CORR |
PROC CANDISC OUTSTAT= |
CATMOD |
EST |
RESPONSE / OUTEST= |
CLUSTER |
TREE |
PROC CLUSTER OUTTREE= |
DISCRIM |
LINEAR |
PROC DISCRIM POOL=YES OUTSTAT= |
DISTANCE |
DISTANCE |
PROC DISTANCE METHOD=distance-method OUT= |
FACTOR |
FACTOR |
PROC FACTOR OUTSTAT= |
LIFEREG |
EST |
PROC LIFEREG OUTEST= |
LOGISTIC |
EST |
PROC LOGISTIC OUTEST= |
MI |
COV |
EM OUTEM= |
NLIN |
EST |
PROC NLIN OUTEST= |
ORTHOREG |
EST |
PROC ORTHOREG OUTEST= |
PHREG |
EST |
PROC PHREG OUTEST= |
PRINCOMP |
CORR |
PROC PRINCOMP OUTSTAT= |
PROBIT |
EST |
PROC PROBIT OUTEST= |
QUANTREG |
EST |
PROC QUANTREG OUTEST= |
REG |
EST |
PROC REG OUTEST= |
ROBUSTREG |
EST |
PROC ROBUSTREG OUTEST= |
VARCLUS |
CORR |
PROC VARCLUS OUTSTAT= |
Copyright © SAS Institute, Inc. All Rights Reserved.