OUTPUT
<OUT=SASdataset>
<keyword<(keywordoptions)> <=name>> …
<keyword<(keywordoptions)> <=name>> </ options>;
The OUTPUT statement creates a data set that contains predicted values and residual diagnostics, computed after fitting the model. By default, all variables in the original data set are included in the output data set.
You can use the ID statement to select a subset of the variables from the input data set as well as computed variables for adding to the output data set. If you reassign a data set variable through programming statements, the value of the variable from the input data set supersedes the recomputed value when observations are written to the output data set. If you list the variable in the ID statement, however, PROC GLIMMIX saves the current value of the variable after the programming statements have been executed.
For example, suppose that data set Scores
contains the variables score
, machine
, and person
. The following statements fit a model with fixed machine and random person effects. The variable score divided by 100 is
assumed to follow an inverse Gaussian distribution. The (conditional) mean and residuals are saved to the data set igausout
. Because no ID
statement is given, the variable score
in the output data set contains the values from the input data set.
proc glimmix; class machine person; score = score/100; p = 4*_linp_; model score = machine / dist=invgauss; random int / sub=person; output out=igausout pred=p resid=r; run;
On the contrary, the following statements list explicitly which variables to save to the OUTPUT data set. Because the variable
score
is listed in the ID
statement, and is (re)assigned through programming statements, the values of score
saved to the OUTPUT data set are the input values divided by 100.
proc glimmix; class machine person; score = score / 100; model score = machine / dist=invgauss; random int / sub=person; output out=igausout pred=p resid=r; id machine score _xbeta_ _zgamma_; run;
You can specify the following syntax elements in the OUTPUT statement before the slash (/).
specifies the name of the output data set. If the OUT= option is omitted, the procedure uses the DATAn
convention to name the output data set.
specifies a statistic to include in the output data set and optionally assigns the variable the name name. You can use the keywordoptions to control which type of a particular statistic to compute. The keywordoptions can take on the following values:
uses the predictors of the random effects in computing the statistic.
computes the statistic on the scale of the data.
does not use the predictors of the random effects in computing the statistic.
computes the statistic on the scale of the link function.
The default is to compute statistics by using BLUPs on the scale of the link function (the linearized scale). For example, the following OUTPUT statements are equivalent:
output out=out1 pred=predicted lcl=lower;
output out=out1 pred(blup noilink)=predicted lcl (blup noilink)=lower;
If a particular combination of keyword and keyword options is not supported, the statistic is not computed and a message is produced in the SAS log.
A keyword can appear multiple times in the OUTPUT statement. Table 44.15 lists the keywords and the default names assigned by the GLIMMIX procedure if you do not specify a name. In this table, y denotes the observed response, and p denotes the linearized pseudodata. See the section Pseudolikelihood Estimation Based on Linearization for details on notation and the section Notes on Output Statistics for further details regarding the output statistics.
Table 44.15: Keywords for Output Statistics
Keyword 
Options 
Description 
Expression 
Name 

PREDICTED 
Default 
Linear predictor 

Pred 
NOBLUP 
Marginal linear predictor 

PredPA 

ILINK 
Predicted mean 

PredMu 

NOBLUP ILINK 
Marginal mean 

PredMuPA 

STDERR 
Default 
Standard deviation of linear predictor 

StdErr 
NOBLUP 
Standard deviation of marginal linear predictor 

StdErrPA 

ILINK 
Standard deviation of mean 

StdErr 

NOBLUP ILINK 
Standard deviation of marginal mean 

StdErrMuPA 

RESIDUAL 
Default 
Residual 

Resid 
NOBLUP 
Marginal residual 

ResidPA 

ILINK 
Residual on mean scale 

ResidMu 

NOBLUP ILINK 
Marginal residual on mean scale 

ResidMuPA 

PEARSON 
Default 
Pearsontype residual 

Pearson 
NOBLUP 
Marginal Pearsontype residual 

PearsonPA 

ILINK 
Conditional Pearsontype mean residual 

PearsonMu 

STUDENT 
Default 
Studentized residual 

Student 
NOBLUP 
Studentized marginal residual 

StudentPA 

LCL 
Default 
Lower prediction limit for linear predictor 
LCL 

NOBLUP 
Lower confidence limit for marginal linear predictor 
LCLPA 

ILINK 
Lower prediction limit for mean 
LCLMu 

NOBLUP ILINK 
Lower confidence limit for marginal mean 
LCLMuPA 

UCL 
Default 
Upper prediction limit for linear predictor 
UCL 

NOBLUP 
Upper confidence limit for marginal linear predictor 
UCLPA 

ILINK 
Upper prediction limit for mean 
UCLMu 

NOBLUP ILINK 
Upper confidence limit for marginal mean 
UCLMuPA 

VARIANCE 
Default 
Conditional variance of pseudodata 

Variance 
NOBLUP 
Marginal variance of pseudodata 

VariancePA 

ILINK 
Conditional variance of response 

Variance_Dep 

NOBLUP ILINK 
Marginal variance of response 

Variance_DepPA 
Studentized residuals are computed only on the linear scale (scale of the link), unless the link is the identity, in which case the two scales are equal. The keywords RESIDUAL, PEARSON, STUDENT, and VARIANCE are not available with the multinomial distribution. You can use the following shortcuts to request statistics: PRED for PREDICTED, STD for STDERR, RESID for RESIDUAL, and VAR for VARIANCE. Output statistics that depend on the marginal variance are not available with METHOD= LAPLACE or METHOD= QUAD .
Table 44.16 summarizes the options available in the OUTPUT statement.
Table 44.16: OUTPUT Statement Options
Option 
Description 

Computes all statistics 

Determines the confidence level () 

Changes the way in which marginal residuals are computed 

Adds derivatives of model quantities to the output data set 

Outputs only observations used in the analysis 

Requests that names not be made unique 

Requests that variables from the input data set not be added to the output data set 

Writes statistics to output data set only for the response level corresponding to the observed level of the observation 

Adds computed variables to the output data set 
You can specify the following options in the OUTPUT statement after a slash (/).
requests that all statistics are computed. If you do not use a keyword to assign a name, the GLIMMIX procedure uses the default name.
determines the coverage probability for twosided confidence and prediction intervals. The coverage probability is computed as 1 – number. The value of number must be between 0 and 1; the default is 0.05.
changes the way in which marginal residuals are computed when model parameters are estimated by pseudolikelihood methods. See the section Notes on Output Statistics for details.
adds derivatives of model quantities to the output data set. If, for example, the model fit requires the (conditional) log likelihood of the data, then the DERIVATIVES option writes for each observation the evaluations of the first and second derivatives of the log likelihood with respect to _LINP_ and _PHI_ to the output data set. The particular derivatives produced by the GLIMMIX procedure depend on the type of model and the estimation method.
requests that records be written to the output data only for those observations that were used in the analysis. By default, the GLIMMIX procedure produces output statistics for all observations in the input data set.
requests that names not be made unique in the case of naming conflicts. By default, the GLIMMIX procedure avoids naming conflicts by assigning a unique name to each output variable. If you specify the NOUNIQUE option, variables with conflicting names are not renamed. In that case, the first variable added to the output data set takes precedence.
requests that variables from the input data set not be added to the output data set. This option does not apply to variables listed in the BY statement or to computed variables listed in the ID statement.
requests that in models for multinomial data statistics be written to the output data set only for the response level that corresponds to the observed level of the observation.
adds to the output data set computed variables that are defined or referenced in the program.