| The HPFDIAGNOSE Procedure |
| PROC HPFDIAGNOSE Statement |
The following options can be used in the PROC HPFDIAGNOSE or HPFDIAG statement.
specifies the confidence level size to use in computing the confidence limits in the model selection list files. The ALPHA= value must be between (0, 1). The default is ALPHA=0.05, which produces 95% confidence intervals.
specifies the number of observations before the end of the data. If BACK=
and the number of observation is
, then the first
observations are used to diagnose a series. The default is BACK=0.
prefixes the model specification file name or the model selection list file name or both. If the BASENAME=MYSPEC, then the model specification files or the model selection list files or both are named MYSPEC0, ..., MYSPEC9999999999. The default SAS-name starts with DIAG, such as DIAG0, ..., DIAG9999999999. The model specification files or the model selection list files or both are stored in the model repository defined by the REPOSITORY= option.
specifies the model selection criterion to select the best model. This option is often used in conjunction with the HOLDOUT= and HOLDOUTPCT= options. The default is CRITERION=RMSE. The following statistics of fit are provided:
sum of square error
mean squared error
root mean squared error
unbiased mean squared error
unbiased root mean squared error
maximum percent error
minimum percent error
mean percent error
mean absolute percent error
median percent error
geometric mean percent error
mean absolute error percent of standard deviation
median absolute error percent of standard deviation
geometric mean absolute error percent of standard deviation
minimum predictive percent error
maximum predictive percent error
mean predictive percent error
symmetric mean absolute predictive percent error
median predictive percent error
geometric mean predictive percent error
minimum symmetric percent error
maximum symmetric percent error
mean symmetric percent error
symmetric mean absolute percent error
median symmetric percent error
geometric mean symmetric percent error
minimum relative error
maximum relative error
mean relative error
mean relative absolute error
median relative absolute error
geometric mean relative absolute error
maximum error
minimum error
mean error
mean absolute error
mean absolute scaled error
R-square
adjusted R-square
Amemiya’s adjusted R-square
random walk R-square
Akaike information criterion
Akaike information Corrected criterion
Schwarz Bayesian information criterion
Amemiya’s prediction criterion
specifies the name of the SAS data set that contains the time series. If the DATA= option is not specified, the most recently created SAS data set is used.
specifies the delay lag for the events. If the option is not specified, the delay lag for the events is set to zero by default.
specifies the delay lag for the inputs. If the option is not specified, the delay lag for the inputs is appropriately chosen by the procedure.
specifies a threshold to check the percentage increment of the criterion between two candidate models. The ENTRYPCT=value should be in (0,100); the default is ENTRYPCT=0.1.
allows finer control of message printing. The error severity level and the HPFDIAGNOSE procedure processing stages are set independently. The MAXMESSAGE=number option controls the number of messages printed. A logical ‘and’ is taken over all the specified options and any message.
Available severity-options are as follows:
specifies low severity, minor issues
specifies medium severity problems
specifies severe errors
specifies all severity levels of LOW, MEDIUM, and HIGH options
specifies that no messages from PROC HPFDIAGNOSE are printed
Available stage-options are as follows:
specifies that the procedure stage is option processing and validation
specifies the accumulation of data and the application of SETMISS= and ZEROMISS= options
specifies the diagnostic process
specifies all PROCEDURELEVEL, DATAPREP, and DIAGNOSE options
Examples are as follows.
The following statement prints high- and moderate-severity errors at any processing stage of PROC HPFDIAGNOSE:
errorcontrol=(severity=(high medium) stage=all)
The following statement prints high-severity errors only during the data preparation:
errorcontrol=(severity=high stage=dataprep)
The following statement turns off messages from PROC HPFDIAGNOSE:
errorcontrol=(severity=none stage=all) errorcontrol=(maxmessage=0)
Each of the following statements specifies the default behavior:
errorcontrol=( severity=(high medium low)
stage=(procedurelevel dataprep diagnose) )
errorcontrol=(severity=all stage=all)
specifies the name of the event data set that contains the events for specific BY groups that are created by DATA steps. The events in the EVENT statement are used in all BY groups, but the events in the EVENTBY= data set are used in the specific BY group.
specifies the desired handling of arithmetic exceptions during the run. You can specify except-option as one of the following:
specifies that PROC HPFDIAGNOSE stop on an arithmetic exception. No recovery is attempted. This is the default behavior if the EXCEPTIONS= option is not specified.
specifies that PROC HPFDIAGNOSE skip the generation of diagnostic output for the variable that produces the exception in the current BY group. PROC HPFDIAGNOSE generates a record to the OUTEST= data set with a blank select list name in the _SELECT_ column. The blank select list name reflects the handled exception on that combination of variable and BY group.
specifies the size of the holdout sample to be used for model selection. The holdout sample is a subset of the dependent time series that ends at the last nonmissing observation. The statistics of a model selection criterion are computed using only the holdout sample. The default is HOLDOUT=0.
specifies the size of the holdout sample as a percentage of the length of the dependent time series. If HOLDOUT=5 and HOLDOUTPCT=10, the size of the holdout sample is
where
is the length of the dependent time series with beginning and ending missing values removed. The default is HOLDOUTPCT=0.
contains information that maps forecast variables to models or selection lists, and data set variables to model variables.
specifies the name of the event data set that contains the event definitions created by the HPFEVENTS procedure. If the INEVENT= data set is not specified, only SAS predefined event definitions can be used in the EVENT statement.
For more information about the INEVENT= option, see Chapter 7, The HPFEVENTS Procedure .
specifies the size of the missing observation as a percentage of the length of the input time series. If INPUTMISSINGPCT=50, then the input time series that has more than 50% missing data is ignored in the model. The default is INPUTMISSINGPCT=10.
specifies the name of a catalog entry that serves as a model selection list. This is the selection list that includes existing model specification files. A selection list created by the HPFDIAGNOSE procedure includes the existing model specification files.
specifies that no seasonal model is fitted to any series with fewer nonmissing observations than number
(season length). The value of number must be greater than or equal to 1. The default is number = 2.
specifies that no trend model is fitted to any series with fewer nonmissing observations than number . The value of number must be greater than or equal to 1. The default is number = 1.
specifies that the series is not diagnosed. If the INSELECTNAME= option and OUTEST= option are specified, the existing model specification files are written to the OUTEST data set.
specifies that the selection lists referred to by the INEST= option are not used in the diagnosed version.
contains information that maps data set variables to model symbols and references model specification files and model selection list files.
contains information that is associated with the detected outliers.
names the output data set to contain the summary information of the processing done by PROC HPFDIAGNOSE . It is particularly useful for easy programmatic assessment of the status of the procedure’s execution via a data set instead of looking at or parsing the SAS log.
specifies handling missing and extreme values prior to diagnostic tests.
Smoothed values for missing data are applied for tentative order selection and missing values are used for the final diagnostics.
Smoothed values for missing data are applied to overall diagnoses. This option is the default.
Extreme values set to missing for a tentative ARIMA model and extreme values are used for the final ARIMAX model diagnostics.
This value is equivalent to both YES and EXTREME.
If the input variables have missing values, they are always smoothed for the diagnostics.
suppresses the printed output. This option is the default.
prints the model specifications. This option also prints only the significant input variables, events, and outliers.
prints the summary of the transform, the stationarity test, and the determination of ARMA order in addition to all of the information printed by PRINT=SHORT.
prints the details of the stationarity test and the determination of ARMA order. This option prints the detail information about all input variables and events under consideration.
contains information about model specification files and model selection list files. The REPOSITORY= option can also be specified as MODELREPOSITORY=, MODELREP=, or REP=. The default model repository is SASUSER.HPFDFLT.
specifies that the CHOOSE= option in the HPFSELECT procedure is respected when re-diagnosing series. The default is RETAINCHOOSE=YES.
specifies the length of the seasonal cycle. The number should be a positive integer. For example, SEASONALITY=3 means that every group of three observations forms a seasonal cycle. By default, the length of the seasonal cycle is 1 (no seasonality) or the length implied by the INTERVAL= option specified in the ID statement. For example, INTERVAL=MONTH implies that the length of the seasonal cycle is 12.
specifies the maximum number of the input variables to select.
selects the input variables that satisfy the criteria (noncollinearity, nonnegative delay, smaller AIC). This option is the default.
selects the input variables that satisfy the criteria (noncollinearity, nonnegative delay).
selects the best number input variables that satisfy the criteria (noncollinearity, nonnegative delay).
specifies the maximum number of events to select.
selects the events that satisfy the criteria (noncollinearity, smaller AIC). This option is the default.
selects the events that satisfy the criteria (noncollinearity).
selects the best number of events that satisfy the criteria (noncollinearity).
specifies the cutoff value for all diagnostic tests such as log transformation, stationarity, tentative ARMA order selection, and significance of UCM components. The SIGLEVEL=value should be between (0,1) and SIGLEVEL=0.05 is the default. The SIGLEVEL options in TRANSFORM, TREND, ARIMAX, and UCM statements control testing independently.
prefixes the model selection list file name. If the SELECTBASE=MYSELECT, then the model selection list files are named MYSELECT0, MYSELECT1, and so on. The default SAS-name starts with DIAG, such as DIAG0, DIAG1, and so on. The model selection list files are stored in the model repository defined by the REPOSITORY= option.
prefixes the model specification file name. If the SPECBASE=MYSPEC, then the model specification files are named MYSPEC0, MYSPECT1, and so on. The default SAS-name starts with DIAG, such as DIAG0, DIAG1, and so on. The model specification files are stored in the model repository defined by the REPOSITORY= option.
specifies that the log transform testing of the input variables is applied independently of the variable to be forecast.
specifies that the trend testing of the input variables is applied independently of the variable to be forecast.
specifies that the log transform and trend testing of the input variables are applied independently of the variable to be forecast.
If this option is not specified, the same differencing is applied to the input variables as is used for the variable to be forecast, and no transformation is applied to the input variables.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.