Previous Page | Next Page

The HPFRECONCILE Procedure

Data Set Input/Output

AGGDATA= Data Set

The AGGDATA= data set contains either a proper subset of or none of the variables specified in the BY statement, the time ID variable in the ID statement (when this statement is specified), and the following variables:

_NAME_

variable name

PREDICT

predicted values

The following variables can optionally be present in the AGGDATA= data set and are used when available. If not present, their value is assumed to be missing for computational purposes.

ACTUAL

actual values

LOWER

lower confidence limits

UPPER

upper confidence limits

ERROR

prediction errors

STD

prediction standard errors

Typically, the AGGDATA= data set is generated by the OUTFOR= option of the HPFENGINE procedure. See Chapter 5, The HPFENGINE Procedure, for more details.

The AGGDATA= data set must be either sorted by the AGGBY variables and by the ID variable (when the latter is specified) or indexed on the AGGBY variables. Even when the data set is indexed, if the ID variable is specified, its values must be sorted in ascending order within each AGGBY group. See section BY Statement for details about AGGBY variables and AGGBY groups.

You can specify custom names for the variables in the AGGDATA= data set by using the AGGDATA statement. See the section AGGDATA Statement for more details.

DISAGGDATA= Data Set

The DISAGGDATA= data set contains the variables specified in the BY statement, the variable in the ID statement (when this statement is specified), and the following variables:

_NAME_

variable name

PREDICT

predicted values

The following variables can optionally be present in the DISAGGDATA= data set and are used when available. If not present, their value is assumed to be missing for computational purposes.

ACTUAL

actual values

LOWER

lower confidence limits

UPPER

upper confidence limits

ERROR

prediction errors

STD

prediction standard errors

Typically, the DISAGGDATA= data set is generated by the OUTFOR= option of the HPFENGINE procedure. See Chapter 5, The HPFENGINE Procedure, for more details.

The DISAGGDATA= data set must be either sorted by the BY variables and by the ID variable when the latter is specified, or indexed on the BY variables. If the variable _NAME_ is present and has multiple values, then the index must be a composite index on BY variables and _NAME_, in that order. If _NAME_ is present and has only one value, then the index can contain only BY variables. Even when the data set is indexed, if the ID variable is specified, its values must be sorted in ascending order within each BY, or BY and _NAME_ group, as applicable. Indexing the DISAGGDATA= data set on the BY variables when it is already sorted by the BY variables leads to less efficient and less scalable operation if the available memory is not sufficient to hold the disaggregated data for the AGGBY group that is being processed. The amount of memory required depends on, among other things, the length of the series, the number of BY groups for each AGGBY group, and the number and format of the BY variables. For example, if there are four BY variables, each 16 characters long, 10,000 BY groups within each AGGBY group, and each series has length 100, then the minimum required memory for efficient processing is approximately 100 MB. If the memory is not sufficient, sorting the DISAGGDATA= data set, not indexing, is more efficient.

You can specify custom names for the variables in the DISAGGDATA= data set by using the DISAGGDATA statement. See the section DISAGGDATA Statement for more details.

CONSTRAINT= Data Set

The CONSTRAINT= data set specifies the constraints to be applied to the reconciled forecasts. It contains the BY variables for the level at which reconciled forecasts are generated. That is, it contains the AGGBY variables when DIRECTION=BU, and the variables specified in the BY statement when DIRECTION=TD. If the _NAME_ variable is present in the AGGDATA= and DISAGGDATA= data set, it must also be present in the CONSTRAINT= data set. Additionally, the CONSTRAINT= data set contains the variable in the ID statement (when this statement is specified), and the following variables:

EQUALITY

an equality constraint for the predicted reconciled value

UNLOCK

a flag that specifies whether the equality constraint should be strictly enforced. Admissible values are as follows:

0

The equality constraint is locked.

1

The equality constraint is unlocked.

When EQUALITY is nonmissing and the UNLOCK flag is missing, the equality is treated as locked.

LOWERBD

lower bounds for the reconciled forecasts

UPPERBD

upper bounds for the reconciled forecasts

Locked equality constraints are treated as constraints, and therefore their value is honored. Unlocked equalities are instead treated as regular forecasts and, in general, are changed by the reconciliation process.

A constraint is said to be active when the reconciled prediction lies on the constraint. By definition, locked equalities are always active constraints.

If the NOTSORTED option is specified in the BY statement, then any BY group in the CONSTRAINT= data set that is out of order with respect to the BY groups in the AGGDATA= or DISAGGDATA= data set is ignored without any error or warning message. If the NOTSORTED option is not specified, then the BY groups in the CONSTRAINT= data set must be in the same sorted order as the AGGBY groups in the AGGDATA= data set when DIRECTION=BU, and in the same sorted order as the BY groups in the DISAGGDATA= data set when DIRECTION=TD; otherwise processing stops at the first such occurrence of a mismatch.

OUTFOR= Data Set

The OUTFOR= data set contains the following variables:

_NAME_

variable name

ACTUAL

actual values

PREDICT

predicted values

LOWER

lower confidence limits

UPPER

upper confidence limits

ERROR

prediction errors

STD

prediction standard errors

_RECONSTATUS_

reconciliation status

Additionally, it contains any other variable that was present in the input data set at the same level—that is, the DISAGGDATA= data set when DIRECTION=TD and the AGGDATA= data set when DIRECTION=BU.

When DIRECTION=BU and the AGGDATA= data set has not been specified, the OUTFOR= data set contains the variables in the previous list, the BY variables specified in the AGGBY statement, and the time ID variable in the ID statement.

If reconciliation fails with _RECONSTATUS_ between 1000 and 6000, PROC HPFRECONCILE copies the input values of the relevant variables to the OUTFOR= data set. If a variable is not present in the input data set, its value is set to missing in the OUTFOR= data set. The only exception to this rule is when the problem is infeasible and the FORCECONSTRAINT option is specified. See the section The FORCECONSTRAINT Option for more details on the latter case.

The OUTFOR= data set is always sorted by the BY variables (and by the _NAME_ variable and time ID variable when these variables are present) even if input data sets are indexed and not sorted.

If the ID statement is specified, then the values of the ID variable in OUTFOR= data set are aligned based on the ALIGN= and INTERVAL= options specified on the ID statement. If ALIGN= option is not specified, then the values are aligned to the beginning of the interval.

If the RECDIFF option of the HPFRECONCILE statement has been specified, the OUTFOR= data sets also contains the following variable:

RECDIFF

difference between the reconciled predicted value and the original predicted value

The _RECONSTATUS_ variable contains a code that specifies whether the reconciliation was successful or not. A corresponding message is also displayed in the log. You can use the ERRORTRACE= option to define how often the error and warning messages are displayed in the log. The _RECONSTATUS_ variable can take the following values:

0

Reconciliation was successful.

400

A unlocked equality constraint has been imposed.

500

A locked equality constraint has been imposed.

600

A lower bound is active.

700

An upper bound is active.

1000

The ID value is out of the range with respect to the START= and END= interval.

2000

There is insufficient data to reconcile.

3000

Reconciliation failed for the predicted value. This implies that it also failed for the confidence limits and standard error.

4000

Reconciliation failed for the standard error.

5000

Reconciliation failed for the confidence limits.

6000

The constrained optimization problem is infeasible.

7000

The option DISAGGREGATION=PROPORTION has been changed to DISAGGREGATION=DIFFERENCE for this observation because of a discordant sign in the input.

8000

The option STDMETHOD= provided by the user has been changed for this observation.

9000

The option CLMETHOD= provided by the user has been changed for this observation.

10000

The standard error hit the limits imposed by the STDDIFBD= option.

11000

Multiple warnings have been displayed in the log for this observation.

12000

The number of missing values in the STD variable in the DISAGGDATA= data set is different from the number of missing values in the union of the PREDICT and ACTUAL variables.

13000

The solution might be suboptimal. This means that the optimizer did not find an optimal solution, but the solution provided satisfies all constraints.

14000

A failed forecast ".F" has been detected in a relevant input variable.

The FORCECONSTRAINT Option

The FORCECONSTRAINT option applies when there are conflicts between the aggregation constraint and one or more constraints that you specify using the CONSTRAINT= data set, the SIGN= option, or the WEIGHTED option with zero weights. By default, when reconciliation is impossible, PROC HPFRECONCILE copies the input to the OUTFOR= data set without modification. However, if the reconciliation is infeasible because of a conflict between the constraints you specified and the aggregation constraint, you can ask PROC HPFRECONCILE to impose your constraints on the output even though that results in a violation of the aggregation constraint. For example, assume the input is described by the diagram in Figure 10.1 and assume you want to impose the following constraints on the reconciled forecasts: .

Figure 10.1 FORCECONSTRAINT Option

The constraints are clearly in conflict the aggregation constraint ; therefore, PROC HPFRECONCILE will consider the problem infeasible. If you do not specify the FORCECONSTRAINT option, the predicted values in the OUTFOR= data set will equal the input predicted values (that is, ) and the _RECONSTATUS_ variable will take the value 6000. If you specify the FORCECONSTRAINT option, the OUTFOR= data set will contain the values .

OUTINFEASIBLE= Data Set

The OUTINFEASIBLE= data set contains summary information about the nodes in the hierarchy for which reconciliation is infeasible because the aggregation constraint is incompatible with the constraints supplied by the user.

The OUTINFEASIBLE= data set is always produced at the level of the AGGDATA= data set.

The OUTINFEASIBLE= data set contains the AGGBY variables present in the AGGDATA= data set, the time ID variable, when it is specified, and the following variables:

_NAME_

variable name

ISRECONCILED

takes value 1 when the node is reconciled, and value 0 when it is not

FINALPREDICT

the predicted value for the parent node

AGGCHILDPREDICT

the aggregated prediction of the children nodes

LOWERBD

the lower bound implied by the constraints on FINALPREDICT

UPPERBD

the upper bound implied by the constraints on FINALPREDICT

If the ID statement is specified, then the values of the ID variable in OUTINFEASIBLE= data set are aligned based on the ALIGN= and INTERVAL= options specified in the ID statement. If ALIGN= option is not specified, then the values are aligned to the beginning of the interval.

OUTNODESUM= Data Set

The OUTNODESUM= data set contains the BY variables in the AGGDATA= data set (or in the AGGBY statement if the AGGDATA= data set is not specified), the time ID variable in the ID statement when this statement is specified, and the following variables:

_NAME_

variable name

NONMISSCHLD

number of nonmissing children of the current AGGBY group

OUTPROCINFO= Data Set

The OUTPROCINFO= data set contains the following variables:

_SOURCE_

source procedure that produces this data set

_STAGE_

stage of the procedure execution for which the summary variable is reported

_NAME_

name of the summary variable

_LABEL_

description of the summary variable

_VALUE_

value of the summary variable

For PROC HPFRECONCILE , the value of the _SOURCE_ variable is HPFRECONCILE and the value of the _STAGE_ variable is ALL for all observations. It contains observations that corresponds to each of the following values of the _NAME_.

NOBS_RECON

total number of observations subject to reconciliation

NOBS_SUCCESS

number of observations with successful reconciliation

NOBS_PREDICTFAIL

number of observations for which reconciliation failed for PREDICT. This number does not include failures to reconcile due to an infeasible problem or a failed (".F") forecast.

NOBS_PROBLEMSTATUS

number of observations for which some problem was encountered. This is the number of observations in the OUTFOR= data set that have a _RECONSTATUS_ value greater or equal to 1000.

NOBS_INFEASIBLE

number of observations for which reconciliation is infeasible due to incompatible constraints

NOBS_SUBOPTIMAL

number of observations for which the optimizer did not find an optimal solution

NOBS_LOCKEQ

number of observations subject to a locked equality constraint

NOBS_USER_LOCKEQ

number of observations for which a locked equality was specified in the CONSTRAINT= data set

NOBS_LOWERBD

number of observations for which a lower bound was imposed

NOBS_USER_LOWERBD

number of observations for which a lower bound was specified in the CONSTRAINT= data set

NOBS_UPPERBD

number of observations for which an upper bound was imposed

NOBS_USER_UPPERBD

number of observations for which an upper bound was specified in the CONSTRAINT= data set

NOBS_ACTIVE_LOWERBD

number of observations for which a lower bound is active

NOBS_ACTIVE_UPPERBD

number of observations for which an upper bound is active

NOBS_FAILED_FORECAST

number of observations for which a failed forecast (".F") was written

NPROB_TOTAL

total number of possible problems. One reconciliation problem is possible for each distinct value of time ID variable that appears in AGGDATA= or DISAGGDATA= data sets.

NPROB_CHANGED_LOSS

number of reconciliation problems for which DISAGGREGATION= option was changed internally because the supplied or default option was not feasible

NPROB_CHANGED_CLMETHOD

number of reconciliation problems for which CLMETHOD= option was changed internally because the supplied or default option was not feasible

NPROB_CHANGED_STDMETHOD

number of reconciliation problems for which STDMETHOD= option was changed internally because the supplied or default option was not feasible

NPROB_RECON

total number of problems subject to reconciliation

NPROB_INFEASIBLE

number of infeasible reconciliation problems due to incompatible constraints

NPROB_SUBOPTIMAL

number of reconciliation problems for which the optimizer did not find an optimal solution

NPROB_OPTIMIZER

number of reconciliation problems solved by using the optimizer

ID_MIN

minimum value of time ID for which reconciliation was attempted

ID_MAX

maximum value of time ID for which reconciliation was attempted

NAGGBY

total number of AGGBY groups processed

AVGDBY_PERABY

average number of BY groups per AGGBY group

NBY_IRREGULAR_ID

number of BY groups processed partially because of irregular ID values

NAGGBY_IRREGULAR_ID

number of AGGBY groups processed partially because of irregular ID values

AVGACTIVEDBY_PERID

average number of active BY groups per time ID for which reconciliation was attempted

NCONS_READ

total number of constraints read in the CONSTRAINT= data set

NCONS_IN_VALID_ID_RANGE

total number of constraints in the CONSTRAINT= data set with time ID values in the [START=,END=] range

NCONS_USED

number of constraints used

CONS_ISANYUNMATCHED

If any of the constraints in the CONSTRAINT= data set is left unmatched and unprocessed, then _VALUE_ for this observation is set to 1; otherwise, it is set to 0.

RC

return code of PROC HPFRECONCILE

Previous Page | Next Page | Top of Page