One of the key features of PROC HPSEVERITY is that it enables you to specify whether the severity event’s magnitude is observable and if it is observable, then whether the exact value of the magnitude is known. If an event is unobservable when the magnitude is in certain intervals, then it is referred to as a truncation effect. If the exact magnitude of the event is not known, but it is known to have a value in a certain interval, then it is referred to as a censoring effect.
PROC HPSEVERITY allows a severity event to be subject to any combination of the following four censoring and truncation effects:
Left-truncation: An event is said to be left-truncated if it is observed only when , where denotes the random variable for the magnitude and denotes a random variable for the truncation threshold. You can specify left-truncation using the LEFTTRUNCATED= option in the LOSS statement.
Right-truncation: An event is said to be right-truncated if it is observed only when , where denotes the random variable for the magnitude and denotes a random variable for the truncation threshold. You can specify right-truncation using the RIGHTTRUNCATED= option in the LOSS statement.
Left-censoring: An event is said to be left-censored if it is known that the magnitude is , but the exact value of is not known. is a random variable for the censoring limit. You can specify left-censoring using the LEFTCENSORED= option in the LOSS statement.
Right-censoring: An event is said to be right-censored if it is known that the magnitude is , but the exact value of is not known. is a random variable for the censoring limit. You can specify right-censoring using the RIGHTCENSORED= option in the LOSS statement.
For each effect, you can specify a different threshold or limit for each observation or specify a single threshold or limit that applies to all the observations.
If all the four types of effects are present on an event, then the following relationship holds: . PROC HPSEVERITY checks these relationships and write a warning to the SAS log if any is violated.
If the response variable is specified in the LOSS statement, then PROC HPSEVERITY also checks whether each observation satisfies the definitions of the specified censoring and truncation effects. If left-truncation is specified, then PROC HPSEVERITY ignores observations where , because such observations are not observable by definition. Similarly, if right-truncation is specified, then PROC HPSEVERITY ignores observations where . If left-censoring is specified, then PROC HPSEVERITY treats an observation with as uncensored and ignores the value of . The observations with are considered as left-censored, and the value of is ignored. If right-censoring is specified, then PROC HPSEVERITY treats an observation with as uncensored and ignores the value of . The observations with are considered as right-censored, and the value of is ignored. The specification of both left-censoring and right-censoring is referred to as interval-censoring. If is satisfied for an observation, then it is considered as interval-censored and the value of the response variable is ignored. If for an observation, then PROC HPSEVERITY assumes that observation to be uncensored. If all the observations in a data set are censored in some form, then the specification of the response variable in the LOSS statement is optional, because the actual value of the response variable is not required for the purposes of estimating a model.
Specification of censoring and truncation affects the likelihood of the data (see the section Likelihood Function) and how the empirical distribution function (EDF) is estimated (see the section Empirical Distribution Function Estimation Methods).
If left-truncation or right-truncation is specified, then the EDF estimates that are computed by all methods except the STANDARD method are conditional on the truncation information. See the section EDF Estimates and Truncation for more information. In such cases, PROC HPSEVERITY uses conditional estimates of the CDF when it computes the EDF-based statistics of fit.
Let be the smallest value of the left-truncation threshold ( is the left-truncation threshold for observation ) and be the largest value of the right-truncation threshold ( is the right-truncation threshold for observation ). If denotes the unconditional estimate of the CDF at , then the conditional estimate is computed as follows:
If an observation is both left-truncated and right-truncated, then
If an observation is left-truncated but not right-truncated, then
If an observation is right-truncated but not left-truncated, then
If regressors are specified, then , , and are all computed from a mixture distribution, as described in the section CDF Estimates with Regression Effects.