Figure 32.12 shows the transformations that are available when you select from the list. These transformations are often used to improve the normality of a variable. Equations for these transformations are given in Table 32.2.
Figure 32.12: Normalizing Transformations
Table 32.2: Description of Normalizing Transformations
Default |
Name of |
||
---|---|---|---|
Transformation |
Parameter |
New Variable |
Equation |
log(Y+a) |
|
Log_Y |
|
log10(Y+a) |
|
Log10_Y |
|
sqrt(Y+a) |
|
Sqrt_Y |
|
exp(Y) |
Exp_Y |
|
|
power(Y;a) |
|
Pow_Y |
if a is not integral |
arcsinh(Y) |
Arcsinh_Y |
|
|
Box-Cox(Y;a) |
MLE |
BC_Y |
See text. |
The Box-Cox transformation (Box and Cox 1964) is a one-parameter family of power transformations that includes the logarithmic transformation as a limiting case. For ,
You can specify the parameter, , for the Box-Cox transformation, but typically you choose a value for that maximizes (or nearly maximizes) a log-likelihood function.
SAS/IML Studio plots the log-likelihood function versus the parameter, as shown in Figure 32.8. An inset gives the lower and upper 95% confidence limits for the maximum log-likelihood estimate, the MLE estimate, and a convenient estimate. A convenient estimate is a fraction with a small denominator (such as an integer, a half integer, or an integer multiple of or ) that is within the 95% confidence limits about the MLE. Although the value of the parameter is not bounded, SAS/IML Studio graphs the log-likelihood function restricted to the interval .
A dialog box (see Figure 32.9) also appears that prompts you to enter the parameter value to use for the Box-Cox transformation.
The log-likelihood function for the Box-Cox transformation is defined as follows. Write the normalized Box-Cox transformation, , as
where is the geometric mean of y. Let N be the number of nonmissing values, and define
The log-likelihood function is (Atkinson 1985, p. 87)