Nonparametric Regression

Parametric regression models express the mean of an observation as a function of the regressor variables $x_1,\ldots ,x_ k$ and the parameters $\beta _1,\ldots ,\beta _ p$:

\[  \mr {E}[Y] = f(x_1,\ldots ,x_ k;\beta _1,\ldots ,\beta _ p)  \]

Not only do nonparametric regression techniques relax the assumption of linearity in the regression parameters, but they also do not require that you specify a precise functional form for the relationship between response and regressor variables. Consider a regression problem in which the relationship between response Y and regressor X is to be modeled. It is assumed that $\mr {E}[Y_ i] = g(x_ i) + \epsilon _ i$, where $g(\cdot )$ is an unspecified regression function. Two primary approaches in nonparametric regression modeling are as follows:

  • Approximate $g(x_ i)$ locally by a parametric function that is constructed from information in a local neighborhood of $x_ i$.

  • Approximate the unknown function $g(x_ i)$ by a smooth, flexible function and determine the necessary smoothness and continuity properties from the data.

The SAS/STAT procedures ADAPTIVEREG, LOESS, TPSPLINE, and GAM fit nonparametric regression models by one of these methods.