Nonparametric Regression

Regression models that suppose a parametric form express the mean of an observation as a function of regressor variables $x_1,\cdots ,x_ k$ and parameters $\beta _1,\cdots ,\beta _ p$:

\[  \mr {E}[Y] = f(x_1,\cdots ,x_ k;\beta _1,\cdots ,\beta _ p)  \]

Nonparametric regression techniques not only relax the assumption of linearity in the regression parameters, but they also do not require that you specify a precise functional form for the relationship between response and regressor variables. Consider a regression problem where the relationship between response Y and regressor X is to be modeled. It is assumed that $\mr {E}[Y_ i] = g(x_ i) + \epsilon _ i$, where $g(\cdot )$ is an unspecified regression function. Two primary approaches in nonparametric regression modeling are as follows:

  • approximate $g(x_ i)$ locally by a parametric function constructed from information in a local neighborhood of $x_ i$

  • approximate the unknown function $g(x_ i)$ by a smooth, flexible function and determine the necessary smoothness and continuity properties from the data

The SAS/STAT procedures LOESS, GAM, and TPSPLINE fit nonparametric regression models by one of these methods.