Nonparametric Regression
Nonparametric regression relaxes the usual assumption of linearity and enables you to uncover relationships between the
independent variables and the dependent variable that might otherwise be missed.
The SAS/STAT nonparametric regression procedures include the following:
ADAPTIVEREG Procedure
The ADAPTIVEREG procedure fits multivariate adaptive regression splines. The method is a nonparametric regression technique
that combines both regression splines and model selection methods. It does not assume parametric model forms and does not
require specification of knot values for constructing regression spline terms. Instead, it constructs spline basis functions
in an adaptive way by automatically selecting appropriate knot values for different variables and obtains reduced models by
applying model selection techniques.
The procedure enables you to do the following:
 specify classification variables with ordering options
 partition your data into training, validation, and testing roles
 specify the distribution family used in the model
 specify the link function in the model
 specify an offset
 specify the maximum number of basis functions that can be used in the final model
 specify the maximum interaction levels for effects that could potentially enter the model
 specify the incremental penalty for increasing the number of variables in the model
 specify the effects to be included in the final model
 request an additive model for which only main effects are included in the fitted model
 specify the parameter that controls the number of knots considered for each variable

 force effects in the final model or restrict variables in linear forms
 specify options for fast forward selection
 perform leaveoneout and kfold cross validation
 produce a graphical representation of the selection process, model fit, functional components, and fit diagnostics
 create an output data set that contains predicted values and residuals
 create an output data set that contains the design matrix of formed basis functions
 specify multiple SCORE statements, which create new SAS data sets that contain predicted values and residuals
 perform BY group processing to obtain separate analyses on grouped observations
 automatically produce graphs by using ODS Graphics

For further details, see
ADAPTIVEREG Procedure
GAM Procedure
The GAM procedure fits generalized additive models as those models are defined by Hastie and Tibshirani (1990).
This procedure provides powerful tools for nonparametric regression and smoothing.
Nonparametric regression relaxes the usual assumption of linearity and enables you to uncover relationships
between the independent variables and the dependent variable that might otherwise be missed.
The generalized additive models fit by the GAM procedure combine an additivity assumption (Stone 1985) that enables
relatively many nonparametric relationships to be explored simultaneously and the distributional flexibility of generalized
linear models (Nelder and Wedderburn 1972). The following are highlights of the procedure's features:
 permits the following smoothing effects:
 smoothing spline (SPLINE)
 local regression (LOESS)
 bivariate thinplate smoothing spline (SPLINE2)
 supports the following distributions families for the response variables:
 gaussian (continuous response variables)
 binomial (binary response variables)
 Poisson (nonnegative discrete response variables)
 gamma (positive continuous response variables)
 inverse gaussian (positive continuous response variables)

 supports the use of multidimensional data
 fits both generalized semiparametric additive models and generalized additive models
 enables you to choose a particular model by specifying the model degrees of freedom or smoothing parameter
 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 scores new data sets
 creates an output data set that contains diagnostic measures
 creates a SAS data set that corresponds to any output table
 automatically creates graphs by using ODS Graphics

For further details, see
GAM Procedure
GAMPL Procedure
The GAMPL procedure is a highperformance procedure that fits
generalized additive models that are based on lowrank regression splines.
This procedure provides powerful tools for
nonparametric regression and smoothing.
Generalized additive models are extensions of generalized linear
models. They relax the linearity assumption in generalized linear
models by allowing spline terms in order to characterize nonlinear
dependency structures. Each spline term is constructed by the
thinplate regression spline technique.
A roughness penalty is applied to each spline term by a smoothing
parameter that controls the balance between goodness of fit and
the roughness of the spline curve.
PROC GAMPL fits models for standard distributions in the
exponential family, such as normal, Poisson, and gamma distributions.
PROC GAMPL runs in either singlemachine mode or distributed mode.
 estimates the regression parameters of a generalized additive
model that has fixed smoothing parameters by using penalized
likelihood estimation
 estimates the smoothing parameters of a generalized additive
model by using either the performance iteration method or the
outer iteration method
 estimates the regression parameters of a generalized linear
model by using maximum likelihood techniques
 tests the total contribution of each spline term based on the Wald statistic
 provides modelbuilding syntax in the CLASS
statement and effectbased parametric effects in the
MODEL statement, which are used in other
SAS/STAT analytic procedures (in particular, the GLM, LOGISTIC,
GLIMMIX, and MIXED procedures)
 provides responsevariable options
 enables you to construct a spline term by using multiple variables
 provides control options for constructing a spline term, such as
fixed degrees of freedom, initial smoothing parameter, fixed
smoothing parameter, smoothing parameter search range, usersupplied
knot values, and so on

 provides multiple link functions for any distribution
 provides a WEIGHT statement for weighted analysis
 provides a FREQ statement for grouped analysis
 provides an OUTPUT statement to produce a data set
that has predicted values and other observationwise statistics
 produces graphs by using ODS Graphics
 enables you to run in distributed mode on a cluster of machines that
distribute the data and the computations
 enables you to run in singlemachine mode on the server where SAS is installed
 exploits all the available cores and concurrent threads, regardless of execution mode

For further details, see
GAMPL Procedure
LOESS Procedure
The LOESS procedure implements a nonparametric method for estimating regression surfaces. PROC LOESS allows great flexibility
because no assumptions about the parametric form of the regression surface are needed. The following are highlights of the LOESS
procedure's features:
 supports the use of multidimensional data
 supports multiple dependent variables
 supports both direct and interpolated fitting that uses kd trees
 performs statistical inference
 performs automatic smoothing parameter selection
 performs iterative reweighting to provide robust fitting when there are outliers in the data
 scores external data sets

 performs BY group processing, which enables you to obtain separate analyses on grouped observations
 performs weighted estimation
 creates a SAS data set that contains the predicted values and other requested statistics
 creates a SAS data set that corresponds to any output table
 automatically creates graphs by using ODS Graphics

For further details, see
LOESS Procedure
TPSPLINE Procedure
The TPSPLINE procedure uses the penalized least squares method to fit a nonparametric regression model. It computes thinplate smoothing
splines to approximate smooth multivariate functions observed with noise. The TPSPLINE procedure allows great flexibility in the possible
form of the regression surface. In particular, PROC TPSPLINE makes no assumptions of a parametric form for the model.
The following are highlights of the TPSPLINE procedure's features:
 supports the use of multidimensional data
 supports multiple SCORE statements
 fits both semiparametric models and nonparametric models
 provides options for handling large data sets
 supports multiple dependent variables

 enables you to choose a particular model by specifying the model degrees of freedom or smoothing parameter
 performs BY group processing, which enables you to obtain separate analysis on grouped observations
 creates a SAS data set that corresponds to any output table
 automatically creates graphs by using ODS Graphics

For further details, see
TPSPLINE Procedure