PROC TPSPLINE with Large Data Sets

### PROC TPSPLINE with Large Data Sets

The calculation of the penalized least squares estimate is computationally intensive. The amount of memory and CPU time needed
for the analysis depends on the number of unique design points, which corresponds to the number of unknown parameters to be
estimated.

You can specify the D= option in the MODEL statement to reduce the number of unknown parameters. The option groups design points by the specified
range (see the D= option on ).

PROC TPSPLINE selects one design point from the group and treats all observations in the group as replicates of that design
point. Calculation of the thin-plate smoothing spline estimates is based on the reprocessed data. The way to choose the design
point from a group depends on the order of the data. Hence, different orders of input data might result in different estimates.

By combining several design points into one, this option reduces the number of unique design points, thereby approximating
the original data. The value you specify for the D= option determines the width of the range used to group the data.

Copyright © SAS Institute Inc. All Rights Reserved.