Shared Concepts and Topics

Splines and Spline Bases

This section provides details about the construction of spline bases with the EFFECT statement. A spline function is a piecewise polynomial function in which the individual polynomials have the same degree and connect smoothly at join points whose abscissa values, referred to as knots, are prespecified. You can use spline functions to fit curves to a wide variety of data.

A spline of degree 0 is a step function with steps located at the knots. A spline of degree 1 is a piecewise linear function where the lines connect at the knots. A spline of degree 2 is a piecewise quadratic curve whose values and slopes coincide at the knots. A spline of degree 3 is a piecewise cubic curve whose values, slopes, and curvature coincide at the knots. Visually, a cubic spline is a smooth curve, and it is the most commonly used spline when a smooth fit is desired. Note that when no knots are used, splines of degree d are simply polynomials of degree d.

More formally, suppose you specify knots $k_1 < k_2 < k_3 < \cdots < k_ n$. Then a spline of degree $d\geq 0$ is a function $S(x)$ with d – 1 continuous derivatives such that

\[ S(x) = \left\{ \begin{array}{ll} P_0(x) & \quad x<k_1 \\ P_ i(x) & \quad k_ i\leq x<k_{i+1}; \, i=1, 2, \dots , n-1 \\ P_ n(x) & \quad x\geq k_ n \end{array} \right. \]

where each $P_ i(x)$ is a polynomial of degree d. The requirement that $S(x)$ has d – 1continuous derivatives is satisfied by requiring that the function values and all derivatives up to order d – 1 of the adjacent polynomials at each knot match.

A counting argument yields the number of parameters that define a spline with n knots. There are n + 1 polynomials of degree d, giving $(n+1)(d+1)$ coefficients. However, there are d restrictions at each of the n knots, so the number of free parameters is $(n+1)(d+1) - n d$ = n + d + 1. In mathematical terminology this says that the dimension of the vector space of splines of degree d on n distinct knots is n + d + 1. If you have n + d + 1 basis vectors, then you can fit a curve to your data by regressing your dependent variable by using this basis for the corresponding design matrix columns. In this context, such a spline is known as a regression spline. The EFFECT statement provides a simple mechanism for obtaining such a basis.

If you remove the restriction that the knots of a spline must be distinct and allow repeated knots, then you can obtain functions with less smoothness and even discontinuities at the repeated knot location. For a spline of degree d and a repeated knot with multiplicity $m \leq d$, the piecewise polynomials that join such a knot are required to have only dm matching derivatives. Note that this increases the number of free parameters by m – 1 but also decreases the number of distinct knots by m – 1. Hence the dimension of the vector space of splines of degree d with n knots is still n + d + 1, provided that any repeated knot has a multiplicity less than or equal to d.

The EFFECT statement provides support for the commonly used truncated power function basis and B-spline basis. With exact arithmetic and by using the complete basis, you obtain the same fit with either of these bases. The following sections provide details about constructing spline bases for the space of splines of degree d with n knots that satisfies $k_1\leq k_2\leq k_3<\cdots \leq k_ n$.