# The HPPRINCOMP Procedure

### Example 58.3 Extracting Principal Components with NIPALS

This example demonstrates the NIPALS method in PROC HPPRINCOMP, which extracts principal components successively. The data that this example uses are from the Getting Started section; they provide crime rates per 100,000 people in seven categories for each of the 50 US states in 1977. The following DATA step generates the data:

```data Crime;
title 'Crime Rates per 100,000 Population by State';
input State \$1-15 Murder Rape Robbery Assault
Burglary Larceny Auto_Theft;
datalines;
Alabama        14.2 25.2  96.8 278.3 1135.5 1881.9 280.7
Alaska         10.8 51.6  96.8 284.0 1331.7 3369.8 753.3
Arizona         9.5 34.2 138.2 312.3 2346.1 4467.4 439.5
Arkansas        8.8 27.6  83.2 203.4  972.6 1862.1 183.4
California     11.5 49.4 287.0 358.0 2139.4 3499.8 663.5

... more lines ...

Wisconsin       2.8 12.9  52.2  63.7  846.9 2614.2 220.7
Wyoming          .  21.9  39.7 173.9  811.6 2772.2 282.0
;
```

The following statements use PROC HPPRINCOMP to extract principal components by using the NIPALS method:

```proc hpprincomp data=Crime method=nipals;
run;
```

Output 58.3.1 displays the PROC HPPRINCOMP output. The "Model Information" table shows that the NIPALS method is used to extract principal components. The "Explained Variation of Variables" table lists the fraction of variation that is accounted for in each variable by each of the seven principal components. All the variation in each variable is accounted for by seven principal components because there are only seven variables. The eigenvalues indicate that two or three components provide a good summary of the data: two components account for 76% of the total variance, and three components account for 87%. Subsequent components account for less than 5% each.

Note that in the Getting Started section, the principal components are extracted from the same data by using the eigenvalue decomposition method; the "Eigenvalues" table generated there matches the one generated by the NIPALS method. Also, the eigenvectors in the "Eigenvectors" table match the loading factors in the "Loadings" table.

Output 58.3.1: Results of Principal Component Analysis Using NIPALS

 Crime Rates per 100,000 Population by State

The HPPRINCOMP Procedure

Performance Information
Execution Mode Single-Machine

Data Access Information
Data Engine Role Path
WORK.CRIME V9 Input On Client

Model Information
Data Source WORK.CRIME
Component Extraction Method NIPALS

 Number of Observations Read 50 48

 Number of Variables 7 7

Centering and Scaling Information
Variable Subtracted off Divided by
Murder 7.51667 3.93059
Rape 26.07500 10.81304
Robbery 127.55625 88.49374
Assault 214.58750 100.64360
Burglary 1316.37917 423.31261
Larceny 2696.88542 714.75023
Auto_Theft 383.97917 194.37033

Explained Variation of Variables
Variable Prin1 Prin2 Prin3 Prin4 Prin5 Prin6 Prin7
Murder 0.37117 0.85539 0.87790 0.89562 0.97555 0.99143 1.00000
Rape 0.76242 0.79917 0.84059 0.84199 0.85065 0.99041 1.00000
Robbery 0.63783 0.64064 0.82164 0.92942 0.99788 0.99992 1.00000
Assault 0.63517 0.79127 0.79341 0.91781 0.98822 0.99513 1.00000
Burglary 0.78913 0.84414 0.88183 0.88207 0.88544 0.94800 1.00000
Larceny 0.51373 0.72178 0.93718 0.95479 0.95492 0.95530 1.00000
Auto_Theft 0.33638 0.65746 0.90481 0.96197 0.99623 0.99706 1.00000

Eigenvalues of the Data Matrix
Eigenvalue Difference Proportion Cumulative
1 4.045824 2.781795 0.5780 0.5780
2 1.264030 0.516529 0.1806 0.7586
3 0.747500 0.421175 0.1068 0.8653
4 0.326325 0.061119 0.0466 0.9120
5 0.265207 0.036843 0.0379 0.9498
6 0.228364 0.105613 0.0326 0.9825
7 0.122750   0.0175 1.0000