The following statements request a correlation analysis and a scatter plot matrix for the variables in the data set Fish1
, which was created in Example 2.6.
ods graphics on; title 'Fish Measurement Data'; proc corr data=fish1 nomiss plots=matrix(histogram); var Height Width Length3 Weight3; run; ods graphics off;
The “Simple Statistics” table in Output 2.8.1 displays univariate descriptive statistics for analysis variables.
Output 2.8.1: Simple Statistics
Fish Measurement Data 
4 Variables:  Height Width Length3 Weight3 

Simple Statistics  

Variable  N  Mean  Std Dev  Sum  Minimum  Maximum 
Height  34  15.22057  1.98159  517.49950  11.52000  18.95700 
Width  34  5.43805  0.72967  184.89370  4.02000  6.74970 
Length3  34  38.38529  4.21628  1305  30.00000  46.50000 
Weight3  34  8.44751  0.97574  287.21524  6.23168  10.00000 
The “Pearson Correlation Coefficients” table in Output 2.8.2 displays Pearson correlation statistics for pairs of analysis variables.
Output 2.8.2: Pearson Correlation Coefficients
Pearson Correlation Coefficients, N = 34 Prob > r under H0: Rho=0 


Height  Width  Length3  Weight3  
Height 





Width 





Length3 





Weight3 




The variables are highly correlated. For example, the correlation between Height
and Width
is 0.92632.
The PLOTS=MATRIX(HISTOGRAM) option requests a scatter plot matrix for the VAR statement variables in Output 2.8.3.
Note that this graphical display is requested by enabling ODS Graphics and by specifying the PLOTS= option. For more information about ODS Graphics, see Chapter 21: Statistical Graphics Using ODS in SAS/STAT User's Guide.
To explore the correlation between Height
and Width
, the following statements display (in Output 2.8.4) a scatter plot with prediction ellipses for the two variables:
ods graphics on; proc corr data=fish1 nomiss plots=scatter(nvar=2 alpha=.20 .30); var Height Width Length3 Weight3; run; ods graphics off;
The PLOTS=SCATTER(NVAR=2) option requests a scatter plot for the first two variables in the VAR list. The ALPHA=.20 .30 suboption requests and prediction ellipses, respectively.
A prediction ellipse is a region for predicting a new observation from the population, assuming bivariate normality. It also approximates a region that contains a specified percentage of the population. The displayed prediction ellipse is centered at the means . For further details, see the section Confidence and Prediction Ellipses.
Note that the following statements also display (in Output 2.8.5) a scatter plot for Height
and Width
:
ods graphics on; proc corr data=fish1 plots=scatter(alpha=.20 .30); var Height Width; run; ods graphics off;
Output 2.8.5 includes the point , which was excluded from Output 2.8.4 because the observation had a missing value for Weight3
. The prediction ellipses in Output 2.8.5 also reflect the inclusion of this observation.
The following statements display (in Output 2.8.6) a scatter plot with confidence ellipses for the mean:
ods graphics on; title 'Fish Measurement Data'; proc corr data=fish1 nomiss plots=scatter(ellipse=confidence nvar=2 alpha=.05 .01); var Height Width Length3 Weight3; run; ods graphics off;
The NVAR=2 suboption within the PLOTS= option restricts the number of plots created to the first two variables in the VAR statement, and the ELLIPSE=CONFIDENCE suboption requests confidence ellipses for the mean. The ALPHA=.05 .01 suboption requests and confidence ellipses, respectively.
The confidence ellipse for the mean is centered at the means . For further details, see the section Confidence and Prediction Ellipses.