Examining Correlation Among Variables in SAS

Overview

This section covers the idea of correlation through the use of proc corr. Correlation is a measure of the strength of the relationship between two variables. While variables having a high correlation coefficient don't gaurentee a cause and effect relationship between the pair; having a high correlation is a necessary condition to such a relationship.

Proc corr

Proc corr measures the correlation coefficient between two variables. The letter r represents the correlation coefficient and has a range from -1 to +1. If we take the absolute value of r, then a value closer to 1 has a stronger relationship. Negative correlations imply when one value goes up another goes down, positive correlations imply that when one variable goes up another goes up too.

The basic format for proc corr is:

proc corr data=dataset;
var variable-list;
run;

Proc corr will generate some basic descriptive statistics (the same as produced by proc means) and then a correlation table. For each variable specify, SAS will show the correlation of that variable will all other variables in the list (including itself). You should see that you have a correlation value of 1 across the downward diagonal, this is because it is taking the correlation of itself. In addition to the r value you will get a probability value. This p-value is the significance in testing that the null-hypothesis is true. Depending on the confidence level you require you want this value to be close to zero, signifying that the null-hypothesis is not true.

Example 7 - Proc corr


Back to Sas Index Page

Author - Jack Suess
UMBC University Computing Services
Created - 1/15/96