Examining Correlation Among Variables in SAS
Overview
This section covers the idea of correlation through the use of
proc corr. Correlation is a measure of the strength of the
relationship between two variables. While variables having a high
correlation coefficient don't gaurentee a cause and effect relationship
between the pair; having a high correlation is a necessary condition to
such a relationship.
Proc corr
Proc corr measures the correlation coefficient between two variables.
The letter r represents the correlation coefficient and has a
range from -1 to +1. If we take the absolute value of r, then a
value closer to 1 has a stronger relationship. Negative correlations
imply when one value goes up another goes down, positive correlations imply
that when one variable goes up another goes up too.
The basic format for proc corr is:
proc corr data=dataset;
var variable-list;
run;
Proc corr will generate some basic descriptive statistics (the same as
produced by proc means) and then a correlation table. For each variable
specify, SAS will show the correlation of that variable will all other
variables in the list (including itself). You should see that you
have a correlation value of 1 across the downward diagonal, this is
because it is taking the correlation of itself. In addition to the
r value you will get a probability value. This p-value is the
significance in testing that the null-hypothesis is true. Depending on
the confidence level you require you want this value to be close to
zero, signifying that the null-hypothesis is not true.
Example 7 - Proc corr
Back to Sas Index
Page
Author - Jack Suess
UMBC University Computing Services
Created - 1/15/96