How to Find the Correlation Coefficient
Virtually every type of research conducted, no matter what field it is in, measures its results based on how well it fits a model. The model predicts the results based on some set of conditions or stimuli. A plot of the independent variable tested versus the dependent variable will yield a data point. The degree to which the data lies upon a straight line indicates how valid the model is for the hypothesis of the research project. A statistical measure of how close the data fits a straight line is the correlation coefficient, r. The correlation coefficient ranges between -1 and 1. A value of 1 indicates that the model has a strong positive linear correlation and a value of -1 indicates a strong negative correlation. A strong negative correlation indicates that as the independent variable increases, the dependent variable decreases.
Instructions
-
-
1
Define a data set that describes a set of results for a research project. Any set of results will work, as long as you can define a dependent and independent variable in the research. For an example of how the value of r is calculated, assume the following data sets (x,y): (60, 3.1), (61, 3.6), (62, 3.8), (63, 4) and (65, 4.1). Identify these data points as set 1, 2, 3, 4 and 5, respectively. N represents the number of data points, in this case, 5.
-
2
Form a table that lists the data sets vertically in columns. Place the values of the x points in the first column and the values of the y points in the second column. The labels of the next three columns are "x * y", "x * x" and "y * y".
-
-
3
Perform the indicated calculations for each block of the table. The results of the sample data would look like the following, going row by row. Row 1 would read 60; 3.1; 186; 3,600; 9.61. Row 2 would read 61; 3.6; 219.6; 3,721; 12.96. Row 3 would read 62; 3.8; 235.6; 3,844; 14.44. Row 4 would read 63; 4; 252; 3,969; 16. Row 5 would read 65; 4.1; 266.5; 4,225; 16.81.
-
4
Calculate the sum of each column of the table. Add all the values in each row, placing the total under each column. For the sample data, the values obtained would be 311; 18.6; 1,159.7; 19,359; 69.82.
-
5
Substitute the values from the table into the following equation for correlation coefficient, r. Correlation Coefficient (r) = [N * (Sum x * y) -- (Sum x) * (Sum y) / Square Root (Sqrt) [Sum x^2 -- (Sum X)^2]{N * (Sum y^2) -- (Sum y)^2]}]. [(5 * 1159.7) -- (311 * 18.6)] / Sqrt {[(5 * 19359) -- (311)^2] * [(5 * 69.82) -- (18.6)^2]}. (5798.5 -- 5784.6) / sqrt [(96795-96721) * (349.1 -- 345.96)]. 13.9 / sqrt (74 * 3.14). 13.9 / sqrt (232.36). 13.9 / 15.24336. r = 0.9119.
-
1
References
- Photo Credit Jupiterimages/Photos.com/Getty Images