Today in my Probability and Statistics class we did one of my favorite activities. We used the Anscombe quartet to learn the lesson that summary statistics are only part of data analysis, and the fact that it is very important to always plot your data! The Anscombe quartet has 4 sets of bivariate data. You can see the data below.
Each of the four data sets have the same correlation coefficient of about 0.816. They also have the same least-squares regression line. But is a linear model appropriate in each situation? What do the graphs tell you?
You could see the initial disbelief on my students faces when we looked at the results of each group on the board. First, they couldn't believe that there could be data sets that could have such a high correlation coefficient that were clearly not linearly related. Second, they realized that the only way they could see the relationship was non-linear was to look at the graphs.
For more on Anscombe's quartet, I invite you to read this interesting blog post.