BabcockHall Posted November 16, 2019 Posted November 16, 2019 (edited) https://en.wikipedia.org/wiki/Anscombe's_quartet I would like to know whether or not there is a statistic that can differentiate between the case at the top left versus the top right. Clearly R2 does not do so. One could plot the residuals, and the non-random distribution sometimes becomes apparent. However, what I was hoping to find is some number, preferably one that would be calculated by a statistics program, that could be compared in the two situations. I am reading Motulsky's book Intuitive Biostatistics (that is where I first saw the Anscombe quartet, but I have not found anything in his book yet. I am presently using ProStat, which has both a calculation of COD (which I am pretty sure is R2), as well as a calculation of "Corrl" which is said by the user manual to indicate "how closely the two variables approximate a linear relationship to each other." I note the presence of squared differences in the numerator of COD, which are not found in Corrl. Edited November 16, 2019 by BabcockHall
BabcockHall Posted November 18, 2019 Author Posted November 18, 2019 It has been suggested to me that Pearson's R is a good statistic for this situation. Thoughts?
studiot Posted November 18, 2019 Posted November 18, 2019 Hi I've been thinking about you question and I'm not sure there is any specific test to be extracted from tabulated data; which is partly why Anscombe recommends (strongly) sketching a plot first. There are just so many different possible lines you could draw through a given set of points that comparing them pair by pair or even class by class is an overwhelming task. Further there is the question of endslopes. If you try to fit a linear line then you cannot have zero slope at the origin or a turnover to an asympote. A second order quadratic can do the first but not the second, you require at least a cubic to achieve this. There may also be points that have more certain values than others. For example consider a plank resting on two supports. At the support points the plank can have zero deflection (or it is not resting on its support!) Depending upon the support restraint it may also have a curvature or zero curvature. So these values can never be rejected.
Prometheus Posted November 19, 2019 Posted November 19, 2019 The Kolmogorov-Smirnov test should work as it compares the entire sample distribution against an empirical distribution of any shape. A visual check would be better though.
BabcockHall Posted November 19, 2019 Author Posted November 19, 2019 I thank you both for some helpful comments. In our case we were plotting a straight line for gel electrophoresis data on proteins. The standard curve, which is mobility versus logarithm of molecular weight for the standards, had noticeable curvature. I am still looking into the biophysics, but the information that I have presently is that a slight deviation at high molecular weights is expected. I am not looking to explain the results, so much as to describe it, in the sense of making a more formal statement to the effect that a linear fit leads to non-random residuals.
Prometheus Posted November 22, 2019 Posted November 22, 2019 Since you're not worried about inference you could just try fitting different curves to it and seeing which has the lowest least squared error. The problem is over-fitting the data: for a high enough polynomial you'll be able to find a 'perfect' fit. Stick to simple curves.
BabcockHall Posted November 25, 2019 Author Posted November 25, 2019 One of the reasons I am asking is because R2 is a little bit like electronegativity in chemistry; one teaches students about it, and they want to use it for everything, even when there are better tools. In this instance R2 is not ideal, because it is indifferent to the direction of the residual, only to its magnitude.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now