Jump to content

Prediction vs Explanation


swansont

Recommended Posts

I had linked this in my blog, but thought it might generate some discussion here.

 

Over at Ideas

 

We do ten experiments. A scientist observes the results, constructs a theory consistent with them, and uses it to predict the results of the next ten. We do them and the results fit his predictions. A second scientist now constructs a theory consistent with the results of all twenty experiments.

 

The two theories give different predictions for the next experiment. Which do we believe? Why?

Link to comment
Share on other sites

Both theories explain all experiments, so that means that the experiments conducted by the first guy did not exclude the explanation as proposed by the second. Depending on the research one usually tries to exclude as many possibilities as possible in ones experiments and has to include those explanations that could not be disproved with the experiments. In other words regardless who did the experiments, the first guy has to concede that the second explanation could be correct and the next set has to be conducted that apparently is able to distinguish between the two. Happens quite regularly in biological papers. In order to publish the second guy usually would have done the experiments to disprove the first guy first, though.

In fact, in one of my first papers I pretty much did that.

Link to comment
Share on other sites

Seems to me the more observations you have the better. Just doing a simple regression would be better with all the data points, I would think. Does anyone think if the first scientist used only the second set of experiments it would be a better model than the second scientist?

Link to comment
Share on other sites

Don't we also have things like "Occham's Razor" to help us out with these issues?

 

I mean, if both theories equally explain the results and there are no other mediating factors here (like, say, logical fallacies at play), then I'd say we could also use Occham's Razor to try and see which is more logical.

 

In any case, though, if we're unsure which of the two are better theories, doesn't that mean more experiments are necessary?

Link to comment
Share on other sites

I'd say the second theory has a slight advantage mathematically. The second theory agrees with 20 experiments, and the first theory only agrees with 10. Granted, the first is still 100% (10/10), but a bird in the hand is worth two in the bush. ;)

Edited by traveler
Link to comment
Share on other sites

nobody`s arguing that there doesn`t need to be further experiments, and that`s not part of the question either. and secondly the Quantity of experiments performed is arbitrary, sometimes it only needs one Good one.

Link to comment
Share on other sites

The first has more weight, sure. Theory vs. hypothesis. Of course, a plausible contrary hypothesis casts doubt on established theory, but hopefully also suggests an experiment to falsify at least one of them. (And if there's no way to falsify one or the other, we've gone terribly wrong somewhere.)

 

I'd say the second theory has a slight advantage mathematically. The second theory agrees with 20 experiments, and the first theory only agrees with 10. Granted, the first is still 100% (10/10), but a bird in the hand is worth two in the bush. ;)

 

They both agree with all 20 experiments. Read again.

Link to comment
Share on other sites

A similar phenomenon occurs when one is developing an ad hoc model of some process. Should one use all of the data at hand to generate the model, or reserve some of the data as a test against the generated model? The answer is, it depends.

 

If one can use all of the data and still construct some kind of meaningful statistical measure of "goodness of fit", use all of the data. If, on the other hand, the model generation process is so complex (e.g., machine learning techniques such as neural nets and ant colony optimization) that a goodness of fit constructed from the data is suspect, it is much better to reserve some of the data. Doing so most likely will reduce the accuracy of the model when applied to the full data set, but it will certainly give an idea of how well the model performs when applied to data drawn from outside the data set used to construct the model.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.