Reliability and Generalisability

So I know I said I won’t write about replications any more. I want to stay true to my word for once… But then a thought occurred to me last night. Sometimes it takes minimal sleep followed by a hectic crazy day to crystalise an idea in your mind. So before I go on a well-needed break from research, social media, and (most importantly) bureaucracy, here is a post through the loophole (actually the previous one was too – I decided to split them up). It’s not about replication as such but about what “replicability” can tell us. Also it’s unusually short!

While replicability now seems widely understood to mean that a finding replicates on repeated testing, I have come to prefer the term reliability. If a finding is so fluky that it often cannot be reproduced, it is likely it was spurious in the first place. Hence it is unreliable. Most of the direct replication attempts underway now are seeking to establish the reliability of previous findings. That is fair enough. However, any divergence from the original experimental conditions will make the replication less direct.

This brings us to another important aspect to a finding – its generalisability (I have actually written about this whole issue before although in a more specific context). A finding may be highly reliable but still fail to generalise, like the coin flipping example in my previous post. In my opinion science must seek to establish both, the reliability and generalisability of hypotheses. Just like Sinatra said, “you can’t have one without the other.”

This is where I think most (not all!) currently debated replication efforts fall short. They seek to only establish reliability which you can’t do. A reliable finding that is so specific that it only occurs under very precise conditions could still lead to important theoretical advances – just like Fluke’s magnetic sand led to the invention of holographic television and hover-cars. Or, if you prefer real examples, just ask any microbiologist or single-cell electrophysiologist. Some very real effects can be very precarious.

However, it is almost certainly true that a lot of findings (especially in our field right now) are just simply unreliable and thus probably unreal. My main issue with the “direct” replication effort is that by definition it cannot ever distinguish reliability from generalisability. One could argue (and some people clearly do) that the theories underlying things like social priming are just so implausible that we don’t need to ask if they generalise. I think that is wrong. It is perfectly fine to argue that some hypothesis is implausible – I have done so myself. However, I think we should always test reliability and generalisability at the same time. If you only seek to establish reliability, you may be looking in the wrong place. If you only ask if the hypothesis generalises, you may end up chasing a mirage. Either way, you invest a lot of effort and resources but you may not actually advance scientific knowledge very much.

And this, my dear friends, Romans, and country(wo)men will really be my final post on replication. When I’m back from my well-needed science break I will want to post about another topic inspired by a recent Neuroskeptic post.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s