This weekend marked another great moment in the saga surrounding the discussion about open science – a worthy sequel to “angry birds” and “shameless little bullies”. This time it was an editorial about data sharing in the New England Journal of Medicine which contains the statement that:
There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”
Remarks like this from journal editors are just all kinds of stupid. Even though this was presented in the context of quotes by unnamed “front-line researchers” (whatever that means) they implicitly endorse the interpretation that re-using other people’s published data is parasitical. In fact, their endorsement is made clear later on in the editorial when the editors express the hope that data sharing “should happen symbiotically, not parasitically.”
It shouldn’t come as a surprise that this editorial was immediately greeted by wide-spread ridicule and the creation of all sorts of internet memes poking fun of the notion of “research parasites.” Even if some people believe this, hell, even if the claim were true (spoiler: it’s not), this is just a very idiotic thing to do. Like it or not, open access, transparency, and post-publication scrutiny of published scientific findings are becoming increasingly common and are already required in many places. We’re now less than a year away from the date when the Peer Reviewers Openness Initiative, whose function it is to encourage data sharing, will come into effect. Not only is the clock not turning back on this stuff – it is deeply counterproductive to liken the supporters of this movement to parasites. This is no way to start (or have) a reasonable conversation.
And there should be a conversation. If there is one thing I have learned from talking with colleagues, worries about data sharing and open science as a whole are far from rare. Misguided as it may be, the concern about others scooping your ideas and sifting through your data you spent blood, sweat, and tears collecting resonates with many people. This editorial didn’t just pop into existence from the quantum foam – it comes from a real place. The mocking and snide remarks about this editorial are fully deserved. This editorial is moronic and ass-backwards. But speaking more generally, snide and mocking are never a good way to convince people of the strength of your argument. All too often worries like this are met with disrespect and ridicule. Is it any surprise that a lot of people don’t dare to speak up against open science? Similarly, when someone discovers errors or problems in somebody else’s data, some are quick to make jokes or serious accusations about these researchers. Is this encouraging them to be open their lab books and file drawers? I think not.
Scientists are human beings and they tend to have normal human reactions when being accused of wrong-doing, incompetence, or sloppiness. Whether or not the accusations are correct is irrelevant. Even mentioning the dreaded “questionable research practices” sounds like a fierce accusation to the accused even though questionable research practices can occur quite naturally without conscious ill intent when people are wandering in the garden of forking paths. In my opinion we need to be mindful of that and try to be more considerate in the way we discuss these issues. Social media like Facebook and Twitter do not exactly seem to encourage respectful dialogue. I know this firsthand as I have myself said things about (in my view) questionable research that I subsequently regretted. Scepticism is good and essential to scientific progress – disrespect is not.
It seems to have been the intention of this misguided editorial to communicate a similar message. It encourages researchers using other people’s data to work with the original authors. So far so good. I am sure no sensible person would actually disagree with that notion. But where the editorial misses the point is that there is no plan for what happens if this “symbiotic” relationship isn’t forming, either because the original authors are not cooperating or because there is a conflict of interests between skeptics and proponents of a scientific claim. In fact, the editorial lays bare what I think is the heart of the problem in a statement that to me seems much worse than the “research parasites” label. They say that people…
…even use the data to try to disprove what the original investigators had posited.
It baffles me that anyone can write something like this whilst keeping a straight face. Isn’t this how science is supposed to work? Trying to disprove a hypothesis is just basic Popperian falsification. Not only should others do that, you should do that yourself with your own research claims. To be fair, the best way to do science in my opinion is to generate competing hypotheses and test them with as little emotional attachment to any of them as possible but this is more easily said than done… So ideally we should try to find the hypothesis that best explains the data rather than just seeking to disprove. Either way however, this sentence is clearly symptomatic of a much greater problem: Science should be about “finding better ways of being wrong.” The first step towards this is to acknowledge that anything we posited is never really going to be “true” and that it can always use a healthy dose of scientific scepticism and disproving.
I want to have this dialogue. I want to debate the ways to make science healthier, more efficient, and more flexible in overturning false ideas. As I outlined in a previous post, data sharing is the single most important improvement we can make to our research culture. I think even if there are downsides to it, the benefits outweigh them by far. But not everyone shares my enthusiasm for data sharing and many people seem worried but afraid to speak up. This is wrong and it must change. I strongly believe that most of the worries can be alleviated:
- I think it’s delusional that data sharing will produce a “class” of “research parasites.” People will still need to generate their own science to be successful. Simply sitting around waiting for other people to generate data is not going to be a viable career strategy. If anything, large consortia like the Human Genome or Human Connectome Project will produce large data sets that a broad base of researchers can use. But this won’t allow them to test every possible hypothesis under the sun. In fact, most data sets are far too specific to be much use to many other people.
- I’m willing to bet that the vast majority of publicly shared data sets won’t be downloaded, let alone analysed by anyone other than the original authors. This is irrelevant. The point is that the data are available because they could be potentially useful to future science.
- Scooping other people’s research ideas by doing the experiment they wanted to do by using their published data is a pretty ineffective and risky strategy. In most cases, there is just no way that someone else would be faster than you publishing an experiment you wanted to do using your data. This doesn’t mean that it never happens but I’m still waiting for anyone to tell me of a case where this actually did happen… But if you are worried about it, preregister your intention so at least anyone can see that you planned it. Or even better, submit it as a Registered Report so you can guarantee that this work will be published in a journal regardless of what other people did with your data.
- While we’re at it, upload the preprints of your manuscripts when you submit them to journals. I still dream of a publication system where we don’t submit to journals at all, or at least not until peer review took place and the robustness of the finding has been confirmed. But until we get there, preprints are the next best thing. With a public preprint the chronological precedence is clear for all to see.
Now that covers the “parasites” feeding on your research productivity. But what to do if someone else subjects your data to sceptical scrutiny in the attempt to disprove what you posited? Again, first of all I don’t think this is going to be that frequent. It is probably more frequent for controversial or surprising claims and it bloody well should be. This is how science progresses and shouldn’t be a concern. And if it actually turns out that the result or your interpretation of it is wrong, wouldn’t you want to know about it? If your answer to this question is No, then I honestly wonder why you do research.
I can however empathise with the fear that people, some of whom may lack the necessary expertise or who cherry pick the results, will actively seek to dismantle your findings. I am sure that this does happen and with more general data sharing this may certainly become more common. If the volume of such efforts becomes so large that it overwhelms an individual researcher and thus hinders their own progress unnecessarily, this would indeed be a concern. Perhaps we need to have a discussion on what safeguards could ensure that this doesn’t get out of hand or how one should deal with that situation. I think it’s a valid concern and worth some serious thought. (Update on 25 Jan 2016: In this context Stephan Lewandowsky and Dorothy Bishop wrote an interesting comment about this).
But I guarantee you, throwing the blame at data sharing is not the solution to this potential problem. The answer to scepticism and scrutiny cannot ever be to keep your data under lock and key. You may never convince a staunch sceptic but you also will not win the hearts and minds of the undecidedly doubtful by hiding in your ivory tower. In science, the only convincing argument is data, more data, better tests – and the willingness to change your mind if the evidence demands it.