Category Archives: improving science

How funders could encourage replication efforts

As promised, here is a post about science stuff, finally back to a more cheerful and hopeful topic than the dreadful state the world outside science is in right now…

A Dutch research funding agency recently announced a new grant initiative that exclusively funds replication attempts. The idea is to support replication efforts of particularly momentous “cornerstone” research findings. It’s not entirely clear what this means but presumably such findings include highly cited findings, those with great media coverage and public policy impact etc. It isn’t clear who determines whether a finding falls under this.

You can read about this announcement here. In that article you can see some comments by me on how I think funders should encourage replications by requiring that new grant proposals should also contain some replication of previous work. Like most people I believe replication to be one of the pillars supporting science. Before we treat any discovery as important we must know that it is reliable and meaningful. We need to know in how far it generalizes or if it is fickle and subject to minor changes in experimental parameters. If you read anything I have written about replication, you will probably already know my view on this: most good research is built on previous findings. This is how science advances. You take some previously observed results and use it to generate new hypotheses to be tested in a new experiment. In order to do so, you should include a replication and/or sanity check condition in your new experiment. This is precisely the suggestion Richard Feynman made in his famous Cargo Cult Science lecture.

Imagine somebody published a finding that people perceive the world as darker when they listen to sad classical music (let’s ignore for the moment the inherent difficulty in actually demonstrating such an effect…). You now want to ask if they also perceive the world as darker when they listen to dark metal. If you simply run the same experiment but replace the music any result you find will be inconclusive. If you don’t find any perceptual effect, it could be that your participant sample simply isn’t affected by music. The only way to rule this out is to also include the sad classical music condition in your experiment to test whether this claim actually replicates. Importantly, even if you do find a perceptual effect of dark metal music, the same problem applies. While you could argue that this is a conceptual replication, if you don’t know that you could actually replicate the original effect of classical music, it is impossible to know that you really found the same phenomenon.

My idea is that when applying for funding we should be far more explicit about how the proposal builds on past research and, insofar this is feasible, build more replication attempts into the proposed experiments. Critically, if you fail to replicate those experiments, this would in itself be an important finding that should be added to the scientific record. The funding thus implicitly sets aside some resources for replication attempts to validate previous claims. However, this approach also supports the advance of science because every proposal is nevertheless designed to test novel hypotheses. This stands in clear contrast between pure replication efforts such as those this Dutch initiative advocates or the various large-scale replication efforts like the RPP and Many Labs project. While I think these efforts clearly have value, one major concern I have with them is that they seem to stagnate scientific progress. They highlighted a lack of replicability in the current literature and it is undoubtedly important to flag that up. But surely this cannot be the way we will continue to do science from now on. Should we have a new RPP every 10 years now? And who decides which findings should be replicated? I don’t think we should really care whether every single surprising claim is replicated. Only the ones that are in fact in need of validation because they have an impact on science and society probably need to be replicated. But determining what makes a cornerstone discovery is not really that trivial.

That is not to say that such pure replication attempts should no longer happen or that they should receive no funding at all. If anyone is happy to give you money to replicate some result, by all means do so. However, my suggestion differs from these large-scale efforts and the Dutch initiative in that it treats replication the way it should be treated, as an essential part of all research, rather than as a special effort that is somehow separate from the rest. Most research would only be funded if it is explicit about which previous findings it builds on. This inherently also answers the question which previous claims should be replicated: only those findings that are deemed important enough by other researchers to motivate new research are sufficiently important for replication attempts.

Perhaps most crucially, encouraging replication in this way will help to break down the perceived polarization between the replicators and original authors of high-impact research claims. While I doubt many scientists who published replications actually see themselves as a “replication police,” we continue to rehash these discussions. Many replication attempts are also being suspected to be motivated by mistrust in the original claim. Not that there is really anything wrong with that because surely healthy skepticism is important in science. However, whether justified or not, skepticism of previous claims can lead to the perception that the replicators were biased and the outcome of the replication was a self-fulfilling prophecy. My suggestion would mitigate this problem at least to a large degree because most grant proposals would at least seek to replicate results that have a fighting chance of being true.

In the Nature article about this Dutch initiative there are also comments from Dan Gilbert, a vocal critic of the large-scale replication efforts. He bemoans that such replication research is based on its “unoriginality” and suspects that we will learn more about the universe by spending money on “exploring important new ideas.” I think this betrays the same false dichotomy I described above. I certainly agree with Gilbert that the goal of science should be to advance our understanding of the world but originality is not really the only objective here. Scientific claims must also be valid and generalize beyond very specific experimental contexts and parameters. In my view, both are equally important for  healthy science. As such, there is not a problem with the Dutch initiative but it seems rather gimmicky to me and I am unconvinced its effects will be lasting. Instead I believe the only way to encourage active and on-going replication efforts will be to overhaul the funding structure as I outlined here.

320px-flag_of_europe-svg
52% seems barely above chance. Someone should try to replicate that stupid referendum.

3 scoops of vanilla science in a low impact waffle please

A lot of young[1] researchers are worried about being “scooped”. No, this is not about something unpleasantly kinky but about when some other lab publishes an experiment that is very similar to yours before you do. Sometimes this is even more than just a worry and it actually happens. I know that this could be depressing. You’ve invested months or years of work and sleepless nights in this project and then somebody else comes along and publishes something similar and – poof – all the novelty is gone. Your science career is over. You will never publish this high impact now. You won’t ever get a grant. Immeasurable effort down the drain. Might as well give up, sell your soul to the Devil, and get a slave job in the pharmaceutical industry and get rich[2].

Except that this is total crap. There is no such thing as being scooped in this way, or at least if there is, it is not the end of your scientific career. In this post I want to briefly explain why I think so. This won’t be a lecture on the merits of open science, on replications, on how we should care more about the truth than about novelty and “sexiness”. All of these things are undoubtedly true in my mind and they are things we as a community should be actively working to change – but this is no help to young scientists who are still trying to make a name for themselves in a system that continues to reward high impact publications over substance.

No. Here I will talk about this issue with respect to the status quo. I think even in the current system, imperfect as it may be, this irrational fear is in my view unfounded. It is essential to dispel these myths about impact and novelty, about how precedence is tied to your career prospects. Early career scientists are the future of science. How can we ever hope to change science for the better if we allow this sort of madness to live on in the next generation of scientists? I say ‘live on’ for good reason – I, too, used to suffer from this madness when I was a graduate student and postdoc.

Why did I have this madness? Honestly I couldn’t say. Perhaps it’s a natural evolution of young researchers, at least in our current system. People like to point the finger at the lab PIs pressuring you into this sort of crazy behaviour. But that wasn’t it for me. For most of my postdoc I worked with Geraint Rees at UCL and perhaps the best thing he ever told me was to fucking chill[3]. He taught me – more by example than words – that while having a successful career was useful, what is much more important is to remember why you’re doing it: The point of having a (reasonably successful) science career is to be able to pay the rent/mortgage and take some enjoyment out of this life you’ve been given. The reason I do science, rather than making a mint in the pharma industry[4], is that I am genuinely curious and want to figure shit out.

Guess what? Neither of these things depend on whether somebody else publishes a similar (or even very similar) experiment while you’re still running it. We all know that novelty still matters to a lot of journals. Some have been very reluctant to publish replication attempts. I agree that publishing high impact papers does help wedge your foot in the door (that is, get you short-listed) in grant and job applications. But even if this were all that matters to be a successful scientist (and it really isn’t), here’s why you shouldn’t care too much about that anyway:

No paper was ever rejected because it was scooped

While journal editors will reject papers because they aren’t “novel,” I have never seen any paper being rejected because somebody else published something similar a few months earlier. Most editors and reviewers will not even be aware of the scooping study. You may find this hard to believe because you think your own research is so timely and important, but statistically it is true. Of course, some reviewers will know of the work. But most reviewers are not actually bad people and will not say “Something like this was published three months ago already and therefore this is not interesting.” Again, you may find this hard to believe because we’ve all heard too many stories of Reviewer 2 being an asshole. But in the end, most people aren’t that big of an asshole[5]. It happens quite frequently that I suggest in reviews that the authors cite some recently published work (usually not my own, in case you were wondering) that is very similar to theirs. In my experience this has never led to a rejection but I ask to them to put their results in the context of similar findings in the literature. You know, the way a Discussion section should be.

No two scooped studies are the same

You may think that the scooper’s experiment was very similar, but unless they actually stole your idea (a whole different story I also don’t believe but I have no time for this now…) and essentially pre-replicated (preclicated?) your design, I’d bet that there are still significant differences. Your study has not lost any of its value because of this. And it’s certainly no reason to quit and/or be depressed.

It’s actually a compliment

Not 100% sure about this one. Scientific curiosity shouldn’t have anything to do with a popularity contest if you ask me. Study whatever the hell you want to (within ethical limits, that is). But I admit, it feels reassuring to me when other people agree that the research questions I am interested in are also interesting to them. For one thing, this means that they will appreciate you working and (eventually) publishing on it, which again from a pragmatic point of view means that you can pay those rents/mortgages. And from a simple vanity perspective it is also reassuring that you’re not completely mad for pursuing a particular research question.

It has little to do with publishing high impact

Honestly, from what I can tell neither precedence nor the popularity of your topic are the critical factors in getting your work into high impact journals. The novelty of your techniques, how surprising and/or reassuringly expected your results are, and the simplicity of the narrative are actually major factors. Moreover, the place you work, the co-authors you with whom you write your papers, and the accessibility of the writing (in particular your cover letter to the editors!) definitely matter a great deal also (and these are not independent of the first points either…). It is quite possible that your “rival”[6] will publish first, but that doesn’t mean you won’t publish similar work in a higher impact journal. Journal review outcome is pretty stochastic and not really very predictable.

Actual decisions are not based on this

We all hear the horror stories of impact factors and h-indexes determining your success with grant applications and hiring decisions. Even if this were true (and I actually have my doubts that it is as black and white as this), a CV with lots of high impact publications may get your foot in the door – but it does not absolve the panel from making a hiring/funding decision. You need to do the work on that one yourself and even then luck may be against you (the odds certainly are). It also simply is not true that most people are looking for the person with the most Nature papers. Instead I bet you they are looking for people who can string together a coherent argument, communicate their ideas, and who have the drive and intellect to be a good researcher. Applicants with a long list of high impact papers may still come up with awful grant proposals or do terribly in job interviews while people with less stellar publication records can demonstrate their excellence in other ways. You may already have made a name for yourself in your field anyway, through conferences, social media, public engagement etc. This may matter far more than any high impact paper could.

There are more important things

And now we’re coming back to the work-life balance and why you’re doing this in the first place. Honestly, who the hell cares whether someone else published this a few months earlier? Is being the first to do this the reason you’re doing science? I can see the excitement of discovery but let’s face it, most of our research is neither like the work of Einstein or Newton nor are we discovering extraterrestrial life. Your discovery is no doubt exciting to you, it is hopefully exciting to some other scientists in your little bubble and it may even be exciting to some journalist who will write a distorting, simplifying article about it for the mainstream news. But seriously, it’s not as groundbreaking that it is worth sacrificing your mental and physical health over it. Live your life. Spend time with your family. Be good to your fellow creatures on this planet. By all means, don’t be complacent, ensure you make a living but don’t pressure yourself into believing that publishing ultra-high impact papers is the meaning of life.

A positive suggestion for next time…

Now if you’re really worried about this sort of thing, why not preregister your experiment? I know I said I wouldn’t talk about open science here but bear with me just this once because this is a practical point you can implement today. As I keep saying, the whole discussion about preregistration is dominated by talking about “questionable research practices”, HARKing, and all that junk. Not that these aren’t worthwhile concerns but this is a lot of negativity. There are plenty of positive reasons why preregistration can help and the (fallacious) fear of being scooped is one of them. Preregistration does not stop anyone else from publishing the same experiment before you but it does allow you to demonstrate that you had thought of the idea before they published it. With Registered Reports it becomes irrelevant if someone else published before you because your publication is guaranteed after the method has been reviewed. And I believe it will also make it far clearer to everyone how much who published what first where actually matters in the big scheme of things.

[1] Actually there are a lot of old and experienced researchers who worry about this too. And that is far worse than when early career researchers do it because they should really know better and they shouldn’t feel the same career pressures.
[2] It may sound appealing now, but thinking about it I wouldn’t trade my current professional life for anything. Except for grant admin bureaucracy perhaps. I would happily give that up at any price… :/
[3] He didn’t quite say it in those terms.
[4] This doesn’t actually happen. If you want to make a mint you need to go into scientific publishing but the whole open science movement is screwing up that opportunity now as well so you may be out of luck!
[5] Don’t bombard me with “Reviewer 2 held up my paper to publish theirs first” stories. Unless Reviewer 2 signed their review or told you specifically that it was them I don’t take such stories at face value.
[6] The sooner we stop thinking of other scientists in those terms the better for all of us.

Strawberry Ice Cream Cone

Started signing my reviews

As of this year, I started signing my reviews. This decision has been a long time coming. A lot of people sign their reviews making this not a particularly newsworthy event but I’ll tell you about it anyway, largely to have a record of when I started and also to explain my reasons.

To explain why, I first need to talk about why one might not want to sign peer reviews. The debate about whether or not to sign reviews has been raging for years. It divides people’s minds and the debate regularly sparks up again. Even the people who agree that the process of scientific research can be improved seem to often fall into two camps whose opinions are diametrically opposed: one side fervently argues that all peer reviews should be transparent and signed, whilst other people argue with equal fervour that ideally all reviews should be double-blind, so that neither reviewers nor authors’ know each other’s identities.

Whenever someone suggests double-blind reviews, people are wont to argue that this simply doesn’t work in many situations. It is possible to guess the reviewers from the research question and/or the methods used. If the authors previously presented the research at a conference it is likely that reviewers will have already seen it in a preliminary form. That said, the very few times I did review in a double-blind manner I actually didn’t guess the authors’ identities and in one case I was in fact reviewing the work of friends and collaborators without even knowing it. I’d like to think I would’ve been fair either way, but I must also admit that I was probably more sceptical and possibly less biased because I didn’t know who the authors were. Still, these cases are probably somewhat special – in many situations I would know the authors from the research or at least have a strong suspicion. The suspicion might also lead me to some erroneous assumptions, such as “These authors usually do this and that even though this isn’t mentioned here”. If my guess were actually wrong then this could skew my thought process unduly.

So I think double-blind reviewing is a bad idea. Now, many arguments have been brought forth as to why reviews should be anonymous. It can protect reviewers from the wrath of vengeful senior colleagues making unfair hiring or funding decisions because they didn’t like your review. There are a lot of arseholes in the world and this is certainly a possibility. But the truth is that anonymity doesn’t stop people from behaving in this way – and there is actually no compelling evidence that signed reviews make it worse. I have heard some harrowing tales from colleagues who were being treated unfairly by some major players in their fields because they thought that they had given their work a bad review. In one case, it was a PhD student of the assumed reviewer who received ill treatment – and the assumption was entirely incorrect.

You also frequently hear people’s guesses about who they think Reviewer 2 was on their latest rejected manuscript, often based on circumstantial or generally weak evidence. One of my favourites is the age old “He (because we know all reviewers are obviously male…) asked us to cite lots of his papers!” I am sure this happens but I wonder how often this deduction is correct. I almost never ask people to cite my papers – if I do it is because I feel they are directly relevant and citing them is the scholarly thing to do. It is far more likely that I ask people to cite the work of researchers whose work I know well when it is relevant. In many cases when people just “know” that Reviewer 2 is Professor X because they want X to be cited, it seems to me far more likely that the reviewer is one of Professor X’s postdocs or former students. In many cases, it may also be that Professor X’s work is an established part of the literature and thus in the interest of scholarship an unbiased reviewer will think it deserves being cited even though you think Professor X’s work is rubbish. In short, I find those kind of insane guessing games rather tedious and potentially quite damaging.

The first time I signed a review was when I reviewed for F1000Research where signing is mandatory. (I had already reviewed at Frontiers a few times where reviewer identities are public but I don’t think this counts: reviews aren’t signed upon submission of the review but only after publication of the paper. Moreover, the match between review and reviewer remains ambiguous). I must say reviewing this paper all in public was a rather uplifting experience. At all stages of this process I felt the communication between me and the authors was amicable and sensible in spite of the harshness of my decisions. I have also been led to believe that the authors appreciated my scepticism (although only they can tell you that for sure).

By signing I may have also been more polite than I might have been if my review were anonymous. I am not entirely convinced of this last argument because I typically try to be polite. There are a lot of dickheads out there who aren’t polite even when their identity is public :P. I also don’t buy that anonymous reviewers aren’t accountable and that thus the quality of the review suffers. Your review is still read by at least one editor – unless that editor is your close personal friend (which is still rare for me at least) then I do feel like someone is checking my review both for factual quality and politeness.

Either way, I did not perceive any adverse consequences of signing my reviews. If anything, it made me think harder about how I would write my review and to check the arguments I am making. Scientists should criticise and scrutinise each other. By this I don’t mean you should mistrust people’s intentions or question their competence. But science is fueled by scepticism and you should challenge anything that doesn’t make sense. I have certainly done so in my collaborations in the past (often to the frustration of my collaborators) and I try to encourage this in my own lab. I much rather have a student or postdoc who tells me that my idea makes no sense than someone who does everything I say. Researchers also do that at conferences where they discuss each other’s research. One of my most positive experiences from a conference was some rather intense – but very polite – discussions at a poster. Why can’t we do the same in paper reviews?

When I’m perfectly honest, the main reason I hadn’t signed reviews so far is that I was raised that way. Almost none of the reviews I ever received were signed – certainly none of the negative ones. Some reviewers (including very critical ones) revealed their identities after the manuscripts had been accepted for publication and I have done the same in some cases. But the status quo of my field was always that reviews were anonymous and that’s just how it was. Challenging this seemed to go against nature – but that really isn’t true. Whether or not reviews are signed is a question of culture, not nature. And I want to change this culture.

Signing reviews is a personal choice. I don’t think it should ever become mandatory. For one thing, I’m a libertarian (just to be clear, I’m not one of the delusional tea party types) and I don’t believe we should force people to do things that aren’t necessary. I don’t think signed reviews are necessary. I think making all review contents public would be an essential improvement to peer review, with or without signing. But signing reviews can be positive development and I believe it should be encouraged. I certainly think it is a positive development for me and this is why everyone should be free to take this step of their own accord. Signing my first reviews has been a strangely liberating experience. I don’t know if this will provoke the ire of powerful senior colleagues. In a few years’ time I may post an update about my experience. Somehow I doubt it will turn out to be a problem.

Coconut

A brave new world of research parasites

What a week! I have rarely seen the definition of irony being demonstrated more clearly in front of my eyes than during the days following the publication of this comment by Lewandowsky and Bishop in Nature. I mentioned this at the end of my previous post. The comment discusses the question how to deal with data requests and criticisms of scientific claims in the new world of open science. A lot of digital ink has already been spilled elsewhere debating what they did or didn’t say and what they meant to say with their article. I have no intention of rehashing that debate here. So while I typically welcome any meaningful and respectful comments under my posts, I’ll regard any comments on the specifics of the L&B article as off-topic and will not publish them. There are plenty of other channels for this.

I think the critics attack a strawman and the L&B discussion is a red herring. Irrespective of what they actually said, I want to get back to the discussion we should be having, which I already alluded to last time.  In order to do so, let’s get the premise crystal clear. I have said all this before in my various posts about data sharing but let me summarise the fundamental points:

  1. Data sharing: All data for scientific studies needed to reproduce the results should be made public in some independent repository at the point of publication. This must exclude data which would be unethical to share, e.g. unprocessed brain images from human participants. Such data fall in a grey area as to how much anonymisation is necessary and it is my policy to err on the side of caution there. We have no permission from our participants (except for some individual cases) to share their data with anyone outside the team if there is a chance that they could be identified from it so we don’t. For the overwhelming majority of purposes such data are not required and the pre-processed, anonymised data will suffice.
  2. Material sharing: When I talk about sharing data I implicitly also mean material so any custom analysis code, stimulus protocols, or other materials used for the study  should also be shared. This is not only good for reproducibility, i.e. getting the same results using the same data. It is also useful for replication efforts aiming to repeat the same experiment to collect new data.
  3. Useful documentation: Shared data are unlikely to be much use to anyone if there isn’t a minimum of documentation explaining what it contains. I don’t think this needs to be excessive, especially given the fact that most data will probably never be accessed by anyone. But there should at least be some basic guide how to use the data to return a result. It should be reasonably clear what data can be found where or how to run the experiment. Provided the uncompiled code is included and the methods section of the publication contains sufficient detail of what is being done, anyone looking at it should be able to work it out by themselves. More extensive documentation is certainly helpful and may also help the researchers themselves in organising their work – but I don’t think we should expect more than the basics.

Now with this out of the way I don’t want to hear no lamentations about how I am “defending” the restriction of data access to anyone or any such rubbish. Let’s simply work on the assumption that the world is how it should be and that the necessary data are available to anyone with an internet connection. So let’s talk about the worries and potential problems this may bring. Note that, as I already said, most data sets will probably not generate much interest. That is fine – they should be available for potential future use in any case. More importantly this doesn’t mean the following concerns aren’t valid:

Volume of criticism

In some cases the number of people reusing the shared data will be very large. This is particularly likely for research on controversial topics. This could be because the topic is a political battleground or that the research is being used to promote policy changes people are not happy with. Perhaps the research receives undeserved accolades from the mainstream media or maybe it’s just a very sensational claim (Psi research springs to mind again…). The criticisms of this research may or may not be justified. None of this matters and I don’t care to hear about the specifics about your particular pet peeve whether it’s climate change or some medical trial. All that matters in this context is that the topic is controversial.

As I said last time, it should be natural that sensational or controversial research attracts more attention and more scepticism. This is how it should be. Scientists should be sceptical. But individual scientists or small research teams are composed of normal human beings and they have a limit with how much criticism they can keep up with. This is a simple fact. Of course this statement will no doubt draw out the usual suspects who feel the need to explain to me that criticism and scepticism is necessary in science and that this is simply what one should expect.

396px-bookplate_of_the_royal_society_28great_britain29

So let me cut the heads off this inevitable hydra right away. First of all, this is exactly what I just said: Yes, science depends on scepticism. But it is also true that humans have limited capacity for answering questions and criticisms and limited ability to handle stress. Simply saying that they should be prepared for that and have no right to complain is unrealistic. If anything it will drive people away from doing research on controversial questions which cannot be a good thing.

Similar, it is unrealistic to say that they could just ignore criticisms if it gets too much for them. It is completely natural that a given scientist will want to respond to criticisms, especially if those criticisms are public. They will want to defend the conclusions they’ve drawn and they will also feel that they have a reputation to defend. I believe science would generally be better off if we all learned to become less invested in our pet theories and conducted our inferences in a less dogmatic way. I hope there are ways we can encourage such a change – but I don’t think you can take ego out of the question completely. Especially if a critic accuses a researcher of incompetence or worse, it shouldn’t surprise anyone if they react emotionally and have personal stakes in the debate.

So what can we expect? To me it seems entirely justified in this situation that a researcher would write a summary response that addresses the criticism collectively. In that they would most likely have to be selective and only address the more serious points and ignore the minutia. This may require some training. Even then it may be difficult because critics might insist that their subtle points are of fundamental importance. In that situation an adjudicating article by an independent party may be helpful (albeit probably not always feasible).

On a related note, it also seems justified to me that a researcher will require time to make a response. This pertains more to how we should assess a scientific disagreement as outside observers. Just because a researcher hasn’t responded to every little criticism within days of somebody criticising their work doesn’t mean that the criticism is valid. Scientists have lives too. They have other professional duties, mortgages to pay with their too-low salaries, children to feed, and – hard as it is to believe – they deserve some time off occasionally. As long as they declare their intention to respond in depth at some stage we should respect that. Of course if they never respond that may be a sign that they simply don’t have a good response to the criticism. But you need some patience, something we seem to have lost in the age of instant access social media.

Excessive criticism or harassment

This brings us to the next issue. Harassment of researchers is never okay. Which is really because harassment of anyone is never okay. So pelting a researcher with repeated criticisms, making the same points or asking the same questions over and over, is not acceptable. This certainly borders on harassment and may cross the line. This constant background noise can wear people out. It is also counterproductive because it slows them down in making their response. It may also paralyse their other research efforts which in turn will stress them out because they have grant obligations to fulfill etc. Above all, stress can make you sick. If you harassed somebody out of the ability to work, you’ll never get a response – this doesn’t make your criticism valid.

If the researchers declared their intention to respond to criticism we should leave it at that. If they don’t respond after a significant time it might be worth a reminder if they are still working on it. As I said above, if they never respond this may be a sign that they have no response. In that case, leave it at that.

It should require no explanation why any blatant harassment, abusive contact, or any form of interference in the researchers’ personal lives, is completely unacceptable. Depending on the severity of such cases they should be prosecuted to the full extent of the law. And if someone reports harassment, in the first instance you should believe them. It is a common tactic of harassers to downplay claims of abuse. Sure, it is also unethical to make false accusations but you should leave that for the authorities to judge, in particular if you don’t have any evidence one way or the other. Harassment is also subjective. What might not bother you may very well affect another person badly. Brushing this off as them being too sensitive demonstrates a serious lack of compassion, is disrespectful, and I think it also makes you seem untrustworthy.

Motive and bias

Speaking of untrustworthiness brings me to the next point. There has been much discussion about the motives of critics and in how far a criticism is to be taken in “good faith”. This is a complex and highly subjective judgement. In my view, your motive for reanalysing or critiquing a particular piece of research is not automatically a problem. All the data should be available, remember? Anyone can reanalyse it.

However, as all researchers should be honest so should all critics. Obviously this isn’t mandatory and it couldn’t be enforced even if it were. But this is how it should be and how good scientists should work. I have myself criticised and reanalysed research by others and I was not beating around the bush in either case – I believe I was pretty clear that I didn’t believe their hypothesis was valid. Hiding your prior notions is disrespectful to the authors and also misleads the neutral observers of the discussion. Even if you think that your public image already makes your views clear – say, because you ranted at great length on social media about how terribly flawed you think that study was – this isn’t enough. Even the Science Kardashians don’t have that large a social media following and probably only a fraction of that following will have read all your in-depth rants.

In addition to declaring your potential bias you should also state your intention. It is perfectly justified to dig into the data because you suspect it isn’t kosher. But this is an exploratory analysis and it comes with many of the same biases that uncontrolled, undeclared exploration always has. Of course you may find some big smoking gun that invalidates or undermines the original authors’ conclusions. But you are just as likely to find some spurious glitch or artifact in the data that doesn’t actually mean anything. In the latter case it would make more sense to conduct a follow up experiment that tests  your new alternative hypothesis to see if your suspicion holds up. If on the other hand you have a clear suspicion to start with you should declare it and then test it and report the findings no matter what. Preregistration may help to discriminate the exploratory fishing trips from the pointed critical reanalyses – however, it is logistically not very feasible to check whether this wasn’t just a preregistration after the fact because the data were already available.

So I think this judgement will always rely heavily on trust but that’s not a bad thing. I’m happy to trust a critic if they declare their prior opinion. I will simply take their views with some scepticism that their bias may have influenced them. A critic who didn’t declare their bias but is then shown to have a bias appears far less trustworthy. So it is actually in your interest to declare your bias.

Now before anyone inevitably reminds us that we should also worry about the motives and biases of the original authors – yes, of course. But this is a discussion we’ve already had for years and this is why data sharing and novel publication models like preregistration and registered reports are becoming more commonplace.

Lack of expertise

On to the final point. Reanalyses or criticism may come from people with limited expertise and knowledge of a research area to provide useful contributions. Such criticisms may obfuscate the discussion and that is never a good thing. Again preempting the inevitable comments: No, this does not mean that you have to prove your expertise to reanalyse the data. (Seriously guys, which part of “all data should be available to anyone” don’t you get?!). What it does mean is that I might not want to weight the criticism by someone who once took a biology class in high school the same way as that of a world expert. It also means that I will be more sceptical when someone is criticising something outside their own field.

There are many situations where this caveat doesn’t matter. Any scientist with some statistical training may be able to comment on some statistical issue. In fact, a statistician is presumably more qualified to comment on some statistical point than a non-statistician of whatever field. And even if you may not be an expert on some particular research topic you may still be an expert on the methods used by the researchers. Importantly, even a non-expert can reveal a fundamental flaw. The lack of a critic’s expertise shouldn’t be misused to discredit them. In the end, what really matters is that your argument is coherent and convincing. For that it doesn’t actually matter if you are an expert or not (an expert may however find it easier to communicate their criticism convincingly).

However, let’s assume that a large number of non-experts are descending on a data set picking little things they perceive as flaws that aren’t actually consequential or making glaring errors (to an expert) in their analysis. What should the researchers do in this situation? Not responding at all is not in their interest. This can easily be misinterpreted as a tacit acknowledgement that their research is flawed. On the other hand, responding to every single case is not in their interest either if they want to get on with their work (and their lives for that matter). As above, perhaps the best thing to do would be write a summary response collectively rebuking the most pertinent points, make a clear argument about the inconsequentialness of these criticisms, and then leave it at that.

Conclusion

In general, scientific criticisms are publications that should work like any other scientific publications. They should be subject to peer review (which, as readers of this blog will know, I believe should be post-publication and public). This doesn’t mean that criticism cannot start on social media, blogs, journal comment sections, or on PubPeer, and the boundaries may also blur at times. For some kinds of criticism, such as pointing out basic errors or misinterpretations some public comments may suffice and there have been cases where a publication was retracted simply because of the social media response. But for a criticism to be taken seriously by anyone, especially non-experts, it helps if it is properly vetted by independent experts – just how any study should be vetted. This may also help particularly with cases where the validity of the criticism is uncertain.

I think this is a very important discussion to have. We need to have this to bring about the research culture most of us seem to want. A brave new world of happy research parasites.

Parasites

(Note: I changed the final section somewhat after Neuroskeptic rightly pointed out that the conclusions were a bit too general. Tal Yarkoni independently replicated this sentiment. But he was only giving me a hard time.)

 

Parasitical science?

This weekend marked another great moment in the saga surrounding the discussion about open science – a worthy sequel to “angry birds” and “shameless little bullies”. This time it was an editorial about data sharing in the New England Journal of Medicine which contains the statement that:

There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”

Remarks like this from journal editors are just all kinds of stupid. Even though this was presented in the context of quotes by unnamed “front-line researchers” (whatever that means) they implicitly endorse the interpretation that re-using other people’s published data is parasitical. In fact, their endorsement is made clear later on in the editorial when the editors express the hope that data sharing “should happen symbiotically, not parasitically.”

Parasites
Contact Richard Morey to add this badge to your publications!

It shouldn’t come as a surprise that this editorial was immediately greeted by wide-spread ridicule and the creation of all sorts of internet memes poking fun of the notion of research parasites.” Even if some people believe this, hell, even if the claim were true (spoiler: it’s not), this is just a very idiotic thing to do. Like it or not, open access, transparency, and post-publication scrutiny of published scientific findings are becoming increasingly common and are already required in many places. We’re now less than a year away from the date when the Peer Reviewers Openness Initiative, whose function it is to encourage data sharing, will come into effect. Not only is the clock not turning back on this stuff – it is deeply counterproductive to liken the supporters of this movement to parasites. This is no way to start (or have) a reasonable conversation.

And there should be a conversation. If there is one thing I have learned from talking with colleagues, worries about data sharing and open science as a whole are far from rare. Misguided as it may be, the concern about others scooping your ideas and sifting through your data you spent blood, sweat, and tears collecting resonates with many people. This editorial didn’t just pop into existence from the quantum foam – it comes from a real place. The mocking and snide remarks about this editorial are fully deserved. This editorial is moronic and ass-backwards. But speaking more generally, snide and mocking are never a good way to convince people of the strength of your argument. All too often worries like this are met with disrespect and ridicule. Is it any surprise that a lot of people don’t dare to speak up against open science? Similarly, when someone discovers errors or problems in somebody else’s data, some are quick to make jokes or serious accusations about these researchers. Is this encouraging them to be open their lab books and file drawers? I think not.

Scientists are human beings and they tend to have normal human reactions when being accused of wrong-doing, incompetence, or sloppiness. Whether or not the accusations are correct is irrelevant. Even mentioning the dreaded “questionable research practices” sounds like a fierce accusation to the accused even though questionable research practices can occur quite naturally without conscious ill intent when people are wandering in the garden of forking paths. In my opinion we need to be mindful of that and try to be more considerate in the way we discuss these issues. Social media like Facebook and Twitter do not exactly seem to encourage respectful dialogue. I know this firsthand as I have myself said things about (in my view) questionable research that I subsequently regretted. Scepticism is good and essential to scientific progress – disrespect is not.

It seems to have been the intention of this misguided editorial to communicate a similar message. It encourages researchers using other people’s data to work with the original authors. So far so good. I am sure no sensible person would actually disagree with that notion. But where the editorial misses the point is that there is no plan for what happens if this “symbiotic” relationship isn’t forming, either because the original authors are not cooperating or because there is a conflict of interests between skeptics and proponents of a scientific claim. In fact, the editorial lays bare what I think is the heart of the problem in a statement that to me seems much worse than the “research parasites” label. They say that people…

…even use the data to try to disprove what the original investigators had posited.

It baffles me that anyone can write something like this whilst keeping a straight face. Isn’t this how science is supposed to work? Trying to disprove a hypothesis is just basic Popperian falsification. Not only should others do that, you should do that yourself with your own research claims. To be fair, the best way to do science in my opinion is to generate competing hypotheses and test them with as little emotional attachment to any of them as possible but this is more easily said than done… So ideally we should try to find the hypothesis that best explains the data rather than just seeking to disprove. Either way however, this sentence is clearly symptomatic of a much greater problem: Science should be about “finding better ways of being wrong.” The first step towards this is to acknowledge that anything we posited is never really going to be “true” and that it can always use a healthy dose of scientific scepticism and disproving.

I want to have this dialogue. I want to debate the ways to make science healthier, more efficient, and more flexible in overturning false ideas. As I outlined in a previous post, data sharing is the single most important improvement we can make to our research culture. I think even if there are downsides to it, the benefits outweigh them by far. But not everyone shares my enthusiasm for data sharing and many people seem worried but afraid to speak up. This is wrong and it must change. I strongly believe that most of the worries can be alleviated:

  • I think it’s delusional that data sharing will produce a “class” of “research parasites.” People will still need to generate their own science to be successful. Simply sitting around waiting for other people to generate data is not going to be a viable career strategy. If anything, large consortia like the Human Genome or Human Connectome Project will produce large data sets that a broad base of researchers can use. But this won’t allow them to test every possible hypothesis under the sun. In fact, most data sets are far too specific to be much use to many other people.
  • I’m willing to bet that the vast majority of publicly shared data sets won’t be downloaded, let alone analysed by anyone other than the original authors. This is irrelevant. The point is that the data are available because they could be potentially useful to future science.
  • Scooping other people’s research ideas by doing the experiment they wanted to do by using their published data is a pretty ineffective and risky strategy. In most cases, there is just no way that someone else would be faster than you publishing an experiment you wanted to do using your data. This doesn’t mean that it never happens but I’m still waiting for anyone to tell me of a case where this actually did happen… But if you are worried about it, preregister your intention so at least anyone can see that you planned it. Or even better, submit it as a Registered Report so you can guarantee that this work will be published in a journal regardless of what other people did with your data.
  • While we’re at it, upload the preprints of your manuscripts when you submit them to journals. I still dream of a publication system where we don’t submit to journals at all, or at least not until peer review took place and the robustness of the finding has been confirmed. But until we get there, preprints are the next best thing. With a public preprint the chronological precedence is clear for all to see.

Now that covers the “parasites” feeding on your research productivity. But what to do if someone else subjects your data to sceptical scrutiny in the attempt to disprove what you posited? Again, first of all I don’t think this is going to be that frequent. It is probably more frequent for controversial or surprising claims and it bloody well should be. This is how science progresses and shouldn’t be a concern. And if it actually turns out that the result or your interpretation of it is wrong, wouldn’t you want to know about it? If your answer to this question is No, then I honestly wonder why you do research.

I can however empathise with the fear that people, some of whom may lack the necessary expertise or who cherry pick the results, will actively seek to dismantle your findings. I am sure that this does happen and with more general data sharing this may certainly become more common. If the volume of such efforts becomes so large that it overwhelms an individual researcher and thus hinders their own progress unnecessarily, this would indeed be a concern. Perhaps we need to have a discussion on what safeguards could ensure that this doesn’t get out of hand or how one should deal with that situation. I think it’s a valid concern and worth some serious thought. (Update on 25 Jan 2016: In this context Stephan Lewandowsky and Dorothy Bishop wrote an interesting comment about this).

But I guarantee you, throwing the blame at data sharing is not the solution to this potential problem. The answer to scepticism and scrutiny cannot ever be to keep your data under lock and key. You may never convince a staunch sceptic but you also will not win the hearts and minds of the undecidedly doubtful by hiding in your ivory tower. In science, the only convincing argument is data, more data, better tests – and the willingness to change your mind if the evidence demands it.

Coconut
Here at CoCoNiT (Cook-Islands Centre Of NeuroImaging Tests) we understand that once you crack the hard shell of your data, the sweet, white knowledge will just come pouring out…

Yes, science is self-correcting

If you don’t believe science self-corrects, then you probably shouldn’t believe that evolution by natural selection occurs either – it’s basically the same thing.

I have said it many times before, both under the guise of my satirical alter ego and later – more seriously – on this blog. I am getting very tired of repeating it so I wrote this final post about it that I will simply link to next time this inevitably comes up…

My latest outburst about this was triggered by this blog post by Keith Laws entitled “Science is ‘Other-Correcting‘”. I have no qualms with the actual content of this post. It gives an interesting account of the attempt to correct an error in the publication record. The people behind this effort are great researchers for whom I have the utmost respect. The story they tell is shocking and important. In particular, the email they received by accident from a journal editor is disturbing and serves as a reminder of all the things that are wrong with the way scientific research and publishing currently operates.

My issue is with the (in my view seemingly) ubiquitous doubts about the self-correcting nature of science. To quote from the first paragraph in that post:

“I have never been convinced by the ubiquitous phrase ‘Science is self-correcting’. Much evidence points to science being conservative and looking less self-correcting and more ego-protecting. It is also not clear why ‘self’ is the correct description – most change occurs because of the ‘other’ – Science is other correcting.”

In my view this and similar criticisms of self-correction completely miss the point. The suffix ‘self-‘ refers to science, not to scientists. In fact, the very same paragraph contains the key: “Science is a process.” Science is an iterative approach by which we gradually broaden our knowledge and understanding of the world. You can debate whether or not there is such a thing as the “scientific method” – perhaps it’s more of a collection of methods. However, in my view above all else science is a way of thinking.

Scientific thinking is being inquisitive, skeptical, and taking nothing for granted. Prestige, fame, success are irrelevant. Perfect theories are irrelevant. The smallest piece of contradictory evidence can refute your grand unifying theory. And science encompasses all that. It is an emergent concept. And this is what is self-correcting.

Scientists, on the other hand, are not self-correcting. Some are more so than others but none are perfect. Scientists are people and thus inherently fallible. They are subject to ego, pride, greed, and all of life’s pressures, such as the need to pay a mortgage, feed their children, and having a career. In the common vernacular “science” is often conflated with the scientific enterprise, the way scientists go about doing science. This involves all those human factors and more and, fair enough, it is anything but self-correcting. But to argue that this means science isn’t self-correcting is attacking a straw man because few people are seriously arguing that the scientific enterprise couldn’t be better.

We should always strive to improve the way we do science because due to our human failings it will never be perfect. However, in this context we also shouldn’t forget how much we have already improved it. In the times of Newton, in Europe (the hub of science then) science was largely done only by white men from a very limited socioeconomic background. Even decades later, most women or people of non-European origin didn’t even need to bother trying (although this uphill struggle makes the achievements of scientists like Marie Curie or Henrietta Swan Leavitt all the more impressive). And publishing your research findings was not subject to formal peer review but largely dependent on the ego of some society presidents and on whether they liked you. None of these problems have been wiped off the face of the Earth but I would hope most people agree that things are better than they were 100 years ago.

Like all human beings, scientists are flawed. Nevertheless I am actually optimistic about us as a group. I do believe that on the whole scientists are actually interested in learning the truth and widening their understanding of nature. Sure, there are black sheep and even the best of us will succumb to human failings. At some point or other our dogma and affinity to our pet hypotheses can blind us to the cold facts. But on average I’d like to think we do better than most of our fellow humans. (Then again, I’m probably biased…).

We will continue to make the scientific enterprise better. We will change the way we publish and evaluate scientific findings. We will improve the way we interpret evidence and we communicate scientific discoveries. The scientific enterprise will become more democratic, less dependent on publishers getting rich on our free labour. Already within the decade I have been a practicing scientist we have begun to tear down the wide-spread illusion that when a piece of research is published it must therefore be true. When I did my PhD, the only place we could critically discuss new publications was in a small journal club and the conclusions of these discussions were almost never shared with the world. Nowadays every new study is immediately discussed online by an international audience. We have taken leaps towards scientific findings, data, and materials being available to anyone, anywhere, provided they have internet access.  I am very optimistic that this is only the beginning of much more fundamental changes.

Last year I participated in a workshop called “Is Science Broken?” that was solely organised by graduate students in my department. The growing number of replication attempts in the literature and all these post-publication discussions we are having are perfect examples of science correcting itself. It seems deeply ironic to me when posts like Keith Laws’, which describes an active effort to rectify errors, argue against the self-correcting nature of the scientific process.

Of course, self-correction is not guaranteed. It can easily be stifled. There is always a danger that we drift back into the 19th century or the dark ages. But the greater academic freedom (and generous funding) scientists are given, the more science will be allowed to correct itself.

Beach
Science is like a calm lagoon in the sunset… Or whatever. There is no real reason why this picture is here.

Update (19 Jan 2016): I just read this nice post about the role of priors in Bayesian statistics. The author actually says Bayesian analysis is “self-correcting” and this epitomises my point here about science. I would say science is essentially Bayesian. We start with prior hypotheses and theories but by accumulating evidence we update our prior beliefs to posterior beliefs. It may take a long time but assuming we continue to collect data our assumptions will self-correct. It may take a reevaluation of what the evidence is (which in this analogy would be a change to the likelihood function). Thus the discussion about how we know how close to the truth we are is in my view missing the point. Self-correction describes the process.

Update (21 Jan 2016): I added a sentence from my comment in the discussion section to the top. It makes for a good summary of my post. The analogy may not be perfect – but even if not I’d say it’s close. If you disagree, please leave a comment below.

Why wouldn’t you share data?

Data sharing has been in the news a lot lately from the refusal of the authors of the PACE trial to share their data even though the journal expects it to the eventful story of the “Sadness impairs color perception” study. A blog post by Dorothy Bishop called “Who’s afraid of Open Data?” made the rounds. The post itself is actually a month old already but it was republished by the LSE blog which gave it some additional publicity. In it she makes a impassioned argument for open data sharing and discusses the fears and criticisms many researchers have voiced against data sharing.

I have long believed in making all data available (and please note that in the following I will always mean data and materials, so not just the results but also the methods). The way I see it this transparency is the first and most important remedy to the ills of scientific research. I have regular discussions with one of my close colleagues* about how to improve science – we don’t always agree on various points like preregistration, but if there is one thing where we are on the same page, it is open data sharing. By making data available anyone can reanalyse it and check if the results reproduce and it allows you to check the robustness of a finding for yourself, if you feel that you should. Moreover, by documenting and organising your data you not only make it easier for other researchers to use, but also for yourself and your lab colleagues. It also helps you with spotting errors. It is also a good argument that stops reviewer 2 from requesting a gazillion additional analyses – if they really think these analyses are necessary they can do them themselves and publish them. This aspect in fact overlaps greatly with the debate on Registered Reports (RR) and it is one of the reasons I like the RR concept. But the benefits of data sharing go well beyond this. Access to the data will allow others to reuse the data to answer scientific questions you may not even have thought of. They can also be used in meta-analyses. With the increasing popularity and feasibility of large-scale permutation/bootstrapping methods it also means that availability to the raw values will be particularly important. Access to the data allows you to take into account distributional anomalies, outliers, or perhaps estimate the uncertainty on individual data points.

But as Dorothy describes, many scientists nevertheless remain afraid of publishing their actual data alongside their studies. For several years many journals and funding agencies have had a policy that data should always be shared upon request – but a laughably small proportion of such requests are successful. This is why some have now adopted the policy that all data must be shared in repositories upon publication or even upon submission. And to encourage this process recently the Peer Reviewer Openness Initiative was launched by which signatories would refuse to conduct in-depth reviews of manuscripts unless the authors can give a reason why data and materials aren’t public.

My most memorable experience with fears about open data involve a case where the lab head refused to share data and materials with the graduate student* who actually created the methods and collected the data. The exact details aren’t important. Maybe one day I will talk more about this little horror story… For me this demonstrates how far we have come already. Nowadays that story would be baffling to most researchers but back then (and that’s only a few years ago – I’m not that old!) more than one person actually told me that the PI and university were perfectly justified in keeping the student’s results and the fruits of their intellectual labour under lock and key.

Clearly, people are still afraid of open data. Dorothy lists the following reasons:

  1. Lack of time to curate data;  Data are only useful if they are understandable, and documenting a dataset adequately is a non-trivial task;
  2. Personal investment – sense of not wanting to give away data that had taken time and trouble to collect to other researchers who are perceived as freeloaders;
  3. Concerns about being scooped before the analysis is complete;
  4. Fear of errors being found in the data;
  5. Ethical concerns about confidentiality of personal data, especially in the context of clinical research;
  6. Possibility that others with a different agenda may misuse the data, e.g. perform selective analysis that misrepresented the findings;

In my view, points 1-4 are invalid arguments even if they seem understandable. I have a few comments about some of these:

The fear of being scooped 

I honestly am puzzled by this one. How often does this actually happen? The fear of being scooped is widespread and it may occasionally be justified. Say, if you discuss some great idea you have or post a pilot result on social media perhaps you shouldn’t be surprised if someone else agrees that the idea is great and also does it. Some people wouldn’t be bothered by that but many would and that’s understandable. Less understandable to me is if you present research at a conference and then complain about others publishing similar work because they were inspired by you. That’s what conferences are for. If you don’t want that to happen, don’t go to conferences. Personally, I think science would be a lot better if we cared a lot less about who did what first and instead cared more about what is true and how we can work together…

But anyway, as far as I can see none of that applies to data sharing. By definition data you share is either already published or at least submitted for peer review. If someone reuses your data for something else they have to cite you and give you credit. In many situations they may even do it in collaboration with you which could lead to coauthorship. More importantly, if the scooped result is so easily obtained that somebody beats you to it despite your head start (it’s your data, regardless of how well documented it is you will always know it better than some stranger) then perhaps you should have thought about that sooner. You could have held back on your first publication and combined the analyses. Or, if it really makes more sense to publish the data in separate papers, then you could perhaps declare that the full data set will be shared after the second one is published. I don’t really think this is necessary but I would accept that argument.

Either way, I don’t believe being scooped by data sharing is very realistic and any cases of that happening must be extremely rare. But please share these stories if you have them to prove me wrong! If you prefer, you can post it anonymously on the Neuroscience Devils. That’s what I created that website for.

Fear of errors being discovered

I’m sure everyone can understand that fear. It can be embarrassing to have your errors (and we all make mistakes) being discovered – at least if they are errors with big consequences. Part of the problem is also that all too often the discovery of errors is associated with some malice. To err is human, to forgive divine. We really need to stop treating every time somebody’s mistakes are being revealed (or, for that matter, when somebody’s findings fail to replicate) as an implication of sloppy science or malpractice. Sometimes (usually?) mistakes are just mistakes.

Probably nobody wants to have all of their data combed by vengeful sleuths nitpicking every tiny detail. If that becomes excessive and the same person is targeted, it could border on harassment and that should be counteracted. In-depth scrutiny of all the data by a particular researcher should be a special case that only happens when there is a substantial reason, say, in a fraud investigation. I would hope though that these cases are also rare.

And surely nobody can seriously want the scientific record to be littered with false findings, artifacts, and coding errors. I am not happy if someone tells me I made a serious error but I would nonetheless be grateful to them for telling me! It has happened before when lab members or collaborators spotted mistakes I made. In turn I have spotted mistakes colleagues made. None of this would have been possible if we didn’t share our data and methods amongst each another. I am always surprised when I hear how uncommon this seems to be in some labs. Labs should be collaborative, and so should science as a whole. And as I already said, organising and documenting your data actually helps you to spot errors before the work is published. If anything, data sharing reduces mistakes.

Ethical issues with patient confidentiality

This is a big concern – and the only one that I have full sympathy with. But all of our ethics and data protection applications actually discuss this. The only data that is shared should be anonymised. Participants should only be identified by unique codes that only the researchers who collected the data have access to. For a lot of psychology or other behavioural experiments this shouldn’t be hard to achieve.

Neuroimaging or biological data are a different story. I have a strict rule for my own results. We do not upload the actual brain images of our fMRI experiments to public repositories. While under certain conditions I am willing to share such data upon request as long as the participant’s name has been removed, I don’t think it is safe to make those data permanently available to the entire internet. Participant confidentiality must trump the need for transparency. It simply is not possible to remove all identifying information from these files. Skull-stripping, which removes the head tissues from an MRI scan except for the brain, does not remove all identifying information. Brains are like finger-prints and they can easily be matched up, if you have the required data. As someone* recently said in a discussion of this issue, the undergrad you are scanning in your experiment now may be Prime Minister in 20-30 years time. They definitely didn’t consent to their brain scans being available to anyone. It may not take much to identify a person’s data using only their age, gender, handedness, and a basic model of their head shape derived from their brain scan. We must also keep in mind of what additional data mining may be possible in the coming decades that we simply have no idea about yet. Nobody can know what information could be gleaned from these data, say, about health risks or personality factors. Sharing this without very clear informed consent (that many people probably wouldn’t give) is in my view irresponsible.

I also don’t believe that for most purposes this is even necessary. Most neuroimaging studies involve group analyses. In those you first spatially normalise the images of each participant and the perform statistical analysis across participants. It is perfectly reasonable to make those group results available. For purpose of non-parametric permutation analyses (also in the news recently) you would want to share individual data points but even there you can probably share images after sufficient processing that not much incidental information is left (e.g. condition contrast images). In our own work, these considerations don’t apply. We conduct almost all our analyses in the participant’s native brain space. As such we decided to only share the participants’ data projected on a cortical reconstruction. These data contain the functional results for every relevant voxel after motion correction and signal filtering. No this isn’t raw data but it is sufficient to reproduce the results and it is also sufficient for applying different analyses. I’d wager that for almost all purposes this is more than enough. And again, if someone were to be interested in applying different motion correction or filtering methods, this would be a negotiable situation. But I don’t think we need to allow unrestricted permanent access for such highly unlikely purposes.

Basically, rather than sharing all raw data I think we need to treat each data set on a case-by-case basis and weigh the risks against benefits. What should be mandatory in my view is sharing all data after default processing that is needed to reproduce the published results.

People with agendas and freeloaders

Finally a few words about a combination of points 2 and 6 in Dorothy Bishop’s list. When it comes to controversial topics (e.g. climate change, chronic fatigue syndrome, to name a few examples where this apparently happened) there could perhaps be the danger that people with shady motivations will reanalyse and nitpick the data to find fault with them and discredit the researcher. More generally, people with limited expertise may conduct poor reanalysis. Since failed reanalysis (and again, the same applies to failed replications) often cause quite a stir and are frequently discussed as evidence that the original claims were false, this could indeed be a problem. Also some will perceive these cases as “data tourism”, using somebody else’s hard-won results for quick personal gain – say by making a name for themselves as a cunning data detective.

There can be some truth in that and for that reason I feel we really have to work harder to change the culture of scientific discourse. We must resist the bias to agree with the “accuser” in these situations. (Don’t pretend you don’t have this bias because we all do. Maybe not in all cases but in many cases…)

Of course skepticism is good. Scientists should be skeptical but the skepticism should apply to all claims (see also this post by Neuroskeptic on this issue). If somebody reanalyses somebody else’s data using a different method that does not automatically make them right and the original author wrong. If somebody fails to replicate a finding, that doesn’t mean that finding was false.

Science thrives on discussion and disagreement. The critical thing is that the discussion is transparent and public. Anyone who has an interest should have the opportunity to follow it. Anyone who is skeptical of the authors’ or the reanalysers’/replicators’ claims should be able to check for themselves.

And the only way to achieve this level of openness is Open Data.

 

* They will remain anonymous unless they want to join this debate.