Uncertain times

On 23rd June 2016 the citizens of the United Kingdom (plus immigrants from Commonwealth nations and the Republic of Ireland) will vote to decide if the UK should remain a member of the European Union. Colloquially, this is known as the “Brexit” debate. I refuse to use this horrible term again, not only because it sounds like a breakfast cereal but also because it’s a misnomer: the decision is not about Britain but the whole of the UK.

Let me be straight: I am a strong and vocal supporter of the UK staying in the EU. As an immigrant (I also won’t use the offensive term “migrant”) from a EU country, a Leave vote would have direct consequences for my life in the country I called home for almost two decades. I would inevitably lose some of the rights I have enjoyed since then. The EU is what made it possible for me to study, live and work here, it allowed me to spend a year in yet another EU country during my studies, and it made my life easier in countless ways, not least of all the simplicity of crossing the borders. All of these things apply to all EU citizens so from a purely selfish perspective all of them should also support it. A lot of things we take for granted are a direct consequence of the civil liberties the EU guarantees.

The whole public debate surrounding this issue has been characterised by panic mongering and inane bickering from both sides. From deliberate obfuscations (“If we leave the EU every household will lose £4,300!”) to outright lies (“We pay £350 million a week to Brussels!”) both camps are painting nightmare scenarios of what will happen if the other side wins. Add to that all of the rubbish Boris Johnson dreams up on a typical day that is too delusional to even constitute a lie.

The truth of the matter is this: nobody has a damn clue what will happen if the UK leaves the EU. There is no precedent for a country leaving the bloc. Some European countries like Switzerland, Norway, and Liechtenstein are not EU members so they can give us an idea of what the relationship between the UK and the remainder of the EU could look like in the future. However, these are all also very different countries than the UK. They have far smaller populations, have very different economies and societies, and they have existed outside the EU whilst the UK has for decades been an integral, albeit reluctant, member. The only thing Liechtenstein has in common with the UK is its national anthem. The only thing we do know is that most of these countries have a relationship with the EU that permits free movement of people. Since one of the main arguments put forth in support of leaving the EU is “regaining control of our borders,” it actually seems very unlikely that any dramatic change in border control can be achieved this way. Nevertheless, we can’t know what will happen.

This is why it worries me and why it should worry you also. The future is very uncertain and leaving the EU will be a very big risk. The doomsday scenarios painted by either side are extreme and frankly also insulting our collective intelligence. Anyone who tells you what terrible consequences a Leave or Remain decision will have, is either lying or – at best – completely delusional. In any case, you shouldn’t listen to them. The Remain camp have one thing going for them though: they support the status quo and voting to Remain is the conservative decision. Whatever propaganda Leave proponents may spout about it, it is unlikely that any of their horrors are becoming a reality if the UK stays in the EU. Rather, things will presumably not really change much from how they are now. Voting Leave is by far the riskier and more radical thing to do. That said, I doubt it will have disastrous consequences either. The initial negotiations will be difficult and cumbersome. The process of leaving will also cost a lot of money, at least in the short run, money that won’t be saved by not paying into EU coffers (for one thing because the UK will continue to pay while these negotiations take place). For science and technology, I believe the consequences will be painful as it will make it much harder to access large European research grants (which the UK also cannot simply make up for by no longer paying in) and – what is worse – the added bureaucracy and administrative burden of turning skilled workers from EU countries into immigrants from overseas, which will stifle collaboration to some extent. So no, leaving the EU will probably not to ruin the UK. But don’t fool yourself into thinking that it won’t have bad consequences.

As far as I am concerned, I recently I started the process of naturalisation to become a British citizen. I had originally wanted to complete this before the referendum so I could vote in it. I held back on this for ages because for many years my country of origin did not allow dual citizenship (guess which supernational organisation is to thank for that being possible now?) and also because it’s damned expensive. Now I am too late to get there before 23rd June. If the UK stays in the EU, I will most likely finish the process. This place is my home and I am tired of being subject to taxation without representation. I feel it’s about time I can fully shape the future of this country with my votes.

Of course, gaining citizenship will be far more useful if the UK indeed leaves the EU. People like me would most likely lose the right to vote in other elections we could vote in until now. It is far less clear how residence rights will change. As I already said, if the UK remains in the EEA, free movement rights are unlikely to be affected at all. But if “control of the borders” is “regained” this would change. Will our automatic permanent resident status be carried over? It does seem improbable that we would suddenly be asked to apply for visas or indefinite leave as such a change would result in complete chaos. It is quite likely though that some additional bureaucratic hurdles would be erected because that’s just what governments do. But honestly, I don’t think I’ll go through with naturalisation if the UK leaves the EU. Melodramatic as it sounds (because screw it, this whole debate has been plagued with melodrama and over-emotional rhetoric) a vote to Leave is a statement that people like me are not really welcome in this country. If the UK votes to Leave, I will most likely choose to leave the UK.

But don’t get me wrong. I do think this referendum is a good idea. For one thing, it is democratic. More importantly, I don’t actually think that Britons will vote to Leave. The polls suggest a close race but the Remain camp has been steadily ahead. As the graph below shows,  the fewer people answer “Don’t know” in a poll, the farther the Remain vote is ahead of the Leave vote. It is probably simplistic to interpret too much into this because this must depend on the particular poll but it could support the interpretation that people vote conservatively. Undecided voters are unlikely to choose the radical option on referendum day.

In my view, this referendum is actually way overdue. If the Leaves have it, then yes, the British will have spoken that European integration has gone too far for them. But if the Remain votes win this will hopefully finally put to rest a common Eurosceptic assertion that the UK originally only “voted to join the Common Market.”  It is about bloody time that this discussion moved into the 21st century.

EuroPolls
Difference between Remain and Leave votes compared to the percentage of Don’t know votes. (Polls without Don’t know votes have been removed). Source: Poll of Polls

 

3 scoops of vanilla science in a low impact waffle please

A lot of young[1] researchers are worried about being “scooped”. No, this is not about something unpleasantly kinky but about when some other lab publishes an experiment that is very similar to yours before you do. Sometimes this is even more than just a worry and it actually happens. I know that this could be depressing. You’ve invested months or years of work and sleepless nights in this project and then somebody else comes along and publishes something similar and – poof – all the novelty is gone. Your science career is over. You will never publish this high impact now. You won’t ever get a grant. Immeasurable effort down the drain. Might as well give up, sell your soul to the Devil, and get a slave job in the pharmaceutical industry and get rich[2].

Except that this is total crap. There is no such thing as being scooped in this way, or at least if there is, it is not the end of your scientific career. In this post I want to briefly explain why I think so. This won’t be a lecture on the merits of open science, on replications, on how we should care more about the truth than about novelty and “sexiness”. All of these things are undoubtedly true in my mind and they are things we as a community should be actively working to change – but this is no help to young scientists who are still trying to make a name for themselves in a system that continues to reward high impact publications over substance.

No. Here I will talk about this issue with respect to the status quo. I think even in the current system, imperfect as it may be, this irrational fear is in my view unfounded. It is essential to dispel these myths about impact and novelty, about how precedence is tied to your career prospects. Early career scientists are the future of science. How can we ever hope to change science for the better if we allow this sort of madness to live on in the next generation of scientists? I say ‘live on’ for good reason – I, too, used to suffer from this madness when I was a graduate student and postdoc.

Why did I have this madness? Honestly I couldn’t say. Perhaps it’s a natural evolution of young researchers, at least in our current system. People like to point the finger at the lab PIs pressuring you into this sort of crazy behaviour. But that wasn’t it for me. For most of my postdoc I worked with Geraint Rees at UCL and perhaps the best thing he ever told me was to fucking chill[3]. He taught me – more by example than words – that while having a successful career was useful, what is much more important is to remember why you’re doing it: The point of having a (reasonably successful) science career is to be able to pay the rent/mortgage and take some enjoyment out of this life you’ve been given. The reason I do science, rather than making a mint in the pharma industry[4], is that I am genuinely curious and want to figure shit out.

Guess what? Neither of these things depend on whether somebody else publishes a similar (or even very similar) experiment while you’re still running it. We all know that novelty still matters to a lot of journals. Some have been very reluctant to publish replication attempts. I agree that publishing high impact papers does help wedge your foot in the door (that is, get you short-listed) in grant and job applications. But even if this were all that matters to be a successful scientist (and it really isn’t), here’s why you shouldn’t care too much about that anyway:

No paper was ever rejected because it was scooped

While journal editors will reject papers because they aren’t “novel,” I have never seen any paper being rejected because somebody else published something similar a few months earlier. Most editors and reviewers will not even be aware of the scooping study. You may find this hard to believe because you think your own research is so timely and important, but statistically it is true. Of course, some reviewers will know of the work. But most reviewers are not actually bad people and will not say “Something like this was published three months ago already and therefore this is not interesting.” Again, you may find this hard to believe because we’ve all heard too many stories of Reviewer 2 being an asshole. But in the end, most people aren’t that big of an asshole[5]. It happens quite frequently that I suggest in reviews that the authors cite some recently published work (usually not my own, in case you were wondering) that is very similar to theirs. In my experience this has never led to a rejection but I ask to them to put their results in the context of similar findings in the literature. You know, the way a Discussion section should be.

No two scooped studies are the same

You may think that the scooper’s experiment was very similar, but unless they actually stole your idea (a whole different story I also don’t believe but I have no time for this now…) and essentially pre-replicated (preclicated?) your design, I’d bet that there are still significant differences. Your study has not lost any of its value because of this. And it’s certainly no reason to quit and/or be depressed.

It’s actually a compliment

Not 100% sure about this one. Scientific curiosity shouldn’t have anything to do with a popularity contest if you ask me. Study whatever the hell you want to (within ethical limits, that is). But I admit, it feels reassuring to me when other people agree that the research questions I am interested in are also interesting to them. For one thing, this means that they will appreciate you working and (eventually) publishing on it, which again from a pragmatic point of view means that you can pay those rents/mortgages. And from a simple vanity perspective it is also reassuring that you’re not completely mad for pursuing a particular research question.

It has little to do with publishing high impact

Honestly, from what I can tell neither precedence nor the popularity of your topic are the critical factors in getting your work into high impact journals. The novelty of your techniques, how surprising and/or reassuringly expected your results are, and the simplicity of the narrative are actually major factors. Moreover, the place you work, the co-authors you with whom you write your papers, and the accessibility of the writing (in particular your cover letter to the editors!) definitely matter a great deal also (and these are not independent of the first points either…). It is quite possible that your “rival”[6] will publish first, but that doesn’t mean you won’t publish similar work in a higher impact journal. Journal review outcome is pretty stochastic and not really very predictable.

Actual decisions are not based on this

We all hear the horror stories of impact factors and h-indexes determining your success with grant applications and hiring decisions. Even if this were true (and I actually have my doubts that it is as black and white as this), a CV with lots of high impact publications may get your foot in the door – but it does not absolve the panel from making a hiring/funding decision. You need to do the work on that one yourself and even then luck may be against you (the odds certainly are). It also simply is not true that most people are looking for the person with the most Nature papers. Instead I bet you they are looking for people who can string together a coherent argument, communicate their ideas, and who have the drive and intellect to be a good researcher. Applicants with a long list of high impact papers may still come up with awful grant proposals or do terribly in job interviews while people with less stellar publication records can demonstrate their excellence in other ways. You may already have made a name for yourself in your field anyway, through conferences, social media, public engagement etc. This may matter far more than any high impact paper could.

There are more important things

And now we’re coming back to the work-life balance and why you’re doing this in the first place. Honestly, who the hell cares whether someone else published this a few months earlier? Is being the first to do this the reason you’re doing science? I can see the excitement of discovery but let’s face it, most of our research is neither like the work of Einstein or Newton nor are we discovering extraterrestrial life. Your discovery is no doubt exciting to you, it is hopefully exciting to some other scientists in your little bubble and it may even be exciting to some journalist who will write a distorting, simplifying article about it for the mainstream news. But seriously, it’s not as groundbreaking that it is worth sacrificing your mental and physical health over it. Live your life. Spend time with your family. Be good to your fellow creatures on this planet. By all means, don’t be complacent, ensure you make a living but don’t pressure yourself into believing that publishing ultra-high impact papers is the meaning of life.

A positive suggestion for next time…

Now if you’re really worried about this sort of thing, why not preregister your experiment? I know I said I wouldn’t talk about open science here but bear with me just this once because this is a practical point you can implement today. As I keep saying, the whole discussion about preregistration is dominated by talking about “questionable research practices”, HARKing, and all that junk. Not that these aren’t worthwhile concerns but this is a lot of negativity. There are plenty of positive reasons why preregistration can help and the (fallacious) fear of being scooped is one of them. Preregistration does not stop anyone else from publishing the same experiment before you but it does allow you to demonstrate that you had thought of the idea before they published it. With Registered Reports it becomes irrelevant if someone else published before you because your publication is guaranteed after the method has been reviewed. And I believe it will also make it far clearer to everyone how much who published what first where actually matters in the big scheme of things.

[1] Actually there are a lot of old and experienced researchers who worry about this too. And that is far worse than when early career researchers do it because they should really know better and they shouldn’t feel the same career pressures.
[2] It may sound appealing now, but thinking about it I wouldn’t trade my current professional life for anything. Except for grant admin bureaucracy perhaps. I would happily give that up at any price… :/
[3] He didn’t quite say it in those terms.
[4] This doesn’t actually happen. If you want to make a mint you need to go into scientific publishing but the whole open science movement is screwing up that opportunity now as well so you may be out of luck!
[5] Don’t bombard me with “Reviewer 2 held up my paper to publish theirs first” stories. Unless Reviewer 2 signed their review or told you specifically that it was them I don’t take such stories at face value.
[6] The sooner we stop thinking of other scientists in those terms the better for all of us.

Strawberry Ice Cream Cone

Started signing my reviews

As of this year, I started signing my reviews. This decision has been a long time coming. A lot of people sign their reviews making this not a particularly newsworthy event but I’ll tell you about it anyway, largely to have a record of when I started and also to explain my reasons.

To explain why, I first need to talk about why one might not want to sign peer reviews. The debate about whether or not to sign reviews has been raging for years. It divides people’s minds and the debate regularly sparks up again. Even the people who agree that the process of scientific research can be improved seem to often fall into two camps whose opinions are diametrically opposed: one side fervently argues that all peer reviews should be transparent and signed, whilst other people argue with equal fervour that ideally all reviews should be double-blind, so that neither reviewers nor authors’ know each other’s identities.

Whenever someone suggests double-blind reviews, people are wont to argue that this simply doesn’t work in many situations. It is possible to guess the reviewers from the research question and/or the methods used. If the authors previously presented the research at a conference it is likely that reviewers will have already seen it in a preliminary form. That said, the very few times I did review in a double-blind manner I actually didn’t guess the authors’ identities and in one case I was in fact reviewing the work of friends and collaborators without even knowing it. I’d like to think I would’ve been fair either way, but I must also admit that I was probably more sceptical and possibly less biased because I didn’t know who the authors were. Still, these cases are probably somewhat special – in many situations I would know the authors from the research or at least have a strong suspicion. The suspicion might also lead me to some erroneous assumptions, such as “These authors usually do this and that even though this isn’t mentioned here”. If my guess were actually wrong then this could skew my thought process unduly.

So I think double-blind reviewing is a bad idea. Now, many arguments have been brought forth as to why reviews should be anonymous. It can protect reviewers from the wrath of vengeful senior colleagues making unfair hiring or funding decisions because they didn’t like your review. There are a lot of arseholes in the world and this is certainly a possibility. But the truth is that anonymity doesn’t stop people from behaving in this way – and there is actually no compelling evidence that signed reviews make it worse. I have heard some harrowing tales from colleagues who were being treated unfairly by some major players in their fields because they thought that they had given their work a bad review. In one case, it was a PhD student of the assumed reviewer who received ill treatment – and the assumption was entirely incorrect.

You also frequently hear people’s guesses about who they think Reviewer 2 was on their latest rejected manuscript, often based on circumstantial or generally weak evidence. One of my favourites is the age old “He (because we know all reviewers are obviously male…) asked us to cite lots of his papers!” I am sure this happens but I wonder how often this deduction is correct. I almost never ask people to cite my papers – if I do it is because I feel they are directly relevant and citing them is the scholarly thing to do. It is far more likely that I ask people to cite the work of researchers whose work I know well when it is relevant. In many cases when people just “know” that Reviewer 2 is Professor X because they want X to be cited, it seems to me far more likely that the reviewer is one of Professor X’s postdocs or former students. In many cases, it may also be that Professor X’s work is an established part of the literature and thus in the interest of scholarship an unbiased reviewer will think it deserves being cited even though you think Professor X’s work is rubbish. In short, I find those kind of insane guessing games rather tedious and potentially quite damaging.

The first time I signed a review was when I reviewed for F1000Research where signing is mandatory. (I had already reviewed at Frontiers a few times where reviewer identities are public but I don’t think this counts: reviews aren’t signed upon submission of the review but only after publication of the paper. Moreover, the match between review and reviewer remains ambiguous). I must say reviewing this paper all in public was a rather uplifting experience. At all stages of this process I felt the communication between me and the authors was amicable and sensible in spite of the harshness of my decisions. I have also been led to believe that the authors appreciated my scepticism (although only they can tell you that for sure).

By signing I may have also been more polite than I might have been if my review were anonymous. I am not entirely convinced of this last argument because I typically try to be polite. There are a lot of dickheads out there who aren’t polite even when their identity is public :P. I also don’t buy that anonymous reviewers aren’t accountable and that thus the quality of the review suffers. Your review is still read by at least one editor – unless that editor is your close personal friend (which is still rare for me at least) then I do feel like someone is checking my review both for factual quality and politeness.

Either way, I did not perceive any adverse consequences of signing my reviews. If anything, it made me think harder about how I would write my review and to check the arguments I am making. Scientists should criticise and scrutinise each other. By this I don’t mean you should mistrust people’s intentions or question their competence. But science is fueled by scepticism and you should challenge anything that doesn’t make sense. I have certainly done so in my collaborations in the past (often to the frustration of my collaborators) and I try to encourage this in my own lab. I much rather have a student or postdoc who tells me that my idea makes no sense than someone who does everything I say. Researchers also do that at conferences where they discuss each other’s research. One of my most positive experiences from a conference was some rather intense – but very polite – discussions at a poster. Why can’t we do the same in paper reviews?

When I’m perfectly honest, the main reason I hadn’t signed reviews so far is that I was raised that way. Almost none of the reviews I ever received were signed – certainly none of the negative ones. Some reviewers (including very critical ones) revealed their identities after the manuscripts had been accepted for publication and I have done the same in some cases. But the status quo of my field was always that reviews were anonymous and that’s just how it was. Challenging this seemed to go against nature – but that really isn’t true. Whether or not reviews are signed is a question of culture, not nature. And I want to change this culture.

Signing reviews is a personal choice. I don’t think it should ever become mandatory. For one thing, I’m a libertarian (just to be clear, I’m not one of the delusional tea party types) and I don’t believe we should force people to do things that aren’t necessary. I don’t think signed reviews are necessary. I think making all review contents public would be an essential improvement to peer review, with or without signing. But signing reviews can be positive development and I believe it should be encouraged. I certainly think it is a positive development for me and this is why everyone should be free to take this step of their own accord. Signing my first reviews has been a strangely liberating experience. I don’t know if this will provoke the ire of powerful senior colleagues. In a few years’ time I may post an update about my experience. Somehow I doubt it will turn out to be a problem.

Coconut

Why Gilbert et al. are missing the point

This (hopefully brief) post is yet again related to the replicability debate I discussed in my previous post. I just read a response by Gilbert et al. to the blog comments about their reply to the reply to the reply to the (in my view, misnamed) Reproducibility Project Psychology. I won’t go over all of this again. I also won’t discuss the minutia of the statistical issues as many others have already done so and will no doubt do so again. I just want to say briefly why I believe they are missing the point:

The main argument put forth by Gilbert et al. is that there is no evidence for a replicability crisis in psychology and that the “conclusions” of the RPP are thus unfounded. I don’t think that the RPP ever claimed anything of the kind one way or the other (in fact, I was impressed by the modesty of the claims made by the RPP study when I read it) but I’ll leave that aside. I appreciate what Gilbert et al. are trying to do. I have myself frequently argued a contrarian position in these discussions (albeit not always entirely seriously). I am trying to view this whole debate the same way any scientist should: by evaluating the evidence without any investment in the answer. For that reason, the debate they have raised seems worthwhile. They tried to estimate a baseline level of replicability one could expect from psychology studies. I don’t think they’ve done it correctly (for statistical reasons) but I appreciate that they are talking about this. This is certainly what we would want to do in any other situation.

Unfortunately, it isn’t that simple. Even if there were no problems with publication bias, analytical flexibility, and lacking statistical power (and we can probably agree that this is not a tenable assumption), it wouldn’t be a straightforward thing to estimate how many psychology studies should replicate by chance. In order to know this you would need to know how many of the hypotheses are true and we usually don’t. As Einstein said – or at least the internet tells me he did: “If we knew what it was we were doing, it would not be called research, would it?”

One of the main points they brought up is that some of the replications in the RPP may have used inappropriate procedures to test the original hypotheses – I agree this is a valid concern but it also completely invalidates the argument they are trying to make. Instead of quibbling about what measure of replication rates is evidence for a “crisis” (a completely subjective judgement) let’s look at the data:

f1-large

This scatter graph from the RPP plots effect sizes in the replications against the originally reported ones. Green (referred to as “blue” by the presumably colour-blind art editors) points are replications that turned out significant, red ones are those that were not significant and thus “failed to replicate.” The separation of the two data clouds is fairly obvious. Significant replication effects have a clear linear relationship with the original ones. Non-significant ones are uncorrelated with the original effect sizes.

We can argue until the cows come home what this means. The red points are presumably at least to a large part false positives. Yes, of course some – perhaps many – may be because of methodological differences or hidden moderators etc. There is no way to quantify this reliably. And conversely, a lot of the green dots probably don’t tell us about any cosmic truths. While they are replicating wonderfully, they may just be replicating the same errors and artifacts. All of these arguments are undoubtedly valid.

But that’s not the point. When we test the reliability of something we should aim for high fidelity. Of course, perfect reliability is impossible so there must be some scatter around the identity line. We also know that there will always be false positives so there should be some data points scattering around the x-axis. But do you honestly think it should be as many as in that scatter graph? Even if these are not all false positives in the original but rather false negatives in the replication, for instance because the replicators did a poor job or there were unknown factors we don’t yet understand, this ratio of green to red dots is not very encouraging.

Replicability encompasses all of the aforementioned explanations. When I read a scientific finding I don’t expect it to be “true.” Even if the underlying effects are real, the explanation for them can be utterly wrong. But we should expect a level of replicability from a field of research that at least maximises the trustworthiness of the reported findings. Any which way you look at it, this scatter graph is unsettling: if two thirds of the dots are red because low statistical power and publication bias in the original effects, this is a major problem. But if they are red because the replications are somehow defective this isn’t exactly a great argument either. What this shows is that the way psychology studies are currently done does not permit very reliable replication. Either way, if you give me a psychology study I should probably bet against it replicating. Does anyone think that’s an acceptable state of affairs?

I am sure both of these issues play a role but the encouraging thing is that probably it is the former, false positives, that is more dominant after all. In my opinion the best way anyone has looked at the RPP data so far is Alex Etz’s Bayesian reanalysis. This suggests that one of the main reasons the replicability in the RPP is so underwhelming is that the level of evidence for the original effects was weak to begin with. This speaks for false positives (due to low power, publication bias, QRPs) and against unknown moderators being behind most of the replication failures. Believe it or not, this is actually a good thing – because it is much easier to address the former problem than the latter.

The (non-)replicability of soft science

Since last night the internet has been all atwitter about a commentary* by Dan Gilbert and colleagues about the recent and (in my view) misnamed Reproducibility Project: Psychology. In this commentary, Gilbert et al. criticise the RPP for a number of technical reasons asserting that the sampling was non-random and biased and that essentially the conclusions, in particular in the coverage by science media and blogosphere, of a replicability crisis in psychology is unfounded. Some of their points are rather questionable to say the least and some, like their interpretation of confidence intervals, are statistically simply wrong. But I won’t talk about this here.

One point they raise is the oft repeated argument that replications differed in some way from the original research. We’ve discussed this already ad nauseam in the past and there is little point going over this again. Exact replication of the methods and conditions of an original experiment can test the replicability of a finding. Indirect replications loosely testing similar hypotheses instead inform about generalisability of the idea, which in turn tells us about the robustness of the purported processes we posited. Everybody (hopefully) knows this. Both are important aspects to scientific progress.

The main problem is that most debates about replicability go down that same road with people arguing about whether the replication was of sufficient quality to yield interpretable results. One example by Gilbert and co is that one of the replications in the RPP used the same video stimuli used by the original study, even though the original study was conducted in the US while the replication was carried out in the Netherlands, and the dependent variable was related to something that had no relevance to the participants in the replication (race relations and affirmative action). Other examples like this were brought up in previous debates about replication studies. A similar argument has also been made about the differences in language context between the original Bargh social priming studies and the replications. In my view, some of these points have merit and the example raised by Gilbert et al. is certain worth a facepalm or two. It does seem mind-boggling how anyone could have thought that it is valid to replicate a result about a US-specific issue in a liberal European country whilst using the original stimuli in English.

But what this example illustrates is a much larger problem. In my mind that is actually the crux of the matter: Psychology, or at least most forms of more traditional psychology, do not lend themselves very well to replication. As I am wont to point out, I am not a psychologist but a neuroscientist. I do work in a psychology department, however, and my field obviously has considerable overlap with traditional psychology. I also think many subfields of experimental psychology work in much the same way as other so-called “harder” sciences. This is not to say that neuroscience, psychophysics, or other fields do not also have problems with replicability, publication bias, and other concerns that plague science as a whole. We know they do. But the social sciences, the more lofty sides of psychology dealing with vague concepts of the mind and psyche, in my view have an additional problem: They lack the lawful regularity of effects that scientific discovery requires.

For example, we are currently conducting an fMRI experiment in which we replicate a previous finding. We are using the approach I have long advocated that in order to try to replicate you should design experiments that do both, replicate a previous result but also seek to address a novel question. The details of the experiment are not very important. (If we ever complete this experiment and publish it you can read about it then…) What matters is that we very closely replicate the methods of a study from 2012 and this study closely replicated the methods of one from 2008. The results are pretty consistent across all three instances of the experiment. The 2012 study provided a somewhat alternative interpretation of the findings of the 2008 one. Our experiment now adds more spatially sensitive methods to yet again paint a somewhat different picture. Since we’re not finished with it I can’t tell you how interesting this difference is. It is however already blatantly obvious that the general finding is the same. Had we analysed our experiment in the same way as the 2008 study, we would have reached the same conclusions they did.

The whole idea of science is to find regularities in our complex observations of the world, to uncover lawfulness in the chaos. The entire empirical approach is based on the idea that I can perform an experiment with particular parameters and repeat it with the same results, blurred somewhat by random chance. Estimating the generalisability allows me to understand how tweaking the parameters can affect the results and thus allows me to determine what the laws are the govern the whole system.

And this right there is where much of psychology has a big problem. I agree with Gilbert et al. that repeating a social effect in US participants with identical methods in Dutch participants is not a direct replication. But what would be? They discuss how the same experiment was then repeated in the US and found results weakly consistent with the original findings. But this isn’t a direct replication either. It does not suffer from the same cultural and language differences as the replication in the Netherlands did but it has other contextual discrepancies. Even repeating exactly the same experiment in the original Stanford(?) population would not necessarily be equivalent because of the time that has passed and the way cultural factors have changed. A replication is simply not possible.

For all the failings that all fields of science have, this is a problem my research area does not suffer from (and to clarify: “my field” is not all of cognitive neuroscience, much of which is essentially straight-up psychology with the brain tagged on, and also while I don’t see myself as a psychologist, I certainly acknowledge that my research also involves psychology). Our experiment is done on people living in London. The 2012 study was presumably done mainly on Belgians in Belgium. As far as I know the 2008 study was run in the mid-western US. We are asking a question that deals with a fairly fundamental aspect of human brain function. This does not mean that there aren’t any population differences but our prior for such things affecting the results in a very substantial way are pretty small. Similarly, the methods can certainly modulate the results somewhat but I would expect the effects to be fairly robust to minor methodological changes. In fact, whenever we see that small changes in the method (say, the stimulus duration or the particular scanning sequence used) seem to obliterate a result completely, my first instinct is usually that such a finding is non-robust and thus unlikely to be meaningful.

From where I’m standing, social and other forms of traditional psychology can’t say the same. Small contextual or methodological differences can quite likely skew the results because the mind is a damn complex thing. For that reason alone, we should expect psychology to have low replicability and the effect sizes should be pretty small (i.e. smaller than what is common in the literature) because they will always be diluted by a multitude of independent factors. Perhaps more than any other field, psychology can benefit from preregistering experimental protocols to delineate the exploratory garden-path from hypothesis-driven confirmatory results.

I agree that a direct replication of a contextually dependent effect in a different country and at a different time makes little sense but that is no excuse. If you just say that the effects are so context-specific it is difficult to replicate them, you are bound to end up chasing lots of phantoms. And that isn’t science – not even a “soft” one.

purity
Then again, all fields of science are “soft”
* At first I thought the commentary was due to be published by Science on 4th March and embargoed until that date. However, it turns out to be more complicated than that because the commentary I am discussing here is not the Science article but Gilbert et al.’s reply to Nosek et al.’s reply to Gilbert et al.’s reply to the RPP (Confused yet?). It appeared on a website and then swiftly vanished again. I don’t know how I would feel posting it because the authors evidently didn’t want it to be public. I don’t think actually having that article is central to understanding my post so I feel it’s not important.

On studying precognition

Today I received an email from somebody who had read some of my discussions of Psi research. They made an interesting point that has so far been neglected in most of the debates I participated in. With their permission I post their email (without identifying information) and my response to it. I hope this clarifies my views:

I also have experienced real and significant episodes of precognition. After many experiences I researched my ancestry and found relatives who had histories of episodes of precognition. The studies I have read that claim precognition is not real all have the same error. I can’t pick a card, I can’t tell you what the next sound will be. Precognition does not work like that. I will demonstrate with this example.
I was standing at the front desk at work when I got a terrible feeling something was wrong. I didn’t know what. I called a friend and told him something is wrong.  I began a one hour drive home and continued talking to my friend. The feeling that something was wrong grew to an increasing level as a approached the river. I saw a city bus parked on the side of the road. Many vehicles were down by the river. I passed that scene and then told my friend that a child was now drowned and she was close to the bridge about 1/2 a mile down river. The next day the TV news confirmed that she was found at the bridge down river.
No one told me there was a drowning, no one told me it was a girl, no one knew she was floating by the bridge.
This type of thing happens to me regularly. I believe it results from the same thing that will stampede cattle. I think humans communicate through speech and other forms of non verbal communication. I think somehow I am able to know what the herd is thinking or saying without being there. I think the reason I got the feeling something was wrong had to do with the escalating fear and crying out of the people who were madly searching for the child who fell in the river.
So trying to study precognition by getting a person to predict the next card will never work. Look at the reality of how it happens and see if you can study it a different way.

My response to this:

Thank you for your email. I’d say we are in greater agreement than you may think. What I have written on my blog and in the scientific literature about precognition/telepathy/presentiment pertains strictly to the scientific experiments that have been done on these paranormal abilities, usually with the sole aim to prove their existence. You say you “can’t pick a card” etc – tell that to the researchers who believe that showing a very subtle difference from chance performance on such simple experiments is evidence for precognition.
Now, do I believe you have precognition? No, I don’t. The experiences you describe are not uncommon but they may be uncommonly frequent for you. Nevertheless they are anecdotal evidence and my first hunch would be to suspect cognitive biases that we know can masquerade as paranormal abilities. There may also be cognitive processes we currently simply have no understanding of. How we remember our own thoughts is still very poorly understood. The perception of causality is a fascinating topic. We know we can induce causality illusions but this line of inquiry is still in its infancy.
But I cannot be certain of this. Perhaps you do have precognition. I don’t have any intention to convince you that you don’t; I only want to clarify why I don’t believe it, certainly not based on the limited information I have. The main issue here is that your precognition is unfalsifiable. You say yourself that “Precognition does not work like that.” If it does not occur with the same regularity as other natural phenomena, it isn’t amenable to scientific study. Psi researchers believe that precognition etc have that regularity and so they think you can demonstrate it with card-picking experiments. My primary argument is about that line of thinking.
I am not one of those scientists who feel the need to tell everyone what to believe in. These people are just as irritating as religious fundamentalists who seek to convert everybody. If some belief is unfalsiable, like the existence of God or your belief in your precognition, then it falls outside the realm of science. I have no problem with you believing that you have precognition, certainly not as long as it doesn’t cause any harm to anyone. But unless we can construct a falsifiable hypothesis, science has no place in it.

No Tea in Schwarzkopf

Yesterday I came across* this essay by Etzel Cardeña entitled “The unbearable fear of psi: On scientific censorship in the 21st century” in the Journal of Scientific Exploration, an outlet that frequently publishes parapsychology studies. In this essay he bemoans the “suppression” of Psi research by the scientific establishment. I have noticed (personal opinion) that Psi researchers frequently tend to have a bit of a persecution complex although some of the concerns may very well be justified. Seriously entertaining the hypothesis that there is precognition or telepathy is often met with ridicule and I can imagine that it could make life with “mainstream” scientists harder. At the same time I am almost certain that the claims of this suppression are vastly overstated and they don’t make Psi researchers the 21st century Galileos or Giordano Brunos.

fnhum-08-00332-g001
Psi is like the thing on the right…

In fact, in a commentary on a Psi study I published two years ago I tried to outline specifically what differs between Galileo’s theories and the Psi hypothesis: the principle of parsimony. Heliocentricism may have faced dogmatic opposition from the religious establishment because it threatened their power and worldview. However, it is nonetheless the better model to explain observations of nature whilst being consistent with the current state of our knowledge. This is why it eventually succeeded against all opposition. The truth will out. Science is self-correcting even if it can take a long time and painful revolutions to get there. The same does not apply to the Psi hypothesis because Psi doesn’t explain anything. Rather Psi is the absence of an explanation because it posits that there are unexplained observations, something I would call stating the obvious. Anyway, I’ve repeatedly said all this before and this isn’t actually the point of this blog post…

In his essay, Cardeña briefly mentions my commentary and discusses in the Appendix some of the strawman arguments that have been levelled against my points. That’s all fair and good. I disagree with him but I have neither the time nor the desire to get back into this discussion right now. However, it brings me to another puzzling thing that I have long wondered about – mainly because it has followed me around for most of my life (ever since moving to an English-speaking country at least): the unbearable inability of people to spell my name correctly.

It used to frustrate me but after now decades of experiencing it regularly I have become accustomed to it. But this doesn’t stop me from being mystified by this error. Let me repeat it again:

There is no T in Schwarzkopf

By far the most common misspelling of my surname is Schwartzkopf. There have also been other mistakes, such as missing the 2nd letter C or replacing the F with an H (that one is particularly common when people try to write it phonetically). I assume the prevalence of the TZ spelling is that in the English language Z is a soft S sound and you need to have a T to produce the sharp German Z sound. That certainly makes sense. I know I’m not alone; a lot of people with foreign sounding names will suffer from frequent misspellings. I have become quite sensitive to this issue and I usually try very hard to ensure I spell other people’s names properly but of course I occasionally fail, too.

But this does not explain the incredible robustness of this TZ error. Cardeña is by far not the only person who has made it and under normal circumstances it would’ve barely registered on my radar. Yet what makes his essay so fascinating is that he manages to spell it correctly at the bottom of page 9 but then repeatedly misspells it in all the subsequent instances. This is in spite of the fact that he did spell it correctly in his own paper (that this essay discusses), that it is correct in his bibliography, and that he could easily access my article. This reminds me of a dyslexic student in my high school class who baffled us all (especially the teacher) by managing to change the spelling of the same word from one line to the next in his school papers (this was before dyslexia was well known or widely accepted as a condition – it would probably not be as dumbfounding to teachers these days). Cardeña is not a bad speller in general. His dyslexia seems to be Schwarzkopf-specific.

And he’s not alone in that. I singled him out here because it’s the latest example I came across but it would be harsh to lay this at his door. In fact, it is possible that his misspelling started because he quoted the TZ mistake from an email by Hauke Heekeren. This does not excuse his misspelling of my name after that in my book given he had access to the correct spelling – but it certainly proves he isn’t alone. Heekeren is German (I think) so he doesn’t have the language excuse either. How did he manage to misspell what is essentially two common German words? But it doesn’t stop there. I’ve also had my name misspelled in this way by a former employer who decided to acknowledge me in a paper they published. I worked with that person for over a year and published papers with them. You’d think they would know how to spell my name but at the very least you’d think they’d look it up before putting it in writing.

The general language excuse is also not that valid for statistical reasons. I am sure there are people with the same name spelled with a T but I don’t know any. I don’t know which spelling is more frequent but the T-spelling certainly has far less exposure. Schwarzkopf (with the correct spelling :P) is the name of an international brand of hair care products (no relations and none of the proceeds go my way, unfortunately). People should see that all the time. Schwarzkopf was also the surname of “Stormin’ Norman“, the coalition commander in the first Iraq War. At least people in the United States were relatively frequently exposed to his name for some time.

So what is it about people consistently misspelling my name despite better knowledge? Is there some sort of cognitive or even perceptual bias happening here? Can we test this experimentally? If you have an idea how to do that, let me know.

tea
Whilst mainly a coffee drinker, occasionally there is also tea in Schwarzkopf

(* Thanks to Leonid Schneider and UK Skeptics for drawing my attention to this article)

A brave new world of research parasites

What a week! I have rarely seen the definition of irony being demonstrated more clearly in front of my eyes than during the days following the publication of this comment by Lewandowsky and Bishop in Nature. I mentioned this at the end of my previous post. The comment discusses the question how to deal with data requests and criticisms of scientific claims in the new world of open science. A lot of digital ink has already been spilled elsewhere debating what they did or didn’t say and what they meant to say with their article. I have no intention of rehashing that debate here. So while I typically welcome any meaningful and respectful comments under my posts, I’ll regard any comments on the specifics of the L&B article as off-topic and will not publish them. There are plenty of other channels for this.

I think the critics attack a strawman and the L&B discussion is a red herring. Irrespective of what they actually said, I want to get back to the discussion we should be having, which I already alluded to last time.  In order to do so, let’s get the premise crystal clear. I have said all this before in my various posts about data sharing but let me summarise the fundamental points:

  1. Data sharing: All data for scientific studies needed to reproduce the results should be made public in some independent repository at the point of publication. This must exclude data which would be unethical to share, e.g. unprocessed brain images from human participants. Such data fall in a grey area as to how much anonymisation is necessary and it is my policy to err on the side of caution there. We have no permission from our participants (except for some individual cases) to share their data with anyone outside the team if there is a chance that they could be identified from it so we don’t. For the overwhelming majority of purposes such data are not required and the pre-processed, anonymised data will suffice.
  2. Material sharing: When I talk about sharing data I implicitly also mean material so any custom analysis code, stimulus protocols, or other materials used for the study  should also be shared. This is not only good for reproducibility, i.e. getting the same results using the same data. It is also useful for replication efforts aiming to repeat the same experiment to collect new data.
  3. Useful documentation: Shared data are unlikely to be much use to anyone if there isn’t a minimum of documentation explaining what it contains. I don’t think this needs to be excessive, especially given the fact that most data will probably never be accessed by anyone. But there should at least be some basic guide how to use the data to return a result. It should be reasonably clear what data can be found where or how to run the experiment. Provided the uncompiled code is included and the methods section of the publication contains sufficient detail of what is being done, anyone looking at it should be able to work it out by themselves. More extensive documentation is certainly helpful and may also help the researchers themselves in organising their work – but I don’t think we should expect more than the basics.

Now with this out of the way I don’t want to hear no lamentations about how I am “defending” the restriction of data access to anyone or any such rubbish. Let’s simply work on the assumption that the world is how it should be and that the necessary data are available to anyone with an internet connection. So let’s talk about the worries and potential problems this may bring. Note that, as I already said, most data sets will probably not generate much interest. That is fine – they should be available for potential future use in any case. More importantly this doesn’t mean the following concerns aren’t valid:

Volume of criticism

In some cases the number of people reusing the shared data will be very large. This is particularly likely for research on controversial topics. This could be because the topic is a political battleground or that the research is being used to promote policy changes people are not happy with. Perhaps the research receives undeserved accolades from the mainstream media or maybe it’s just a very sensational claim (Psi research springs to mind again…). The criticisms of this research may or may not be justified. None of this matters and I don’t care to hear about the specifics about your particular pet peeve whether it’s climate change or some medical trial. All that matters in this context is that the topic is controversial.

As I said last time, it should be natural that sensational or controversial research attracts more attention and more scepticism. This is how it should be. Scientists should be sceptical. But individual scientists or small research teams are composed of normal human beings and they have a limit with how much criticism they can keep up with. This is a simple fact. Of course this statement will no doubt draw out the usual suspects who feel the need to explain to me that criticism and scepticism is necessary in science and that this is simply what one should expect.

396px-bookplate_of_the_royal_society_28great_britain29

So let me cut the heads off this inevitable hydra right away. First of all, this is exactly what I just said: Yes, science depends on scepticism. But it is also true that humans have limited capacity for answering questions and criticisms and limited ability to handle stress. Simply saying that they should be prepared for that and have no right to complain is unrealistic. If anything it will drive people away from doing research on controversial questions which cannot be a good thing.

Similar, it is unrealistic to say that they could just ignore criticisms if it gets too much for them. It is completely natural that a given scientist will want to respond to criticisms, especially if those criticisms are public. They will want to defend the conclusions they’ve drawn and they will also feel that they have a reputation to defend. I believe science would generally be better off if we all learned to become less invested in our pet theories and conducted our inferences in a less dogmatic way. I hope there are ways we can encourage such a change – but I don’t think you can take ego out of the question completely. Especially if a critic accuses a researcher of incompetence or worse, it shouldn’t surprise anyone if they react emotionally and have personal stakes in the debate.

So what can we expect? To me it seems entirely justified in this situation that a researcher would write a summary response that addresses the criticism collectively. In that they would most likely have to be selective and only address the more serious points and ignore the minutia. This may require some training. Even then it may be difficult because critics might insist that their subtle points are of fundamental importance. In that situation an adjudicating article by an independent party may be helpful (albeit probably not always feasible).

On a related note, it also seems justified to me that a researcher will require time to make a response. This pertains more to how we should assess a scientific disagreement as outside observers. Just because a researcher hasn’t responded to every little criticism within days of somebody criticising their work doesn’t mean that the criticism is valid. Scientists have lives too. They have other professional duties, mortgages to pay with their too-low salaries, children to feed, and – hard as it is to believe – they deserve some time off occasionally. As long as they declare their intention to respond in depth at some stage we should respect that. Of course if they never respond that may be a sign that they simply don’t have a good response to the criticism. But you need some patience, something we seem to have lost in the age of instant access social media.

Excessive criticism or harassment

This brings us to the next issue. Harassment of researchers is never okay. Which is really because harassment of anyone is never okay. So pelting a researcher with repeated criticisms, making the same points or asking the same questions over and over, is not acceptable. This certainly borders on harassment and may cross the line. This constant background noise can wear people out. It is also counterproductive because it slows them down in making their response. It may also paralyse their other research efforts which in turn will stress them out because they have grant obligations to fulfill etc. Above all, stress can make you sick. If you harassed somebody out of the ability to work, you’ll never get a response – this doesn’t make your criticism valid.

If the researchers declared their intention to respond to criticism we should leave it at that. If they don’t respond after a significant time it might be worth a reminder if they are still working on it. As I said above, if they never respond this may be a sign that they have no response. In that case, leave it at that.

It should require no explanation why any blatant harassment, abusive contact, or any form of interference in the researchers’ personal lives, is completely unacceptable. Depending on the severity of such cases they should be prosecuted to the full extent of the law. And if someone reports harassment, in the first instance you should believe them. It is a common tactic of harassers to downplay claims of abuse. Sure, it is also unethical to make false accusations but you should leave that for the authorities to judge, in particular if you don’t have any evidence one way or the other. Harassment is also subjective. What might not bother you may very well affect another person badly. Brushing this off as them being too sensitive demonstrates a serious lack of compassion, is disrespectful, and I think it also makes you seem untrustworthy.

Motive and bias

Speaking of untrustworthiness brings me to the next point. There has been much discussion about the motives of critics and in how far a criticism is to be taken in “good faith”. This is a complex and highly subjective judgement. In my view, your motive for reanalysing or critiquing a particular piece of research is not automatically a problem. All the data should be available, remember? Anyone can reanalyse it.

However, as all researchers should be honest so should all critics. Obviously this isn’t mandatory and it couldn’t be enforced even if it were. But this is how it should be and how good scientists should work. I have myself criticised and reanalysed research by others and I was not beating around the bush in either case – I believe I was pretty clear that I didn’t believe their hypothesis was valid. Hiding your prior notions is disrespectful to the authors and also misleads the neutral observers of the discussion. Even if you think that your public image already makes your views clear – say, because you ranted at great length on social media about how terribly flawed you think that study was – this isn’t enough. Even the Science Kardashians don’t have that large a social media following and probably only a fraction of that following will have read all your in-depth rants.

In addition to declaring your potential bias you should also state your intention. It is perfectly justified to dig into the data because you suspect it isn’t kosher. But this is an exploratory analysis and it comes with many of the same biases that uncontrolled, undeclared exploration always has. Of course you may find some big smoking gun that invalidates or undermines the original authors’ conclusions. But you are just as likely to find some spurious glitch or artifact in the data that doesn’t actually mean anything. In the latter case it would make more sense to conduct a follow up experiment that tests  your new alternative hypothesis to see if your suspicion holds up. If on the other hand you have a clear suspicion to start with you should declare it and then test it and report the findings no matter what. Preregistration may help to discriminate the exploratory fishing trips from the pointed critical reanalyses – however, it is logistically not very feasible to check whether this wasn’t just a preregistration after the fact because the data were already available.

So I think this judgement will always rely heavily on trust but that’s not a bad thing. I’m happy to trust a critic if they declare their prior opinion. I will simply take their views with some scepticism that their bias may have influenced them. A critic who didn’t declare their bias but is then shown to have a bias appears far less trustworthy. So it is actually in your interest to declare your bias.

Now before anyone inevitably reminds us that we should also worry about the motives and biases of the original authors – yes, of course. But this is a discussion we’ve already had for years and this is why data sharing and novel publication models like preregistration and registered reports are becoming more commonplace.

Lack of expertise

On to the final point. Reanalyses or criticism may come from people with limited expertise and knowledge of a research area to provide useful contributions. Such criticisms may obfuscate the discussion and that is never a good thing. Again preempting the inevitable comments: No, this does not mean that you have to prove your expertise to reanalyse the data. (Seriously guys, which part of “all data should be available to anyone” don’t you get?!). What it does mean is that I might not want to weight the criticism by someone who once took a biology class in high school the same way as that of a world expert. It also means that I will be more sceptical when someone is criticising something outside their own field.

There are many situations where this caveat doesn’t matter. Any scientist with some statistical training may be able to comment on some statistical issue. In fact, a statistician is presumably more qualified to comment on some statistical point than a non-statistician of whatever field. And even if you may not be an expert on some particular research topic you may still be an expert on the methods used by the researchers. Importantly, even a non-expert can reveal a fundamental flaw. The lack of a critic’s expertise shouldn’t be misused to discredit them. In the end, what really matters is that your argument is coherent and convincing. For that it doesn’t actually matter if you are an expert or not (an expert may however find it easier to communicate their criticism convincingly).

However, let’s assume that a large number of non-experts are descending on a data set picking little things they perceive as flaws that aren’t actually consequential or making glaring errors (to an expert) in their analysis. What should the researchers do in this situation? Not responding at all is not in their interest. This can easily be misinterpreted as a tacit acknowledgement that their research is flawed. On the other hand, responding to every single case is not in their interest either if they want to get on with their work (and their lives for that matter). As above, perhaps the best thing to do would be write a summary response collectively rebuking the most pertinent points, make a clear argument about the inconsequentialness of these criticisms, and then leave it at that.

Conclusion

In general, scientific criticisms are publications that should work like any other scientific publications. They should be subject to peer review (which, as readers of this blog will know, I believe should be post-publication and public). This doesn’t mean that criticism cannot start on social media, blogs, journal comment sections, or on PubPeer, and the boundaries may also blur at times. For some kinds of criticism, such as pointing out basic errors or misinterpretations some public comments may suffice and there have been cases where a publication was retracted simply because of the social media response. But for a criticism to be taken seriously by anyone, especially non-experts, it helps if it is properly vetted by independent experts – just how any study should be vetted. This may also help particularly with cases where the validity of the criticism is uncertain.

I think this is a very important discussion to have. We need to have this to bring about the research culture most of us seem to want. A brave new world of happy research parasites.

Parasites

(Note: I changed the final section somewhat after Neuroskeptic rightly pointed out that the conclusions were a bit too general. Tal Yarkoni independently replicated this sentiment. But he was only giving me a hard time.)

 

Parasitical science?

This weekend marked another great moment in the saga surrounding the discussion about open science – a worthy sequel to “angry birds” and “shameless little bullies”. This time it was an editorial about data sharing in the New England Journal of Medicine which contains the statement that:

There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”

Remarks like this from journal editors are just all kinds of stupid. Even though this was presented in the context of quotes by unnamed “front-line researchers” (whatever that means) they implicitly endorse the interpretation that re-using other people’s published data is parasitical. In fact, their endorsement is made clear later on in the editorial when the editors express the hope that data sharing “should happen symbiotically, not parasitically.”

Parasites
Contact Richard Morey to add this badge to your publications!

It shouldn’t come as a surprise that this editorial was immediately greeted by wide-spread ridicule and the creation of all sorts of internet memes poking fun of the notion of research parasites.” Even if some people believe this, hell, even if the claim were true (spoiler: it’s not), this is just a very idiotic thing to do. Like it or not, open access, transparency, and post-publication scrutiny of published scientific findings are becoming increasingly common and are already required in many places. We’re now less than a year away from the date when the Peer Reviewers Openness Initiative, whose function it is to encourage data sharing, will come into effect. Not only is the clock not turning back on this stuff – it is deeply counterproductive to liken the supporters of this movement to parasites. This is no way to start (or have) a reasonable conversation.

And there should be a conversation. If there is one thing I have learned from talking with colleagues, worries about data sharing and open science as a whole are far from rare. Misguided as it may be, the concern about others scooping your ideas and sifting through your data you spent blood, sweat, and tears collecting resonates with many people. This editorial didn’t just pop into existence from the quantum foam – it comes from a real place. The mocking and snide remarks about this editorial are fully deserved. This editorial is moronic and ass-backwards. But speaking more generally, snide and mocking are never a good way to convince people of the strength of your argument. All too often worries like this are met with disrespect and ridicule. Is it any surprise that a lot of people don’t dare to speak up against open science? Similarly, when someone discovers errors or problems in somebody else’s data, some are quick to make jokes or serious accusations about these researchers. Is this encouraging them to be open their lab books and file drawers? I think not.

Scientists are human beings and they tend to have normal human reactions when being accused of wrong-doing, incompetence, or sloppiness. Whether or not the accusations are correct is irrelevant. Even mentioning the dreaded “questionable research practices” sounds like a fierce accusation to the accused even though questionable research practices can occur quite naturally without conscious ill intent when people are wandering in the garden of forking paths. In my opinion we need to be mindful of that and try to be more considerate in the way we discuss these issues. Social media like Facebook and Twitter do not exactly seem to encourage respectful dialogue. I know this firsthand as I have myself said things about (in my view) questionable research that I subsequently regretted. Scepticism is good and essential to scientific progress – disrespect is not.

It seems to have been the intention of this misguided editorial to communicate a similar message. It encourages researchers using other people’s data to work with the original authors. So far so good. I am sure no sensible person would actually disagree with that notion. But where the editorial misses the point is that there is no plan for what happens if this “symbiotic” relationship isn’t forming, either because the original authors are not cooperating or because there is a conflict of interests between skeptics and proponents of a scientific claim. In fact, the editorial lays bare what I think is the heart of the problem in a statement that to me seems much worse than the “research parasites” label. They say that people…

…even use the data to try to disprove what the original investigators had posited.

It baffles me that anyone can write something like this whilst keeping a straight face. Isn’t this how science is supposed to work? Trying to disprove a hypothesis is just basic Popperian falsification. Not only should others do that, you should do that yourself with your own research claims. To be fair, the best way to do science in my opinion is to generate competing hypotheses and test them with as little emotional attachment to any of them as possible but this is more easily said than done… So ideally we should try to find the hypothesis that best explains the data rather than just seeking to disprove. Either way however, this sentence is clearly symptomatic of a much greater problem: Science should be about “finding better ways of being wrong.” The first step towards this is to acknowledge that anything we posited is never really going to be “true” and that it can always use a healthy dose of scientific scepticism and disproving.

I want to have this dialogue. I want to debate the ways to make science healthier, more efficient, and more flexible in overturning false ideas. As I outlined in a previous post, data sharing is the single most important improvement we can make to our research culture. I think even if there are downsides to it, the benefits outweigh them by far. But not everyone shares my enthusiasm for data sharing and many people seem worried but afraid to speak up. This is wrong and it must change. I strongly believe that most of the worries can be alleviated:

  • I think it’s delusional that data sharing will produce a “class” of “research parasites.” People will still need to generate their own science to be successful. Simply sitting around waiting for other people to generate data is not going to be a viable career strategy. If anything, large consortia like the Human Genome or Human Connectome Project will produce large data sets that a broad base of researchers can use. But this won’t allow them to test every possible hypothesis under the sun. In fact, most data sets are far too specific to be much use to many other people.
  • I’m willing to bet that the vast majority of publicly shared data sets won’t be downloaded, let alone analysed by anyone other than the original authors. This is irrelevant. The point is that the data are available because they could be potentially useful to future science.
  • Scooping other people’s research ideas by doing the experiment they wanted to do by using their published data is a pretty ineffective and risky strategy. In most cases, there is just no way that someone else would be faster than you publishing an experiment you wanted to do using your data. This doesn’t mean that it never happens but I’m still waiting for anyone to tell me of a case where this actually did happen… But if you are worried about it, preregister your intention so at least anyone can see that you planned it. Or even better, submit it as a Registered Report so you can guarantee that this work will be published in a journal regardless of what other people did with your data.
  • While we’re at it, upload the preprints of your manuscripts when you submit them to journals. I still dream of a publication system where we don’t submit to journals at all, or at least not until peer review took place and the robustness of the finding has been confirmed. But until we get there, preprints are the next best thing. With a public preprint the chronological precedence is clear for all to see.

Now that covers the “parasites” feeding on your research productivity. But what to do if someone else subjects your data to sceptical scrutiny in the attempt to disprove what you posited? Again, first of all I don’t think this is going to be that frequent. It is probably more frequent for controversial or surprising claims and it bloody well should be. This is how science progresses and shouldn’t be a concern. And if it actually turns out that the result or your interpretation of it is wrong, wouldn’t you want to know about it? If your answer to this question is No, then I honestly wonder why you do research.

I can however empathise with the fear that people, some of whom may lack the necessary expertise or who cherry pick the results, will actively seek to dismantle your findings. I am sure that this does happen and with more general data sharing this may certainly become more common. If the volume of such efforts becomes so large that it overwhelms an individual researcher and thus hinders their own progress unnecessarily, this would indeed be a concern. Perhaps we need to have a discussion on what safeguards could ensure that this doesn’t get out of hand or how one should deal with that situation. I think it’s a valid concern and worth some serious thought. (Update on 25 Jan 2016: In this context Stephan Lewandowsky and Dorothy Bishop wrote an interesting comment about this).

But I guarantee you, throwing the blame at data sharing is not the solution to this potential problem. The answer to scepticism and scrutiny cannot ever be to keep your data under lock and key. You may never convince a staunch sceptic but you also will not win the hearts and minds of the undecidedly doubtful by hiding in your ivory tower. In science, the only convincing argument is data, more data, better tests – and the willingness to change your mind if the evidence demands it.

Coconut
Here at CoCoNiT (Cook-Islands Centre Of NeuroImaging Tests) we understand that once you crack the hard shell of your data, the sweet, white knowledge will just come pouring out…

Yes, science is self-correcting

If you don’t believe science self-corrects, then you probably shouldn’t believe that evolution by natural selection occurs either – it’s basically the same thing.

I have said it many times before, both under the guise of my satirical alter ego and later – more seriously – on this blog. I am getting very tired of repeating it so I wrote this final post about it that I will simply link to next time this inevitably comes up…

My latest outburst about this was triggered by this blog post by Keith Laws entitled “Science is ‘Other-Correcting‘”. I have no qualms with the actual content of this post. It gives an interesting account of the attempt to correct an error in the publication record. The people behind this effort are great researchers for whom I have the utmost respect. The story they tell is shocking and important. In particular, the email they received by accident from a journal editor is disturbing and serves as a reminder of all the things that are wrong with the way scientific research and publishing currently operates.

My issue is with the (in my view seemingly) ubiquitous doubts about the self-correcting nature of science. To quote from the first paragraph in that post:

“I have never been convinced by the ubiquitous phrase ‘Science is self-correcting’. Much evidence points to science being conservative and looking less self-correcting and more ego-protecting. It is also not clear why ‘self’ is the correct description – most change occurs because of the ‘other’ – Science is other correcting.”

In my view this and similar criticisms of self-correction completely miss the point. The suffix ‘self-‘ refers to science, not to scientists. In fact, the very same paragraph contains the key: “Science is a process.” Science is an iterative approach by which we gradually broaden our knowledge and understanding of the world. You can debate whether or not there is such a thing as the “scientific method” – perhaps it’s more of a collection of methods. However, in my view above all else science is a way of thinking.

Scientific thinking is being inquisitive, skeptical, and taking nothing for granted. Prestige, fame, success are irrelevant. Perfect theories are irrelevant. The smallest piece of contradictory evidence can refute your grand unifying theory. And science encompasses all that. It is an emergent concept. And this is what is self-correcting.

Scientists, on the other hand, are not self-correcting. Some are more so than others but none are perfect. Scientists are people and thus inherently fallible. They are subject to ego, pride, greed, and all of life’s pressures, such as the need to pay a mortgage, feed their children, and having a career. In the common vernacular “science” is often conflated with the scientific enterprise, the way scientists go about doing science. This involves all those human factors and more and, fair enough, it is anything but self-correcting. But to argue that this means science isn’t self-correcting is attacking a straw man because few people are seriously arguing that the scientific enterprise couldn’t be better.

We should always strive to improve the way we do science because due to our human failings it will never be perfect. However, in this context we also shouldn’t forget how much we have already improved it. In the times of Newton, in Europe (the hub of science then) science was largely done only by white men from a very limited socioeconomic background. Even decades later, most women or people of non-European origin didn’t even need to bother trying (although this uphill struggle makes the achievements of scientists like Marie Curie or Henrietta Swan Leavitt all the more impressive). And publishing your research findings was not subject to formal peer review but largely dependent on the ego of some society presidents and on whether they liked you. None of these problems have been wiped off the face of the Earth but I would hope most people agree that things are better than they were 100 years ago.

Like all human beings, scientists are flawed. Nevertheless I am actually optimistic about us as a group. I do believe that on the whole scientists are actually interested in learning the truth and widening their understanding of nature. Sure, there are black sheep and even the best of us will succumb to human failings. At some point or other our dogma and affinity to our pet hypotheses can blind us to the cold facts. But on average I’d like to think we do better than most of our fellow humans. (Then again, I’m probably biased…).

We will continue to make the scientific enterprise better. We will change the way we publish and evaluate scientific findings. We will improve the way we interpret evidence and we communicate scientific discoveries. The scientific enterprise will become more democratic, less dependent on publishers getting rich on our free labour. Already within the decade I have been a practicing scientist we have begun to tear down the wide-spread illusion that when a piece of research is published it must therefore be true. When I did my PhD, the only place we could critically discuss new publications was in a small journal club and the conclusions of these discussions were almost never shared with the world. Nowadays every new study is immediately discussed online by an international audience. We have taken leaps towards scientific findings, data, and materials being available to anyone, anywhere, provided they have internet access.  I am very optimistic that this is only the beginning of much more fundamental changes.

Last year I participated in a workshop called “Is Science Broken?” that was solely organised by graduate students in my department. The growing number of replication attempts in the literature and all these post-publication discussions we are having are perfect examples of science correcting itself. It seems deeply ironic to me when posts like Keith Laws’, which describes an active effort to rectify errors, argue against the self-correcting nature of the scientific process.

Of course, self-correction is not guaranteed. It can easily be stifled. There is always a danger that we drift back into the 19th century or the dark ages. But the greater academic freedom (and generous funding) scientists are given, the more science will be allowed to correct itself.

Beach
Science is like a calm lagoon in the sunset… Or whatever. There is no real reason why this picture is here.

Update (19 Jan 2016): I just read this nice post about the role of priors in Bayesian statistics. The author actually says Bayesian analysis is “self-correcting” and this epitomises my point here about science. I would say science is essentially Bayesian. We start with prior hypotheses and theories but by accumulating evidence we update our prior beliefs to posterior beliefs. It may take a long time but assuming we continue to collect data our assumptions will self-correct. It may take a reevaluation of what the evidence is (which in this analogy would be a change to the likelihood function). Thus the discussion about how we know how close to the truth we are is in my view missing the point. Self-correction describes the process.

Update (21 Jan 2016): I added a sentence from my comment in the discussion section to the top. It makes for a good summary of my post. The analogy may not be perfect – but even if not I’d say it’s close. If you disagree, please leave a comment below.