Realistic data peeking isn’t as bad as you* thought – it’s worse

Unless you’ve been living under a rock, you have probably heard of data peeking – also known as “optional stopping”. It’s one of those nasty questionable research practices that could produce a body of scientific literature contaminated by widespread spurious findings and thus lead to poor replicability.

Data peeking is when you run a Frequentist statistical test every time you collect a new subject/observation (or after every few observations) and stop collecting data when the test comes out significant (say, at p < 0.05). Doing this clearly does not accord with good statistical practice because under the Frequentist framework you should plan your final sample size a priori based on power analysis, collect data until you have that sample size, and never look back (but see my comment below for more discussion of this…). What is worse, under the aforementioned data peeking scheme you can be theoretically certain to reject the null hypothesis eventually. Even if the null hypothesis is true, sooner or later you will hit a p-value smaller than the significance threshold.

Until recently, many researchers, at least in psychological and biological sciences, appeared to be unaware of this problem and it isn’t difficult to see that this could contribute to a prevalence of false positives in the literature. Even now, after numerous papers and blog posts have been written about this topic, this problem still persists. It is perhaps less common but I still occasionally overhear people (sometimes even in their own public seminar presentations) saying things like “This effect isn’t quite significant yet so we’ll see what happens after we tested a few more subjects.” So far so bad.

Ever since I heard about this issue (and I must admit that I was also unaware of the severity of this problem back in my younger, carefree days), I have felt somehow dissatisfied with how this issue has been described. While it is a nice illustration of a problem, the models of data peeking seem extremely simplistic to me. There are two primary aspects of this notion that in my opinion just aren’t realistic. First, the notion of indefinite data collection is obviously impossible, as this would imply having an infinite subject pool and other bottomless resources. However, even if you allow for a relatively manageable maximal sample size at which a researcher may finally stop data collection even when the test is not significant, the false positive rate is still massively inflated.

The second issue is therefore a bigger problem: the simple data peeking procedure described above seems grossly fraudulent to me. I would have thought that even if the researcher in question were unaware of the problems with data peeking, they probably would nonetheless feel that something is quite right with checking for significant results after every few subjects and continuing until they get them. As always, I may be wrong about this but I sincerely doubt this is what most “normal people do. Rather, I believe people would be more likely to peek at the data to look if the results are significant, and only if the p-value “looks promising” (say 0.05 < p < 0.1) they continue testing. This sampling plan sounds a lot more like what may actually happen. So I wanted to find out how this sort of sampling scheme would affect results. I have no idea if anyone already did something like this. If so, I’d be grateful if you could point me to that analysis.

So what I did is the following: I used Pearson’s correlation as the statistical test. In each iteration of the simulation I generated a data set of 150 subjects, each with two uncorrelated Gaussian variables, let’s just pretend it’s the height of some bump on the subjects’ foreheads and a behavioral score of how belligerent they are. 150 is thus the maximal sample size, assuming that our simulated phrenologist – let’s call him Dr Peek – would not want to test more than 150 subjects. However, Dr Peek actually starts with only 3 subjects and then runs the correlation test. In the simplistic version of data peeking, Dr Peek will stop collecting data if p < 0.05; otherwise he will collect another subject until p < 0.05 or 150 subjects are eventually reached. In addition, I simulated three other sampling schemes that feel more realistic to me. In these cases, Dr Peek will also stop data collection when p < 0.05 but he will also stop when p is either greater than 0.1, greater than 0.3, or greater than 0.5. I repeated each of these simulations 1000 times.

The results are in the graph below. The four sampling schemes are denoted by the different colors. On the y-axis I plotted the proportion of the 1000 simulations in which the final outcome (that is, whenever data collection was stopped) yielded p < 0.05. The scenario I described above is the leftmost set of data points in which the true effect size, the correlation between forehead bump height and belligerence, is zero. Confirming previous reports on data peeking, the simplistic case (blue curve) has an enormously inflated false positive rate of around 0.42. Nominally, the false positive rate should be 0.05. However, under the more “realistic” sampling schemes the false positive rates are far lower. In fact, for the case where data collection only continues while p-values are marginal (0.05 < p < 0.1), the false positive rate is 0.068, only barely above the nominal rate. For the other two schemes, the situation is slightly worse but not by that much. So does this mean that data peeking isn’t really as bad as we have been led to believe?

Rates

Hold on, not so fast. Let us now look what happens in the rest of the plot. I redid the same kind of simulation for a range of true effect sizes up to rho = 0.9. The x-axis shows the true correlation between forehead bump height and belligerence. Unlike for the above case when the true correlation is zero, now the y-axis shows statistical power, the proportion of simulations in which Dr Peek concluded correctly that there actual is a correlation. All four curves rise steadily as one might expect with stronger true effects. The blue curve showing the simplistic data peeking scheme rises very steeply and reaches maximal power at a true correlation of around 0.4. The slopes of the other curves are much more shallow and while the power at strong true correlations is reasonable at least for two of them, they don’t reach the lofty heights of the simplistic scheme.

This feels somehow counter-intuitive at first but it makes sense: when the true correlation is strong, the probability of high p-values is low. However, at the very small sample sizes we start out with even a strong correlation is not always detectable – the confidence interval of the estimated correlation is very wide. Thus there will be a relatively large proportion of p-values that pass that high cut-off and terminate data collection prematurely without rejecting the null hypothesis.

Critically, these two things, inflated false positive rates and reduced statistical power to detect true effects, dramatically reduce the sensitivity of any analysis that is performed under these realistic data peeking schemes. In the graph below, I plot the sensitivity (quantified as d’) of the analysis. Larger d’ means there is a more favorable ratio between the number of simulations in which Dr Peek correctly detected a true effect and how often he falsely concluded there was a correlation when there wasn’t one. Sensitivity for the simplistic sample scheme (blue curve) rises steeply until power is maximal. However, sensitivity for the other sampling schemes starts off close to zero (no sensitivity) and only rises fairly slowly.

Sensitivity

For reference compare this to the situation under desired conditions, that is, without questionable research practices, with adequate statistical power of 0.8, and the nominal false positive rate of 0.05: in this case the sensitivity would be d’ = 2.49, so higher than any of the realistic sampling schemes ever get. Again, this is not really surprising because data collections will typically be terminated at sample sizes that give far less than 0.8 power. But in any case, this is bad news. Even though the more realistic forms of data peeking don’t inflate false positives as massively as the most pessimistic predictions, they impede the sensitivity of experiments dramatically and are thus very likely to only produce rubbish. It should come as no surprise that many findings fail to replicate.

Obviously, what I call here more realistic data peeking is not necessarily a perfect simulation of how data peeking may work in practice. For one thing, I don’t think Dr Peek would have a fixed cut-off of p > 0.1 or p > 0.5. Rather, such a cut-off might be determined on a case-by-case basis, dependent on the prior expectation Dr Peek has that the experiment should yield significant results. (Dr Peek may not use Bayesian statistics, but like all of us he clearly has Bayesian priors.) In some cases, he may be very confident that there should be an effect and he will continue testing for a while but then finally give up when the p-value is very high. For other hypotheses that he considered to be risky to begin with, he may not be very convinced even by marginal p-values and thus will terminate data collection when p > 0.1.

Moreover, it is probably also unrealistic that Dr Peek would start with a sample size of 3. Rather, it seems more likely that he would have a larger minimal sample size in mind, for example 20 and collect that first. While he may have been peeking at the data before he completed testing 20 subjects, there is nothing wrong with that provided he doesn’t stop early if the result becomes significant. Under these conditions the situation becomes somewhat better but the realistic data peeking schemes still have reduced sensitivity, at least for lower true effect sizes, which are presumably far more prevalent in real world situations. The only reason that sensitivity goes up fairly quickly to reasonable levels is that with the starting sample size of 20 subjects, the power to detect those stronger correlations is already fairly high – so in many cases data collection will be terminated as soon as the minimum sample is completed.

Sensitivity

Finally, while I don’t think this plot is entirely necessary, I also show you the false positives / power rates for this latter case. The curves are such beautiful sigmoids that I just cannot help myself but to include them in this post…

Rates

So to sum up, leaving aside the fact that you shouldn’t really peek at your data and stop data collection prematurely in any case, if you do this you can shoot yourself seriously in the foot. While the inflation of false positives through data peeking may have contributed a considerable number of spurious, unreplicable findings to the literature, what is worse it may very well also have contributed a great number of false negatives to the proverbial file drawer: experiments that were run but failed to produce significant results after peeking a few times and which were then abandoned, never to be heard of again. When it comes to spurious findings in the literature, I suspect the biggest problem is not actually data peeking but other questionable practices from the Garden of Forking Paths, such as tweaking the parameters of an experiment or the analysis.

* Actually it may just be me…

Matlab code for these simulations. Please let me know if you discover the inevitable bugs in this analysis.

On brain transplants, the Matrix, and Dualism

Warning: This post contains spoilers for the movie The Matrix.

Today a tweet by Neuroskeptic pointed me to this post entitled “You are not your brain: Why a head transplant is not what you think it is“. The title initially sparked my interest because it is a topic I have been thinking about a lot. I am actually writing a novel that deals with topics such as the scientific study of unconsciousness, non-free will, and disembodied cognition*. This issue is therefore succinctly relevant to me.

Unfortunately, this particular post does not really deal with this topic in any depth but only espouses a trivial form of mind-brain dualism. It discusses some cherry-picked findings without any proper understanding of current neuroscientifc knowledge and brushes aside most scientific arguments about consciousness as “bizarre” claims, without providing any concrete argument why that is so. Don’t get me wrong, some claims by neuroscientists about free will and consciousness are probably on logically shaky ground, and neuroscientists themselves frequently espouse a form of inadvertent dualism in their own writing about how the brain relates to the mind. However, this post doesn’t really discuss these issues in an adequate way – but go and read it and make up your own mind.

Either way, I think the general thought is intriguing nonetheless. What would actually happen if we could transplant a human brain (or the whole head) into a different body? Let’s ignore for the moment the fact that our surgical technology is nowhere near the point where we could do this with humans and allow the transplanted head to actually control the new host body. Instead let us assume that we can in fact connect up all the peripheral neurons and muscles in the body to the corresponding neurons in the transplanted brain.

Thinking about this already reveals the first problem: there has got to be a mismatch between the number of neurons in the body and the brain. Perhaps this doesn’t matter and some afferent and efferent nerve fibers need not be connected up to the brain, or – vice versa – some of the brain’s neurons need not receive any input or have any targets in the body. If the bulk of the brain is connected up properly perhaps this suffices? In any case, our brains are calibrated to the body and so to place them into a new body must inevitably throw this calibration completely out of whack. Perhaps this can be overcome and a new calibration can emerge but in how far this is possible is anybody’s guess.

A related problem is how the brain represents the body in which it has been placed. Somehow we carry in our minds a body image that encodes the space that our body occupies, how it looks, how it feels, etc. There are illusions that distort this representation of our own bodies. Some malfunction or fluke in that system could also explain some out-of-body experiences although it is of course difficult to study such phenomena. It seems however pretty likely that such experiences should be exacerbated in a person whose brain has been transplanted into a new body. Imagine you are 1.5 meters tall but your brain has been transplanted into the body of an NBA player. Your experience of the world through this new, much taller body must inevitably be far greater than simply looking at the world from your new vantage point. Over a lifetime of existing in your short body you should have no representation of the sensory experiences related to being 2 meters tall, nor of the feats your muscles are capable of when you can slam dunk. It is possible that we can learn to live in this new corporal shell but who can know whether that is the case.

In that sense, there may actually be truth to the claim in the aforementioned post that there is some kind of “bodily memory”. For one thing, the flexibility and strength of various muscles is presumably related to what you are doing with them on a daily basis. Who knows, perhaps the various nervous tissues in the body also undergo other forms of synaptic plasticity we don’t yet know about? Of course, none of this suggests – as this post claims – that much of your self is in fact stored inside your body or that you become the host person. The brain is undoubtedly the seat of consciousness and of much of your memory, including the fine procedural or motor memory that you take for granted. But I think it is fair to say that by having your brain being wired up to a new body you would certainly experience the world in uniquely different ways. Insofar as your perception affects how you interact with the world this may very well alter your personality and thus really change who you are.

In the same vein I also view other thought experiments, such as the common science-fiction notion that we could one day upload our brains to a computer. Even if we had a computer with the data capacity to not only store a complete wiring diagram and the synaptic weights of all the neurons in the human brain and even if neuronal wiring diagrams were all there is to processing in the brain (thus completely ignoring the role of astrocytes, possibly important functional roles of particular ion channel proteins, or of slow neurochemical transmissions), whatever this stores would presumably not really be the person’s mind. Simply running such a network in a computer would effectively cut this brain off from its host body and in this way it would be comparable to the brain transplant situation. It is difficult to imagine what such a brain would in fact experience in silico. To approximate normal functioning you would also have to simulate the sensory inputs and the reciprocal interactions between the simulated brain and the simulated world this brain inhabits. This would be a bit like the Matrix (although that movie does not involve disembodiment). It is hard to imagine what this might really feel like. Quite likely, at least the cruder beta versions of such a simulation would be highly uncanny because they don’t accurately capture real world experience? In fact, this is part of the plot of the Matrix movie because the protagonist senses that something isn’t right with the world.

I find this topic quite fascinating. Whatever the case may be, I think it is safe to say that we are not just our brains. As opposed to the simplistic notion of how the brain works suggested by many science fiction and fantasy stories, our minds aren’t merely software running on brain hardware. Our brain exists inside a body and that presumably must have some influence on the mind. I don’t buy into much of the embodied cognition literature, as a lot of that also seems very simplistic. I certainly agree in large parts with Chaz Firestone and Brian Scholl that it has not really been demonstrated that things like the heaviness of a backpack can affect your perception of the steepness of the hill before you. But at the same time, I think some degree of embodiment must exist, precisely because I am not a dualist. I don’t think there is any evidence to suggest the mind simply floats in the ether, completely removed from the brain and body. Rather it is emergent property of the brain, a brain that is intricately connected to the rest of the body it resides in (and even that is simplistic: I would in fact say that the brain is part of the body).

Coming back to that post about head transplants, the post is on a website called Religion News and the author is a professor of theological and social ethics. As such it is unsurprising that he discusses a dualist view of the mind and criticizes some of the neuroscientific claims that conflict with that notion. However, his argument is quite odd when you dig a little deeper: rather than saying that your self arises in your brain, the author implicitly suggests that is inherent to your body – he literally states that the person whose brain is transplanted dies because they are in a new body. He further suggests that any children the patient would have in their new body would not be his but those of the host body. While genetically this argument is correct, it completely ignores the fact that there is now a new mind driving the body. Whatever distortions and changes to this mind may result from the brain transplant, it is clearly wrong to claim that the host body would completely override the mind inside the transplanted brain. Yes, biologically the children would be those of the host body but mentally they would be the children of the transplanted mind. Claiming otherwise is equivalent to the suggestion that adoptive parents are not real parents.

In conclusion, I agree that there are some interesting philosophical and theological ramifications to consider about brain transplants. If you believe in the existence of a soul, it is not immediately obvious how you should interpret such a case. I don’t think science can give you the answer to that but that is between you and your rabbi or guru or whoever holds your spirituality together. But I think one thing is clear to me: the soul is not inherently attached to your body any more than it is to your brain. No, you are definitely not just your brain. But you aren’t just your body either.

 

474px-human-brain_257
This fella knows how to have a fun time! (Source)

 

(* Work on this is going very slowly so don’t get your hopes up you’ll see any of this anytime soon – it’s more of a lifetime project…)

How funders could encourage replication efforts

As promised, here is a post about science stuff, finally back to a more cheerful and hopeful topic than the dreadful state the world outside science is in right now…

A Dutch research funding agency recently announced a new grant initiative that exclusively funds replication attempts. The idea is to support replication efforts of particularly momentous “cornerstone” research findings. It’s not entirely clear what this means but presumably such findings include highly cited findings, those with great media coverage and public policy impact etc. It isn’t clear who determines whether a finding falls under this.

You can read about this announcement here. In that article you can see some comments by me on how I think funders should encourage replications by requiring that new grant proposals should also contain some replication of previous work. Like most people I believe replication to be one of the pillars supporting science. Before we treat any discovery as important we must know that it is reliable and meaningful. We need to know in how far it generalizes or if it is fickle and subject to minor changes in experimental parameters. If you read anything I have written about replication, you will probably already know my view on this: most good research is built on previous findings. This is how science advances. You take some previously observed results and use it to generate new hypotheses to be tested in a new experiment. In order to do so, you should include a replication and/or sanity check condition in your new experiment. This is precisely the suggestion Richard Feynman made in his famous Cargo Cult Science lecture.

Imagine somebody published a finding that people perceive the world as darker when they listen to sad classical music (let’s ignore for the moment the inherent difficulty in actually demonstrating such an effect…). You now want to ask if they also perceive the world as darker when they listen to dark metal. If you simply run the same experiment but replace the music any result you find will be inconclusive. If you don’t find any perceptual effect, it could be that your participant sample simply isn’t affected by music. The only way to rule this out is to also include the sad classical music condition in your experiment to test whether this claim actually replicates. Importantly, even if you do find a perceptual effect of dark metal music, the same problem applies. While you could argue that this is a conceptual replication, if you don’t know that you could actually replicate the original effect of classical music, it is impossible to know that you really found the same phenomenon.

My idea is that when applying for funding we should be far more explicit about how the proposal builds on past research and, insofar this is feasible, build more replication attempts into the proposed experiments. Critically, if you fail to replicate those experiments, this would in itself be an important finding that should be added to the scientific record. The funding thus implicitly sets aside some resources for replication attempts to validate previous claims. However, this approach also supports the advance of science because every proposal is nevertheless designed to test novel hypotheses. This stands in clear contrast between pure replication efforts such as those this Dutch initiative advocates or the various large-scale replication efforts like the RPP and Many Labs project. While I think these efforts clearly have value, one major concern I have with them is that they seem to stagnate scientific progress. They highlighted a lack of replicability in the current literature and it is undoubtedly important to flag that up. But surely this cannot be the way we will continue to do science from now on. Should we have a new RPP every 10 years now? And who decides which findings should be replicated? I don’t think we should really care whether every single surprising claim is replicated. Only the ones that are in fact in need of validation because they have an impact on science and society probably need to be replicated. But determining what makes a cornerstone discovery is not really that trivial.

That is not to say that such pure replication attempts should no longer happen or that they should receive no funding at all. If anyone is happy to give you money to replicate some result, by all means do so. However, my suggestion differs from these large-scale efforts and the Dutch initiative in that it treats replication the way it should be treated, as an essential part of all research, rather than as a special effort that is somehow separate from the rest. Most research would only be funded if it is explicit about which previous findings it builds on. This inherently also answers the question which previous claims should be replicated: only those findings that are deemed important enough by other researchers to motivate new research are sufficiently important for replication attempts.

Perhaps most crucially, encouraging replication in this way will help to break down the perceived polarization between the replicators and original authors of high-impact research claims. While I doubt many scientists who published replications actually see themselves as a “replication police,” we continue to rehash these discussions. Many replication attempts are also being suspected to be motivated by mistrust in the original claim. Not that there is really anything wrong with that because surely healthy skepticism is important in science. However, whether justified or not, skepticism of previous claims can lead to the perception that the replicators were biased and the outcome of the replication was a self-fulfilling prophecy. My suggestion would mitigate this problem at least to a large degree because most grant proposals would at least seek to replicate results that have a fighting chance of being true.

In the Nature article about this Dutch initiative there are also comments from Dan Gilbert, a vocal critic of the large-scale replication efforts. He bemoans that such replication research is based on its “unoriginality” and suspects that we will learn more about the universe by spending money on “exploring important new ideas.” I think this betrays the same false dichotomy I described above. I certainly agree with Gilbert that the goal of science should be to advance our understanding of the world but originality is not really the only objective here. Scientific claims must also be valid and generalize beyond very specific experimental contexts and parameters. In my view, both are equally important for  healthy science. As such, there is not a problem with the Dutch initiative but it seems rather gimmicky to me and I am unconvinced its effects will be lasting. Instead I believe the only way to encourage active and on-going replication efforts will be to overhaul the funding structure as I outlined here.

320px-flag_of_europe-svg
52% seems barely above chance. Someone should try to replicate that stupid referendum.

The bottom line

So I haven’t posted in a while, first because I was depressed and lethargic from the dreadful outcome of the EU referendum, and then because I was busy with actual work. I was considering writing a post about how direct democracy has the same problems as citizen science (thanks to Chris Chambers for inspiring that thought a little) but then I don’t feel like it right now.

There isn’t much left to be said about “Brexit” (how I hate that word) that others haven’t already said. The bottom line is, it is highly likely to seriously hurt British science and, I wager, also Britain in general. It seems the political will isn’t there to simply slide into EEA membership (which would keep freedom of movement) and any other solutions appear to be like a terrible deal for the UK, for the EU, and for science. What exactly will happen nobody can predict (as you know I don’t believe in precognition) so we’ll just have to wait and see. Except we don’t have to wait and see for it here. I don’t really see why I should suffer the consequences of a referendum I wasn’t even allowed to vote in despite being a settled and contributing member of society. It is too early to make any rash decisions but I can certainly perceive greener pastures elsewhere…

For the time being, however, I have merely decided to switch to American spelling. This is not reawakening the Devil’s Neuroscientist (She also used American English). It’s just a protest. And, perhaps, depending how the US elections in November go I may have to change it back… On the bright side, my next post will presumably be about something sciency.

el_ultimo_de_gibraltar
I am currently considering petitioning UCL to open a branch in Gibraltar given that this region will almost certainly have to get some special status after the UK leaves the EU

Six flawed arguments for leaving the EU

As anyone who reads this blog probably knows, the UK will hold a referendum about its continued membership in the EU later this month, on 23rd June. I already discussed my views on this in my previous post, so I won’t go into any depth on that here. The discussion is raging, not only in the media but no doubt in many family homes and workplaces (would be curious to be a fly on the wall when Boris Johnson and his brother, science minister Jo Johnson, talk about this in private…). I do think I have said most that I can say about it already – but I keep hearing the same tired, naïve arguments over and over. So I’ll write something about it, one last time before putting my future career, my civil rights, and most likely my continued life in this country in the hands of voters. Here I address six flawed arguments for leaving the EU:

1. “It will change everything”

Actually, most likely nothing major will happen at all. By far the most likely scenario is that the UK leaves the EU, and then joins the EEA in which free movement of people remains in turn for having full access to the single market. EU citizens in the UK will retain the same rights they had previously. Parliament comprises a large number of MPs from Labour, the Lib Dems, the SNP, and one (I think?) from the Greens, plus a healthy number of Europhile Conservatives. This means that this outcome is essentially guaranteed, at least until the next general election (and even then it seems highly unlikely that this situation will change dramatically). Of course, the UK would nevertheless give up its rights to influence EU policy. Sounds like a rotten deal to me. Anyway, leaving this aside, in the remainder of this post I will pretend that a vote to leave the EU will mean also an end to freedom of movement, which is the illusory scenario the Leave campaign is  peddling.

(Update 11 June 2016: The above EEA scenario of course assumes that the UK is allowed to remain in the single market. Wolfgang Schäuble seems to think that isn’t going to happen. I don’t agree with Schäuble much about anything but then again he is also highly influential in EU politics so it’s difficult to know what to think about his argument.)

2.”We can spend the money we save on UK science”

One reason I and many scientists are vehemently opposing this nostalgic independence nonsense is that a great deal of British science funding comes from the EU and that science in the UK would suffer if that were lost. An oft-repeated counterargument to this is that by leaving the EU the UK would no longer pay contributions to European funds and could thus use those savings to spend on British science. This is based on false economy and wishful thinking. The UK brings in more science funding than it pays in, so it would have to increase its science funding. When was the last time a British government did that? Do you honestly think it is likely they will do that now? Of course this argument is not even taking into account the strain on the economy now. It also ignores the likely hit the economy will take after leaving which will reduce and quite possibly wipe out any potential savings. And it blatantly neglects the substantial cost that the UK must pay to leave the EU in the first place. None of these things suggest there will be lots of spare pennies to fund UK research and development. (For similar reasons I also don’t believe this money will be used for the NHS or building homes but that’s outside the scope of my post).

3. “We will be free of EU bureaucracy”

Science has always been collaborative and it is increasingly so in our age. We need international science projects and the EU science initiatives (which go well beyond EU member states) can facilitate this far better than any single national body could. So the UK will quite likely continue to contribute to those initiatives, just as other non-EU countries (like Switzerland) are contributing – without any say in its direction.

4. “Scientists can still collaborate”

Funding is a big factor in science and the cynics on the Leave side are probably right that it is one of the driving factors why all vice-chancellors and governing bodies of British universities want the UK to stay in the EU. But it’s not just about that. Because science is collaborative and international, universities and research centres are usually extremely multinational. This may be especially true in English-speaking countries and this ability to attract bright minds from all over the world is what boost British science output (e.g. a large proportion of research grants brought to UK universities are brought in by people who are not UK citizens). You do not help this by putting up barriers. Leave campaigners like to talk about “point-based immigration systems” that would allow the UK to hire people in professions it needs and that makes it possible for excellent students to come here. Sure, because the best thing is always to have more bureaucracy and paperwork! That will doubtless attract great applicants who could instead be free to move to Paris, Berlin – or Dublin.

5. “EU citizens already living here can stay”

Much of this referendum debate has focused on immigration. Recent years have seen unprecedented immigration of people from other EU nations (although this still only accounts for around half of overall immigration to the UK). It is not surprising that this could cause some issues and concerns. More people making demands on the health system, on housing, or on jobs may strain the country’s capacity. Stopping EU immigration dead in its tracks will perhaps relieve this strain – however, one question Leave campaigners steadfastly ignore to address is what happens to the people who are already here. Unless they all pack and leave voluntarily on 24th June they will still put a strain on the capacity for some time to come. One argument I often hear is “nobody will be kicked out”. However, non-EU citizens are being deported left and right, sometimes for ludicrous reasons and in ludicrous ways. Under the Reign of Terroresa May, neither having a doctorate nor a British spouse necessarily protect you from this. Unless some sort of special agreement is negotiated, the same rules will apply to EU citizens if the UK leaves the EU. There is a lot of conflicting information out there, the most insidious of which is blatant (but presumably lucrative?) scare-mongering by law firms pushing people to apply for citizenship. Now, I don’t think many EU citizens will be deported, especially not those who are already settled here. But Leave campaigners show an obvious disconnect: On the one hand, they seem to believe that by leaving the EU the burden on the NHS and housing is magically lifted. On the other hand, they (at least the sane ones) maintain that there won’t be any mass-deportation of the very people they blame for this burden.

6. “We will regain our sovereignty”

The UK still is, and remains to be, a sovereign nation insofar that such a thing exists in this globalised world. I wasn’t overly impressed by David Cameron’s performance in that cringe-worthy ITV townhall meeting but one compelling answer he gave is that voting to Leave the EU will give an illusion of independence from foreign powers whilst sacrificing actual influence on the world and European stage. I call this the Libertarian Fallacy because it is the same faulty logic that leads many self-declared Libertarians to oppose all sorts of policies in the name of “liberty” without achieving any individual freedom at all. It’s the reasoning that allows some to decry background checks on guns as tyranny but sees no problem with strict tests for driving licenses. It’s the cognitive dissonance in which citizen ID cards evoke the spectre of fascist dictatorship but nobody worries about the far less controlled surveillance via credit card transactions or online activities. Whatever utopian dreams you may have about a “sovereign” UK after EU exit, it will lose its seat at the table and have reduced sway in any decision-making process in Europe – and by extension also in the world. Perhaps it’s fine with many to be an isolated island in a big sea dominated by China and the US, and a new Russian empire rattling its sabres. Fine, not all nations need to be world players. Perhaps these big guys will even leave you in peace. But don’t think for a second that by leaving the EU Britannia will rule the waves again.

Uncertain times

On 23rd June 2016 the citizens of the United Kingdom (plus immigrants from Commonwealth nations and the Republic of Ireland) will vote to decide if the UK should remain a member of the European Union. Colloquially, this is known as the “Brexit” debate. I refuse to use this horrible term again, not only because it sounds like a breakfast cereal but also because it’s a misnomer: the decision is not about Britain but the whole of the UK.

Let me be straight: I am a strong and vocal supporter of the UK staying in the EU. As an immigrant (I also won’t use the offensive term “migrant”) from a EU country, a Leave vote would have direct consequences for my life in the country I called home for almost two decades. I would inevitably lose some of the rights I have enjoyed since then. The EU is what made it possible for me to study, live and work here, it allowed me to spend a year in yet another EU country during my studies, and it made my life easier in countless ways, not least of all the simplicity of crossing the borders. All of these things apply to all EU citizens so from a purely selfish perspective all of them should also support it. A lot of things we take for granted are a direct consequence of the civil liberties the EU guarantees.

The whole public debate surrounding this issue has been characterised by panic mongering and inane bickering from both sides. From deliberate obfuscations (“If we leave the EU every household will lose £4,300!”) to outright lies (“We pay £350 million a week to Brussels!”) both camps are painting nightmare scenarios of what will happen if the other side wins. Add to that all of the rubbish Boris Johnson dreams up on a typical day that is too delusional to even constitute a lie.

The truth of the matter is this: nobody has a damn clue what will happen if the UK leaves the EU. There is no precedent for a country leaving the bloc. Some European countries like Switzerland, Norway, and Liechtenstein are not EU members so they can give us an idea of what the relationship between the UK and the remainder of the EU could look like in the future. However, these are all also very different countries than the UK. They have far smaller populations, have very different economies and societies, and they have existed outside the EU whilst the UK has for decades been an integral, albeit reluctant, member. The only thing Liechtenstein has in common with the UK is its national anthem. The only thing we do know is that most of these countries have a relationship with the EU that permits free movement of people. Since one of the main arguments put forth in support of leaving the EU is “regaining control of our borders,” it actually seems very unlikely that any dramatic change in border control can be achieved this way. Nevertheless, we can’t know what will happen.

This is why it worries me and why it should worry you also. The future is very uncertain and leaving the EU will be a very big risk. The doomsday scenarios painted by either side are extreme and frankly also insulting our collective intelligence. Anyone who tells you what terrible consequences a Leave or Remain decision will have, is either lying or – at best – completely delusional. In any case, you shouldn’t listen to them. The Remain camp have one thing going for them though: they support the status quo and voting to Remain is the conservative decision. Whatever propaganda Leave proponents may spout about it, it is unlikely that any of their horrors are becoming a reality if the UK stays in the EU. Rather, things will presumably not really change much from how they are now. Voting Leave is by far the riskier and more radical thing to do. That said, I doubt it will have disastrous consequences either. The initial negotiations will be difficult and cumbersome. The process of leaving will also cost a lot of money, at least in the short run, money that won’t be saved by not paying into EU coffers (for one thing because the UK will continue to pay while these negotiations take place). For science and technology, I believe the consequences will be painful as it will make it much harder to access large European research grants (which the UK also cannot simply make up for by no longer paying in) and – what is worse – the added bureaucracy and administrative burden of turning skilled workers from EU countries into immigrants from overseas, which will stifle collaboration to some extent. So no, leaving the EU will probably not to ruin the UK. But don’t fool yourself into thinking that it won’t have bad consequences.

As far as I am concerned, I recently I started the process of naturalisation to become a British citizen. I had originally wanted to complete this before the referendum so I could vote in it. I held back on this for ages because for many years my country of origin did not allow dual citizenship (guess which supernational organisation is to thank for that being possible now?) and also because it’s damned expensive. Now I am too late to get there before 23rd June. If the UK stays in the EU, I will most likely finish the process. This place is my home and I am tired of being subject to taxation without representation. I feel it’s about time I can fully shape the future of this country with my votes.

Of course, gaining citizenship will be far more useful if the UK indeed leaves the EU. People like me would most likely lose the right to vote in other elections we could vote in until now. It is far less clear how residence rights will change. As I already said, if the UK remains in the EEA, free movement rights are unlikely to be affected at all. But if “control of the borders” is “regained” this would change. Will our automatic permanent resident status be carried over? It does seem improbable that we would suddenly be asked to apply for visas or indefinite leave as such a change would result in complete chaos. It is quite likely though that some additional bureaucratic hurdles would be erected because that’s just what governments do. But honestly, I don’t think I’ll go through with naturalisation if the UK leaves the EU. Melodramatic as it sounds (because screw it, this whole debate has been plagued with melodrama and over-emotional rhetoric) a vote to Leave is a statement that people like me are not really welcome in this country. If the UK votes to Leave, I will most likely choose to leave the UK.

But don’t get me wrong. I do think this referendum is a good idea. For one thing, it is democratic. More importantly, I don’t actually think that Britons will vote to Leave. The polls suggest a close race but the Remain camp has been steadily ahead. As the graph below shows,  the fewer people answer “Don’t know” in a poll, the farther the Remain vote is ahead of the Leave vote. It is probably simplistic to interpret too much into this because this must depend on the particular poll but it could support the interpretation that people vote conservatively. Undecided voters are unlikely to choose the radical option on referendum day.

In my view, this referendum is actually way overdue. If the Leaves have it, then yes, the British will have spoken that European integration has gone too far for them. But if the Remain votes win this will hopefully finally put to rest a common Eurosceptic assertion that the UK originally only “voted to join the Common Market.”  It is about bloody time that this discussion moved into the 21st century.

EuroPolls
Difference between Remain and Leave votes compared to the percentage of Don’t know votes. (Polls without Don’t know votes have been removed). Source: Poll of Polls

 

3 scoops of vanilla science in a low impact waffle please

A lot of young[1] researchers are worried about being “scooped”. No, this is not about something unpleasantly kinky but about when some other lab publishes an experiment that is very similar to yours before you do. Sometimes this is even more than just a worry and it actually happens. I know that this could be depressing. You’ve invested months or years of work and sleepless nights in this project and then somebody else comes along and publishes something similar and – poof – all the novelty is gone. Your science career is over. You will never publish this high impact now. You won’t ever get a grant. Immeasurable effort down the drain. Might as well give up, sell your soul to the Devil, and get a slave job in the pharmaceutical industry and get rich[2].

Except that this is total crap. There is no such thing as being scooped in this way, or at least if there is, it is not the end of your scientific career. In this post I want to briefly explain why I think so. This won’t be a lecture on the merits of open science, on replications, on how we should care more about the truth than about novelty and “sexiness”. All of these things are undoubtedly true in my mind and they are things we as a community should be actively working to change – but this is no help to young scientists who are still trying to make a name for themselves in a system that continues to reward high impact publications over substance.

No. Here I will talk about this issue with respect to the status quo. I think even in the current system, imperfect as it may be, this irrational fear is in my view unfounded. It is essential to dispel these myths about impact and novelty, about how precedence is tied to your career prospects. Early career scientists are the future of science. How can we ever hope to change science for the better if we allow this sort of madness to live on in the next generation of scientists? I say ‘live on’ for good reason – I, too, used to suffer from this madness when I was a graduate student and postdoc.

Why did I have this madness? Honestly I couldn’t say. Perhaps it’s a natural evolution of young researchers, at least in our current system. People like to point the finger at the lab PIs pressuring you into this sort of crazy behaviour. But that wasn’t it for me. For most of my postdoc I worked with Geraint Rees at UCL and perhaps the best thing he ever told me was to fucking chill[3]. He taught me – more by example than words – that while having a successful career was useful, what is much more important is to remember why you’re doing it: The point of having a (reasonably successful) science career is to be able to pay the rent/mortgage and take some enjoyment out of this life you’ve been given. The reason I do science, rather than making a mint in the pharma industry[4], is that I am genuinely curious and want to figure shit out.

Guess what? Neither of these things depend on whether somebody else publishes a similar (or even very similar) experiment while you’re still running it. We all know that novelty still matters to a lot of journals. Some have been very reluctant to publish replication attempts. I agree that publishing high impact papers does help wedge your foot in the door (that is, get you short-listed) in grant and job applications. But even if this were all that matters to be a successful scientist (and it really isn’t), here’s why you shouldn’t care too much about that anyway:

No paper was ever rejected because it was scooped

While journal editors will reject papers because they aren’t “novel,” I have never seen any paper being rejected because somebody else published something similar a few months earlier. Most editors and reviewers will not even be aware of the scooping study. You may find this hard to believe because you think your own research is so timely and important, but statistically it is true. Of course, some reviewers will know of the work. But most reviewers are not actually bad people and will not say “Something like this was published three months ago already and therefore this is not interesting.” Again, you may find this hard to believe because we’ve all heard too many stories of Reviewer 2 being an asshole. But in the end, most people aren’t that big of an asshole[5]. It happens quite frequently that I suggest in reviews that the authors cite some recently published work (usually not my own, in case you were wondering) that is very similar to theirs. In my experience this has never led to a rejection but I ask to them to put their results in the context of similar findings in the literature. You know, the way a Discussion section should be.

No two scooped studies are the same

You may think that the scooper’s experiment was very similar, but unless they actually stole your idea (a whole different story I also don’t believe but I have no time for this now…) and essentially pre-replicated (preclicated?) your design, I’d bet that there are still significant differences. Your study has not lost any of its value because of this. And it’s certainly no reason to quit and/or be depressed.

It’s actually a compliment

Not 100% sure about this one. Scientific curiosity shouldn’t have anything to do with a popularity contest if you ask me. Study whatever the hell you want to (within ethical limits, that is). But I admit, it feels reassuring to me when other people agree that the research questions I am interested in are also interesting to them. For one thing, this means that they will appreciate you working and (eventually) publishing on it, which again from a pragmatic point of view means that you can pay those rents/mortgages. And from a simple vanity perspective it is also reassuring that you’re not completely mad for pursuing a particular research question.

It has little to do with publishing high impact

Honestly, from what I can tell neither precedence nor the popularity of your topic are the critical factors in getting your work into high impact journals. The novelty of your techniques, how surprising and/or reassuringly expected your results are, and the simplicity of the narrative are actually major factors. Moreover, the place you work, the co-authors you with whom you write your papers, and the accessibility of the writing (in particular your cover letter to the editors!) definitely matter a great deal also (and these are not independent of the first points either…). It is quite possible that your “rival”[6] will publish first, but that doesn’t mean you won’t publish similar work in a higher impact journal. Journal review outcome is pretty stochastic and not really very predictable.

Actual decisions are not based on this

We all hear the horror stories of impact factors and h-indexes determining your success with grant applications and hiring decisions. Even if this were true (and I actually have my doubts that it is as black and white as this), a CV with lots of high impact publications may get your foot in the door – but it does not absolve the panel from making a hiring/funding decision. You need to do the work on that one yourself and even then luck may be against you (the odds certainly are). It also simply is not true that most people are looking for the person with the most Nature papers. Instead I bet you they are looking for people who can string together a coherent argument, communicate their ideas, and who have the drive and intellect to be a good researcher. Applicants with a long list of high impact papers may still come up with awful grant proposals or do terribly in job interviews while people with less stellar publication records can demonstrate their excellence in other ways. You may already have made a name for yourself in your field anyway, through conferences, social media, public engagement etc. This may matter far more than any high impact paper could.

There are more important things

And now we’re coming back to the work-life balance and why you’re doing this in the first place. Honestly, who the hell cares whether someone else published this a few months earlier? Is being the first to do this the reason you’re doing science? I can see the excitement of discovery but let’s face it, most of our research is neither like the work of Einstein or Newton nor are we discovering extraterrestrial life. Your discovery is no doubt exciting to you, it is hopefully exciting to some other scientists in your little bubble and it may even be exciting to some journalist who will write a distorting, simplifying article about it for the mainstream news. But seriously, it’s not as groundbreaking that it is worth sacrificing your mental and physical health over it. Live your life. Spend time with your family. Be good to your fellow creatures on this planet. By all means, don’t be complacent, ensure you make a living but don’t pressure yourself into believing that publishing ultra-high impact papers is the meaning of life.

A positive suggestion for next time…

Now if you’re really worried about this sort of thing, why not preregister your experiment? I know I said I wouldn’t talk about open science here but bear with me just this once because this is a practical point you can implement today. As I keep saying, the whole discussion about preregistration is dominated by talking about “questionable research practices”, HARKing, and all that junk. Not that these aren’t worthwhile concerns but this is a lot of negativity. There are plenty of positive reasons why preregistration can help and the (fallacious) fear of being scooped is one of them. Preregistration does not stop anyone else from publishing the same experiment before you but it does allow you to demonstrate that you had thought of the idea before they published it. With Registered Reports it becomes irrelevant if someone else published before you because your publication is guaranteed after the method has been reviewed. And I believe it will also make it far clearer to everyone how much who published what first where actually matters in the big scheme of things.

[1] Actually there are a lot of old and experienced researchers who worry about this too. And that is far worse than when early career researchers do it because they should really know better and they shouldn’t feel the same career pressures.
[2] It may sound appealing now, but thinking about it I wouldn’t trade my current professional life for anything. Except for grant admin bureaucracy perhaps. I would happily give that up at any price…:/
[3] He didn’t quite say it in those terms.
[4] This doesn’t actually happen. If you want to make a mint you need to go into scientific publishing but the whole open science movement is screwing up that opportunity now as well so you may be out of luck!
[5] Don’t bombard me with “Reviewer 2 held up my paper to publish theirs first” stories. Unless Reviewer 2 signed their review or told you specifically that it was them I don’t take such stories at face value.
[6] The sooner we stop thinking of other scientists in those terms the better for all of us.

Strawberry Ice Cream Cone