Category Archives: peer review

Angels in our midst?

A little more on “tone” – but also some science

This post is somewhat related to the last one and will be my last words on the tone debate*. I am sorry if calling it the “tone debate” makes some people feel excluded from participating in scientific discourse. I thought my last post was crystal clear that science should be maximally inclusive, that everyone has the right to complain about things they believe to be wrong, and that unacceptable behaviour should be called out. And certainly, I believe that those with the most influence have a moral obligation to defend those who are in a weaker position (with great power comes great responsibility, etc…). It is how I have always tried to act. In fact, not so long ago I called out a particularly bullish but powerful individual because he repeatedly acts in my (and, for that matter, many other people’s) estimation grossly inappropriately in post-publication peer review. In response, I and others have taken a fair bit of abuse from said person. Speaking more generally, I also feel that as a PI I have a responsibility to support those junior to me. I think my students and postdocs can all stand up for themselves, and I would support them in doing so, but in any direct confrontation I’ll be their first line of defense. I don’t think many who have criticised the “tone debate” would disagree with this.

The problem with arguments about tone is that they are often very subjective. The case I mentioned above is a pretty clear cut case. Many other situations are much greyer. More importantly, all too often “tone” is put forth as a means to silence criticism. Quite to the contrary of the argument that this “excludes” underrepresented groups from participating in the debate, it is used to categorically dismiss any dissenting views. In my experience, the people making these arguments are almost always people in positions of power.

A recent example of the tone debate

One of the many events that recently brought the question of tone to my mind was this tweet by Tom Wallis. On PubPeer** a Lydia Maniatis has been posting comments on what seems to be just about every paper published on psychophysical vision science.

I find a lot of things to be wrong with Dr Maniatis’ comments. First and foremost, it remains a mystery to me what the actual point is she is trying to make. I confess I must first read some of the literature she cites to comprehend the fundamental problem with vision science she clearly believes to have identified. Who knows, she might have an important theoretical point but it eludes me. This may very well be due to my own deficiency but it would help if she spelled it out more clearly for unenlightened readers.

The second problem with her comments is that they are in many places clearly uninformed with regard to the subject matter. It is difficult to argue with someone about the choices and underlying assumptions for a particular model of the data when they seemly misapprehend what these parameters are. This is not an insurmountable problem and it may also partly originate in the lack of clarity with which they are described in publications. Try as you might***, to some degree your method sections will always make tacit assumptions about the methodological knowledge of the reader. A related issue is that she picks seemingly random statements from papers and counters them with quotes from other papers that often do not really support her point.

The third problem is that there is just so much of Maniatis’ comments! I probably can’t talk as I am known to write verbose blogs myself – but conciseness is a virtue in communication. In my scientific writing in manuscripts or reviews I certainly aim for it. Yet, in her comments of this paper by my colleague John Greenwood are a perfect example: by my count she expends 5262 words on this before giving John a chance to respond! Now perhaps the problems with that paper are so gigantic that this is justified but somehow I doubt it. Maniatis’ concern seems to be with the general theoretical background of the field. It seems to me that a paper or even a continuous blog would be a far better way to communicate her concerns than targeting one particular paper with this deluge. Even if the paper were a perfect example of the fundamental problem, it is hard to see the forest for the trees here. Furthermore, it also drowns out the signal-to-noise ratio of the PubPeer thread considerably. If someone had an actual specific concern, say because they identified a major statistical flaw, it would be very hard to see it in this sea of Maniatis. Fortunately most of her other comments on PubPeer aren’t as extensive but they are still long and the same issue applies.

Why am I talking about this? Well, a fourth problem that people have raised is that her “tone” is unacceptable (see for example here). I disagree. If there is one thing I don’t take issue with it is her tone. Don’t get me wrong: I do not like her tone. I also think that her criticisms are aggressive, hostile, and unnecessarily inflammatory. Does this mean we can just brush aside her comments and ignore her immediately? It most certainly doesn’t. Even if her comments were the kind of crude bullying some other unpleasant characters in the post-publication peer review sphere are guilty of (like that bullish person I mentioned above), we should at least try to extract the meaning. If someone continues to be nasty after being called out on it, I think it is best to ignore them. In particularly bad cases they should be banned from participating in the debate. No fruitful discussion will happen with someone who just showers you in ad hominems. However, none of that categorically invalidates the arguments they make underneath all that rubbish.

Maniatis’ comments are aggressive and uncalled for. I do however not think they are nasty. I would prefer it if she “toned it down” as they say but I can live with how she says what she says (but of course YMMV). The point is, the other three issues I described above are what concerns me, not her tone. To address them I see these solutions: first of all, I need to read some of the literature her criticisms are based on to try to understand where she is coming from. Secondly, people in the field need to explain to her points of apparent misunderstanding. If she refuses to engage or acknowledge that, then it is best to ignore her. Third, the signal-to-noise ratio of PubPeer comments could be improved by better filtering, so by muting a commenter like you can on Twitter. If PubPeer doesn’t implement that, then perhaps it can be achieved with a browser plug-in.

You promised there would be some science!

Yes I did. I am sorry it took so long to get here but I will briefly discuss a quote from Maniatis’ latest comment on John’s paper:

Let’s suppose that the movement of heavenly bodies is due to pushing by angels, and that some of these angels are lazier than others. We may then measure the relative motions of these bodies, fit them to functions, infer the energy with which each angel is pushing his or her planet, and report our “angel energy” findings. We may ignore logical arguments against the angel hypothesis. When, in future measurements, changes in motion are observed that makes the fit to our functions less good, we can add assumptions, such as that angels sometimes take a break, causing a lapse in their performance. And we can report these inferences as well. If discrepancies can’t be managed with quantitative fixes, we can just “hush them up.”

I may disagree (and fail to understand) most of her criticisms, but I really like this analogy. It actually reminds me of an example I used when commenting on Psi research and which I also use in my teaching about the scientific method. I used the difference between the heliocentric and geocentric models of planetary movements to illustrate Occam’s Razor, explanatory power, and the trade-off with model complexity. Maniatis’ angels are a perfect example for how we can update our models to account for new observations by increasing their complexity and overfitting the noise. The best possible model however should maximise explanatory power while minimising our assumptions. If we can account for planetary motion without assuming the existence of angels, we may be on the right track (as disappointing as that is).

It won’t surprise you when I say I don’t believe Maniatis’ criticism applies to vision science. Our angels are supported by a long list of converging scientific observations and I think that if we remove them from the model the explanatory power of the models goes down and the complexity increases. Or at least Maniatis hasn’t made it clear why that isn’t the case. However, leaving this specific case aside, I do like the analogy a lot. There you go, I actually discussed science for a change.

* I expect someone to hold me to this!
** She also commented on PubMed Central but apparently her account there has been blocked.
*** But this is no reason not to try harder.

fnhum-08-00332-g001

3 scoops of vanilla science in a low impact waffle please

A lot of young[1] researchers are worried about being “scooped”. No, this is not about something unpleasantly kinky but about when some other lab publishes an experiment that is very similar to yours before you do. Sometimes this is even more than just a worry and it actually happens. I know that this could be depressing. You’ve invested months or years of work and sleepless nights in this project and then somebody else comes along and publishes something similar and – poof – all the novelty is gone. Your science career is over. You will never publish this high impact now. You won’t ever get a grant. Immeasurable effort down the drain. Might as well give up, sell your soul to the Devil, and get a slave job in the pharmaceutical industry and get rich[2].

Except that this is total crap. There is no such thing as being scooped in this way, or at least if there is, it is not the end of your scientific career. In this post I want to briefly explain why I think so. This won’t be a lecture on the merits of open science, on replications, on how we should care more about the truth than about novelty and “sexiness”. All of these things are undoubtedly true in my mind and they are things we as a community should be actively working to change – but this is no help to young scientists who are still trying to make a name for themselves in a system that continues to reward high impact publications over substance.

No. Here I will talk about this issue with respect to the status quo. I think even in the current system, imperfect as it may be, this irrational fear is in my view unfounded. It is essential to dispel these myths about impact and novelty, about how precedence is tied to your career prospects. Early career scientists are the future of science. How can we ever hope to change science for the better if we allow this sort of madness to live on in the next generation of scientists? I say ‘live on’ for good reason – I, too, used to suffer from this madness when I was a graduate student and postdoc.

Why did I have this madness? Honestly I couldn’t say. Perhaps it’s a natural evolution of young researchers, at least in our current system. People like to point the finger at the lab PIs pressuring you into this sort of crazy behaviour. But that wasn’t it for me. For most of my postdoc I worked with Geraint Rees at UCL and perhaps the best thing he ever told me was to fucking chill[3]. He taught me – more by example than words – that while having a successful career was useful, what is much more important is to remember why you’re doing it: The point of having a (reasonably successful) science career is to be able to pay the rent/mortgage and take some enjoyment out of this life you’ve been given. The reason I do science, rather than making a mint in the pharma industry[4], is that I am genuinely curious and want to figure shit out.

Guess what? Neither of these things depend on whether somebody else publishes a similar (or even very similar) experiment while you’re still running it. We all know that novelty still matters to a lot of journals. Some have been very reluctant to publish replication attempts. I agree that publishing high impact papers does help wedge your foot in the door (that is, get you short-listed) in grant and job applications. But even if this were all that matters to be a successful scientist (and it really isn’t), here’s why you shouldn’t care too much about that anyway:

No paper was ever rejected because it was scooped

While journal editors will reject papers because they aren’t “novel,” I have never seen any paper being rejected because somebody else published something similar a few months earlier. Most editors and reviewers will not even be aware of the scooping study. You may find this hard to believe because you think your own research is so timely and important, but statistically it is true. Of course, some reviewers will know of the work. But most reviewers are not actually bad people and will not say “Something like this was published three months ago already and therefore this is not interesting.” Again, you may find this hard to believe because we’ve all heard too many stories of Reviewer 2 being an asshole. But in the end, most people aren’t that big of an asshole[5]. It happens quite frequently that I suggest in reviews that the authors cite some recently published work (usually not my own, in case you were wondering) that is very similar to theirs. In my experience this has never led to a rejection but I ask to them to put their results in the context of similar findings in the literature. You know, the way a Discussion section should be.

No two scooped studies are the same

You may think that the scooper’s experiment was very similar, but unless they actually stole your idea (a whole different story I also don’t believe but I have no time for this now…) and essentially pre-replicated (preclicated?) your design, I’d bet that there are still significant differences. Your study has not lost any of its value because of this. And it’s certainly no reason to quit and/or be depressed.

It’s actually a compliment

Not 100% sure about this one. Scientific curiosity shouldn’t have anything to do with a popularity contest if you ask me. Study whatever the hell you want to (within ethical limits, that is). But I admit, it feels reassuring to me when other people agree that the research questions I am interested in are also interesting to them. For one thing, this means that they will appreciate you working and (eventually) publishing on it, which again from a pragmatic point of view means that you can pay those rents/mortgages. And from a simple vanity perspective it is also reassuring that you’re not completely mad for pursuing a particular research question.

It has little to do with publishing high impact

Honestly, from what I can tell neither precedence nor the popularity of your topic are the critical factors in getting your work into high impact journals. The novelty of your techniques, how surprising and/or reassuringly expected your results are, and the simplicity of the narrative are actually major factors. Moreover, the place you work, the co-authors you with whom you write your papers, and the accessibility of the writing (in particular your cover letter to the editors!) definitely matter a great deal also (and these are not independent of the first points either…). It is quite possible that your “rival”[6] will publish first, but that doesn’t mean you won’t publish similar work in a higher impact journal. Journal review outcome is pretty stochastic and not really very predictable.

Actual decisions are not based on this

We all hear the horror stories of impact factors and h-indexes determining your success with grant applications and hiring decisions. Even if this were true (and I actually have my doubts that it is as black and white as this), a CV with lots of high impact publications may get your foot in the door – but it does not absolve the panel from making a hiring/funding decision. You need to do the work on that one yourself and even then luck may be against you (the odds certainly are). It also simply is not true that most people are looking for the person with the most Nature papers. Instead I bet you they are looking for people who can string together a coherent argument, communicate their ideas, and who have the drive and intellect to be a good researcher. Applicants with a long list of high impact papers may still come up with awful grant proposals or do terribly in job interviews while people with less stellar publication records can demonstrate their excellence in other ways. You may already have made a name for yourself in your field anyway, through conferences, social media, public engagement etc. This may matter far more than any high impact paper could.

There are more important things

And now we’re coming back to the work-life balance and why you’re doing this in the first place. Honestly, who the hell cares whether someone else published this a few months earlier? Is being the first to do this the reason you’re doing science? I can see the excitement of discovery but let’s face it, most of our research is neither like the work of Einstein or Newton nor are we discovering extraterrestrial life. Your discovery is no doubt exciting to you, it is hopefully exciting to some other scientists in your little bubble and it may even be exciting to some journalist who will write a distorting, simplifying article about it for the mainstream news. But seriously, it’s not as groundbreaking that it is worth sacrificing your mental and physical health over it. Live your life. Spend time with your family. Be good to your fellow creatures on this planet. By all means, don’t be complacent, ensure you make a living but don’t pressure yourself into believing that publishing ultra-high impact papers is the meaning of life.

A positive suggestion for next time…

Now if you’re really worried about this sort of thing, why not preregister your experiment? I know I said I wouldn’t talk about open science here but bear with me just this once because this is a practical point you can implement today. As I keep saying, the whole discussion about preregistration is dominated by talking about “questionable research practices”, HARKing, and all that junk. Not that these aren’t worthwhile concerns but this is a lot of negativity. There are plenty of positive reasons why preregistration can help and the (fallacious) fear of being scooped is one of them. Preregistration does not stop anyone else from publishing the same experiment before you but it does allow you to demonstrate that you had thought of the idea before they published it. With Registered Reports it becomes irrelevant if someone else published before you because your publication is guaranteed after the method has been reviewed. And I believe it will also make it far clearer to everyone how much who published what first where actually matters in the big scheme of things.

[1] Actually there are a lot of old and experienced researchers who worry about this too. And that is far worse than when early career researchers do it because they should really know better and they shouldn’t feel the same career pressures.
[2] It may sound appealing now, but thinking about it I wouldn’t trade my current professional life for anything. Except for grant admin bureaucracy perhaps. I would happily give that up at any price… :/
[3] He didn’t quite say it in those terms.
[4] This doesn’t actually happen. If you want to make a mint you need to go into scientific publishing but the whole open science movement is screwing up that opportunity now as well so you may be out of luck!
[5] Don’t bombard me with “Reviewer 2 held up my paper to publish theirs first” stories. Unless Reviewer 2 signed their review or told you specifically that it was them I don’t take such stories at face value.
[6] The sooner we stop thinking of other scientists in those terms the better for all of us.

Strawberry Ice Cream Cone

Started signing my reviews

As of this year, I started signing my reviews. This decision has been a long time coming. A lot of people sign their reviews making this not a particularly newsworthy event but I’ll tell you about it anyway, largely to have a record of when I started and also to explain my reasons.

To explain why, I first need to talk about why one might not want to sign peer reviews. The debate about whether or not to sign reviews has been raging for years. It divides people’s minds and the debate regularly sparks up again. Even the people who agree that the process of scientific research can be improved seem to often fall into two camps whose opinions are diametrically opposed: one side fervently argues that all peer reviews should be transparent and signed, whilst other people argue with equal fervour that ideally all reviews should be double-blind, so that neither reviewers nor authors’ know each other’s identities.

Whenever someone suggests double-blind reviews, people are wont to argue that this simply doesn’t work in many situations. It is possible to guess the reviewers from the research question and/or the methods used. If the authors previously presented the research at a conference it is likely that reviewers will have already seen it in a preliminary form. That said, the very few times I did review in a double-blind manner I actually didn’t guess the authors’ identities and in one case I was in fact reviewing the work of friends and collaborators without even knowing it. I’d like to think I would’ve been fair either way, but I must also admit that I was probably more sceptical and possibly less biased because I didn’t know who the authors were. Still, these cases are probably somewhat special – in many situations I would know the authors from the research or at least have a strong suspicion. The suspicion might also lead me to some erroneous assumptions, such as “These authors usually do this and that even though this isn’t mentioned here”. If my guess were actually wrong then this could skew my thought process unduly.

So I think double-blind reviewing is a bad idea. Now, many arguments have been brought forth as to why reviews should be anonymous. It can protect reviewers from the wrath of vengeful senior colleagues making unfair hiring or funding decisions because they didn’t like your review. There are a lot of arseholes in the world and this is certainly a possibility. But the truth is that anonymity doesn’t stop people from behaving in this way – and there is actually no compelling evidence that signed reviews make it worse. I have heard some harrowing tales from colleagues who were being treated unfairly by some major players in their fields because they thought that they had given their work a bad review. In one case, it was a PhD student of the assumed reviewer who received ill treatment – and the assumption was entirely incorrect.

You also frequently hear people’s guesses about who they think Reviewer 2 was on their latest rejected manuscript, often based on circumstantial or generally weak evidence. One of my favourites is the age old “He (because we know all reviewers are obviously male…) asked us to cite lots of his papers!” I am sure this happens but I wonder how often this deduction is correct. I almost never ask people to cite my papers – if I do it is because I feel they are directly relevant and citing them is the scholarly thing to do. It is far more likely that I ask people to cite the work of researchers whose work I know well when it is relevant. In many cases when people just “know” that Reviewer 2 is Professor X because they want X to be cited, it seems to me far more likely that the reviewer is one of Professor X’s postdocs or former students. In many cases, it may also be that Professor X’s work is an established part of the literature and thus in the interest of scholarship an unbiased reviewer will think it deserves being cited even though you think Professor X’s work is rubbish. In short, I find those kind of insane guessing games rather tedious and potentially quite damaging.

The first time I signed a review was when I reviewed for F1000Research where signing is mandatory. (I had already reviewed at Frontiers a few times where reviewer identities are public but I don’t think this counts: reviews aren’t signed upon submission of the review but only after publication of the paper. Moreover, the match between review and reviewer remains ambiguous). I must say reviewing this paper all in public was a rather uplifting experience. At all stages of this process I felt the communication between me and the authors was amicable and sensible in spite of the harshness of my decisions. I have also been led to believe that the authors appreciated my scepticism (although only they can tell you that for sure).

By signing I may have also been more polite than I might have been if my review were anonymous. I am not entirely convinced of this last argument because I typically try to be polite. There are a lot of dickheads out there who aren’t polite even when their identity is public :P. I also don’t buy that anonymous reviewers aren’t accountable and that thus the quality of the review suffers. Your review is still read by at least one editor – unless that editor is your close personal friend (which is still rare for me at least) then I do feel like someone is checking my review both for factual quality and politeness.

Either way, I did not perceive any adverse consequences of signing my reviews. If anything, it made me think harder about how I would write my review and to check the arguments I am making. Scientists should criticise and scrutinise each other. By this I don’t mean you should mistrust people’s intentions or question their competence. But science is fueled by scepticism and you should challenge anything that doesn’t make sense. I have certainly done so in my collaborations in the past (often to the frustration of my collaborators) and I try to encourage this in my own lab. I much rather have a student or postdoc who tells me that my idea makes no sense than someone who does everything I say. Researchers also do that at conferences where they discuss each other’s research. One of my most positive experiences from a conference was some rather intense – but very polite – discussions at a poster. Why can’t we do the same in paper reviews?

When I’m perfectly honest, the main reason I hadn’t signed reviews so far is that I was raised that way. Almost none of the reviews I ever received were signed – certainly none of the negative ones. Some reviewers (including very critical ones) revealed their identities after the manuscripts had been accepted for publication and I have done the same in some cases. But the status quo of my field was always that reviews were anonymous and that’s just how it was. Challenging this seemed to go against nature – but that really isn’t true. Whether or not reviews are signed is a question of culture, not nature. And I want to change this culture.

Signing reviews is a personal choice. I don’t think it should ever become mandatory. For one thing, I’m a libertarian (just to be clear, I’m not one of the delusional tea party types) and I don’t believe we should force people to do things that aren’t necessary. I don’t think signed reviews are necessary. I think making all review contents public would be an essential improvement to peer review, with or without signing. But signing reviews can be positive development and I believe it should be encouraged. I certainly think it is a positive development for me and this is why everyone should be free to take this step of their own accord. Signing my first reviews has been a strangely liberating experience. I don’t know if this will provoke the ire of powerful senior colleagues. In a few years’ time I may post an update about my experience. Somehow I doubt it will turn out to be a problem.

Coconut

Is publication bias actually a good thing?*

Yesterday Neuroskeptic came to our Cognitive Drinks event in the Experimental Psychology department at UCL to talk about p-hacking. His entertaining talk (see Figure 1) was followed by a lively and fairly long debate about p-hacking and related questions about reproducibility, preregistration, and publication bias. During the course of this discussion a few interesting things came up. (I deliberately won’t name anyone as I think this complicates matters. People can comment and identify themselves if they feel that they should…)

Figure 1. Using this super-high-tech interactive fMRI results simulator Neuroskeptic clearly demonstrated a significant blob of activation in the pre-SMA (I think?) in stressed compared to relaxed people. This result made perfect sense.

It was suggested that a lot of the problems with science would be remedied effectively if only people were encouraged (or required?) to replicate their own findings before publication. Now that sounds generally like a good idea. I have previously suggested that this would work very well in combination with preregistration: you first do a (semi-)exploratory experiment to finalise the protocol, then submit a preregistration of your hypothesis and methods, and then do the whole thing again as a replication (or perhaps more than one if you want to test several boundary conditions or parameters). You then submit the final set of results for publication. Under the Registered Report format, your preregistered protocol would already undergo peer review. This would ensure that the final results are almost certain to be published provided you didn’t stray excessively from the preregistered design. So far, so good.

Should you publish unclear results?

Or is it? Someone suggested that it would be a problem if your self-replication didn’t show the same thing as the original experiment. What should one do in this case? Doesn’t publishing something incoherent like this, one significant finding and a failed replication, just add to the noise in the literature?

At first, this question simply baffled me, as I suspect it would many of the folks campaigning to improve science. (My evil twin sister called these people Crusaders for True Science but I’m not supposed to use derogatory terms like that anymore nor should I impersonate lady demons for that matter. Most people from both sides of this mudslinging contest “debate” never seemed to understand that I’m also a revolutionary – you might just say that I’m more Proudhon, Bakunin, or Henry David Thoreau rather than Marx, Lenin, or Che Guevara. But I digress…)

Surely, the attitude that unclear, incoherent findings, that is, those that are more likely to be null results, are not worth publishing must contribute to the prevailing publication bias in the scientific literature? Surely, this view is counterproductive to the aims of science to accumulate evidence and gradually get closer to some universal truths? We must know which hypotheses have been supported by experimental data and which haven’t. One of the most important lessons I learned from one of my long-term mentors was that all good experiments should be published regardless of what they show. This doesn’t mean you should publish every single pilot experiment you ever did that didn’t work. (We can talk about what that does and doesn’t mean another time. But you know how life is: sometimes you think you have some great idea only to realise that it makes no sense at all when you actually try it in practice. Or maybe that’s just me? :P). Even with completed experiments you probably shouldn’t bother publishing if you realise afterwards that it is all artifactual or the result of some error. Hopefully you don’t have a lot of data sets like that though. So provided you did an experiment of suitable quality I believe you should publish it rather than hiding it in the proverbial file drawer. All scientific knowledge should be part of the scientific record.

I naively assumed that this view was self-evident and shared by almost everyone – but this clearly is not the case. Yet instead of sneering at such alternative opinions I believe we should understand why people hold them. There are reasonable arguments why one might wish to not publish every unclear finding. The person making this suggestion at our discussion said that it is difficult to interpret a null result, especially an assumed null result like this. If your original experiment O showed a significant effect supporting your hypothesis, but your replication experiment R does not, you cannot naturally conclude that the effect really doesn’t exist. For one thing you need to be more specific than that. If O showed a significant positive effect but R shows a significant negative one, this would be more consistent with the null hypothesis than if O is highly significant (p<10-30) and R just barely misses the threshold (p=0.051).

So let’s assume that we are talking about the former scenario. Even then things aren’t as straightforward, especially if R isn’t as exact a replication of O as you might have liked. If there is any doubt (and usually there is) that something could have been different in R than in O, this could be one of the hidden factors people always like to talk about in these discussions. Now you hopefully know your data better than anyone. If experiment O was largely exploratory and you tried various things to see what works best (dare we say p-hacking again?), then the odds are probably quite good that a significant non-replication in the opposite direction shows that the effect was just a fluke. But this is not a natural law but a probabilistic one. You cannot ever know whether the original effect was real or not, especially not from such a limited data set of two non-independent experiments.

This is precisely why you should publish all results!

In my view, it is inherently dangerous if researchers decide for themselves which findings are important and which are not. It is not only a question of publishing only significant results. It applies much more broadly to the situation when a researcher publishes only results that support their pet theory but ignores or hides those that do not. I’d like to believe that most scientists don’t engage in this sort of behaviour – but sadly it is probably not uncommon. A way to counteract this is to train researchers to think of ways that test alternative hypotheses that make opposite predictions. However, such so-called “strong inference” is not always feasible. And even when it is, the two alternatives are not always equally interesting, which in turn means that people may still become emotionally attached to one hypothesis.

The decision whether a result is meaningful should be left to posterity. You should publish all your properly conducted experiments. If you have defensible doubts that the data are actually rubbish (say, an fMRI data set littered with spikes, distortions, and excessive motion artifacts, or a social psychology study where you discovered posthoc that all the participants were illiterate and couldn’t read the questionnaires) then by all means throw them in the bin. But unless you have a good reason, you should never do this and instead add the results to the scientific record.

Now the suggestion during our debate was that such inconclusive findings clog up the record with unnecessary noise. There is an enormous and constantly growing scientific literature. As it is, it is becoming increasingly harder to separate the wheat from the chaff. I can barely keep up with the continuous feed of new publications in my field and I am missing a lot. Total information overload. So from that point of view the notion makes sense that only those studies that meet a certain threshold for being conclusive are accepted as part of the scientific record.

I can certainly relate to this fear. For the same reason I am sceptical of proposals that papers should be published before review and all decisions about the quality and interest of some piece of research, including the whole peer review process, should be entirely post-publication. Some people even seem to think that the line between scientific publication and science blog should be blurred beyond recognition. I don’t agree with this. I don’t think that rating systems like those used on Amazon or IMDb are an ideal way to evaluate scientific research. It doesn’t sound wise to me to assess scientific discoveries and medical breakthroughs in the same way we rank our entertainment and retail products. And that is not even talking about unleashing the horror of internet comment sections onto peer review…

Solving the (false) dilemma

I think this discussion is creating a false dichotomy. These are not mutually exclusive options. The solution to a low signal-to-noise ratio in the scientific literature is not to maintain publication bias of significant results. Rather the solution is to improve our filtering mechanisms. As I just described, I don’t think it will be sufficient to employ online shopping and social network procedures to rank the scientific literature. Even in the best-case scenario this is likely to highlight the results of authors who are socially dominant or popular and probably also those who are particularly unpopular or controversial. It does not necessarily imply that the highest quality research floats to the top [cue obvious joke about what kind of things float to the top…].

No, a high quality filter requires some organisation. I am convinced the scientific community can organise itself very well to create these mechanisms without too much outside influence. (I told you I’m Thoreau and Proudhon, not some insane Chaos Worshipper :P). We need some form of acceptance to the record. As I outlined previously, we should reorganise the entire publication process so that the whole peer-review process is transparent and public. It should be completely separate from journals. The journals’ only job should be to select interesting manuscripts and to publish short summary versions of them in order to communicate particularly exciting results to the broader community. But this peer-review should still involve a “pre-publication” stage – in the sense that the initial manuscript should not generate an enormous amount of undue interest before it has been properly vetted. To reiterate (because people always misunderstand that): the “vetting” should be completely public. Everyone should be able to see all the reviews, all the editorial decisions, and the whole evolution of the manuscript. If anyone has any particular insight to share about the study, by all means they should be free to do so. But there should be some editorial process. Someone should chase potential reviewers to ensure the process takes off at all.

The good news about all this is that it benefits you. Instead of weeping bitterly and considering to quit science because yet again you didn’t find the result you hypothesised, this just means that you get to publish more research. Taking the focus off novel, controversial, special, cool or otherwise “important” results should also help make the peer review more about the quality and meticulousness of the methods. Peer review should be about ensuring that the science is sound. In current practice it instead often resembles a battle with authors defending to the death their claims about the significance of their findings against the reviewers’ scepticism. Scepticism is important in science but this kind of scepticism is completely unnecessary when people are not incentivised to overstate the importance of their results.

Practice what you preach

I honestly haven’t followed all of the suggestions I make here. Neither have many other people who talk about improving science. I know of vocal proponents of preregistration who have yet to preregister any study of their own. The reasons for this are complex. Of course, you should “be the change you wish to see in the world” (I’m told Gandhi said this). But it’s not always that simple.

On the whole though I think I have published almost all of the research I’ve done. While I currently have a lot of unpublished results there is very little in the file drawer as most of these experiments have either been submitted or are being written up for eventual publication. There are two exceptions. One is a student project that produced somewhat inconclusive results although I would say it is a conceptual replication of a published study by others. The main reason we haven’t tried to publish this yet is that the student isn’t here anymore and hasn’t been in contact and the data aren’t that exciting to us to bother with the hassle of publication (and it is a hassle!).

The other data set is perhaps ironic because it is a perfect example of the scenario I described earlier. A few years ago when I started a new postdoc I was asked to replicate an experiment a previous lab member had done. For simplicity, let’s just call this colleague Dr Toffee. Again, they can identify themselves if they wish. The main reason for this was that reviewers had asked Dr Toffee to collect eye-movement data. So I replicated the original experiment but added eye-tracking. My replication wasn’t an exact one in the strictest terms because I decided to code the experimental protocol from scratch (this was a lot easier). I also had to use a different stimulus setup than the previous experiment as that wouldn’t have worked with the eye-tracker. Still, I did my best to match the conditions in all other ways.

My results were a highly significant effect in the opposite direction than the original finding. We did all the necessary checks to ensure that this wasn’t just a coding error etc. It seemed to be real. Dr Toffee and I discussed what to do about it and we eventually decided that we wouldn’t bother to publish this set of experiments. The original experiment had been conducted several years before my replication. Dr Toffee had moved on with their life. I on the other hand had done this experiment as a courtesy because I was asked to. It was very peripheral to my own research interests. So, as in the other example, we both felt that going through the publication process would have been a fairly big hassle for very little gain.

Now this is bad. Perhaps there is some other poor researcher, a student perhaps, who will do a similar experiment again and waste a lot of time on testing the hypothesis that, at least according to our incoherent results, is unlikely to be true. And perhaps they will also not publish their failure to support this hypothesis. The circle of null results continues… :/

But you need to pick your battles. We are all just human beings and we do not have unlimited (research) energy. With both of these lacklustre or incoherent results I mentioned (and these are literally the only completed experiments we haven’t at least begun to write up), it seems like a daunting task to undergo the pain of submission->review->rejection->repeat that simply doesn’t seem worth it.

So what to do? Well, the solution is again what I described. The very reason the task of publishing these results isn’t worth our energy is everything that is wrong with the current publication process! In my dream world in which I can simply write up a manuscript formatted in a way that pleases me and then upload this to the pre-print peer-review site my life would be infinitely simpler. No more perusing dense journal websites for their guide to authors or hunting for the Zotero/Endnote/Whatever style to format the bibliography. No more submitting your files to one horribly designed, clunky journal website after another, checking the same stupid tick boxes, adding the same reviewer suggestions. No more rewriting your cover letters by changing the name of the journal. Certainly for my student’s project, it would not be hard to do as there is already a dissertation that can be used as a basis for the manuscript. Dr Toffee’s experiment and its contradictory replication might require a bit more work – but to be fair even there is already a previous manuscript. So all we’d need to add would be the modifications of the methods and the results of my replication. In a world where all you need to do is upload the manuscript and address some reviewers’ comments to ensure the quality of the science this should be fairly little effort. In turn it would ensure that the file drawer is empty and we are all much more productive.

This world isn’t here yet but there are journals that will allow something that isn’t too far off from that, namely F1000Research and PeerJ (and the Winnower also counts although the content there seems to be different and I don’t quite know how much review editing happens there). So, maybe I should email Dr Toffee now…

(* In case you didn’t get this from the previous 2700ish words: the answer to this question is unequivocally “No.”)

Science is not broken – but these three things are

Because it’s so much more fun than the things I should really be doing (correcting student dissertations and responding to grant reviews) I read a long blog post entitled “Science isn’t broken” by Christie Aschwanden. In large parts this is a summary of the various controversies and “crises” that seem to have engulfed scientific research in recent years. The title is a direct response to an event I participated in recently at UCL. More importantly, I think it’s a really good read so I recommend it.

This post is a quick follow-up response to the general points raised there. As I tried to argue (probably not very coherently) at that event, I also don’t think science is broken. First of all, probably nobody seriously believes that the lofty concept of science, the scientific method (if there is one such thing), can even be broken. But even in more pragmatic terms, the human aspects of how science works are not broken either. My main point was that the very fact we are having these kinds of discussions, about how scientific research can be improved, is direct proof that science is in fact very healthy. This is what self-correction looks like.

If anything, the fact that there has been a recent surge of these kinds of debates shows that science has already improved a lot recently. After decades of complacency with the status quo there now seems to be real energy afoot to effect some changes. However, it is not the first time this happened (for example, the introduction of peer review would have been a similarly revolutionary time) and it will not be the last. Science will always need to be improved. If some day conventional wisdom were that our procedure is now perfect, that it cannot be improved anymore, that would be a tell-tale sign for me that I should do something else.

So instead of fretting over whether science is “broken” (No, it isn’t) or even whether it needs improvement (Yes, it does), what we should be talking about is specifically what really urgently needs improvement. Here is my short list. I am not proposing many solutions (except for point 1). I’d be happy to hear suggestions:

I. Publishing and peer review

The way we publish and review seriously needs to change. We are wasting far too much time on trivialities instead of the science. The trivialities range from reformatting manuscripts to fit journal guidelines and uploading files on the practical side to chasing impact factors and “novel” research on the more abstract side. Both hurt research productivity although in different ways. I recently proposed a solution that combines some of the ideas by Dorothy Bishop and Micah Allen (and no doubt many others).

II. Post-publication review

Related to this, the way we evaluate and discuss published science needs to change, too. We need to encourage more post-publication review. This currently still doesn’t happen as most studies never receive any post-pub review or get commented on at all. Sure, some (including some of my own) probably just don’t deserve any attention, but how will you know unless somebody tells you the study even exists? Many precious gems will be missed that way. This has of course always been the case in science but we should try to minimise that problem. Some believe post-publication review is all we will ever need but unless there are robust mechanisms to attract reviewers to new manuscripts besides the authors’ fame, (un-)popularity, and/or their social media presence – none of which are good scientific arguments – I can’t see how a post-pub only system can change this. On this note I should mention that Tal Yarkoni, with whom I’ve had some discussions about this issue, wrote an article about this which presents some suggestions. I am not entirely convinced of the arguments he makes for enhancing post-publication review but I need more time to respond to this in detail. So I will just point this out for now to any interested reader.

III. Research funding and hiring decisions

Above all, what seriously needs to change is how we allocate research funds and how we make hiring decisions. The solution to that probably goes hand in hand with solving the other two points, but I think it also requires direct action now in the absence of good solutions for the other issues. We must stop judging grant and job applicants based on impact factors or h-indeces. This is certainly more easily done for job applications than for grant decisions as in the latter the volume of applications is much greater – and the expertise of the panel members in judging the applications is lower. But it should be possible to reduce the reliance on metrics and ratings – even newer, more refined ones. Also grant applications shouldn’t be killed by a single off-hand critical review comment. Most importantly, grants shouldn’t all be written in a way that devalues exploratory research (by pretending to have strong hypotheses when you don’t) or – even worse – by pretending that the research you already conducted and are ready to publish is a “preliminary pilot data set.” For work that actually is hypothesis driven I quite like Dorothy Bishop’s idea that research funds could be obtained at the pre-registration stage when the theoretical background and experimental design have been established but before data collection commences. Realistically, this is probably more suitable for larger experimental programs than for every single study. But then again, encouraging larger, more thorough, projects may in fact be a good thing.

Revolutionise the publication process

Enough of this political squabble and twitter war (twar?) and back to the “real” world of neuroneuroticism. Last year I sadly agreed (for various reasons) to act as corresponding author on one of our studies. I also have this statistics pet project that I want to try to publish as a single-author paper. Both of these experiences reminded me of something I have long known:

I seriously hate submitting manuscripts and the whole peer review and publication process.

The way publishing currently works authors are encouraged to climb down the rungs of the impact factor ladder, starting at whatever journal they feel is sufficiently general interest and high impact to take their manuscript and then gradually working their way down through editorial and/or scientific rejections until it is eventually accepted by, how the rejection letters from high impact journals suggest, a “more specialised journal.” At each step one battles with an online submission system that competes for the least user-friendly webpage of the year award and you repeat the same actions: uploading your manuscript files, suggesting reviewers, and checking the PDF conversion worked. Before you can do these you of course need to format your manuscript into the style the journal expects, with the right kind of citations, and having the various sections in the correct place. You also modify the cover letter to the editors, in which you hype up the importance of the work rather than letting the research speak for itself, to adjust it to the particular journal you are submitting to. All of this takes precious time and has very little to do with research.

Because I absolutely loathe having to do this sort of mindless work, I have long tried to outsource this to my postdocs and students as much as I can. I don’t need to be corresponding author on all my research. Of course, this doesn’t absolve you from being involved with the somewhat more important decisions, such as rewriting the manuscripts and drafting the cover letters. More importantly, while this may help my own peace of mind, it just makes somebody else suffer. It is not a real solution.

The truth is that this wasted time and effort would be far better used for doing science and ensuring that the study is of the best possible quality. I have long felt that the entire publication process should be remodelled so that these things are no longer a drain of researcher’s time and sanity. I am by far not the first person to suggest a publication model like this. For instance, Micah Allen mentioned very similar ideas on his blog and, more recently, Dorothy Bishop had a passionate proposal to get rid of journals altogether. Both touched on many of the same points and partly inspired my own thoughts on this.

Centralised review platform

Some people think that all peer review could be post-publication. I don’t believe this is a good idea – depending on what you regard as publication. I think we need some sort of fundamental vetting procedure before a scientific study is indexed and regarded as part of the scientific record. I fear that without some expert scrutiny we will become swamped with poor quality outputs that make it impossible to separate the wheat from the chaff. Post-publication peer review alone is not enough to find the needles in the haystack. If there is so much material out there that most studies never get read, let alone reviewed or even receive comments, this isn’t going to work. By having some traditional review prior to “acceptance” in which experts are invited to review the manuscript – and reminded to do so – we can at least ensure that every manuscript will be read by someone. Nobody is stopping you from turning blog posts into actual publications. Daniël Lakens has a whole series of blog posts that have turned into peer reviewed publications.

A key feature of this pre-publication peer review though should be that it all takes place in a central place completely divorced from any of the traditional journals. All the judgements on the scientific quality of a study requires expert reviewers and editors but there should be no evaluation of the novelty or “impact” of the research. It should be all about the scientific details to ensure that the work is robust. The manuscript should be as detailed as necessary to replicate a study (and the hypotheses and protocols can be pre-registered – a peer review system of pre-registered protocols is certainly an option in this system).

Ideally this review should involve access to the data and materials so that reviewers can try to replicate the findings presented in the study. Most expert reviewers rarely reanalyse data even if they are available. Many people usually do not have the time to get that deeply involved in a review. An interesting possible solution to this dilemma was suggested to me recently by Lee de-Wit: there could be reviewers whose primary role is to check the data and try to reproduce the analysed results based on the documentation. These data reviewers would likely be junior researchers, that is, PhD students and junior postdocs perhaps. It would provide an opportunity to learn about reviewing and also to become known with editors. There is presently a huge variability as to what stage of their career a researcher starts reviewing manuscripts. While some people begin reviewing even as graduate students others don’t even seem to review often after several years of postdoc experience. This idea could help close that gap.

Transparent reviewing

Another in my mind essential key aspect should be that reviews are transparent. That is, all the review comments should be public and the various revisions of the manuscript should be accessible. Ideally, the platform allows easy navigation between the changes so that it is straightforward to simply look at the current/final product and filter out the tracked changes – but equally easy to blend in the comments.

It remains a very controversial and polarising issue whether reviewers’ names should be public as well. I haven’t come to a final conclusion on that. There are certainly arguments for both. One of the reasons many people dislike the idea of mandatory signed reviews is that it could put junior researchers at a disadvantage. It may discourage them from writing critical reviews of the work by senior colleagues, those people who make hiring and funding decisions. Reviewer anonymity can protect against that but it can also lead to biased, overly harsh, and sometimes outright nasty reviews. It also has the odd effect of creating a reviewer guessing game. People often display a surprising level of confidence in who they “know” their anonymous reviewers were – and I would bet they are often wrong. In fact, I know of certainly one case where this sort of false belief resulted in years of animosity directed at the presumed reviewer and even their students. Publishing reviewer names would put an end to this sort of nonsense. It also encourages people to be more polite. Editors at F1000Research (a journal with a completely transparent review process) told me that they frequently ask reviewers to check if they are prepared to publish the review in the state they submitted because it will be associated with their name – and many then decide to edit their comments to tone down the hostility.

However, I think even with anonymous reviews we could go a long way, provided that reviewer comments are public. Since the content of the review is subject to public scrunity it is in the reviewer’s, and even more so the editor’s, interest to ensure they are fair and of suitable quality. Reviews of poor quality or with potential political motivation could easily be flagged up and result in public discussion. I believe it was Chris Chambers who recently suggested a compromise in which tenured scientists must sign their reviews while junior researchers who still exist at the mercy of senior colleagues have the option to remain anonymous. I think this idea has merit although even tenured researchers can still suffer from political and personal biases so I am not sure this really protects against those problems.

One argument that is sometimes made against anonymous reviews is that it prevents people from taking credit for their reviewing work. I don’t think this is true. Anonymous reviews are nevertheless associated with a digital reviewer’s account and ratings of review quality and reliability etc could be easily quantified in that way. (In fact, this is precisely what websites like Publons are already doing).

Novelty, impact, and traditional journals

So what happens next? Let’s assume a manuscript passes this initial peer review. It then enters the official scientific record, is indexed on PubMed and Google Scholar. Perhaps it could follow the example of F1000Research in that the title of the study itself contains an indication that it has been accepted/approved by peer review.

This is where it gets complicated. A lot of the ideas I discussed are already implemented to some extent by journals like F1000Research, PeerJ, or the Frontiers brand. The only aspect that these implementations do not have is that they are not a single, centralised platform for reviews. And although I think having a single platform would be more ideal to avoid confusion and splintering, even a handful of venues for scientific review could probably work.

However, what these systems currently do not provide is the role currently still played by the high impact, traditional publishers: filtering the enormous volume of scientific work to select ground-breaking, timely, and important research findings. There is a lot of hostility towards this aspect of scientific publishing. It often seems completely arbitrary, obsessed with temporary fads and shallow buzzwords. I think for many researchers the implicit or even explicit pressure to publish as much “high impact” work as possible to sustain their careers is contributing to this. It isn’t entirely clear to me how much of this pressure is real and how much is an illusion. Certainly some grant applications still require you to list impact factors and citation numbers (which are directly linked to impact factors) to support your case.

Whatever you may think about this (and I personally agree that it has lots of negative effects and can be extremely annoying) the filtering and sorting by high impact journals does also have its benefits. The short format publications, brief communications, and perspective articles in these journals make work much more accessible to wider audiences and I think there is some point in highlighting new, creative, surprising, and/or controversial findings over incremental follow-up research. While published research should provide detailed methods and well-scrutinised results, there are different audiences. When I read about findings in astronomy or particle physics, or even many studies from biological sciences that aren’t in my area, I don’t typically read all the in-depth methods (nor would I understand them). An easily accessible article that appeals to a general scientific audience is certainly a nice way to communicate scientific findings. In the present system this is typically achieved by separating a general main text from Supplementary/Online sections that contain methods, additional results, and possibly even in-depth discussion.

This is where I think we should implement an explicit tier system. The initial research is published, after scientific peer review as discussed above, in the centralised repository of new manuscripts. These publications are written as traditional journal articles complete with detailed methods and results. Novelty and impact played no role up to this stage. However, now the more conventional publishers come into play. Authors may want to write cover letters competing for the attention of higher impact journals. Conversely, journal editors may want to contact authors of particularly interesting studies to ask them to submit a short-form article in their journal. There are several mechanisms by which new publications may come to the attention of journal editors. They could simply generate a strong social media buzz and lots of views, downloads, and citations. This in fact seems to be the basis of the Frontiers tier system. I think this is far from optimal because it doesn’t necessarily highlight the scientifically most valuable but the most sensational studies, which can be for all sorts of reasons, such as making extraordinary claims or because the titles contain curse words. Rather it would be ideal to highlight research that attracts a lot of post-publication review and discussion – but of course this still poses the question how to encourage that.

In either case, the decision as to what is novel, general interest research is still up to editorial discretion making it easier for traditional journals to accept this change. How these articles are accepted is still up to each journal. Some may not require any further peer review and simply ask for a copy-edited summary article. Others may want to have some additional peer review to keep the interpretation of these summaries in check. It is likely that these high impact articles would be heavy on the implications and wider interpretation while the original scientific publication has only brief discussion sections detailing the basic interpretation and elaborating on the limitations. Some peer review may help keep the authors honest at this stage. Importantly, instead of having endless online methods sections and (sometimes barely reviewed) supplementary materials, the full scientific detail of any study would be available within its original publication. The high impact short-form article simply contains a link to that detailed publications.

One important aim that this system would achieve is to ensure that the research that actually is published as high impact will typically meet high thresholds of scientific quality. Our current publishing model is still incentivising publishing shoddy research because it emphasises novelty and the speed of publication over quality. In the new system, every study would first have to pass a quality threshold. Novelty judgements should be entirely secondary to that.

How can we make this happen?

The biggest problem with all of these grand ideas we are kicking around is that it remains mostly unclear to me how we can actually effect this change. The notion that we can do away with traditional journals altogether sounds like a pipe dream to me as it is diametrically opposed to the self-interest of traditional publishers and our current funding structures. While some great upheavals have already happened in scientific publishing, such as the now wide spread of open access papers, I feel that a lot of these changes have simply occurred because traditional publishers realised that they can make considerable profit from open access charges.

I do hope that eventually the kinds of journals publishing short-form, general interest articles to filter the ground-breaking research from incremental, specialised topics will not be for-profit publishers. There are already a few examples of traditional  journals now that are more community driven, such as the Journal of Neuroscience, the Journal of Vision, and also e-Life (not so much community-driven but driven by a research funder rather than a for-profit publishing house). I hope to see more of that in the future. Since many scientists seem to be quite idealist in their hearts I think there is hope for that.

But in the meantime it seems to be necessary to work together with traditional publishing houses rather than antagonising them. I would think it shouldn’t be that difficult to convince some publishers of the idea that what now forms the supplementary materials and online methods in many high impact journals could be actual proper publications in their own right. Journals that already have a system like I envision, e.g. F1000Research or PeerJ, could perhaps negotiate such deals with traditional journals. This need not be mutually exclusive but it could simply apply to some articles published in these journals.

The main obstacle to do away with here is the in my mind obsolete notion that none of the results can have been published elsewhere. This is already no longer true in most cases anyway. Most research will have been published prior to journal publication in conference proceedings. Many traditional journals nowadays tolerate manuscripts uploaded to pre-print servers. The new aspect of the system I described would be that there is actually an assurance that the pre-published work has been peer reviewed properly thus guaranteeing a certain level of quality.

I know there are probably many issues to still resolve with these ideas and I would love to hear them. However, I think this vision is not a dream but a distinct possibility. Let’s make it come true.