Category Archives: registered reports

Is publication bias actually a good thing?*

Yesterday Neuroskeptic came to our Cognitive Drinks event in the Experimental Psychology department at UCL to talk about p-hacking. His entertaining talk (see Figure 1) was followed by a lively and fairly long debate about p-hacking and related questions about reproducibility, preregistration, and publication bias. During the course of this discussion a few interesting things came up. (I deliberately won’t name anyone as I think this complicates matters. People can comment and identify themselves if they feel that they should…)

Figure 1. Using this super-high-tech interactive fMRI results simulator Neuroskeptic clearly demonstrated a significant blob of activation in the pre-SMA (I think?) in stressed compared to relaxed people. This result made perfect sense.

It was suggested that a lot of the problems with science would be remedied effectively if only people were encouraged (or required?) to replicate their own findings before publication. Now that sounds generally like a good idea. I have previously suggested that this would work very well in combination with preregistration: you first do a (semi-)exploratory experiment to finalise the protocol, then submit a preregistration of your hypothesis and methods, and then do the whole thing again as a replication (or perhaps more than one if you want to test several boundary conditions or parameters). You then submit the final set of results for publication. Under the Registered Report format, your preregistered protocol would already undergo peer review. This would ensure that the final results are almost certain to be published provided you didn’t stray excessively from the preregistered design. So far, so good.

Should you publish unclear results?

Or is it? Someone suggested that it would be a problem if your self-replication didn’t show the same thing as the original experiment. What should one do in this case? Doesn’t publishing something incoherent like this, one significant finding and a failed replication, just add to the noise in the literature?

At first, this question simply baffled me, as I suspect it would many of the folks campaigning to improve science. (My evil twin sister called these people Crusaders for True Science but I’m not supposed to use derogatory terms like that anymore nor should I impersonate lady demons for that matter. Most people from both sides of this mudslinging contest “debate” never seemed to understand that I’m also a revolutionary – you might just say that I’m more Proudhon, Bakunin, or Henry David Thoreau rather than Marx, Lenin, or Che Guevara. But I digress…)

Surely, the attitude that unclear, incoherent findings, that is, those that are more likely to be null results, are not worth publishing must contribute to the prevailing publication bias in the scientific literature? Surely, this view is counterproductive to the aims of science to accumulate evidence and gradually get closer to some universal truths? We must know which hypotheses have been supported by experimental data and which haven’t. One of the most important lessons I learned from one of my long-term mentors was that all good experiments should be published regardless of what they show. This doesn’t mean you should publish every single pilot experiment you ever did that didn’t work. (We can talk about what that does and doesn’t mean another time. But you know how life is: sometimes you think you have some great idea only to realise that it makes no sense at all when you actually try it in practice. Or maybe that’s just me? :P). Even with completed experiments you probably shouldn’t bother publishing if you realise afterwards that it is all artifactual or the result of some error. Hopefully you don’t have a lot of data sets like that though. So provided you did an experiment of suitable quality I believe you should publish it rather than hiding it in the proverbial file drawer. All scientific knowledge should be part of the scientific record.

I naively assumed that this view was self-evident and shared by almost everyone – but this clearly is not the case. Yet instead of sneering at such alternative opinions I believe we should understand why people hold them. There are reasonable arguments why one might wish to not publish every unclear finding. The person making this suggestion at our discussion said that it is difficult to interpret a null result, especially an assumed null result like this. If your original experiment O showed a significant effect supporting your hypothesis, but your replication experiment R does not, you cannot naturally conclude that the effect really doesn’t exist. For one thing you need to be more specific than that. If O showed a significant positive effect but R shows a significant negative one, this would be more consistent with the null hypothesis than if O is highly significant (p<10-30) and R just barely misses the threshold (p=0.051).

So let’s assume that we are talking about the former scenario. Even then things aren’t as straightforward, especially if R isn’t as exact a replication of O as you might have liked. If there is any doubt (and usually there is) that something could have been different in R than in O, this could be one of the hidden factors people always like to talk about in these discussions. Now you hopefully know your data better than anyone. If experiment O was largely exploratory and you tried various things to see what works best (dare we say p-hacking again?), then the odds are probably quite good that a significant non-replication in the opposite direction shows that the effect was just a fluke. But this is not a natural law but a probabilistic one. You cannot ever know whether the original effect was real or not, especially not from such a limited data set of two non-independent experiments.

This is precisely why you should publish all results!

In my view, it is inherently dangerous if researchers decide for themselves which findings are important and which are not. It is not only a question of publishing only significant results. It applies much more broadly to the situation when a researcher publishes only results that support their pet theory but ignores or hides those that do not. I’d like to believe that most scientists don’t engage in this sort of behaviour – but sadly it is probably not uncommon. A way to counteract this is to train researchers to think of ways that test alternative hypotheses that make opposite predictions. However, such so-called “strong inference” is not always feasible. And even when it is, the two alternatives are not always equally interesting, which in turn means that people may still become emotionally attached to one hypothesis.

The decision whether a result is meaningful should be left to posterity. You should publish all your properly conducted experiments. If you have defensible doubts that the data are actually rubbish (say, an fMRI data set littered with spikes, distortions, and excessive motion artifacts, or a social psychology study where you discovered posthoc that all the participants were illiterate and couldn’t read the questionnaires) then by all means throw them in the bin. But unless you have a good reason, you should never do this and instead add the results to the scientific record.

Now the suggestion during our debate was that such inconclusive findings clog up the record with unnecessary noise. There is an enormous and constantly growing scientific literature. As it is, it is becoming increasingly harder to separate the wheat from the chaff. I can barely keep up with the continuous feed of new publications in my field and I am missing a lot. Total information overload. So from that point of view the notion makes sense that only those studies that meet a certain threshold for being conclusive are accepted as part of the scientific record.

I can certainly relate to this fear. For the same reason I am sceptical of proposals that papers should be published before review and all decisions about the quality and interest of some piece of research, including the whole peer review process, should be entirely post-publication. Some people even seem to think that the line between scientific publication and science blog should be blurred beyond recognition. I don’t agree with this. I don’t think that rating systems like those used on Amazon or IMDb are an ideal way to evaluate scientific research. It doesn’t sound wise to me to assess scientific discoveries and medical breakthroughs in the same way we rank our entertainment and retail products. And that is not even talking about unleashing the horror of internet comment sections onto peer review…

Solving the (false) dilemma

I think this discussion is creating a false dichotomy. These are not mutually exclusive options. The solution to a low signal-to-noise ratio in the scientific literature is not to maintain publication bias of significant results. Rather the solution is to improve our filtering mechanisms. As I just described, I don’t think it will be sufficient to employ online shopping and social network procedures to rank the scientific literature. Even in the best-case scenario this is likely to highlight the results of authors who are socially dominant or popular and probably also those who are particularly unpopular or controversial. It does not necessarily imply that the highest quality research floats to the top [cue obvious joke about what kind of things float to the top…].

No, a high quality filter requires some organisation. I am convinced the scientific community can organise itself very well to create these mechanisms without too much outside influence. (I told you I’m Thoreau and Proudhon, not some insane Chaos Worshipper :P). We need some form of acceptance to the record. As I outlined previously, we should reorganise the entire publication process so that the whole peer-review process is transparent and public. It should be completely separate from journals. The journals’ only job should be to select interesting manuscripts and to publish short summary versions of them in order to communicate particularly exciting results to the broader community. But this peer-review should still involve a “pre-publication” stage – in the sense that the initial manuscript should not generate an enormous amount of undue interest before it has been properly vetted. To reiterate (because people always misunderstand that): the “vetting” should be completely public. Everyone should be able to see all the reviews, all the editorial decisions, and the whole evolution of the manuscript. If anyone has any particular insight to share about the study, by all means they should be free to do so. But there should be some editorial process. Someone should chase potential reviewers to ensure the process takes off at all.

The good news about all this is that it benefits you. Instead of weeping bitterly and considering to quit science because yet again you didn’t find the result you hypothesised, this just means that you get to publish more research. Taking the focus off novel, controversial, special, cool or otherwise “important” results should also help make the peer review more about the quality and meticulousness of the methods. Peer review should be about ensuring that the science is sound. In current practice it instead often resembles a battle with authors defending to the death their claims about the significance of their findings against the reviewers’ scepticism. Scepticism is important in science but this kind of scepticism is completely unnecessary when people are not incentivised to overstate the importance of their results.

Practice what you preach

I honestly haven’t followed all of the suggestions I make here. Neither have many other people who talk about improving science. I know of vocal proponents of preregistration who have yet to preregister any study of their own. The reasons for this are complex. Of course, you should “be the change you wish to see in the world” (I’m told Gandhi said this). But it’s not always that simple.

On the whole though I think I have published almost all of the research I’ve done. While I currently have a lot of unpublished results there is very little in the file drawer as most of these experiments have either been submitted or are being written up for eventual publication. There are two exceptions. One is a student project that produced somewhat inconclusive results although I would say it is a conceptual replication of a published study by others. The main reason we haven’t tried to publish this yet is that the student isn’t here anymore and hasn’t been in contact and the data aren’t that exciting to us to bother with the hassle of publication (and it is a hassle!).

The other data set is perhaps ironic because it is a perfect example of the scenario I described earlier. A few years ago when I started a new postdoc I was asked to replicate an experiment a previous lab member had done. For simplicity, let’s just call this colleague Dr Toffee. Again, they can identify themselves if they wish. The main reason for this was that reviewers had asked Dr Toffee to collect eye-movement data. So I replicated the original experiment but added eye-tracking. My replication wasn’t an exact one in the strictest terms because I decided to code the experimental protocol from scratch (this was a lot easier). I also had to use a different stimulus setup than the previous experiment as that wouldn’t have worked with the eye-tracker. Still, I did my best to match the conditions in all other ways.

My results were a highly significant effect in the opposite direction than the original finding. We did all the necessary checks to ensure that this wasn’t just a coding error etc. It seemed to be real. Dr Toffee and I discussed what to do about it and we eventually decided that we wouldn’t bother to publish this set of experiments. The original experiment had been conducted several years before my replication. Dr Toffee had moved on with their life. I on the other hand had done this experiment as a courtesy because I was asked to. It was very peripheral to my own research interests. So, as in the other example, we both felt that going through the publication process would have been a fairly big hassle for very little gain.

Now this is bad. Perhaps there is some other poor researcher, a student perhaps, who will do a similar experiment again and waste a lot of time on testing the hypothesis that, at least according to our incoherent results, is unlikely to be true. And perhaps they will also not publish their failure to support this hypothesis. The circle of null results continues… :/

But you need to pick your battles. We are all just human beings and we do not have unlimited (research) energy. With both of these lacklustre or incoherent results I mentioned (and these are literally the only completed experiments we haven’t at least begun to write up), it seems like a daunting task to undergo the pain of submission->review->rejection->repeat that simply doesn’t seem worth it.

So what to do? Well, the solution is again what I described. The very reason the task of publishing these results isn’t worth our energy is everything that is wrong with the current publication process! In my dream world in which I can simply write up a manuscript formatted in a way that pleases me and then upload this to the pre-print peer-review site my life would be infinitely simpler. No more perusing dense journal websites for their guide to authors or hunting for the Zotero/Endnote/Whatever style to format the bibliography. No more submitting your files to one horribly designed, clunky journal website after another, checking the same stupid tick boxes, adding the same reviewer suggestions. No more rewriting your cover letters by changing the name of the journal. Certainly for my student’s project, it would not be hard to do as there is already a dissertation that can be used as a basis for the manuscript. Dr Toffee’s experiment and its contradictory replication might require a bit more work – but to be fair even there is already a previous manuscript. So all we’d need to add would be the modifications of the methods and the results of my replication. In a world where all you need to do is upload the manuscript and address some reviewers’ comments to ensure the quality of the science this should be fairly little effort. In turn it would ensure that the file drawer is empty and we are all much more productive.

This world isn’t here yet but there are journals that will allow something that isn’t too far off from that, namely F1000Research and PeerJ (and the Winnower also counts although the content there seems to be different and I don’t quite know how much review editing happens there). So, maybe I should email Dr Toffee now…

(* In case you didn’t get this from the previous 2700ish words: the answer to this question is unequivocally “No.”)

Replies to Dorothy Bishop about RRs

I decided to respond now before I get inundated with the next round of overdue work I need to do this week… I was going to wait until Chris’ response as I think you will probably overlap a bit but there are a lot of deadlines and things to do, so now is a better time. I also decided to write my reply as a post because it is a bit long for a comment and others may find it interesting.

I think most of your answers illustrate how we all miss each other’s points a little. I am not talking about what RR and prereg are like right now. Any evidence we have about it now is confounded by the fact that it is new and that the people trying it are probably for the most part proponents of the approach. Most of the points I raised (except perhaps the last one) are issues that really only come into play once the approach has become normalised, when it is commonplace at many journals, and it stopped being a measure to improve science but just how it works – so a bit like how standard peer review is now (and you know how much people complain about that).

DB: Nope. You have to give a comprehensive account of what you plan, thinking through every aspect of rationale, methods and analysis: Cortex really doesn’t want to publish anything flawed and so they screw you down on the details.
DB: Why any more so than for other publication methods? I really find this concern quite an odd one.

I agree that detailed review is key but the same could be said about the standard system. I don’t buy that author reputation isn’t going to influence judgements there. Like I am sure most of us, I always try my best not to be influenced by it, but I think we’re kidding ourselves if we think we’re perfectly unbiased. If you get a somewhat lacklustre manuscript to review, you will almost inevitably respond better to that author with a proven track record in the field (who probably possesses great writing skills) compared to some nobodies you never heard of, especially if they’re failing to communicate their ideas well (e.g. because their native language isn’t English). Nevertheless the quality of their work could actual be equal.

Now I take your point that this is also an issue in the current system, but the difference is that RR stage 1 reviews are just about evaluating the idea and the design. I think you’re lacking some information that could actually help you make a more informed choice. And it would be very disturbing if we tell people what science they can or can’t do (in the good journals that have RRs) just because of factors like this.

DB: Well, for a start registered reports require you to have very high statistical power and so humungous great N. Most people who just happened to do a study won’t meet that criterion. Second, as Chris pointed out in his talk, if you submit a registered report, then it goes out to review, and the reviewers do what reviewers do, i.e. make suggestions for changes, new conditions, etc etc. They do also expect you to specify your analysis in advance: that is one of the important features of RR.

I think this isn’t really answering my question. It should be very easy to come up with a “highly powered experiment” if you already know the finally observed effect size :P. And as I said in my post, I think many outcome-dependent changes to the protocol are about the analysis not about the design. Again, my point is also that once RRs have become more normal and people have run a bit out of steam (so the review quality may suffer compared to now) it may be a fairly easy thing to do. I could also see there being hybrids (i.e. people have already collected a fair bit of “pilot” data and just add a bit more in the registered protocol.

But I agree that this is perhaps all a bit hypothetical. I was questioning the actual logic of the response to this criticism. In the end what matters though is how likely it is that people engage in that sort of behaviour. If pre-completed grant proposals are really as common as people claim I could see it happening – but that depends largely on how difficult it is compared to being honest. Perhaps you’re right and it’s just very unlikely.

DB: So you would be unlikely to get through the RR process even if you did decide to fake your time stamps (and let’s face it, if you’re going to do that, you are beyond redemption).

I’m sure we all agree on that but I wouldn’t put it past some people. I’ve seen cases where people threw away around a third of their data points because they didn’t like the results. I am not sure that fiddling with the time stamps (which may be easier than actively changing the date) is really all that much worse.

Of course, this brings us to another question in that nothing in RR or data sharing in general really stops people from excluding “bad” subjects. Again, of course this is not different from the status quo but my issue is that having preregistered and open experiments clearly bestows a certain value judgement for people (hell, the OSF actually operates a “badge” system!). So in a way a faked RR could end up being valued more than an honest well-done non-RR. That does bother me.

DB: Since you yourself don’t find this [people stealing my ideas from RRs] all that plausible, I won’t rehearse the reasons why it isn’t.

Again, I was mostly pointing out the holes in the logic here. And of course whether or not it is plausible, a lot of people are quite evidently afraid of what Chris called the “boogieman” of being scooped. My point was that to allay this fear pointing to Manuscript Received dates is not going to suffice. But we all seem to agree that scooping is an exaggerated problem. I think the best way to deal with this worry is to stop people from being afraid of the boogieman in the first place.

DB: Your view on this may be reinforced by PIs in your institution. However, be aware that there are some senior people who are more interested in whether your research is replicable than whether it is sexy. And who find the soundbite form of reporting required by Nature etc quite inadequate.

This seems a bit naive to me. It’s not just about what “some senior people” think. I can’t with all honesty say that these factors don’t play into grant and hiring decisions. I also think it is a bit hypocritical to advise junior researchers not to pursue a bit of high impact glory when our own careers are at least in part founded on that (although mine isn’t nearly as much as some other people’s ;)). I do advise people that just to chase high impact is a bad idea but that you should have a healthy selection of solid studies. But I can also tell from experience that a few high impact publications clearly open doors for you. Anyway, this is really a topic for a different day I guess.

My own view is that I would go for a registered report in cases where it is feasible, as it has three big benefits – 1) you get good peer review before doing the study, 2) it can be nice to have a guarantee of publication and 3) you don’t have to convince people that you didn’t make up the hypothesis after seeing the data. But where it’s not feasible, I’d go for a registered protocol on OSF which at least gives me (3).

I agree this is eminently sensible. I think the (almost) guaranteed publication is probably a quite convincing argument to many people. And by god I can say that I have in the past wished for (3) – oddly enough it’s usually the most hypothesis-driven research where (some) people don’t want to believe you weren’t HARKing…

I think this also underlines an important point. The whole prereg discussion far too often revolves around negative issues. The critics are probably partly to blame for it but I think in general you often hear it mentioned as a response to questionable research practices. But what this discussion suggests is that there are many positive aspects about prereg so rather than being a cure to an ailing scientific process, it can also be seen as a healthier way to do science.

Some questions about Registered Reports

Recently I participated in an event organised by PhD students in Experimental Psychology at UCL called “Is Science broken?”. It involved a lively panel discussion in which the panel members answered many questions from the audience about how we feel science can be improved. The opening of the event was a talk by Chris Chambers of Cardiff University about Registered Reports (RR), a new publishing format in which researchers preregister their introduction and methods sections before any data collection takes place. Peer review takes place in two stages: first, reviewers evaluate the appropriateness of the question and the proposed experimental design and analysis procedures, and then, after data collection and analysis have been completed and the results are known, peer review continues to finalise the study for publication. This approach is aimed to make scientific publishing independent from the pressure to get perfect results or changing one’s apparent hypothesis depending on the outcome of the experiments.

Chris’ talk was in large part a question and answer session for specific concerns with the RR approach that had been raised at other talks or in writing. Most of these questions he (and his coauthors) had already answered in a similar way in a published FAQ paper. However, it was nice to see him talk so passionately about this topic. Also speaking for myself at least I can say that seeing a person arguing their case is usually far more compelling than reading an article on it – even though the latter will in the end probably have a wider reach.

Here I want to raise some additional questions about the answers Chris (and others) have given to some of these specific concerns. The purpose in doing so is not to “bring about death by a thousand cuts” to the RR concept as Aidan Horner calls it. I completely agree with Aidan that many concerns people have with RR (and lighter forms of preregistration) are probably logistical. It may well be that some people just really want to oppose this idea and are looking for any little reason to use as an excuse. However, I think both sides of the debate about this issue have suffered from a focus on potentials rather than fact. We simply won’t know how much good or bad preregistration will do for science unless we try it. This seems to be a concept that everyone at the discussion was very much in agreement on and we all discussed ways in which we could actually assess the evidence for whether RRs improved science over the next few decades.

Therefore I want it to be clear that the points I raise are not an ideological opposition to preregistration. Rather they are some points where I found the answers Chris describe to be not entirely satisfying. I very much believe that preregistration must be tried but I want to provoke some thought about possible problems with it. The sooner we are aware of these issues, the sooner they can be fixed.

Wouldn’t reviewers rely even more on the authors’ reputation?

In the Stage 1 of an RR, when only the scientific question and experimental design are reviewed, reviewers have little to go on to evaluate the protocol. Provided that the logic of the question and the quality of the design are evident, they would hopefully be able to make some informed decisions about it. However, I think it is a bit naive to assume that the reputation of the authors isn’t going to influence the reviewers’ judgements. I have heard of many grant reviews asking questions as to whether the authors would be capable of pulling off some proposed research. There is an extensive research literature on how the evaluation of identical exam papers or job applications or whatnot can be influenced by factors like the subject’s gender or name. I don’t think simply saying “Author reputation is not among” the review criteria is enough of a safeguard.

I also don’t think that having a double-blind review system is necessarily a good way to protect against this. There have been wider discussions about the short-comings of double-blind reviews and this situation is no different. In many situations you could easily guess the authors’ identity by the experimental protocol alone. And double blind review suffers even more from one of the main problems with anonymous reviewers (which I generally support): when the reviewers guess the authors’ identities incorrectly that could have even worse consequences because their decision will be based on an incorrect assessment of the authors.

Can’t people preregister experiments they have already completed?

The general answer here is that this would constitute fraud. The RR format would also require time stamped data files and lab logs to guarantee that data were produced only after the protocol has been registered. Both of these points are undeniably true. However, while there may be such a thing as an absolute ethical ideal, in the end a lot of our ethics are probably governed by majority consensus. The fact that many questionable research practices are apparently so widespread is presumably just that: while most people deep down understand that these things are “not ideal”, they may nonetheless engage in them because they feel that “everybody else does it.”

For instance, I often hear that people submit grant proposals for experiments they have already completed although I have personally never seen this with any grant proposals. I have also heard that it is more common in the US perhaps but at least based on all the anecdotes it may generally be widespread. Surely this is also fraudulent but nevertheless people apparently do it?

Regarding time stamped data, I also don’t know if this is necessarily a sufficiently strong safeguard. For the most part, time stamps are pretty easy to “adjust”. Crucially, I don’t think many reviewers or post-publication commenters will go through the trouble of checking them. Faking time stamps is certainly deep into fraud territory but people’s ethical views are probably not all black and white. I could easily see some people bending the rules just that little, especially if preregistered studies become a new gold standard in the scientific community.

Now perhaps this is a bit too pessimistic a view of our colleagues. I agree that we probably should not exaggerate this concern. But given the concerns people have with questionable research practices now I am not entirely sure we can really just dismiss this possibility by saying that this would be fraud.

Finally, another answer to this concern is that preregistering your experiment after the fact would backfire because the authors could then not implement any changes suggested by reviewers in Stage 1. However, this only applies to changes in the experimental design, the stimuli or apparatus etc. The most confusing corners in the “garden of forking paths” are usually the analysis procedure, not the design. There are only so many ways to run a simple experiment and most minor changes suggested by a reviewer could easily be dismissed by authors. However, changes to the analysis approach could quite easily be implemented after the results are known.

Reviewers could steal my preregistered protocol and scoop me

I agree that this concern is not overly realistic. In fact, I don’t even believe the fear of being scooped is overly realistic. I’m sure it happens (and there are some historical examples) but it is far rarer than most people believe. Certainly it is rather unlikely for a reviewer to do this. For one thing, time is usually simply not on their side. There is a lot that could be written about the fear of getting scooped and I might do that at some future point. But this is outside the scope of this post.

Whatever its actual prevalence, the paranoia (or boogieman) of scooping is clearly widespread. Until we find a way to allay this fear I am not sure that it will be enough to tell people that the Manuscript Received date of a preregistered protocol would clearly show who had the idea sooner. First of all, the Received date doesn’t really tell you when somebody had an idea. The “scooper” could always argue that they had the idea before that date but only ended up submitting the study afterwards (and I am sure that actually happens fairly often without scooping).

More importantly though, one of the main reasons people are afraid of being scooped is the pressure to publish in high impact journals. Having a high impact publication has greater currency than the Received date of a RR in what is most likely a lower impact journal. I doubt many people would actually check the date unless you specifically point it out to them. We already now have a problem with people not reading the publications of job and grant applicants but relying on metrics like impact factors and h-indeces. I don’t easily see them looking through that information.

As a junior researcher I must publish in high impact journals

I think this is an interesting issue. I would love nothing more than if we could stop caring about who published what where. At the same time I think that there is a role for high impact journals like Nature, Science or Neuron (seriously folks, PNAS doesn’t belong in that list – even if you didn’t boycott it like me…). I would like the judgement of scientific quality and merit to be completely divorced from issues of sensationalism, novelty, and news that quite likely isn’t the whole story. I don’t know how to encourage that change though. Perhaps RRs can help with that but I am not sure they’re enough. Either way, it may be a foolish and irrational fear but I know that as a junior scientist I (and my postdocs and students) currently do seek to publish at least some (but not exclusively) “high impact” research to be successful. But I digress.

Chris et al. write that sooner or later high impact outlets will probably come on board with offering RRs. I don’t honestly see that happening, at least not without a much more wide-ranging change in culture. I think RRs are a great format for specialised journals to have. However, the top impact journals primarily exist for publishing exciting results (whatever that means). I don’t think they will be keen to open the floodgates for lots of submissions that aim to test exciting ideas but fail to deliver the results to match them. What I could see perhaps is a system in which a journal like Nature would review a protocol and accept to publish it in its flagship journal if the results are positive but in its lower-impact outlet (e.g. Nature Communications) if the results are negative. The problem with this idea is that it somehow goes against the egalitarian philosophy of the current RR proposals. The publication again would be dependent on the outcome of the experiments.

Registered Reports are incompatible with short student projects

After all the previous fairly negative points I think this one is actually about a positive aspect of science. For me this is actually one of the greatest concerns. In my mind this is a very valid worry and Chris and co acknowledge this also in their article. I think RRs would be a viable solution for experiments by a PhD student but for master students, who are typically around for a few months only, it is simply not very realistic to first submit a protocol and revising it over weeks and months of reviews before even collecting the first data.

A possible solution suggested for this problem is that you could design the experiments and have them approved by peer reviewers before the students commence. I think this is a terrible idea. For me perhaps the best part of supervising student projects in my lab is when we discuss the experimental design. The best students typically come with their own ideas and make critical suggestions and improvements to the procedure. Not only is this phase very enjoyable but I think designing good experiments is also one of the most critical skills for junior scientists to learn. By having the designs finalised before the students even step through the door of the lab would undermine that.

Perhaps for those cases it would make more sense to just use light preregistration, that is, uploading your protocol to a timestamped archive but without external review. But if in the future RR do become the new gold standard, I would worry that this denigrates the excellent research projects of many students.

Wrapping up…

As I said, these points are not meant to shoot down the concept of Registered Reports. Some of the points may not even be such enormous concerns at all. However, I hope that my questions provoke thought and that we can discuss ways to improve the concept further and find safeguards against these possible problems.

Sorry this post was very long as usual but there seems to be a lot to say. My next post though will be short, I promise! 😉