Category Archives: scientific evidence

Would the TEF have passed peer review?

Today we have the first NeuroNeurotic guest post! The following short rant was written by my colleague Lee de Wit. In this he talks about the recently published “Teaching Excellence Framework” in which UK universities are ranked based on the quality of their undergraduate teaching… If you are also a psychologist and would like to write and sign a more formal letter/editorial to the Times Higher Education outlining these points, email l.de-wit@ucl.ac.uk

As psychologists when we create a new way of measuring something complex (like a personality trait) we have to go to rigorous lengths to demonstrate that the measures we use are valid, reliable and that we classify people meaningfully.

When it comes to measuring teaching in higher education however, it seems we can just lower the standards. Apparently the TEF is meant to help students make meaningful choices, yet I can see no evidence that it is a valid measure, no evidence it is reliable, and no evidence that it meaningfully clusters Universities.

Validity – One of the key measures used in TEF are student satisfaction scores – yet we already know that they are not a valid measure of teaching quality. In fact there are meta-analyses demonstrating that high student satisfaction scores don’t even correlate with learning outcomes.

Reliability – Apparently it is fine to just have a panel of 27 people make some subjective judgements about the quantitative data, to classify Universities. No need to have two panels rate them and then check they come to similar judgements.

Clustering – in terms of the underlying distribution of the data, no need to seriously think about whether there are meaningful clusters or more continuous variability. Gold, Sliver and Bronze – job done.

If there are any academics tweeting today about your University’s strong result, I would seriously call into question the excellence with which you can teach critical thinking to your students.

The one lesson I would take from this for UK Universities, is that we are clearly failing to educate politicians and policy makers to think carefully about evidence based policy. Presumably most of the key players in designing and implementing TEF went to UK Universities. So I’m worried about what they learnt that made them think this was a good idea.

The Day the Palm hit the Face

536px-triple-facepalm

 

Scientists are human beings. I get it. I really do because – all contrary reports and demonic possessions aside – I’m a human being, too. So I have all manner of sympathy for people’s hurt feelings. It can hurt when somebody criticizes you. It may also be true that the tone of criticism isn’t always as it should be to avoid hurt.

In this post, I want to discuss ways to answer scientific criticism. I haven’t always followed this advice myself because, as I said, I’m human. But I am at least trying. The post was sparked by an as-yet unpublished editorial by a certain ex-president of the APS. I don’t want to discuss specifically the rather inflammatory statements in that article as doing so will serve no good. Since it isn’t officially published, it may still change anyway. And last time I blogged about an unpublished editorial I received a cease and desist letter forcing me to embargo my post for two full hours

I believe most people would agree that science is an endeavor of truth seeking. It attempts to identify regularities in our chaotic observations of the world that can help us understand the underlying laws that govern it. So when multiple people are unable to replicate a previous claim, this casts doubt on the claim’s validity as a regularity of nature.

The currency of science should be evidence. Without any evidence, a claim is worthless. So if someone says “I don’t think this effect is real” but offers no evidence for that statement, be it a failed replication or a reanalysis of the same data showing the conclusions are erroneous, then you have every right to ignore them. But if they do offer evidence, this cannot be ignored. It is simply not good enough to talk about “hidden moderators” or complain about the replicators’ incompetence. Without evidence, these statements are hollow.

Whether you agree with it in principle or not, preregistration of experimental designs has become something of a standard in replication studies (and is becoming increasingly common in general). So when faced with a replication failure and the fact that people of that ilk are evidently worried about analytical flexibility and publication bias, surely it shouldn’t be very surprising that they won’t just be convinced by your rants about untested moderators or Google searches of ancient conceptual replications, let alone by your accusations of “shameless bullying” or “methodological terrorism”. Instead, what might possibly convince them is a preregistered and vetted replication attempt in which you do right all of the things that these incompetent buffoons did wrong. This proposal has already been outlined very well recently by Brent W Roberts. Speaking more generally, it is the ground-breaking, revolutionary concept that scientific debates should be fought using equivalent evidence instead of childish playground tactics and special pleading.

Granted, some might not be convinced even by that. And that’s fine, too. Skepticism is part of science. Also, some people are not convinced by any evidence you show them. It is actually not your job as a scientist to convince all your critics. It is your job to test theories and evaluate the evidence unimpassionately. If your evidence is solid, the scientific community will come around eventually. If your evidence is only shouting about hidden moderators and nightmare stories of people fearing tenure committees because someone failed to replicate your finding, then I doubt it will pass the test of time.

And maybe, just maybe, science is also about changing your mind when you realize that the evidence simply doesn’t support your previous thinking. I don’t think any single failed replication is enough to do that but a set of failed replications should certainly at least push you in that direction. As far as I can see, nobody who ever published a replication failure has even suggested that people should be refused tenure or lose their research program or whatever. I can’t speak for others, but if someone applied for a job with me and openly discussed the fact that a result of theirs failed to replicate and/or that they had to revise their theories, this would work strongly in their favor compared to the candidate with overbrimming confidence who only published Impact Factor > 30 papers, none of which have been challenged. And, in a story I think I told before, one of my scientific heroes was a man who admitted without me bringing it up that the results of his Science paper had been disproven.

Seriously, people, get a grip. I am sympathetic to the idea that criticism hurts, that we should perhaps be more mindful of just how snarky and frivolous we are with our criticism, and that there is a level of glee associated with how replication failures are publicized. But there is also a lot of glee with which high impact papers are being publicized and presented in TED talks. If you want the former to stop, perhaps we should also finally put an end to the bullshitting altogether. Anyway, I will conclude with a quote by another of my heroes and let my unbridled optimism flow, in spite of it all:

In science it often happens that scientists say, ‘You know that’s a really good argument; my position is mistaken,’ and then they would actually change their minds and you never hear that old view from them again. They really do it. It doesn’t happen as often as it should, because scientists are human and change is sometimes painful. But it happens every day. I cannot recall the last time something like that happened in politics or religion.
– Carl Sagan

On the magic of independent piloting

TL,DR: Never simply decide to run a full experiment based on whether one of the small pilots in which you tweaked your paradigm supported the hypothesis. Use small pilots only to ensure the experiment produces high quality data, judged by criteria that are unrelated to your hypothesis.

Sorry for the bombardment with posts on data peeking and piloting. I felt this would have cluttered up the previous post so I wrote a separate one. After this one I will go back to doing actual work though, I promise! That grant proposal I should be writing has been neglected for too long…

In my previous post, I simulated what happens when you conduct inappropriate pilot experiments by running a small experiment and then continuing data collection if the pilot produces significant results. This is really data peeking and it shouldn’t come as much of a surprise that this inflates false positives and massively skews effect size estimates. I hope most people realize that this is a terrible thing to do because it makes your results entirely dependent on the outcome. Quite possibly, some people would have learned about this in their undergrad stats classes. As one of my colleagues put it, “if it ends up in the final analysis it is not a pilot.” Sadly, I don’t think this as widely known as it should be. I was not kidding when I said that I have seen it happen before or overheard people discussing having done this type of inappropriate piloting.

But anyway, what is an appropriate pilot then? In my previous post, I suggested you should redo the same experiment but restart data collection. You now stick to the methods that gave you a significant pilot result. Now the data set used to test your hypothesis is completely independent, so it won’t be skewed by the pre-selected pilot data. Put another way, your exploratory pilot allows you to estimate a prior, and your full experiment seeks to confirm it. Surely there is nothing wrong with that, right?

I’m afraid there is and it is actually obvious why: your small pilot experiment is underpowered to detect real effects, especially small ones. So if you use inferential statistics to determine if a pilot experiment “worked,” this small pilot is biased towards detecting larger effect sizes. Importantly, this does not mean you bias your experiment towards larger effect sizes. If you only continue the experiment when the pilot was significant, you are ignoring all of the pilots that would have shown true effects but which – due to the large uncertainty (low power) of the pilot – failed to do so purely by chance. Naturally, the proportion of these false negatives becomes smaller the larger you make your pilot sample – but since pilots are by definition small, the error rate is pretty high in any case. For example, for a true effect size of δ = 0.3, the false negatives at a pilot sample of 2 is 95%. With a pilot sample of 15, it is still as high as 88%. Just for illustration I show below the false negative rates (1-power) for three different true effect sizes. Even for quite decent effect sizes the sensitivity of a small pilot is abysmal:

False Negatives

Thus, if you only pick pilot experiments with significant results to do real experiments you are deluding yourself into thinking that the methods you piloted are somehow better (or “precisely calibrated”). Remember this is based on a theoretical scenario that the effect is real and of fixed strength. Every single pilot experiment you ran investigated the same underlying phenomenon and any difference in outcome is purely due to chance – the tweaking of your methods had no effect whatsoever. You waste all manner of resources piloting some methods you then want to test.

So frequentist inferential statistics on pilot experiments are generally nonsense. Pilots are by nature exploratory. You should only determine significance for confirmatory results. But what are these pilots good for? Perhaps we just want to have an idea of what effect size they can produce and then do our confirmatory experiments for those methods that produce a reasonably strong effect?

I’m afraid that won’t do either. I simulated this scenario in a similar manner as in my previous post. 100,000 times I generated two groups (with a full sample size of n = 80, although the full sample size isn’t critical for this). Both groups are drawn from a population with standard deviation 1 but one group has a mean of zero while the other’s mean is shifted by 0.3 – so we have a true effect size here (the actual magnitude of this true effect size is irrelevant for the conclusions). In each of the 100,000 simulations, the researcher runs a number of pilot subjects per group (plotted on x-axis). Only if the effect size estimate for this pilot exceeds a certain criterion level, the researcher runs an independent, full experiment. The criterion is either 50%, 100%, or 200% of the true effect size. Obviously, the researcher cannot know this however. I simply use these criteria as something that the researcher might be doing in a real world situation. (For the true effect size I used here, these criteria would be d = 0.15, d = 0.3, or d = 0.6, respectively).

The results are below. The graph on the left once again plots the false negative rates against the pilot sample size. A false negative here is not based on significance but on effect size, so any simulation for which d was below the criterion. When the criterion is equal to the true effect size, the false negative rate is constant at 50%. The reason for this is obvious: each simulation is drawn from a population centered on the true effect of 0.3, so half of these simulations will exceed that value. However, when the criterion is not equal to the true effect the false negative rates depend on the pilot sample size. If the criterion is lower than the true effect, false negatives decrease. If the criterion is strict, false negatives increase. Either way, the false negative rates are substantially greater than the 20% mark you would have with an adequately powered experiment. So you will still delude yourself a considerable number of times if you only conduct the full experiment when your pilot has a particular effect size. Even if your criterion is lax (and d = 0.15 for a pilot sounds pretty lax to me), you are missing a lot of true results. Again, remember that all of the pilot experiments here investigated a real effect of fixed size. Tweaking the method makes no difference. The difference between simulations is simply due to chance.

Finally, the graph on the right shows the mean effect sizes  estimated by your completed experiments (but not the absolute this time!). The criterion you used in the pilot makes no difference here (all colors are at the same level), which is reassuring. However, all is not necessarily rosy. The open circles plot the effect size you get under publication bias, that is, if you only publish the significant experiments with p < 0.05. This effect is clearly inflated compared to the true effect size of 0.3. The asterisks plot the effect size estimate if you take all of the experiments. This is the situation you would have (Chris Chambers will like this) if you did a Registered Report for your full experiment and publication of the results is guaranteed irrespective of whether or not they are significant. On average, this effect size is an accurate estimate of the true effect.

Simulation Results

Again, these are only the experiments that were lucky enough to go beyond the piloting stage. You already wasted a lot of time, effort, and money to get here. While the final outcome is solid if publication bias is minimized, you have thrown a considerable number of good experiments into the trash. You’ve also misled yourself into believing that you conducted a valid pilot experiment that honed the sensitivity of your methods when in truth all your pilot experiments were equally mediocre.

I have had a few comments from people saying that they are only interested in large effect sizes and surely that means they are fine? I’m afraid not. As I said earlier already, the principle here is not dependent on the true effect size. It is solely a factor of the low sensitivity of the pilot experiment. Even with a large true effect, your outcome-dependent pilot is a blind chicken that errs around in the dark until it is lucky enough to hit a true effect more or less by chance. For this to happen you must use a very low criterion to turn your pilot into a real experiment. This however also means that if the null hypothesis is true an unacceptable proportion of your pilots produce false positives. Again, remember that your piloting is completely meaningless – you’re simply chasing noise here. It means that your decision whether to go from pilot to full experiment is (almost) completely arbitrary, even when the true effect is large.

So for instance, when the true effect is a whopping δ = 1, and you are using d > 0.15 as a criterion in your pilot of 10 subjects (which is already large for pilots I typically hear about), your false negative rate is nice and low at ~3%. But critically, if the null hypothesis of δ = 0 is true, your false positive rate is ~37%. How often you will fool yourself by turning a pilot into a full experiment depends on the base rate. If you give this hypothesis at 50:50 chance of being true, almost one in three of your pilot experiments will lead you to chase a false positive. If these odds are lower (which they very well may be), the situation becomes increasingly worse.

What should we do then? In my view, there are two options: either run a well-powered confirmatory experiment that tests your hypothesis based on an effect size you consider meaningful. This is the option I would chose if resources are a critical factor. Alternatively, if you can afford the investment of time, money, and effort, you could run an exploratory experiment with a reasonably large sample size (that is, more than a pilot). If you must, tweak the analysis at the end to figure out what hides in the data. Then, run a well-powered replication experiment to confirm the result. The power for this should be high enough to detect effects that are considerably weaker than the exploratory effect size. This exploratory experiment may sound like a pilot but it isn’t because it has decent sensitivity and the only resource you might be wasting is your time* during the exploratory analysis stage.

The take-home message here is: don’t make your experiments dependent on whether your pilot supported your hypothesis, even if you use independent data. It may seem like a good idea but it’s tantamount to magical thinking. Chances are that you did not refine your method at all. Again (and I apologize for the repetition but it deserves repeating): this does not mean all small piloting is bad. If your pilot is about assuring that the task isn’t too difficult for subjects, that your analysis pipeline works, that the stimuli appear as you intended, that the subjects aren’t using a different strategy to perform the task, or quite simply to reduce the measurement noise, then it is perfectly valid to run a few people first and it can even be justified to include them in your final data set (although that last point depends on what you’re studying). The critical difference is that the criteria for green-lighting a pilot experiment are completely unrelated to the hypothesis you are testing.

(* Well, your time and the carbon footprint produced by your various analysis attempts. But if you cared about that, you probably wouldn’t waste resources on meaningless pilots in the first place, so this post is not for you…)

MatLab code for this simulation.

On the worthlessness of inappropriate piloting

So this post is just a brief follow up to my previous post on data peeking. I hope it will be easy to see why this is very related:

Today I read this long article about the RRR of the pen-in-mouth experiments – another in a growing list of failures to replicate classical psychology findings. I was quite taken aback by one comment in this: the assertion that these classical psychology experiments (in particular the social priming ones) had been “precisely calibrated to elicit tiny changes in behavior.” It is an often repeated argument to explain why findings fail to replicate – the “replicators” simply do not have the expertise and/or skill to redo these delicate experiments. And yes, I am entirely willing to believe that I’d be unable to replicate a lot of experiments outside my area, say, finding subatomic particles or even (to take an example from my general field) difficult studies on clinical populations.

But what does this statement really mean? How were these psychology experiments “calibrated” before they were run? What did the authors do to nail down the methods before they conducted the studies? It implies that extensive pilot experiments were conducted first. I am in no position to say that this is what the authors of these psychology studies did during their piloting stage but one possibility is that several small pilot experiments were run and the experimental parameters were tweaked until a significant result supporting the hypothesis was observed. Only then they continued the experiment and collected a full data set that included the pilot data. I have seen and heard of people who did precisely this sort of piloting until the “experiment worked.”

So, what actually happens when you “pilot” experiments to “precisely calibrate” them? I decided to simulate this and the results are in the graph below (each data point is based on 100,000 simulations). In this simulation, an intrepid researcher first runs a small number of pilot subjects per group (plotted on x-axis). If the pilot fails to produce significant results at p < 0.05, the experiment is abandoned and the results are thrown in the bin never to again see the light of day. However, if the results are significant, the eager researcher collects more data until the full sample in each group is n = 20, 40, or 80. On the y-axis I plotted the proportion of these continued experiments that produced significant results. Note that all simulated groups were drawn from a normal distribution with mean 0 and standard deviation 1. Therefore, any experiments that “worked” (that is, they were significant) are false positives. In a world where publication bias is still commonplace, these are the findings that make it into journals – the rest vanish in the file-drawer.

 

False Positives

As you can see, such a scheme of piloting until the experiment “works,” can produce an enormous number of false positives in the completed experiments. Perhaps this is not really all that surprising – after all this is just another form of data peeking. Critically, I don’t think this is unrealistic. I’d wager this sort of thing is not at all uncommon. And doesn’t it seem harmless? After all we are only peeking once! If a pilot experiment “worked,” we continue sampling until the sample is complete.

Well, even under these seemingly benign conditions false positives can be inflated dramatically. The black curve is for the case when the final sample size, of the completed studies, is only 20. This is the worst case and it is perhaps unrealistic. If the pilot experiment consists of 10 subjects (that is, half the full sample) about a third of results will be flukes. But even in the other cases, when only a handful of pilot subjects are collected compared to the much larger full samples, false positives are well above 5%. In other words, whenever you pilot an experiment and decide that it’s “working” because it seems to support your hypothesis, you are already skewing the final outcome.

Of course, the true false positive rate, taken across the whole set of 100,000 pilots that were run, would be much lower (0.05 times the rates I plotted above to be precise, because we picked from the 5% of significant “pilots” in the first place). However, since we cannot know how much of this inappropriate piloting went on behind the scenes, knowing this isn’t particularly helpful.

More importantly, we aren’t only interested in the false positive rate. A lot of researchers will care about the effect size estimates of their experiments. Crucially, this form of piloting will substantially inflate these effect size estimates as well and this may have even worse consequences for the interpretation of these experiments. In the graph below, I plot the effect sizes (the mean absolute Cohen’s d) for the same simulations for which I showed you the false positive rates above. I use the absolute effect size because the sign is irrelevant – the whole point of this simulation exercise is to mimic a full-blown fishing expedition via inappropriate “piloting.” So our researcher will interpret a significant result as meaningful regardless of whether d is positive or negative.

Forgive the somewhat cluttered plot but it’s not that difficult to digest really. The color code is the same as for the previous figure. The open circles and solid lines show you the effect size of the experiments that “worked,” that is, the ones for which we completed data collection and which came out significant. The asterisks and dashed lines show the effect sizes for all of global false positives, that is, all the simulations with p < 0.05 after the pilot but using the full the data set, as if you had completed these experiments. Finally, the crosses and dotted lines show the effect sizes you get for all simulations (ignoring inferential statistics). This is just given as a reference.

Effect Sizes

Two things are notable about all this. First, effect size estimates increase with “pilot” sample size for the set of global false positives (asterisks) but not the other curves. This is because the “pilot” sample size determines how strongly the fluke pilot effect will contribute to the final effect size. More importantly, the effect size estimates for those experiments with significant pilots and which also “worked” after completion are massively exaggerated (open circles). The degree of exaggeration is a factor of the baseline effect (crosses). The absolute effect size estimate depends on the full sample size. At the smallest full sample size (n=20, black curve) the effect sizes are as high as d = 0.8. Critically, the degree of exaggeration does not depend on how large your pilot sample was. Whether your “pilot” had only 2 or 15 subjects, the average effect size estimate is around 0.8.

The reason for this is that the smaller the pilot experiment, the more underpowered it is. Since it is a condition for continuing the experiment that the pilot must be significant, the pilot effect size must be considerably larger for small pilots than larger pilots. Because the true effect size is always zero, this cancels out in the end so the final effect size estimate is constant regardless of the pilot sample size. But in any case, the effect size estimate you got from your precisely calibrated inappropriately piloted experiments are enormously overrated. It shouldn’t be much of a surprise if these don’t replicate and that posthoc power calculations based on these effect sizes suggest low power (of course, you should never use posthoc power in that way but that’s another story…) .

So what should we do? Ideally you should just throw away the pilot data, preregister the design, and restart the experiment anew with the methods you piloted. In this case the results are independent and only the methods are shared. Importantly, there is nothing wrong with piloting in general. After all, I had a previous post praising pilot experiments. But piloting should be about ensuring that the methods are effective in producing clean data. There are many situations in which an experiment seems clever and elegant in theory but once you actually start it in practice you realize that it just can’t work. Perhaps the participants don’t use the task strategy you envisioned. Or they simply don’t perceive the stimuli the way they were intended. In fact, this happened to us recently and we may have stumbled over an interesting finding in its own right (but this must also be confirmed by a proper experiment!). In all these situations, however, the decision on the pilot results is unrelated to the hypothesis you are testing. If they are related, you must account for that.

MatLab code for these simulations is available. As always, let me know if you find errors. (To err is human, to have other people check your code divine?)

On brain transplants, the Matrix, and Dualism

Warning: This post contains spoilers for the movie The Matrix.

Today a tweet by Neuroskeptic pointed me to this post entitled “You are not your brain: Why a head transplant is not what you think it is“. The title initially sparked my interest because it is a topic I have been thinking about a lot. I am actually writing a novel that deals with topics such as the scientific study of unconsciousness, non-free will, and disembodied cognition*. This issue is therefore succinctly relevant to me.

Unfortunately, this particular post does not really deal with this topic in any depth but only espouses a trivial form of mind-brain dualism. It discusses some cherry-picked findings without any proper understanding of current neuroscientifc knowledge and brushes aside most scientific arguments about consciousness as “bizarre” claims, without providing any concrete argument why that is so. Don’t get me wrong, some claims by neuroscientists about free will and consciousness are probably on logically shaky ground, and neuroscientists themselves frequently espouse a form of inadvertent dualism in their own writing about how the brain relates to the mind. However, this post doesn’t really discuss these issues in an adequate way – but go and read it and make up your own mind.

Either way, I think the general thought is intriguing nonetheless. What would actually happen if we could transplant a human brain (or the whole head) into a different body? Let’s ignore for the moment the fact that our surgical technology is nowhere near the point where we could do this with humans and allow the transplanted head to actually control the new host body. Instead let us assume that we can in fact connect up all the peripheral neurons and muscles in the body to the corresponding neurons in the transplanted brain.

Thinking about this already reveals the first problem: there has got to be a mismatch between the number of neurons in the body and the brain. Perhaps this doesn’t matter and some afferent and efferent nerve fibers need not be connected up to the brain, or – vice versa – some of the brain’s neurons need not receive any input or have any targets in the body. If the bulk of the brain is connected up properly perhaps this suffices? In any case, our brains are calibrated to the body and so to place them into a new body must inevitably throw this calibration completely out of whack. Perhaps this can be overcome and a new calibration can emerge but in how far this is possible is anybody’s guess.

A related problem is how the brain represents the body in which it has been placed. Somehow we carry in our minds a body image that encodes the space that our body occupies, how it looks, how it feels, etc. There are illusions that distort this representation of our own bodies. Some malfunction or fluke in that system could also explain some out-of-body experiences although it is of course difficult to study such phenomena. It seems however pretty likely that such experiences should be exacerbated in a person whose brain has been transplanted into a new body. Imagine you are 1.5 meters tall but your brain has been transplanted into the body of an NBA player. Your experience of the world through this new, much taller body must inevitably be far greater than simply looking at the world from your new vantage point. Over a lifetime of existing in your short body you should have no representation of the sensory experiences related to being 2 meters tall, nor of the feats your muscles are capable of when you can slam dunk. It is possible that we can learn to live in this new corporal shell but who can know whether that is the case.

In that sense, there may actually be truth to the claim in the aforementioned post that there is some kind of “bodily memory”. For one thing, the flexibility and strength of various muscles is presumably related to what you are doing with them on a daily basis. Who knows, perhaps the various nervous tissues in the body also undergo other forms of synaptic plasticity we don’t yet know about? Of course, none of this suggests – as this post claims – that much of your self is in fact stored inside your body or that you become the host person. The brain is undoubtedly the seat of consciousness and of much of your memory, including the fine procedural or motor memory that you take for granted. But I think it is fair to say that by having your brain being wired up to a new body you would certainly experience the world in uniquely different ways. Insofar as your perception affects how you interact with the world this may very well alter your personality and thus really change who you are.

In the same vein I also view other thought experiments, such as the common science-fiction notion that we could one day upload our brains to a computer. Even if we had a computer with the data capacity to not only store a complete wiring diagram and the synaptic weights of all the neurons in the human brain and even if neuronal wiring diagrams were all there is to processing in the brain (thus completely ignoring the role of astrocytes, possibly important functional roles of particular ion channel proteins, or of slow neurochemical transmissions), whatever this stores would presumably not really be the person’s mind. Simply running such a network in a computer would effectively cut this brain off from its host body and in this way it would be comparable to the brain transplant situation. It is difficult to imagine what such a brain would in fact experience in silico. To approximate normal functioning you would also have to simulate the sensory inputs and the reciprocal interactions between the simulated brain and the simulated world this brain inhabits. This would be a bit like the Matrix (although that movie does not involve disembodiment). It is hard to imagine what this might really feel like. Quite likely, at least the cruder beta versions of such a simulation would be highly uncanny because they don’t accurately capture real world experience? In fact, this is part of the plot of the Matrix movie because the protagonist senses that something isn’t right with the world.

I find this topic quite fascinating. Whatever the case may be, I think it is safe to say that we are not just our brains. As opposed to the simplistic notion of how the brain works suggested by many science fiction and fantasy stories, our minds aren’t merely software running on brain hardware. Our brain exists inside a body and that presumably must have some influence on the mind. I don’t buy into much of the embodied cognition literature, as a lot of that also seems very simplistic. I certainly agree in large parts with Chaz Firestone and Brian Scholl that it has not really been demonstrated that things like the heaviness of a backpack can affect your perception of the steepness of the hill before you. But at the same time, I think some degree of embodiment must exist, precisely because I am not a dualist. I don’t think there is any evidence to suggest the mind simply floats in the ether, completely removed from the brain and body. Rather it is emergent property of the brain, a brain that is intricately connected to the rest of the body it resides in (and even that is simplistic: I would in fact say that the brain is part of the body).

Coming back to that post about head transplants, the post is on a website called Religion News and the author is a professor of theological and social ethics. As such it is unsurprising that he discusses a dualist view of the mind and criticizes some of the neuroscientific claims that conflict with that notion. However, his argument is quite odd when you dig a little deeper: rather than saying that your self arises in your brain, the author implicitly suggests that is inherent to your body – he literally states that the person whose brain is transplanted dies because they are in a new body. He further suggests that any children the patient would have in their new body would not be his but those of the host body. While genetically this argument is correct, it completely ignores the fact that there is now a new mind driving the body. Whatever distortions and changes to this mind may result from the brain transplant, it is clearly wrong to claim that the host body would completely override the mind inside the transplanted brain. Yes, biologically the children would be those of the host body but mentally they would be the children of the transplanted mind. Claiming otherwise is equivalent to the suggestion that adoptive parents are not real parents.

In conclusion, I agree that there are some interesting philosophical and theological ramifications to consider about brain transplants. If you believe in the existence of a soul, it is not immediately obvious how you should interpret such a case. I don’t think science can give you the answer to that but that is between you and your rabbi or guru or whoever holds your spirituality together. But I think one thing is clear to me: the soul is not inherently attached to your body any more than it is to your brain. No, you are definitely not just your brain. But you aren’t just your body either.

 

474px-human-brain_257
This fella knows how to have a fun time! (Source)

 

(* Work on this is going very slowly so don’t get your hopes up you’ll see any of this anytime soon – it’s more of a lifetime project…)