In response to failed replications, some researchers argue that replication studies are especially convincing when the people who performed the replication are ‘competent’ ‘experts’.
Paul Bloom has recently remarked: “Plainly, a failure to replicate means a lot when it’s done by careful and competent experimenters, and when it’s clear that the methods are sensitive enough to find an effect if one exists. Many failures to replicate are of this sort, and these are of considerable scientific value. But I’ve read enough descriptions of failed replications to know how badly some of them are done. I’m aware as well that some attempts at replication are done by undergraduates who have never run a study before. Such replication attempts are a great way to train students to do psychological research, but when they fail to get an effect, the response of the scientific community should be: Meh.”
This mirrors the response by John Bargh after replications of the elderly priming studies yielded no significant effects: “The attitude that just anyone can have the expertise to conduct research in our area strikes me as more than a bit arrogant and condescending, as if designing the conducting these studies were mere child's play.” “Believe it or not, folks, a PhD in social psychology actually means something; the four or five years of training actually matters.”
So where is the evidence we should ‘meh’ replications by novices that show no effect? And how do we define a ‘competent’ experimenter? And can we justify the intuition that a non-significant finding by undergraduate students is ‘meh’, when we are more than willing to submit the work by the same undergraduate when the outcome is statistically significant?
One way to define a competent experimenter is simply by looking who managed to observe the effect in the past. However, this won’t do. If we look at the elderly priming literature, a p-curve analysis gives no reason to assume anything more is going on than p-hacking. Thus, merely finding a significant result in the past should not be our definition of competence. It is a good definition of an ‘expert’, where the difference between an expert and novice is the amount of expertise one has in researching a topic. But I see no reason to believe expertise and competence are perfectly correlated.
There are cases where competence matters, as Paul Meehl reminds us in his lecture series (video 2, 46:30 minutes). He discusses a situation where David Miller provided evidence in support of the ether drift, long after Einstein’s relativity theory explained it away. This is perhaps the opposite as replication showing a null effect, but the competence of Miller, who had the reputation of being a very reliable experimenter, is clearly being taken into account by Meehl. It took until 1955 before the ‘occult result’ observed by Miller was explained by a temperature confound.
Showing that you can reliably reproduce findings is an important sign of competence – if this has been done without relying on publication bias and researchers’ degrees of freedom. This could easily be done in a single well-powered pre-registered replication study, but over the last years, I am not aware of researchers demonstrating their competence in reproducing contested findings in a pre-registered study. I definitely understand researchers prefer to spend their time in other ways than defending their past research. At the same time, I’ve seen many researchers who spend a lot of time writing papers criticizing replications that yield null results. Personally, I would say that if you are going to invest in defending your study, and data collection doesn’t take too much time, the most convincing demonstration of competence is a pre-registered study showing the effect.
So, the idea that there are competent researchers who can reliably demonstrate the presence of effects, which are not observed by others, is not supported by empirical data (so far). In the extreme case of clear incompetence, there is no need for an empirical justification, as the importance of competence to observe an effect is trivially true. It might very well be true under less trivial circumstances. These circumstances are probably not experiments that occur completely in computer cubicles, where people are guided through the experiment by a computer program. I can’t see how the expertise of experimenters has a large influence on psychological effects in these situations. This is also one of the reasons (along with the 50 participants randomly assigned to four between subject conditions) why I don’t think the ‘experimenter bias’ explanation for the elderly priming studies by Doyen and colleagues is particularly convincing (see Lakens & Evers, 2014).
In a recent pre-registered replication project re-examining the ego-depletion effect, both experts and novices performed replication studies. Although this paper is still in press, preliminary reports at conferences and on social media tell us the overall effect is not reliably different from 0. Is expertise a moderator? I have it on good authority that the answer is: No.
This last set of studies shows the importance of getting experts involved in replication efforts, since it allows us to empirically examine the idea that competence plays a big role in replication success. There are, apparently, people who will go ‘meh’ whenever non-experts perform replications. As is clear from my post, I am not convinced the correlation between expertise and competence is 1, but in light of the importance of social aspects of science, I think experts in specific research areas should get more involved in registered replication efforts of contested findings. In my book, and regardless of the outcome of such studies, performing pre-registered studies examining the robustness of your findings is a clear sign of competence.