It’s that time of year again: on Sunday 21st December, the winner of the annual scuffle to be Christmas Number One will be announced. The inclusion of streaming in this year’s charts is likely to shake up the system, as older tracks are likely to get more of a shot: Mariah Carey’s “All I Want For Christmas Is You” may make an improbable return to the top, alongside more recent releases by Band Aid 30 and whoever the X Factor winner ends up being. And then of course there’s Iron Maiden; and the whole thing will be accompanied by profound reflection on the pointlessness of it all, and on how it’s not nearly as important as it was ten, or twenty, or thirty years ago.
But three days earlier, on 18th December, academics like me will be hunkering down in their ivory towers and turning up the wireless to hear a different Top 40: the results of the Research Excellence Framework (REF) 2014. REF is the scheme that the main UK higher education funding body, HEFCE, uses to determine where its research funding goes. The idea, of course, is to allocate more money to those institutions that are doing the best research. That raises an old, and difficult, question: how do we measure research quality?
Ask almost any researcher what the gold standard for research assessment is and they’ll tell you that it’s anonymous peer review. This is the system used to publish in academic journals: papers are sent to an editor, who asks a number of experts in the field (usually 2-3) to consider them and provide feedback. In the ideal case, neither the authors nor the reviewers know who the other parties are. Based on the reviewers’ reports, the editor then decides whether or not to publish the paper, and if so whether it needs changes before publication. In the UK, unlike almost all other countries in the world, we’re lucky enough to have a system for research assessment in which peer review is at the very heart: the quality of research outputs as assessed by a specialist panel accounts for 65% of a REF submission.
It may strike you as surprising, then, that there’s been a backlash against the REF from academics. Part of the problem is that academia selects for individualistic, often egotistical personalities, and such free-thinkers often don’t like the idea of their research being assessed at all. Then there are issues to do with where the other 35% of the REF result comes from: 20% is for the “impact” of research beyond academia, a poorly understood criterion whose applicability in more abstract areas of research such as mathematics and philosophy is dubious and whose dangers have been widely discussed. But the issues go deeper than that. In particular, there have been two recent suggestions that peer review should be dropped from UK research assessment. The core arguments are that REF peer review is expensive and falls short of internationally accepted standards.
Lancaster cultural historian Derek Sayer has written about the practical problems with REF peer review. In short, a tiny number of evaluators assess a huge number of submissions, which often only fall into their area of expertise in a very general sense. Oxford neuropsychologist Dorothy Bishop, meanwhile, has pointed out that a specific metric, the h-index, actually serves as a fairly robust predictor of the results of REF peer review, and this has been followed up on a larger scale in a paper by Mryglod et al. (2014).
What are the alternatives to peer review, then? There’s little doubt that a system that employs some automatically-calculable measure would be cheaper to run (though how much cheaper is not clear). Such measures include journal impact factors as well as citation counts, including the h-index itself, and all are problematic. Impact factors assess individual articles based on the prestige of the journal in which they are published, which is obviously a non-starter: good journals can accept terrible research (especially if the research is “sexy”), and conversely great research may be published in obscure journals, or in books or conference proceedings volumes. A system similar to this is in use in Denmark, based on an “authority list” of good journals and publishers; colleagues familiar with this sort of system have bewailed the pernicious effect this has on publication practices, causing researchers to try to cram all their work into a select few journals and reinforcing deeply entrenched notions of prestige that are orthogonal to research quality.
Citation counts also have their problems. Papers can be cited for many reasons other than that they contain top quality research: often researchers will cite a study because they disagree with it or have found counterevidence, not because it is good. Furthermore, papers by more established researchers with better self-publicity strategies will be more widely read, and hence more widely cited. Swan (2010) has also shown that open access publications are cited more often than their paywalled counterparts; open access is a good thing, but of course it’s not the case that open access papers are better than those behind paywalls! Finally, the h-index has been shown to discriminate against junior academics.
These specific problems may well be fixable, as Sayer argues. However, the underlying problem is that, for any such metric, academics can and will find a way to game the system. By contrast, peer review is at least difficult to game, if not impossible: if you don’t know who the reviewers are going to be, it’s very tough to cover your bases enough to please everyone without going to the trouble of actually writing a decent paper. Sayer’s argument is based on an idealistic, as-yet-unrealized view of both metrics and researchers, and on comparing this view to a worst case scenario for REF peer review.
It’s thus not surprising that there was a tremendous backlash from academics against the proposal to introduce a bibliometric element into UK research assessment, both under Labour in 2007 and under the ConDem coalition in 2014. In having a research assessment framework with peer review as its cornerstone, despite its problems, the UK is in a world-leading position - and, whatever happens on 18th December (or on 21st December!), we shouldn’t throw it away.