REF prediction

I’ve come to feel more strongly that although the Research Excellence Framework (REF) is trying to do something reasonably decent, it is doing it in a ridiculous and counterproductive way. Not only does it take an awful lot of effort and time, it also has a big impact on how universities and university departments behave. As I’ve mentioned before, I think the amount of effort expended assessing the various university departments in order to give them a REF score seems excessive and that using metrics might be more appropriate. I don’t particularly like the use of metrics, but if ever there was an appropriate time it would be when assessing a large diverse organisation like a University.

To put my money where my mouth is, I decided to see if I could come up with a ranking for all of the 42 Physics and Astronomy departments that were included in RAE2008 (the precursor to REF). For REF2014, each department will submit 4 papers per submitted academic and these papers must be published or in press between January 2008 and October 2013. What I did is went to Web of Science and found all the papers published in Physics and Astronomy for each of the 42 departments included in RAE2008. Since it is currently October 2011, I used papers published between January 2006 to October 2011. I also didn’t exclude reviews or conference papers. For each department I then determined the h-index of their publications and the number of citations per publications. I then ranked the departments according to these two metrics and then decided that the final ranking would be determined by the average of these two rankings. The final table is shown below. It is ordered in terms of the average of the h-index and citations per publications ranking, but these individual rankings are also shown. I also show the ranking that each department achieved in RAE2008.

I don’t know if the above ranking has any merit, but it took a couple of hours and seems – at first glance at least – quite reasonable. The departments that one would expect to be strong are near the top and the ones that one might expect to be weaker are near the bottom. I’m sure a more sophisticated algorithm could be determined and other factors included but I predict (I’ll probably regret doing this) that the final rankings that will be reported sometime in 2015 will be reasonably similar to what I’ve produced in a rather unproductive afternoon. We’ll see.

Addendum – added 21/03/2013
Deevy Bishop, who writes a blog called BishopBlog, has carried out a similar excercise for Psychology. In her post she compares the h-index rank with the RAE2008 position and also works out the correlation. I thought I would do the same for my analysis. Slightly different in that Deevey Bishop considered the h-index rank for the time period associated with RAE2008, while I’ve considered the h-index rank associated with a time period similar to that for REF2014, but it should still be instructive. If I plot the RAE2008 rank against h-index rank, I get the figure below. The correlation is 0.66, smaller than the 0.84 that Deevey Bishop got for Psychology, but not insignificant. There are some clear outliers and the scatter is quite large. Also, this was a very quick analysis and something more sophisticated, but still simpler than what is happening for REF2004, could certainly be developed.

h-index rank from this work plotted against RAE2008 rank for all Physics departments included in RAE2008.

h-index rank from this work plotted against RAE2008 rank for all Physics departments included in RAE2008.

Additional addendum
Deevy Bishop, through a comment on my most recent post, has described a sensible method for weighting the RAE2008 results to take into account the number of staff submitted. The weighting (which essentially ranks them by how much the funding each institution received) is N(0.1×2* + 0.3×3* + 0.7×4*) where, N, is the number of staff submitted and 2*, 3*, 4* are the percentage of the submitted papers at each rating. If I then compare the h-index rank from above with this new weighted rank I get the figure below which (as Deevey Bishop found for psychology) shows a much stronger correlation than my figure above. Deevey Bishop checked the correlation for physics and found a value of 0.8 using the basic data, and a value of 0.92 if one included whether or not an institution had a staff member on the panel. I did a quick correlation and found a value of 0.92 without taking panel membership into account. Either way, the the correlation is remarkably strong and seems to suggest that one could use h-indices to get quite a good estimate of how to distribute the REF2014 funding.

Plot showing the h-index rank (x-axis) and a weighted RAE2008 ranking (y-axis) for all UK Physics institutions included in RAE2008.

Plot showing the h-index rank (x-axis) and a weighted RAE2008 ranking (y-axis) for all UK Physics institutions included in RAE2008.

Another addendum
I realised that in the figure above I had plotted RAE2008 funding level rank against h-index rank, rather than simply RAE2008 funding level against h-index. I’ve redone the plot and the new one is below. It still correlates well (correlation of 0.9 according to my calculation). I’ve also done a plot showing h-index (for the RAE2008 period admittedly) against what might be the REF2014 formula, which is thought to be N(0.1×3* + 0.9*4*). It still correlates well but, compared to the RAE2008 plot, it seems to shift the bottom points to the right a little. This is presumably because the funding formula now depends strongly on the fraction of 4* papers, and so the supposedly weaker institutions suffer a little compared to the more highly ranked institutions. Having said that, the plot using the possible REF2014 funding formula, does seem very similar to the RAE2008 figure, so I hope I haven’t made some kind of silly mistake. Don’t think so. Presumably it just means that for RAE2008, (0.1×3* + 0.9×4*) is similar to (0.1×2*+0.3×3*+0.7×4*).

A plot of h-index against the RAE2008 funding formula - N(0.1x2* + 0.3*3* + 0.7*4*).

A plot of h-index against the RAE2008 funding formula – N(0.1×2* + 0.3*3* + 0.7*4*).

A plot showing h-index (for RAE2008 period) plotted against a possible REF2014 formula - N(0.1x3* + 0.9x4*).

A plot showing h-index (for RAE2008 period) plotted against a possible REF2014 formula – N(0.1×3* + 0.9×4*).


10 thoughts on “REF prediction

  1. Pingback: Some more REF thoughts | To the left of centre

  2. Pingback: REF again! | To the left of centre

    • Thanks for the comment. I was thinking of doing exactly that. I thought that might a more interesting way to predict the possible REF2014 rankings. May try and do that tomorrow if I get a chance.

    • I’ve actually just realised that I plotted the h-index ranking against the weighted RAE 2008 ranking, rather than plotting h-index against weighted RAE2008 funding level. I’ll have to have a look tomorrow and see what difference it makes. That also probably explains why my correlation was 0.92 while Deevy Bishop got 0.8.

  3. Pingback: A new REF algorithm | To the left of centre

  4. Disclaimer: I am a member of the HEFCE Steering Group that is conducting the metrics review but I comment here in as an individual — I do not speak for the committee.

    This is a very interesting and closely argued submission that makes many valid points. But there is one thing about it that has been nagging at me and I wanted to raise a question that I hope might help amplify the discussion.

    You give many particular instances of where citation counts fail to capture research quality (or impact — the two are not the same even if they may have some interrelations). For example, Huntington’s book that has over 20,000 citations but appears, for a number of reasons, to be of dubious value. That seems fair enough and it is quite right to point out other potential pitfalls in the use of citations. It is easy to see how, in the case of individual authors or individual works, citation counting can be very problematic as a proxy for quality.

    But what seems to be to be missing here (and it is perhaps ironic!) is any attempt to quantify the magnitude of the problem. This strikes me as important because the REF is not assessing individuals but departments, groups of people. When considering populations, some of the noise of measurement is averaged out by looking at the whole. For example, of all the monographs in International Relations that have received 20,000+ citations, what proportion are considered by the community of scholars to be of high quality? Do they all suffer the problems of Huntington’s tome or is there a percentage that would be thought to be valuable. And, on average, are publications with higher citation counts considered to be better than those that attract few citations?

    If one is looking in aggregate at the output from a given department, then is there any merit in totalling citations and comparing them with the count from a department of the same discipline (and size)? Could it be that many of the known issues with citations for individual papers are washed out by treating the numbers as representative of the likely distribution of the quality of output from a relatively large group of scholars?

    I am bound to say that I don’t know the answers to these questions though there is an interesting analysis of departmental H-indices by psychologist Prof Dorothy Bishop that I think is worth looking at. (see also

    Finally, I don’t suppose for a moment that raising this question addresses all the concerns surrounding citation metrics (such as the risk of gaming, the prejudicial effects on women or minorities). But I still think it’s a question worth discussing.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s