Tag Archives: REF

A REF conundrum

For those who have read any of my previous posts, you’ll know that I’m not a fan on the Research Excellence Framework (REF2014). I’ve blogged about the negative impact of REF before. Essentially, although it may aiming to do something quite reasonable, the way in which it is aiming to do this, and the impact it is having on the way universities are behaving, seem very negative to me. I did, however, think of something particular that I thought I would blog about here.

Basically, each university department in the UK will submit – to be assessed by the relevant REF panel – 4 refereed journal papers from all, or some, of their academics and research fellows. Each paper will be scored as either 1*, 2*, 3*, or 4*. The amount of money that the university will then get will depend on the average score and the number of people submitted. It’s still not quite clear if it’s better to submit fewer people, so as to get a higher score, or to simply submit as many eligible people as possible. However, I believe that someone cannot be submitted if they don’t have 4 refereed journal papers published between January 2008 and October 2013.

Here’s where I thought there could be a possible issue. Consider the situation in which there is someone in a university department who is the primary author on 4 refereed journal papers that are probably okay. They will probably score 3*. Imagine there is a second person in the same department who is the lead author on only 3 papers, but they’re fantastic papers and will probably score 4*. This second person, I think cannot be submitted to REF. However, if they happen to also be an author on one or more of the first person’s papers, one of these papers could be transferred to the second person who now has 4 papers (one scoring 3* and the others potentially scoring 4*). If this paper has 10 or fewer authors, the second person’s contribution does not even need to be justified. If it has more than 10, there would need to be some narrative explaining the second person’s contribution to the paper. The first person can now no longer, however, be submitted to REF.

In some sense, the fact that the first person can no longer be submitted to REF doesn’t matter. Individuals aren’t actually assessed. It’s simply that a subset of their papers are used to assess the research quality of a university department. However, an individual must be associated with each set of 4 papers. It’s in the department’s interest to submit the strongest set of papers. The first person is, however, someone who was not formally submitted and so this could disadvantage them (in a career sense) if people at their university don’t realise why. Also, if such a scenario were to occur, should the first person (and second I guess) approve the strategy. Is it acceptable for a university department to simply decide who should take credit for a particular paper? What if the first person objected and insisted that their 4 papers (which are good but not fantastic) be submitted to REF and refused to allow the department to transfer one of their papers to someone else?

I’m not sure if the above scenario is at all likely. I do think, however, that there will be situations (where more than one person in a department is an author on a paper) in which a decision will have to be made as to who should be credited with a particular paper and that it may well go to the person who played the less significant role. Given that individuals are not actually being assessed, it is logical that the optimal set of papers be submitted. However, it is an interesting issue as to whether or not it is acceptable for a university department to decide on who gets credit for a paper. Given that someone objecting to this strategy would disadvantage their department, I suspect that most will be largely happy with this. It does, however, seem to be something that could create some difficulties.

The negative impact of REF

The more I learn about the Research Excellence Framework (REF) the less convinced I am about the merits of this whole exercise. That universities are assessed to get some idea of how best to distribute a pot of money is fine. The way in which it is done, and the “games” that appear to be played by universities and university departments, is what concerns me. For starters, something like 300 senior people are involved in actually carrying out the assessments and numerous others are involved in preparing the submissions. The cost of doing this must be substantial (plus these are meant to be our leading researchers who are spending a large fraction of their time assessing everyone else). Some might argue that the amount being distributed (billions) makes it worth spending all this money carrying out the assessment.

An alternative argument might be that if ever it was an appropriate occasion in which to use metrics, it would be when assessing a large diverse organisation like a university. The problem with metrics (like citations) is that comparing different fields (or even different areas within the same field) is difficult because there might different citation practices in different fields and the size of the field plays a role. A typical university, however, has so many different fields that these variations should – to a certain extent – cancel and one could probably get a pretty good idea of the quality of a university by considering citations statistics and other metrics (number of spin-out companies, patents, etc.). One could also be a bit cruder in the rankings. I don’t really believe that we can rank universities perfectly. Rather than first, second, third…, it could be top 3, next 5, next 5, etc.

What concerns me more are the implications of what universities and university departments seem to be willing to do to optimise their REF scores. You can include research fellows in REF submissions and so there will be lots of carrots dangled to try to ensure that no Fellows leave before the REF census date in October 2013. Some of these research fellows may also be offered permanent positions that will start when their Fellowships end, either to keep them or to attract them away from another university. These will clearly be very good researchers, but I have an issue with a hiring practice in which holding a Fellowship plays a significant role in whether or not you will be hired. Getting a Fellowship is a bit of a lottery in the first place and what about those whose Fellowships are just due to end. It becomes a bit of a career year lottery – if you have a number of years left on a Fellowship at the same time as a REF submission you are more likely to get a permanent academic job than if you don’t.

There are also other issues. Departments will potentially be creating a number of new posts at a very uncertain time. What if things do not work as expected. How do you pay these people once they come off their Fellowships. What about the stability of academic careers. A burst of hiring every 7 years to coincide with REF submissions doesn’t seem very sensible. I should add, however, that if anyone who actually reads this has managed to get a permanent job or a promise of a permanent job, well done to you. I should also add that my views are not really based on anything specific, just a sense that we are letting the REF dictate our behaviour in a way that may not be ideal and wouldn’t be how we’d behave if the REF wasn’t happening. You have to worry slightly about the validity of an assessment exercise that has such a potentially strong influence on the behaviour of the organisations it is trying to assess. Can’t really be regarded as independent.

REF strategy

I was at a meeting yesterday where we discussed our REF strategy. I probably shouldn’t say what it is (might be confidential), but it didn’t increase my confidence in the basic system. For those who don’t know, REF is the Research Excellence Framework and the basic idea is that all university departments will be assessed to determine the quality of their research, the wider impact of their research, and the vitality of their research environment.

In fairness, what the REF is attempting to do is not inherently bad. The precursor to REF was the RAE (Research Assessment Excecise). In RAE2001, if I remember correctly, individuals were assessed and given a score. They were not told (I think) what their scores were, but departments were then given a final score based on the scores of the individuals in that department. In RAE2008, rather than scoring individuals, outputs were scored. Each person who was included by a department would submit 4 papers (with brief descriptions), some invited talks and other forms of output. These were then ranked on a scale from 1* to 4*. The advantage of this (in my view) is that an individual could have some outputs that score 4* and some that score 1*, so many could contribute to the 4* outputs of a department. A department was then given a score that was essentially what fraction of their ouputs were 4*, 3*, 2* and 1*. If a department had 25% 4* it wouldn’t be known whether this was because only 25% of the individuals produced 4* outputs or if a quarter of everyone’s outputs were 4* (or somewhere inbetween, as is more likely).

The amount of money given to a university was then based on what fraction of the outputs were 1*, 2*, 3* and 4*. I forget the exact formula, but it was something like amount*[(fraction of 1*) + (fraction of 2*)*2 + (fraction of 3*)*5 + (fraction of 4*)*7] multiplied by the number of people submitted. The 3* and 4* outputs therefore counted much more than the 1* and 2* outputs. That money was given for 1* and 2* outputs was, recently, heavily criticised by Vince Cable who (incorrectly in my view) interpreted this as giving money for mediocre research.

As a consequence of the above view, it appears as though 1* and 2* outputs will not receive any funding from the upcoming REF. This, consequently, has implications for the strategy that departments might choose to use. The two strategies that I’m aware of are, firstly, to submit as many people as possible, which will dilute the fraction of 3* and 4* outputs but the reduction in amount per person may be more than compensated for by the fact that there are more people submitted. The second strategy is to submit fewer people so as to minimise the fraction of 1* and 2* outputs (and hence increase the fraction of 3* and 4* outputs) and hope that the reduction in the number of people submitted is compensated for by the fact that the amount per person increases sharply with increasing fraction of 3* and 4* outputs. The advantage of the latter strategy is that it is also likely lead to a higher place in the rankings table, which is often regarded as extremely important.

The problem that I have with the above is not the university departments are chosing strategies (which is largely because they don’t actually know how the assessment scores will translate into money), but that strategies are necessary at all. The REF is meant, in my opinion at least, to be an attempt to objectively assess the quality of research in UK universities. That two essentially identical departments could end up with different scores depending on their chosen strategies suggests that the process is flawed. It’s not meant to be about whether they can guess the best strategy or not. It’s meant to be producing a measure of their quality (relative to other departments in the same area). The future funding of UK universities should – ideally – be based on objective measures of quality, not on whether the stategy gamble paid off or not.

Citation metrics

It seems like there is an increasing tendency to use metrics to make decisions. Essentially people want to have some measurable quantity that not only allows them to judge the quality of something, but also allows them to justify the decision that is made. In science, the quantity that is often used is number of citations that a person or scientific paper has. For those who are not familiar with this, it is essentially the number of times a particular piece of work is referred to in other pieces of scientific work. It is used when hiring people, when deciding if someone’s research proposal should be funded, and is likely to be used in the upcoming Research Excellence Framework (REF) that will in a few years time decide how much money each university should get from the Higher Education Funding councils.

Generally, citation numbers are not used in isolation and other factors are also considered. What is slightly worrying is the impression – that I am getting – that it is likely to get more and more important in the future. Why is this worrying? Although in some sense citations are a reasonable measure of quality, it is almost certainly relative. For example, it presumably must in some way depend on the size of the particular discipline or sub-discipline.

Let’s consider a field in which 100 papers are published every year and imagine each paper cites 10 other papers, none of which are more than 10 years old. This means that at any time there are 1000 papers that could be cited. A particular paper therefore has a 10 x 1/1000 chance of being cited in another paper. Since 100 papers are published every year, this means that on average a paper has an even chance of being cited once a year. Over its 10 year lifetime it is therefore likely to be cited about 10 times. This number doesn’t actually depend on the size of field. If I increase the number of papers published to 1000 every year but assume that only 10 other papers are cited in each paper published, each paper should still only be cited about 10 times in 10 years.

The problem is that the above number is an average. Some papers will not get cited at all and others will get more often than the average. The maximum numbers of citations that a paper can receive clearly depends on the size of the field. We might expect good papers to have many more citations than the average, but this is certainly limited by the total number of papers published in a particular field. If we decide to make citation counts an important metric in determining the amount of money a particular field (or particular researcher maybe) gets, this suggests that the biggest fields (or the best researchers in the biggest fields) will get the most money. At first this may seem alright, especially if the initial size of each field has been determined by some other objective measure. Over time, however, the biggest fields will get bigger and the smallest will suffer as a result, especially if the amount of money available means that only those with significantly more than the average number of citations are likely to funded.

One could argue that the smaller fields weren’t very interesting and therefore deserved to be penalised and the biggest fields deserve to get the most money since their size indicates how interested people are in this area. I would buy this argument if the potential of a field to grow didn’t depend strongly on the size of the field. The above also doesn’t consider different citation practices. I was talking recently to a reasonably eminent Cambridge professor who was arguing that we should all cite each other since this is what happens in other research areas and therefore we should do the same to make sure we aren’t disadvantaged.

Essentially I am concerned that if citations numbers become the primary mechanism for determining research quality, we could do a lot of damage to very interesting areas that are not large enough to be competitive according to this rather simplistic metric. This isn’t to say that we shouldn’t use them at all, but we should be aware of the various selection effects. Of course, one problem will be that most researchers probably work in the largest research fields and at least half of these people have better than average citations counts. Since the people making some of the decisions may well fall into this category, it’s not really in their interest to be more objective about how research quality should be determined since they will do perfectly well if citation counts becomes the primary metric for judging quality.