Many of us in the social sciences and humanities were a little apprehensive about the Excellence in Research for Australia (ERA) exercise. Some of that apprehension came from the very idea of assessing research quality, which was a relatively novel idea to us, but we accepted that we live in an age where accountability for public expenditure is a recurring and necessary theme.
Part of our apprehension also stemmed from the shift from the previous Research Quality Framework (RQF) system to ERA. We had got with the program on RQF, and worked hard to do as well as possible. I took part in a national consultation process to make sure the process was able to capture the nuances of various disciplines, and took part in numerous internal trials. We used search engines like Google and Google Scholar to make up for the deficiencies of the Social Sciences Citation Index which, of course, does not cover citations in books - the highest quality research output to which we (perhaps economists excepted) all aspire. Natural scientists compare strange things like h-numbers that are largely mysterious to us.
Both systems eschewed any attempt to assess the quality of research at the individual level - something quite curious when it is considered we spend much of our professional lives assessing the individual efforts of our students.
Advertisement
Search engines also helped us build evidence for portfolios demonstrating that elusive concept of impact: the use of one's research by policy-makers and in curricula around the world. It was an affirming experience to find references in policy documents and reading lists at universities around the world, where we had no idea our work was being used.
The natural scientists, we heard, didn't like RQF and disciplines like astronomy particularly didn't like the impact element. This we understood. Their research makes few direct contributions to the lives of those taxed to the support the enterprise, but we all support such pure science as underpinning much more applied and useful knowledge. But astronomers for years have had to argue their case with policy-makers and the public. I can recall many years ago an astronomer, doing his best to maximise favourable public perception with a television interview on a new discovery being asked by the cub reporter, 'But what are the benefits of this for the public?'. The best he could do was to stammer that 'this was research at the frontiers of science.' Quite true, but it seemed a bit lame.
So we had invested a lot of effort in RQF. Then the government changed, and ERA appeared, with many changes that seemed to reflect the concerns of the natural scientists, and especially that putative concern of astronomers over demonstrating impact. The astronomers certainly seemed to be in the ascendancy, with one of their number, Penny Sackett, appointed as the new Chief Scientist.
ERA shifted the goal posts. Especially important for us was the removal of impact, which we had initially feared, but come to quite like, and (significantly) the disciplinary rankings of book publishers. We were particularly concerned with the latter, because (as stated earlier) the publication of the research monograph is the summit for most social science disciplines, the research output by which we judge ourselves and our peers. We heard that the Australian Research Council (ARC), the custodians of ERA, considered ranking publishers might have left them open to legal action. This seemed surprising. The American Political Science Association, for example, ranks publishers (Cambridge University Press is number one) without being subjected to litigation from those they consider to be third rate.
So, given these apprehensions, we awaited the announcement of the results with some trepidation to see how we would measure up. My own School did well enough and not too far away from expectations. In both 'Political Science' and 'Policy and Administration' we were scored at 3 – 'at world standard'. We had expected a 4 for 'Policy and Administration', in which we have particular strength, but we understood this was a first, inevitably inexact attempt to apply the system. Our 3 for Political Science was about par for my expectations. I (correctly) thought there would be only one 5 and that would be ANU. It has a large number of high quality political scientists, many of whom have been full-time researchers. It received much more generous research funding than most institutions, so it would be surprising if it did not do well.
Strangely, in 'Policy and Administration' there was only one 4 and no 5s. The 4 was UNSW - initially surprising to those of use who are general policy and administration scholars, but understandable given that their very good social policy research centre would lie in this Field of Research (FOR) classification.
Advertisement
My expectations were quite realistic, rather than expressions of hope, because we knew with some accuracy where others placed us. In a global ranking of political science departments conducted earlier by a scholar at the London School of Economics (on the basis of journal publication), the University of Tasmania was ranked 167, quite a good result globally, and about 8 nationally (Hix, 2004). As a mid-sized department, we did a little better when the scores were adjusted on an FTE basis, moving to 147 globally. Size matters: Western Australia and Murdoch, with small, productive departments did well in absolute terms, but moved up substantially in rankings adjusted for size. A more recent evaluation within Australia (Sharman and Weller, 2009) gave a similar picture in terms of relativities, using journals and ARC grants, though no size adjustment was attempted. There were some differences, but one could think of reasons for those changes in terms of personnel changes and the like.
After looking to see how one was ranked, thoughts turned naturally to other disciplines, first within one's own institution and then more broadly. The areas ranked at 5 at the University of Tasmania contained no real surprises, but the geologists working in ore body research (an ARC centre of excellence with a genuine world reputation) was, for me, a surprise omission. I suspect it was disadvantaged by not having a sufficiently narrow FOR code, just as the similarly excellent separation scientists did expectedly well with a neat fit with 'Analytical Chemistry.' One limitation with ERA would seem to be comparisons between disciplines with multiple FOR codes and those whole disciplines (like political science) that have only one.
Political Science as a single FOR category must rank the American Political Science Review as the top journal, yet this is not an international journal, with more than 90 percent of the authors it published coming form US institutions. The sheer number of political scientists in the US and the maintenance of institutional subscriptions because it is number one keeps it at the top, despite that fact that few of us read it. IN accepting the Mattei Dogan Prize, the closest thing to a Nobel in political science, at the the World Congress in 2009, Philippe Schmitter announced proudly that he had never once published in the APSR - to thunderous applause. The FOR categories determine much of the ERA result, because they determine which journals can be considered excellent: a single FOR code covers political theory, international relations, comparative politics, political sociology, voting behaviour, and so on.
Tasmania's astronomers, I knew were well regarded, and with a 4 ('above world standard'), did quite well, as I expected. Or did they?
I was surprised to learn that a rating of 4 in fact placed them below the national average. Astronomy achieved a quite remarkable national average score of 4.2. Of thirteen institutions rated, six were rated at 5 ('well above world standard') and five at 4. Whereas political science, like most other social sciences seemed to have the kind of bell curve one would expect, with a distribution about a means that (surprisingly to me, a chair of an IPSA Research Committee) was below world standard, astronomy seemed to be, with only two exceptions, better than world standard. How could this be? The answer throws into question the whole ERA exercise as a means of comparing disciplines, because the result for astronomy seems to be largely an artefact of the methodology employed.
Part of the problem seems to be the peculiarly Australian penchant for emphasising research income (an input measure) for assessing research quality (an output measure). There is a strong case for using multiple indicators in assessing research performance. To assess the quality of research papers according to the quality of the journals in which they appear would appear to come dangerously close to committing the ecological fallacy, so other measure are valuable, but there seems to be an absence in the international literature of reports using research income as an indicator of either research 'performance' or 'quality' (Martin, 1996). There seems to a singular fixation in Australia with this measure, which commits the fallacy, well known to students of policy analysis, of confusing an input measure with outputs. Research income might well be important in providing research infrastructure funding to support research, but if we are interested in performance in terms of either effectiveness or efficiency we must focus on outputs and their relationship to inputs. We certainly would not regard a car as excellent simply because it cost a lot to buy and to operate.
Indeed, using research income introduces an acknowledged bias that is clearly demonstrated in astronomy. Large telescopes are expensive instruments, and research quality (unsurprisingly) is highly correlated with telescope size. Martin (1996: 351) emphasizes that size-adjusted indicators are vital if smaller research units are to be compared fairly with larger ones, yet most scientometric studies rely solely on size-dependent indicators such as publication or citation totals, unadjusted even for number of staff or income. The basic point that large budgets allow more researchers to be employed, and more researchers tend to produce more papers seems to have been lost sight of. The ERA rewards large budgets and fails to adjust for size. Its results therefore inevitably conflate size and quality to some extent.
While multiple indicators are recommended, those most often suggested are things like numbers of publications or citations, peer evaluation or estimates of the quality, importance or impact of publications (perhaps assessed by peer review). Nobody seems to suggest counting inputs.
There are also acknowledged biases in any attempt to assess quality in astronomical research. One is a language bias: a requirement to publish in English advantages anglophones, tend not to read or cite papers in other languages, and citation databases provide uneven coverage of foreign language journals (Sánchez, and Benn, 2004: 445). Another bias stems from the tendency of each community to over-cite its own results, and papers from large countries receive more citations than those from small countries. These biases are thought to favour citation of papers from the large North American and UK astronomy communities, but some also favour Australian researchers.
One possible reason for astronomy doing so well in a research quality assessment is that they have long experience with the task, with published research on measuring performance going back more than 25 years (Martin and Irvine, 1983).Certainly, astronomers in other countries seems to be particularly adept at being able to demonstrate their claims to research pre-eminence. While one might gain the impression from the ERA that Australian astronomy must lead the world, this claim would be disputed by the Canadians, who consider that they themselves occupy that position.
One of their number, Dennis Crabtree (2009:1) recently claimed that
Their [Science Citation Index] August, 2005 report on Science in Canada, which covered papers published in a ten-year plus ten month period, January 1994 - October 31, 2004, showed that Canada ranked #1 in the world in average citations per paper in the "Space Science" field. An examination of the journals included in the Space Science field shows that the field is dominated by astronomy.
Perhaps Australia overtook Canada by the time the ERA took place? No -
Crabtree (2009: 2) thinks not:
Canadian astronomy's excellence on the world stage continues. ScienceWatch's report on Science in Canada from May 31, 2009 indicates that of all science fields, astronomy had the highest impact relative to the world. Canadian astronomy papers published between 2004 and 2008 were cited 44% above the world average. For comparison, astronomy papers from the UK and France, for a similar period, were cited 41% and 21% above the world average.
This suggests that astronomy in Canada, the UK and France is well above the world average - and nobody has even mentioned the US. As it happens - and as one might expect - the US seems to lead the world in astronomy.
The impact of astronomical research carried out by different countries has been compared by analysing the 1000 most-cited astronomy papers published 1991-8 (Sánchez, and Benn, 2004: 446). The USA dominates, receiving 61% of the citations to the top 1000 papers, considerably higher than its all-science share of citations, 31%. UK comes second (11% of citations), followed by Germany (5%), Canada (4%), Italy (3%), France (3%), Japan (2%), the Netherlands (2%), Switzerland (2%) and Australia (2%). Leaving aside for the moment the fact that Australian astronomy, so outstanding that it achieved a national average of 4.2 in the ERA exercise, ranked only a remote seventh equal in this assessment, and focusing on the deflated Canadian claims, how can Canada claim pre-eminence when they come only a distant fourth? In answering this question we learn much about how Australian astronomy might have come to be ranked so highly.
The explanation for Canada lies in the methodology employed. In the global analysis, where Canada ranked fourth, papers were assigned to the country of the first-named author. This would seem to be a reasonable approach, given that the natural sciences place great emphasis on the order of authorship. (In the social sciences and humanities, no emphasis is made on order, with alphabetical order commonly the norm, but order is important in the natural sciences, and often keenly fought over). In the Canadian analysis, however, a paper was 'considered to be Canadian if there was at least one author based at a Canadian institution on the paper. The countries of all authors on a paper were given full "credit" for a paper.' Crabtree (2009: 1). Canadian astronomers, it turns out, are junior authors on a lot of good papers, but are lead authors on far fewer. The reason for this is that Canadian astronomers are involved in a large number of international collaborations. As Crabtree (2009: 1-2) put it:
This result can also be interpreted as a result of Canadians, on average, being members of strong international collaborations. For example, there is a Canadian on the WMAP [Wilkinson Microwave Anisotropy Probe] team so all of the (very highly cited) WMAP papers get credited to Canada along with the countries of each of the other authors.
An unusually high degree of international collaboration is a feature of astronomical research, which probably assisted in producing a high ERA ranking, and this is widely known. The reasons for it are simply astronomical. As Elizabeth Capaldi put it recently, 'Astronomy requires the telescope be placed on the right place on the globe, so astronomers cannot always work in their own country and must collaborate with the country where the telescope is placed' (Capaldi, 2010: 71). This, of course, is the reason why the transmissions form the Apollo moon landings were relayed through Australia - and featured in the movie The Dish.
Australia has a particularly fortunate location in this regard: northern hemisphere researchers wishing to research the southern sky must find a southern hemisphere collaborator, so South Africa, Chile and Australia, with clear skies and political stability provide considerable opportunities. Since most of the Milky Way is observable only from the Southern Hemisphere, Australian astronomers end up as at least junior authors on most of the leading research, but they are lead authors on few of the leading papers.
This advantage is being demonstrated again with SKA. Most social science and humanities scholars probably think Skais a music genre originating in Jamaica in the late 1950s. But SKA stands for 'Square Kilometre Array', a giant radio telescope that will be built in either South Africa or Australia, where the view of the Milky Way is best and radio interference is least. Construction, costing €1.5 billion of EU funds will begin in 2016, with first observations in 2019 and full operation by 2024. SKA needed to be located in unpopulated areas where there was a guarantee of very low levels of anthropogenic radio interference. Sites in Argentina and China were soon eliminated.
An Australian SKA Pathfinder (ASKAP) costing $100m (consisting of 36 12m dishes) is already under construction at Boolardy in Western Australia (completion due in 2013), which would be the core of SKA. So $100m is being spent with the prospect of attracting $2 billion (at current exchange rates) – all on the basis that Australia has natural advantages.
Those natural advantages, combined with the ERA methodology, got astronomy to a national average ERA score of 4.2. Call me skeptical, but I remain unconvinced that this is an accurate assessment of research in a discipline that ranks seventh equal with a mere 2 percent of lead authorships in an assessment of the discipline itself. I can understand, when we are bidding for SKA, why we as a nation would wish to talk up the quality of our research performance in astronomy. But I remain unpersuaded (as a member of the IPSA Commission on Research with some knowledge of international realtivities), that ERA has captured accurately relative research quality in political science and astronomy.
Not only do the results not provide any reliable basis for allocating funds between competing disciplines they heavily skewed and clustered results in astronomy alone can hardly serve as a reliable indicator for allocating funding within that discipline.
The ARC has the opportunity to improve the reliability of the exercise, at least by attempting some standardization of scores that – in the first attempt – are clearly egregious. As a political scientist, I would prefer to see the pernicious effects of the FOR codes addressed, book publisher quality and citations addressed, impact assessment reinstated, and the actual quality of individual outputs interrogated more closely. The New Zealand system - submitting the four best outputs for peer assessment - seems more likely to produce reliable results. As a member of a smaller university, I would rather see the system adjust for size and numbers and not place any great emphasis on input measures like budgets.
Minster Carr (Australian 5 February 2011) stated that "The ERA national report reveals for the first time exactly how our country's research efforts compare to the rest of the world." Sorry, Minister. I cannot accept your faith in the exactitude of the exercise.The next round is already under way - perhaps too soon to allow policy learning - and such improvements are not likely to be adopted. But let's at least try to rid the system of the faults made only too apparent in the case of astronomy.
A complete version of this paper, complete with references can be downloaded by clicking here.