The Sciences

When Scientists Over-cite Themselves, a Metric of Research Importance Breaks

A database of the 100,000 most-cited researchers has shed light on scientists gaming the system to improve their evaluation.

New Delhi: A review of the scientific literature has found that an inordinate number of scientists around the world may have inflated the importance of their research by manipulating a metric used to measure it, giving rise to a false impression about the value of their own contributions.

For example, in March 2018, Sundarapandian Vaidyanathan received an honorary mention for an award that recognises top Indian researchers in various academic fields. Then Union human resource development minister Prakash Javadekar handed the award to Vaidyanathan, a computer scientist at Veltech University in Chennai. To determine the ‘top researchers’, the award measured productivity and citation metrics.

Now, a paper that explores the technique of citation metrics – indices that help quantify the impact and reach of a researcher’s work – has thrown up an interesting tidbit about Vaidyanathan: while he is one of the most cited researchers in the world, almost 94% of his citations were by himself.

When a scientific paper builds on the work described in an older paper, it mentions the older paper as a reference. This mention is called a citation. Scientific journals have a journal impact factor (JIF) attached to them based on the average number of citations the papers they publish receive. JIF has evolved to become a widely used, and today controversial, number used to determine the putative prestige of journals. Both the JIF and citation analysis broadly owe themselves to the late American researcher Eugene Garfield.

The study, authored among others by John Ioannidis, an outspoken proponent of reforming scientific publishing to better filter useless or misleading results, was published in the journal in PLOS Biology. Ioannidis and his team compiled a database of the world’s 100,000 most-cited researchers. While they focused on technical problems and misuse of citation metrics, the analysis also shed light on the volume of self-citation that some researchers engaged in to promote themselves.

According to Nature, more than 50% of the citations of at least 250 scientists are from themselves or their co-authors. The study puts the median self-citation rate at 12.7%. It also flags “extreme self-citation” and citations farms – where clusters of authors massively and systematically, cite each other’s papers – as practices that render citation metrics “spurious and meaningless”.

Ioannidis, a physician at Stanford University, told Nature that “self-citation farms are far more common than we believe”. “Those with greater than 25% self-citation are not necessarily engaging in unethical behaviour, but closer scrutiny may be needed.”

The database adds to the discussion around problems caused by excessive self-citation, as some researchers seem to be gaming the system to receive higher rankings in citation metrics. Universities in many countries, including India, gauge, reward and promote researchers according to their ranks on citation metrics lists.

Indeed, universities themselves use citation metrics to advertise their research capabilities. As Vasudevan Mukunth, the science editor of The Wire, wrote in 2015, “A good placement on these rankings invites favourable interest towards the universities, attracting researchers and students as well as increasing opportunities for raising funds.”

For example, in a survey by Nature whose results were announced in 2015, Panjab University was ranked #1 in India by the number of citations its scientists had accrued in a year. However, P. Sriram, a professor at IIT Madras, pointed out that Panjab University’s scientists were members of the CERN’s Large Hadron Collider experiment. Every time the members of this experiment publish a paper about some of their results, they include every member of the international collaboration as a co-author.

“Looking at the example of Panjab University, the contributors to the … collaboration are unquestionably respected researchers,” Prof Sriram told The Wire at the time, “but the issue is whether their contributions should be counted against the … collaboration or Panjab University.”

In July, a publisher-advisory body called the Committee on Publication Ethics (COPE) flagged “extreme self-citation” as one of the main forms of citation manipulation. Ioannidis further said that the study “should not lead to the vilification of particular researchers for their self-citation rates, not least because these can vary between disciplines and career stages”. The data should also not be used for “verdicts such as deciding that too high self-citation equates to a bad scientist” or university.

Because there are legitimate reasons for researchers to cite their own work, or that of their co-authors, scientists believe that completely excluding self-citations from metrics is not a solution. In the same document where COPE flagged the issue, it also argued that excluding self-citations from metrics would not “permit a nuanced understanding of when self-citation makes good scholarly sense”.

Authors warned that the study should not be used to arrive at verdicts such as deciding that too high self-citation equates to a bad scientist. Credit: Cofrin Library/Flickr CC BY 2.0

Justin Flatt, a biologist at the University of Helsinki, advocated in 2017 for a separate self-citation metric that could be displayed like the h-index, an indicator of productivity. A scientist with an h-index of 10 indicates that she has published 10 papers that received at least 10 citations each. Similarly, Flatt suggested, an s-index of 10 would mean that the scientist had published 10 papers with at least 10 self-citations each.

Flatt wrote in Physics Today, “Pairing the h and s indices would highlight the degree of self-promotion and help dampen the incentive to excessively self-cite. We would be able for the first time to see clearly how much the different scientific fields are resorting to self-citing, thereby making excessive behaviour more identifiable, explainable and accountable.”

According to Nature, Flatt has already received a grant to collate data for the s-index. “It’s never been about criminalising self-citations,” he told the journal, adding that when academics promote themselves using the h-index, the s-index could be used to provide useful context.

Problems with citation metrics in India

Veltech University, Vaidyanathan’s institute, received the highest score for citations in Asia. However, as Vyasa Shastry, a materials engineer, wrote in The Wire in 2017, the data that helped it achieve this score was not available. “It also pays to remember – even when the underlying data is accurate – that rankings are subjective and bibliometric measures like citations have their own nuances,” Shastry added.

The University Grants Commission and the All India Council for Technical Education use JIF to appoint and promote teachers. The UGC’s Academic Performance Indicator (API) score also accords more points to teachers who published papers in journals with a higher JIF.

Reporting on the flaws of the JIF, The Wire noted in 2018 that it “fluctuates erratically from year to year because of the two-year window. It favours certain journals and disciplines. It doesn’t take into account any kind of field-normalisation. It doesn’t predict citations. And it suffers from a creeping inflation.”

In another instance, Shreya Ghosh, a neuroscientist, wrote in IndiaBioscience that using publication metrics to gauge scientific merit may weigh heavily on younger graduate students who “feel the pressure to publish in high-impact journals”.

These problems, and the other concerns that evaluating applicants for positions depended too much on citation metrics, prompted the Indian National Science Academy to release a policy statement in June 2018 recommending that “research should be evaluated on the basis of what is published rather than where it is published”. Since it may be impossible for hiring, granting and awarding committees to undertake an in-depth analysis of the scientific merit of every single publication by any particular candidate, the statement suggests that a researcher should be allowed to select their five “best” papers, which may then be classified by an expert committee as being “confirmatory “, “incremental” or “path-breaking” in nature.