When Scientific Citations Are Illegalized: Exposing ‘Sneaky References’

Contents

Hidden manipulation Opportunity discovery A new type of fraud Meaning and potential solutions

The image of a solitary researcher – isolated from the world and the rest of the scientific community – is a classic and misleading one. Research is actually based on continuous exchange within the scientific community. First, you understand the work of others, and then you share the results.

Reading and writing articles published in academic journals and presented at conferences is a core part of being a researcher. When researchers write academic articles, they must cite the work of their colleagues to provide context, detail the sources of inspiration, and explain differences in approach and results. Positive citations from other researchers are a key measure of the visibility of a researcher’s own work.

But what happens when this citation system is manipulated? Recent Information Science and Technology Association Journal ArticleOur team of academic detectives, including information scientists, computer scientists, and mathematicians, have exposed a sneaky way to artificially inflate citation counts by manipulating metadata: spoofing references.

Hidden manipulation

People have become more aware of scientific publications, including how they work and their potential flaws. In the past year alone, Over 10,000 scientific articles have been retractedThe problem of citation gaming and its harm to the scientific community, particularly its credibility, is well documented.

Citations in scientific works follow a standardized referencing system. Each reference explicitly mentions at least the title, author name, year of publication, journal or conference name, and page number of the cited publication. These details are stored as metadata and are not directly visible in the article text, but are assigned to a digital object identifier or DOI (a unique identifier for each scientific publication).

References in scientific publications help authors justify methodological choices or present the results of past studies, emphasizing the iterative and collaborative nature of science.

But what we discovered by accident was that some unscrupulous actors were adding additional references to scientific databases when submitting articles, which were not visible in the text but were present in the article’s metadata. The result? A spike in the number of citations to certain researchers or journals, even though the authors did not cite these references in their articles.

Opportunity discovery

The investigation began with a post by Guillaume Cabanac, a professor at the University of Toulouse. PubsphereA website dedicated to post-publication peer review, where scientists discuss and analyze publications. In the post, he detailed how he discovered the discrepancy: a Hindawi journal article suspected of being fraudulent due to awkward wording received far more citations than downloads, which is very unusual.

This post has caught the attention of several famous detectives, including the current author of this post. JASIST Article. We used scientific search engines to find articles that cited the initial article. Google Scholar found nothing, but Crossref and Dimensions found references. What’s the difference? Google Scholar relies primarily on the text of the article to extract references that appear in the References section, while Crossref and Dimensions use metadata provided by the publisher.

A new type of fraud

To determine the extent of the manipulation, we examined three scientific journals published by the Academy of Sciences and Technology, the publisher that published the articles containing the questionable citations.

Our research consisted of three phases.

We have listed references that are explicitly mentioned in the HTML or PDF versions of the article.
We compared this list with the metadata recorded in Crossref and found additional references that were added to the metadata but not appear in the article.
We investigated Dimensions, a bibliometric platform that uses Crossref as its metadata source, and found additional inconsistencies.

At least 9% of the references recorded in journals published by the Technoscience Academy were “sneaky references.” These additional references were only present in the metadata, distorting citation counts and giving certain authors an unfair advantage. Some legitimate references were also lost and not present in the metadata.

In addition, when we analyzed the content of the slush citations, we found that they were of great help to some researchers. For example, one researcher associated with the Technoscience Academy benefited from over 3,000 slush citations. Some journals from the same publisher benefited from hundreds of slush citations.

We published our study because we wanted our results to be externally validated. In pre-printWe informed Crossref and Dimensions of our findings and provided links to preprinted research. Dimensions acknowledged the illegal citation and confirmed that their database reflected Crossref’s data. Crossref Also confirmed Additional References shrinking clock And it stressed that this was the first time the issue had been reported to its database. The publisher took steps to address the issue following Crossref’s investigation.

Meaning and potential solutions

Why is this discovery important? Citations have a huge impact on research funding, academic promotion, and institutional rankings. Manipulating citations could lead to unfair decisions based on false data. More worryingly, this discovery raises questions about the integrity of the scientific impact measurement system, a concern that researchers have been raising for years. The system could be manipulated to encourage unhealthy competition among researchers, tempting them to take shortcuts to publish faster or get more citations.

To combat this practice, we propose several measures:

Publishers and institutions like Crossref rigorously validate metadata.
Independent audits to ensure data integrity.
Improved transparency in managing references and citations.

This study is the first to report metadata manipulation, as far as we know. We also discuss the implications this may have for researcher evaluation. This study again highlights that over-reliance on metrics to evaluate researchers, their work, and their impact is inherently flawed and possibly misleading.

This overreliance has the potential to promote questionable research practices, including forming hypotheses after the results are known. Hacking; Splitting a single data set into multiple papers, i.e. salami slicing; data manipulation; plagiarism. It also hinders transparency, which is key to more. robust and efficient Research. The problematic citation metadata and cryptic references now appear to have been fixed, but the fixes may remain. This is often the case with scientific revisions.I woke up too late.

Ronnie Besançon I am an Assistant Professor in Data Visualization at Linköping University. Guillaume Cabanac I am a professor at the Toulouse Institute of Information. Thierry Vieville I am the Inria Research Director in charge of Inria’s scientific media. This article was republished from conversation ~Below Creative Commons License. read Original article.