London

An unusual case of plagiarism has struck ArXiv, the popular physics preprint server at Cornell University in Ithaca, New York, resulting in the withdrawal of 22 papers.

In the light of the incident, Cornell physicist Paul Ginsparg, who developed the archive, says that he now intends to investigate measures to prevent such misdemeanours from recurring. In the meantime, Ginsparg is also facing the challenge of potentially libellous material being posted on the server (see next article).

The plagiarism case traces its origins to June 2002, when Yasushi Watanabe, a high-energy physicist at the Tokyo Institute of Technology, was contacted by Ramy Naboulsi, who said he was a mathematical physicist. Naboulsi asked for Watanabe's help in obtaining a research position in Japan. Impressed by Naboulsi's work, Watanabe agreed to upload some of his papers to ArXiv, which Naboulsi was unable to do himself as he had no academic affiliation. “I was so amazed at his productivity I began to think he was a genius,” Watanabe later wrote in an e-mail to the archive.

By April 2003, Naboulsi had 22 papers on ArXiv, but some of server's users noticed that one of his papers copied parts of the BaBar Physics Book, an online summary of meetings about a high-energy physics experiment at the Stanford Linear Accelerator Center in California. When six more of the papers were shown to be similar to the BaBar book, Watanabe asked for all 22 preprints to be withdrawn. ArXiv labelled the papers as such but left them on the site, in accordance with its policy.

When contacted by Nature, Naboulsi said that he would like to apologize to ArXiv and to the Stanford centre, but said that his other papers, on cosmology, were not plagiarized.

ArXiv plainly states that “authors must make their own submissions”, but Ginsparg is not planning to take any action against Watanabe, who now deeply regrets the incident. “He has already suffered enough over this,” says Ginsparg.

Plagiarism is extremely rare on ArXiv — only a handful of its 250,000 submissions have been withdrawn for this reason. But the incident has prompted Ginsparg to think about how to prevent further incidents.

He says that his first step will be to take a baseline measure of similarity between the archive's documents, and so generate a warning when a paper that exceeds this threshold is uploaded. Computer software, typically developed for universities to spot student fraud, is available to do this.

“The technology is there,” says Fintan Culwin, an expert in anti-plagiarism software at London's South Bank University. “The question is how much does the archive want to pay to have this service.” Ginsparg is now looking for a masters or doctoral student to work on the project.

http://www.arxiv.org