iSGTW is now Science Node Learn more about our evolution

  • Subscribe

Wiping out plagiarism

Plagiarising Hamlet
Sometimes, plagiarism is obvious. But often it isn't, especially when dealing with foreign translations. The application KOPI will make plagiarism much more difficult. Image courtesy Wikimedia.

The battle against plagiarism begins much like an arms race: each time an application is built to detect plagiarism, the plagiariser becomes better at avoiding detection.

However, the plagiarism arms race can have a final point is when it takes so much time and wits to avoid detection, one might as well write original material. For English material, there is now a large number of good plagiarism detectors, such as Turnitin. However, many other languages still lag behind.

"Plagiarism was becoming a growing problem in Hungary at the beginning of this millennium and there were no plagiarism check services available in Hungarian," said Máté Pataki, a senior research fellow at MTA SZTAKI. And many of the existing systems have had problems with the Hungarian encoding and accented characters.

MTA SZTAKI, the Computer and Automation Research Institute at the Hungarian Academy of Sciences in Budapest, is launching a new version of the anti-plagiarism software KOPI. KOPI can now check for translations of Wikipedia pages and is powered by desktop grids.

The KOPI Plagiarism Search Portal has been available to the public since 2004, however, it has thus far only checked for plagiarism of Hungarian content.

Catching translated plagiarisms

"A year ago, we decided to go a step further and try to detect not only one language copy-and-paste plagiarism cases but also translated plagiarisms. Due to the spread of the Internet and the growing English language knowledge translated plagiarism has become an issue, not only in academic circles but also in newspapers," said Pataki.

The researchers' new algorithm for KOPI checks students' work against Wikipedia. Wikipedia may only represent one small segment of the Web, but the researchers were processing the material in German, French and English as well as Hungarian, and it took weeks on Pataki's previous server to process the quantity of data, they said.

Pataki and his colleagues are using desktop grids to process Wikipedia approximately every month, meaning the processing time is much shorter and their databases can be kept up to date.

Using desktop grids

KOPI was ported to desktop grids under the program GASuC in Hungary by research fellow Attila Marosi, who said that from a technical perspective, it suits desktop grids perfectly because the tasks can easily be divided into smaller tasks. "And, from social perspective we think that the application tries to solve an interesting problem that people would like to contribute to," Marosi said.

KOPI will be available for everyone to use, and the MTA SZTAKI team will be showcasing the tool around Hungary, to universities and secondary schools, according to Agnes Szeberenyi, the coordinator of GASuC.

The team is already planning the next upgrade to the application, which they hope will incorporate more sources other than Wikipedia and will also be able to check if a student has tried to escape detection by using synonyms while plagiarizing.

Join the conversation

Do you have story ideas or something to contribute?
Let us know!

Copyright © 2015 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.


We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.