• Subscribe

iSGTW Technology - Weka4WS: distributed data mining using web services

Technology - Weka4WS: distributed data mining using web services

The Waikato Environment for Knowledge Analysis, or WEKA, is software developed at the University of Waikato, New Zealand. It gets its four-letter acronym from New Zealand's native weka, flightless brown birds about the size of a chicken.
Stock image from sxc.hu

Released in June 2007, Weka4WS is a new tool designed to open the way for worldwide use of data mining services.

Developed at the University of Calabria Grid Computing Lab, Weka4WS extends the open source Weka toolkit for supporting distributed data mining on grid environments.

The original Weka provides a large collection of machine learning algorithms, written in Java, for data pre-processing, classification, clustering, association rules and visualization, which can be invoked through a common Graphical User Interface.

In Weka, the overall data mining process takes place on a single machine, since the algorithms can be executed only locally. Weka4WS extends Weka to support remote grid execution of the data mining algorithms through web services-hence the 4WS.

In this way, distributed data mining algorithms for classification, clustering and association rules can be concurrently executed on decentralized grid nodes.

To enable remote invocation, all the data mining algorithms provided by the Weka library are exposed as a web service, which can be easily deployed on available grid nodes.

Thus, Weka4WS also extends the Weka GUI to enable the invocation of the data mining algorithms that are exposed as web services on remote grid nodes.

The extended Knowledge Flow: Still under development, this component will allow execution of data mining workflows over multiple grid machines.
Image courtesy of Weka4WS

Grid integration

To achieve integration and interoperability with standard grid environments, Weka4WS has been designed by using the Web Services Resource Framework as an enabling technology.

In particular, Weka4WS has been developed by using the WSRF Java library provided by Globus Toolkit 4.

The current version of Weka4WS (1.0, released 7 June 2007), is based on the latest version of Weka (3.4.11, released 1 June 2007) and extends the Weka Explorer component. It runs on *nix platforms and requires Globus Toolkit 4 on both client and server nodes.

The development team is currently working on a new version that will include an extension of the Knowledge Flow component for grid-enabled data mining workflows, as well as support for running the client on any platform (including, for example, Microsoft Windows).

Weka4WS is partially funded by CoreGRID, an EU Network of Excellence on Peer-to-Peer and Grid technologies. Weka4WS is freely downloadable.

- Domenico Talia, University of Calabria, Italy

Join the conversation

Do you have story ideas or something to contribute?
Let us know!

Copyright © 2015 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.


We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.