As any lay-person who has ever looked at an X-ray knows, it can be very difficult to tell what you are looking at, let alone differentiate what is healthy from what is diseased or damaged or otherwise not normal human tissue. (See image at right.)
Pity the poor medical expert, then, who must deal with not only two-dimensional images but also interpret information from three-dimensions, such as is the case with magnetic resonance imaging or computer-aided tomography, to name just two imaging modes. These multidimensional data bring additional information, but are also much more difficult to process and interpret.
But with advanced algorithms such as clustering, the really useful information can be more easily extracted from these imaging modes. Clustering methods automatically detect and group data that are similar; what's more, cluster analysis can "learn" how to differentiate pathological from healthy tissues and then retrieve similar pathologies in new cases.
However, the trick is to select the best scale for examining the data. A view that is too broad will yield information that is too general and coarse, leading to useless analysis. A view that is too small will give information too detailed to be practical.
Enter the "Mean Shift Smoothie," a Matlab application developed by the CREATIS laboratory to help perform common mean-shift algorithms such as clustering and filtering.
The team says that the grid porting of the application allows images to be interpreted 66 times faster than on a regular computer. So, something that would have taken 164 days to compute on a single PC took 2.5 days instead on the EGEE grid.
A series of steps
The team approached the problem by first optimizing the scale of analysis. For optimal selection of scale parameters, an exhaustive search of all possible combinations is needed. Given the several thousand parameter combinations for each image of interest, this search can require several months of CPU time. This is hardly achievable on a single PC but represents a classic case of a "parameter sweep application" that can use the grid for computational speed-up.
Consequently, the team was able to port the application to the EGEE Grid within the EGEE Life-Sciences cluster.
Application porting was conducted at Creatis and made use of the VBrowser, the Moteur workflow engine and the Diane pilot-job framework. Users interact with the VBrowser to browse and transfer input/output files to and from EGEE grid storage resources and logical file catalog. This allows them to parameterize and launch application workflows, whose execution is done with the MOTEUR engine.
Mean-Shift Smoothie is implemented using proprietary Matlab software. To simplify its grid deployment, the application was compiled on a machine holding the Matlab license representative of the EGEE worker nodes. The generated binary code can be executed on the grid with the Matlab Compiler Runtime (MCR).
Although application porting and result retrieval demanded assistance from grid experts, the grid environment developed at Creatis in collaboration with CNRS-I3S, CERN and UVA enabled autonomous experiment design and launching by end-users.
"We hope this work will provide a way to quickly compute optimal or quasi-optimal scale parameters. However, many tests are still required to have a robust approach."
In its current phase the application does not manipulate patient data. If patient data is to be manipulated in the future, CREATIS will ensure that it is anonymized and manipulated using systems such as the gLite medical data manager, for example.
-Dan Drollette, iSGTW, and the CREATIS team. This application is being demonstrated at the EGEE User Forum in Uppsala.
*The object on Mrs. Röentgen's finger is her wedding ring.