Feature - Lining up new grid users with gqsub |
||
|
||
Grid computing is often seen as the domain of big science, involving giant experiments and massive number crunching. What if you're not working on these huge scales? What if your work is too big for a local cluster but "too small for the grid?" Welcome to 'mesoscale' computing, that middle ground between the applications that use 10 minutes on your desktop and 100 hours on the grid. Researchers who fall into this area are usually well looked after by local resources. However, for various reasons - maybe a high demand from other users, or just investigating something unusual - a user may want to move on to the grid. This move can be tricky when the user is faced with another interface they have to learn or a completely different way of getting their work done. One solution comes from a team at the University of Glasgow with gqsub, a grid interface based on the qsub command. qsub is found on many *nix (Unix-like operating system) machines, and used by many scientists to submit jobs to their local batch system, in which jobs are queued for later execution. Creating a grid-enabled version makes it easier encourage new users to try out the grid. |
||
Origins It all started when electrical engineers at the University of Glasgow started trying to use the grid alongside their local cluster, and found that the two interfaces were just too dissimilar for them. So they asked Stuart Purdie of the GridPP team at Glasgow about getting direct access to the Glasgow grid cluster but bypassing the grid. Their preferred method of accessing these resources? qsub. Giving direct access was not feasible, so Purdie started looking around the lesser known parts of gLite. What he found was that there are features that can be used to give the illusion of a shared filesystem just like in a batch system. It was this which led him to start looking at qsub in greater detail. Despite its age (it was designed over 2 decades ago, ancient by computing standards), qsub included the ability to talk to remote servers and the correct components for describing a grid-type system. He now had two halves to a solution; he just had to weld them together. Nothing works that easily, however, and there were a few hurdles to overcome. The major problem was the fact that while qsub understood distributed resources, it didn't understand their being decentralized. This, coupled with needing to integrate digital certificates, led to a lot of reading specs, thinking and coding. The result was a set of python scripts that act as a translation layer between the user and the gLite software. Stuart was a little taken aback by the interest in his work, which he presented at the EGEE conference in September 2009. (His poster came in second, you can see it here.) "At the conference, the first email I got about it was on the Monday morning, before the conference had even really begun. The ultimate intent is to provide something that can be installed on an existing cluster, so that the users can use both local and Grid systems with as small as possible mismatch between them." Tony Doyle, technical director of GridPP, said: "gqsub was only in the ideas stage a few months ago. The current prototype, implemented over the summer, provides enough functionality for those users who find the transition to the Grid daunting. It's a great example of how to recognize a problem and implement a technical solution to the benefit of everyone in EGEE in a short period of time. Congratulations, Stuart!" The software has a proven track record, and the electrical engineers at Glasgow have been using it for a couple of months now and is freely available for use. There is still work to be done, with improved results collection for users who may not be online all the time, and increased control over the job once submitted to the grid. -A version of this story originally appeared in GridPP. You can find more information on gqsub and download it here. |