Extreme Science and Engineering Discovery Environment (XSEDE), the much-anticipated successor to TeraGrid, launched 1 July 2011. Now that things have settled down in the aftermath of the last TeraGrid conference, iSGTW decided to look up XSEDE principal investigator John Towns to find out more. Towns also serves as the lead for persistent infrastructure at the National Center for Supercomputing Applications in Urbana, Illinois, USA.
iSGTW: Thank you for agreeing to speak with us, John. Before we start, I'd just like to congratulate you and your team on launching XSEDE.
John Towns: Thanks very much. It has been a very long road and it is great to finally be at the stage where we can start doing instead of just planning.
iSGTW: I know that successfully funded proposals for big projects like this are very detailed. How much of XSEDE is already planned out?
Towns: Though much of it has been planned out, there is still some work to do. There is really a three-part answer to this question.
The first part is that the NSF funded the planning of projects during the proposal development phase. This allowed for much more significant planning and preparation than we typically see in these projects. The second part is that the final plan for this project involved incorporation of new elements as a condition from NSF. This is inducing re-planning in several areas, and this work has been under way for the past few months in anticipations of the award. The third is that the project is designed to evolve over time and, as a result, will explicitly involve a level of ongoing planning throughout the project.
iSGTW: How will XSEDE be different from TeraGrid?
Towns: Wow, that's a big question and also one I have been asked frequently of late. Unfortunately, it is difficult to capture in a few minutes (or words). At the highest level, we have worked very hard to determine what has been the best of TeraGrid to bring that forward, to identify the weaknesses of TeraGrid and address them, and to determine where we could most effectively expand our scope and efforts. We have spent considerable time in public (particularly last week during the TeraGrid '11 conference in Salt Lake City) relating what is new and better about XSEDE to the community.
The short answer to the question is that we have incorporated many formal processes in the management of the project, we have expanded the scope of the community we are trying to serve, and our goal is to establish a basis for an open national cyberinfrastructure to facilitate collaboration and productivity of the research community writ large. Really, there are so many things to say in response to this question and I want to launch into a long list of things, but I will refer folks to the growing set of information we are posting to the XSEDE.org web site.
iSGTW: Were those differences mandated by the National Science Foundation, or are they part of the vision your team brings to XSEDE?
Towns: Certainly, NSF set out certain parameters of the program in the solicitation to which we responded. (See NSF 08-571). The solicitation left many things undefined, however. The vision we developed over the past three plus years took the solicitation as part of the many sources of guidance we had in the process.
iSGTW: You mentioned earlier that with XSEDE, you've increased the scope of the community. Can you expand on that statement?
Towns: I don't think it is unfair to say that an aspect of TeraGrid was that it was a technology-centric project with a focus on the delivery of capabilities-and in particular HPC capabilities-to the academic research community. TeraGrid was very successful here and facilitated many wonderful successes. However, the strong HPC focus naturally scoped the community we supported to primarily those making use of high-end computing resources. With XSEDE we are taking a slightly different perspective: a researcher-centric point of view. The focus now is much more on creating an infrastructure in which researchers can create their own environment in which all the resources they need are embedded to facilitate their work and allow them to be more productive. The resources of interest might be HPC systems, but will also be data stores, visualization capabilities, collaboration tools and resources that distributed teams might wish to share among themselves. This approach opens the door to provide support for a much larger community of researchers.
iSGTW: With XSEDE's new focus on collaboration, how will its approach to interoperability differ from TeraGrid's?
Towns: This is a bit of a tricky space. Under TeraGrid, we did not support interoperability nearly as effectively as we would have liked. With XSEDE, we have incorporated this desire from the start of the design of our distributed environment. Many things come into play-not the least of which is supporting innovative capabilities that often work at odds with maintaining interoperability. We believe we have come to a judicious balance of the use of standards and common practices to support interoperability for common capabilities while still allowing innovation to occur.
A major thrust of XSEDE is to integrate with campus cyberinfrastructures, and the use of standards and common practices will be extremely important in being successful in this space. At the national level, we wish to support interoperability with other infrastructures of interest to the research community. In many of these cases, adherence to select standards will be critical, but we must also look to be interoperable with projects that have had less of an adherence to standards.
We have already tested interoperability with the Open Science Grid, one of our key targets in this space, and we believe we can do this well technically. There are, of course, always the policy issues, and we are beginning to sort them out with OSG already.
iSGTW: How will XSEDE differ from TeraGrid with respect to international collaboration and interoperability?
Towns: On the international front, we have been engaging several infrastructures. During the planning of this project we developed a deep relationship with our friends in Europe who are part of DEISA (which is in the process of being merged with PRACE) and have plans in development with other infrastructures. We have already tested interoperability with DEISA and have plans to do this with others. At the same time we are working with PRACE management to define the relationship after the incorporation of DEISA.
Likewise, we are engaged with folks from EGI, the NGS and others in the UK, the folks from NAREGI in Japan, from CNGrid and ChinaGrid in China and from KISTI in South Korea. As it turns out, there is also emerging infrastructure in countries like Chile, which I visited earlier this year, and they are keenly interested in working with us.
There seems to be a resurgence of interest in international collaboration among those providing infrastructures, and this is critically important to effectively supporting the international nature of research collaborations in the modern world. These relationships take a significant amount of time and effort to develop, but we see great potential for accomplishing our goals in collaboration and we will continue to put effort into these developments.
iSGTW: There are a plethora of cyberinfrastructure-enabled experiments in the United States that did not use TeraGrid. Is it the hope that eventually, all NSF-funded cyberinfrastructure will be plugged into XSEDE if not directly, then via some other cyberinfrastructure project?
Towns: We recognized this in the planning for this project. To the extent possible, we have tried to reach out to these experiments, and many have been great sources of requirements for us. While it is the vision of XSEDE to develop the basis for a national cyberinfrastructure to which all of these projects might connect and thus have interoperability supported, we have some very significant challenges.
In the case of large-scale experiments, they frequently have funding timelines of multiple decades and very long planning horizons. XSEDE has been awarded as a five-year project with an option to extend for an additional five years. This gives us a maximum run of 10 years. For the large experiments, they are reluctant to become too dependent on an infrastructure that may disappear.
This introduces a significant risk, and they are thus often developing their own solutions and replicating significant amounts of effort. While we have and will continue to have technical discussions with them to assure interoperability to the extent possible, without a long-term commitment to a common infrastructure, the many cyberinfrastructure-enabled experiments in the United States will not benefit from the significant advantages of leverage they might.
I have made this point with many at NSF, and there are some sympathetic to this issue. With luck, we might hope for a longer term commitment to such critical research infrastructure in the future.
iSGTW: What about multi-decade projects that have not yet made any decisions about their infrastructure - have you spoken to any of them? Do you think they might be more open to making infrastructure choices that are compatible with XSEDE despite the differences in project duration?
Towns: As I mentioned before, building these relationships takes significant time and effort, and as yet we have not been able to approach all of the projects we would like to. Identifying projects in the state you ask about is difficult since they are often in the proposal development stage and we do not necessarily know who will be responding to various NSF solicitations. One of the things I have been asking of NSF is to include in their solicitations, particularly the MREFC (Major Research Equipment and Facilities Construction) solicitations, information about XSEDE as a possible provider and collaborator in this space.
Unfortunately, until there is some clear indication of greater longevity for XSEDE, the longer-lived project will simply see committing to the use of XSEDE-provided capabilities as too risky. I am encouraged that many have been willing to discuss with us what we are doing and planning, and I think that is helpful to the research community, but we simply are not realizing the leverage that is possible at this point.