Opinion - Multiple challenges for multicore processors
Since the microprocessor's advent over 30 years ago, the vast majority of software applications have been built and executed on single processor computer systems. We have lived through an age of easy programmability where large numbers of software developers have learned to program in a variety of languages which can easily be compiled into machine code for the latest, greatest microprocessor.
For many years, high performance computing followed the same path, with faster, more-complex processors delivering annual performance gains following Moore's Law. By the mid-1980s it became clear that single-processor performance improvements couldn't continue to be delivered, and parallel computing was born.
The transition from serial to parallel programming is difficult, and as a result the number of programmers trained to develop code in this way has remained less than 1% of the overall programmer population.
Until the beginning of this century, companies and individuals benefited enormously from the annual improvements in microprocessors. These gains were driven by silicon feature size reductions, increased design complexity and greater clock speed. Clock speed peaked in 2005; since then, we have witnessed reduced clock speeds on many products to minimize energy consumption.
To continue annual performance increases and maintain the market for their products, the vendors' response has been to introduce multicore chips that pack two or more microprocessors onto a single piece of silicon.
For most businesses and individuals, the introduction of multicore processors has been positive. The performance of laptop and desktop computers has continued to increase as processor-hungry applications (e.g., music players) have run on one core while other processes (e.g., virus scanning, word processing) have run on the other core.
However, more technical users have not reaped the benefits of multicore processors unless they are lucky enough to possess an already parallelized application. In some cases, their applications run considerably slower on a multicore processor due to further clock speed reductions. Furthermore, as the number of cores on multicore processors continue to increase, the benefit becomes limited as humans are poor at performing more than 3-to-4 tasks at once, even on a computer.
In 2006, Intel demonstrated an 80-core microprocessor with over 1 Teraflop of peak performance. However, the only way to realize true performance gains is to parallelize all applications, but there are too few programmers familiar with parallel programming methods to do so.
We also face a hardware challenge directly related to multicore processors and their interaction with main memory. Memory speed has been a bottleneck ever since microprocessors were invented and this has only increased with time. Poorly optimized code may only utilize 5-to-10% of a single-core processor and the problem is exacerbated further in multicore counterparts. Unless data and instructions can be cached successfully - which generally isn't the case because of the quantities involved - any expected performance gains will be lost because of insufficient memory speed.
We need to address these hardware issues, by investigating ways of hiding the lack of memory speed from the application and by looking at how to increase overall speed through better hardware design. We also need to look at what changes can be made at the operating system level to support better use of multicore processors and the languages used to program them.
While languages such as Fortran and C can be parallelized using existing methodologies taught to software developers 20 years ago, this is not the case today with more emphasis put upon object-oriented languages such as C++ and Java and much less (if any) on parallel programming.
However, it would be wrong to dismiss Fortran and C as yesterday's languages; many of Europe's industries still rely on applications written in these codes. We need to continue training graduates in these languages, while traditional parallel programming paradigms should be taught at undergraduate level so that IT graduates have a basic understanding of the challenges. However, conceptually, parallel programming is complex to master and there is insufficient infrastructure available to train enough programmers.
We should therefore also consider completely new languages, such as Chapel which is being developed by Cray as part of HPCS. Chapel is a high-productivity programming language for multicore and parallel systems, freely available as open source. Chapel presents today's software developer with a language, grammar and concepts with which they are familiar and an Integrated Development Environment (the graphical interface to the software editor compiler, debugger and profiler programs) similar or the same as they already use.
Multicore microprocessors are becoming ubiquitous. However, the programming tools and hardware support they need to realize their true potential are lacking. We need to develop the hardware, the training programs and the new languages necessary to ensure that computer systems, from the laptop to the supercomputer, continue to deliver the increases in performance we have come to expect.
-Mark Parsons, EPCC. This article previously appeared in EPCC News