Next Generation Grid aims at supporting resource-sharing in virtual organizations all over the world, and thus to attract commercial users to use the Grid, to develop grid-enabled applications, and to offer their resources in the Grid. Mandatory prerequisites are flexibility (build virtual organizations on demand), transparency, security, predictability, and reliability in communication and cooperation (Fault Tolerance), and finally the application of reliable Service Level Agreements (SLAs) to guarantee the desired and negotiated Quality of Service. These requirements lead to new claims in all Grid middleware components, local resource management systems, and in the underlying computer, storage and networking architectures. However, at present none of the processing levels (computer architectures, resource management systems, grid middleware) complies with these high grade requirements.
Necessity for Quality of Service
Scientific and engineering applications in domains such as energy, CAE (Computer Aided Engineering), bio-informatics, weather modelling, pharmaceutical, automobile, fluid dynamics, and finance to name but a few form part of a widening range of computational and data intensive applications on production clusters. All these domains of application rely on a guaranteed level of Quality of Service (efficiency, predictability, scalability, and reliability) from the underlying computer architectures and from the applied Grid middleware.
Scientific and Technological Objectives
The main scientific and technological objectives of the HPC4U project are to provide:
- Predictable and reliable middleware for clusters built of Commodity-Off-the-Shelf components, which enables fault tolerance based on checkpointing and job migration in the Grid
- SLA-aware resource management system providing a reliable and sharp statement about the service level for a submitted job, which is fulfilled despite hardware or software failures.
- Grid middleware supporting multiple SLA-aware resource management systems in multiple administrative domains and aware of available redundant resources, which can be used in case of failure.
A consistent realisation of this vertical approach based on existing products and middleware is the main goal of the project HPC4U.