Motivation: Resource Discovery in Grids
In the BabelPeers project, we look at the problem how resources in large, heterogeneous Grids can be discovered. In world-sized Grids, two aspects of the resource discovery problem become especially important: heterogeneity and size.
Heterogeneity means that the types of resources included in the Grid are highly diverse. Additionally to traditional resources like high performance clusters and storage devices, any kind of service including arbitrary applications and expensive physical instruments are treated as resources. No single standard can encompass any resource to be described. As soon as there are multiple standards, additional knowledge is needed to mediate between these standards.
Size means that a scalable solution to the resource discovery problem is needed. Although there are numerous reasoning systems, they typically assume that all knowledge is collected at a single system, which is infeasible for arbitrary large collections of information. We present a system that contributes the initial steps to the solution of the described problem.
Generic Problem: Scalable Semantic (Meta-) Data
Although Grid computing is the motivating application, the scope of this project is larger. Its goal is to provide a scalable approach to combine and query large-scale collections of machine-readable information.
The main elements of the project are the system architecture based on a structured p2p network, a dissemination algorithm that places information on well-defined nodes, a reasoning mechanism that derives new knowledge from the existing, combining information which originates from different nodes, and query evaluation strategies.
The first evaluation strategy aims to extract all matches for a given query. We describe various strategies to minimize the network load. However, in case of queries with large result sets, an exhaustive evaluation is infeasible. Thus we present a second strategy targeting queries with a huge number of results that retrieves only a restricted number of results according to some sorting criterion. We use caching and look ahead strategies to make the algorithm efficient.
Prototype and Benchmarks
We have implemented the system prototypically. Using this implementation, we perform various experiments both on a simulation base and using real test runs to show the efficiency of the system. See the interner Link folgtdownload page for a demo of BabelPeers.
Current Work / Status
The BabelPeers project is ongoing work. Main research lines of interest are: load-balancing DHT based stores for semantic data, integration of further semantics (towards OWL-Lite and similar formalisms), efficient query evaluation strategies.