direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content

Master Project: Distributed Systems

Dates / Locations
Kickoff meeting
03.11.2020 13:00
Service talk on DSP Pipelines + Start of task one
10.11.2020 13:00
Final presentation of task one
15.12.2020 13:00
Presentation + Slides + Live-Demo
Service Talk on Research + Start of task two
12.01.2021 13:00
Final presentation of task two
23.02.2021 13:00
Presentation + Slides + Live-Demo
Submit report (via ISIS)
by 16.03.2021 23:59
PDF document


Registration for this project is required (capacity: 12). This semester's project still has open slots.

  • Registration has to be done via email to Morgan Geldenhuys (morgan.geldenhuys@tu-berlin.de) and should include: full name, matriculation number, field of study and list of courses already completed through DOS / CIT.
  • Registration opens on 01.10.2020 9am and closes on 02.11.2020 9am. Emails before or after will be ignored. 
  • Each successful registration will be confirmed until 02.11.2020 9pm.

Topic: Adaptive Resource Management for Stream Processing Jobs

This winter semester, we offer a master project around adaptive resource management for stream processing jobs.

Distributed Stream Processing (DSP) systems are critical to the processing of vast amounts of data in real-time. It is here where events must traverse a graph of streaming operators to allow for the extraction of valuable information. There are many scenarios where this information is at its most valuable at the time of data arrival and therefore systems must deliver a predictable level of performance. Examples of such scenarios include IoT data processing, click stream analytics, network monitoring, financial fraud detection, spam filtering, news processing, and many more. In order to process these large streams of data, DSP systems such as Storm, Spark, and Flink have been introduced which allow for the deployment of analytics pipelines which utilize the processing power of a cluster of commodity nodes. Applications developed within these frameworks are, in principle, required to operate indefinitely on an unbounded stream of continuous data in an environment where partial failures are to be expected as these applications scale. Consequently, DSP systems feature high availability modes, implement fault tolerance mechanisms by default, and expose a rich set of continuously evolving features.

In this master project, you together with your team will be required to design and implement a streaming analytics pipeline taking advantage of the latest OS virtualization and container orchestration technologies. You will be required to examine existing real-world data streams and create a generator to emulate the underlying data patterns. Once the proof of concept has been presented for your analytics pipeline, your group will then be assigned a topic within the area of adaptive resource management and will be expected to develop a scientific approach for achieving this. Topic areas may include: adaptive fault tolerance, runtime optimization, automatic parameter tuning, monitoring, modeling and runtime prediction, profiling, resource allocation, and fault detection/identification.

Target Audience

Master students with solid programming skills, who would like to gain some hands-on experience in a practical project in the area of distributed and operating systems.


  • We expect participants to have a basic understanding of distributed stream processing systems, frameworks such as Flink or Spark, and a keen interest in adaptive resource management. 
  • We assume solid programming skills in at least one widely used programming language (Java /Scala / Python) and also expect familiarity with virtualization technologies and Linux/Unix systems.

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe


Morgan Geldenhuys
+49 30 314-79675
Room TEL 1211


Dominik Scheinert
+49 30 314-26260
Room TEL 1218