Inhalt des Dokuments
Event | Date/Time | Format |
---|---|---|
Kickoff meeting | 03.11.2020
13:00 | WebConf |
Service talk on DSP Pipelines + Start of task
one | 10.11.2020
13:00 | WebConf |
Final presentation of task one | 15.12.2020
13:00 | Presentation + Slides + Live-Demo |
Service Talk on Research + Start of task
two | 12.01.2021
13:00 | WebConf |
Final presentation of task two | 23.02.2021
13:00 | Presentation + Slides + Live-Demo |
Submit report (via ISIS) | by
16.03.2021 23:59 | PDF
document |
Registration
- Registration has to be done via email to Morgan Geldenhuys (morgan.geldenhuys@tu-berlin.de) and should include: full name, matriculation number, field of study and list of courses already completed through DOS / CIT.
- Registration opens on 01.10.2020 9am and closes on 02.11.2020 9am. Emails before or after will be ignored.
- Each successful registration will be confirmed until 02.11.2020 9pm.
Topic: Adaptive Resource Management for Stream Processing Jobs
This winter semester, we offer a master project
around adaptive resource management for stream processing jobs.
Distributed
Stream Processing (DSP) systems are critical to the processing of vast
amounts of data in real-time. It is here where events must traverse a
graph of streaming operators to allow for the extraction of valuable
information. There are many scenarios where this information is at its
most valuable at the time of data arrival and therefore systems must
deliver a predictable level of performance. Examples of such scenarios
include IoT data processing, click stream analytics, network
monitoring, financial fraud detection, spam filtering, news
processing, and many more. In order to process these large streams of
data, DSP systems such as Storm, Spark, and Flink have been introduced
which allow for the deployment of analytics pipelines which utilize
the processing power of a cluster of commodity nodes. Applications
developed within these frameworks are, in principle, required to
operate indefinitely on an unbounded stream of continuous data in an
environment where partial failures are to be expected as these
applications scale. Consequently, DSP systems feature high
availability modes, implement fault tolerance mechanisms by default,
and expose a rich set of continuously evolving features.
In this master project, you together with your team will be required to design and implement a streaming analytics pipeline taking advantage of the latest OS virtualization and container orchestration technologies. You will be required to examine existing real-world data streams and create a generator to emulate the underlying data patterns. Once the proof of concept has been presented for your analytics pipeline, your group will then be assigned a topic within the area of adaptive resource management and will be expected to develop a scientific approach for achieving this. Topic areas may include: adaptive fault tolerance, runtime optimization, automatic parameter tuning, monitoring, modeling and runtime prediction, profiling, resource allocation, and fault detection/identification.
Target Audience
Master students with solid programming skills, who would like to gain some hands-on experience in a practical project in the area of distributed and operating systems.
Prerequisites
- We expect participants to have a basic understanding of distributed stream processing systems, frameworks such as Flink or Spark, and a keen interest in adaptive resource management.
- We assume solid programming skills in at least one widely used programming language (Java /Scala / Python) and also expect familiarity with virtualization technologies and Linux/Unix systems.
Ansprechpartner
Morgan Geldenhuys+49 30 314-79675
Room TEL 1211
e-mail query [1]
Ansprechpartner
Dominik Scheinert+49 30 314-26260
Room TEL 1218
e-mail query [2]
parameter/en/id/216552/?no_cache=1&ask_mail=YAKI7wA
EdAS%2Fpr7xHHvmR5Jz%2FTSRk7%2Fn8cxkDKA%2FWZUirJyE1i1Dng
%3D%3D&ask_name=Morgan%20Geldenhuys
parameter/en/id/216552/?no_cache=1&ask_mail=YAKI7wA
Er3iKktgOwRWIhE11203qM0JG8cK%2FyKEePhDuuvXRa5q9CA%3D%3D
&ask_name=Dominik%20Scheinert