TU Berlin

Department of Telecommunication SystemsRoot Cause Localization and Automatic Recovery for Microservices

Page Content

to Navigation

Root Cause Localization and Automatic Recovery for Microservices

An increasing number of applications have started to adopt microservices architectures (MSA) in domains such as internet of things (IoT), cloud computing and fog computing, to build large-scale systems that are more resilient, robust and better adapted to dynamic customer requirements.

To operate microservices reliably and with high uptime, it is very important to identify the root causes and recover the services quickly once abnormal behaviours are detected. 

However, it is difficult to achieve this in microservices systems due to the following challenges:

  • complex dependencies
  • numerous metrics
  • frequent updates
  • volatile infrastructure

In this project, we will study the following research questions: 

  • Root cause localization:  How to locate the root cause of performance issues in microservices ? 
    (One proposed method: MicroRCA)
  • Automatic recovery:  Once root cause identified, what action should be taken to recover the performance degradation with no/minimum SLA violation. (On going )
  • Extension to fog computing:  In a geographical distributed, resource-constrained, network unreliable fog computing environment, how could we apply the approaches in cloud to it ? 


MicroRCA root cause localization procedures




Quick Access

Schnellnavigation zur Seite über Nummerneingabe