Skip to main content

Open Sourcing Incident Management system

· 4 min read
Mykola K.

img.png

We are excited to announce that our incident management system is now open source!

Our incident management system is designed to help teams quickly and effectively respond to and resolve any incidents that may occur, specifically in the tech industry.

Features

It includes features such as incident categorization, incident escalation, and real-time communication tools.

Open Sourcing

By open sourcing the system, we hope to make it more widely accessible to tech companies and organizations, and to encourage collaboration and contributions from the wider community.

We believe that by working together, we can improve the system and make it even more powerful and effective in addressing tech-related incidents.

Getting Started

To help you get started, we have also provided links to our documentation, Linkedin, official website, and a live demo of the system on our website.

We also have a signup form where you can create a new account for free to work with system.

Also, you can install platform on your own server or inside Kubernetes

Contribution

If you are interested in contributing to the project, please visit our GitHub repository to learn more and to access the code.

Our team is also available for any questions or collaborations at nikolay.k@harpia.io or GitHub Issues.

Subscription-based Support

In addition, we also provide a subscription-based support service for our incident management system.

This includes access to priority support, regular updates, and additional features.

If you are interested in this service, please contact us at nikolay.k@harpia.io for more information and pricing.

Technologies used

We use a combination of technologies such as Vue.js, Python, Aerospike, Kafka, VitoriaMetrics and MariaDB to build the system.

Architecture Overview

Our incident management system is built on a microservices architecture, utilizing a combination of APIs and event-driven communication.

The system is composed of several services that work together to provide the functionality of incident categorization, escalation, and real-time communication.

Each service is designed to be independent and can be deployed independently, allowing for flexibility and scalability.

harp-architecture.drawio.svg

1. Technical flow to process alerts:

  • harp-collectors: receive alerts from monitoring system, unify the structure and push them to Kafka topic
  • harp-alert-decorator: read alert from Kafka topic (produced by harp-collectors) and add additional info about environments and scenarios that should be applied to the alert
  • harp-daemon: read alert from Kafka topic (produced by harp-alert-decorator), describe the logic and state of the alert and write result to MariaDB
  • harp-aggregator: read alerts from MariaDB, aggregate it and send to Aerospike
  • harp-bridge: read alerts from Aerospike and send to UI via websockets
  • harp-ui: the main user interface of the platform

2. Additional Services:

3. Platform Monitoring:

  • Prometheus metrics in VictoriaMetrics
  • Traces in Grafana Tempo
  • Logs in Grafana Loki
  • Dashboards and Alerts in Grafana

Conclusion

We look forward to working with you to make our incident management system the best it can be for the water industry!