We are excited to announce that our incident management system is now open source!
Our incident management system is designed to help teams quickly and effectively respond to and resolve any incidents that may occur, specifically in the tech industry.
Features
It includes features such as incident categorization, incident escalation, and real-time communication tools.
Open Sourcing
By open sourcing the system, we hope to make it more widely accessible to tech companies and organizations, and to encourage collaboration and contributions from the wider community.
We believe that by working together, we can improve the system and make it even more powerful and effective in addressing tech-related incidents.
Getting Started
To help you get started, we have also provided links to our documentation, Linkedin, official website, and a live demo of the system on our website.
We also have a signup form where you can create a new account for free to work with system.
Also, you can install platform on your own server or inside Kubernetes
Contribution
If you are interested in contributing to the project, please visit our GitHub repository to learn more and to access the code.
Our team is also available for any questions or collaborations at nikolay.k@harpia.io or GitHub Issues.
Subscription-based Support
In addition, we also provide a subscription-based support service for our incident management system.
This includes access to priority support, regular updates, and additional features.
If you are interested in this service, please contact us at nikolay.k@harpia.io for more information and pricing.
Technologies used
We use a combination of technologies such as Vue.js, Python, Aerospike, Kafka, VitoriaMetrics and MariaDB to build the system.
Architecture Overview
Our incident management system is built on a microservices architecture, utilizing a combination of APIs and event-driven communication.
The system is composed of several services that work together to provide the functionality of incident categorization, escalation, and real-time communication.
Each service is designed to be independent and can be deployed independently, allowing for flexibility and scalability.
1. Technical flow to process alerts:
- harp-collectors: receive alerts from monitoring system, unify the structure and push them to Kafka topic
- harp-alert-decorator: read alert from Kafka topic (produced by harp-collectors) and add additional info about environments and scenarios that should be applied to the alert
- harp-daemon: read alert from Kafka topic (produced by harp-alert-decorator), describe the logic and state of the alert and write result to MariaDB
- harp-aggregator: read alerts from MariaDB, aggregate it and send to Aerospike
- harp-bridge: read alerts from Aerospike and send to UI via websockets
- harp-ui: the main user interface of the platform
2. Additional Services:
- harp-filters: create and manage the user specific filters in UI
- harp-actions: manage alerts - handle, snooze, acknowledge
- harp-environments: create and manage environments
- harp-bots: configure your own bots to send auto notifications to different channels - Email, SMS, Slack etc..
- harp-integrations: create and manage the integrations with your monitoring systems
- harp-licenses: monitor the usage of the alerts and notification channels
- harp-scenarios: create and manage scenarios for alerts
- harp-users: create and manage users inside platform including authentication and authorization
- harp-notifications-gmail: responsible to send auto email notifications
- harp-notifications-msteams: responsible to send auto notifications to Microsoft Teams
- harp-notifications-slack: responsible to send auto notifications to Slack channel
- harp-notifications-sms: responsible for creating auto SMS notifications via Twilio integration
- harp-notifications-telegram: responsible to send auto notifications to Telegram channels
- harp-notifications-voice: responsible for creating auto Phone Calls via Twilio integration
- harp-clientevents: receive and analyze metrics from the frontend
3. Platform Monitoring:
- Prometheus metrics in VictoriaMetrics
- Traces in Grafana Tempo
- Logs in Grafana Loki
- Dashboards and Alerts in Grafana
Conclusion
We look forward to working with you to make our incident management system the best it can be for the water industry!