Monitoring Guide: Prometheus, Grafana & OpenSearch

This document describes the monitoring stack using free, open-source tools to ensure visibility and alerting for your deployment.

1. Metrics Collection with Prometheus

Installation via Helm

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus

Service Discovery

Annotate service pods to expose metrics:

metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/path: "/actuator/prometheus"
    prometheus.io/port: "8080"

Key Metrics to Track

API Latency: HTTP request duration (p95/p99)
Error Rate: 4xx/5xx per minute
Kafka Consumer Lag: to monitor event processing
Pod Health: CPU/memory usage, restart count

2. Dashboards with Grafana

Installation via Helm

helm repo add grafana https://grafana.github.io/helm-charts
helm install grafana grafana/grafana

Dashboard Setup

Import official Prometheus dashboard (ID: 1860)
Import Kafka overview dashboard (ID: 7607)
Custom panels:
- P99 latency (histogram_quantile)
- Error budget burn rate (alerts per minute)

3. Log Aggregation with OpenSearch

Installation via Helm

helm repo add opensearch https://opensearch-project.github.io/helm-charts
helm install opensearch opensearch/opensearch
helm install dashboards opensearch/opensearch-dashboards

Filebeat DaemonSet

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
spec:
  selector:
    matchLabels:
      app: filebeat
  template:
    metadata:
      labels:
        app: filebeat
    spec:
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:7.17.0
        volumeMounts:
        - name: varlog
          mountPath: /var/log
      volumes:
      - name: varlog
        hostPath:
          path: /var/log

Key Log Queries

Error logs: level:ERROR in last 5m
Slow requests: HTTP duration > 1s
Kafka errors: consumer group failures

Document Version: 1.0
Date: 2025-06-21
Author: ArturChernets

1. Metrics Collection with Prometheus​

Installation via Helm​

Service Discovery​

Key Metrics to Track​

2. Dashboards with Grafana​

Installation via Helm​

Dashboard Setup​

3. Log Aggregation with OpenSearch​

Installation via Helm​

Filebeat DaemonSet​

Key Log Queries​

1. Metrics Collection with Prometheus

Installation via Helm

Service Discovery

Key Metrics to Track

2. Dashboards with Grafana

Installation via Helm

Dashboard Setup

3. Log Aggregation with OpenSearch

Installation via Helm

Filebeat DaemonSet

Key Log Queries