Auto Scaling Strategy

This guide defines how to implement efficient, metric-driven autoscaling for Spring Boot microservices running on Kubernetes.

Tooling Used

Horizontal Pod Autoscaler (HPA)
Prometheus Adapter
Spring Boot Actuator
Custom Metrics via Micrometer

Setup HPA with Custom Metrics

Step 1: Expose Metrics

<!-- pom.xml -->
<dependency>
  <groupId>io.micrometer</groupId>
  <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

application.yml:

management:
  endpoints:
    web.exposure.include: "*"
  metrics:
    export:
      prometheus:
        enabled: true

Step 2: Create HPA Based on CPU or Custom Metric

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: booking-service
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: booking-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Best Practices

Use both CPU and custom metrics (e.g., request count).
Tune thresholds with real-world traffic patterns.
Enable Kubernetes vertical pod autoscaler (VPA) for critical jobs.

Document Version: 1.0
Date: 2025-06-24
Author: ArturChernets

Tooling Used​

Setup HPA with Custom Metrics​

Step 1: Expose Metrics​

Step 2: Create HPA Based on CPU or Custom Metric​

Best Practices​

Tooling Used

Setup HPA with Custom Metrics

Step 1: Expose Metrics

Step 2: Create HPA Based on CPU or Custom Metric

Best Practices