Skip to content

Observability

This document covers the observability stack for the MaaS Platform, including metrics collection, monitoring, and visualization.

Important

User Workload Monitoring must be enabled in order to collect metrics.

Add enableUserWorkload: true to the cluster-monitoring-config in the openshift-monitoring namespace

Overview

As part of Dev Preview MaaS Platform includes a basic observability stack that provides insights into system performance, usage patterns, and operational health. The observability stack consists of:

Note

The observability stack will be enhanced in the future.

  • Limitador: Rate limiting service that exposes metrics
  • Prometheus: Metrics collection and storage
  • Grafana: Metrics visualization and dashboards
  • Future: Migration to Perses for enhanced dashboard management

Metrics Collection

Limitador Metrics

Limitador exposes several key metrics that are collected through a ServiceMonitor by Prometheus:

Rate Limiting Metrics

  • limitador_ratelimit_requests_total: Total number of rate limit requests
  • limitador_ratelimit_allowed_total: Number of requests allowed
  • limitador_ratelimit_denied_total: Number of requests denied
  • limitador_ratelimit_errors_total: Number of rate limiting errors

Performance Metrics

  • limitador_ratelimit_duration_seconds: Duration of rate limit checks
  • limitador_ratelimit_active_connections: Number of active connections
  • limitador_ratelimit_cache_hits_total: Cache hit rate
  • limitador_ratelimit_cache_misses_total: Cache miss rate

Tier-Based Metrics

  • limitador_ratelimit_tier_requests_total: Requests per tier
  • limitador_ratelimit_tier_allowed_total: Allowed requests per tier
  • limitador_ratelimit_tier_denied_total: Denied requests per tier

ServiceMonitor Configuration

For automatic discovery of services, use ServiceMonitor resources:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: limitador-monitor
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: limitador
  endpoints:
  - port: metrics
    interval: 10s
    path: /metrics

High Availability for MaaS Metrics

For production deployments where metric persistence across pod restarts and scaling events is critical, you should configure Limitador to use Redis as a backend storage solution.

Why High Availability Matters

By default, Limitador stores rate-limiting counters in memory, which means:

  • All hit counts are lost when pods restart
  • Metrics reset when pods are rescheduled or scaled down
  • No persistence across cluster maintenance or updates

Setting Up Persistent Metrics

To enable persistent metric counts, refer to the detailed guide:

Configuring Redis storage for rate limiting

This Red Hat documentation provides:

  • Step-by-step Redis configuration for OpenShift
  • Secret management for Redis credentials
  • Limitador custom resource updates
  • Production-ready setup instructions

For local development and testing, you can also use our Limitador Persistence guide which includes a basic Redis setup script that works with any Kubernetes cluster.

Grafana Dashboards

MaaS Platform Overview Dashboard

We are providing a basic dashboard for the MaaS Platform that can be used to get a quick overview of the system. Its definition can be found and imported from the following link: maas-token-metrics-dashboard.json

See more detailed description of the Grafana Dashboard in its README of the repository.