Observability
This document covers the observability stack for the MaaS Platform, including metrics collection, monitoring, and visualization.
Important
User Workload Monitoring must be enabled in order to collect metrics.
Add enableUserWorkload: true to the cluster-monitoring-config in the openshift-monitoring namespace
Overview
As part of Dev Preview MaaS Platform includes a basic observability stack that provides insights into system performance, usage patterns, and operational health. The observability stack consists of:
Note
The observability stack will be enhanced in the future.
- Limitador: Rate limiting service that exposes metrics
- Prometheus: Metrics collection and storage
- Grafana: Metrics visualization and dashboards
- Future: Migration to Perses for enhanced dashboard management
Metrics Collection
Limitador Metrics
Limitador exposes several key metrics that are collected through a ServiceMonitor by Prometheus:
Rate Limiting Metrics
limitador_ratelimit_requests_total: Total number of rate limit requestslimitador_ratelimit_allowed_total: Number of requests allowedlimitador_ratelimit_denied_total: Number of requests deniedlimitador_ratelimit_errors_total: Number of rate limiting errors
Performance Metrics
limitador_ratelimit_duration_seconds: Duration of rate limit checkslimitador_ratelimit_active_connections: Number of active connectionslimitador_ratelimit_cache_hits_total: Cache hit ratelimitador_ratelimit_cache_misses_total: Cache miss rate
Tier-Based Metrics
limitador_ratelimit_tier_requests_total: Requests per tierlimitador_ratelimit_tier_allowed_total: Allowed requests per tierlimitador_ratelimit_tier_denied_total: Denied requests per tier
ServiceMonitor Configuration
For automatic discovery of services, use ServiceMonitor resources:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: limitador-monitor
namespace: monitoring
spec:
selector:
matchLabels:
app: limitador
endpoints:
- port: metrics
interval: 10s
path: /metrics
High Availability for MaaS Metrics
For production deployments where metric persistence across pod restarts and scaling events is critical, you should configure Limitador to use Redis as a backend storage solution.
Why High Availability Matters
By default, Limitador stores rate-limiting counters in memory, which means:
- All hit counts are lost when pods restart
- Metrics reset when pods are rescheduled or scaled down
- No persistence across cluster maintenance or updates
Setting Up Persistent Metrics
To enable persistent metric counts, refer to the detailed guide:
Configuring Redis storage for rate limiting
This Red Hat documentation provides:
- Step-by-step Redis configuration for OpenShift
- Secret management for Redis credentials
- Limitador custom resource updates
- Production-ready setup instructions
For local development and testing, you can also use our Limitador Persistence guide which includes a basic Redis setup script that works with any Kubernetes cluster.
Grafana Dashboards
MaaS Platform Overview Dashboard
We are providing a basic dashboard for the MaaS Platform that can be used to get a quick overview of the system. Its definition can be found and imported from the following link: maas-token-metrics-dashboard.json
See more detailed description of the Grafana Dashboard in its README of the repository.