Troubleshooting
This guide helps you diagnose and resolve common issues with MaaS Platform deployments.
Common Issues
- Getting
501Not Implemented errors: Traffic is not making it to the Gateway.- Verify Gateway status and HTTPRoute configuration
-
Getting
401Unauthorized errors when trying to create an API key: Authentication to maas-api is not working.- Verify
maas-api-auth-policyAuthPolicy is applied - Check if your cluster uses a custom token review audience:
# Detect your cluster's audience AUD="$(kubectl create token default --duration=10m 2>/dev/null | \ cut -d. -f2 | jq -Rr '@base64d | fromjson | .aud[0]' 2>/dev/null)" echo "Cluster audience: ${AUD}"If the audience is NOT
https://kubernetes.default.svc, patch the AuthPolicy: - Verify
-
Getting
401errors when trying to get models: Authentication is not working for the models endpoint.- Create a new API key and use it in the Authorization header
- Verify
gateway-auth-policyAuthPolicy is applied - Validate that the service account has
postaccess to thellminferenceservicesresource per MaaSAuthPolicy- Note: this should be automated by the ODH Controller
- Getting
404errors when trying to get models: The models endpoint is not working.- Verify
model-routeHTTPRoute exist and is applied - Verify the model is deployed and the
LLMInferenceServicehas themaas-default-gatewaygateway specified - Verify that the model is recognized by maas-api by checking the
maas-api/v1/modelsendpoint (see Validation Guide - List Available Models)
- Verify
- Rate limiting not working: Verify AuthPolicy and TokenRateLimitPolicy are applied
- Verify
gateway-rate-limitsRateLimitPolicy is applied - Verify TokenRateLimitPolicy is applied (e.g. gateway-default-deny or per-route policies)
- If multiple TokenRateLimitPolicies target the same HTTPRoute, see Quota and Access Configuration
- Verify the model is deployed and the
LLMInferenceServicehas themaas-default-gatewaygateway specified - Verify that the model is rate limited by checking the inference endpoint (see Validation Guide - Test Rate Limiting)
- Verify that the model is token rate limited by checking the inference endpoint (see Validation Guide - Test Rate Limiting)
- Verify
-
Routes not accessible (503 errors): Check MaaS Default Gateway status and HTTPRoute configuration
- Verify Gateway is in
Programmedstate:kubectl get gateway -n openshift-ingress maas-default-gateway - Check HTTPRoute configuration and status
- Verify Gateway is in
-
Metrics not appearing in dashboards: Prometheus is not scraping MaaS components.
- Verify User Workload Monitoring is enabled — see Observability Prerequisites
- Verify Kuadrant observability is enabled — see Observability Prerequisites
- Check prometheus-user-workload pods are running:
- Verify ServiceMonitors/PodMonitors exist:
-
Rate limiting metrics missing (authorized_calls, limited_calls): Kuadrant observability is not enabled.
- Enable observability on Kuadrant CR:
kubectl patch kuadrant kuadrant -n kuadrant-system --type=merge \ -p '{"spec":{"observability":{"enable":true}}}'- Verify the PodMonitor was created:
-
RHOAI Dashboard Observability tab returns
503 Service Unavailable: The Dashboard cannot reach the Perses backend.The error typically appears as
{"statusCode": 503, "code": "FST_REPLY_FROM_SERVICE_UNAVAILABLE", ...}. This is a Fastify/Dashboard-level error (not a gateway 503) indicating the monitoring stack is not deployed or Perses is not running. The most common causes are missing operators (COO, OpenTelemetry) or DSCImonitoring.metricsnot being configured.See RHOAI Dashboard Observability Tab for the full prerequisites and verification checklist.
-
GenAI Studio tab not visible in Dashboard: Requires
llamastackoperatorset toManagedin the DSC and thegenAiStudiofeature flag enabled onOdhDashboardConfig.See OdhDashboardConfig Feature Flags for setup.
Additional Resources
- Validation Guide — Manual validation steps
- Observability Guide — Metrics, monitoring, and dashboards
- scripts/README.md — Deployment scripts documentation