Skip to content

Evaluation & Testing

Skills for evaluating and testing AI agent skills

Plugins

assess-rfe

Assess RFEs against quality criteria using a structured rubric.

2 skills - v1.0.0

test-plan

End-to-end test planning workflow for RHOAI: generate test plans from strategies, create test cases, implement executable automation code, verify UI tests against live clusters via Playwright, publish to GitHub with PR creation, resolve review feedback, and score quality with automated rubrics using parallel sub-agent analysis.

17 skills - v1.0.0

quality-tooling

Quality tooling and automation for RHOAI component development. Includes automated repository analysis, build validation, and test pattern extraction.

5 skills - v1.0.0

agent-eval-harness

Generic agentic evaluation for skills and agents. Provides end-to-end skills to analyze, test, score, review, and iteratively improve agent skills with MLflow support for experiment tracking, tracing, and reporting. Schema-driven evaluation via eval.yaml with support for inline, LLM-based, and external judges.

7 skills - v0.1.0