Evaluation & Testing¶

Skills for evaluating and testing AI agent skills

Plugins¶

assess-rfe ¶

Assess RFEs against quality criteria using a structured rubric.

2 skills - v1.0.0

End-to-end test planning workflow for RHOAI: generate test plans from strategies, create test cases, implement executable automation code, verify UI tests against live clusters via Playwright, publish to GitHub with PR creation, resolve review feedback, and score quality with automated rubrics using parallel sub-agent analysis.

17 skills - v1.0.0

quality-tooling ¶

Quality tooling and automation for RHOAI component development. Includes automated repository analysis, build validation, and test pattern extraction.

5 skills - v1.0.0

agent-eval-harness ¶

Generic agentic evaluation for skills and agents. Provides end-to-end skills to analyze, test, score, review, and iteratively improve agent skills with MLflow support for experiment tracking, tracing, and reporting. Schema-driven evaluation via eval.yaml with support for inline, LLM-based, and external judges.

7 skills - v0.1.0

Evaluation & Testing¶

Plugins¶

assess-rfe¶

test-plan¶

quality-tooling¶

agent-eval-harness¶

assess-rfe ¶

test-plan ¶

quality-tooling ¶

agent-eval-harness ¶