Skip to content

// Design, run, and score prompt evaluations with variance-aware benchmarks and regression tracking.

Prompt Eval KitVerified Creator

git log --oneline --stat
HEAD
Stars
2.2k
Forks
50
Updated
Apr 12, 2026
repo --stat
  • stars

    2.2k

  • forks

    50

  • last update

    Apr 12, 2026

  • license

    MITv0.9.4

quickstart.sh
3 steps
  1. Install

    // Drops SKILL.md into ~/.claude/skills/

    $ claude skills add prompt-eval-kit
  2. Invoke

    // Run from any project directory

    $ claude --skill prompt-eval-kit "help me ship this"
  3. Iterate

    // Re-run with edits — Claude keeps the skill loaded

    $ claude --skill prompt-eval-kit "now refactor it"
prompt-eval-kit/
references
SKILL.md
readonly
name:
Prompt Eval Kit
slug:
prompt-eval-kit
version:
v0.9.4
license:
MIT
author:
@evalforge
repository:
github.com/evalforge/prompt-eval-kit
categories:
tags:
#eval#benchmark#rubric#regression
description:

Design, run, and score prompt evaluations with variance-aware benchmarks and regression tracking.

features.md
3 capabilities

// What you can do with it

  • Automates the tedious parts of the workflow.
  • Gives Claude the right context, tools, and guardrails.
  • Produces consistent, reviewable output every time.

README.md

prompt-eval-kit/README.md
5 sections
Loading README…

$ cat reviews/

Reviews

// No reviews yet. Be the first.
Loading review form…

$ ls related/

explore all →