onyx/ directory is the small repo-side surface the agent uses to understand, measure, and resume a research loop.
You usually do not create it by hand. When you prompt an agent with /onyx ..., the agent creates the initial files, commits them, and updates them as it learns.
Files
| File | Purpose |
|---|---|
onyx/onyx.md | The research brief, memory, and steering file. |
onyx/eval.sh | The required measurement entry point. |
onyx/checks.sh | Optional correctness backpressure. |
projectPath.
onyx.md Steers the Agent
onyx.md is the most important file for humans to edit. It is durable context that survives chat resets, agent handoffs, and future resumes.
Use it to tell the agent:
- what objective matters;
- which metric to optimize;
- which tradeoffs to watch;
- what files it may edit;
- what APIs, files, or behaviors are off limits;
- what constraints must hold;
- what approaches have already failed or looked promising.
onyx.md and ask the agent to continue.
eval.sh Measures Progress
eval.sh is the benchmark entry point. It should be fast, repeatable, and informative.
It must print metric lines:
checks.sh Guards Correctness
checks.sh is optional. Use it for tests, typechecks, lint, or safety checks that should run after a passing eval.
Checks do not affect the primary metric timing. If they fail, the run is recorded as checks_failed.
Protected Scripts
onyx/eval.sh and onyx/checks.sh define the comparison. Agents can improve them between runs, but should not modify them during a measured run.
If you edit them manually, commit the change before asking the agent to continue so later experiments remain understandable.