Skip to main content
The onyx/ directory is the small repo-side surface the agent uses to understand, measure, and resume a research loop. You usually do not create it by hand. When you prompt an agent with /onyx ..., the agent creates the initial files, commits them, and updates them as it learns.

Files

FilePurpose
onyx/onyx.mdThe research brief, memory, and steering file.
onyx/eval.shThe required measurement entry point.
onyx/checks.shOptional correctness backpressure.
For scoped projects, the directory lives under projectPath.

onyx.md Steers the Agent

onyx.md is the most important file for humans to edit. It is durable context that survives chat resets, agent handoffs, and future resumes. Use it to tell the agent:
  • what objective matters;
  • which metric to optimize;
  • which tradeoffs to watch;
  • what files it may edit;
  • what APIs, files, or behaviors are off limits;
  • what constraints must hold;
  • what approaches have already failed or looked promising.
When the loop starts drifting, update onyx.md and ask the agent to continue.

eval.sh Measures Progress

eval.sh is the benchmark entry point. It should be fast, repeatable, and informative. It must print metric lines:
METRIC tracking_error=0.18
METRIC overshoot_percent=3.2
The agent will use these lines to log experiments and decide what to build from next.

checks.sh Guards Correctness

checks.sh is optional. Use it for tests, typechecks, lint, or safety checks that should run after a passing eval. Checks do not affect the primary metric timing. If they fail, the run is recorded as checks_failed.

Protected Scripts

onyx/eval.sh and onyx/checks.sh define the comparison. Agents can improve them between runs, but should not modify them during a measured run. If you edit them manually, commit the change before asking the agent to continue so later experiments remain understandable.