Hermes Agent Guide Independent · Practitioner-authored

AgentOps Forensics · Cost reference

What is the fixed-overhead token breakdown of a Hermes agent?

On a public field report for Hermes v0.6.0, about 73 percent of per-run tokens were fixed overhead, paid on every call before the task: roughly 46% tool definitions, 27% system prompt, 27% task work (issue #4379). This is the component view: what each part is, what inflates it, and how to measure your own.

One field report, sourced Measure your own split Per-component levers

Definition

What counts as fixed overhead in a Hermes Agent run?

Scope: this is the token-component reference only, no dollar figures. For per-run cost in dollars, the bill-creep drivers, and the keep, migrate, or stop decision, see the Hermes Agent cost breakdown.

Fixed overhead is everything the model is sent before it does the task you asked for, on every call: the tool definitions and the system prompt, plus any standing context. It is fixed in the sense that it repeats run after run, largely independent of what each run actually does.

The opposite is task-specific work: the input and the reasoning for the one thing you asked. On the public field report, the fixed part was roughly two thirds of the run, which is why a reliability problem that re-runs the agent shows up first as a cost problem.

What this page reports, and does not: the 46 / 27 / 27 split is a single field report (n=1) on one Hermes version (issue #4379), not a universal constant. Your split will differ by model, tool count, and prompt. Treat it as the shape to expect, then measure your own.

The split

How is the fixed overhead split by component?

Three buckets, one of them dominant. Tool definitions alone were nearly half of every run (issue #4379):

Per-run token split (#4379)

Tool definitions lead

The schemas of registered tools are the single largest line, ahead of the system prompt and the task itself. Attack that line and you attack most of the overhead.

Read issue #4379
Tool definitions46%

re-sent on every call, whether the run uses the tool or not

System prompt27%

instructions and formatting, paid before the task starts

Task-specific work27%

the only part that varies with what you actually asked

By component

What makes each overhead component grow, and how do I measure it?

For each part: what it is, what inflates it, the figure to pull from your own logs, and the one lever that moves it.

i
Tool definitions · ~46% (largest single share)

The JSON schema of every tool you register: names, descriptions, and parameters. It is re-sent on every call, whether or not that run uses the tool.

Grows withMore tools registered, and more verbose schemas (long descriptions, many parameters).

MeasureTool-definition tokens divided by the number of tools the run actually called.

LeverRegister only the tools a given run can use, not the whole catalogue; trim verbose schemas.

ii
System prompt · ~27%

Standing instructions: role, formatting rules, and any worked examples. Paid in full before the task starts.

Grows withAccumulated "just in case" instructions, embedded few-shot examples, long formatting rules.

MeasureSystem-prompt tokens as a share of the whole run.

LeverCut what the model already knows; move rare instructions to the runs that need them.

iii
Task-specific work · ~27%

The actual input and the work you asked for. The only part that scales with the task, not strictly overhead.

Grows withLarger inputs and longer task context. This is the part you are paying to get.

MeasureWhat is left after tool definitions and the system prompt are accounted for.

LeverNone worth pulling: shrinking this shrinks the work itself. Optimise the two above instead.

This is the component reference. For the full per-run cost model, the bill-creep drivers, and the keep-migrate-stop decision, see the Hermes Agent cost breakdown.

Questions

Hermes Agent fixed overhead: frequently asked

Which part of a Hermes Agent run is the biggest token cost?+

Tool definitions. On a public field report for Hermes v0.6.0, the JSON schemas of registered tools were the largest single share of per-run tokens, near 46 percent of the run (issue #4379). They are re-sent on every call regardless of whether the run uses each tool, so they tax every run.

Do unused tools still cost tokens in a Hermes Agent?+

Yes. Every tool you register has its schema re-sent on every call, used or not. That is why tool definitions dominate the fixed overhead: a large catalogue is paid for on every run, even when a run touches only one tool.

How do I measure my own fixed-overhead split?+

Take one representative window of usage logs and bucket the tokens into three: tool definitions, the system prompt, and the task input. The first two are your fixed overhead, paid before any task work. The field report figures (roughly 46 / 27 / 27) are one deployment on one version; measure yours rather than assuming them.

Is task-specific work part of the fixed overhead?+

No. Task-specific tokens are the only part that scales with what you asked, so they are not overhead in the strict sense. They are shown alongside tool definitions and the system prompt only to give the proportion: on the field report, the work you actually asked for was the minority of the run.

J

Who writes this

byJed, building and operating production software since 2007 and running self-hosted Hermes workflows since February 2026. The breakdown and the levers are given away here; a paid Verdict applies them to your specific numbers, read from your logs.

Figures sourced from the public Hermes issue tracker (#4379). The 46 / 27 / 27 split is a single field report (n=1) on one version; measure your own.

Run the free self-check~5 min · no signup · in your browser Start