AgentOps Forensics · Cost reference

What is the fixed-overhead token breakdown of a Hermes agent?

This is the component-by-component reference for Hermes Agent fixed overhead: tool definitions about 46 percent, the system prompt about 27 percent, task work about 27 percent, with what inflates each part and how to measure your own split from your logs. The figures come from a public field report on Hermes v0.6.0 (issue #4379), where fixed overhead, paid on every call before the task, ran near 73 percent of per-run tokens.

Run the free self-check the guide home

✓ One field report, sourced ✓ Measure your own split ✓ Per-component levers

Definition

What counts as fixed overhead in a Hermes Agent run?

Scope: this is the token-component reference only, no dollar figures. For per-run cost in dollars, the bill-creep drivers, and the keep, migrate, or stop decision, see the Hermes Agent cost breakdown.

Fixed overhead is everything the model is sent before it does the task you asked for, on every call: the tool definitions and the system prompt, plus any standing context. It is fixed in the sense that it repeats run after run, largely independent of what each run actually does.

The opposite is task-specific work: the input and the reasoning for the one thing you asked. On the public field report, the fixed part was roughly two thirds of the run, which is why a reliability problem that re-runs the agent shows up first as a cost problem.

What this page reports, and does not: the 46 / 27 / 27 split is a single field report (n=1) on one Hermes version (issue #4379), not a universal constant. Your split will differ by model, tool count, and prompt. Treat it as the shape to expect, then measure your own.

The split

How is the fixed overhead split by component?

Three buckets, one of them dominant. Tool definitions alone were nearly half of every run (issue #4379):

Per-run token split (#4379)

Tool definitions lead

The schemas of registered tools are the single largest line, ahead of the system prompt and the task itself. Attack that line and you attack most of the overhead.

Read issue #4379

Tool definitions46%

re-sent on every call, whether the run uses the tool or not

System prompt27%

instructions and formatting, paid before the task starts

Task-specific work27%

the only part that varies with what you actually asked

Fixed-overhead components and the figure to pull from your own logs (field-report shares from issue #4379, n=1).
Component	Field-report share	Measure from your logs
Tool definitions	about 46 percent (largest)	tool-definition tokens divided by the tools the run actually called
System prompt	about 27 percent	system-prompt tokens as a share of the whole run
Task-specific work	about 27 percent	what remains after tool definitions and the system prompt are accounted for

Vendor pricing documents this overhead. Anthropic's tool-use pricing documentation confirms input-token pricing includes the tools parameter: the names, descriptions, and schemas of every tool included in the request. In traces from this site's own self-hosted Hermes runs, Hermes sent the full schema of every registered tool on every call. A qualitative read from one deployment's logs, not a benchmark.

By component

What makes each overhead component grow, and how do I measure it?

For each part: what it is, what inflates it, the figure to pull from your own logs, and the one lever that moves it.

Tool definitions · ~46% (largest single share)

The JSON schema of every tool you register: names, descriptions, and parameters. It is re-sent on every call, whether or not that run uses the tool.

Grows withMore tools registered, and more verbose schemas (long descriptions, many parameters).

MeasureTool-definition tokens divided by the number of tools the run actually called.

LeverRegister only the tools a given run can use, not the whole catalogue; trim verbose schemas.

System prompt · ~27%

Standing instructions: role, formatting rules, and any worked examples. Paid in full before the task starts.

Grows withAccumulated "just in case" instructions, embedded few-shot examples, long formatting rules.

MeasureSystem-prompt tokens as a share of the whole run.

LeverCut what the model already knows; move rare instructions to the runs that need them.

iii

Task-specific work · ~27%

The actual input and the work you asked for. The only part that scales with the task, not strictly overhead.

Grows withLarger inputs and longer task context. This is the part you are paying to get.

MeasureWhat is left after tool definitions and the system prompt are accounted for.

LeverNone worth pulling: shrinking this shrinks the work itself. Optimise tool definitions and the system prompt instead.

This is the component reference. For the full per-run cost model, the bill-creep drivers, and the keep-migrate-stop decision, see the Hermes Agent cost breakdown.

Questions

Hermes Agent fixed overhead: frequently asked

Which part of a Hermes Agent run is the biggest token cost?+

Tool definitions. On a public field report for Hermes v0.6.0, the JSON schemas of registered tools were the largest single share of per-run tokens, near 46 percent of the run (issue #4379). They are re-sent on every call regardless of whether the run uses each tool, so they tax every run.

Do unused tools still cost tokens in a Hermes Agent?+

Yes. Every tool you register has its schema re-sent on every call, used or not. That is why tool definitions dominate the fixed overhead: a large catalogue is paid for on every run, even when a run touches only one tool.

How do I measure my own fixed-overhead split?+

Take one representative window of usage logs and bucket the tokens into three: tool definitions, the system prompt, and the task input. The first two are your fixed overhead, paid before any task work. The field report figures (roughly 46 / 27 / 27) are one deployment on one version; measure yours rather than assuming them.

Is task-specific work part of the fixed overhead?+

No. Task-specific tokens are the only part that scales with what you asked, so they are not overhead in the strict sense. They are shown alongside tool definitions and the system prompt only to give the proportion: on the field report, the work you actually asked for was the minority of the run.

Who writes this

byJed, building and operating production software since 2007 and running self-hosted Hermes workflows since February 2026. The breakdown and the levers are given away here; a paid Verdict applies them to your specific numbers, read from your logs.

Figures sourced from the public Hermes issue tracker (#4379). The 46 / 27 / 27 split is a single field report (n=1) on one version; measure your own.

Run the free self-check~5 min · no signup · in your browser Start