How should AI agents handle large tool outputs without losing information?

Instead of truncating large outputs (which loses potentially important information), ContextChef's Offloader relocates full content to a local Virtual File System (VFS) and replaces it in context with a truncated summary plus a context://vfs/ URI. The model can retrieve the full content on demand via a read_vfs tool. This follows Manus's principle that compression strategies must be restorable, and Anthropic's just-in-time context pattern.

What is the relationship between Offloader and Janitor in ContextChef?

Offloader does lossless relocation; Janitor does lossy summarization. When the token budget is exceeded, try the lossless option first — offload large tool results to VFS. If offloading alone doesn't bring the count under budget, then fall back to summarization. This 'lossless first' strategy is implemented via Janitor's onBeforeCompress hook.

ContextChef (4): Offloader/VFS — Relocate Information, Don't Destroy It

8 Mar 2026

4 min read

AI Agent ContextChef Context Engineering

中文版：ContextChef (4)：Offloader/VFS——不破坏信息，只搬移信息

A tool returns 15,000 characters of webpack build logs. You push them into history as-is. Next turn, token count spikes, the model’s attention gets swamped by this wall of noise, and the actual error lines at the end get buried.

The intuitive response is truncation — keep only the last N lines past the threshold. That works, but truncation means information loss. At the moment you truncate, you don’t know whether line 3,000 contains a warning that will matter ten steps from now.

Design Angle: Relocate, Don’t Delete

Manus’s principle in Context Engineering for AI Agents is: compression strategies must be restorable. They treat the file system as the agent’s ultimate context store — content can be removed from the context window, but as long as the file path remains, the model can retrieve it at any time.

Anthropic’s Claude Code uses the same pattern, calling it just-in-time context: the agent maintains lightweight identifiers (file paths, URLs, queries) and only loads complete data when it’s actually needed, rather than pre-loading everything upfront.

Offloader’s design angle aligns with both: relocate information, don’t destroy it. When output exceeds the threshold, the complete content is written to local VFS, and the context entry is replaced with a truncated summary plus a context://vfs/<hash> URI. The information isn’t gone — it’s just stored somewhere else, with a pointer still in the context.

The advantage of this angle is honesty toward the model. The model sees the URI, knows full content is available on demand, and won’t misinterpret a truncated output as the tool’s complete result. This is far less risky than silent truncation — you’re not deciding “this content doesn’t matter” on the model’s behalf; you’re saying “the full content is here, come get it when you need it.”

The URI Is the Core Design, Not an Implementation Detail

context://vfs/<hash> isn’t just a file path. It’s a protocol contract: any string starting with context://vfs/ signals to the model that it can retrieve the complete content using the read_vfs tool.

The benefit of this design is consistency: regardless of whether what was offloaded is terminal logs, file content, or an API response, the retrieval mechanism is the same. The model doesn’t need to distinguish between sources. Same on the developer side — register one read_vfs tool and all large output retrieval routes through it, with no per-content-type handling.

The tailChars parameter reflects an awareness that different content types have different tail importance: terminal logs preserve the last 2,000 characters (errors are usually at the end); static documents use { tailChars: 0 } (any part might be relevant — offload everything, let the model fetch what it needs). You can also use headChars to preserve content from the beginning. The defaults are tuned for the most common case — tool output — so you don’t need to specify them explicitly most of the time.

const safeLog = chef.offload(rawTerminalOutput);                      // logs: keep last 2000 chars
const safeDoc = chef.offload(largeDoc, { tailChars: 0 });            // docs: offload everything
const safeBuild = chef.offload(buildOutput, { headChars: 500, tailChars: 1000 }); // keep head + tail

Working with Janitor: Lossless First

The natural division between Offloader and Janitor is: Offloader does lossless relocation; Janitor does lossy summarization. When the token budget is exceeded, try the lossless option first — offload large tool results to VFS and see if the token count comes down without summarizing. If offloading isn’t enough, then fall back to summarization.

The reason for this order: summarization is a lossy operation. You never know which “seemed unimportant at the time” details it dropped. If you can solve it losslessly, don’t take the lossy path. Janitor’s onBeforeCompress hook lets you implement this two-stage strategy explicitly, without manually coordinating the two modules from outside.

Next: Memory — why “let the LLM decide what to remember” is a dangerous design.