ContextChef (7): The Provider Adapter Layer — Let Differences Stop at Compile Time
11 Mar 2026
5 min read
You spent two weeks tuning your prompt on Claude: prefill guides the model to think before answering, cache breakpoints cut input token costs by 60%, tool schemas precision-tuned to minimize hallucination rates. Then you need to add an OpenAI fallback. You open a new file, start copying the prompt, manually delete Anthropic-specific fields.
This isn’t a one-time inconvenience — it’s a structural problem. Every time you tune something, you need to sync the change across multiple files. Every time a provider updates their API, you need to track it in multiple places. Over time, three diverging prompt implementations are inevitable.
Design Angle: Provider Differences Stop at the Compile Layer
The adapter layer’s design angle is: all provider differences should be absorbed inside compile({ target }), and should not leak into business logic.
The value of this principle: your agent code has no if (provider === 'anthropic') branches anywhere. The layer above only interacts with ContextChef’s internal message format — it doesn’t know and doesn’t need to know what Anthropic’s cache_control field looks like, or what Gemini’s functionResponse structure is. That knowledge is encapsulated in the adapter, maintained in one place, updated in one place.
The direct benefit: switching providers requires changing exactly one parameter. compile({ target: "anthropic" }) becomes compile({ target: "openai" }), everything else stays the same. Functional degradation (like prefill becoming a soft constraint on OpenAI) is real — you know exactly what’s happening — but your code won’t throw errors or break because of the switch.
Where the Differences Are, and How the Adapter Handles Them
On the surface, all three providers are “send a message array, receive a response.” But the implementation details differ precisely in the places where tuning has the most impact.
Caching is where the economic impact is largest. Manus in Context Engineering for AI Agents defined KV-cache hit rate as the single most important metric for a production agent — Anthropic’s Claude Sonnet caches at 1/10 the cost of uncached tokens, and for agents with input/output ratios of 100:1, a 10% improvement in cache hit rate translates into significant cost reduction. Anthropic requires explicit cache_control: { type: 'ephemeral' } markers on messages; OpenAI automatically caches prefixes with no marker needed; Gemini’s prompt caching is a completely separate CachedContent API, decoupled from the message format.
ContextChef’s approach: you mark your intent once with _cache_breakpoint: true, and the adapter translates — Anthropic converts it to cache_control, OpenAI and Gemini ignore it (each has its own mechanism). This is intent annotation, not a provider instruction. You’re expressing “this should be a cache boundary,” not “add a cache_control field to this message.” The latter breaks when you switch providers; the former doesn’t.
Prefill is another key difference. Anthropic natively supports appending an assistant message at the end of the message array as the model’s starting content, enabling precise output format control. OpenAI and Gemini don’t support this. The adapter’s response is graceful degradation, not failure: the prefill content is converted to a [System Note] system annotation; the model will try to comply, but with weaker constraint than native prefill. Your code doesn’t crash, but you know this is a soft constraint, and can decide at the application level whether to accept it.
Tool call history formats are completely different across all three providers. The adapter’s value is most direct here: your history uses ContextChef’s internal format for tool calls; when switching providers, the adapter automatically converts the entire history to the target format — no code changes needed on your end.
Adapters Can Be Used Standalone
If you have your own message assembly logic and only need the format conversion layer, adapters can be imported independently:
import { getAdapter } from "context-chef";
const payload = getAdapter("anthropic").compile(messages);Adapters are stateless pure functions: input Message[], output the target provider’s payload structure. They don’t depend on any other ContextChef modules and can be dropped into any existing agent project.
Custom Adapters: A One-Method Interface
ContextChef ships with built-in adapters for OpenAI, Anthropic, and Gemini, but the adapter interface is intentionally minimal:
export interface ITargetAdapter {
compile(messages: Message[]): TargetPayload;
}Implement this interface and you can support any target format — self-hosted open-source models (like Qwen or Mistral deployed via vLLM), internal inference gateways, or any provider with special field requirements beyond OpenAI-compatible APIs.
The typical scenario is an internal inference service: the company’s model gateway requires extra auth fields on top of the standard OpenAI format, or a field’s format isn’t fully compatible. You implement a thin adapter to translate ContextChef’s internal message format into your target structure. Everything else — history compression, tool pruning, state injection, memory management — continues to work unchanged.
The interface has one method. No lifecycle hooks, no base class to inherit from. The minimal implementation is a pure function that takes a Message[] and returns your target format, wrapped in a class.
That’s the ContextChef series. Each module solves one independent problem and can be adopted independently — you don’t have to take the whole thing. Pick up whichever module addresses the problem you’re facing right now.