Cherry Studio AI Core (1): Architecture Shift to AI SDK and Plugins

5 Jan 2026

9 min read

This is part 1 of a series where I unpack the core LLM refactor I led in Cherry Studio. I am writing it the way I would explain it to another open-source developer who wants the real tradeoffs, not just the headline.

Key PRs I authored in this part:

#7404 (AI SDK integration, plugin system, request/MCP changes, aiCore package)

1. Why the AI SDK pivot (and why we chose a big refactor)

The old architecture worked, but it hit a local maximum: every new provider or capability forced another fork in the pipeline. That tradeoff was fine early on, but the surface area kept expanding.

Concrete pain points I could no longer ignore:

Provider divergence: each provider had its own edge cases, which multiplied in the middleware chain.
Tool-call fragmentation: prompt-based tools, MCP tools, and provider-native tools needed consistent orchestration.
Type drift: types had to be reconciled across multiple clients and adapters, which increased maintenance cost.
Extension friction: adding new features (web search, tool approval, usage tracking) required touching multiple layers.

The AI SDK pivot solved multiple problems at once:

A single model interface reduced provider-specific branching.
Stronger, centralized typing simplified tool-call and streaming logic.
The plugin engine became a stable surface for cross-cutting features.

Most importantly, this was not just “cleanup.” The old design could not handle the next wave of features without compounding complexity. The refactor was the smallest change that would actually bend that curve.

2. The new core layers

The new aiCore package is built around a simple, explicit layering model:

Models: model creation and provider resolution
Runtime: execution and user-facing APIs
Plugins: lifecycle hooks and stream transforms

I wrote the structure down in the architecture doc so future me and contributors can follow the flow:

graph TD
  subgraph "App"
    UI[UI]
  end
  subgraph "aiCore"
    Runtime[Runtime]
    Models[Models]
    Plugins[Plugins]
  end
  subgraph "AI SDK"
    SDK[ai]
  end
  UI --> Runtime --> SDK
  Runtime --> Plugins
  Runtime --> Models

The full diagram is in packages/aiCore/AI_SDK_ARCHITECTURE.md and shows the runtime, models, plugins, middleware, and provider registry in a single flow.

3. The design center (the rule I kept repeating)

The center of this architecture is a simple rule: the runtime should be boring. Everything messy lives outside it.

That translates into three concrete decisions:

Normalize provider quirks before the runtime sees them.
Route all feature logic through plugins rather than branching inside executor code.
Keep a single execution path so every request shares the same lifecycle.

4. Architecture walk-through (layer by layer)

This is the mental model I use when I explain the flow to someone new.

Models layer

The models layer does two things: it resolves model IDs (namespaced or plain) and it wraps the resulting model with middlewares when needed. That means the runtime always receives a LanguageModelV2 that is already normalized.

Runtime layer

The runtime layer is the execution boundary. I kept it small on purpose (streamText/generateText/streamObject/generateObject/generateImage), and pushed cross-cutting concerns into plugins so behavior stays predictable.

Plugin layer

Plugins are now the primary extension surface. Logging, tool use, web search transforms, and future agent extensions live here. The hook model makes ordering explicit and cuts down hidden coupling.

Provider system

Providers are centralized in aiCore. The registry and factory isolate provider-specific configuration, so the rest of the pipeline can operate on unified interfaces. This is what makes the runtime reusable and future provider additions less invasive.

4. Core modules in `src/core` (decisions, boundaries, and why the split works)

I split packages/aiCore/src/core/ into six modules on purpose. The goal was to make the flow obvious and to isolate “messy” concerns so they do not leak into runtime.

models/ (model resolution is not runtime logic)

Models are the compatibility layer. The key decision here: the runtime never resolves providers or model IDs. That logic lives in ModelResolver, which handles both namespaced and traditional model IDs, and only then wraps the model with middlewares.

if (modelId.includes(DEFAULT_SEPARATOR)) {
  model = this.resolveNamespacedModel(modelId)
} else {
  model = this.resolveTraditionalModel(finalProviderId, modelId)
}

if (middlewares && middlewares.length > 0) {
  model = wrapModelWithMiddlewares(model, middlewares)
}

File: packages/aiCore/src/core/models/ModelResolver.ts

Why this split works: adding a provider changes how models are built, not how the runtime behaves.

runtime/ (Executor + PluginEngine, one execution path)

Runtime is where I keep the execution surface stable. It is basically two things:

RuntimeExecutor: the user-facing API that exposes streamText/generateText/streamObject/generateObject/generateImage.
PluginEngine: the internal pipeline that runs hooks, resolves the model, transforms params/results, and emits lifecycle events.

The key decision: every call goes through PluginEngine, even when model resolution is needed. The resolver still lives in models/, and it is responsible for returning a LanguageModelV2. PluginEngine just wires it into the execution flow so all calls share the same ordering and context setup.

this.pluginEngine.usePlugins([
  this.createResolveModelPlugin(options?.middlewares),
  this.createConfigureContextPlugin()
])

File: packages/aiCore/src/core/runtime/executor.ts

PluginEngine itself makes the flow explicit:

await this.pluginManager.executeConfigureContext(context)
await this.pluginManager.executeParallel('onRequestStart', context)
const resolved = await this.pluginManager.executeFirst('resolveModel', modelId, context)
const transformedParams = await this.pluginManager.executeSequential('transformParams', params, context)
const result = await executor(resolvedModel, transformedParams)
const transformedResult = await this.pluginManager.executeSequential('transformResult', result, context)
await this.pluginManager.executeParallel('onRequestEnd', context, transformedResult)

File: packages/aiCore/src/core/runtime/pluginEngine.ts

I also like to summarize the hook lifecycle as a simple sequence:

sequenceDiagram
  participant Executor as RuntimeExecutor
  participant Engine as PluginEngine
  participant SDK as AI SDK
  Executor->>Engine: configureContext
  Engine-->>Engine: onRequestStart
  Engine-->>Engine: resolveModel
  Engine-->>Engine: transformParams
  Engine->>SDK: execute(model, params)
  SDK-->>Engine: result/stream
  Engine-->>Engine: transformResult
  Engine-->>Engine: onRequestEnd/onError

This is why runtime stays small but still powerful: it is a thin API layer on top of a deterministic plugin pipeline.

plugins/ (a real hook system, not a bag of helpers)

The plugin system is structured, not ad-hoc. I separated hooks into First / Sequential / Parallel / Stream stages, and enforce pre/normal/post ordering. This made it possible to add tool-use, logging, and web search as plugins without creating hard dependencies.

// pre -> normal -> post
private sortPlugins(plugins: AiPlugin[]): AiPlugin[] { /* ... */ }

async executeFirst(hookName, arg, context) {
  for (const plugin of this.plugins) {
    const result = await hook?.(arg, context)
    if (result != null) return result
  }
  return null
}

File: packages/aiCore/src/core/plugins/manager.ts

The runtime itself only knows how to execute; plugins decide how to transform and observe.

providers/ (registry + aliases + provider-specific quirks)

Providers are centralized in a registry to keep the rest of the system clean. I also added alias support and provider-specific handling (OpenAI chat vs responses, Azure variants) at the registry layer so the rest of the pipeline does not need to know these details.

if (providerId === 'openai') {
  globalRegistryManagement.registerProvider(providerId, provider, aliases)
  const openaiChatProvider = customProvider({ fallbackProvider: { ...provider, languageModel: (id) => provider.chat(id) } })
  globalRegistryManagement.registerProvider(`${providerId}-chat`, openaiChatProvider)
}

File: packages/aiCore/src/core/providers/registry.ts

middleware/ (SDK-level seam, not app-level behavior)

AI SDK already has a middleware concept. I kept it separate from plugins so SDK-level behavior stays distinct from app-level behavior. That prevents plugins from accidentally becoming “SDK middleware by another name.”

File: packages/aiCore/src/core/middleware/manager.ts

options/ (normalize messy provider options in one place)

Provider options are messy by nature. I isolated them and added deep-merge helpers so options can be composed without leaking into runtime. This is where provider-specific flags (like streaming metadata) are normalized.

export function mergeProviderOptions(...optionsMap: Partial<TypedProviderOptions>[]): TypedProviderOptions {
  return optionsMap.reduce((acc, options) => {
    // deep merge
  }, {} as TypedProviderOptions)
}

File: packages/aiCore/src/core/options/factory.ts

How I think about the dependency graph

graph TD
  Options --> Providers
  Providers --> Models
  Middleware --> Models
  Models --> Runtime
  Plugins --> Runtime

Tradeoffs I accepted

More files and folders up front.
A clean but strict boundary between middleware and plugins.

The payoff is worth it: each module can evolve without turning runtime into a ball of conditionals.

5. Plugin-first extensibility

I refactored the previous middleware chain into a plugin engine with clear lifecycle hooks. That makes cross-cutting concerns (logging, tool use, web search transforms) easier to build, test, and reorder.

One practical example is the prompt-based tool use plugin for models without native function calling. The plugin builds a structured system prompt and parses tool-use tags from the model output:

const TOOL_USE_TAG_CONFIG: TagConfig = {
  openingTag: '<tool_use>',
  closingTag: '</tool_use>',
  separator: '\n'
}

function defaultBuildSystemPrompt(userSystemPrompt: string, tools: ToolSet): string {
  const availableTools = buildAvailableTools(tools)
  if (availableTools === null) return userSystemPrompt

  return DEFAULT_SYSTEM_PROMPT
    .replace('{{ TOOL_USE_EXAMPLES }}', DEFAULT_TOOL_USE_EXAMPLES)
    .replace('{{ AVAILABLE_TOOLS }}', availableTools)
    .replace('{{ USER_SYSTEM_PROMPT }}', userSystemPrompt || '')
}

File: packages/aiCore/src/core/plugins/built-in/toolUsePlugin/promptToolUsePlugin.ts

6. aiCore as a standalone package

This refactor did more than move code around. I split the AI SDK integration into packages/aiCore, which is now a reusable package with its own docs, examples, and tests. It makes the LLM runtime reusable outside the main app and keeps UI and execution concerns separate.

The package also keeps the door open for future agent workflows (the architecture doc explicitly reserves extension points).

7. What changed for the request/MCP pipeline

I updated the request and MCP invocation paths so streaming events, tool use, and result handling are standardized across providers. Parts 2 and 3 go deeper, but the key milestone in #7404 is: there is now a single, SDK-backed core to plug into.

8. Tradeoffs and why the new design wins

No refactor is free. This shift introduced a new package boundary and required a transition period where old and new paths coexisted. That cost was real.

But the benefits are structural:

One consistent pipeline for tool calls and streaming events.
A single provider integration surface with fewer branching paths.
Clear layers that make feature work predictable.

That is why the refactor was worth it: it turns a scaling problem into a composable system.

Takeaways

The AI SDK integration is a structural change, not just a provider swap.
Clear layers (models/runtime/plugins) replaced implicit middleware coupling.
aiCore became a reusable package, which makes the architecture easier to reason about and extend.

Next up: tool-call reliability, MCP approval, and provider options.