One Agent’s Shadow Clone vs. An Ensemble of Experts — Multi-Agent and Multi-Session Agent Management Methodologies

When running a single agent, simply writing good prompts was enough.

However, the moment you have two agents, it becomes a distributed system.

>

What This Article Covers

  • The essential difference between Multi-Agent and Forked / Parallel Session Agents
  • 5 core orchestration patterns — Orchestrator-Worker, Swarm, Mesh, Hierarchical, Pipeline
  • Common pitfalls in practice — Handoff loops, token explosion, context collision
  • 5 pillars of management methodology — Context, State, Communication, Observability, Security

Why This Discussion Now

If 2024 was “the year AI agents emerged,” then 2025-2026 has become the year for operating agents. Projects that started with a single agent are now evolving into systems where 5, 10, or even 20 agents run concurrently.

At this juncture, we encounter the first hurdle.

I know I need more agents, but should I use multiple agents,

or should I increase the number of sessions for a single agent?

These two approaches may look similar, but their operational philosophies are entirely different. A wrong choice can lead to months of battling coordination overhead, preventing any progress in product development.


Essential Differences Between the Two Patterns

Multi-Agent System

This is a structure where multiple agents with different roles and system prompts collaborate. Anthropic’s Claude Code official documentation explains it as follows:

“Subagents work within a single session; agent teams coordinate across separate sessions.”

In other words, the core of multi-agent systems is specialization. A code reviewer, security auditor, document writer, and test writer each collaborate with different system prompts and tool permissions.

Parallel Sessions / Forked Subagents

This is a structure where agents with the same model, same permissions, and same context are distributed across multiple workspaces (Worktree, Branch, Container) and operate simultaneously. The –worktree flag added from Claude Code 2.1.50 is a prime example.

# Keep the main session, start a new session in a separate worktree
claude --worktree feature-auth

The core values are isolation and parallelism. It allows working on different features of the same codebase simultaneously without conflicts.

One-Line Summary

Category Multi-Agent Multi-Session Agent
Core Value Specialization (Role Separation) Isolation (Workspace Separation)
Agent Definition N different prompts 1 identical prompt
Communication Method Handoff, Message Queue Independent (Merge results only if needed)
Cost Pattern Different models possible per role N-fold linear increase
Typical Pitfall Handoff Loop Merge Conflict

️ 5 Orchestration Patterns

Production multi-agent systems ultimately boil down to one of these five patterns, or a combination thereof.

1. Orchestrator-Worker (Supervisor)

This is the most widely used pattern. A central orchestrator receives user requests, breaks them down into subtasks, delegates them to specialized worker agents, and then synthesizes the results.

  • Advantages: Easy to debug, clear traceability, simple output verification
  • Disadvantages: The orchestrator becomes a Single Point of Failure (SPOF)
  • Recommended timing: Almost all starting points. Microsoft’s official guide recommends “starting centrally and only distributing when clear bottlenecks are identified.”

2. Swarm (Handoff)

This pattern is adopted by the OpenAI Agents SDK. Agents explicitly handoff control to each other, with the conversation context being passed along. Instead of a central orchestrator, each agent directly decides the next agent.

3. Mesh

A fully distributed system where all agents can communicate with each other. Resilience is highest, but it’s prone to handoff loops (A → B → A), making guard condition design essential.

4. Hierarchical

A multi-layered structure where orchestrators exist beneath other orchestrators. Suitable for large-scale enterprise systems.

5. Pipeline

A structure where data flows through sequential stages. The output of one agent becomes the input for the next. This is the same mindset seen in CI/CD.


In Practice: Multi-Agent Definition Example

Let’s look at how subagents are defined in the Claude Agent SDK.

# .claude/agents/security-reviewer.md
---
name: security-reviewer
description: 코드의 보안 취약점을 검토합니다. SQL Injection, XSS, 인증 우회, 민감정보 노출을 중점적으로 봅니다.
tools:
  - Read
  - Grep
  - Bash
model: claude-sonnet-4-5
---

당신은 시니어 보안 엔지니어입니다.
주어진 코드에서 OWASP Top 10 기반의 취약점을 식별하고,
각 취약점에 대해 다음을 보고하세요:
1. 취약점 종류
2. 발생 위치 (파일:라인)
3. 공격 시나리오
4. 권장 수정안

The key here is the description field. The orchestrator uses this description to decide whether to delegate, so if it’s written ambiguously, it might not be called or might be called for the wrong task.

Multi-Session Agent Execution Example

# Terminal 1 — Main Session
claude --worktree main

# Terminal 2 — Working on a different feature in the same repository
claude --worktree feature-payments

# Terminal 3 — Hotfix
claude --worktree hotfix-auth-bug

Each worktree has an independent directory and branch under .claude/worktrees/, and is isolated at the filesystem level.


Management Methodology Summarized in 5 Pillars

1️⃣ Context Management (Context Isolation)

The greatest value of subagents is to keep the main context clean. If the main agent directly handles a task like exploring 100 files, the context window will explode. Delegating to a subagent means all intermediate processes remain within the subagent’s isolated context, and only the final summary returns to the main agent.

Forked Subagents are an exception. They start by inheriting the entire history of the main session, which means there’s no need to re-explain background information, but it comes with the trade-off of sacrificing input isolation.

2️⃣ State and Handoff Management

When handing off, you must explicitly define what to pass. OpenAI Agents SDK’s handoff patterns, LangGraph’s checkpoints, and MCP/A2A protocols are all tools to solve this problem.

A common mistake is sending vague handoff messages like “implement this.” It must always include three elements: scope, file references, and expected output.

3️⃣ Observability & Cost

Multi-agent systems use 3-4 times more tokens than single-agent systems. Depending on the pattern, there can be a difference of over 200%. The following metrics must be tracked during operation:

  • Token usage and cost per agent
  • Response time per decision
  • Tool call chain tracing (LangSmith, Arize, OpenAI Tracing)
  • Alerts for abnormal patterns (loops, repeated API failures)

For cost-saving tips, separating the main session with Opus and subagents with Sonnet or Haiku can significantly reduce costs without sacrificing quality.

4️⃣ Infrastructure

At an enterprise scale, agent operation becomes infrastructure operation.

  • Containerization: Docker isolation per agent
  • Orchestration: Kubernetes pod auto-scaling
  • GPU Scheduling: Separate node pools for inference-heavy agents
  • Message Bus: Kafka, RabbitMQ for inter-agent event delivery

5️⃣ Security & Permissions

Minimize tool permissions for each subagent. Set security auditor agents to read-only, and implementation agents to write permissions only. Also, if Permission Presets are not pre-approved, you will be bombarded with permission prompts every time an agent is spawned.


⚠️ Common Pitfalls in Practice

Pitfall 1. Starting with a Distributed System from the Outset

This is the most common mistake. The Mesh pattern might look cool, but it leads to debugging hell. Always start with Orchestrator-Worker. Most production teams don’t need distribution in the end.

Pitfall 2. Under-Parallelization

It’s common to run four independent analyses sequentially. If the domains are independent, always parallelize.

Pitfall 3. Handoff Loops

A → B → A → B… an infinite loop. Guard conditions and maximum hop limits are essential.

Pitfall 4. Merge Conflicts in Multi-Session Agents

If multiple sessions modify the same file simultaneously, the result is a merge conflict or partially applied changes. Isolation at the worktree or container level is absolutely necessary.


✅ Summary — How to Decide

If you’re starting a new project, follow this decision tree:

  1. Does the task inherently require different specializations? → Multi-Agent
  2. Are you applying the same task simultaneously to multiple data/files/branches? → Multi-Session
  3. If both apply? → Separate roles with Multi-Agent + each agent spawns child sessions as needed

And no matter which pattern you choose, don’t forget Microsoft’s guidance: start centrally and only distribute when clear bottlenecks are identified.

Operational capability in the age of agents is synonymous with distributed system design capability. If you have experience operating microservices, that intuition carries over directly. Familiar challenges have simply put on new clothes.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *