Editor’s Note: Claude Code is evolving from a code assistant into an orchestratable agent workspace.
The workflows described in this article provide core value by enabling Claude to move beyond merely "thinking then acting" within a single context window. Instead, it can dynamically generate an execution framework tailored to the task: breaking down tasks, assigning sub-agents, processing in parallel, cross-validating results, iterating in cycles, and even encouraging competition among different agents before synthesizing the final outcome.
This means that the use cases for Claude Code are clearly expanding beyond its original scope. It is not only useful for code migration, refactoring, test reproduction, and code reviews, but also for non-technical tasks such as in-depth research, fact-checking, resume screening, incident retrospectives, rule documentation, business plan evaluations, and naming brainstorming. Many complex tasks are fundamentally similar to programming: they require breaking down problems, isolating context, validating assumptions, managing large volumes of details, and making decisions among multiple potential paths.
Dynamic workflows address several common issues in large models during long tasks: "agent laziness," where the model declares completion midway; "self-preference bias," where it favors its own conclusions; and "goal drift," where performance gradually deviates from the original objective over multiple iterations. By assigning the task to multiple Claudes with independent contexts, it transforms complex tasks from a "single-agent marathon" into a "multi-agent collaboration."
Of course, workflows are not a universal solution—they often consume more tokens and may not be suitable for every routine coding task. But they point to an important direction: the future competition among AI tools may not just be about how intelligent a single model is, but whether it can organize a reliable, reusable, and auditable execution process around complex goals.
The following is the original text:
Although the default Claude Code execution framework is built for programming, it is also suitable for many other types of tasks. Many tasks have been found to be structurally similar to programming tasks. However, to achieve optimal performance for certain specific task types—such as research, security analysis, agent team collaboration, or code review—we still need to build customized execution frameworks on top of Claude Code.
Workflows allow you to dynamically create execution frameworks that enable Claude to more natively solve the above problems and many other types of issues within Claude Code. You can also share and reuse these workflows with others.
In this article, I’ll share my initial experiences and insights using workflows to help you unlock its full potential.
However, it should be noted that relevant best practices are still evolving. Dynamic workflows typically consume more tokens, so you should carefully consider when and how to use them.
Note: This article is also published on the Claude Blog.
Example Prompt
Before diving into the technical details, I’d like to provide some example prompts to help you understand the possibilities of workflows:
This test fails approximately once every 50 runs. Set up a workflow to reproduce it, formulate hypotheses, and conduct adversarial testing across different worktrees. /goal Do not stop until one hypothesis is verified.
Use the workflow to review my last 50 sessions, identify recurring corrections I've made, and turn these repeated issues into CLAUDE.md rules.
Use the workflow to review the past six months of the #incidents channel on Slack and identify recurring root causes that haven't had tickets submitted.
Run my business plan through a workflow where different agents analyze it from the perspectives of investors, customers, and competitors.
There is a folder containing 80 resumes. Use the workflow to sort them according to the backend position requirements and review the top ten. Use the AskUserQuestion tool to ask me questions to help establish your evaluation criteria.
I need to name this CLI tool. Use workflow to brainstorm a list of options, then select the top three using a tournament mechanism.
Use the workflow to rename our User model to Account everywhere.
Read my blog draft and use the workflow to verify every technical claim against the codebase. I don't want to publish anything incorrect.
How does dynamic workflow work?
The dynamic workflow executes a JavaScript file containing several specialized functions for generating and coordinating sub-agents.

The dynamic workflow also includes standard JavaScript functions such as JSON, Math, and Array for processing data.
Notably, the dynamic workflow can determine which model a given agent uses and whether sub-agents run within their own worktree. This allows Claude to autonomously select the required level of intelligence and isolation based on task needs.
If a workflow is interrupted, such as by a user manually stopping it or the terminal exiting, the session can be resumed and the workflow will continue from where it was interrupted.
Why is a dynamic workflow necessary?
When you have the default Claude Code execution framework handle a task, it must simultaneously plan and execute within the same context window. While this approach works very well for many programming tasks, it can sometimes fail in long-running, highly parallel, or highly structured adversarial tasks.
The reason is that the longer Claude processes complex tasks within a single context window, the more prone it becomes to certain types of failure modes:
Agentic laziness refers to Claude prematurely stopping before fully completing particularly complex, multi-part tasks, and declaring the task finished after only making partial progress. For example, it might declare work complete after addressing only 20 out of 50 items in a security review.
Self-preferential bias refers to Claude's tendency to favor its own results or findings, particularly when asked to verify or evaluate its own output against a set of criteria.
Goal drift refers to the gradual decline in Claude’s adherence to the original objective over multiple execution rounds, especially after context is compressed. Each summary results in information loss, and certain detailed requirements—such as edge cases or constraints like “do not do X”—may be lost.
Creating a workflow helps alleviate these issues by orchestrating multiple independent Claude instances, each with its own context window and focused on clearly defined, isolated tasks.
Dynamic workflows and static workflows
You may have previously created static workflows using the Claude Agent SDK or claude -p to coordinate multiple Claude Code instances.
However, because static workflows need to account for various edge cases, they are typically more generic. With the introduction of Claude Opus 4.8 and dynamic workflows, Claude is now intelligent enough to create a customized execution framework tailored to your specific use case.

Practical patterns when using dynamic workflows
You can directly have Claude create a dynamic workflow, or use the trigger word "ultracode" to ensure Claude Code generates the workflow.
However, if you can develop a mental model of how dynamic workflows operate, it becomes easier to determine when to use them and to guide Claude effectively through prompts.
When building workflows, Claude commonly uses and combines the following patterns:

Classify and execute: Use a classification agent to determine the task type, then route to the appropriate agent or action based on the type. A classifier can also be used at the end of the process to evaluate the output.
Fan-out and synthesis: Break a task into multiple smaller steps, with each step handled by a separate agent, then combine the results. This approach is especially suitable for tasks involving numerous small steps, or when each step requires a clean context window to avoid interference or cross-contamination. The synthesis step acts as a "barrier": it waits for all fan-out agents to complete, then merges their structured outputs into a single result.
Adversarial validation: For each generated agent, run a separate agent to perform adversarial validation on its output according to a set of evaluation criteria or guidelines.
Generate and filter: Generate a large number of ideas around a topic, then screen them using evaluation criteria or validation processes to remove duplicates and return only the tested, highest-quality ideas.
Tournament: Instead of splitting the work, let agents compete against each other. Generate N agents, each attempting the same task using different approaches. Then, have a prompt or model evaluate and pairwise compare the agents’ outputs until a winner is selected.
Loop until completion: For tasks with unknown workloads, do not set a fixed number of rounds; instead, continuously generate agents until a stopping condition is met, such as no new findings appear or no more errors occur in the logs.
Use cases
You can think more creatively about when and how to have Claude Code create dynamic workflows. I've found that workflows can sometimes be even more useful in non-technical work.

Migration and Reconstruction
Bun was rewritten from Zig to Rust using workflows. You can read Jarred’s post on X to learn more about the process.
The key is to break the task into a series of steps that need to be addressed, such as call points, failure tests, and modules. Launch a sub-agent in the worktree for each fix task to complete the repair, then have another agent perform adversarial review, and finally merge the results. You can explicitly instruct the agents to avoid using resource-intensive commands, thereby maximizing parallelism without exhausting local machine resources.
In-depth research
We have released a deep research skill (/deep-research) in Claude Code that uses dynamic workflows. Specifically, it scales out to perform web searches, scrape sources, conduct adversarial validation of relevant claims, and synthesize a citation-rich report.
However, this type of research is not limited to web searches. For example, you can also have Claude compile a status report from Slack context or investigate how a feature works by deeply exploring a codebase.
Depth Verification

On the other hand, if you have a report and want to verify every factual claim and source cited within it, you can create a workflow: first, have one agent identify all factual claims, then trigger a sub-agent for each claim to conduct a detailed verification. You can also include a verification agent to review the sub-agents responsible for sourcing, ensuring their sources meet sufficient quality standards.
Sort

You may have a set of items you'd like to rank according to a qualitative metric, and you believe Claude Code excels at evaluating such metrics. For example, ranking support tickets by bug severity.
However, if you attempt to sort over a thousand lines within a single prompt, quality declines and the context window cannot accommodate it. A better approach is to run a tournament mechanism, building a pipeline of pairwise comparison agents, as comparative judgments are typically more reliable than absolute scoring; or first perform parallel bucket sorting and then merge the results. Each comparison is performed by an independent agent, so a deterministic loop can maintain the overall tournament structure, with only the current sequence needing to remain in context.
Memory and Rule Compliance

If you have a set of specific rules that Claude frequently overlooks or fails to follow, even when those rules are listed in CLAUDE.md, you can create a workflow that lists these rules and assigns a dedicated verification agent to check each one individually—each rule corresponds to one verification agent. Creating a sub-agent with a "skeptic" persona to review whether the rules themselves are reasonable can also help reduce false positives.
Conversely, you can mine your recent conversations and code review comments to identify recurring corrections you've made; have a parallel agent cluster these issues; then perform adversarial validation on each candidate rule to determine whether it truly prevents a real error; finally, refine the rules that pass screening back into CLAUDE.md.
Root cause investigation
The most effective way to debug is to formulate several independent hypotheses and test them one by one. However, if you use only a single context window, Claude may fall into self-bias.
The workflow can structurally prevent this situation: it can initiate multiple agents to generate hypotheses based on non-overlapping evidence. For example, different agents can examine logs, files, and data separately. Each hypothesis can then be reviewed by a set of validators and refuters.
This doesn't apply only to code. Workflows can also be used for sales analysis, such as "Why did sales drop in March?"; for data engineering, such as "Why did this pipeline fail?"; or for any post-mortem review.
Mass triage

Every team has support queues, bug reports, or other backlogs that cannot be fully handled by humans. A triage workflow can categorize each item, deduplicate it against already tracked issues, and take action—whether that means attempting a fix or escalating it to a human for handling.
For a triage workflow, a useful pattern is quarantine. That is, prohibit agents that read untrusted public content from performing high-privilege operations; high-privilege operations should be carried out by dedicated action agents.
You can pair triage workflows with /loop to enable Claude to continuously execute these tasks.
Explore and make informed judgments
Workflows are useful when you need to explore different paths to a solution, especially for design, naming, or other tasks involving aesthetic judgment, and when you can benefit from a set of evaluation criteria.
You can have Claude explore a wide range of solutions and provide the review agent with a set of criteria defining what constitutes a good solution. The task is complete when the review agent determines that the results meet these criteria. Different solutions can also be ranked or filtered using a tournament-style mechanism based on these evaluation standards.
Evaluations
You can run lightweight evals for specific tasks by launching an independent agent in the worktree, followed by a comparison agent that evaluates and scores the output based on predefined criteria. For example, you can assess and improve a skill you’ve created to determine whether it meets certain specific standards.
Model and Intelligent Routing: You can create a classification agent optimized for your specific task to determine which model to use. This approach is particularly useful when tasks involve extensive tool usage, and conducting research beforehand helps identify the most suitable model.
For example, for the task "Explain how the auth module works," the most suitable model depends on the number of files in the auth module and the structure of the codebase. The classification agent can first conduct this research and then route the task to Sonnet or Opus based on the anticipated complexity.
When not to use dynamic workflows
Workflows are still new. While they can deliver far greater results than conventional methods in many use cases, not every task requires them, and they may significantly increase token consumption.
Use workflows for tasks that can extend the boundaries of Claude Code’s capabilities in new ways. For routine programming tasks, ask yourself first: Does this task truly require additional computational resources? For example, most traditional programming tasks don’t need a team of five reviewers.
Tips for building dynamic workflows
Prompt design
When writing prompts for dynamic workflows, the more detailed the information, the better the results—especially when using the specific techniques mentioned above.
Workflows are not only for large tasks—you can also prompt the model to use a "quick workflow." For example, you can create a rapid adversarial review process to verify a hypothesis.
Use in combination with /goal and /loop
When using repeatable workflows, such as triage, research, or verification workflows, pair them with /loop to run them at fixed intervals, and use /goal to set strict completion requirements.
Token Budget
You can set a clear token budget for dynamic workflows to limit the number of tokens consumed by tasks. You can specify a budget requirement in the prompt, such as “use 10k tokens,” which will set the upper limit to 10k tokens.
Save and share dynamic workflows
Press 's' in the workflow menu to save workflows. You can submit them to ~/.claude/workflows or distribute them via skills.

To share them via Skill, place the JavaScript workflow files in the Skill folder and reference them in SKILL.md. For greater flexibility, you can also prompt Claude to treat the workflows within the Skill as templates rather than scripts that must be executed verbatim.

A whole new world
Workflows are a useful new way to extend Claude Code. We encourage you to view this as a starting point—there’s still much to explore about how to use it effectively. We’d love to hear what you discover.
Thariq Shihipar and Sid Bidasaria (@sidbid) are members of Anthropic’s engineering team working on Claude Code.
