The Planning Pattern: How AI Agents Break Down Complex Goals

Ask a junior developer to "build a user authentication system" and they'll start coding immediately. Ask a senior developer the same thing, and they'll first ask questions, sketch out an architecture, identify dependencies, and create a plan.

AI agents work the same way. Planning is what separates agents that flail from agents that succeed.

This guide shows you how to implement planning in your AI agents—from simple linear plans to adaptive, hierarchical planning systems.

What Is the Planning Pattern?

Planning is the process of decomposing a high-level goal into a sequence of actionable steps before execution begins:

text

1	┌─────────────────────────────────────────────────────────────┐
2	│ "Build me a dashboard" │
3	└─────────────────────────────────────────────────────────────┘
4	│
5	▼
6	┌─────────────────────────────────────────────────────────────┐
7	│ PLANNING │
8	│ │
9	│ 1. Gather requirements │
10	│ 2. Design data schema │
11	│ 3. Create API endpoints │
12	│ 4. Build frontend components │
13	│ 5. Integrate and test │
14	│ 6. Deploy │
15	│ │
16	└─────────────────────────────────────────────────────────────┘
17	│
18	▼
19	┌─────────────────────────────────────────────────────────────┐
20	│ EXECUTION │
21	│ │
22	│ Step 1 → Step 2 → Step 3 → ... → Done │
23	│ │
24	└─────────────────────────────────────────────────────────────┘
25

Without planning:

Agent jumps straight into action
Often gets stuck or goes in circles
Misses critical steps
Can't estimate effort or progress

With planning:

Agent understands the full scope
Executes steps in logical order
Tracks progress toward goal
Can adapt when obstacles arise

Why Planning Matters

1. Complex Tasks Require Decomposition

LLMs have limited context windows and attention spans. A single prompt for a complex task often fails because the model can't hold all requirements simultaneously.

Planning breaks the problem into chunks the model can handle:

python

# ❌ Single complex prompt - often fails
"Create a complete e-commerce site with user auth, product catalog, 
shopping cart, checkout, payment integration, and admin panel"
 
# ✅ Planned approach - each step is manageable
plan = [
    "Set up project structure and database",
    "Implement user authentication",
    "Create product catalog with CRUD",
    "Build shopping cart functionality",
    "Add checkout flow",
    "Integrate payment provider",
    "Build admin dashboard"
]
 

2. Dependencies and Order Matter

Some tasks depend on others. Planning identifies these dependencies:

text

1	┌──────────────────┐
2	│ Create database │
3	└────────┬─────────┘
4	│
5	┌────┴────┐
6	▼ ▼
7	┌───────┐ ┌───────┐
8	│ Auth │ │Product│
9	│ API │ │ API │
10	└───┬───┘ └───┬───┘
11	│ │
12	└────┬────┘
13	▼
14	┌─────────┐
15	│ Cart │
16	│ API │
17	└────┬────┘
18	│
19	▼
20	┌─────────┐
21	│Checkout │
22	└─────────┘
23

3. Progress Tracking and Recovery

With a plan, you can:

Show progress ("Step 3 of 7 complete")
Resume after failures
Skip completed steps
Estimate remaining time

Basic Planning Implementation

Here's a minimal but complete planning agent:

python

import openai
import json
from dataclasses import dataclass
 
@dataclass
class PlanStep:
    id: str
    description: str
    dependencies: list[str]
    status: str = "pending"  # pending, in_progress, completed, failed
 
@dataclass
class Plan:
    goal: str
    steps: list[PlanStep]
    
    def get_next_step(self) -> PlanStep | None:
        for step in self.steps:
            if step.status == "pending":
                # Check if dependencies are met
                deps_met = all(
                    self.get_step(dep).status == "completed"
                    for dep in step.dependencies
                )
                if deps_met:
                    return step
        return None
    
    def get_step(self, step_id: str) -> PlanStep:
        return next(s for s in self.steps if s.id == step_id)
 
class PlanningAgent:
    def __init__(self):
        self.client = openai.OpenAI()
    
    def run(self, goal: str) -> str:
        # Phase 1: Create plan
        plan = self._create_plan(goal)
        print(f"Created plan with {len(plan.steps)} steps")
        
        # Phase 2: Execute plan
        results = {}
        
        while True:
            step = plan.get_next_step()
            if not step:
                break
            
            step.status = "in_progress"
            print(f"Executing: {step.description}")
            
            try:
                result = self._execute_step(step, results)
                results[step.id] = result
                step.status = "completed"
                print(f"Completed: {step.id}")
            except Exception as e:
                step.status = "failed"
                print(f"Failed: {step.id} - {e}")
                # Optionally: replan or abort
        
        # Phase 3: Synthesize results
        return self._synthesize(goal, results)
    
    def _create_plan(self, goal: str) -> Plan:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Create a plan to achieve the goal. Return JSON:
{
    "steps": [
        {
            "id": "step_1",
            "description": "What to do",
            "dependencies": []
        },
        {
            "id": "step_2", 
            "description": "What to do next",
            "dependencies": ["step_1"]
        }
    ]
}
 
Rules:
- Break into 3-10 concrete steps
- Each step should be independently executable
- List dependencies (steps that must complete first)
- Order from first to last"""
            }, {
                "role": "user",
                "content": f"Goal: {goal}"
            }],
            response_format={"type": "json_object"}
        )
        
        data = json.loads(response.choices[0].message.content)
        steps = [
            PlanStep(
                id=s["id"],
                description=s["description"],
                dependencies=s.get("dependencies", [])
            )
            for s in data["steps"]
        ]
        
        return Plan(goal=goal, steps=steps)
    
    def _execute_step(self, step: PlanStep, context: dict) -> str:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": "Execute the step and return the result."
            }, {
                "role": "user",
                "content": f"Step: {step.description}\n\nContext from previous steps:\n{json.dumps(context, indent=2)}"
            }]
        )
        return response.choices[0].message.content
    
    def _synthesize(self, goal: str, results: dict) -> str:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": "Synthesize the results into a final answer."
            }, {
                "role": "user",
                "content": f"Goal: {goal}\n\nResults:\n{json.dumps(results, indent=2)}"
            }]
        )
        return response.choices[0].message.content
 
 
# Usage
agent = PlanningAgent()
result = agent.run("Research the top 3 Python web frameworks and create a comparison table")
print(result)
 

Planning Patterns

Pattern 1: Linear Planning

Simple sequence of steps without branching:

text

1	Step 1 → Step 2 → Step 3 → Step 4 → Done
2

python

def linear_plan(goal: str) -> list[str]:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"Break this into sequential steps:\n{goal}"
        }]
    )
    
    # Parse steps
    steps = parse_numbered_list(response.choices[0].message.content)
    return steps
 
def execute_linear_plan(steps: list[str]) -> list[str]:
    results = []
    for step in steps:
        result = execute_step(step, results)
        results.append(result)
    return results
 

Best for: Simple, well-understood tasks with clear sequences.

Pattern 2: DAG Planning (Dependency Graph)

Steps with explicit dependencies, allowing parallel execution:

text

1	┌─────────┐
2	│ Step 1 │
3	└────┬────┘
4	│
5	┌───────┴───────┐
6	▼ ▼
7	┌─────────┐ ┌─────────┐
8	│ Step 2a │ │ Step 2b │ ← Can run in parallel
9	└────┬────┘ └────┬────┘
10	│ │
11	└───────┬───────┘
12	▼
13	┌─────────┐
14	│ Step 3 │
15	└─────────┘
16

python

from concurrent.futures import ThreadPoolExecutor, as_completed
 
def execute_dag_plan(plan: Plan) -> dict:
    results = {}
    completed = set()
    
    with ThreadPoolExecutor(max_workers=4) as executor:
        while len(completed) < len(plan.steps):
            # Find all steps that can run now
            ready = [
                step for step in plan.steps
                if step.id not in completed
                and all(dep in completed for dep in step.dependencies)
            ]
            
            if not ready:
                break  # No progress possible
            
            # Submit all ready steps
            futures = {
                executor.submit(execute_step, step, results): step
                for step in ready
            }
            
            # Collect results
            for future in as_completed(futures):
                step = futures[future]
                results[step.id] = future.result()
                completed.add(step.id)
    
    return results
 

Best for: Complex tasks with independent subtasks that can parallelize.

Pattern 3: Hierarchical Planning

High-level plan decomposes into sub-plans:

text

Goal: "Create a blog platform"
│
├── Sub-goal: "Set up backend"
│   ├── Step: Create database schema
│   ├── Step: Implement user auth
│   └── Step: Build post API
│
├── Sub-goal: "Build frontend"
│   ├── Step: Create layout components
│   ├── Step: Build post editor
│   └── Step: Add routing
│
└── Sub-goal: "Deploy"
    ├── Step: Configure hosting
    └── Step: Set up CI/CD
 

python

class HierarchicalPlanner:
    def __init__(self, max_depth: int = 3):
        self.client = openai.OpenAI()
        self.max_depth = max_depth
    
    def plan(self, goal: str, depth: int = 0) -> dict:
        if depth >= self.max_depth:
            return {"goal": goal, "type": "leaf", "steps": []}
        
        # Get high-level breakdown
        subgoals = self._decompose(goal)
        
        if len(subgoals) == 1 and subgoals[0] == goal:
            # Can't decompose further
            return {"goal": goal, "type": "leaf", "steps": []}
        
        # Recursively plan each subgoal
        children = []
        for subgoal in subgoals:
            child_plan = self.plan(subgoal, depth + 1)
            children.append(child_plan)
        
        return {
            "goal": goal,
            "type": "branch",
            "children": children
        }
    
    def _decompose(self, goal: str) -> list[str]:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Break the goal into 2-5 major subgoals.
If the goal is already atomic (can't be broken down), return just the goal.
Return as JSON: {"subgoals": ["...", "..."]}"""
            }, {
                "role": "user",
                "content": goal
            }],
            response_format={"type": "json_object"}
        )
        
        data = json.loads(response.choices[0].message.content)
        return data["subgoals"]
    
    def execute(self, plan: dict) -> dict:
        if plan["type"] == "leaf":
            return {"goal": plan["goal"], "result": self._execute_leaf(plan["goal"])}
        
        results = []
        for child in plan["children"]:
            result = self.execute(child)
            results.append(result)
        
        return {
            "goal": plan["goal"],
            "children_results": results
        }
 

Best for: Very complex, multi-faceted goals that benefit from divide-and-conquer.

Pattern 4: Adaptive Planning

Plan adjusts based on execution results:

text

1	┌─────────────────────────────────────────────────────────────┐
2	│ PLAN │
3	│ Step 1 → Step 2 → Step 3 → Step 4 │
4	└─────────────────────────────────────────────────────────────┘
5	│
6	▼ Execute Step 2
7	│
8	▼ Step 2 fails!
9	│
10	┌─────────────────────────────────────────────────────────────┐
11	│ REPLAN │
12	│ Step 1 ✓ → Step 2b → Step 2c → Step 3 → Step 4 │
13	│ (alternative approach) │
14	└─────────────────────────────────────────────────────────────┘
15

python

class AdaptivePlanner:
    def __init__(self):
        self.client = openai.OpenAI()
        self.max_replans = 3
    
    def run(self, goal: str) -> str:
        plan = self._create_plan(goal)
        completed_steps = []
        replan_count = 0
        
        while plan.get_next_step():
            step = plan.get_next_step()
            
            try:
                result = self._execute_step(step, completed_steps)
                completed_steps.append({
                    "step": step.description,
                    "result": result,
                    "status": "success"
                })
                step.status = "completed"
                
            except Exception as e:
                completed_steps.append({
                    "step": step.description,
                    "error": str(e),
                    "status": "failed"
                })
                
                if replan_count >= self.max_replans:
                    raise Exception("Max replans exceeded")
                
                # Replan from current state
                plan = self._replan(goal, completed_steps, str(e))
                replan_count += 1
                print(f"Replanned (attempt {replan_count})")
        
        return self._synthesize(goal, completed_steps)
    
    def _replan(self, goal: str, completed: list, error: str) -> Plan:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Create a revised plan given what's been done and what failed.
Return JSON with remaining steps only."""
            }, {
                "role": "user",
                "content": f"""Goal: {goal}
 
Completed steps:
{json.dumps(completed, indent=2)}
 
Last error: {error}
 
Create a new plan to complete the goal, working around the failure."""
            }],
            response_format={"type": "json_object"}
        )
        
        # Parse and return new plan
        data = json.loads(response.choices[0].message.content)
        return self._parse_plan(data)
 

Best for: Uncertain environments where steps may fail unpredictably.

Planning with Code Execution

For technical tasks, planning should include actual code execution to verify each step:

python

from hopx import Sandbox
 
class CodePlanningAgent:
    def __init__(self):
        self.client = openai.OpenAI()
        self.sandbox = None
    
    def run(self, goal: str) -> str:
        # Create persistent sandbox for the session
        self.sandbox = Sandbox.create(template="code-interpreter")
        
        try:
            # Plan
            plan = self._create_plan(goal)
            
            # Execute each step with code
            for step in plan.steps:
                success = self._execute_code_step(step)
                if not success:
                    # Replan or handle failure
                    plan = self._replan_from_failure(goal, plan, step)
            
            # Get final result
            return self._get_final_result(goal, plan)
        
        finally:
            self.sandbox.kill()
    
    def _execute_code_step(self, step: PlanStep) -> bool:
        # Generate code for this step
        code = self._generate_code(step)
        
        # Execute in sandbox
        self.sandbox.files.write("/app/step.py", code)
        result = self.sandbox.commands.run("python /app/step.py")
        
        if result.exit_code == 0:
            step.status = "completed"
            step.result = result.stdout
            return True
        else:
            step.status = "failed"
            step.error = result.stderr
            return False
    
    def _generate_code(self, step: PlanStep) -> str:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Generate Python code to accomplish this step.
The code should:
- Be complete and runnable
- Print results to stdout
- Handle errors gracefully
- Save any outputs to files if needed"""
            }, {
                "role": "user",
                "content": f"Step: {step.description}"
            }]
        )
        
        return self._extract_code(response.choices[0].message.content)
 

Real-World Example: Research Agent

Here's a complete planning agent that researches a topic:

python

from hopx import Sandbox
import openai
import json
 
class ResearchAgent:
    def __init__(self):
        self.client = openai.OpenAI()
    
    def research(self, topic: str) -> dict:
        # Phase 1: Plan the research
        plan = self._plan_research(topic)
        print(f"Research plan: {len(plan)} steps")
        
        # Phase 2: Execute research steps
        findings = []
        for i, step in enumerate(plan):
            print(f"Step {i+1}/{len(plan)}: {step['action']}")
            
            if step["action"] == "search":
                result = self._search(step["query"])
            elif step["action"] == "analyze":
                result = self._analyze(step["data"], step["question"])
            elif step["action"] == "synthesize":
                result = self._synthesize(step["findings"])
            else:
                result = {"error": f"Unknown action: {step['action']}"}
            
            findings.append({
                "step": step,
                "result": result
            })
        
        # Phase 3: Generate final report
        report = self._generate_report(topic, findings)
        
        return {
            "topic": topic,
            "plan": plan,
            "findings": findings,
            "report": report
        }
    
    def _plan_research(self, topic: str) -> list:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Plan a research investigation. Return JSON:
{
    "steps": [
        {"action": "search", "query": "search terms"},
        {"action": "analyze", "data": "what to analyze", "question": "what to find"},
        {"action": "synthesize", "findings": ["finding1", "finding2"]}
    ]
}
 
Available actions:
- search: Search for information
- analyze: Analyze data to answer a question
- synthesize: Combine findings into insights"""
            }, {
                "role": "user",
                "content": f"Research topic: {topic}"
            }],
            response_format={"type": "json_object"}
        )
        
        return json.loads(response.choices[0].message.content)["steps"]
    
    def _search(self, query: str) -> dict:
        # In production, use a real search API
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "user",
                "content": f"What do you know about: {query}"
            }]
        )
        return {"query": query, "results": response.choices[0].message.content}
    
    def _analyze(self, data: str, question: str) -> dict:
        sandbox = Sandbox.create(template="code-interpreter")
        
        try:
            # Use code to analyze
            analysis_code = f'''
import json
 
data = """{data}"""
question = """{question}"""
 
# Analyze the data
# This would be more sophisticated in production
analysis = {{
    "data_summary": data[:500],
    "question": question,
    "findings": "Analysis results would go here"
}}
 
print(json.dumps(analysis))
'''
            sandbox.files.write("/app/analyze.py", analysis_code)
            result = sandbox.commands.run("python /app/analyze.py")
            
            return json.loads(result.stdout)
        finally:
            sandbox.kill()
    
    def _synthesize(self, findings: list) -> dict:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "user",
                "content": f"Synthesize these findings into key insights:\n{json.dumps(findings)}"
            }]
        )
        return {"synthesis": response.choices[0].message.content}
    
    def _generate_report(self, topic: str, findings: list) -> str:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": "Generate a well-structured research report."
            }, {
                "role": "user",
                "content": f"Topic: {topic}\n\nFindings:\n{json.dumps(findings, indent=2)}"
            }]
        )
        return response.choices[0].message.content
 
 
# Usage
agent = ResearchAgent()
result = agent.research("Current trends in AI agent architectures")
print(result["report"])
 

Best Practices

1. Right-Size Your Plans

python

# ❌ Too granular - overhead exceeds benefit
plan = [
    "Open file",
    "Read first line",
    "Parse first field",
    "Convert to integer",
    ...  # 50 more steps
]
 
# ❌ Too coarse - steps are still too complex
plan = [
    "Build the entire backend",
    "Build the entire frontend"
]
 
# ✅ Just right - each step is meaningful but manageable
plan = [
    "Design database schema",
    "Implement user authentication",
    "Create REST API for products",
    "Build product listing page",
    "Add shopping cart functionality"
]
 

2. Include Verification Steps

python

plan = [
    {"step": "Write user registration endpoint", "type": "action"},
    {"step": "Test registration with valid data", "type": "verify"},
    {"step": "Test registration with invalid data", "type": "verify"},
    {"step": "Write login endpoint", "type": "action"},
    {"step": "Test login flow", "type": "verify"},
]
 

3. Plan for Failure

python

class RobustPlan:
    def __init__(self):
        self.steps = []
        self.fallbacks = {}  # step_id -> fallback_step
    
    def add_step(self, step: PlanStep, fallback: PlanStep = None):
        self.steps.append(step)
        if fallback:
            self.fallbacks[step.id] = fallback
    
    def get_fallback(self, failed_step_id: str) -> PlanStep | None:
        return self.fallbacks.get(failed_step_id)
 

4. Show Progress

python

def execute_with_progress(plan: Plan, callback):
    total = len(plan.steps)
    
    for i, step in enumerate(plan.steps):
        callback({
            "step": i + 1,
            "total": total,
            "percent": (i + 1) / total * 100,
            "description": step.description,
            "status": "in_progress"
        })
        
        result = execute_step(step)
        
        callback({
            "step": i + 1,
            "total": total,
            "percent": (i + 1) / total * 100,
            "description": step.description,
            "status": "completed",
            "result_preview": result[:100]
        })
 

When NOT to Use Planning

Planning adds overhead. Skip it when:

Scenario	Why Skip Planning
Simple, single-step tasks	"What's 2+2?" doesn't need a plan
Real-time responses needed	Planning adds latency
Highly unpredictable tasks	Plan will be wrong anyway
Exploratory/creative work	Structure can limit creativity

Planning + Other Patterns

Planning combines powerfully with other agentic patterns:

Planning + Reflection

python

def plan_with_reflection(goal: str) -> Plan:
    # Generate initial plan
    plan = create_plan(goal)
    
    # Reflect on the plan
    critique = reflect_on_plan(plan)
    
    # Improve if needed
    if not critique.approved:
        plan = improve_plan(plan, critique)
    
    return plan
 

Planning + Tool Use

python

def execute_planned_step(step: PlanStep, tools: dict):
    # Determine which tools are needed
    required_tools = identify_tools(step)
    
    # Execute with tools
    for tool_name in required_tools:
        result = tools[tool_name].execute(step.parameters)
        step.add_result(tool_name, result)
 

Planning + Multi-Agent

python

def distributed_plan_execution(plan: Plan, agents: dict):
    # Assign steps to specialized agents
    for step in plan.steps:
        agent = agents[step.agent_type]
        agent.queue_step(step)
    
    # Execute in parallel where possible
    results = await gather_results(agents)
    return results
 

Conclusion

Planning is the foundation of reliable AI agents:

Decompose complex goals into manageable steps
Order steps by dependencies
Track progress through execution
Adapt when things don't go as expected

Start with linear planning for simple tasks. Add DAG planning when you need parallelism. Use hierarchical planning for complex, multi-faceted goals. Always build in the ability to replan.

The agent that plans beats the agent that doesn't. Every time.

Ready to build planning agents with code execution? Get started with HopX — sandboxes that let your agents verify each step.

The Planning Pattern: How AI Agents Break Down Complex Goals

The Planning Pattern: How AI Agents Break Down Complex Goals

What Is the Planning Pattern?

Why Planning Matters

1. Complex Tasks Require Decomposition

2. Dependencies and Order Matter

3. Progress Tracking and Recovery

Basic Planning Implementation

Planning Patterns

Pattern 1: Linear Planning

Pattern 2: DAG Planning (Dependency Graph)

Pattern 3: Hierarchical Planning

Pattern 4: Adaptive Planning

Planning with Code Execution

Real-World Example: Research Agent

Best Practices

1. Right-Size Your Plans

2. Include Verification Steps

3. Plan for Failure

4. Show Progress

When NOT to Use Planning

Planning + Other Patterns

Planning + Reflection

Planning + Tool Use

Planning + Multi-Agent

Conclusion

Further Reading

Related articles

Evaluator-Optimizer Loop: Continuous AI Agent Improvement

Human-in-the-Loop: Balancing AI Autonomy and Human Control

Memory for AI Agents: Short-term, Long-term, and RAG

1	# ❌ Single complex prompt - often fails
2	"Create a complete e-commerce site with user auth, product catalog,
3	shopping cart, checkout, payment integration, and admin panel"
4
5	# ✅ Planned approach - each step is manageable
6	plan = [
7	"Set up project structure and database",
8	"Implement user authentication",
9	"Create product catalog with CRUD",
10	"Build shopping cart functionality",
11	"Add checkout flow",
12	"Integrate payment provider",
13	"Build admin dashboard"
14	]
15

1	import openai
2	import json
3	from dataclasses import dataclass
4
5	@dataclass
6	class PlanStep:
7	id: str
8	description: str
9	dependencies: list[str]
10	status: str = "pending" # pending, in_progress, completed, failed
11
12	@dataclass
13	class Plan:
14	goal: str
15	steps: list[PlanStep]
16
17	def get_next_step(self) -> PlanStep \| None:
18	for step in self.steps:
19	if step.status == "pending":
20	# Check if dependencies are met
21	deps_met = all(
22	self.get_step(dep).status == "completed"
23	for dep in step.dependencies
24	)
25	if deps_met:
26	return step
27	return None
28
29	def get_step(self, step_id: str) -> PlanStep:
30	return next(s for s in self.steps if s.id == step_id)
31
32	class PlanningAgent:
33	def __init__(self):
34	self.client = openai.OpenAI()
35
36	def run(self, goal: str) -> str:
37	# Phase 1: Create plan
38	plan = self._create_plan(goal)
39	print(f"Created plan with {len(plan.steps)} steps")
40
41	# Phase 2: Execute plan
42	results = {}
43
44	while True:
45	step = plan.get_next_step()
46	if not step:
47	break
48
49	step.status = "in_progress"
50	print(f"Executing: {step.description}")
51
52	try:
53	result = self._execute_step(step, results)
54	results[step.id] = result
55	step.status = "completed"
56	print(f"Completed: {step.id}")
57	except Exception as e:
58	step.status = "failed"
59	print(f"Failed: {step.id} - {e}")
60	# Optionally: replan or abort
61
62	# Phase 3: Synthesize results
63	return self._synthesize(goal, results)
64
65	def _create_plan(self, goal: str) -> Plan:
66	response = self.client.chat.completions.create(
67	model="gpt-4o",
68	messages=[{
69	"role": "system",
70	"content": """Create a plan to achieve the goal. Return JSON:
71	{
72	"steps": [
73	{
74	"id": "step_1",
75	"description": "What to do",
76	"dependencies": []
77	},
78	{
79	"id": "step_2",
80	"description": "What to do next",
81	"dependencies": ["step_1"]
82	}
83	]
84	}
85
86	Rules:
87	- Break into 3-10 concrete steps
88	- Each step should be independently executable
89	- List dependencies (steps that must complete first)
90	- Order from first to last"""
91	}, {
92	"role": "user",
93	"content": f"Goal: {goal}"
94	}],
95	response_format={"type": "json_object"}
96	)
97
98	data = json.loads(response.choices[0].message.content)
99	steps = [
100	PlanStep(
101	id=s["id"],
102	description=s["description"],
103	dependencies=s.get("dependencies", [])
104	)
105	for s in data["steps"]
106	]
107
108	return Plan(goal=goal, steps=steps)
109
110	def _execute_step(self, step: PlanStep, context: dict) -> str:
111	response = self.client.chat.completions.create(
112	model="gpt-4o",
113	messages=[{
114	"role": "system",
115	"content": "Execute the step and return the result."
116	}, {
117	"role": "user",
118	"content": f"Step: {step.description}\n\nContext from previous steps:\n{json.dumps(context, indent=2)}"
119	}]
120	)
121	return response.choices[0].message.content
122
123	def _synthesize(self, goal: str, results: dict) -> str:
124	response = self.client.chat.completions.create(
125	model="gpt-4o",
126	messages=[{
127	"role": "system",
128	"content": "Synthesize the results into a final answer."
129	}, {
130	"role": "user",
131	"content": f"Goal: {goal}\n\nResults:\n{json.dumps(results, indent=2)}"
132	}]
133	)
134	return response.choices[0].message.content
135
136
137	# Usage
138	agent = PlanningAgent()
139	result = agent.run("Research the top 3 Python web frameworks and create a comparison table")
140	print(result)
141

1	def linear_plan(goal: str) -> list[str]:
2	response = client.chat.completions.create(
3	model="gpt-4o",
4	messages=[{
5	"role": "user",
6	"content": f"Break this into sequential steps:\n{goal}"
7	}]
8	)
9
10	# Parse steps
11	steps = parse_numbered_list(response.choices[0].message.content)
12	return steps
13
14	def execute_linear_plan(steps: list[str]) -> list[str]:
15	results = []
16	for step in steps:
17	result = execute_step(step, results)
18	results.append(result)
19	return results
20

1	from concurrent.futures import ThreadPoolExecutor, as_completed
2
3	def execute_dag_plan(plan: Plan) -> dict:
4	results = {}
5	completed = set()
6
7	with ThreadPoolExecutor(max_workers=4) as executor:
8	while len(completed) < len(plan.steps):
9	# Find all steps that can run now
10	ready = [
11	step for step in plan.steps
12	if step.id not in completed
13	and all(dep in completed for dep in step.dependencies)
14	]
15
16	if not ready:
17	break # No progress possible
18
19	# Submit all ready steps
20	futures = {
21	executor.submit(execute_step, step, results): step
22	for step in ready
23	}
24
25	# Collect results
26	for future in as_completed(futures):
27	step = futures[future]
28	results[step.id] = future.result()
29	completed.add(step.id)
30
31	return results
32

1	Goal: "Create a blog platform"
2	│
3	├── Sub-goal: "Set up backend"
4	│ ├── Step: Create database schema
5	│ ├── Step: Implement user auth
6	│ └── Step: Build post API
7	│
8	├── Sub-goal: "Build frontend"
9	│ ├── Step: Create layout components
10	│ ├── Step: Build post editor
11	│ └── Step: Add routing
12	│
13	└── Sub-goal: "Deploy"
14	├── Step: Configure hosting
15	└── Step: Set up CI/CD
16

1	class HierarchicalPlanner:
2	def __init__(self, max_depth: int = 3):
3	self.client = openai.OpenAI()
4	self.max_depth = max_depth
5
6	def plan(self, goal: str, depth: int = 0) -> dict:
7	if depth >= self.max_depth:
8	return {"goal": goal, "type": "leaf", "steps": []}
9
10	# Get high-level breakdown
11	subgoals = self._decompose(goal)
12
13	if len(subgoals) == 1 and subgoals[0] == goal:
14	# Can't decompose further
15	return {"goal": goal, "type": "leaf", "steps": []}
16
17	# Recursively plan each subgoal
18	children = []
19	for subgoal in subgoals:
20	child_plan = self.plan(subgoal, depth + 1)
21	children.append(child_plan)
22
23	return {
24	"goal": goal,
25	"type": "branch",
26	"children": children
27	}
28
29	def _decompose(self, goal: str) -> list[str]:
30	response = self.client.chat.completions.create(
31	model="gpt-4o",
32	messages=[{
33	"role": "system",
34	"content": """Break the goal into 2-5 major subgoals.
35	If the goal is already atomic (can't be broken down), return just the goal.
36	Return as JSON: {"subgoals": ["...", "..."]}"""
37	}, {
38	"role": "user",
39	"content": goal
40	}],
41	response_format={"type": "json_object"}
42	)
43
44	data = json.loads(response.choices[0].message.content)
45	return data["subgoals"]
46
47	def execute(self, plan: dict) -> dict:
48	if plan["type"] == "leaf":
49	return {"goal": plan["goal"], "result": self._execute_leaf(plan["goal"])}
50
51	results = []
52	for child in plan["children"]:
53	result = self.execute(child)
54	results.append(result)
55
56	return {
57	"goal": plan["goal"],
58	"children_results": results
59	}
60

1	class AdaptivePlanner:
2	def __init__(self):
3	self.client = openai.OpenAI()
4	self.max_replans = 3
5
6	def run(self, goal: str) -> str:
7	plan = self._create_plan(goal)
8	completed_steps = []
9	replan_count = 0
10
11	while plan.get_next_step():
12	step = plan.get_next_step()
13
14	try:
15	result = self._execute_step(step, completed_steps)
16	completed_steps.append({
17	"step": step.description,
18	"result": result,
19	"status": "success"
20	})
21	step.status = "completed"
22
23	except Exception as e:
24	completed_steps.append({
25	"step": step.description,
26	"error": str(e),
27	"status": "failed"
28	})
29
30	if replan_count >= self.max_replans:
31	raise Exception("Max replans exceeded")
32
33	# Replan from current state
34	plan = self._replan(goal, completed_steps, str(e))
35	replan_count += 1
36	print(f"Replanned (attempt {replan_count})")
37
38	return self._synthesize(goal, completed_steps)
39
40	def _replan(self, goal: str, completed: list, error: str) -> Plan:
41	response = self.client.chat.completions.create(
42	model="gpt-4o",
43	messages=[{
44	"role": "system",
45	"content": """Create a revised plan given what's been done and what failed.
46	Return JSON with remaining steps only."""
47	}, {
48	"role": "user",
49	"content": f"""Goal: {goal}
50
51	Completed steps:
52	{json.dumps(completed, indent=2)}
53
54	Last error: {error}
55
56	Create a new plan to complete the goal, working around the failure."""
57	}],
58	response_format={"type": "json_object"}
59	)
60
61	# Parse and return new plan
62	data = json.loads(response.choices[0].message.content)
63	return self._parse_plan(data)
64

1	from hopx import Sandbox
2
3	class CodePlanningAgent:
4	def __init__(self):
5	self.client = openai.OpenAI()
6	self.sandbox = None
7
8	def run(self, goal: str) -> str:
9	# Create persistent sandbox for the session
10	self.sandbox = Sandbox.create(template="code-interpreter")
11
12	try:
13	# Plan
14	plan = self._create_plan(goal)
15
16	# Execute each step with code
17	for step in plan.steps:
18	success = self._execute_code_step(step)
19	if not success:
20	# Replan or handle failure
21	plan = self._replan_from_failure(goal, plan, step)
22
23	# Get final result
24	return self._get_final_result(goal, plan)
25
26	finally:
27	self.sandbox.kill()
28
29	def _execute_code_step(self, step: PlanStep) -> bool:
30	# Generate code for this step
31	code = self._generate_code(step)
32
33	# Execute in sandbox
34	self.sandbox.files.write("/app/step.py", code)
35	result = self.sandbox.commands.run("python /app/step.py")
36
37	if result.exit_code == 0:
38	step.status = "completed"
39	step.result = result.stdout
40	return True
41	else:
42	step.status = "failed"
43	step.error = result.stderr
44	return False
45
46	def _generate_code(self, step: PlanStep) -> str:
47	response = self.client.chat.completions.create(
48	model="gpt-4o",
49	messages=[{
50	"role": "system",
51	"content": """Generate Python code to accomplish this step.
52	The code should:
53	- Be complete and runnable
54	- Print results to stdout
55	- Handle errors gracefully
56	- Save any outputs to files if needed"""
57	}, {
58	"role": "user",
59	"content": f"Step: {step.description}"
60	}]
61	)
62
63	return self._extract_code(response.choices[0].message.content)
64

1	# ❌ Too granular - overhead exceeds benefit
2	plan = [
3	"Open file",
4	"Read first line",
5	"Parse first field",
6	"Convert to integer",
7	... # 50 more steps
8	]
9
10	# ❌ Too coarse - steps are still too complex
11	plan = [
12	"Build the entire backend",
13	"Build the entire frontend"
14	]
15
16	# ✅ Just right - each step is meaningful but manageable
17	plan = [
18	"Design database schema",
19	"Implement user authentication",
20	"Create REST API for products",
21	"Build product listing page",
22	"Add shopping cart functionality"
23	]
24

1	plan = [
2	{"step": "Write user registration endpoint", "type": "action"},
3	{"step": "Test registration with valid data", "type": "verify"},
4	{"step": "Test registration with invalid data", "type": "verify"},
5	{"step": "Write login endpoint", "type": "action"},
6	{"step": "Test login flow", "type": "verify"},
7	]
8

1	class RobustPlan:
2	def __init__(self):
3	self.steps = []
4	self.fallbacks = {} # step_id -> fallback_step
5
6	def add_step(self, step: PlanStep, fallback: PlanStep = None):
7	self.steps.append(step)
8	if fallback:
9	self.fallbacks[step.id] = fallback
10
11	def get_fallback(self, failed_step_id: str) -> PlanStep \| None:
12	return self.fallbacks.get(failed_step_id)
13

1	def execute_with_progress(plan: Plan, callback):
2	total = len(plan.steps)
3
4	for i, step in enumerate(plan.steps):
5	callback({
6	"step": i + 1,
7	"total": total,
8	"percent": (i + 1) / total * 100,
9	"description": step.description,
10	"status": "in_progress"
11	})
12
13	result = execute_step(step)
14
15	callback({
16	"step": i + 1,
17	"total": total,
18	"percent": (i + 1) / total * 100,
19	"description": step.description,
20	"status": "completed",
21	"result_preview": result[:100]
22	})
23

1	def plan_with_reflection(goal: str) -> Plan:
2	# Generate initial plan
3	plan = create_plan(goal)
4
5	# Reflect on the plan
6	critique = reflect_on_plan(plan)
7
8	# Improve if needed
9	if not critique.approved:
10	plan = improve_plan(plan, critique)
11
12	return plan
13

1	def execute_planned_step(step: PlanStep, tools: dict):
2	# Determine which tools are needed
3	required_tools = identify_tools(step)
4
5	# Execute with tools
6	for tool_name in required_tools:
7	result = tools[tool_name].execute(step.parameters)
8	step.add_result(tool_name, result)
9

1	def distributed_plan_execution(plan: Plan, agents: dict):
2	# Assign steps to specialized agents
3	for step in plan.steps:
4	agent = agents[step.agent_type]
5	agent.queue_step(step)
6
7	# Execute in parallel where possible
8	results = await gather_results(agents)
9	return results
10