The Planning Pattern: How AI Agents Break Down Complex Goals
Ask a junior developer to "build a user authentication system" and they'll start coding immediately. Ask a senior developer the same thing, and they'll first ask questions, sketch out an architecture, identify dependencies, and create a plan.
AI agents work the same way. Planning is what separates agents that flail from agents that succeed.
This guide shows you how to implement planning in your AI agents—from simple linear plans to adaptive, hierarchical planning systems.
What Is the Planning Pattern?
Planning is the process of decomposing a high-level goal into a sequence of actionable steps before execution begins:
| 1 | ┌─────────────────────────────────────────────────────────────┐ |
| 2 | │ "Build me a dashboard" │ |
| 3 | └─────────────────────────────────────────────────────────────┘ |
| 4 | │ |
| 5 | ▼ |
| 6 | ┌─────────────────────────────────────────────────────────────┐ |
| 7 | │ PLANNING │ |
| 8 | │ │ |
| 9 | │ 1. Gather requirements │ |
| 10 | │ 2. Design data schema │ |
| 11 | │ 3. Create API endpoints │ |
| 12 | │ 4. Build frontend components │ |
| 13 | │ 5. Integrate and test │ |
| 14 | │ 6. Deploy │ |
| 15 | │ │ |
| 16 | └─────────────────────────────────────────────────────────────┘ |
| 17 | │ |
| 18 | ▼ |
| 19 | ┌─────────────────────────────────────────────────────────────┐ |
| 20 | │ EXECUTION │ |
| 21 | │ │ |
| 22 | │ Step 1 → Step 2 → Step 3 → ... → Done │ |
| 23 | │ │ |
| 24 | └─────────────────────────────────────────────────────────────┘ |
| 25 | |
Without planning:
- Agent jumps straight into action
- Often gets stuck or goes in circles
- Misses critical steps
- Can't estimate effort or progress
With planning:
- Agent understands the full scope
- Executes steps in logical order
- Tracks progress toward goal
- Can adapt when obstacles arise
Why Planning Matters
1. Complex Tasks Require Decomposition
LLMs have limited context windows and attention spans. A single prompt for a complex task often fails because the model can't hold all requirements simultaneously.
Planning breaks the problem into chunks the model can handle:
| 1 | # ❌ Single complex prompt - often fails |
| 2 | "Create a complete e-commerce site with user auth, product catalog, |
| 3 | shopping cart, checkout, payment integration, and admin panel" |
| 4 | |
| 5 | # ✅ Planned approach - each step is manageable |
| 6 | plan = [ |
| 7 | "Set up project structure and database", |
| 8 | "Implement user authentication", |
| 9 | "Create product catalog with CRUD", |
| 10 | "Build shopping cart functionality", |
| 11 | "Add checkout flow", |
| 12 | "Integrate payment provider", |
| 13 | "Build admin dashboard" |
| 14 | ] |
| 15 | |
2. Dependencies and Order Matter
Some tasks depend on others. Planning identifies these dependencies:
| 1 | ┌──────────────────┐ |
| 2 | │ Create database │ |
| 3 | └────────┬─────────┘ |
| 4 | │ |
| 5 | ┌────┴────┐ |
| 6 | ▼ ▼ |
| 7 | ┌───────┐ ┌───────┐ |
| 8 | │ Auth │ │Product│ |
| 9 | │ API │ │ API │ |
| 10 | └───┬───┘ └───┬───┘ |
| 11 | │ │ |
| 12 | └────┬────┘ |
| 13 | ▼ |
| 14 | ┌─────────┐ |
| 15 | │ Cart │ |
| 16 | │ API │ |
| 17 | └────┬────┘ |
| 18 | │ |
| 19 | ▼ |
| 20 | ┌─────────┐ |
| 21 | │Checkout │ |
| 22 | └─────────┘ |
| 23 | |
3. Progress Tracking and Recovery
With a plan, you can:
- Show progress ("Step 3 of 7 complete")
- Resume after failures
- Skip completed steps
- Estimate remaining time
Basic Planning Implementation
Here's a minimal but complete planning agent:
| 1 | import openai |
| 2 | import json |
| 3 | from dataclasses import dataclass |
| 4 | |
| 5 | @dataclass |
| 6 | class PlanStep: |
| 7 | id: str |
| 8 | description: str |
| 9 | dependencies: list[str] |
| 10 | status: str = "pending" # pending, in_progress, completed, failed |
| 11 | |
| 12 | @dataclass |
| 13 | class Plan: |
| 14 | goal: str |
| 15 | steps: list[PlanStep] |
| 16 | |
| 17 | def get_next_step(self) -> PlanStep | None: |
| 18 | for step in self.steps: |
| 19 | if step.status == "pending": |
| 20 | # Check if dependencies are met |
| 21 | deps_met = all( |
| 22 | self.get_step(dep).status == "completed" |
| 23 | for dep in step.dependencies |
| 24 | ) |
| 25 | if deps_met: |
| 26 | return step |
| 27 | return None |
| 28 | |
| 29 | def get_step(self, step_id: str) -> PlanStep: |
| 30 | return next(s for s in self.steps if s.id == step_id) |
| 31 | |
| 32 | class PlanningAgent: |
| 33 | def __init__(self): |
| 34 | self.client = openai.OpenAI() |
| 35 | |
| 36 | def run(self, goal: str) -> str: |
| 37 | # Phase 1: Create plan |
| 38 | plan = self._create_plan(goal) |
| 39 | print(f"Created plan with {len(plan.steps)} steps") |
| 40 | |
| 41 | # Phase 2: Execute plan |
| 42 | results = {} |
| 43 | |
| 44 | while True: |
| 45 | step = plan.get_next_step() |
| 46 | if not step: |
| 47 | break |
| 48 | |
| 49 | step.status = "in_progress" |
| 50 | print(f"Executing: {step.description}") |
| 51 | |
| 52 | try: |
| 53 | result = self._execute_step(step, results) |
| 54 | results[step.id] = result |
| 55 | step.status = "completed" |
| 56 | print(f"Completed: {step.id}") |
| 57 | except Exception as e: |
| 58 | step.status = "failed" |
| 59 | print(f"Failed: {step.id} - {e}") |
| 60 | # Optionally: replan or abort |
| 61 | |
| 62 | # Phase 3: Synthesize results |
| 63 | return self._synthesize(goal, results) |
| 64 | |
| 65 | def _create_plan(self, goal: str) -> Plan: |
| 66 | response = self.client.chat.completions.create( |
| 67 | model="gpt-4o", |
| 68 | messages=[{ |
| 69 | "role": "system", |
| 70 | "content": """Create a plan to achieve the goal. Return JSON: |
| 71 | { |
| 72 | "steps": [ |
| 73 | { |
| 74 | "id": "step_1", |
| 75 | "description": "What to do", |
| 76 | "dependencies": [] |
| 77 | }, |
| 78 | { |
| 79 | "id": "step_2", |
| 80 | "description": "What to do next", |
| 81 | "dependencies": ["step_1"] |
| 82 | } |
| 83 | ] |
| 84 | } |
| 85 | |
| 86 | Rules: |
| 87 | - Break into 3-10 concrete steps |
| 88 | - Each step should be independently executable |
| 89 | - List dependencies (steps that must complete first) |
| 90 | - Order from first to last""" |
| 91 | }, { |
| 92 | "role": "user", |
| 93 | "content": f"Goal: {goal}" |
| 94 | }], |
| 95 | response_format={"type": "json_object"} |
| 96 | ) |
| 97 | |
| 98 | data = json.loads(response.choices[0].message.content) |
| 99 | steps = [ |
| 100 | PlanStep( |
| 101 | id=s["id"], |
| 102 | description=s["description"], |
| 103 | dependencies=s.get("dependencies", []) |
| 104 | ) |
| 105 | for s in data["steps"] |
| 106 | ] |
| 107 | |
| 108 | return Plan(goal=goal, steps=steps) |
| 109 | |
| 110 | def _execute_step(self, step: PlanStep, context: dict) -> str: |
| 111 | response = self.client.chat.completions.create( |
| 112 | model="gpt-4o", |
| 113 | messages=[{ |
| 114 | "role": "system", |
| 115 | "content": "Execute the step and return the result." |
| 116 | }, { |
| 117 | "role": "user", |
| 118 | "content": f"Step: {step.description}\n\nContext from previous steps:\n{json.dumps(context, indent=2)}" |
| 119 | }] |
| 120 | ) |
| 121 | return response.choices[0].message.content |
| 122 | |
| 123 | def _synthesize(self, goal: str, results: dict) -> str: |
| 124 | response = self.client.chat.completions.create( |
| 125 | model="gpt-4o", |
| 126 | messages=[{ |
| 127 | "role": "system", |
| 128 | "content": "Synthesize the results into a final answer." |
| 129 | }, { |
| 130 | "role": "user", |
| 131 | "content": f"Goal: {goal}\n\nResults:\n{json.dumps(results, indent=2)}" |
| 132 | }] |
| 133 | ) |
| 134 | return response.choices[0].message.content |
| 135 | |
| 136 | |
| 137 | # Usage |
| 138 | agent = PlanningAgent() |
| 139 | result = agent.run("Research the top 3 Python web frameworks and create a comparison table") |
| 140 | print(result) |
| 141 | |
Planning Patterns
Pattern 1: Linear Planning
Simple sequence of steps without branching:
| 1 | Step 1 → Step 2 → Step 3 → Step 4 → Done |
| 2 | |
| 1 | def linear_plan(goal: str) -> list[str]: |
| 2 | response = client.chat.completions.create( |
| 3 | model="gpt-4o", |
| 4 | messages=[{ |
| 5 | "role": "user", |
| 6 | "content": f"Break this into sequential steps:\n{goal}" |
| 7 | }] |
| 8 | ) |
| 9 | |
| 10 | # Parse steps |
| 11 | steps = parse_numbered_list(response.choices[0].message.content) |
| 12 | return steps |
| 13 | |
| 14 | def execute_linear_plan(steps: list[str]) -> list[str]: |
| 15 | results = [] |
| 16 | for step in steps: |
| 17 | result = execute_step(step, results) |
| 18 | results.append(result) |
| 19 | return results |
| 20 | |
Best for: Simple, well-understood tasks with clear sequences.
Pattern 2: DAG Planning (Dependency Graph)
Steps with explicit dependencies, allowing parallel execution:
| 1 | ┌─────────┐ |
| 2 | │ Step 1 │ |
| 3 | └────┬────┘ |
| 4 | │ |
| 5 | ┌───────┴───────┐ |
| 6 | ▼ ▼ |
| 7 | ┌─────────┐ ┌─────────┐ |
| 8 | │ Step 2a │ │ Step 2b │ ← Can run in parallel |
| 9 | └────┬────┘ └────┬────┘ |
| 10 | │ │ |
| 11 | └───────┬───────┘ |
| 12 | ▼ |
| 13 | ┌─────────┐ |
| 14 | │ Step 3 │ |
| 15 | └─────────┘ |
| 16 | |
| 1 | from concurrent.futures import ThreadPoolExecutor, as_completed |
| 2 | |
| 3 | def execute_dag_plan(plan: Plan) -> dict: |
| 4 | results = {} |
| 5 | completed = set() |
| 6 | |
| 7 | with ThreadPoolExecutor(max_workers=4) as executor: |
| 8 | while len(completed) < len(plan.steps): |
| 9 | # Find all steps that can run now |
| 10 | ready = [ |
| 11 | step for step in plan.steps |
| 12 | if step.id not in completed |
| 13 | and all(dep in completed for dep in step.dependencies) |
| 14 | ] |
| 15 | |
| 16 | if not ready: |
| 17 | break # No progress possible |
| 18 | |
| 19 | # Submit all ready steps |
| 20 | futures = { |
| 21 | executor.submit(execute_step, step, results): step |
| 22 | for step in ready |
| 23 | } |
| 24 | |
| 25 | # Collect results |
| 26 | for future in as_completed(futures): |
| 27 | step = futures[future] |
| 28 | results[step.id] = future.result() |
| 29 | completed.add(step.id) |
| 30 | |
| 31 | return results |
| 32 | |
Best for: Complex tasks with independent subtasks that can parallelize.
Pattern 3: Hierarchical Planning
High-level plan decomposes into sub-plans:
| 1 | Goal: "Create a blog platform" |
| 2 | │ |
| 3 | ├── Sub-goal: "Set up backend" |
| 4 | │ ├── Step: Create database schema |
| 5 | │ ├── Step: Implement user auth |
| 6 | │ └── Step: Build post API |
| 7 | │ |
| 8 | ├── Sub-goal: "Build frontend" |
| 9 | │ ├── Step: Create layout components |
| 10 | │ ├── Step: Build post editor |
| 11 | │ └── Step: Add routing |
| 12 | │ |
| 13 | └── Sub-goal: "Deploy" |
| 14 | ├── Step: Configure hosting |
| 15 | └── Step: Set up CI/CD |
| 16 | |
| 1 | class HierarchicalPlanner: |
| 2 | def __init__(self, max_depth: int = 3): |
| 3 | self.client = openai.OpenAI() |
| 4 | self.max_depth = max_depth |
| 5 | |
| 6 | def plan(self, goal: str, depth: int = 0) -> dict: |
| 7 | if depth >= self.max_depth: |
| 8 | return {"goal": goal, "type": "leaf", "steps": []} |
| 9 | |
| 10 | # Get high-level breakdown |
| 11 | subgoals = self._decompose(goal) |
| 12 | |
| 13 | if len(subgoals) == 1 and subgoals[0] == goal: |
| 14 | # Can't decompose further |
| 15 | return {"goal": goal, "type": "leaf", "steps": []} |
| 16 | |
| 17 | # Recursively plan each subgoal |
| 18 | children = [] |
| 19 | for subgoal in subgoals: |
| 20 | child_plan = self.plan(subgoal, depth + 1) |
| 21 | children.append(child_plan) |
| 22 | |
| 23 | return { |
| 24 | "goal": goal, |
| 25 | "type": "branch", |
| 26 | "children": children |
| 27 | } |
| 28 | |
| 29 | def _decompose(self, goal: str) -> list[str]: |
| 30 | response = self.client.chat.completions.create( |
| 31 | model="gpt-4o", |
| 32 | messages=[{ |
| 33 | "role": "system", |
| 34 | "content": """Break the goal into 2-5 major subgoals. |
| 35 | If the goal is already atomic (can't be broken down), return just the goal. |
| 36 | Return as JSON: {"subgoals": ["...", "..."]}""" |
| 37 | }, { |
| 38 | "role": "user", |
| 39 | "content": goal |
| 40 | }], |
| 41 | response_format={"type": "json_object"} |
| 42 | ) |
| 43 | |
| 44 | data = json.loads(response.choices[0].message.content) |
| 45 | return data["subgoals"] |
| 46 | |
| 47 | def execute(self, plan: dict) -> dict: |
| 48 | if plan["type"] == "leaf": |
| 49 | return {"goal": plan["goal"], "result": self._execute_leaf(plan["goal"])} |
| 50 | |
| 51 | results = [] |
| 52 | for child in plan["children"]: |
| 53 | result = self.execute(child) |
| 54 | results.append(result) |
| 55 | |
| 56 | return { |
| 57 | "goal": plan["goal"], |
| 58 | "children_results": results |
| 59 | } |
| 60 | |
Best for: Very complex, multi-faceted goals that benefit from divide-and-conquer.
Pattern 4: Adaptive Planning
Plan adjusts based on execution results:
| 1 | ┌─────────────────────────────────────────────────────────────┐ |
| 2 | │ PLAN │ |
| 3 | │ Step 1 → Step 2 → Step 3 → Step 4 │ |
| 4 | └─────────────────────────────────────────────────────────────┘ |
| 5 | │ |
| 6 | ▼ Execute Step 2 |
| 7 | │ |
| 8 | ▼ Step 2 fails! |
| 9 | │ |
| 10 | ┌─────────────────────────────────────────────────────────────┐ |
| 11 | │ REPLAN │ |
| 12 | │ Step 1 ✓ → Step 2b → Step 2c → Step 3 → Step 4 │ |
| 13 | │ (alternative approach) │ |
| 14 | └─────────────────────────────────────────────────────────────┘ |
| 15 | |
| 1 | class AdaptivePlanner: |
| 2 | def __init__(self): |
| 3 | self.client = openai.OpenAI() |
| 4 | self.max_replans = 3 |
| 5 | |
| 6 | def run(self, goal: str) -> str: |
| 7 | plan = self._create_plan(goal) |
| 8 | completed_steps = [] |
| 9 | replan_count = 0 |
| 10 | |
| 11 | while plan.get_next_step(): |
| 12 | step = plan.get_next_step() |
| 13 | |
| 14 | try: |
| 15 | result = self._execute_step(step, completed_steps) |
| 16 | completed_steps.append({ |
| 17 | "step": step.description, |
| 18 | "result": result, |
| 19 | "status": "success" |
| 20 | }) |
| 21 | step.status = "completed" |
| 22 | |
| 23 | except Exception as e: |
| 24 | completed_steps.append({ |
| 25 | "step": step.description, |
| 26 | "error": str(e), |
| 27 | "status": "failed" |
| 28 | }) |
| 29 | |
| 30 | if replan_count >= self.max_replans: |
| 31 | raise Exception("Max replans exceeded") |
| 32 | |
| 33 | # Replan from current state |
| 34 | plan = self._replan(goal, completed_steps, str(e)) |
| 35 | replan_count += 1 |
| 36 | print(f"Replanned (attempt {replan_count})") |
| 37 | |
| 38 | return self._synthesize(goal, completed_steps) |
| 39 | |
| 40 | def _replan(self, goal: str, completed: list, error: str) -> Plan: |
| 41 | response = self.client.chat.completions.create( |
| 42 | model="gpt-4o", |
| 43 | messages=[{ |
| 44 | "role": "system", |
| 45 | "content": """Create a revised plan given what's been done and what failed. |
| 46 | Return JSON with remaining steps only.""" |
| 47 | }, { |
| 48 | "role": "user", |
| 49 | "content": f"""Goal: {goal} |
| 50 | |
| 51 | Completed steps: |
| 52 | {json.dumps(completed, indent=2)} |
| 53 | |
| 54 | Last error: {error} |
| 55 | |
| 56 | Create a new plan to complete the goal, working around the failure.""" |
| 57 | }], |
| 58 | response_format={"type": "json_object"} |
| 59 | ) |
| 60 | |
| 61 | # Parse and return new plan |
| 62 | data = json.loads(response.choices[0].message.content) |
| 63 | return self._parse_plan(data) |
| 64 | |
Best for: Uncertain environments where steps may fail unpredictably.
Planning with Code Execution
For technical tasks, planning should include actual code execution to verify each step:
| 1 | from hopx import Sandbox |
| 2 | |
| 3 | class CodePlanningAgent: |
| 4 | def __init__(self): |
| 5 | self.client = openai.OpenAI() |
| 6 | self.sandbox = None |
| 7 | |
| 8 | def run(self, goal: str) -> str: |
| 9 | # Create persistent sandbox for the session |
| 10 | self.sandbox = Sandbox.create(template="code-interpreter") |
| 11 | |
| 12 | try: |
| 13 | # Plan |
| 14 | plan = self._create_plan(goal) |
| 15 | |
| 16 | # Execute each step with code |
| 17 | for step in plan.steps: |
| 18 | success = self._execute_code_step(step) |
| 19 | if not success: |
| 20 | # Replan or handle failure |
| 21 | plan = self._replan_from_failure(goal, plan, step) |
| 22 | |
| 23 | # Get final result |
| 24 | return self._get_final_result(goal, plan) |
| 25 | |
| 26 | finally: |
| 27 | self.sandbox.kill() |
| 28 | |
| 29 | def _execute_code_step(self, step: PlanStep) -> bool: |
| 30 | # Generate code for this step |
| 31 | code = self._generate_code(step) |
| 32 | |
| 33 | # Execute in sandbox |
| 34 | self.sandbox.files.write("/app/step.py", code) |
| 35 | result = self.sandbox.commands.run("python /app/step.py") |
| 36 | |
| 37 | if result.exit_code == 0: |
| 38 | step.status = "completed" |
| 39 | step.result = result.stdout |
| 40 | return True |
| 41 | else: |
| 42 | step.status = "failed" |
| 43 | step.error = result.stderr |
| 44 | return False |
| 45 | |
| 46 | def _generate_code(self, step: PlanStep) -> str: |
| 47 | response = self.client.chat.completions.create( |
| 48 | model="gpt-4o", |
| 49 | messages=[{ |
| 50 | "role": "system", |
| 51 | "content": """Generate Python code to accomplish this step. |
| 52 | The code should: |
| 53 | - Be complete and runnable |
| 54 | - Print results to stdout |
| 55 | - Handle errors gracefully |
| 56 | - Save any outputs to files if needed""" |
| 57 | }, { |
| 58 | "role": "user", |
| 59 | "content": f"Step: {step.description}" |
| 60 | }] |
| 61 | ) |
| 62 | |
| 63 | return self._extract_code(response.choices[0].message.content) |
| 64 | |
Real-World Example: Research Agent
Here's a complete planning agent that researches a topic:
| 1 | from hopx import Sandbox |
| 2 | import openai |
| 3 | import json |
| 4 | |
| 5 | class ResearchAgent: |
| 6 | def __init__(self): |
| 7 | self.client = openai.OpenAI() |
| 8 | |
| 9 | def research(self, topic: str) -> dict: |
| 10 | # Phase 1: Plan the research |
| 11 | plan = self._plan_research(topic) |
| 12 | print(f"Research plan: {len(plan)} steps") |
| 13 | |
| 14 | # Phase 2: Execute research steps |
| 15 | findings = [] |
| 16 | for i, step in enumerate(plan): |
| 17 | print(f"Step {i+1}/{len(plan)}: {step['action']}") |
| 18 | |
| 19 | if step["action"] == "search": |
| 20 | result = self._search(step["query"]) |
| 21 | elif step["action"] == "analyze": |
| 22 | result = self._analyze(step["data"], step["question"]) |
| 23 | elif step["action"] == "synthesize": |
| 24 | result = self._synthesize(step["findings"]) |
| 25 | else: |
| 26 | result = {"error": f"Unknown action: {step['action']}"} |
| 27 | |
| 28 | findings.append({ |
| 29 | "step": step, |
| 30 | "result": result |
| 31 | }) |
| 32 | |
| 33 | # Phase 3: Generate final report |
| 34 | report = self._generate_report(topic, findings) |
| 35 | |
| 36 | return { |
| 37 | "topic": topic, |
| 38 | "plan": plan, |
| 39 | "findings": findings, |
| 40 | "report": report |
| 41 | } |
| 42 | |
| 43 | def _plan_research(self, topic: str) -> list: |
| 44 | response = self.client.chat.completions.create( |
| 45 | model="gpt-4o", |
| 46 | messages=[{ |
| 47 | "role": "system", |
| 48 | "content": """Plan a research investigation. Return JSON: |
| 49 | { |
| 50 | "steps": [ |
| 51 | {"action": "search", "query": "search terms"}, |
| 52 | {"action": "analyze", "data": "what to analyze", "question": "what to find"}, |
| 53 | {"action": "synthesize", "findings": ["finding1", "finding2"]} |
| 54 | ] |
| 55 | } |
| 56 | |
| 57 | Available actions: |
| 58 | - search: Search for information |
| 59 | - analyze: Analyze data to answer a question |
| 60 | - synthesize: Combine findings into insights""" |
| 61 | }, { |
| 62 | "role": "user", |
| 63 | "content": f"Research topic: {topic}" |
| 64 | }], |
| 65 | response_format={"type": "json_object"} |
| 66 | ) |
| 67 | |
| 68 | return json.loads(response.choices[0].message.content)["steps"] |
| 69 | |
| 70 | def _search(self, query: str) -> dict: |
| 71 | # In production, use a real search API |
| 72 | response = self.client.chat.completions.create( |
| 73 | model="gpt-4o", |
| 74 | messages=[{ |
| 75 | "role": "user", |
| 76 | "content": f"What do you know about: {query}" |
| 77 | }] |
| 78 | ) |
| 79 | return {"query": query, "results": response.choices[0].message.content} |
| 80 | |
| 81 | def _analyze(self, data: str, question: str) -> dict: |
| 82 | sandbox = Sandbox.create(template="code-interpreter") |
| 83 | |
| 84 | try: |
| 85 | # Use code to analyze |
| 86 | analysis_code = f''' |
| 87 | import json |
| 88 | |
| 89 | data = """{data}""" |
| 90 | question = """{question}""" |
| 91 | |
| 92 | # Analyze the data |
| 93 | # This would be more sophisticated in production |
| 94 | analysis = {{ |
| 95 | "data_summary": data[:500], |
| 96 | "question": question, |
| 97 | "findings": "Analysis results would go here" |
| 98 | }} |
| 99 | |
| 100 | print(json.dumps(analysis)) |
| 101 | ''' |
| 102 | sandbox.files.write("/app/analyze.py", analysis_code) |
| 103 | result = sandbox.commands.run("python /app/analyze.py") |
| 104 | |
| 105 | return json.loads(result.stdout) |
| 106 | finally: |
| 107 | sandbox.kill() |
| 108 | |
| 109 | def _synthesize(self, findings: list) -> dict: |
| 110 | response = self.client.chat.completions.create( |
| 111 | model="gpt-4o", |
| 112 | messages=[{ |
| 113 | "role": "user", |
| 114 | "content": f"Synthesize these findings into key insights:\n{json.dumps(findings)}" |
| 115 | }] |
| 116 | ) |
| 117 | return {"synthesis": response.choices[0].message.content} |
| 118 | |
| 119 | def _generate_report(self, topic: str, findings: list) -> str: |
| 120 | response = self.client.chat.completions.create( |
| 121 | model="gpt-4o", |
| 122 | messages=[{ |
| 123 | "role": "system", |
| 124 | "content": "Generate a well-structured research report." |
| 125 | }, { |
| 126 | "role": "user", |
| 127 | "content": f"Topic: {topic}\n\nFindings:\n{json.dumps(findings, indent=2)}" |
| 128 | }] |
| 129 | ) |
| 130 | return response.choices[0].message.content |
| 131 | |
| 132 | |
| 133 | # Usage |
| 134 | agent = ResearchAgent() |
| 135 | result = agent.research("Current trends in AI agent architectures") |
| 136 | print(result["report"]) |
| 137 | |
Best Practices
1. Right-Size Your Plans
| 1 | # ❌ Too granular - overhead exceeds benefit |
| 2 | plan = [ |
| 3 | "Open file", |
| 4 | "Read first line", |
| 5 | "Parse first field", |
| 6 | "Convert to integer", |
| 7 | ... # 50 more steps |
| 8 | ] |
| 9 | |
| 10 | # ❌ Too coarse - steps are still too complex |
| 11 | plan = [ |
| 12 | "Build the entire backend", |
| 13 | "Build the entire frontend" |
| 14 | ] |
| 15 | |
| 16 | # ✅ Just right - each step is meaningful but manageable |
| 17 | plan = [ |
| 18 | "Design database schema", |
| 19 | "Implement user authentication", |
| 20 | "Create REST API for products", |
| 21 | "Build product listing page", |
| 22 | "Add shopping cart functionality" |
| 23 | ] |
| 24 | |
2. Include Verification Steps
| 1 | plan = [ |
| 2 | {"step": "Write user registration endpoint", "type": "action"}, |
| 3 | {"step": "Test registration with valid data", "type": "verify"}, |
| 4 | {"step": "Test registration with invalid data", "type": "verify"}, |
| 5 | {"step": "Write login endpoint", "type": "action"}, |
| 6 | {"step": "Test login flow", "type": "verify"}, |
| 7 | ] |
| 8 | |
3. Plan for Failure
| 1 | class RobustPlan: |
| 2 | def __init__(self): |
| 3 | self.steps = [] |
| 4 | self.fallbacks = {} # step_id -> fallback_step |
| 5 | |
| 6 | def add_step(self, step: PlanStep, fallback: PlanStep = None): |
| 7 | self.steps.append(step) |
| 8 | if fallback: |
| 9 | self.fallbacks[step.id] = fallback |
| 10 | |
| 11 | def get_fallback(self, failed_step_id: str) -> PlanStep | None: |
| 12 | return self.fallbacks.get(failed_step_id) |
| 13 | |
4. Show Progress
| 1 | def execute_with_progress(plan: Plan, callback): |
| 2 | total = len(plan.steps) |
| 3 | |
| 4 | for i, step in enumerate(plan.steps): |
| 5 | callback({ |
| 6 | "step": i + 1, |
| 7 | "total": total, |
| 8 | "percent": (i + 1) / total * 100, |
| 9 | "description": step.description, |
| 10 | "status": "in_progress" |
| 11 | }) |
| 12 | |
| 13 | result = execute_step(step) |
| 14 | |
| 15 | callback({ |
| 16 | "step": i + 1, |
| 17 | "total": total, |
| 18 | "percent": (i + 1) / total * 100, |
| 19 | "description": step.description, |
| 20 | "status": "completed", |
| 21 | "result_preview": result[:100] |
| 22 | }) |
| 23 | |
When NOT to Use Planning
Planning adds overhead. Skip it when:
| Scenario | Why Skip Planning |
|---|---|
| Simple, single-step tasks | "What's 2+2?" doesn't need a plan |
| Real-time responses needed | Planning adds latency |
| Highly unpredictable tasks | Plan will be wrong anyway |
| Exploratory/creative work | Structure can limit creativity |
Planning + Other Patterns
Planning combines powerfully with other agentic patterns:
Planning + Reflection
| 1 | def plan_with_reflection(goal: str) -> Plan: |
| 2 | # Generate initial plan |
| 3 | plan = create_plan(goal) |
| 4 | |
| 5 | # Reflect on the plan |
| 6 | critique = reflect_on_plan(plan) |
| 7 | |
| 8 | # Improve if needed |
| 9 | if not critique.approved: |
| 10 | plan = improve_plan(plan, critique) |
| 11 | |
| 12 | return plan |
| 13 | |
Planning + Tool Use
| 1 | def execute_planned_step(step: PlanStep, tools: dict): |
| 2 | # Determine which tools are needed |
| 3 | required_tools = identify_tools(step) |
| 4 | |
| 5 | # Execute with tools |
| 6 | for tool_name in required_tools: |
| 7 | result = tools[tool_name].execute(step.parameters) |
| 8 | step.add_result(tool_name, result) |
| 9 | |
Planning + Multi-Agent
| 1 | def distributed_plan_execution(plan: Plan, agents: dict): |
| 2 | # Assign steps to specialized agents |
| 3 | for step in plan.steps: |
| 4 | agent = agents[step.agent_type] |
| 5 | agent.queue_step(step) |
| 6 | |
| 7 | # Execute in parallel where possible |
| 8 | results = await gather_results(agents) |
| 9 | return results |
| 10 | |
Conclusion
Planning is the foundation of reliable AI agents:
- Decompose complex goals into manageable steps
- Order steps by dependencies
- Track progress through execution
- Adapt when things don't go as expected
Start with linear planning for simple tasks. Add DAG planning when you need parallelism. Use hierarchical planning for complex, multi-faceted goals. Always build in the ability to replan.
The agent that plans beats the agent that doesn't. Every time.
Ready to build planning agents with code execution? Get started with HopX — sandboxes that let your agents verify each step.
Further Reading
- What Is an AI Agent? — Agent fundamentals
- Prompt Chaining — Execute plans as chains
- Tool Use — Give planning agents capabilities
- The Reflection Pattern — Improve plans through self-review
- Multi-Agent Architectures — Distribute plan execution