Prompt Chaining: How to Build Sequential AI Workflows

You've hit the wall. Your single prompt is getting longer, more complex, and increasingly unreliable. The LLM sometimes nails it, sometimes completely misses. Sound familiar?

Prompt chaining is the solution: break your mega-prompt into smaller, focused steps where each LLM call does one thing well.

This guide shows you how to build reliable prompt chains, when to use them, and how to avoid the common pitfalls that trip up most developers.

What Is Prompt Chaining?

Prompt chaining connects multiple LLM calls in sequence. The output of one prompt becomes the input for the next:

text

1	┌─────────────┐ ┌─────────────┐ ┌─────────────┐
2	│ Prompt 1 │───▶│ Prompt 2 │───▶│ Prompt 3 │
3	│ Extract │ │ Transform │ │ Format │
4	└─────────────┘ └─────────────┘ └─────────────┘
5	│ │ │
6	▼ ▼ ▼
7	Raw Data Structured Final Output
8	Data
9

Instead of asking the LLM to do everything at once:

text

1	❌ "Read this document, extract the key points, translate them to Spanish,
2	summarize each point, and format as a newsletter"
3

You break it into steps:

text

✅ Step 1: "Extract key points from this document"
   Step 2: "Translate these points to Spanish"  
   Step 3: "Summarize each point in one sentence"
   Step 4: "Format these summaries as a newsletter"
 

Each step is simpler, more reliable, and easier to debug.

Why Prompt Chaining Works

1. Reduced Cognitive Load

LLMs perform better on focused tasks. A prompt that does one thing well consistently outperforms a prompt trying to juggle five things.

Research insight: Studies show LLM accuracy drops significantly as task complexity increases. Breaking a 5-step task into 5 prompts can improve overall accuracy by 20-40%.

2. Debuggability

When something goes wrong in a monolithic prompt, good luck figuring out where. With chains, you can inspect each intermediate output:

python

# Easy to debug
step1_output = extract_entities(document)      # Check: Are entities correct?
step2_output = classify_entities(step1_output) # Check: Are classifications correct?
step3_output = generate_summary(step2_output)  # Check: Is summary accurate?
 

3. Reusability

Chain steps become building blocks. Your "translate to Spanish" step works in any pipeline:

python

# Reuse across different workflows
translate_step = TranslatePrompt(target_language="Spanish")
 
workflow_a = Chain([extract, translate_step, summarize])
workflow_b = Chain([user_input, translate_step, respond])
 

4. Cost Optimization

You can use smaller, cheaper models for simpler steps and reserve expensive models for complex reasoning:

python

chain = [
    Step("Extract dates", model="gpt-3.5-turbo"),     # Simple extraction: cheap model
    Step("Parse to ISO format", model="gpt-3.5-turbo"), # Formatting: cheap model  
    Step("Analyze timeline", model="gpt-4o"),         # Complex reasoning: powerful model
]
 

Basic Prompt Chain Implementation

Here's a minimal but complete implementation:

python

import openai
from dataclasses import dataclass
 
@dataclass
class ChainStep:
    name: str
    prompt_template: str
    model: str = "gpt-4o"
 
class PromptChain:
    def __init__(self, steps: list[ChainStep]):
        self.steps = steps
        self.client = openai.OpenAI()
        self.trace = []  # For debugging
    
    def run(self, initial_input: str) -> str:
        current_input = initial_input
        
        for step in self.steps:
            # Format prompt with current input
            prompt = step.prompt_template.format(input=current_input)
            
            # Call LLM
            response = self.client.chat.completions.create(
                model=step.model,
                messages=[{"role": "user", "content": prompt}]
            )
            
            output = response.choices[0].message.content
            
            # Save trace for debugging
            self.trace.append({
                "step": step.name,
                "input": current_input[:200],  # Truncate for readability
                "output": output[:200]
            })
            
            # Output becomes next input
            current_input = output
        
        return current_input
    
    def debug(self):
        """Print execution trace"""
        for i, step in enumerate(self.trace):
            print(f"\n{'='*50}")
            print(f"Step {i+1}: {step['step']}")
            print(f"Input: {step['input']}...")
            print(f"Output: {step['output']}...")
 
 
# Usage
chain = PromptChain([
    ChainStep(
        name="Extract",
        prompt_template="Extract all person names from this text:\n\n{input}"
    ),
    ChainStep(
        name="Deduplicate", 
        prompt_template="Remove duplicates from this list of names:\n\n{input}"
    ),
    ChainStep(
        name="Format",
        prompt_template="Format these names as a numbered list:\n\n{input}"
    )
])
 
result = chain.run("John met Sarah at the coffee shop. Sarah introduced John to Mike...")
print(result)
chain.debug()  # See what happened at each step
 

Output:

text

1. John
2. Sarah
3. Mike
 
==================================================
Step 1: Extract
Input: John met Sarah at the coffee shop. Sarah introduced John to Mike...
Output: John, Sarah, John, Mike, Sarah...
 
==================================================
Step 2: Deduplicate
Input: John, Sarah, John, Mike, Sarah...
Output: John, Sarah, Mike...
 
==================================================
Step 3: Format
Input: John, Sarah, Mike...
Output: 1. John
2. Sarah
3. Mike...
 

Real-World Example: Document Processing Pipeline

Let's build a practical document processing chain that:

Extracts key information
Validates the extraction
Transforms to structured data
Generates a summary

python

from hopx import Sandbox
import openai
import json
 
class DocumentProcessor:
    def __init__(self):
        self.client = openai.OpenAI()
    
    def process(self, document: str) -> dict:
        # Step 1: Extract key information
        extracted = self._extract(document)
        
        # Step 2: Validate extraction (with code execution)
        validated = self._validate(extracted)
        
        # Step 3: Structure the data
        structured = self._structure(validated)
        
        # Step 4: Generate summary
        summary = self._summarize(structured)
        
        return {
            "extracted": extracted,
            "validated": validated,
            "structured": structured,
            "summary": summary
        }
    
    def _extract(self, document: str) -> str:
        """Step 1: Extract key entities and facts"""
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Extract the following from the document:
                - People mentioned (with roles)
                - Dates and deadlines
                - Action items
                - Key decisions
                
                Format as a structured list."""
            }, {
                "role": "user",
                "content": document
            }]
        )
        return response.choices[0].message.content
    
    def _validate(self, extracted: str) -> str:
        """Step 2: Validate with code execution"""
        sandbox = Sandbox.create(template="code-interpreter")
        
        try:
            # Use code to validate dates, check for inconsistencies
            validation_code = f'''
import re
from datetime import datetime
 
text = """{extracted}"""
 
# Find all dates
date_patterns = [
    r'\d{{1,2}}/\d{{1,2}}/\d{{4}}',
    r'\d{{4}}-\d{{2}}-\d{{2}}',
    r'(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{{1,2}},?\s+\d{{4}}'
]
 
dates_found = []
for pattern in date_patterns:
    dates_found.extend(re.findall(pattern, text))
 
# Check for potential issues
issues = []
if len(dates_found) == 0:
    issues.append("No dates found - verify manually")
 
# Output validation result
print("VALIDATION RESULT")
print(f"Dates found: {{dates_found}}")
print(f"Issues: {{issues if issues else 'None'}}")
print("---")
print(text)
'''
            
            sandbox.files.write("/app/validate.py", validation_code)
            result = sandbox.commands.run("python /app/validate.py")
            
            return result.stdout
        finally:
            sandbox.kill()
    
    def _structure(self, validated: str) -> dict:
        """Step 3: Convert to structured JSON"""
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Convert this information to JSON with the schema:
                {
                    "people": [{"name": "", "role": ""}],
                    "dates": [{"date": "", "event": ""}],
                    "action_items": [{"task": "", "owner": "", "due": ""}],
                    "decisions": [""]
                }"""
            }, {
                "role": "user",
                "content": validated
            }],
            response_format={"type": "json_object"}
        )
        return json.loads(response.choices[0].message.content)
    
    def _summarize(self, structured: dict) -> str:
        """Step 4: Generate executive summary"""
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": "Write a 2-3 sentence executive summary of this meeting/document."
            }, {
                "role": "user",
                "content": json.dumps(structured, indent=2)
            }]
        )
        return response.choices[0].message.content
 
 
# Usage
processor = DocumentProcessor()
result = processor.process("""
Meeting Notes - Product Launch Planning
Date: January 15, 2025
 
Attendees: Sarah Chen (PM), Mike Johnson (Engineering Lead), Lisa Park (Marketing)
 
Discussion:
Sarah presented the launch timeline. Target launch date is March 1, 2025.
Mike raised concerns about the API stability - needs 2 more weeks of testing.
Lisa confirmed marketing materials will be ready by February 15.
 
Decisions:
- Soft launch to beta users on February 20
- Full public launch on March 1
- Mike to own the stability testing
 
Action Items:
- Mike: Complete API load testing by February 1
- Lisa: Finalize press release by February 10
- Sarah: Coordinate with sales team by January 20
""")
 
print(json.dumps(result, indent=2))
 

Prompt Chaining Patterns

Pattern 1: Linear Chain

The simplest pattern—each step feeds into the next:

text

1	Input → [A] → [B] → [C] → Output
2

python

def linear_chain(text):
    extracted = extract(text)
    translated = translate(extracted)
    formatted = format_output(translated)
    return formatted
 

Best for: Sequential transformations, document processing, data pipelines.

Pattern 2: Branching Chain

Different paths based on intermediate results:

text

1	┌─[B1]─┐
2	Input → [A]─┤ ├─[D]→ Output
3	└─[B2]─┘
4

python

def branching_chain(text):
    classification = classify(text)
    
    if classification == "technical":
        processed = technical_processor(text)
    else:
        processed = general_processor(text)
    
    return finalize(processed)
 

Best for: Content routing, specialized processing, conditional logic.

Pattern 3: Parallel Chain

Multiple independent steps that merge:

text

1	┌─[A]─┐
2	Input ──┼─[B]─┼── Merge → Output
3	└─[C]─┘
4

python

import concurrent.futures
 
def parallel_chain(text):
    with concurrent.futures.ThreadPoolExecutor() as executor:
        future_summary = executor.submit(summarize, text)
        future_entities = executor.submit(extract_entities, text)
        future_sentiment = executor.submit(analyze_sentiment, text)
        
        summary = future_summary.result()
        entities = future_entities.result()
        sentiment = future_sentiment.result()
    
    return merge_results(summary, entities, sentiment)
 

Best for: Independent analyses, multi-perspective processing, speed optimization.

Pattern 4: Iterative Chain (Loop)

Repeat until a condition is met:

text

1	┌──────────────┐
2	▼ │
3	Input → [Process] → [Check] ──(not done)──┘
4	│
5	(done)
6	▼
7	Output
8

python

def iterative_chain(text, max_iterations=5):
    current = text
    
    for i in range(max_iterations):
        # Process
        improved = improve(current)
        
        # Check if good enough
        score = evaluate(improved)
        if score > 0.9:
            return improved
        
        current = improved
    
    return current
 

Best for: Refinement tasks, quality improvement, self-correction.

Pattern 5: Fallback Chain

Try multiple approaches, use first success:

text

Input → [A] ──(fail)──→ [B] ──(fail)──→ [C] → Output
         │                │               │
      (success)        (success)       (success)
         ▼                ▼               ▼
       Output          Output          Output
 

python

def fallback_chain(text):
    strategies = [
        ("precise", precise_extract),
        ("fuzzy", fuzzy_extract),
        ("llm_only", llm_extract)
    ]
    
    for name, strategy in strategies:
        try:
            result = strategy(text)
            if validate(result):
                return result
        except Exception as e:
            print(f"{name} failed: {e}")
            continue
    
    raise ValueError("All strategies failed")
 

Best for: Robust systems, graceful degradation, handling edge cases.

Adding Code Execution to Chains

Many chain steps benefit from actual code execution—not just LLM reasoning. This is where sandboxed execution becomes essential:

python

from hopx import Sandbox
import openai
 
class CodeAugmentedChain:
    def __init__(self):
        self.client = openai.OpenAI()
    
    def analyze_data(self, data_description: str, question: str) -> dict:
        """
        Chain: 
        1. LLM generates analysis code
        2. Code executes in sandbox
        3. LLM interprets results
        """
        
        # Step 1: Generate analysis code
        code = self._generate_code(data_description, question)
        
        # Step 2: Execute in sandbox
        execution_result = self._execute_code(code)
        
        # Step 3: Interpret results
        interpretation = self._interpret_results(question, execution_result)
        
        return {
            "code": code,
            "raw_output": execution_result,
            "interpretation": interpretation
        }
    
    def _generate_code(self, data_description: str, question: str) -> str:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": """Generate Python code to analyze data and answer the question.
                Use pandas for data manipulation.
                Print results clearly.
                Do not use plt.show() - save plots to files instead."""
            }, {
                "role": "user",
                "content": f"Data: {data_description}\n\nQuestion: {question}"
            }]
        )
        
        # Extract code from response
        content = response.choices[0].message.content
        if "```python" in content:
            code = content.split("```python")[1].split("```")[0]
        else:
            code = content
        
        return code.strip()
    
    def _execute_code(self, code: str) -> str:
        sandbox = Sandbox.create(template="code-interpreter")
        
        try:
            # Install required packages
            sandbox.commands.run("pip install pandas numpy -q")
            
            # Write and execute code
            sandbox.files.write("/app/analysis.py", code)
            result = sandbox.commands.run("python /app/analysis.py")
            
            if result.exit_code != 0:
                return f"ERROR:\n{result.stderr}"
            
            return result.stdout
        
        finally:
            sandbox.kill()
    
    def _interpret_results(self, question: str, raw_output: str) -> str:
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": "Interpret these analysis results in plain English. Be specific and cite numbers."
            }, {
                "role": "user",
                "content": f"Question: {question}\n\nAnalysis Output:\n{raw_output}"
            }]
        )
        return response.choices[0].message.content
 
 
# Usage
chain = CodeAugmentedChain()
result = chain.analyze_data(
    data_description="CSV file at /app/sales.csv with columns: date, product, revenue, units_sold",
    question="What was the best-selling product in Q4 2024?"
)
 

Error Handling in Chains

Chains fail. Here's how to handle it gracefully:

python

from dataclasses import dataclass
from typing import Optional
import traceback
 
@dataclass
class ChainResult:
    success: bool
    output: Optional[str]
    failed_step: Optional[str]
    error: Optional[str]
    partial_results: dict
 
class RobustChain:
    def __init__(self, steps: list):
        self.steps = steps
    
    def run(self, initial_input: str) -> ChainResult:
        current_input = initial_input
        partial_results = {}
        
        for step in self.steps:
            try:
                output = step.execute(current_input)
                partial_results[step.name] = output
                current_input = output
                
            except Exception as e:
                return ChainResult(
                    success=False,
                    output=None,
                    failed_step=step.name,
                    error=f"{type(e).__name__}: {str(e)}\n{traceback.format_exc()}",
                    partial_results=partial_results
                )
        
        return ChainResult(
            success=True,
            output=current_input,
            failed_step=None,
            error=None,
            partial_results=partial_results
        )
 
 
# With retry logic
class RetryableChain(RobustChain):
    def run(self, initial_input: str, max_retries: int = 3) -> ChainResult:
        current_input = initial_input
        partial_results = {}
        
        for step in self.steps:
            for attempt in range(max_retries):
                try:
                    output = step.execute(current_input)
                    partial_results[step.name] = output
                    current_input = output
                    break  # Success, move to next step
                    
                except Exception as e:
                    if attempt == max_retries - 1:
                        return ChainResult(
                            success=False,
                            output=None,
                            failed_step=step.name,
                            error=str(e),
                            partial_results=partial_results
                        )
                    # Wait before retry (exponential backoff)
                    import time
                    time.sleep(2 ** attempt)
        
        return ChainResult(
            success=True,
            output=current_input,
            failed_step=None,
            error=None,
            partial_results=partial_results
        )
 

When NOT to Use Prompt Chaining

Chaining isn't always the answer. Avoid it when:

Scenario	Why Chaining Hurts	Better Alternative
Simple, single-step task	Unnecessary complexity	Single prompt
Highly interdependent reasoning	Context loss between steps	Long-context model
Real-time latency requirements	Each step adds latency	Cached/precomputed
Very short inputs	Overhead exceeds benefit	Single prompt
Exploratory/creative tasks	Structure kills creativity	Open-ended prompt

Signs You're Over-Chaining

Each step is trivial (could be done with string formatting)
You're passing the same context through every step
The chain is slower than a single smart prompt
Steps are so coupled they always fail/succeed together

Performance Optimization

1. Parallelize Independent Steps

python

import asyncio
 
async def optimized_chain(text):
    # These can run in parallel
    summary_task = asyncio.create_task(summarize(text))
    entities_task = asyncio.create_task(extract_entities(text))
    
    summary, entities = await asyncio.gather(summary_task, entities_task)
    
    # This depends on previous results
    final = await generate_report(summary, entities)
    
    return final
 

2. Use Smaller Models for Simple Steps

python

steps = [
    Step("Format cleanup", model="gpt-3.5-turbo"),      # Simple
    Step("Entity extraction", model="gpt-3.5-turbo"),   # Pattern matching
    Step("Complex reasoning", model="gpt-4o"),          # Needs power
    Step("Final formatting", model="gpt-3.5-turbo"),    # Simple
]
# Cost: ~60% less than using gpt-4o for everything
 

3. Cache Repeated Steps

python

from functools import lru_cache
import hashlib
 
@lru_cache(maxsize=1000)
def cached_step(input_hash: str, step_name: str) -> str:
    # Actual processing
    pass
 
def chain_with_cache(text):
    input_hash = hashlib.md5(text.encode()).hexdigest()
    
    # Check cache first
    cached = cached_step(input_hash, "extract")
    if cached:
        return cached
    
    # Process and cache
    result = extract(text)
    cached_step.cache_info()  # Store result
    return result
 

4. Stream Long Chains

python

async def streaming_chain(text):
    """Yield results as each step completes"""
    
    yield {"step": "extract", "status": "starting"}
    extracted = await extract(text)
    yield {"step": "extract", "status": "complete", "preview": extracted[:100]}
    
    yield {"step": "transform", "status": "starting"}
    transformed = await transform(extracted)
    yield {"step": "transform", "status": "complete", "preview": transformed[:100]}
    
    yield {"step": "format", "status": "starting"}
    final = await format_output(transformed)
    yield {"step": "format", "status": "complete", "result": final}
 

Prompt Chaining vs. Agent Loops

Don't confuse chaining with agentic systems:

Prompt Chaining	Agent Loops
Fixed sequence of steps	Dynamic, decides next step
Predictable execution path	Unpredictable path
Faster, cheaper	More flexible, expensive
Easier to debug	Harder to debug
Best for known workflows	Best for open-ended tasks

Use chaining when you know the steps upfront.
Use agents when the LLM needs to figure out the steps.

Many production systems combine both: an agent that decides what to do, then triggers chains to do it.

Building Your First Chain: Quickstart

python

# Install
# pip install openai hopx
 
from openai import OpenAI
 
client = OpenAI()
 
def chain_step(prompt: str, input_text: str, model: str = "gpt-4o") -> str:
    """Single chain step"""
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": f"{prompt}\n\nInput:\n{input_text}"}]
    )
    return response.choices[0].message.content
 
# Your first chain
text = "The quick brown fox jumps over the lazy dog. This is a sample text."
 
step1 = chain_step("Count the words in this text", text)
step2 = chain_step("Is this count correct? Verify.", step1)
step3 = chain_step("Summarize your findings in one sentence.", step2)
 
print(step3)
 

Once you're comfortable, add:

Error handling
Logging/tracing
Parallel execution
Code execution with sandboxes

Conclusion

Prompt chaining transforms unreliable mega-prompts into robust, debuggable pipelines:

Break complex tasks into focused steps
Debug easily by inspecting intermediate outputs
Optimize costs by using right-sized models per step
Build reusable components for multiple workflows

Start simple—a 2-3 step chain. Add complexity only when needed.

The best chains feel invisible: they just work, every time.

Ready to add code execution to your chains? Get started with HopX — sandboxes that spin up in 100ms.

1	from hopx import Sandbox
2	import openai
3	import json
4
5	class DocumentProcessor:
6	def __init__(self):
7	self.client = openai.OpenAI()
8
9	def process(self, document: str) -> dict:
10	# Step 1: Extract key information
11	extracted = self._extract(document)
12
13	# Step 2: Validate extraction (with code execution)
14	validated = self._validate(extracted)
15
16	# Step 3: Structure the data
17	structured = self._structure(validated)
18
19	# Step 4: Generate summary
20	summary = self._summarize(structured)
21
22	return {
23	"extracted": extracted,
24	"validated": validated,
25	"structured": structured,
26	"summary": summary
27	}
28
29	def _extract(self, document: str) -> str:
30	"""Step 1: Extract key entities and facts"""
31	response = self.client.chat.completions.create(
32	model="gpt-4o",
33	messages=[{
34	"role": "system",
35	"content": """Extract the following from the document:
36	- People mentioned (with roles)
37	- Dates and deadlines
38	- Action items
39	- Key decisions
40
41	Format as a structured list."""
42	}, {
43	"role": "user",
44	"content": document
45	}]
46	)
47	return response.choices[0].message.content
48
49	def _validate(self, extracted: str) -> str:
50	"""Step 2: Validate with code execution"""
51	sandbox = Sandbox.create(template="code-interpreter")
52
53	try:
54	# Use code to validate dates, check for inconsistencies
55	validation_code = f'''
56	import re
57	from datetime import datetime
58
59	text = """{extracted}"""
60
61	# Find all dates
62	date_patterns = [
63	r'\d{{1,2}}/\d{{1,2}}/\d{{4}}',
64	r'\d{{4}}-\d{{2}}-\d{{2}}',
65	r'(January\|February\|March\|April\|May\|June\|July\|August\|September\|October\|November\|December)\s+\d{{1,2}},?\s+\d{{4}}'
66	]
67
68	dates_found = []
69	for pattern in date_patterns:
70	dates_found.extend(re.findall(pattern, text))
71
72	# Check for potential issues
73	issues = []
74	if len(dates_found) == 0:
75	issues.append("No dates found - verify manually")
76
77	# Output validation result
78	print("VALIDATION RESULT")
79	print(f"Dates found: {{dates_found}}")
80	print(f"Issues: {{issues if issues else 'None'}}")
81	print("---")
82	print(text)
83	'''
84
85	sandbox.files.write("/app/validate.py", validation_code)
86	result = sandbox.commands.run("python /app/validate.py")
87
88	return result.stdout
89	finally:
90	sandbox.kill()
91
92	def _structure(self, validated: str) -> dict:
93	"""Step 3: Convert to structured JSON"""
94	response = self.client.chat.completions.create(
95	model="gpt-4o",
96	messages=[{
97	"role": "system",
98	"content": """Convert this information to JSON with the schema:
99	{
100	"people": [{"name": "", "role": ""}],
101	"dates": [{"date": "", "event": ""}],
102	"action_items": [{"task": "", "owner": "", "due": ""}],
103	"decisions": [""]
104	}"""
105	}, {
106	"role": "user",
107	"content": validated
108	}],
109	response_format={"type": "json_object"}
110	)
111	return json.loads(response.choices[0].message.content)
112
113	def _summarize(self, structured: dict) -> str:
114	"""Step 4: Generate executive summary"""
115	response = self.client.chat.completions.create(
116	model="gpt-4o",
117	messages=[{
118	"role": "system",
119	"content": "Write a 2-3 sentence executive summary of this meeting/document."
120	}, {
121	"role": "user",
122	"content": json.dumps(structured, indent=2)
123	}]
124	)
125	return response.choices[0].message.content
126
127
128	# Usage
129	processor = DocumentProcessor()
130	result = processor.process("""
131	Meeting Notes - Product Launch Planning
132	Date: January 15, 2025
133
134	Attendees: Sarah Chen (PM), Mike Johnson (Engineering Lead), Lisa Park (Marketing)
135
136	Discussion:
137	Sarah presented the launch timeline. Target launch date is March 1, 2025.
138	Mike raised concerns about the API stability - needs 2 more weeks of testing.
139	Lisa confirmed marketing materials will be ready by February 15.
140
141	Decisions:
142	- Soft launch to beta users on February 20
143	- Full public launch on March 1
144	- Mike to own the stability testing
145
146	Action Items:
147	- Mike: Complete API load testing by February 1
148	- Lisa: Finalize press release by February 10
149	- Sarah: Coordinate with sales team by January 20
150	""")
151
152	print(json.dumps(result, indent=2))
153

Prompt Chaining: How to Build Sequential AI Workflows

Prompt Chaining: How to Build Sequential AI Workflows

What Is Prompt Chaining?

Why Prompt Chaining Works

1. Reduced Cognitive Load

2. Debuggability

3. Reusability

4. Cost Optimization

Basic Prompt Chain Implementation

Real-World Example: Document Processing Pipeline

Prompt Chaining Patterns

Pattern 1: Linear Chain

Pattern 2: Branching Chain

Pattern 3: Parallel Chain

Pattern 4: Iterative Chain (Loop)

Pattern 5: Fallback Chain

Adding Code Execution to Chains

Error Handling in Chains

When NOT to Use Prompt Chaining

Signs You're Over-Chaining

Performance Optimization

1. Parallelize Independent Steps

2. Use Smaller Models for Simple Steps

3. Cache Repeated Steps

4. Stream Long Chains

Prompt Chaining vs. Agent Loops

Building Your First Chain: Quickstart

Conclusion

Further Reading

Related articles

Evaluator-Optimizer Loop: Continuous AI Agent Improvement

Human-in-the-Loop: Balancing AI Autonomy and Human Control

Memory for AI Agents: Short-term, Long-term, and RAG

1	✅ Step 1: "Extract key points from this document"
2	Step 2: "Translate these points to Spanish"
3	Step 3: "Summarize each point in one sentence"
4	Step 4: "Format these summaries as a newsletter"
5

1	# Easy to debug
2	step1_output = extract_entities(document) # Check: Are entities correct?
3	step2_output = classify_entities(step1_output) # Check: Are classifications correct?
4	step3_output = generate_summary(step2_output) # Check: Is summary accurate?
5

1	# Reuse across different workflows
2	translate_step = TranslatePrompt(target_language="Spanish")
3
4	workflow_a = Chain([extract, translate_step, summarize])
5	workflow_b = Chain([user_input, translate_step, respond])
6

1	chain = [
2	Step("Extract dates", model="gpt-3.5-turbo"), # Simple extraction: cheap model
3	Step("Parse to ISO format", model="gpt-3.5-turbo"), # Formatting: cheap model
4	Step("Analyze timeline", model="gpt-4o"), # Complex reasoning: powerful model
5	]
6

1	import openai
2	from dataclasses import dataclass
3
4	@dataclass
5	class ChainStep:
6	name: str
7	prompt_template: str
8	model: str = "gpt-4o"
9
10	class PromptChain:
11	def __init__(self, steps: list[ChainStep]):
12	self.steps = steps
13	self.client = openai.OpenAI()
14	self.trace = [] # For debugging
15
16	def run(self, initial_input: str) -> str:
17	current_input = initial_input
18
19	for step in self.steps:
20	# Format prompt with current input
21	prompt = step.prompt_template.format(input=current_input)
22
23	# Call LLM
24	response = self.client.chat.completions.create(
25	model=step.model,
26	messages=[{"role": "user", "content": prompt}]
27	)
28
29	output = response.choices[0].message.content
30
31	# Save trace for debugging
32	self.trace.append({
33	"step": step.name,
34	"input": current_input[:200], # Truncate for readability
35	"output": output[:200]
36	})
37
38	# Output becomes next input
39	current_input = output
40
41	return current_input
42
43	def debug(self):
44	"""Print execution trace"""
45	for i, step in enumerate(self.trace):
46	print(f"\n{'='*50}")
47	print(f"Step {i+1}: {step['step']}")
48	print(f"Input: {step['input']}...")
49	print(f"Output: {step['output']}...")
50
51
52	# Usage
53	chain = PromptChain([
54	ChainStep(
55	name="Extract",
56	prompt_template="Extract all person names from this text:\n\n{input}"
57	),
58	ChainStep(
59	name="Deduplicate",
60	prompt_template="Remove duplicates from this list of names:\n\n{input}"
61	),
62	ChainStep(
63	name="Format",
64	prompt_template="Format these names as a numbered list:\n\n{input}"
65	)
66	])
67
68	result = chain.run("John met Sarah at the coffee shop. Sarah introduced John to Mike...")
69	print(result)
70	chain.debug() # See what happened at each step
71

1	1. John
2	2. Sarah
3	3. Mike
4
5	==================================================
6	Step 1: Extract
7	Input: John met Sarah at the coffee shop. Sarah introduced John to Mike...
8	Output: John, Sarah, John, Mike, Sarah...
9
10	==================================================
11	Step 2: Deduplicate
12	Input: John, Sarah, John, Mike, Sarah...
13	Output: John, Sarah, Mike...
14
15	==================================================
16	Step 3: Format
17	Input: John, Sarah, Mike...
18	Output: 1. John
19	2. Sarah
20	3. Mike...
21

1	def linear_chain(text):
2	extracted = extract(text)
3	translated = translate(extracted)
4	formatted = format_output(translated)
5	return formatted
6

1	def branching_chain(text):
2	classification = classify(text)
3
4	if classification == "technical":
5	processed = technical_processor(text)
6	else:
7	processed = general_processor(text)
8
9	return finalize(processed)
10

1	import concurrent.futures
2
3	def parallel_chain(text):
4	with concurrent.futures.ThreadPoolExecutor() as executor:
5	future_summary = executor.submit(summarize, text)
6	future_entities = executor.submit(extract_entities, text)
7	future_sentiment = executor.submit(analyze_sentiment, text)
8
9	summary = future_summary.result()
10	entities = future_entities.result()
11	sentiment = future_sentiment.result()
12
13	return merge_results(summary, entities, sentiment)
14

1	def iterative_chain(text, max_iterations=5):
2	current = text
3
4	for i in range(max_iterations):
5	# Process
6	improved = improve(current)
7
8	# Check if good enough
9	score = evaluate(improved)
10	if score > 0.9:
11	return improved
12
13	current = improved
14
15	return current
16

1	Input → [A] ──(fail)──→ [B] ──(fail)──→ [C] → Output
2	│ │ │
3	(success) (success) (success)
4	▼ ▼ ▼
5	Output Output Output
6

1	def fallback_chain(text):
2	strategies = [
3	("precise", precise_extract),
4	("fuzzy", fuzzy_extract),
5	("llm_only", llm_extract)
6	]
7
8	for name, strategy in strategies:
9	try:
10	result = strategy(text)
11	if validate(result):
12	return result
13	except Exception as e:
14	print(f"{name} failed: {e}")
15	continue
16
17	raise ValueError("All strategies failed")
18

1	from hopx import Sandbox
2	import openai
3
4	class CodeAugmentedChain:
5	def __init__(self):
6	self.client = openai.OpenAI()
7
8	def analyze_data(self, data_description: str, question: str) -> dict:
9	"""
10	Chain:
11	1. LLM generates analysis code
12	2. Code executes in sandbox
13	3. LLM interprets results
14	"""
15
16	# Step 1: Generate analysis code
17	code = self._generate_code(data_description, question)
18
19	# Step 2: Execute in sandbox
20	execution_result = self._execute_code(code)
21
22	# Step 3: Interpret results
23	interpretation = self._interpret_results(question, execution_result)
24
25	return {
26	"code": code,
27	"raw_output": execution_result,
28	"interpretation": interpretation
29	}
30
31	def _generate_code(self, data_description: str, question: str) -> str:
32	response = self.client.chat.completions.create(
33	model="gpt-4o",
34	messages=[{
35	"role": "system",
36	"content": """Generate Python code to analyze data and answer the question.
37	Use pandas for data manipulation.
38	Print results clearly.
39	Do not use plt.show() - save plots to files instead."""
40	}, {
41	"role": "user",
42	"content": f"Data: {data_description}\n\nQuestion: {question}"
43	}]
44	)
45
46	# Extract code from response
47	content = response.choices[0].message.content
48	if "```python" in content:
49	code = content.split("```python")[1].split("```")[0]
50	else:
51	code = content
52
53	return code.strip()
54
55	def _execute_code(self, code: str) -> str:
56	sandbox = Sandbox.create(template="code-interpreter")
57
58	try:
59	# Install required packages
60	sandbox.commands.run("pip install pandas numpy -q")
61
62	# Write and execute code
63	sandbox.files.write("/app/analysis.py", code)
64	result = sandbox.commands.run("python /app/analysis.py")
65
66	if result.exit_code != 0:
67	return f"ERROR:\n{result.stderr}"
68
69	return result.stdout
70
71	finally:
72	sandbox.kill()
73
74	def _interpret_results(self, question: str, raw_output: str) -> str:
75	response = self.client.chat.completions.create(
76	model="gpt-4o",
77	messages=[{
78	"role": "system",
79	"content": "Interpret these analysis results in plain English. Be specific and cite numbers."
80	}, {
81	"role": "user",
82	"content": f"Question: {question}\n\nAnalysis Output:\n{raw_output}"
83	}]
84	)
85	return response.choices[0].message.content
86
87
88	# Usage
89	chain = CodeAugmentedChain()
90	result = chain.analyze_data(
91	data_description="CSV file at /app/sales.csv with columns: date, product, revenue, units_sold",
92	question="What was the best-selling product in Q4 2024?"
93	)
94

1	from dataclasses import dataclass
2	from typing import Optional
3	import traceback
4
5	@dataclass
6	class ChainResult:
7	success: bool
8	output: Optional[str]
9	failed_step: Optional[str]
10	error: Optional[str]
11	partial_results: dict
12
13	class RobustChain:
14	def __init__(self, steps: list):
15	self.steps = steps
16
17	def run(self, initial_input: str) -> ChainResult:
18	current_input = initial_input
19	partial_results = {}
20
21	for step in self.steps:
22	try:
23	output = step.execute(current_input)
24	partial_results[step.name] = output
25	current_input = output
26
27	except Exception as e:
28	return ChainResult(
29	success=False,
30	output=None,
31	failed_step=step.name,
32	error=f"{type(e).__name__}: {str(e)}\n{traceback.format_exc()}",
33	partial_results=partial_results
34	)
35
36	return ChainResult(
37	success=True,
38	output=current_input,
39	failed_step=None,
40	error=None,
41	partial_results=partial_results
42	)
43
44
45	# With retry logic
46	class RetryableChain(RobustChain):
47	def run(self, initial_input: str, max_retries: int = 3) -> ChainResult:
48	current_input = initial_input
49	partial_results = {}
50
51	for step in self.steps:
52	for attempt in range(max_retries):
53	try:
54	output = step.execute(current_input)
55	partial_results[step.name] = output
56	current_input = output
57	break # Success, move to next step
58
59	except Exception as e:
60	if attempt == max_retries - 1:
61	return ChainResult(
62	success=False,
63	output=None,
64	failed_step=step.name,
65	error=str(e),
66	partial_results=partial_results
67	)
68	# Wait before retry (exponential backoff)
69	import time
70	time.sleep(2 ** attempt)
71
72	return ChainResult(
73	success=True,
74	output=current_input,
75	failed_step=None,
76	error=None,
77	partial_results=partial_results
78	)
79

1	import asyncio
2
3	async def optimized_chain(text):
4	# These can run in parallel
5	summary_task = asyncio.create_task(summarize(text))
6	entities_task = asyncio.create_task(extract_entities(text))
7
8	summary, entities = await asyncio.gather(summary_task, entities_task)
9
10	# This depends on previous results
11	final = await generate_report(summary, entities)
12
13	return final
14

1	steps = [
2	Step("Format cleanup", model="gpt-3.5-turbo"), # Simple
3	Step("Entity extraction", model="gpt-3.5-turbo"), # Pattern matching
4	Step("Complex reasoning", model="gpt-4o"), # Needs power
5	Step("Final formatting", model="gpt-3.5-turbo"), # Simple
6	]
7	# Cost: ~60% less than using gpt-4o for everything
8

1	from functools import lru_cache
2	import hashlib
3
4	@lru_cache(maxsize=1000)
5	def cached_step(input_hash: str, step_name: str) -> str:
6	# Actual processing
7	pass
8
9	def chain_with_cache(text):
10	input_hash = hashlib.md5(text.encode()).hexdigest()
11
12	# Check cache first
13	cached = cached_step(input_hash, "extract")
14	if cached:
15	return cached
16
17	# Process and cache
18	result = extract(text)
19	cached_step.cache_info() # Store result
20	return result
21

1	async def streaming_chain(text):
2	"""Yield results as each step completes"""
3
4	yield {"step": "extract", "status": "starting"}
5	extracted = await extract(text)
6	yield {"step": "extract", "status": "complete", "preview": extracted[:100]}
7
8	yield {"step": "transform", "status": "starting"}
9	transformed = await transform(extracted)
10	yield {"step": "transform", "status": "complete", "preview": transformed[:100]}
11
12	yield {"step": "format", "status": "starting"}
13	final = await format_output(transformed)
14	yield {"step": "format", "status": "complete", "result": final}
15

1	# Install
2	# pip install openai hopx
3
4	from openai import OpenAI
5
6	client = OpenAI()
7
8	def chain_step(prompt: str, input_text: str, model: str = "gpt-4o") -> str:
9	"""Single chain step"""
10	response = client.chat.completions.create(
11	model=model,
12	messages=[{"role": "user", "content": f"{prompt}\n\nInput:\n{input_text}"}]
13	)
14	return response.choices[0].message.content
15
16	# Your first chain
17	text = "The quick brown fox jumps over the lazy dog. This is a sample text."
18
19	step1 = chain_step("Count the words in this text", text)
20	step2 = chain_step("Is this count correct? Verify.", step1)
21	step3 = chain_step("Summarize your findings in one sentence.", step2)
22
23	print(step3)
24