Guidance
Control LLM output with regex and grammars for guaranteed valid generation
✨ The solution you've been looking for
Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework
See It In Action
Interactive preview & real-world examples
AI Conversation Simulator
See how users interact with this skill
User Prompt
Generate a user profile with name, age, and email fields in valid JSON format
Skill Processing
Analyzing request...
Agent Response
Structured JSON output with regex-validated email format, numeric age, and proper string formatting - guaranteed to be parseable
Quick Start (3 Steps)
Get up and running in minutes
Install
claude-code skill install guidance
claude-code skill install guidanceConfig
First Trigger
@guidance helpCommands
| Command | Description | Required Args |
|---|---|---|
| @guidance guaranteed-json-generation | Generate valid JSON objects with enforced field formats using regex constraints | None |
| @guidance multi-step-reasoning-agent | Build ReAct-style agents with tool selection and structured thought processes | None |
| @guidance data-extraction-with-format-validation | Extract structured entities from text with guaranteed format compliance | None |
Typical Use Cases
Guaranteed JSON Generation
Generate valid JSON objects with enforced field formats using regex constraints
Multi-Step Reasoning Agent
Build ReAct-style agents with tool selection and structured thought processes
Data Extraction with Format Validation
Extract structured entities from text with guaranteed format compliance
Overview
Guidance: Constrained LLM Generation
When to Use This Skill
Use Guidance when you need to:
- Control LLM output syntax with regex or grammars
- Guarantee valid JSON/XML/code generation
- Reduce latency vs traditional prompting approaches
- Enforce structured formats (dates, emails, IDs, etc.)
- Build multi-step workflows with Pythonic control flow
- Prevent invalid outputs through grammatical constraints
GitHub Stars: 18,000+ | From: Microsoft Research
Installation
1# Base installation
2pip install guidance
3
4# With specific backends
5pip install guidance[transformers] # Hugging Face models
6pip install guidance[llama_cpp] # llama.cpp models
Quick Start
Basic Example: Structured Generation
1from guidance import models, gen
2
3# Load model (supports OpenAI, Transformers, llama.cpp)
4lm = models.OpenAI("gpt-4")
5
6# Generate with constraints
7result = lm + "The capital of France is " + gen("capital", max_tokens=5)
8
9print(result["capital"]) # "Paris"
With Anthropic Claude
1from guidance import models, gen, system, user, assistant
2
3# Configure Claude
4lm = models.Anthropic("claude-sonnet-4-5-20250929")
5
6# Use context managers for chat format
7with system():
8 lm += "You are a helpful assistant."
9
10with user():
11 lm += "What is the capital of France?"
12
13with assistant():
14 lm += gen(max_tokens=20)
Core Concepts
1. Context Managers
Guidance uses Pythonic context managers for chat-style interactions.
1from guidance import system, user, assistant, gen
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5# System message
6with system():
7 lm += "You are a JSON generation expert."
8
9# User message
10with user():
11 lm += "Generate a person object with name and age."
12
13# Assistant response
14with assistant():
15 lm += gen("response", max_tokens=100)
16
17print(lm["response"])
Benefits:
- Natural chat flow
- Clear role separation
- Easy to read and maintain
2. Constrained Generation
Guidance ensures outputs match specified patterns using regex or grammars.
Regex Constraints
1from guidance import models, gen
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5# Constrain to valid email format
6lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")
7
8# Constrain to date format (YYYY-MM-DD)
9lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}")
10
11# Constrain to phone number
12lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")
13
14print(lm["email"]) # Guaranteed valid email
15print(lm["date"]) # Guaranteed YYYY-MM-DD format
How it works:
- Regex converted to grammar at token level
- Invalid tokens filtered during generation
- Model can only produce matching outputs
Selection Constraints
1from guidance import models, gen, select
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5# Constrain to specific choices
6lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
7
8# Multiple-choice selection
9lm += "Best answer: " + select(
10 ["A) Paris", "B) London", "C) Berlin", "D) Madrid"],
11 name="answer"
12)
13
14print(lm["sentiment"]) # One of: positive, negative, neutral
15print(lm["answer"]) # One of: A, B, C, or D
3. Token Healing
Guidance automatically “heals” token boundaries between prompt and generation.
Problem: Tokenization creates unnatural boundaries.
1# Without token healing
2prompt = "The capital of France is "
3# Last token: " is "
4# First generated token might be " Par" (with leading space)
5# Result: "The capital of France is Paris" (double space!)
Solution: Guidance backs up one token and regenerates.
1from guidance import models, gen
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5# Token healing enabled by default
6lm += "The capital of France is " + gen("capital", max_tokens=5)
7# Result: "The capital of France is Paris" (correct spacing)
Benefits:
- Natural text boundaries
- No awkward spacing issues
- Better model performance (sees natural token sequences)
4. Grammar-Based Generation
Define complex structures using context-free grammars.
1from guidance import models, gen
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5# JSON grammar (simplified)
6json_grammar = """
7{
8 "name": <gen name regex="[A-Za-z ]+" max_tokens=20>,
9 "age": <gen age regex="[0-9]+" max_tokens=3>,
10 "email": <gen email regex="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}" max_tokens=50>
11}
12"""
13
14# Generate valid JSON
15lm += gen("person", grammar=json_grammar)
16
17print(lm["person"]) # Guaranteed valid JSON structure
Use cases:
- Complex structured outputs
- Nested data structures
- Programming language syntax
- Domain-specific languages
5. Guidance Functions
Create reusable generation patterns with the @guidance decorator.
1from guidance import guidance, gen, models
2
3@guidance
4def generate_person(lm):
5 """Generate a person with name and age."""
6 lm += "Name: " + gen("name", max_tokens=20, stop="\n")
7 lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)
8 return lm
9
10# Use the function
11lm = models.Anthropic("claude-sonnet-4-5-20250929")
12lm = generate_person(lm)
13
14print(lm["name"])
15print(lm["age"])
Stateful Functions:
1@guidance(stateless=False)
2def react_agent(lm, question, tools, max_rounds=5):
3 """ReAct agent with tool use."""
4 lm += f"Question: {question}\n\n"
5
6 for i in range(max_rounds):
7 # Thought
8 lm += f"Thought {i+1}: " + gen("thought", stop="\n")
9
10 # Action
11 lm += "\nAction: " + select(list(tools.keys()), name="action")
12
13 # Execute tool
14 tool_result = tools[lm["action"]]()
15 lm += f"\nObservation: {tool_result}\n\n"
16
17 # Check if done
18 lm += "Done? " + select(["Yes", "No"], name="done")
19 if lm["done"] == "Yes":
20 break
21
22 # Final answer
23 lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
24 return lm
Backend Configuration
Anthropic Claude
1from guidance import models
2
3lm = models.Anthropic(
4 model="claude-sonnet-4-5-20250929",
5 api_key="your-api-key" # Or set ANTHROPIC_API_KEY env var
6)
OpenAI
1lm = models.OpenAI(
2 model="gpt-4o-mini",
3 api_key="your-api-key" # Or set OPENAI_API_KEY env var
4)
Local Models (Transformers)
1from guidance.models import Transformers
2
3lm = Transformers(
4 "microsoft/Phi-4-mini-instruct",
5 device="cuda" # Or "cpu"
6)
Local Models (llama.cpp)
1from guidance.models import LlamaCpp
2
3lm = LlamaCpp(
4 model_path="/path/to/model.gguf",
5 n_ctx=4096,
6 n_gpu_layers=35
7)
Common Patterns
Pattern 1: JSON Generation
1from guidance import models, gen, system, user, assistant
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5with system():
6 lm += "You generate valid JSON."
7
8with user():
9 lm += "Generate a user profile with name, age, and email."
10
11with assistant():
12 lm += """{
13 "name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,
14 "age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,
15 "email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """
16}"""
17
18print(lm) # Valid JSON guaranteed
Pattern 2: Classification
1from guidance import models, gen, select
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5text = "This product is amazing! I love it."
6
7lm += f"Text: {text}\n"
8lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
9lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"
10
11print(f"Sentiment: {lm['sentiment']}")
12print(f"Confidence: {lm['confidence']}%")
Pattern 3: Multi-Step Reasoning
1from guidance import models, gen, guidance
2
3@guidance
4def chain_of_thought(lm, question):
5 """Generate answer with step-by-step reasoning."""
6 lm += f"Question: {question}\n\n"
7
8 # Generate multiple reasoning steps
9 for i in range(3):
10 lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"
11
12 # Final answer
13 lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)
14
15 return lm
16
17lm = models.Anthropic("claude-sonnet-4-5-20250929")
18lm = chain_of_thought(lm, "What is 15% of 200?")
19
20print(lm["answer"])
Pattern 4: ReAct Agent
1from guidance import models, gen, select, guidance
2
3@guidance(stateless=False)
4def react_agent(lm, question):
5 """ReAct agent with tool use."""
6 tools = {
7 "calculator": lambda expr: eval(expr),
8 "search": lambda query: f"Search results for: {query}",
9 }
10
11 lm += f"Question: {question}\n\n"
12
13 for round in range(5):
14 # Thought
15 lm += f"Thought: " + gen("thought", stop="\n") + "\n"
16
17 # Action selection
18 lm += "Action: " + select(["calculator", "search", "answer"], name="action")
19
20 if lm["action"] == "answer":
21 lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
22 break
23
24 # Action input
25 lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"
26
27 # Execute tool
28 if lm["action"] in tools:
29 result = tools[lm["action"]](lm["action_input"])
30 lm += f"Observation: {result}\n\n"
31
32 return lm
33
34lm = models.Anthropic("claude-sonnet-4-5-20250929")
35lm = react_agent(lm, "What is 25 * 4 + 10?")
36print(lm["answer"])
Pattern 5: Data Extraction
1from guidance import models, gen, guidance
2
3@guidance
4def extract_entities(lm, text):
5 """Extract structured entities from text."""
6 lm += f"Text: {text}\n\n"
7
8 # Extract person
9 lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"
10
11 # Extract organization
12 lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"
13
14 # Extract date
15 lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"
16
17 # Extract location
18 lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"
19
20 return lm
21
22text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."
23
24lm = models.Anthropic("claude-sonnet-4-5-20250929")
25lm = extract_entities(lm, text)
26
27print(f"Person: {lm['person']}")
28print(f"Organization: {lm['organization']}")
29print(f"Date: {lm['date']}")
30print(f"Location: {lm['location']}")
Best Practices
1. Use Regex for Format Validation
1# ✅ Good: Regex ensures valid format
2lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")
3
4# ❌ Bad: Free generation may produce invalid emails
5lm += "Email: " + gen("email", max_tokens=50)
2. Use select() for Fixed Categories
1# ✅ Good: Guaranteed valid category
2lm += "Status: " + select(["pending", "approved", "rejected"], name="status")
3
4# ❌ Bad: May generate typos or invalid values
5lm += "Status: " + gen("status", max_tokens=20)
3. Leverage Token Healing
1# Token healing is enabled by default
2# No special action needed - just concatenate naturally
3lm += "The capital is " + gen("capital") # Automatic healing
4. Use stop Sequences
1# ✅ Good: Stop at newline for single-line outputs
2lm += "Name: " + gen("name", stop="\n")
3
4# ❌ Bad: May generate multiple lines
5lm += "Name: " + gen("name", max_tokens=50)
5. Create Reusable Functions
1# ✅ Good: Reusable pattern
2@guidance
3def generate_person(lm):
4 lm += "Name: " + gen("name", stop="\n")
5 lm += "\nAge: " + gen("age", regex=r"[0-9]+")
6 return lm
7
8# Use multiple times
9lm = generate_person(lm)
10lm += "\n\n"
11lm = generate_person(lm)
6. Balance Constraints
1# ✅ Good: Reasonable constraints
2lm += gen("name", regex=r"[A-Za-z ]+", max_tokens=30)
3
4# ❌ Too strict: May fail or be very slow
5lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)
Comparison to Alternatives
| Feature | Guidance | Instructor | Outlines | LMQL |
|---|---|---|---|---|
| Regex Constraints | ✅ Yes | ❌ No | ✅ Yes | ✅ Yes |
| Grammar Support | ✅ CFG | ❌ No | ✅ CFG | ✅ CFG |
| Pydantic Validation | ❌ No | ✅ Yes | ✅ Yes | ❌ No |
| Token Healing | ✅ Yes | ❌ No | ✅ Yes | ❌ No |
| Local Models | ✅ Yes | ⚠️ Limited | ✅ Yes | ✅ Yes |
| API Models | ✅ Yes | ✅ Yes | ⚠️ Limited | ✅ Yes |
| Pythonic Syntax | ✅ Yes | ✅ Yes | ✅ Yes | ❌ SQL-like |
| Learning Curve | Low | Low | Medium | High |
When to choose Guidance:
- Need regex/grammar constraints
- Want token healing
- Building complex workflows with control flow
- Using local models (Transformers, llama.cpp)
- Prefer Pythonic syntax
When to choose alternatives:
- Instructor: Need Pydantic validation with automatic retrying
- Outlines: Need JSON schema validation
- LMQL: Prefer declarative query syntax
Performance Characteristics
Latency Reduction:
- 30-50% faster than traditional prompting for constrained outputs
- Token healing reduces unnecessary regeneration
- Grammar constraints prevent invalid token generation
Memory Usage:
- Minimal overhead vs unconstrained generation
- Grammar compilation cached after first use
- Efficient token filtering at inference time
Token Efficiency:
- Prevents wasted tokens on invalid outputs
- No need for retry loops
- Direct path to valid outputs
Resources
- Documentation: https://guidance.readthedocs.io
- GitHub: https://github.com/guidance-ai/guidance (18k+ stars)
- Notebooks: https://github.com/guidance-ai/guidance/tree/main/notebooks
- Discord: Community support available
See Also
references/constraints.md- Comprehensive regex and grammar patternsreferences/backends.md- Backend-specific configurationreferences/examples.md- Production-ready examples
What Users Are Saying
Real feedback from the community
Environment Matrix
Dependencies
Framework Support
Context Window
Security & Privacy
Information
- Author
- davila7
- Updated
- 2026-01-30
- Category
- productivity-tools