Guidance

Control LLM output with regex and grammars for guaranteed valid generation

✨ The solution you've been looking for

Verified

Tested and verified by our team

16036 Stars

Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework

constrained-generation prompt-engineering structured-output json-validation grammar microsoft-research format-enforcement multi-step-workflows

Repository

See It In Action

Interactive preview & real-world examples

Live Demo

AI Conversation Simulator

See how users interact with this skill

User Prompt

Generate a user profile with name, age, and email fields in valid JSON format

Skill Processing

Analyzing request...

Agent Response

Structured JSON output with regex-validated email format, numeric age, and proper string formatting - guaranteed to be parseable

Quick Start (3 Steps)

Get up and running in minutes

Install

claude-code skill install guidance

claude-code skill install guidance

Config

First Trigger

@guidance help

Commands

Command	Description	Required Args
@guidance guaranteed-json-generation	Generate valid JSON objects with enforced field formats using regex constraints	None
@guidance multi-step-reasoning-agent	Build ReAct-style agents with tool selection and structured thought processes	None
@guidance data-extraction-with-format-validation	Extract structured entities from text with guaranteed format compliance	None

Typical Use Cases

Guaranteed JSON Generation

Generate valid JSON objects with enforced field formats using regex constraints

Multi-Step Reasoning Agent

Build ReAct-style agents with tool selection and structured thought processes

Data Extraction with Format Validation

Extract structured entities from text with guaranteed format compliance

Overview

Guidance: Constrained LLM Generation

When to Use This Skill

Use Guidance when you need to:

Control LLM output syntax with regex or grammars
Guarantee valid JSON/XML/code generation
Reduce latency vs traditional prompting approaches
Enforce structured formats (dates, emails, IDs, etc.)
Build multi-step workflows with Pythonic control flow
Prevent invalid outputs through grammatical constraints

GitHub Stars: 18,000+ | From: Microsoft Research

Installation

1# Base installation
2pip install guidance
3
4# With specific backends
5pip install guidance[transformers]  # Hugging Face models
6pip install guidance[llama_cpp]     # llama.cpp models

Quick Start

Basic Example: Structured Generation

1from guidance import models, gen
2
3# Load model (supports OpenAI, Transformers, llama.cpp)
4lm = models.OpenAI("gpt-4")
5
6# Generate with constraints
7result = lm + "The capital of France is " + gen("capital", max_tokens=5)
8
9print(result["capital"])  # "Paris"

With Anthropic Claude

 1from guidance import models, gen, system, user, assistant
 2
 3# Configure Claude
 4lm = models.Anthropic("claude-sonnet-4-5-20250929")
 5
 6# Use context managers for chat format
 7with system():
 8    lm += "You are a helpful assistant."
 9
10with user():
11    lm += "What is the capital of France?"
12
13with assistant():
14    lm += gen(max_tokens=20)

Core Concepts

1. Context Managers

Guidance uses Pythonic context managers for chat-style interactions.

 1from guidance import system, user, assistant, gen
 2
 3lm = models.Anthropic("claude-sonnet-4-5-20250929")
 4
 5# System message
 6with system():
 7    lm += "You are a JSON generation expert."
 8
 9# User message
10with user():
11    lm += "Generate a person object with name and age."
12
13# Assistant response
14with assistant():
15    lm += gen("response", max_tokens=100)
16
17print(lm["response"])

Benefits:

Natural chat flow
Clear role separation
Easy to read and maintain

2. Constrained Generation

Guidance ensures outputs match specified patterns using regex or grammars.

Regex Constraints

 1from guidance import models, gen
 2
 3lm = models.Anthropic("claude-sonnet-4-5-20250929")
 4
 5# Constrain to valid email format
 6lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")
 7
 8# Constrain to date format (YYYY-MM-DD)
 9lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}")
10
11# Constrain to phone number
12lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")
13
14print(lm["email"])  # Guaranteed valid email
15print(lm["date"])   # Guaranteed YYYY-MM-DD format

How it works:

Regex converted to grammar at token level
Invalid tokens filtered during generation
Model can only produce matching outputs

Selection Constraints

 1from guidance import models, gen, select
 2
 3lm = models.Anthropic("claude-sonnet-4-5-20250929")
 4
 5# Constrain to specific choices
 6lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
 7
 8# Multiple-choice selection
 9lm += "Best answer: " + select(
10    ["A) Paris", "B) London", "C) Berlin", "D) Madrid"],
11    name="answer"
12)
13
14print(lm["sentiment"])  # One of: positive, negative, neutral
15print(lm["answer"])     # One of: A, B, C, or D

3. Token Healing

Guidance automatically “heals” token boundaries between prompt and generation.

Problem: Tokenization creates unnatural boundaries.

1# Without token healing
2prompt = "The capital of France is "
3# Last token: " is "
4# First generated token might be " Par" (with leading space)
5# Result: "The capital of France is  Paris" (double space!)

Solution: Guidance backs up one token and regenerates.

1from guidance import models, gen
2
3lm = models.Anthropic("claude-sonnet-4-5-20250929")
4
5# Token healing enabled by default
6lm += "The capital of France is " + gen("capital", max_tokens=5)
7# Result: "The capital of France is Paris" (correct spacing)

Benefits:

Natural text boundaries
No awkward spacing issues
Better model performance (sees natural token sequences)

4. Grammar-Based Generation

Define complex structures using context-free grammars.

 1from guidance import models, gen
 2
 3lm = models.Anthropic("claude-sonnet-4-5-20250929")
 4
 5# JSON grammar (simplified)
 6json_grammar = """
 7{
 8    "name": <gen name regex="[A-Za-z ]+" max_tokens=20>,
 9    "age": <gen age regex="[0-9]+" max_tokens=3>,
10    "email": <gen email regex="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}" max_tokens=50>
11}
12"""
13
14# Generate valid JSON
15lm += gen("person", grammar=json_grammar)
16
17print(lm["person"])  # Guaranteed valid JSON structure

Use cases:

Complex structured outputs
Nested data structures
Programming language syntax
Domain-specific languages

5. Guidance Functions

Create reusable generation patterns with the @guidance decorator.

 1from guidance import guidance, gen, models
 2
 3@guidance
 4def generate_person(lm):
 5    """Generate a person with name and age."""
 6    lm += "Name: " + gen("name", max_tokens=20, stop="\n")
 7    lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)
 8    return lm
 9
10# Use the function
11lm = models.Anthropic("claude-sonnet-4-5-20250929")
12lm = generate_person(lm)
13
14print(lm["name"])
15print(lm["age"])

Stateful Functions:

 1@guidance(stateless=False)
 2def react_agent(lm, question, tools, max_rounds=5):
 3    """ReAct agent with tool use."""
 4    lm += f"Question: {question}\n\n"
 5
 6    for i in range(max_rounds):
 7        # Thought
 8        lm += f"Thought {i+1}: " + gen("thought", stop="\n")
 9
10        # Action
11        lm += "\nAction: " + select(list(tools.keys()), name="action")
12
13        # Execute tool
14        tool_result = tools[lm["action"]]()
15        lm += f"\nObservation: {tool_result}\n\n"
16
17        # Check if done
18        lm += "Done? " + select(["Yes", "No"], name="done")
19        if lm["done"] == "Yes":
20            break
21
22    # Final answer
23    lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
24    return lm

Backend Configuration

Anthropic Claude

1from guidance import models
2
3lm = models.Anthropic(
4    model="claude-sonnet-4-5-20250929",
5    api_key="your-api-key"  # Or set ANTHROPIC_API_KEY env var
6)

OpenAI

1lm = models.OpenAI(
2    model="gpt-4o-mini",
3    api_key="your-api-key"  # Or set OPENAI_API_KEY env var
4)

Local Models (Transformers)

1from guidance.models import Transformers
2
3lm = Transformers(
4    "microsoft/Phi-4-mini-instruct",
5    device="cuda"  # Or "cpu"
6)

Local Models (llama.cpp)

1from guidance.models import LlamaCpp
2
3lm = LlamaCpp(
4    model_path="/path/to/model.gguf",
5    n_ctx=4096,
6    n_gpu_layers=35
7)

Common Patterns

Pattern 1: JSON Generation

 1from guidance import models, gen, system, user, assistant
 2
 3lm = models.Anthropic("claude-sonnet-4-5-20250929")
 4
 5with system():
 6    lm += "You generate valid JSON."
 7
 8with user():
 9    lm += "Generate a user profile with name, age, and email."
10
11with assistant():
12    lm += """{
13    "name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,
14    "age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,
15    "email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """
16}"""
17
18print(lm)  # Valid JSON guaranteed

Pattern 2: Classification

 1from guidance import models, gen, select
 2
 3lm = models.Anthropic("claude-sonnet-4-5-20250929")
 4
 5text = "This product is amazing! I love it."
 6
 7lm += f"Text: {text}\n"
 8lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
 9lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"
10
11print(f"Sentiment: {lm['sentiment']}")
12print(f"Confidence: {lm['confidence']}%")

Pattern 3: Multi-Step Reasoning

 1from guidance import models, gen, guidance
 2
 3@guidance
 4def chain_of_thought(lm, question):
 5    """Generate answer with step-by-step reasoning."""
 6    lm += f"Question: {question}\n\n"
 7
 8    # Generate multiple reasoning steps
 9    for i in range(3):
10        lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"
11
12    # Final answer
13    lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)
14
15    return lm
16
17lm = models.Anthropic("claude-sonnet-4-5-20250929")
18lm = chain_of_thought(lm, "What is 15% of 200?")
19
20print(lm["answer"])

Pattern 4: ReAct Agent

 1from guidance import models, gen, select, guidance
 2
 3@guidance(stateless=False)
 4def react_agent(lm, question):
 5    """ReAct agent with tool use."""
 6    tools = {
 7        "calculator": lambda expr: eval(expr),
 8        "search": lambda query: f"Search results for: {query}",
 9    }
10
11    lm += f"Question: {question}\n\n"
12
13    for round in range(5):
14        # Thought
15        lm += f"Thought: " + gen("thought", stop="\n") + "\n"
16
17        # Action selection
18        lm += "Action: " + select(["calculator", "search", "answer"], name="action")
19
20        if lm["action"] == "answer":
21            lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
22            break
23
24        # Action input
25        lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"
26
27        # Execute tool
28        if lm["action"] in tools:
29            result = tools[lm["action"]](lm["action_input"])
30            lm += f"Observation: {result}\n\n"
31
32    return lm
33
34lm = models.Anthropic("claude-sonnet-4-5-20250929")
35lm = react_agent(lm, "What is 25 * 4 + 10?")
36print(lm["answer"])

Pattern 5: Data Extraction

 1from guidance import models, gen, guidance
 2
 3@guidance
 4def extract_entities(lm, text):
 5    """Extract structured entities from text."""
 6    lm += f"Text: {text}\n\n"
 7
 8    # Extract person
 9    lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"
10
11    # Extract organization
12    lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"
13
14    # Extract date
15    lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"
16
17    # Extract location
18    lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"
19
20    return lm
21
22text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."
23
24lm = models.Anthropic("claude-sonnet-4-5-20250929")
25lm = extract_entities(lm, text)
26
27print(f"Person: {lm['person']}")
28print(f"Organization: {lm['organization']}")
29print(f"Date: {lm['date']}")
30print(f"Location: {lm['location']}")

Best Practices

1. Use Regex for Format Validation

1# ✅ Good: Regex ensures valid format
2lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")
3
4# ❌ Bad: Free generation may produce invalid emails
5lm += "Email: " + gen("email", max_tokens=50)

2. Use select() for Fixed Categories

1# ✅ Good: Guaranteed valid category
2lm += "Status: " + select(["pending", "approved", "rejected"], name="status")
3
4# ❌ Bad: May generate typos or invalid values
5lm += "Status: " + gen("status", max_tokens=20)

3. Leverage Token Healing

1# Token healing is enabled by default
2# No special action needed - just concatenate naturally
3lm += "The capital is " + gen("capital")  # Automatic healing

4. Use stop Sequences

1# ✅ Good: Stop at newline for single-line outputs
2lm += "Name: " + gen("name", stop="\n")
3
4# ❌ Bad: May generate multiple lines
5lm += "Name: " + gen("name", max_tokens=50)

5. Create Reusable Functions

 1# ✅ Good: Reusable pattern
 2@guidance
 3def generate_person(lm):
 4    lm += "Name: " + gen("name", stop="\n")
 5    lm += "\nAge: " + gen("age", regex=r"[0-9]+")
 6    return lm
 7
 8# Use multiple times
 9lm = generate_person(lm)
10lm += "\n\n"
11lm = generate_person(lm)

6. Balance Constraints

1# ✅ Good: Reasonable constraints
2lm += gen("name", regex=r"[A-Za-z ]+", max_tokens=30)
3
4# ❌ Too strict: May fail or be very slow
5lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)

Comparison to Alternatives

Feature	Guidance	Instructor	Outlines	LMQL
Regex Constraints	✅ Yes	❌ No	✅ Yes	✅ Yes
Grammar Support	✅ CFG	❌ No	✅ CFG	✅ CFG
Pydantic Validation	❌ No	✅ Yes	✅ Yes	❌ No
Token Healing	✅ Yes	❌ No	✅ Yes	❌ No
Local Models	✅ Yes	⚠️ Limited	✅ Yes	✅ Yes
API Models	✅ Yes	✅ Yes	⚠️ Limited	✅ Yes
Pythonic Syntax	✅ Yes	✅ Yes	✅ Yes	❌ SQL-like
Learning Curve	Low	Low	Medium	High

When to choose Guidance:

Need regex/grammar constraints
Want token healing
Building complex workflows with control flow
Using local models (Transformers, llama.cpp)
Prefer Pythonic syntax

When to choose alternatives:

Instructor: Need Pydantic validation with automatic retrying
Outlines: Need JSON schema validation
LMQL: Prefer declarative query syntax

Performance Characteristics

Latency Reduction:

30-50% faster than traditional prompting for constrained outputs
Token healing reduces unnecessary regeneration
Grammar constraints prevent invalid token generation

Memory Usage:

Minimal overhead vs unconstrained generation
Grammar compilation cached after first use
Efficient token filtering at inference time

Token Efficiency:

Prevents wasted tokens on invalid outputs
No need for retry loops
Direct path to valid outputs

Resources

Documentation: https://guidance.readthedocs.io
GitHub: https://github.com/guidance-ai/guidance (18k+ stars)
Notebooks: https://github.com/guidance-ai/guidance/tree/main/notebooks
Discord: Community support available

What Users Are Saying

Real feedback from the community

Environment Matrix

Dependencies

guidance (Python package)

transformers (optional, for Hugging Face models)

llama-cpp-python (optional, for local GGUF models)

Framework Support

Anthropic Claude ✓ (recommended) OpenAI GPT models ✓ Hugging Face Transformers ✓ llama.cpp GGUF models ✓

Context Window

Token Usage ~2K-8K tokens for typical structured generation tasks

Security & Privacy

Information

Author: davila7
Updated: 2026-01-30
Category: productivity-tools

Related Skills

Guidance

Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce …

View Details →

Guidance

See It In Action

AI Conversation Simulator

Quick Start (3 Steps)

Install

Config

First Trigger

Commands

Typical Use Cases

Guaranteed JSON Generation

Multi-Step Reasoning Agent

Data Extraction with Format Validation

Overview

Guidance: Constrained LLM Generation

When to Use This Skill

Installation

Quick Start

Basic Example: Structured Generation

With Anthropic Claude

Core Concepts

1. Context Managers

2. Constrained Generation

Regex Constraints

Selection Constraints

3. Token Healing

4. Grammar-Based Generation

5. Guidance Functions

Backend Configuration

Anthropic Claude

OpenAI

Local Models (Transformers)

Local Models (llama.cpp)

Common Patterns

Pattern 1: JSON Generation

Pattern 2: Classification

Pattern 3: Multi-Step Reasoning

Pattern 4: ReAct Agent

Pattern 5: Data Extraction

Best Practices

1. Use Regex for Format Validation

2. Use select() for Fixed Categories

3. Leverage Token Healing

4. Use stop Sequences

5. Create Reusable Functions

6. Balance Constraints

Comparison to Alternatives

Performance Characteristics

Resources

See Also

What Users Are Saying

Environment Matrix

Dependencies

Framework Support

Context Window

Security & Privacy

Information

Related Skills

Guidance