Microservices Patterns

Architect resilient microservices with proven patterns and best practices

✨ The solution you've been looking for

Verified
Tested and verified by our team
25450 Stars

Design microservices architectures with service boundaries, event-driven communication, and resilience patterns. Use when building distributed systems, decomposing monoliths, or implementing microservices.

microservices distributed-systems architecture-patterns service-boundaries event-driven resilience-patterns api-gateway saga-pattern
Repository

See It In Action

Interactive preview & real-world examples

Live Demo
Skill Demo Animation

AI Conversation Simulator

See how users interact with this skill

User Prompt

I need to extract the payment functionality from our monolithic e-commerce app. Help me design the service boundaries and migration strategy.

Skill Processing

Analyzing request...

Agent Response

Service decomposition strategy with clear boundaries, data ownership, and step-by-step migration plan using Strangler Fig pattern

Quick Start (3 Steps)

Get up and running in minutes

1

Install

claude-code skill install microservices-patterns

claude-code skill install microservices-patterns
2

Config

3

First Trigger

@microservices-patterns help

Commands

CommandDescriptionRequired Args
@microservices-patterns decompose-legacy-monolithBreak down a monolithic application into well-bounded microservices using domain-driven design principlesNone
@microservices-patterns design-event-driven-architectureImplement asynchronous communication between services using event streaming and pub/sub patternsNone
@microservices-patterns build-distributed-transaction-handlingImplement the Saga pattern for managing distributed transactions across multiple microservicesNone

Typical Use Cases

Decompose Legacy Monolith

Break down a monolithic application into well-bounded microservices using domain-driven design principles

Design Event-Driven Architecture

Implement asynchronous communication between services using event streaming and pub/sub patterns

Build Distributed Transaction Handling

Implement the Saga pattern for managing distributed transactions across multiple microservices

Overview

Microservices Patterns

Master microservices architecture patterns including service boundaries, inter-service communication, data management, and resilience patterns for building distributed systems.

When to Use This Skill

  • Decomposing monoliths into microservices
  • Designing service boundaries and contracts
  • Implementing inter-service communication
  • Managing distributed data and transactions
  • Building resilient distributed systems
  • Implementing service discovery and load balancing
  • Designing event-driven architectures

Core Concepts

1. Service Decomposition Strategies

By Business Capability

  • Organize services around business functions
  • Each service owns its domain
  • Example: OrderService, PaymentService, InventoryService

By Subdomain (DDD)

  • Core domain, supporting subdomains
  • Bounded contexts map to services
  • Clear ownership and responsibility

Strangler Fig Pattern

  • Gradually extract from monolith
  • New functionality as microservices
  • Proxy routes to old/new systems

2. Communication Patterns

Synchronous (Request/Response)

  • REST APIs
  • gRPC
  • GraphQL

Asynchronous (Events/Messages)

  • Event streaming (Kafka)
  • Message queues (RabbitMQ, SQS)
  • Pub/Sub patterns

3. Data Management

Database Per Service

  • Each service owns its data
  • No shared databases
  • Loose coupling

Saga Pattern

  • Distributed transactions
  • Compensating actions
  • Eventual consistency

4. Resilience Patterns

Circuit Breaker

  • Fail fast on repeated errors
  • Prevent cascade failures

Retry with Backoff

  • Transient fault handling
  • Exponential backoff

Bulkhead

  • Isolate resources
  • Limit impact of failures

Service Decomposition Patterns

Pattern 1: By Business Capability

 1# E-commerce example
 2
 3# Order Service
 4class OrderService:
 5    """Handles order lifecycle."""
 6
 7    async def create_order(self, order_data: dict) -> Order:
 8        order = Order.create(order_data)
 9
10        # Publish event for other services
11        await self.event_bus.publish(
12            OrderCreatedEvent(
13                order_id=order.id,
14                customer_id=order.customer_id,
15                items=order.items,
16                total=order.total
17            )
18        )
19
20        return order
21
22# Payment Service (separate service)
23class PaymentService:
24    """Handles payment processing."""
25
26    async def process_payment(self, payment_request: PaymentRequest) -> PaymentResult:
27        # Process payment
28        result = await self.payment_gateway.charge(
29            amount=payment_request.amount,
30            customer=payment_request.customer_id
31        )
32
33        if result.success:
34            await self.event_bus.publish(
35                PaymentCompletedEvent(
36                    order_id=payment_request.order_id,
37                    transaction_id=result.transaction_id
38                )
39            )
40
41        return result
42
43# Inventory Service (separate service)
44class InventoryService:
45    """Handles inventory management."""
46
47    async def reserve_items(self, order_id: str, items: List[OrderItem]) -> ReservationResult:
48        # Check availability
49        for item in items:
50            available = await self.inventory_repo.get_available(item.product_id)
51            if available < item.quantity:
52                return ReservationResult(
53                    success=False,
54                    error=f"Insufficient inventory for {item.product_id}"
55                )
56
57        # Reserve items
58        reservation = await self.create_reservation(order_id, items)
59
60        await self.event_bus.publish(
61            InventoryReservedEvent(
62                order_id=order_id,
63                reservation_id=reservation.id
64            )
65        )
66
67        return ReservationResult(success=True, reservation=reservation)

Pattern 2: API Gateway

 1from fastapi import FastAPI, HTTPException, Depends
 2import httpx
 3from circuitbreaker import circuit
 4
 5app = FastAPI()
 6
 7class APIGateway:
 8    """Central entry point for all client requests."""
 9
10    def __init__(self):
11        self.order_service_url = "http://order-service:8000"
12        self.payment_service_url = "http://payment-service:8001"
13        self.inventory_service_url = "http://inventory-service:8002"
14        self.http_client = httpx.AsyncClient(timeout=5.0)
15
16    @circuit(failure_threshold=5, recovery_timeout=30)
17    async def call_order_service(self, path: str, method: str = "GET", **kwargs):
18        """Call order service with circuit breaker."""
19        response = await self.http_client.request(
20            method,
21            f"{self.order_service_url}{path}",
22            **kwargs
23        )
24        response.raise_for_status()
25        return response.json()
26
27    async def create_order_aggregate(self, order_id: str) -> dict:
28        """Aggregate data from multiple services."""
29        # Parallel requests
30        order, payment, inventory = await asyncio.gather(
31            self.call_order_service(f"/orders/{order_id}"),
32            self.call_payment_service(f"/payments/order/{order_id}"),
33            self.call_inventory_service(f"/reservations/order/{order_id}"),
34            return_exceptions=True
35        )
36
37        # Handle partial failures
38        result = {"order": order}
39        if not isinstance(payment, Exception):
40            result["payment"] = payment
41        if not isinstance(inventory, Exception):
42            result["inventory"] = inventory
43
44        return result
45
46@app.post("/api/orders")
47async def create_order(
48    order_data: dict,
49    gateway: APIGateway = Depends()
50):
51    """API Gateway endpoint."""
52    try:
53        # Route to order service
54        order = await gateway.call_order_service(
55            "/orders",
56            method="POST",
57            json=order_data
58        )
59        return {"order": order}
60    except httpx.HTTPError as e:
61        raise HTTPException(status_code=503, detail="Order service unavailable")

Communication Patterns

Pattern 1: Synchronous REST Communication

 1# Service A calls Service B
 2import httpx
 3from tenacity import retry, stop_after_attempt, wait_exponential
 4
 5class ServiceClient:
 6    """HTTP client with retries and timeout."""
 7
 8    def __init__(self, base_url: str):
 9        self.base_url = base_url
10        self.client = httpx.AsyncClient(
11            timeout=httpx.Timeout(5.0, connect=2.0),
12            limits=httpx.Limits(max_keepalive_connections=20)
13        )
14
15    @retry(
16        stop=stop_after_attempt(3),
17        wait=wait_exponential(multiplier=1, min=2, max=10)
18    )
19    async def get(self, path: str, **kwargs):
20        """GET with automatic retries."""
21        response = await self.client.get(f"{self.base_url}{path}", **kwargs)
22        response.raise_for_status()
23        return response.json()
24
25    async def post(self, path: str, **kwargs):
26        """POST request."""
27        response = await self.client.post(f"{self.base_url}{path}", **kwargs)
28        response.raise_for_status()
29        return response.json()
30
31# Usage
32payment_client = ServiceClient("http://payment-service:8001")
33result = await payment_client.post("/payments", json=payment_data)

Pattern 2: Asynchronous Event-Driven

 1# Event-driven communication with Kafka
 2from aiokafka import AIOKafkaProducer, AIOKafkaConsumer
 3import json
 4from dataclasses import dataclass, asdict
 5from datetime import datetime
 6
 7@dataclass
 8class DomainEvent:
 9    event_id: str
10    event_type: str
11    aggregate_id: str
12    occurred_at: datetime
13    data: dict
14
15class EventBus:
16    """Event publishing and subscription."""
17
18    def __init__(self, bootstrap_servers: List[str]):
19        self.bootstrap_servers = bootstrap_servers
20        self.producer = None
21
22    async def start(self):
23        self.producer = AIOKafkaProducer(
24            bootstrap_servers=self.bootstrap_servers,
25            value_serializer=lambda v: json.dumps(v).encode()
26        )
27        await self.producer.start()
28
29    async def publish(self, event: DomainEvent):
30        """Publish event to Kafka topic."""
31        topic = event.event_type
32        await self.producer.send_and_wait(
33            topic,
34            value=asdict(event),
35            key=event.aggregate_id.encode()
36        )
37
38    async def subscribe(self, topic: str, handler: callable):
39        """Subscribe to events."""
40        consumer = AIOKafkaConsumer(
41            topic,
42            bootstrap_servers=self.bootstrap_servers,
43            value_deserializer=lambda v: json.loads(v.decode()),
44            group_id="my-service"
45        )
46        await consumer.start()
47
48        try:
49            async for message in consumer:
50                event_data = message.value
51                await handler(event_data)
52        finally:
53            await consumer.stop()
54
55# Order Service publishes event
56async def create_order(order_data: dict):
57    order = await save_order(order_data)
58
59    event = DomainEvent(
60        event_id=str(uuid.uuid4()),
61        event_type="OrderCreated",
62        aggregate_id=order.id,
63        occurred_at=datetime.now(),
64        data={
65            "order_id": order.id,
66            "customer_id": order.customer_id,
67            "total": order.total
68        }
69    )
70
71    await event_bus.publish(event)
72
73# Inventory Service listens for OrderCreated
74async def handle_order_created(event_data: dict):
75    """React to order creation."""
76    order_id = event_data["data"]["order_id"]
77    items = event_data["data"]["items"]
78
79    # Reserve inventory
80    await reserve_inventory(order_id, items)

Pattern 3: Saga Pattern (Distributed Transactions)

  1# Saga orchestration for order fulfillment
  2from enum import Enum
  3from typing import List, Callable
  4
  5class SagaStep:
  6    """Single step in saga."""
  7
  8    def __init__(
  9        self,
 10        name: str,
 11        action: Callable,
 12        compensation: Callable
 13    ):
 14        self.name = name
 15        self.action = action
 16        self.compensation = compensation
 17
 18class SagaStatus(Enum):
 19    PENDING = "pending"
 20    COMPLETED = "completed"
 21    COMPENSATING = "compensating"
 22    FAILED = "failed"
 23
 24class OrderFulfillmentSaga:
 25    """Orchestrated saga for order fulfillment."""
 26
 27    def __init__(self):
 28        self.steps: List[SagaStep] = [
 29            SagaStep(
 30                "create_order",
 31                action=self.create_order,
 32                compensation=self.cancel_order
 33            ),
 34            SagaStep(
 35                "reserve_inventory",
 36                action=self.reserve_inventory,
 37                compensation=self.release_inventory
 38            ),
 39            SagaStep(
 40                "process_payment",
 41                action=self.process_payment,
 42                compensation=self.refund_payment
 43            ),
 44            SagaStep(
 45                "confirm_order",
 46                action=self.confirm_order,
 47                compensation=self.cancel_order_confirmation
 48            )
 49        ]
 50
 51    async def execute(self, order_data: dict) -> SagaResult:
 52        """Execute saga steps."""
 53        completed_steps = []
 54        context = {"order_data": order_data}
 55
 56        try:
 57            for step in self.steps:
 58                # Execute step
 59                result = await step.action(context)
 60                if not result.success:
 61                    # Compensate
 62                    await self.compensate(completed_steps, context)
 63                    return SagaResult(
 64                        status=SagaStatus.FAILED,
 65                        error=result.error
 66                    )
 67
 68                completed_steps.append(step)
 69                context.update(result.data)
 70
 71            return SagaResult(status=SagaStatus.COMPLETED, data=context)
 72
 73        except Exception as e:
 74            # Compensate on error
 75            await self.compensate(completed_steps, context)
 76            return SagaResult(status=SagaStatus.FAILED, error=str(e))
 77
 78    async def compensate(self, completed_steps: List[SagaStep], context: dict):
 79        """Execute compensating actions in reverse order."""
 80        for step in reversed(completed_steps):
 81            try:
 82                await step.compensation(context)
 83            except Exception as e:
 84                # Log compensation failure
 85                print(f"Compensation failed for {step.name}: {e}")
 86
 87    # Step implementations
 88    async def create_order(self, context: dict) -> StepResult:
 89        order = await order_service.create(context["order_data"])
 90        return StepResult(success=True, data={"order_id": order.id})
 91
 92    async def cancel_order(self, context: dict):
 93        await order_service.cancel(context["order_id"])
 94
 95    async def reserve_inventory(self, context: dict) -> StepResult:
 96        result = await inventory_service.reserve(
 97            context["order_id"],
 98            context["order_data"]["items"]
 99        )
100        return StepResult(
101            success=result.success,
102            data={"reservation_id": result.reservation_id}
103        )
104
105    async def release_inventory(self, context: dict):
106        await inventory_service.release(context["reservation_id"])
107
108    async def process_payment(self, context: dict) -> StepResult:
109        result = await payment_service.charge(
110            context["order_id"],
111            context["order_data"]["total"]
112        )
113        return StepResult(
114            success=result.success,
115            data={"transaction_id": result.transaction_id},
116            error=result.error
117        )
118
119    async def refund_payment(self, context: dict):
120        await payment_service.refund(context["transaction_id"])

Resilience Patterns

Circuit Breaker Pattern

 1from enum import Enum
 2from datetime import datetime, timedelta
 3from typing import Callable, Any
 4
 5class CircuitState(Enum):
 6    CLOSED = "closed"  # Normal operation
 7    OPEN = "open"      # Failing, reject requests
 8    HALF_OPEN = "half_open"  # Testing if recovered
 9
10class CircuitBreaker:
11    """Circuit breaker for service calls."""
12
13    def __init__(
14        self,
15        failure_threshold: int = 5,
16        recovery_timeout: int = 30,
17        success_threshold: int = 2
18    ):
19        self.failure_threshold = failure_threshold
20        self.recovery_timeout = recovery_timeout
21        self.success_threshold = success_threshold
22
23        self.failure_count = 0
24        self.success_count = 0
25        self.state = CircuitState.CLOSED
26        self.opened_at = None
27
28    async def call(self, func: Callable, *args, **kwargs) -> Any:
29        """Execute function with circuit breaker."""
30
31        if self.state == CircuitState.OPEN:
32            if self._should_attempt_reset():
33                self.state = CircuitState.HALF_OPEN
34            else:
35                raise CircuitBreakerOpenError("Circuit breaker is open")
36
37        try:
38            result = await func(*args, **kwargs)
39            self._on_success()
40            return result
41
42        except Exception as e:
43            self._on_failure()
44            raise
45
46    def _on_success(self):
47        """Handle successful call."""
48        self.failure_count = 0
49
50        if self.state == CircuitState.HALF_OPEN:
51            self.success_count += 1
52            if self.success_count >= self.success_threshold:
53                self.state = CircuitState.CLOSED
54                self.success_count = 0
55
56    def _on_failure(self):
57        """Handle failed call."""
58        self.failure_count += 1
59
60        if self.failure_count >= self.failure_threshold:
61            self.state = CircuitState.OPEN
62            self.opened_at = datetime.now()
63
64        if self.state == CircuitState.HALF_OPEN:
65            self.state = CircuitState.OPEN
66            self.opened_at = datetime.now()
67
68    def _should_attempt_reset(self) -> bool:
69        """Check if enough time passed to try again."""
70        return (
71            datetime.now() - self.opened_at
72            > timedelta(seconds=self.recovery_timeout)
73        )
74
75# Usage
76breaker = CircuitBreaker(failure_threshold=5, recovery_timeout=30)
77
78async def call_payment_service(payment_data: dict):
79    return await breaker.call(
80        payment_client.process_payment,
81        payment_data
82    )

Resources

  • references/service-decomposition-guide.md: Breaking down monoliths
  • references/communication-patterns.md: Sync vs async patterns
  • references/saga-implementation.md: Distributed transactions
  • assets/circuit-breaker.py: Production circuit breaker
  • assets/event-bus-template.py: Kafka event bus implementation
  • assets/api-gateway-template.py: Complete API gateway

Best Practices

  1. Service Boundaries: Align with business capabilities
  2. Database Per Service: No shared databases
  3. API Contracts: Versioned, backward compatible
  4. Async When Possible: Events over direct calls
  5. Circuit Breakers: Fail fast on service failures
  6. Distributed Tracing: Track requests across services
  7. Service Registry: Dynamic service discovery
  8. Health Checks: Liveness and readiness probes

Common Pitfalls

  • Distributed Monolith: Tightly coupled services
  • Chatty Services: Too many inter-service calls
  • Shared Databases: Tight coupling through data
  • No Circuit Breakers: Cascade failures
  • Synchronous Everything: Tight coupling, poor resilience
  • Premature Microservices: Starting with microservices
  • Ignoring Network Failures: Assuming reliable network
  • No Compensation Logic: Can’t undo failed transactions

What Users Are Saying

Real feedback from the community

Environment Matrix

Dependencies

Python 3.8+
FastAPI or Flask for REST APIs
Apache Kafka for event streaming
Docker for containerization

Framework Support

FastAPI ✓ (recommended) Flask ✓ Spring Boot ✓ Express.js ✓

Context Window

Token Usage ~5K-15K tokens for complex distributed system designs

Security & Privacy

Information

Author
wshobson
Updated
2026-01-30
Category
architecture-patterns