Python Performance Optimization

Profile, analyze, and optimize Python code for maximum performance

✨ The solution you've been looking for

Verified

Tested and verified by our team

25450 Stars

Profile and optimize Python code using cProfile, memory profilers, and performance best practices. Use when debugging slow Python code, optimizing bottlenecks, or improving application performance.

python performance profiling optimization debugging memory cpu benchmarking

Repository

See It In Action

Interactive preview & real-world examples

Live Demo

AI Conversation Simulator

See how users interact with this skill

User Prompt

My Python web application is responding slowly. Help me profile it and identify the bottlenecks causing the performance issues.

Skill Processing

Analyzing request...

Agent Response

Detailed profiling results showing top time-consuming functions, memory usage patterns, and specific optimization recommendations

Quick Start (3 Steps)

Get up and running in minutes

Install

claude-code skill install python-performance-optimization

claude-code skill install python-performance-optimization

Config

First Trigger

@python-performance-optimization help

Commands

Command	Description	Required Args
@python-performance-optimization identify-performance-bottlenecks	Profile a slow Python application to find CPU and memory hotspots using cProfile and memory profilers	None
@python-performance-optimization optimize-data-processing-pipeline	Transform inefficient loops and data structures into high-performance implementations using NumPy, generators, and vectorization	None
@python-performance-optimization fix-memory-leaks	Use memory profiling tools to detect and resolve memory leaks in long-running Python applications	None

Typical Use Cases

Identify Performance Bottlenecks

Profile a slow Python application to find CPU and memory hotspots using cProfile and memory profilers

Optimize Data Processing Pipeline

Transform inefficient loops and data structures into high-performance implementations using NumPy, generators, and vectorization

Fix Memory Leaks

Use memory profiling tools to detect and resolve memory leaks in long-running Python applications

Overview

Python Performance Optimization

Comprehensive guide to profiling, analyzing, and optimizing Python code for better performance, including CPU profiling, memory optimization, and implementation best practices.

When to Use This Skill

Identifying performance bottlenecks in Python applications
Reducing application latency and response times
Optimizing CPU-intensive operations
Reducing memory consumption and memory leaks
Improving database query performance
Optimizing I/O operations
Speeding up data processing pipelines
Implementing high-performance algorithms
Profiling production applications

Core Concepts

1. Profiling Types

CPU Profiling: Identify time-consuming functions
Memory Profiling: Track memory allocation and leaks
Line Profiling: Profile at line-by-line granularity
Call Graph: Visualize function call relationships

2. Performance Metrics

Execution Time: How long operations take
Memory Usage: Peak and average memory consumption
CPU Utilization: Processor usage patterns
I/O Wait: Time spent on I/O operations

3. Optimization Strategies

Algorithmic: Better algorithms and data structures
Implementation: More efficient code patterns
Parallelization: Multi-threading/processing
Caching: Avoid redundant computation
Native Extensions: C/Rust for critical paths

Quick Start

Basic Timing

 1import time
 2
 3def measure_time():
 4    """Simple timing measurement."""
 5    start = time.time()
 6
 7    # Your code here
 8    result = sum(range(1000000))
 9
10    elapsed = time.time() - start
11    print(f"Execution time: {elapsed:.4f} seconds")
12    return result
13
14# Better: use timeit for accurate measurements
15import timeit
16
17execution_time = timeit.timeit(
18    "sum(range(1000000))",
19    number=100
20)
21print(f"Average time: {execution_time/100:.6f} seconds")

Profiling Tools

Pattern 1: cProfile - CPU Profiling

 1import cProfile
 2import pstats
 3from pstats import SortKey
 4
 5def slow_function():
 6    """Function to profile."""
 7    total = 0
 8    for i in range(1000000):
 9        total += i
10    return total
11
12def another_function():
13    """Another function."""
14    return [i**2 for i in range(100000)]
15
16def main():
17    """Main function to profile."""
18    result1 = slow_function()
19    result2 = another_function()
20    return result1, result2
21
22# Profile the code
23if __name__ == "__main__":
24    profiler = cProfile.Profile()
25    profiler.enable()
26
27    main()
28
29    profiler.disable()
30
31    # Print stats
32    stats = pstats.Stats(profiler)
33    stats.sort_stats(SortKey.CUMULATIVE)
34    stats.print_stats(10)  # Top 10 functions
35
36    # Save to file for later analysis
37    stats.dump_stats("profile_output.prof")

Command-line profiling:

1# Profile a script
2python -m cProfile -o output.prof script.py
3
4# View results
5python -m pstats output.prof
6# In pstats:
7# sort cumtime
8# stats 10

Pattern 2: line_profiler - Line-by-Line Profiling

 1# Install: pip install line-profiler
 2
 3# Add @profile decorator (line_profiler provides this)
 4@profile
 5def process_data(data):
 6    """Process data with line profiling."""
 7    result = []
 8    for item in data:
 9        processed = item * 2
10        result.append(processed)
11    return result
12
13# Run with:
14# kernprof -l -v script.py

Manual line profiling:

 1from line_profiler import LineProfiler
 2
 3def process_data(data):
 4    """Function to profile."""
 5    result = []
 6    for item in data:
 7        processed = item * 2
 8        result.append(processed)
 9    return result
10
11if __name__ == "__main__":
12    lp = LineProfiler()
13    lp.add_function(process_data)
14
15    data = list(range(100000))
16
17    lp_wrapper = lp(process_data)
18    lp_wrapper(data)
19
20    lp.print_stats()

Pattern 3: memory_profiler - Memory Usage

 1# Install: pip install memory-profiler
 2
 3from memory_profiler import profile
 4
 5@profile
 6def memory_intensive():
 7    """Function that uses lots of memory."""
 8    # Create large list
 9    big_list = [i for i in range(1000000)]
10
11    # Create large dict
12    big_dict = {i: i**2 for i in range(100000)}
13
14    # Process data
15    result = sum(big_list)
16
17    return result
18
19if __name__ == "__main__":
20    memory_intensive()
21
22# Run with:
23# python -m memory_profiler script.py

Pattern 4: py-spy - Production Profiling

 1# Install: pip install py-spy
 2
 3# Profile a running Python process
 4py-spy top --pid 12345
 5
 6# Generate flamegraph
 7py-spy record -o profile.svg --pid 12345
 8
 9# Profile a script
10py-spy record -o profile.svg -- python script.py
11
12# Dump current call stack
13py-spy dump --pid 12345

Optimization Patterns

Pattern 5: List Comprehensions vs Loops

 1import timeit
 2
 3# Slow: Traditional loop
 4def slow_squares(n):
 5    """Create list of squares using loop."""
 6    result = []
 7    for i in range(n):
 8        result.append(i**2)
 9    return result
10
11# Fast: List comprehension
12def fast_squares(n):
13    """Create list of squares using comprehension."""
14    return [i**2 for i in range(n)]
15
16# Benchmark
17n = 100000
18
19slow_time = timeit.timeit(lambda: slow_squares(n), number=100)
20fast_time = timeit.timeit(lambda: fast_squares(n), number=100)
21
22print(f"Loop: {slow_time:.4f}s")
23print(f"Comprehension: {fast_time:.4f}s")
24print(f"Speedup: {slow_time/fast_time:.2f}x")
25
26# Even faster for simple operations: map
27def faster_squares(n):
28    """Use map for even better performance."""
29    return list(map(lambda x: x**2, range(n)))

Pattern 6: Generator Expressions for Memory

 1import sys
 2
 3def list_approach():
 4    """Memory-intensive list."""
 5    data = [i**2 for i in range(1000000)]
 6    return sum(data)
 7
 8def generator_approach():
 9    """Memory-efficient generator."""
10    data = (i**2 for i in range(1000000))
11    return sum(data)
12
13# Memory comparison
14list_data = [i for i in range(1000000)]
15gen_data = (i for i in range(1000000))
16
17print(f"List size: {sys.getsizeof(list_data)} bytes")
18print(f"Generator size: {sys.getsizeof(gen_data)} bytes")
19
20# Generators use constant memory regardless of size

Pattern 7: String Concatenation

 1import timeit
 2
 3def slow_concat(items):
 4    """Slow string concatenation."""
 5    result = ""
 6    for item in items:
 7        result += str(item)
 8    return result
 9
10def fast_concat(items):
11    """Fast string concatenation with join."""
12    return "".join(str(item) for item in items)
13
14def faster_concat(items):
15    """Even faster with list."""
16    parts = [str(item) for item in items]
17    return "".join(parts)
18
19items = list(range(10000))
20
21# Benchmark
22slow = timeit.timeit(lambda: slow_concat(items), number=100)
23fast = timeit.timeit(lambda: fast_concat(items), number=100)
24faster = timeit.timeit(lambda: faster_concat(items), number=100)
25
26print(f"Concatenation (+): {slow:.4f}s")
27print(f"Join (generator): {fast:.4f}s")
28print(f"Join (list): {faster:.4f}s")

Pattern 8: Dictionary Lookups vs List Searches

 1import timeit
 2
 3# Create test data
 4size = 10000
 5items = list(range(size))
 6lookup_dict = {i: i for i in range(size)}
 7
 8def list_search(items, target):
 9    """O(n) search in list."""
10    return target in items
11
12def dict_search(lookup_dict, target):
13    """O(1) search in dict."""
14    return target in lookup_dict
15
16target = size - 1  # Worst case for list
17
18# Benchmark
19list_time = timeit.timeit(
20    lambda: list_search(items, target),
21    number=1000
22)
23dict_time = timeit.timeit(
24    lambda: dict_search(lookup_dict, target),
25    number=1000
26)
27
28print(f"List search: {list_time:.6f}s")
29print(f"Dict search: {dict_time:.6f}s")
30print(f"Speedup: {list_time/dict_time:.0f}x")

Pattern 9: Local Variable Access

 1import timeit
 2
 3# Global variable (slow)
 4GLOBAL_VALUE = 100
 5
 6def use_global():
 7    """Access global variable."""
 8    total = 0
 9    for i in range(10000):
10        total += GLOBAL_VALUE
11    return total
12
13def use_local():
14    """Use local variable."""
15    local_value = 100
16    total = 0
17    for i in range(10000):
18        total += local_value
19    return total
20
21# Local is faster
22global_time = timeit.timeit(use_global, number=1000)
23local_time = timeit.timeit(use_local, number=1000)
24
25print(f"Global access: {global_time:.4f}s")
26print(f"Local access: {local_time:.4f}s")
27print(f"Speedup: {global_time/local_time:.2f}x")

Pattern 10: Function Call Overhead

 1import timeit
 2
 3def calculate_inline():
 4    """Inline calculation."""
 5    total = 0
 6    for i in range(10000):
 7        total += i * 2 + 1
 8    return total
 9
10def helper_function(x):
11    """Helper function."""
12    return x * 2 + 1
13
14def calculate_with_function():
15    """Calculation with function calls."""
16    total = 0
17    for i in range(10000):
18        total += helper_function(i)
19    return total
20
21# Inline is faster due to no call overhead
22inline_time = timeit.timeit(calculate_inline, number=1000)
23function_time = timeit.timeit(calculate_with_function, number=1000)
24
25print(f"Inline: {inline_time:.4f}s")
26print(f"Function calls: {function_time:.4f}s")

Advanced Optimization

Pattern 11: NumPy for Numerical Operations

 1import timeit
 2import numpy as np
 3
 4def python_sum(n):
 5    """Sum using pure Python."""
 6    return sum(range(n))
 7
 8def numpy_sum(n):
 9    """Sum using NumPy."""
10    return np.arange(n).sum()
11
12n = 1000000
13
14python_time = timeit.timeit(lambda: python_sum(n), number=100)
15numpy_time = timeit.timeit(lambda: numpy_sum(n), number=100)
16
17print(f"Python: {python_time:.4f}s")
18print(f"NumPy: {numpy_time:.4f}s")
19print(f"Speedup: {python_time/numpy_time:.2f}x")
20
21# Vectorized operations
22def python_multiply():
23    """Element-wise multiplication in Python."""
24    a = list(range(100000))
25    b = list(range(100000))
26    return [x * y for x, y in zip(a, b)]
27
28def numpy_multiply():
29    """Vectorized multiplication in NumPy."""
30    a = np.arange(100000)
31    b = np.arange(100000)
32    return a * b
33
34py_time = timeit.timeit(python_multiply, number=100)
35np_time = timeit.timeit(numpy_multiply, number=100)
36
37print(f"\nPython multiply: {py_time:.4f}s")
38print(f"NumPy multiply: {np_time:.4f}s")
39print(f"Speedup: {py_time/np_time:.2f}x")

Pattern 12: Caching with functools.lru_cache

 1from functools import lru_cache
 2import timeit
 3
 4def fibonacci_slow(n):
 5    """Recursive fibonacci without caching."""
 6    if n < 2:
 7        return n
 8    return fibonacci_slow(n-1) + fibonacci_slow(n-2)
 9
10@lru_cache(maxsize=None)
11def fibonacci_fast(n):
12    """Recursive fibonacci with caching."""
13    if n < 2:
14        return n
15    return fibonacci_fast(n-1) + fibonacci_fast(n-2)
16
17# Massive speedup for recursive algorithms
18n = 30
19
20slow_time = timeit.timeit(lambda: fibonacci_slow(n), number=1)
21fast_time = timeit.timeit(lambda: fibonacci_fast(n), number=1000)
22
23print(f"Without cache (1 run): {slow_time:.4f}s")
24print(f"With cache (1000 runs): {fast_time:.4f}s")
25
26# Cache info
27print(f"Cache info: {fibonacci_fast.cache_info()}")

Pattern 13: Using slots for Memory

 1import sys
 2
 3class RegularClass:
 4    """Regular class with __dict__."""
 5    def __init__(self, x, y, z):
 6        self.x = x
 7        self.y = y
 8        self.z = z
 9
10class SlottedClass:
11    """Class with __slots__ for memory efficiency."""
12    __slots__ = ['x', 'y', 'z']
13
14    def __init__(self, x, y, z):
15        self.x = x
16        self.y = y
17        self.z = z
18
19# Memory comparison
20regular = RegularClass(1, 2, 3)
21slotted = SlottedClass(1, 2, 3)
22
23print(f"Regular class size: {sys.getsizeof(regular)} bytes")
24print(f"Slotted class size: {sys.getsizeof(slotted)} bytes")
25
26# Significant savings with many instances
27regular_objects = [RegularClass(i, i+1, i+2) for i in range(10000)]
28slotted_objects = [SlottedClass(i, i+1, i+2) for i in range(10000)]
29
30print(f"\nMemory for 10000 regular objects: ~{sys.getsizeof(regular) * 10000} bytes")
31print(f"Memory for 10000 slotted objects: ~{sys.getsizeof(slotted) * 10000} bytes")

Pattern 14: Multiprocessing for CPU-Bound Tasks

 1import multiprocessing as mp
 2import time
 3
 4def cpu_intensive_task(n):
 5    """CPU-intensive calculation."""
 6    return sum(i**2 for i in range(n))
 7
 8def sequential_processing():
 9    """Process tasks sequentially."""
10    start = time.time()
11    results = [cpu_intensive_task(1000000) for _ in range(4)]
12    elapsed = time.time() - start
13    return elapsed, results
14
15def parallel_processing():
16    """Process tasks in parallel."""
17    start = time.time()
18    with mp.Pool(processes=4) as pool:
19        results = pool.map(cpu_intensive_task, [1000000] * 4)
20    elapsed = time.time() - start
21    return elapsed, results
22
23if __name__ == "__main__":
24    seq_time, seq_results = sequential_processing()
25    par_time, par_results = parallel_processing()
26
27    print(f"Sequential: {seq_time:.2f}s")
28    print(f"Parallel: {par_time:.2f}s")
29    print(f"Speedup: {seq_time/par_time:.2f}x")

Pattern 15: Async I/O for I/O-Bound Tasks

 1import asyncio
 2import aiohttp
 3import time
 4import requests
 5
 6urls = [
 7    "https://httpbin.org/delay/1",
 8    "https://httpbin.org/delay/1",
 9    "https://httpbin.org/delay/1",
10    "https://httpbin.org/delay/1",
11]
12
13def synchronous_requests():
14    """Synchronous HTTP requests."""
15    start = time.time()
16    results = []
17    for url in urls:
18        response = requests.get(url)
19        results.append(response.status_code)
20    elapsed = time.time() - start
21    return elapsed, results
22
23async def async_fetch(session, url):
24    """Async HTTP request."""
25    async with session.get(url) as response:
26        return response.status
27
28async def asynchronous_requests():
29    """Asynchronous HTTP requests."""
30    start = time.time()
31    async with aiohttp.ClientSession() as session:
32        tasks = [async_fetch(session, url) for url in urls]
33        results = await asyncio.gather(*tasks)
34    elapsed = time.time() - start
35    return elapsed, results
36
37# Async is much faster for I/O-bound work
38sync_time, sync_results = synchronous_requests()
39async_time, async_results = asyncio.run(asynchronous_requests())
40
41print(f"Synchronous: {sync_time:.2f}s")
42print(f"Asynchronous: {async_time:.2f}s")
43print(f"Speedup: {sync_time/async_time:.2f}x")

Database Optimization

Pattern 16: Batch Database Operations

 1import sqlite3
 2import time
 3
 4def create_db():
 5    """Create test database."""
 6    conn = sqlite3.connect(":memory:")
 7    conn.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)")
 8    return conn
 9
10def slow_inserts(conn, count):
11    """Insert records one at a time."""
12    start = time.time()
13    cursor = conn.cursor()
14    for i in range(count):
15        cursor.execute("INSERT INTO users (name) VALUES (?)", (f"User {i}",))
16        conn.commit()  # Commit each insert
17    elapsed = time.time() - start
18    return elapsed
19
20def fast_inserts(conn, count):
21    """Batch insert with single commit."""
22    start = time.time()
23    cursor = conn.cursor()
24    data = [(f"User {i}",) for i in range(count)]
25    cursor.executemany("INSERT INTO users (name) VALUES (?)", data)
26    conn.commit()  # Single commit
27    elapsed = time.time() - start
28    return elapsed
29
30# Benchmark
31conn1 = create_db()
32slow_time = slow_inserts(conn1, 1000)
33
34conn2 = create_db()
35fast_time = fast_inserts(conn2, 1000)
36
37print(f"Individual inserts: {slow_time:.4f}s")
38print(f"Batch insert: {fast_time:.4f}s")
39print(f"Speedup: {slow_time/fast_time:.2f}x")

Pattern 17: Query Optimization

 1# Use indexes for frequently queried columns
 2"""
 3-- Slow: No index
 4SELECT * FROM users WHERE email = 'user@example.com';
 5
 6-- Fast: With index
 7CREATE INDEX idx_users_email ON users(email);
 8SELECT * FROM users WHERE email = 'user@example.com';
 9"""
10
11# Use query planning
12import sqlite3
13
14conn = sqlite3.connect("example.db")
15cursor = conn.cursor()
16
17# Analyze query performance
18cursor.execute("EXPLAIN QUERY PLAN SELECT * FROM users WHERE email = ?", ("test@example.com",))
19print(cursor.fetchall())
20
21# Use SELECT only needed columns
22# Slow: SELECT *
23# Fast: SELECT id, name

Memory Optimization

Pattern 18: Detecting Memory Leaks

 1import tracemalloc
 2import gc
 3
 4def memory_leak_example():
 5    """Example that leaks memory."""
 6    leaked_objects = []
 7
 8    for i in range(100000):
 9        # Objects added but never removed
10        leaked_objects.append([i] * 100)
11
12    # In real code, this would be an unintended reference
13
14def track_memory_usage():
15    """Track memory allocations."""
16    tracemalloc.start()
17
18    # Take snapshot before
19    snapshot1 = tracemalloc.take_snapshot()
20
21    # Run code
22    memory_leak_example()
23
24    # Take snapshot after
25    snapshot2 = tracemalloc.take_snapshot()
26
27    # Compare
28    top_stats = snapshot2.compare_to(snapshot1, 'lineno')
29
30    print("Top 10 memory allocations:")
31    for stat in top_stats[:10]:
32        print(stat)
33
34    tracemalloc.stop()
35
36# Monitor memory
37track_memory_usage()
38
39# Force garbage collection
40gc.collect()

Pattern 19: Iterators vs Lists

 1import sys
 2
 3def process_file_list(filename):
 4    """Load entire file into memory."""
 5    with open(filename) as f:
 6        lines = f.readlines()  # Loads all lines
 7        return sum(1 for line in lines if line.strip())
 8
 9def process_file_iterator(filename):
10    """Process file line by line."""
11    with open(filename) as f:
12        return sum(1 for line in f if line.strip())
13
14# Iterator uses constant memory
15# List loads entire file into memory

Pattern 20: Weakref for Caches

 1import weakref
 2
 3class CachedResource:
 4    """Resource that can be garbage collected."""
 5    def __init__(self, data):
 6        self.data = data
 7
 8# Regular cache prevents garbage collection
 9regular_cache = {}
10
11def get_resource_regular(key):
12    """Get resource from regular cache."""
13    if key not in regular_cache:
14        regular_cache[key] = CachedResource(f"Data for {key}")
15    return regular_cache[key]
16
17# Weak reference cache allows garbage collection
18weak_cache = weakref.WeakValueDictionary()
19
20def get_resource_weak(key):
21    """Get resource from weak cache."""
22    resource = weak_cache.get(key)
23    if resource is None:
24        resource = CachedResource(f"Data for {key}")
25        weak_cache[key] = resource
26    return resource
27
28# When no strong references exist, objects can be GC'd

Benchmarking Tools

Custom Benchmark Decorator

 1import time
 2from functools import wraps
 3
 4def benchmark(func):
 5    """Decorator to benchmark function execution."""
 6    @wraps(func)
 7    def wrapper(*args, **kwargs):
 8        start = time.perf_counter()
 9        result = func(*args, **kwargs)
10        elapsed = time.perf_counter() - start
11        print(f"{func.__name__} took {elapsed:.6f} seconds")
12        return result
13    return wrapper
14
15@benchmark
16def slow_function():
17    """Function to benchmark."""
18    time.sleep(0.5)
19    return sum(range(1000000))
20
21result = slow_function()

Performance Testing with pytest-benchmark

 1# Install: pip install pytest-benchmark
 2
 3def test_list_comprehension(benchmark):
 4    """Benchmark list comprehension."""
 5    result = benchmark(lambda: [i**2 for i in range(10000)])
 6    assert len(result) == 10000
 7
 8def test_map_function(benchmark):
 9    """Benchmark map function."""
10    result = benchmark(lambda: list(map(lambda x: x**2, range(10000))))
11    assert len(result) == 10000
12
13# Run with: pytest test_performance.py --benchmark-compare

Best Practices

Profile before optimizing - Measure to find real bottlenecks
Focus on hot paths - Optimize code that runs most frequently
Use appropriate data structures - Dict for lookups, set for membership
Avoid premature optimization - Clarity first, then optimize
Use built-in functions - They’re implemented in C
Cache expensive computations - Use lru_cache
Batch I/O operations - Reduce system calls
Use generators for large datasets
Consider NumPy for numerical operations
Profile production code - Use py-spy for live systems

Common Pitfalls

Optimizing without profiling
Using global variables unnecessarily
Not using appropriate data structures
Creating unnecessary copies of data
Not using connection pooling for databases
Ignoring algorithmic complexity
Over-optimizing rare code paths
Not considering memory usage

Resources

cProfile: Built-in CPU profiler
memory_profiler: Memory usage profiling
line_profiler: Line-by-line profiling
py-spy: Sampling profiler for production
NumPy: High-performance numerical computing
Cython: Compile Python to C
PyPy: Alternative Python interpreter with JIT

Performance Checklist

Profiled code to identify bottlenecks
Used appropriate data structures
Implemented caching where beneficial
Optimized database queries
Used generators for large datasets
Considered multiprocessing for CPU-bound tasks
Used async I/O for I/O-bound tasks
Minimized function call overhead in hot loops
Checked for memory leaks
Benchmarked before and after optimization

What Users Are Saying

Real feedback from the community

Environment Matrix

Dependencies

Python 3.7+

cProfile (built-in)

memory-profiler (pip install memory-profiler)

line-profiler (pip install line-profiler)

py-spy (pip install py-spy)

NumPy (recommended for numerical operations)

pytest-benchmark (for testing)

Framework Support

NumPy ✓ (recommended) asyncio ✓ multiprocessing ✓ SQLite ✓ aiohttp ✓ requests ✓

Python Performance Optimization

See It In Action

AI Conversation Simulator

Quick Start (3 Steps)

Install

Config

First Trigger

Commands

Typical Use Cases

Identify Performance Bottlenecks

Optimize Data Processing Pipeline

Fix Memory Leaks

Overview

Python Performance Optimization

When to Use This Skill

Core Concepts

1. Profiling Types

2. Performance Metrics

3. Optimization Strategies

Quick Start

Basic Timing

Profiling Tools

Pattern 1: cProfile - CPU Profiling

Pattern 2: line_profiler - Line-by-Line Profiling

Pattern 3: memory_profiler - Memory Usage

Pattern 4: py-spy - Production Profiling

Optimization Patterns

Pattern 5: List Comprehensions vs Loops

Pattern 6: Generator Expressions for Memory

Pattern 7: String Concatenation

Pattern 8: Dictionary Lookups vs List Searches

Pattern 9: Local Variable Access

Pattern 10: Function Call Overhead

Advanced Optimization

Pattern 11: NumPy for Numerical Operations

Pattern 12: Caching with functools.lru_cache

Pattern 13: Using slots for Memory

Pattern 14: Multiprocessing for CPU-Bound Tasks

Pattern 15: Async I/O for I/O-Bound Tasks

Database Optimization

Pattern 16: Batch Database Operations

Pattern 17: Query Optimization

Memory Optimization

Pattern 18: Detecting Memory Leaks

Pattern 19: Iterators vs Lists

Pattern 20: Weakref for Caches

Benchmarking Tools

Custom Benchmark Decorator

Performance Testing with pytest-benchmark

Best Practices

Common Pitfalls

Resources

Performance Checklist

What Users Are Saying

Environment Matrix

Dependencies

Framework Support

Context Window

Security & Privacy

Information

Related Skills

Python Performance Optimization

Railway Metrics

Railway Metrics