Grafana Dashboards
Build production-ready Grafana dashboards for comprehensive observability
✨ The solution you've been looking for
Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.
See It In Action
Interactive preview & real-world examples
AI Conversation Simulator
See how users interact with this skill
User Prompt
Help me create a Grafana dashboard for monitoring our API service with request rate, error percentage, and P95 latency panels. Include alerting for error rates above 5%.
Skill Processing
Analyzing request...
Agent Response
Complete dashboard JSON with request rate graphs, error rate monitoring with alerts, and latency percentile visualization
Quick Start (3 Steps)
Get up and running in minutes
Install
claude-code skill install grafana-dashboards
claude-code skill install grafana-dashboardsConfig
First Trigger
@grafana-dashboards helpCommands
| Command | Description | Required Args |
|---|---|---|
| @grafana-dashboards api-service-monitoring | Create a comprehensive dashboard to monitor API service health using RED method (Rate, Errors, Duration) | None |
| @grafana-dashboards infrastructure-overview | Build a high-level infrastructure dashboard showing cluster health and resource utilization | None |
| @grafana-dashboards database-performance-dashboard | Design a database monitoring dashboard with key performance indicators and connection metrics | None |
Typical Use Cases
API Service Monitoring
Create a comprehensive dashboard to monitor API service health using RED method (Rate, Errors, Duration)
Infrastructure Overview
Build a high-level infrastructure dashboard showing cluster health and resource utilization
Database Performance Dashboard
Design a database monitoring dashboard with key performance indicators and connection metrics
Overview
Grafana Dashboards
Create and manage production-ready Grafana dashboards for comprehensive system observability.
Purpose
Design effective Grafana dashboards for monitoring applications, infrastructure, and business metrics.
When to Use
- Visualize Prometheus metrics
- Create custom dashboards
- Implement SLO dashboards
- Monitor infrastructure
- Track business KPIs
Dashboard Design Principles
1. Hierarchy of Information
┌─────────────────────────────────────┐
│ Critical Metrics (Big Numbers) │
├─────────────────────────────────────┤
│ Key Trends (Time Series) │
├─────────────────────────────────────┤
│ Detailed Metrics (Tables/Heatmaps) │
└─────────────────────────────────────┘
2. RED Method (Services)
- Rate - Requests per second
- Errors - Error rate
- Duration - Latency/response time
3. USE Method (Resources)
- Utilization - % time resource is busy
- Saturation - Queue length/wait time
- Errors - Error count
Dashboard Structure
API Monitoring Dashboard
1{
2 "dashboard": {
3 "title": "API Monitoring",
4 "tags": ["api", "production"],
5 "timezone": "browser",
6 "refresh": "30s",
7 "panels": [
8 {
9 "title": "Request Rate",
10 "type": "graph",
11 "targets": [
12 {
13 "expr": "sum(rate(http_requests_total[5m])) by (service)",
14 "legendFormat": "{{service}}"
15 }
16 ],
17 "gridPos": { "x": 0, "y": 0, "w": 12, "h": 8 }
18 },
19 {
20 "title": "Error Rate %",
21 "type": "graph",
22 "targets": [
23 {
24 "expr": "(sum(rate(http_requests_total{status=~\"5..\"}[5m])) / sum(rate(http_requests_total[5m]))) * 100",
25 "legendFormat": "Error Rate"
26 }
27 ],
28 "alert": {
29 "conditions": [
30 {
31 "evaluator": { "params": [5], "type": "gt" },
32 "operator": { "type": "and" },
33 "query": { "params": ["A", "5m", "now"] },
34 "type": "query"
35 }
36 ]
37 },
38 "gridPos": { "x": 12, "y": 0, "w": 12, "h": 8 }
39 },
40 {
41 "title": "P95 Latency",
42 "type": "graph",
43 "targets": [
44 {
45 "expr": "histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service))",
46 "legendFormat": "{{service}}"
47 }
48 ],
49 "gridPos": { "x": 0, "y": 8, "w": 24, "h": 8 }
50 }
51 ]
52 }
53}
Reference: See assets/api-dashboard.json
Panel Types
1. Stat Panel (Single Value)
1{
2 "type": "stat",
3 "title": "Total Requests",
4 "targets": [
5 {
6 "expr": "sum(http_requests_total)"
7 }
8 ],
9 "options": {
10 "reduceOptions": {
11 "values": false,
12 "calcs": ["lastNotNull"]
13 },
14 "orientation": "auto",
15 "textMode": "auto",
16 "colorMode": "value"
17 },
18 "fieldConfig": {
19 "defaults": {
20 "thresholds": {
21 "mode": "absolute",
22 "steps": [
23 { "value": 0, "color": "green" },
24 { "value": 80, "color": "yellow" },
25 { "value": 90, "color": "red" }
26 ]
27 }
28 }
29 }
30}
2. Time Series Graph
1{
2 "type": "graph",
3 "title": "CPU Usage",
4 "targets": [
5 {
6 "expr": "100 - (avg by (instance) (rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)"
7 }
8 ],
9 "yaxes": [
10 { "format": "percent", "max": 100, "min": 0 },
11 { "format": "short" }
12 ]
13}
3. Table Panel
1{
2 "type": "table",
3 "title": "Service Status",
4 "targets": [
5 {
6 "expr": "up",
7 "format": "table",
8 "instant": true
9 }
10 ],
11 "transformations": [
12 {
13 "id": "organize",
14 "options": {
15 "excludeByName": { "Time": true },
16 "indexByName": {},
17 "renameByName": {
18 "instance": "Instance",
19 "job": "Service",
20 "Value": "Status"
21 }
22 }
23 }
24 ]
25}
4. Heatmap
1{
2 "type": "heatmap",
3 "title": "Latency Heatmap",
4 "targets": [
5 {
6 "expr": "sum(rate(http_request_duration_seconds_bucket[5m])) by (le)",
7 "format": "heatmap"
8 }
9 ],
10 "dataFormat": "tsbuckets",
11 "yAxis": {
12 "format": "s"
13 }
14}
Variables
Query Variables
1{
2 "templating": {
3 "list": [
4 {
5 "name": "namespace",
6 "type": "query",
7 "datasource": "Prometheus",
8 "query": "label_values(kube_pod_info, namespace)",
9 "refresh": 1,
10 "multi": false
11 },
12 {
13 "name": "service",
14 "type": "query",
15 "datasource": "Prometheus",
16 "query": "label_values(kube_service_info{namespace=\"$namespace\"}, service)",
17 "refresh": 1,
18 "multi": true
19 }
20 ]
21 }
22}
Use Variables in Queries
sum(rate(http_requests_total{namespace="$namespace", service=~"$service"}[5m]))
Alerts in Dashboards
1{
2 "alert": {
3 "name": "High Error Rate",
4 "conditions": [
5 {
6 "evaluator": {
7 "params": [5],
8 "type": "gt"
9 },
10 "operator": { "type": "and" },
11 "query": {
12 "params": ["A", "5m", "now"]
13 },
14 "reducer": { "type": "avg" },
15 "type": "query"
16 }
17 ],
18 "executionErrorState": "alerting",
19 "for": "5m",
20 "frequency": "1m",
21 "message": "Error rate is above 5%",
22 "noDataState": "no_data",
23 "notifications": [{ "uid": "slack-channel" }]
24 }
25}
Dashboard Provisioning
dashboards.yml:
1apiVersion: 1
2
3providers:
4 - name: "default"
5 orgId: 1
6 folder: "General"
7 type: file
8 disableDeletion: false
9 updateIntervalSeconds: 10
10 allowUiUpdates: true
11 options:
12 path: /etc/grafana/dashboards
Common Dashboard Patterns
Infrastructure Dashboard
Key Panels:
- CPU utilization per node
- Memory usage per node
- Disk I/O
- Network traffic
- Pod count by namespace
- Node status
Reference: See assets/infrastructure-dashboard.json
Database Dashboard
Key Panels:
- Queries per second
- Connection pool usage
- Query latency (P50, P95, P99)
- Active connections
- Database size
- Replication lag
- Slow queries
Reference: See assets/database-dashboard.json
Application Dashboard
Key Panels:
- Request rate
- Error rate
- Response time (percentiles)
- Active users/sessions
- Cache hit rate
- Queue length
Best Practices
- Start with templates (Grafana community dashboards)
- Use consistent naming for panels and variables
- Group related metrics in rows
- Set appropriate time ranges (default: Last 6 hours)
- Use variables for flexibility
- Add panel descriptions for context
- Configure units correctly
- Set meaningful thresholds for colors
- Use consistent colors across dashboards
- Test with different time ranges
Dashboard as Code
Terraform Provisioning
1resource "grafana_dashboard" "api_monitoring" {
2 config_json = file("${path.module}/dashboards/api-monitoring.json")
3 folder = grafana_folder.monitoring.id
4}
5
6resource "grafana_folder" "monitoring" {
7 title = "Production Monitoring"
8}
Ansible Provisioning
1- name: Deploy Grafana dashboards
2 copy:
3 src: "{{ item }}"
4 dest: /etc/grafana/dashboards/
5 with_fileglob:
6 - "dashboards/*.json"
7 notify: restart grafana
Reference Files
assets/api-dashboard.json- API monitoring dashboardassets/infrastructure-dashboard.json- Infrastructure dashboardassets/database-dashboard.json- Database monitoring dashboardreferences/dashboard-design.md- Dashboard design guide
Related Skills
prometheus-configuration- For metric collectionslo-implementation- For SLO dashboards
What Users Are Saying
Real feedback from the community
Environment Matrix
Dependencies
Framework Support
Context Window
Security & Privacy
Information
- Author
- wshobson
- Updated
- 2026-01-30
- Category
- productivity-tools
Related Skills
Grafana Dashboards
Create and manage production Grafana dashboards for real-time visualization of system and …
View Details →Error Tracking
Add Sentry v8 error tracking and performance monitoring to your project services. Use this skill …
View Details →Error Tracking
Add Sentry v8 error tracking and performance monitoring to your project services. Use this skill …
View Details →