Prometheus Sessions
Prometheus is great for collecting and analyzing metrics to help you monitor your applications and infrastructure.
This page lists questions to help you check your Prometheus setup, dig into your metrics, and optimize your monitoring for better performance.
Key areas of focus
Section titled “Key areas of focus”Below are some useful questions to ask when managing your Prometheus setup:
General monitoring
Section titled “General monitoring”- What are the latest metrics collected from my applications?
- Can I view a list of all active metrics in my Prometheus environment?
- Are there any metrics that have exceeded their thresholds in the past 24 hours?
Alerts & notifications
Section titled “Alerts & notifications”- What alerts are currently active in Prometheus?
- Can I get a list of alerts triggered in the last 7 days?
- How can I set up notifications for specific metrics, like CPU usage or response time?
Metric queries & analysis
Section titled “Metric queries & analysis”- How do I write a query to monitor the CPU usage of my hosts?
- What’s the average response time of my services over the past week?
- How can I identify any recent spikes in memory usage across my servers?
Dashboards & visualizations
Section titled “Dashboards & visualizations”- How can I create a new dashboard for tracking key metrics?
- Which dashboards provide the best overview of my environment?
- Are there any unused or redundant widgets in my dashboards?
Performance optimization
Section titled “Performance optimization”- What are some best practices for writing efficient Prometheus queries?
- How can I reduce the load on my Prometheus server?
- Are there any low-traffic metrics I can archive or delete to save resources?
Troubleshooting
Section titled “Troubleshooting”- Can I see recent gaps or failures in data collection?
- Are there any issues with data retention or storage usage?
- How can I investigate high memory usage by Prometheus itself?