Best practices for querying logs

This guide will help you write efficient LogQL queries in our Loki setup. Following these practices will make your queries faster and use fewer resources.

Understanding the Basics

What Makes Queries Fast?

Our Loki setup uses bloom filters for query acceleration. Think of bloom filters as a quick index that helps Loki know which data chunks might contain what you're looking for, without having to read everything.

Structured Metadata vs Stream Labels

Before diving into best practices, understand these two concepts:

Stream labels: The labels in curly braces {app="nginx", namespace="prod"}
Structured metadata: Additional data attached to logs that you filter with pipe operators | key="value"

Essential Best Practices

1. Start Small, Then Expand

Always test your queries on small time ranges first:

Start with 5 minutes or 1 hour
Once it works, expand to larger ranges
This prevents accidental resource overuse

2. Use Labels to Filter First

Loki indexes labels, not log content. Always start with label filters:

✅ Good - Fast:

{app="myapp", namespace="prod"} |= "error"

❌ Bad - Very slow:

{} |= "error"

The second query scans ALL logs in the system!

3. Order Matters for Performance

To use query acceleration with bloom filters, place filters in the right order:

✅ Fast - Filters before parsing:

{app="nginx"} 
| detected_level="error"    # Filter first
| json                       # Parse second

❌ Slow - Parsing before filtering:

{app="nginx"} 
| json                       # Parse first (slow!)
| detected_level="error"    # Filter second

4. Use Simple String Matching When Possible

|= (exact match) is faster than |~ (regex)
Only use regex when you really need pattern matching

✅ Fast:

{app="nginx"} |= "404"

❌ Slower:

{app="nginx"} |~ "40[0-9]"

5. Parse Only What You Need

JSON and logfmt parsing are expensive. Only parse when necessary:

✅ Good - Filter first, then parse:

{app="api"} 
|= "orderId"           # Filter to relevant logs
| json orderId         # Parse only the field you need
| orderId="12345"

❌ Bad - Parse everything:

{app="api"} 
| json                 # Parses all fields in every log!
| orderId="12345"

Working with Metrics and Aggregations

6. Aggregate After Filtering

When using functions like count_over_time or rate, filter your data first:

✅ Efficient:

{app="nginx"} 
|= "error"                    # Filter first
| count_over_time[5m]         # Then count

❌ Inefficient:

count_over_time({app="nginx"}[5m])  # Counts everything, even non-errors

7. Use Small Time Windows

For aggregation functions, use the smallest window that gives you the resolution you need:

✅ Good:

sum(rate({app="payment"} |= "error" [5m]))

❌ Resource-intensive:

sum(rate({app="payment"} |= "error" [1h]))

Advanced Tips

8. Avoid High-Cardinality Labels

Don't use labels with millions of unique values (like trace_id, user_id, session_id) as stream labels. Keep them in structured metadata instead.

9. Structure Complex Queries in Stages

For complex queries, think in three stages:

Narrow down: Use stream labels and simple filters
Parse: Extract only needed fields from the filtered logs
Aggregate: Summarize the results

Example:

{app="payment", namespace="prod"}  # 1. Narrow with labels
|= "timeout"                        # 1. Further narrow with string filter
| json status                       # 2. Parse only status field
| status="500"                      # 2. Filter parsed data
| count_over_time[5m]              # 3. Aggregate

10. Watch Your Stream Count

Each unique combination of labels creates a "stream". If your query matches too many streams (50,000+), it will be slow regardless of other optimizations.

Check stream count with:

count(count_over_time({app="myapp"}[1m])) by (labels)

Quick Reference: Do's and Don'ts

Do ✅	Don't ❌
Start with label selectors	Use `{}` without any labels
Filter before parsing	Parse before filtering
Use `\|=` for exact matches	Use `\|~` regex unnecessarily
Test on small time ranges	Start with 30-day queries
Aggregate after filtering	Aggregate raw data
Keep high-cardinality data in structured metadata	Use high-cardinality stream labels

Additional Performance Considerations

11. Avoid Querying During Peak Hours

If you're doing deep scans (e.g., across days or multiple services), try off-peak hours to avoid throttling and reduce impact on other users.

12. Query Cost Awareness

Make sure developers understand that log queries can be expensive. Encourage:

Dashboards: Use metrics for high-frequency monitoring, not logs
Logs: Reserve for debugging or audit trails
Explore vs Dashboards: Use Explore for ad hoc queries, Dashboards for focused queries

Warning: Dashboard queries auto-refresh frequently! Avoid embedding expensive queries into dashboards.

Need More Help?

For getting started with Loki/Grafana, see Getting Started Guide
Contact the Platon team (#platon) if you need assistance optimizing specific queries

Understanding the Basics​

What Makes Queries Fast?​

Structured Metadata vs Stream Labels​

Essential Best Practices​

1. Start Small, Then Expand​

2. Use Labels to Filter First​

3. Order Matters for Performance​

4. Use Simple String Matching When Possible​

5. Parse Only What You Need​

Working with Metrics and Aggregations​

6. Aggregate After Filtering​

7. Use Small Time Windows​

Advanced Tips​

8. Avoid High-Cardinality Labels​

9. Structure Complex Queries in Stages​

10. Watch Your Stream Count​

Quick Reference: Do's and Don'ts​

Additional Performance Considerations​

11. Avoid Querying During Peak Hours​

12. Query Cost Awareness​

Need More Help?​