Our team hit a familiar Snowflake paradox: slower ETL runs arrived at the same time as FinOps alerts about rising credits.
The warehouse in question was WH_ETL_BRONZE_01, a multi-cluster warehouse dedicated to Bronze layer ingestion and merge workloads. What looked like a simple cost problem turned out to be a workload-isolation and concurrency problem.
The Problem
We started with this setup:
ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
WAREHOUSE_SIZE = 'XSMALL',
MAX_CLUSTER_COUNT = 10,
MIN_CLUSTER_COUNT = 1,
SCALING_POLICY = 'STANDARD',
AUTO_SUSPEND = 60,
MAX_CONCURRENCY_LEVEL = 8; -- default behavior
And we saw both of these at once:
- Higher queue times (
queued_overload_time) - Higher daily credits
The critical issue was workload mix. Long-running MERGE statements (30 to 60 minutes) were running alongside tiny queries. When too many heavy MERGE statements landed on the same node/cluster, they competed for resources and all slowed down.
In other words, even with multi-cluster enabled, assignment patterns could create unstable performance if too many heavyweight queries were packed together.
What We Changed
This was not a single before/after discovery from history. We ran controlled experiments in sequence.
Phase 1: Baseline
ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
WAREHOUSE_SIZE = 'XSMALL',
MAX_CLUSTER_COUNT = 10,
MIN_CLUSTER_COUNT = 1,
SCALING_POLICY = 'STANDARD',
AUTO_SUSPEND = 60,
MAX_CONCURRENCY_LEVEL = 8;
Phase 2: First Experiment
ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
WAREHOUSE_SIZE = 'XSMALL',
MAX_CLUSTER_COUNT = 10,
MIN_CLUSTER_COUNT = 1,
SCALING_POLICY = 'STANDARD',
AUTO_SUSPEND = 60,
MAX_CONCURRENCY_LEVEL = 2;
Goal: force scale-out behavior earlier and reduce the chance that many heavy MERGE jobs share the same node.
Phase 3: Final Optimization
ALTER WAREHOUSE WH_ETL_BRONZE_01 SET
WAREHOUSE_SIZE = 'SMALL',
MAX_CLUSTER_COUNT = 10,
MIN_CLUSTER_COUNT = 1,
SCALING_POLICY = 'STANDARD',
AUTO_SUSPEND = 60,
MAX_CONCURRENCY_LEVEL = 3;
This became the best balance for our workload profile.
Why This Helped
From deeper analysis of the workload behavior:
- The number of queries and output volume could look similar across runs, but bytes scanned at table level could still rise.
- Heavy MERGE overlap was a key instability driver.
- On slow runs, many heavy MERGE statements could get assigned to the same node, causing resource contention.
- On fast runs, fewer heavy MERGE statements shared a node, leaving room for short-running queries.
Lowering concurrency reduced the tendency to over-pack heavyweight merges. Then moving to SMALL provided enough per-query resources to cut elapsed time and queueing further.
Configuration Summary
| Phase | Size | Max Clusters | Max Concurrency | Intent |
|---|---|---|---|---|
| Baseline | XSMALL | 10 | 8 | Initial setup |
| Experiment | XSMALL | 10 | 2 | Force earlier scale-out / reduce heavy-query packing |
| Final | SMALL | 10 | 3 | Balance throughput, queueing, and cost |
The two final intentional changes vs baseline were:
-
WAREHOUSE_SIZE:XSMALL->SMALL -
MAX_CONCURRENCY_LEVEL:8->3
How We Measured
We used Snowflake ACCOUNT_USAGE views for both performance and cost, comparing baseline and final optimization windows.
Query Performance (Before/After)
SELECT
CASE
WHEN start_time < '2026-05-05' THEN 'XSMALL Period'
ELSE 'SMALL Period'
END AS period,
COUNT(*) AS total_queries,
ROUND(AVG(total_elapsed_time)/1000, 2) AS avg_elapsed_sec,
ROUND(AVG(queued_overload_time)/1000, 2) AS avg_queued_sec,
ROUND(MAX(total_elapsed_time)/1000, 2) AS max_elapsed_sec
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE warehouse_name = 'WH_ETL_BRONZE_01'
AND start_time >= '2026-04-30'
GROUP BY period
ORDER BY period;
Credit Consumption (Daily)
SELECT
CASE
WHEN start_time::DATE < '2026-05-05' THEN 'XSMALL Period'
ELSE 'SMALL Period'
END AS period,
ROUND(SUM(credits_used_compute) / COUNT(DISTINCT start_time::DATE), 2) AS daily_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE warehouse_name = 'WH_ETL_BRONZE_01'
AND start_time::DATE >= '2026-04-30'
GROUP BY period
ORDER BY period;
Results
Cost
| Period | Size | Daily Credits |
|---|---|---|
| Baseline (5 days) | XSMALL | 15.12/day |
| Optimized (3 days) | SMALL | 13.50/day |
Daily credits dropped by about 11% after upsizing.
Performance
| Metric | XSMALL | SMALL | Improvement |
|---|---|---|---|
| Avg query time | 26.15s | 24.13s | 8% faster |
| Max query time | 3,475s | 2,850s | 18% faster |
| Total queue time | 2.20 min | 0.28 min | 87% less queuing |
Note: The intermediate XSMALL plus concurrency 2 phase was used to validate behavior and direction. The published KPI table above compares the stable baseline period against the final tuned period.
The Counterintuitive Lesson
In some workloads, a bigger warehouse can cost less.
1. Faster execution means earlier suspend
Queries completed faster on SMALL, so the warehouse reached AUTO_SUSPEND = 60 sooner. Less runtime plus less idle overhead translated to fewer daily credits.
2. Higher concurrency can reduce cluster sprawl
Concurrency must match workload shape. In our case, moving from default 8 down to 2 first improved isolation for heavy MERGE jobs, then landing at 3 with SMALL gave the right balance of throughput and stability.
3. Less queueing improves throughput and utilization
With 87% less queue time, work finished in tighter windows. The warehouse did useful work and went to sleep sooner.
Key Takeaways
- Do not assume smaller is always cheaper.
- Monitor
queued_overload_timeclosely. Persistent queueing often means you are under-sized or under-concurrented. - Tune
MAX_CONCURRENCY_LEVELwith warehouse size and workload type; they should be optimized together. - Separate or schedule heavyweight workflows when possible to reduce overlap and contention.
- Use both
QUERY_HISTORY(performance) andWAREHOUSE_METERING_HISTORY(cost) for before/after decisions. - Keep aggressive auto-suspend for bursty workloads. Faster queries plus short suspend windows compound savings.
Closing Thought
Warehouse right-sizing is a performance and FinOps exercise, not just a cost-control exercise. In many real workloads, a slightly larger warehouse with slightly higher concurrency can win on both speed and spend.
Disclosure and Scope
- Findings were validated with input from Snowflake Support and Professional Services.
- Configuration decisions, measurements, and outcomes reflect our specific environment and workload profile.
- Results may vary for other teams depending on query patterns, data shape, and scheduling behavior.
- Warehouse and workload identifiers in this post are anonymized.
This analysis used Snowflake built-in ACCOUNT_USAGE views only. No third-party monitoring stack required.
United States
NORTH AMERICA
Related News
How Braze’s CTO is rethinking engineering for the agentic area
10h ago
Amazon Employees Are 'Tokenmaxxing' Due To Pressure To Use AI Tools
21h ago

Implementing Multicloud Data Sharding with Hexagonal Storage Adapters
15h ago

DeepMind’s CEO Says AGI May Be ~4 Years Away. The Last Three Missing Pieces Are Not What Most People Think.
15h ago

CCSnapshot - A Claude Code Configs Transfer Tool
21h ago