I’m experiencing a performance issue in Power BI where a small Excel file (approximately 1500 rows and 800 KB) is causing a query to load up to 800 MB during report refresh. My Power BI report has four connections to Excel files, and the problematic query appends three other queries, each of which is merged with the 1500-row file.
What could be causing this significant memory expansion from 800 KB to 800 MB? Is this issue related to the number of table relationships, poor query architecture, or could it be a bug in Power BI?
A small Excel file with 1500 rows and 800 KB can indeed cause Power BI to consume up to 800 MB of memory during refresh, and this memory expansion is primarily due to how Power BI handles merge and append operations. The issue stems from Power BI’s memory management practices during query processing, where multiple evaluation containers may be used to handle transformations, and each merge operation requires loading the entire dataset into memory for processing.
Contents
- Understanding Memory Overhead in Power BI
- Merge Query Memory Issues
- Append Query Memory Consumption
- Query Architecture Problems
- Potential Solutions
- Monitoring and Diagnostics
Understanding Memory Overhead in Power BI
Power BI’s memory usage often significantly exceeds the actual dataset size due to its internal processing requirements. When performing a full refresh, Power BI typically needs twice the amount of memory the semantic model requires because it maintains both the old and new versions of the compressed data during the refresh process[^1]. This means your 800 KB dataset might require 1.6 MB just for the basic refresh operation.
However, the real memory pressure comes from Power BI’s tendency to grant additional memory for future expansion. As one source explains, Power BI may allocate up to 12 GB for a 10 GB dataset to allow room for future growth[^1]. In your case, with multiple merge operations, this overhead mechanism could be contributing significantly to the memory bloat.
Additionally, when Power BI processes queries, it may use multiple evaluation containers to handle Power Query transformations. Each of these containers can use up to the amount of memory specified by the MaxEvaluationWorkingSetInMB setting, which can further compound memory usage during complex operations[^9].
Merge Query Memory Issues
The memory expansion you’re experiencing is most likely related to the merge operations in your query architecture. Merge operations in Power Query require loading the entire dataset into memory for processing, which explains why your memory usage spikes dramatically[^3][^8].
According to research findings, merges have to take place in memory, so the larger the tables involved in the merge, the more memory is needed[^3]. In your case, even though your Excel file contains only 1500 rows, when it’s merged with three other queries, each merge operation requires loading all the data into memory[^8]. This creates a cascading effect where each merge operation compounds the memory requirements.
One community member reported that every time they perform a merge query, “it has to load up all the data into memory again for each merge”[^8]. This inefficient behavior could explain why your memory usage grows exponentially rather than linearly with the number of merges.
Furthermore, your merge operations might not be optimized. As one source notes, “If your data is not sorted, then you can sort it in Power Query before the merge – but since sorting itself takes time and sorting for non-foldable data sources is another one of those operations which requires the table to be held in memory, you’re unlikely to get any performance improvement”[^2]. This suggests that unsorted data might be forcing Power BI to hold more data in memory than necessary during the merge process.
Append Query Memory Consumption
While merge operations get most of the attention, append queries also contribute to memory overhead in your scenario. Append operations vertically stack data from queries with identical structures, but this stacking process also requires memory allocation[^4].
The Combine Files experience in Power BI improves the end-user experience but adds overhead to the query processing[^6]. If you’re using this feature to handle your multiple Excel connections, you might be experiencing this overhead effect. The Power User source suggests that you could potentially “make a compromise on that end-user experience to optimize your query and make it run up to 500% faster”[^6].
When you append queries, Power BI needs to maintain the structure and data of each query being combined, which can lead to memory fragmentation and increased usage, especially when combined with merge operations in the same query flow.
Query Architecture Problems
Your query architecture with four Excel connections and multiple merge/append operations likely suffers from several issues that compound memory usage:
-
Multiple Evaluation Containers: Power BI may create multiple evaluation containers during refresh, each capable of using significant memory[^9]. With your complex query structure involving multiple merges and appends, this could lead to several containers operating simultaneously.
-
Inefficient Join Operations: One user reported that even with a small number of matching rows (less than 30,000), Power Query ran through all 40 million rows, suggesting inefficient join operations[^9]. Your merge operations might be experiencing similar inefficiency, forcing Power BI to process more data than necessary.
-
Lack of Query Optimization: The Data Bear source emphasizes the need to “reduce the load on our data model in order to perform optimally. We need to keep the memory being used as small as possible”[^1]. Your current architecture might not be optimized for minimal memory usage.
-
Unnecessary Data Loading: As one performance tip suggests, when you get data from data sources, apply transformations, and merge or append queries, you might end up with tables that contain more data than actually needed[^10]. Each unnecessary column or row increases memory pressure during the merge and append operations.
Potential Solutions
To address the memory expansion issue, consider these optimization strategies:
Optimize Merge Operations
- Remove Unwanted Columns: Remove columns before merge operations to reduce memory footprint[^2]. As one source states, “remove any unwanted columns before the merge anyway”[^2].
- Reduce Parallelism: Reduce the number of objects processed in parallel during refresh[^10]. “Reducing the parallelism, and introducing partitioning to large tables can greatly reduce the memory requirements of the refresh”[^10].
- Consider Query Folding: Ensure your data sources support query folding, which allows Power Query to push transformations back to the source, reducing memory usage.
Implement Incremental Refresh
Configure Incremental refresh for your semantic model[^1]. This is particularly important if your model becomes larger and progressively consumes more memory. Incremental refresh processes only new or changed data rather than the entire dataset.
Memory Allocation Settings
Adjust Power Query’s memory allocation settings. You can allocate more memory to evaluation containers in Power BI Desktop[^9], but this should be done judiciously as it can increase overall memory consumption.
Query Architecture Redesign
- Consolidate Data Sources: Instead of four separate Excel connections, consider consolidating them into a single, well-structured data source if possible.
- Pre-process Data: Perform preliminary data transformations in Excel or another tool before importing into Power BI.
- Use DirectQuery Mode: For larger datasets, consider using DirectQuery mode instead of Import mode, though this may not be suitable for your small Excel file.
Monitoring and Diagnostics
To better understand your memory usage patterns, implement these monitoring strategies:
Query Diagnostics
Use Query Diagnostics to monitor memory usage during operations[^3]. “There is now information available in Profiler and Log Analytics that tells you about peak memory and CPU usage across all Power Query queries for a single refresh in the Power BI Service, as well as memory usage for the refresh as a whole”[^3].
Memory Profiling
Monitor memory usage during the refresh process using Windows Task Manager or similar tools to identify when memory usage spikes and correlate this with specific query operations.
Community Resources
Engage with the Power BI community for specific guidance on your query architecture. The Microsoft Fabric Community and Power BI Community forums have extensive discussions on memory optimization[^1][^8].
Conclusion
The memory expansion from 800 KB to 800 MB in your Power BI report is primarily caused by the cumulative effect of multiple merge and append operations, each requiring full dataset loading into memory. This issue is related to poor query architecture rather than being a bug in Power BI itself. To resolve this, focus on optimizing merge operations by removing unnecessary columns, reducing parallelism, and potentially redesigning your query architecture to minimize memory overhead. Implementing incremental refresh and using monitoring tools like Query Diagnostics will help you better understand and manage memory usage patterns. With these optimizations, you should be able to significantly reduce the memory consumption during report refresh while maintaining the functionality you need.
Sources
- Large semantic models in Power BI Premium - Microsoft Fabric
- Chris Webb’s BI Blog: Optimising The Performance Of Power Query Merges In Power BI, Part 3: Table.Join And SortMerge
- Microsoft Power BI Insights: Power Query merge performance; Desktop features; Small multiples
- Merge Queries in Power BI for More Insightful Visuals
- Data Preparation Part 3 - Merging and Appending Queries - Data Bear
- Combine or Append Data: Optimal Combination Pattern — The Power User
- Shape and Combine Data in Power BI Desktop Tutorial - Microsoft Learn
- Solved: Merging Tables Efficiency Question - Microsoft Power BI Community
- PowerQuery: Merge Queries HORRIBLE performance - Microsoft Fabric Community
- Performance Tip for Power BI; Enable Load Sucks Memory Up - RADACAD