
Picture this: you're building a Power BI report that refreshes daily with sales data from multiple sources. Your M queries work perfectly in development, but once deployed, they take 45 minutes to refresh instead of the expected 10 minutes. Users complain, stakeholders ask questions, and you're left wondering where everything went wrong. The culprit? Performance anti-patterns that silently killed your query efficiency.
M Language performance isn't just about making things faster—it's about understanding how Power Query processes data and writing code that works with the query engine instead of against it. The difference between well-optimized M code and poorly written queries can mean the difference between reports that refresh in minutes versus hours.
What you'll learn:
You should be comfortable writing M code directly, understand the basics of query folding, and have experience with Power Query's data transformation operations. Familiarity with SQL concepts will help when we discuss query folding optimization.
Before diving into specific patterns, you need to understand how M Language actually executes your code. Unlike traditional programming languages that execute line by line, M uses a lazy evaluation model combined with query folding that fundamentally changes how you should think about performance.
Query folding is Power Query's ability to push operations back to the data source rather than performing them in memory. When folding works, your source database does the heavy lifting. When it breaks, Power Query pulls all the data into memory and processes it locally—often a performance disaster.
Consider this seemingly innocent transformation:
let
Source = Sql.Database("server", "database"),
SalesTable = Source{[Schema="dbo",Item="Sales"]}[Data],
FilteredSales = Table.SelectRows(SalesTable, each [OrderDate] >= #date(2023,1,1)),
AddedCustomColumn = Table.AddColumn(FilteredSales, "DaysFromOrder",
each Duration.Days(DateTime.LocalNow() - [OrderDate])),
FinalFilter = Table.SelectRows(AddedCustomColumn, each [Amount] > 1000)
in
FinalFilter
This looks reasonable, but the DateTime.LocalNow() function breaks query folding. Everything after that custom column addition—including the final filter—now happens in memory instead of at the database level. If your sales table has millions of rows, you've just forced Power Query to download and process all of them locally.
Here's the optimized version:
let
Source = Sql.Database("server", "database"),
SalesTable = Source{[Schema="dbo",Item="Sales"]}[Data],
FilteredSales = Table.SelectRows(SalesTable, each [OrderDate] >= #date(2023,1,1) and [Amount] > 1000),
AddedCustomColumn = Table.AddColumn(FilteredSales, "DaysFromOrder",
each Duration.Days(DateTime.LocalNow() - [OrderDate]))
in
AddedCustomColumn
By moving the Amount filter before the folding-breaking operation, we dramatically reduce the data volume processed in memory.
M's lazy evaluation means expressions aren't computed until their results are actually needed. This creates optimization opportunities but also potential traps:
let
Source = Sql.Database("server", "database"),
LargeTable = Source{[Schema="dbo",Item="TransactionHistory"]}[Data],
// This step doesn't execute yet - lazy evaluation
ExpensiveTransformation = Table.AddColumn(LargeTable, "ComplexCalculation",
each SomeExpensiveFunction([TransactionId])),
// Only if we actually use the result does the calculation happen
FirstFiveRows = Table.FirstN(ExpensiveTransformation, 5)
in
FirstFiveRows
The expensive calculation only runs on the first five rows because that's all the final step needs. Understanding this evaluation model helps you structure queries for maximum efficiency.
The order and placement of your filters can dramatically impact performance. The golden rule: filter early, filter often, and filter in ways that preserve query folding.
Always apply the most selective filters first and as early as possible in your transformation chain:
let
Source = Sql.Database("server", "database"),
OrdersTable = Source{[Schema="dbo",Item="Orders"]}[Data],
// GOOD: Apply selective filters early
RecentLargeOrders = Table.SelectRows(OrdersTable, each
[OrderDate] >= Date.AddDays(DateTime.LocalNow(), -30) and // Recent orders only
[OrderAmount] >= 5000 and // Large orders only
[Status] = "Completed" // Completed only
),
// Then apply less selective transformations
WithCategory = Table.ExpandRecordColumn(RecentLargeOrders, "CustomerDetails", {"Category"}),
Final = Table.SelectColumns(WithCategory, {"OrderId", "OrderAmount", "Category"})
in
Final
Compare this anti-pattern where filters are applied late:
let
Source = Sql.Database("server", "database"),
OrdersTable = Source{[Schema="dbo",Item="Orders"]}[Data],
// BAD: Expensive operations before filtering
WithCalculations = Table.AddColumn(OrdersTable, "Profit", each [Revenue] - [Costs]),
WithCategories = Table.ExpandRecordColumn(WithCalculations, "CustomerDetails", {"Category"}),
// Filters applied after pulling and processing all data
Filtered = Table.SelectRows(WithCategories, each
[OrderDate] >= Date.AddDays(DateTime.LocalNow(), -30) and
[OrderAmount] >= 5000
)
in
Filtered
Date filtering requires special attention because it's common and often performance-critical:
// Efficient date range filtering that maintains folding
let
StartDate = #date(2023,1,1),
EndDate = #date(2023,12,31),
Source = Sql.Database("server", "database"),
TransactionsTable = Source{[Schema="dbo",Item="Transactions"]}[Data],
// Use literal dates or simple date functions
DateFiltered = Table.SelectRows(TransactionsTable, each
[TransactionDate] >= StartDate and [TransactionDate] <= EndDate
),
// Alternative: use Date.Year for year-based filtering
YearFiltered = Table.SelectRows(TransactionsTable, each
Date.Year([TransactionDate]) = 2023
)
in
DateFiltered
Avoid these date filtering anti-patterns that break folding:
// AVOID: Dynamic date calculations in filters
Table.SelectRows(TransactionsTable, each
Duration.Days([TransactionDate] - DateTime.LocalNow()) <= 90
)
// AVOID: Complex date manipulations
Table.SelectRows(TransactionsTable, each
Date.WeekOfYear([TransactionDate]) = Date.WeekOfYear(DateTime.LocalNow())
)
Joins are often the most expensive operations in data transformations. The key is choosing the right join type and optimizing the join conditions for query folding.
Different join types have different performance characteristics:
let
// Large fact table
Orders = Sql.Database("server", "database"){[Schema="sales",Item="Orders"]}[Data],
// Smaller dimension table
Customers = Sql.Database("server", "database"){[Schema="sales",Item="Customers"]}[Data],
// GOOD: Left join with large table as left side
OrdersWithCustomers = Table.Join(
Orders, {"CustomerId"},
Customers, {"CustomerId"},
JoinKind.LeftOuter
),
// Alternative: Use Table.NestedJoin for better control
NestedJoin = Table.NestedJoin(
Orders, {"CustomerId"},
Customers, {"CustomerId"},
"CustomerData",
JoinKind.LeftOuter
),
// Expand only needed columns
ExpandedJoin = Table.ExpandTableColumn(
NestedJoin,
"CustomerData",
{"CustomerName", "Region"},
{"CustomerName", "Region"}
)
in
ExpandedJoin
When joining multiple tables, the order matters significantly:
let
Orders = Sql.Database("server", "database"){[Schema="sales",Item="Orders"]}[Data],
Customers = Sql.Database("server", "database"){[Schema="sales",Item="Customers"]}[Data],
Products = Sql.Database("server", "database"){[Schema="sales",Item="Products"]}[Data],
Regions = Sql.Database("server", "database"){[Schema="sales",Item="Regions"]}[Data],
// GOOD: Join in order of table size (largest first) and selectivity
Step1 = Table.Join(Orders, {"CustomerId"}, Customers, {"CustomerId"}),
Step2 = Table.Join(Step1, {"ProductId"}, Products, {"ProductId"}),
Step3 = Table.Join(Step2, {"RegionId"}, Regions, {"RegionId"}),
// Select only needed columns after all joins
Final = Table.SelectColumns(Step3, {
"OrderId", "OrderDate", "OrderAmount",
"CustomerName", "ProductName", "RegionName"
})
in
Final
Joining data from different sources breaks query folding and forces in-memory processing:
let
// SQL Server data
SqlOrders = Sql.Database("server1", "database"){[Schema="sales",Item="Orders"]}[Data],
// Excel data
ExcelCustomers = Excel.Workbook(File.Contents("C:\Data\Customers.xlsx")){[Item="Sheet1",Kind="Sheet"]}[Data],
// This join happens in memory - unavoidable but optimize what you can
CrossSourceJoin = Table.Join(
// Filter SQL data first to minimize memory usage
Table.SelectRows(SqlOrders, each [OrderDate] >= #date(2023,1,1)),
{"CustomerId"},
// Ensure Excel data is clean and minimal
Table.SelectColumns(ExcelCustomers, {"CustomerId", "CustomerName", "Tier"}),
{"CustomerId"}
)
in
CrossSourceJoin
When query folding isn't possible, you need strategies to minimize memory usage and processing time.
Select only the columns you need as early as possible:
let
Source = Sql.Database("server", "database"),
WideTable = Source{[Schema="dbo",Item="CustomerTransactions"]}[Data], // 50+ columns
// GOOD: Select needed columns immediately
NarrowTable = Table.SelectColumns(WideTable, {
"TransactionId", "CustomerId", "TransactionDate",
"Amount", "Category", "Status"
}),
// Then apply transformations
Filtered = Table.SelectRows(NarrowTable, each [Status] = "Completed"),
WithCalculations = Table.AddColumn(Filtered, "Quarter",
each "Q" & Number.ToText(Date.QuarterOfYear([TransactionDate])))
in
WithCalculations
When dealing with very large datasets that must be processed in memory, consider batch processing:
let
ProcessBatch = (BatchData as table) as table =>
let
WithCalculations = Table.AddColumn(BatchData, "ProcessedValue",
each SomeExpensiveFunction([InputValue])),
Filtered = Table.SelectRows(WithCalculations, each [ProcessedValue] <> null)
in
Filtered,
Source = Sql.Database("server", "database"){[Schema="dbo",Item="LargeTable"]}[Data],
// Process in chunks to manage memory
BatchSize = 10000,
TotalRows = Table.RowCount(Source),
BatchCount = Number.RoundUp(TotalRows / BatchSize),
ProcessedBatches = List.Generate(
() => 0,
each _ < BatchCount,
each _ + 1,
each
let
Skip = _ * BatchSize,
BatchData = Table.Skip(Source, Skip),
LimitedBatch = Table.FirstN(BatchData, BatchSize),
ProcessedBatch = ProcessBatch(LimitedBatch)
in
ProcessedBatch
),
CombinedResult = Table.Combine(ProcessedBatches)
in
CombinedResult
Warning: Batch processing adds complexity and should only be used when memory constraints make it necessary. Always test if simpler approaches work first.
String operations can be surprisingly expensive. Optimize them carefully:
let
Source = Sql.Database("server", "database"){[Schema="dbo",Item="ProductData"]}[Data],
// GOOD: Use native functions when possible
CleanedData = Table.TransformColumns(Source, {
{"ProductName", Text.Trim}, // Native function
{"Category", Text.Upper}, // Native function
{"Description", each Text.Start(_, 100)} // Truncate efficiently
}),
// AVOID: Complex string manipulations in custom functions
// This would be slower:
// {"ProductCode", each
// if Text.Contains(_, "-") then
// Text.BeforeDelimiter(_, "-") & "_" & Text.AfterDelimiter(_, "-")
// else _
// }
// BETTER: Use Table.SplitColumn for complex string operations
SplitProductCode = Table.SplitColumn(
CleanedData,
"ProductCode",
Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv),
{"CodePrefix", "CodeSuffix"}
),
MergedCode = Table.AddColumn(SplitProductCode, "NewProductCode",
each [CodePrefix] & "_" & [CodeSuffix])
in
MergedCode
Understanding what breaks query folding is crucial for maintaining performance. Here are the most common culprits and how to work around them.
Custom functions almost always break query folding:
// ANTI-PATTERN: Custom function breaks folding
let
CalculateDiscount = (amount as number, customerTier as text) as number =>
if customerTier = "Premium" then amount * 0.1
else if customerTier = "Gold" then amount * 0.05
else 0,
Source = Sql.Database("server", "database"){[Schema="sales",Item="Orders"]}[Data],
// This breaks folding - entire table pulled into memory
WithDiscount = Table.AddColumn(Source, "Discount",
each CalculateDiscount([OrderAmount], [CustomerTier]))
in
WithDiscount
The solution depends on your specific case:
// SOLUTION 1: Use conditional logic directly (may still break folding)
let
Source = Sql.Database("server", "database"){[Schema="sales",Item="Orders"]}[Data],
WithDiscount = Table.AddColumn(Source, "Discount", each
if [CustomerTier] = "Premium" then [OrderAmount] * 0.1
else if [CustomerTier] = "Gold" then [OrderAmount] * 0.05
else 0
)
in
WithDiscount
// SOLUTION 2: Pre-filter and use lookup table (preserves folding better)
let
Source = Sql.Database("server", "database"){[Schema="sales",Item="Orders"]}[Data],
DiscountRates = #table(
type table [CustomerTier = text, DiscountRate = number],
{
{"Premium", 0.1},
{"Gold", 0.05},
{"Standard", 0}
}
),
WithRates = Table.Join(Source, {"CustomerTier"}, DiscountRates, {"CustomerTier"}),
WithDiscount = Table.AddColumn(WithRates, "Discount",
each [OrderAmount] * [DiscountRate])
in
WithDiscount
Many date/time functions break folding:
// ANTI-PATTERN: These break folding
Table.AddColumn(Source, "DaysAgo", each Duration.Days(DateTime.LocalNow() - [OrderDate]))
Table.AddColumn(Source, "IsWeekend", each Date.DayOfWeek([OrderDate]) >= Day.Saturday)
Table.SelectRows(Source, each Duration.Days([OrderDate] - DateTime.LocalNow()) <= 30)
// BETTER: Use simple date comparisons
let
Today = DateTime.Date(DateTime.LocalNow()),
ThirtyDaysAgo = Date.AddDays(Today, -30),
Source = Sql.Database("server", "database"){[Schema="sales",Item="Orders"]}[Data],
RecentOrders = Table.SelectRows(Source, each [OrderDate] >= ThirtyDaysAgo)
in
RecentOrders
Complex text operations are folding killers:
// ANTI-PATTERN: Complex text operations
Table.SelectRows(Source, each Text.Contains(Text.Upper([ProductName]), "WIDGET"))
Table.AddColumn(Source, "CleanPhone", each
Text.Replace(Text.Replace([PhoneNumber], "(", ""), ")", ""))
// BETTER: Use simple, folding-friendly operations
Table.SelectRows(Source, each Text.StartsWith([ProductName], "Widget"))
Table.SelectRows(Source, each [ProductCategory] = "Electronics") // Use categorical columns
When your queries run slowly, you need systematic approaches to identify bottlenecks.
Power Query's built-in diagnostics are your first tool:
// Enable diagnostics at the beginning of your query
let
Source = Diagnostics.Trace(TraceLevel.Information,
Sql.Database("server", "database"), "Database Connection"),
OrdersTable = Diagnostics.Trace(TraceLevel.Information,
Source{[Schema="dbo",Item="Orders"]}[Data], "Orders Table Load"),
FilteredOrders = Diagnostics.Trace(TraceLevel.Information,
Table.SelectRows(OrdersTable, each [OrderDate] >= #date(2023,1,1)),
"Date Filter Applied"),
// ... rest of transformations with tracing
in
FilteredOrders
Test whether your operations fold by examining the generated SQL:
let
Source = Sql.Database("server", "database"),
OrdersTable = Source{[Schema="dbo",Item="Orders"]}[Data],
// Test folding by adding a filter
TestFilter = Table.SelectRows(OrdersTable, each [Amount] > 1000),
// Add this column to see if previous steps folded
FoldingTest = Table.AddColumn(TestFilter, "TestColumn", each "Folded: " & Text.From([OrderId]))
in
FoldingTest
If the TestColumn addition shows folded data (minimal rows processed in memory), your previous steps folded successfully. If Power Query processes many rows for the TestColumn, folding broke earlier.
Create systematic tests to measure performance:
let
// Benchmark helper function
BenchmarkStep = (data as table, stepName as text, operation as function) as table =>
let
StartTime = DateTime.LocalNow(),
Result = operation(data),
EndTime = DateTime.LocalNow(),
Duration = Duration.TotalMilliseconds(EndTime - StartTime),
Traced = Diagnostics.Trace(TraceLevel.Information, Result,
stepName & " completed in " & Number.ToText(Duration) & "ms")
in
Traced,
Source = Sql.Database("server", "database"){[Schema="dbo",Item="Orders"]}[Data],
Step1 = BenchmarkStep(Source, "Date Filter",
each Table.SelectRows(_, each [OrderDate] >= #date(2023,1,1))),
Step2 = BenchmarkStep(Step1, "Amount Filter",
each Table.SelectRows(_, each [Amount] > 500)),
Step3 = BenchmarkStep(Step2, "Customer Join", each
Table.Join(_, {"CustomerId"}, CustomerTable, {"CustomerId"}))
in
Step3
Let's build a real-world performance optimization scenario. You're tasked with creating a monthly sales report that combines data from multiple sources and includes complex calculations.
let
// Data sources
SqlSales = Sql.Database("server", "database"){[Schema="sales",Item="OrderDetails"]}[Data],
ExcelProducts = Excel.Workbook(File.Contents("C:\Data\ProductMaster.xlsx")){[Item="Products",Kind="Sheet"]}[Data],
// Initial approach - multiple performance issues
AllSales = SqlSales,
// Custom function breaks folding
CalculateMargin = (revenue as number, cost as number) as number =>
if cost = 0 then 0 else (revenue - cost) / revenue,
// Adding expensive calculation early
WithMargin = Table.AddColumn(AllSales, "Margin",
each CalculateMargin([Revenue], [Cost])),
// Late filtering after expensive operations
CurrentMonth = Table.SelectRows(WithMargin, each
Date.Month([OrderDate]) = Date.Month(DateTime.LocalNow()) and
Date.Year([OrderDate]) = Date.Year(DateTime.LocalNow())),
// Cross-source join on large dataset
WithProducts = Table.Join(CurrentMonth, {"ProductId"}, ExcelProducts, {"ProductId"}),
// Final aggregation
Grouped = Table.Group(WithProducts, {"ProductCategory"}, {
{"TotalRevenue", each List.Sum([Revenue]), type number},
{"AvgMargin", each List.Average([Margin]), type number},
{"OrderCount", each Table.RowCount(_), type number}
})
in
Grouped
let
// Step 1: Define date parameters upfront
CurrentYear = Date.Year(DateTime.LocalNow()),
CurrentMonth = Date.Month(DateTime.LocalNow()),
FirstDayOfMonth = #date(CurrentYear, CurrentMonth, 1),
// Step 2: Load and immediately filter SQL data
SqlSales = Sql.Database("server", "database"){[Schema="sales",Item="OrderDetails"]}[Data],
FilteredSales = Table.SelectRows(SqlSales, each
[OrderDate] >= FirstDayOfMonth and
[Revenue] > 0 // Additional useful filter
),
// Step 3: Select only needed columns early
SlimSales = Table.SelectColumns(FilteredSales, {
"OrderId", "OrderDate", "ProductId", "Revenue", "Cost"
}),
// Step 4: Load and prepare product data
ExcelProducts = Excel.Workbook(File.Contents("C:\Data\ProductMaster.xlsx")){[Item="Products",Kind="Sheet"]}[Data],
CleanProducts = Table.SelectColumns(ExcelProducts, {"ProductId", "ProductCategory"}),
// Step 5: Join smaller datasets
WithProducts = Table.Join(SlimSales, {"ProductId"}, CleanProducts, {"ProductId"}),
// Step 6: Add calculations after filtering and joining
WithMargin = Table.AddColumn(WithProducts, "Margin", each
if [Cost] = 0 then 0 else ([Revenue] - [Cost]) / [Revenue]
),
// Step 7: Final aggregation
Grouped = Table.Group(WithProducts, {"ProductCategory"}, {
{"TotalRevenue", each List.Sum([Revenue]), type number},
{"TotalCost", each List.Sum([Cost]), type number},
{"OrderCount", each Table.RowCount(_), type number}
}),
// Step 8: Calculate margin at aggregated level (more efficient)
FinalResults = Table.AddColumn(Grouped, "AvgMargin", each
if [TotalCost] = 0 then 0 else ([TotalRevenue] - [TotalCost]) / [TotalRevenue]
)
in
FinalResults
Problem: Creating complex custom functions for simple operations that could use native M functions.
Example:
// WRONG: Custom function for simple text operation
let
ExtractFirstWord = (text as text) as text =>
let
Words = Text.Split(text, " "),
FirstWord = Words{0}
in
FirstWord
in
Table.AddColumn(Source, "FirstWord", each ExtractFirstWord([ProductName]))
Solution:
// RIGHT: Use native function
Table.AddColumn(Source, "FirstWord", each
Text.BeforeDelimiter([ProductName], " ", {0, RelativePosition.FromStart}))
Problem: Applying filters after transformations that break query folding or add computational overhead.
Symptoms: Long refresh times, high memory usage, timeout errors.
Debugging Steps:
Problem: Repeatedly accessing the same calculated values or performing unnecessary column transformations.
Example:
// INEFFICIENT: Multiple calculations on same values
let
WithYear = Table.AddColumn(Source, "Year", each Date.Year([OrderDate])),
WithQuarter = Table.AddColumn(WithYear, "Quarter", each Date.QuarterOfYear([OrderDate])),
WithMonth = Table.AddColumn(WithQuarter, "Month", each Date.Month([OrderDate]))
in
WithMonth
Better Approach:
// MORE EFFICIENT: Single transformation with record
let
WithDateParts = Table.AddColumn(Source, "DateParts", each [
Year = Date.Year([OrderDate]),
Quarter = Date.QuarterOfYear([OrderDate]),
Month = Date.Month([OrderDate])
]),
ExpandedDateParts = Table.ExpandRecordColumn(WithDateParts, "DateParts",
{"Year", "Quarter", "Month"})
in
ExpandedDateParts
When queries fail with memory errors:
Use this systematic approach:
let
Source = Sql.Database("server", "database"){[Schema="dbo",Item="Orders"]}[Data],
Step1 = Table.SelectRows(Source, each [Amount] > 1000), // Test folding here
Step2 = Table.AddColumn(Step1, "Custom", each "test"), // This likely breaks folding
Step3 = Table.SelectRows(Step2, each [Status] = "Complete") // This won't fold
in
Step3
Mastering M Language performance requires understanding the execution model, recognizing folding patterns, and systematically optimizing your queries. The key principles we've covered:
Your next steps should focus on applying these patterns to your specific scenarios. Start by auditing your slowest-performing queries using the diagnostic techniques we covered. Look for the common anti-patterns—custom functions, late filtering, and complex transformations—and refactor them using the performance patterns.
For advanced scenarios, consider exploring Power Query's query plan visualization tools and learning more about the specific folding capabilities of your data sources. SQL Server, for example, supports different folding operations than Oracle or web APIs.
The investment in performance optimization pays dividends not just in faster refresh times, but in more reliable data processes and better user experiences. As your datasets grow and your transformation logic becomes more complex, these patterns become essential for maintaining scalable data solutions.
Learning Path: Advanced M Language