Power BI Antipatterns #9: Filtering whole tables, part I

Whenever you can, you should avoid filtering whole tables in DAX–as opposed to columns–in DAX, because the performance may suffer, or you may get unexpected results, or both. In this blog post, we’re looking at the performance aspect.

Here are an instance of the antipattern that I recently encountered, which will give you the balance on the last available date:

Example bad 1 = 
CALCULATE (
    SUM ( Inventory[Balance] ),
    FILTER (
        Inventory,
        Inventory[Date]
            = LASTDATE (
                ALLSELECTED ( Inventory[Date] )
            )
    )
)

It’s bad not only because we’re filtering a whole table–Inventory–we’re also using LASTDATE over ALLSELECTED, when all we really want is the last available date. You can address the latter issue in the following way:

Example bad 2 = 
CALCULATE (
    SUM ( Inventory[Balance] ),
    FILTER (
        Inventory,
        Inventory[Date] = MAX ( Inventory[Date] )
    )
)

Make no mistake — this is still no good because we’re filtering the whole table. Here’s one way you could rewrite the formula:

Example better = 
CALCULATE (
    SUM ( Inventory[Balance] ),
    Inventory[Date] = MAX(Inventory[Date])
)

The above formula might not always be the best because using Time Intelligence might be tricky in some cases, for example, so you could use the Date dimension instead like so:

Example good 1 = 
CALCULATE (
    SUM ( Inventory[Balance] ),
    'Date'[Date] = MAX(Inventory[Date])
)

Alternatively, you may want to use LASTNONBLANKVALUE:

Example good 2 = 
LASTNONBLANKVALUE(
    'Date'[Date],
    SUM ( Inventory[Balance] )
)

Note: there are several different ways in which you can calculate semi-additive measures like balance, and the business requirements will dictate which one you should use. For more details, see Semi-additive calculations on DAX Patterns.

The good examples will perform better than the bad ones for a simple reason: instead of iterating through the whole Inventory table, which may have lots of rows, you’ll only iterate through the Date values, of which there are usually going to be fewer. You can always use Performance Analyzer in case you want to compare the formulas. Watch the video above for more details and another method!

To be continued…