| Counting Returning Customers in DAX |
| Written by Chris Webb | |||
| Monday, 03 May 2010 22:59 | |||
|
Reposted from Chris Webb's blog with the author's permission. As promised, I’m going to be putting up a few more DAX examples here – although they may not demonstrate any new and interesting concepts, I’m still learning the language and blogging is a good way for me to practise implementing common calculations. So, today’s problem is: how do we find the number of distinct customers who bought something in the current time period and who have also bought something in the past? That’s to say we want to count the number of returning customers, as opposed to completely new customers who have never bought anything from us before. This is, basically, a variation on the problem of how to get a distinct count in DAX that Marco has already dealt with comprehensively here, but with aspects of a time intelligence calculation (a topic which is well covered here). I’ll be using two tables from Adventure Works to illustrate this: FactInternetSales and DimDate. What we need to start off with is find the set of distinct customers who bought something in the current time period. The following simple DAX expression finds this set and gives me the distinct count: =COUNTROWS(DISTINCT(FactInternetSales[CustomerKey])) We now need to filter these customers so that we only return the ones who bought something in the range of dates from the very first date we have data for, up to the day before the first date in the current time range. This can be accomplished with the DatesBetween, FirstDate and DateAdd functions. With the DatesBetween function, if you pass a Blank value to the first parameter it will give you the first date you have data for as your start date (see here for another example of this); for the end date in the range, we find the first date in the current date range with FirstDate, then get the day before using DateAdd: DATESBETWEEN( We then need to use this date range inside the Filter function as follows: =CALCULATE(COUNTROWS(DISTINCT(FactInternetSales[CustomerKey])), FILTER(DISTINCT(FactInternetSales[CustomerKey]) , CALCULATE(SUM(FactInternetSales[SalesAmount]), DATESBETWEEN(DimDate[FullDateAlternateKey] ,BLANK() , DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY)) , ALL(DimDate))>0)) However, there’s one last problem to solve: on the first date in the time dimension using the DateAdd function in this way will return a Blank value, and using Blank as both a start and an end date will result in us getting all the dates in the column back from the DatesBetween function. This means that for this first date we get the same value as the distinct count because we are finding the set of all customers who bought something on that first date and applying a filter to see whether these customers bought something on any date, which of course they did: So we need to use the IF function to check if using DateAdd to get the previous date in this way returns a Blank, and if it does to return a Blank. Here’s the final expression: =IF( One last point: even though I’ve written this expression on the RC version of PowerPivot, I still need to use ALL(DimDate) on the last line I am joining FactInternetSales and DimDate using the OrderDateKey column, which is an integer surrogate key. This wouldn’t be necessary if I was joining the two tables using a datetime column. Personally I think this is going to confuse no end of people and is a bit rubbish – especially because, in many data warehouses, the only way to join a fact table and a time dimension table is using an integer surrogate key; even worse, if you try to bring a datetime column from a time dimension table down onto the fact table using a calculated column, you’ll get a circular reference error. Hopefully this will get fixed in the next release…
|
Most Popular
- Step by step guide on installing PowerPivot for SharePoint on a single machine
- List of suggested datasets to test PowerPivot
- List of PowerPivot DAX functions with description
- How to install PowerPivot for Excel and list of know issues
- Microsoft Virtual Lab: PowerPivot for Excel 2010 Introduction
- List of PowerPivot DAX functions (short)
- Learning PowerPivot and DAX
- PowerPivot's impact on BI pros?





CALCULATE(sum(GL20000[Amount]),datesbetween(GL20000[Transaction Date],date(2007,12,31),date(2011,3,31)))
It works exactly as I want, butI would like to replace date(2011,3,31) with what is selected on my Transaction Date slicer. Is this possible?