Dan English's BI Blog

Welcome to my random thoughts in regards to Business Intelligence, databases, and other technologies

Archive for the ‘PowerPivot’ Category

Calculating Cumulative Values in Microsoft BI Tools

Posted by denglishbi on January 24, 2011

In working on a business intelligence project a common situation you will find yourself in is having to come up with a cumulative value (running value) calculation.  The challenge that you will find is that depending on what product you are working with the calculation that you need is going to be different.  In this post I will compare how this can be done with T-SQL, Excel, SSIS, SSRS, MDX, and DAX.  Along the way I will provide some additional reference links to other options, content, and I will point out some differences in how the totals are being calculated so you will know what to expect with the results.

In these examples I will be working with the Contoso Retail Data Warehouse data sample.  These examples below are not necessarily going to be the optimal approaches, just showing how you can go about getting at the desired result.

SQL Server T-SQL

So let’s start out taking a quick look how calculating the sales over each of the years directly against the data in the database through SQL Server Management Studio.

SELECT d.CalendarYear,

       SUM(f.SalesAmount) SalesAmount,

       SUM(f.SalesAmount) + COALESCE((SELECT SUM(SalesAmount)

                        FROM FactSales fs JOIN DimDate dd ON fs.DateKey = dd.DateKey 

                        WHERE dd.CalendarYear < d.CalendarYear),0) CumSalesAmount

FROM FactSales f

INNER JOIN DimDate d ON f.DateKey = d.DateKey

GROUP BY d.CalendarYear

ORDER BY d.CalendarYear

And the results for this are:

CalendarYear SalesAmount           CumSalesAmount
———— ——————— ———————
2007         4561940955.0212       4561940955.0212
2008         4111233534.6841       8673174489.7053
2009         3740483119.1823       12413657608.8876

 

Not too bad.  We have an inner query that is being used to return the pervious year(s) values and append that onto the current year (if they exist).  If no value is returned then a zero is used in its place.  For additional examples and options take a look at this posting by Garth Wells – Calculating Running Totals.

Microsoft Excel

So now lets take a look at how this would be done if we were using Excel. 

image

There area a few different options you have here, in this example I enter a formula in the C2 cell of =SUM($B$2:B2) and then copy that formula down into cells C3 and C4.

SQL Server Integration Services (SSIS)

Now we will take a quick look at how this could be done in the ETL process.  As we are pulling data from a source we can evaluate the values in the data flow and accumulate the values inside a script component.  In the source we will simply get the sales for each year and then append the values in the flow to each other to get at the same results displayed above.

image

The cumulative value is being performed in the script component with some code.  We have a variable setup that we are using to append the sales amount of each row in the data flow and then sending this value back out in a new output column called CumulativeSales.

image

Nothing too fancy going on here.  I don’t have a destination in the flow, just added a derived column to get the data viewer included so that I could run the data flow to show the results.

SQL Server Reporting Services (SSRS)

Now lets switch over to Reporting Services.  In this example we will not do the cumulative value in the dataset query, we will do this value calculation in the report. 

image

In this example we leverage the RunningValue aggregate function that is available in Reporting Services.  The expression that we use in the Cumulative Sales column is =RunningValue(Fields!SalesAmount.Value,Sum,”DataSet1″)

image

This function returns a running aggregate of the values in our specified dataset using the SUM function which is also specified in the expression.

SQL Server Analysis Services (SSAS) MDX

Now how about taking a look at Analysis Services.  How would we go about doing this calculation using MDX?

WITH MEMBER [Measures].[Cumulative Sales] AS

    SUM({null:[Date].[Calendar Year].CurrentMember},[Measures].[Sales Amount])

    

SELECT {[Measures].[Sales Amount],[Measures].[Cumulative Sales]} ON 0,

    NON EMPTY [Date].[Calendar Year].[Calendar Year].Members ON 1

FROM [Sales]

And the results of this are:

image

We can apply some filters into the MDX statement to remove the years that currently don’t have Sales Amount associated with them.

WITH MEMBER [Measures].[Cumulative Sales] AS

    SUM({null:[Date].[Calendar Year].CurrentMember},[Measures].[Sales Amount])

    

SELECT {[Measures].[Sales Amount],[Measures].[Cumulative Sales]} ON 0,

    NONEMPTY([Date].[Calendar Year].[Calendar Year].Members,[Measures].[Sales Amount]) ON 1

FROM [Sales]

This returns what we have seen in our other results:

image

There are additional ways in going about this calculation and for more options take a look at this recent posting here by Amish Manubhai Shah – Various way to calculate running total from All Period to Currentmember. Now I want to show how this solution differs a bit from the other ones we have seen.  What happens if I apply a filter to the query to only display 2008 and 2009?  What would you expect?

image

Might not be what you would want, but then again maybe it is.  The Cumulative Sales calculation is summing up the values from the beginning of time (the [All] level).  So the first value that we see in 2008 is including the Sales Amount from 2007 as well (if there was sales in 2006 it would include that as well and so on). 

You can make a few modifications to the calculation and setup a named set to reference the items and come up with something along these lines.

WITH DYNAMIC SET [SelectedYears] AS Existing [Date].[Calendar Year].[Calendar Year].Members

 

 MEMBER MEASURES.[Cumulative Sales] as 

    SUM({EXISTS([Date].[Calendar YQMD].[Calendar Year].Members,

    [SelectedYears]).Item(0):[Date].[Calendar YQMD].CurrentMember},

    [Measures].[Sales Amount])

 

SELECT {Measures.[Sales Amount], Measures.[Cumulative Sales]} ON 0,

    [Date].[Calendar YQMD].[Calendar Year].Members ON 1

FROM [Sales]

WHERE {[Date].[Calendar Year].&[2008],[Date].[Calendar Year].&[2009]}

With the following results:

image

Here a dynamic set is used to get the references of the members that are in the slicer (WHERE clause) and feed this information into the calculation for the cumulative sales to grab the first item to come up with the reference to the 2008 year.  This will be the starting point for the cumulative total instead of from the beginning of time.

You might just want to look at year-to-date cumulative totals and be able to drilldown into a user defined hierarchy if you have one setup like Year-Quarter-Month-Date.  If we modify our calculation a bit we might end up with something like the following:

WITH MEMBER [Measures].[Cumulative Sales] AS

    SUM(YTD([Date].[Calendar YQMD].CurrentMember),[Measures].[Sales Amount])

 

SELECT {[Measures].[Sales Amount],[Measures].[Cumulative Sales]} ON 0,

    Hierarchize({[Date].[Calendar YQMD].[Calendar Year].Members,

    [Date].[Calendar YQMD].[Calendar Quarter].Members}) ON 1

FROM [Sales]

WHERE {[Date].[Calendar Year].&[2008],

    [Date].[Calendar Year].&[2009]}

 

With the following results:

image

Here we can see that our cumulative total stops at the year level and we can see at the lower levels that the values are accumulating as expected, so the cumulative total at Q4 is same as the Yearly value.  What we have now is a YTD calculation.

PowerPivot for Excel 2010 DAX

We have one more product that I want to take a look at and that would be PowerPivot.  This is going to be fairly similar to the Analysis Services solution since it actually is Analysis Services behind the scenes, the only difference here is that we will be using DAX instead of MDX.

So for the first example we will setup a cumulative sales calculation just like we did with the first MDX example.  The calculation we use is as follows:

=CALCULATE(SUM(FactSales[SalesAmount]),

DATESBETWEEN(DimDate[Datekey],

FIRSTDATE(ALL(DimDate[Datekey])),

LASTDATE(DimDate[Datekey])))

So we end up with the following:

image

You could go ahead and add some additional logic to evaluate the Sales Amount to determine if a value exists, but you get the general idea here.

Once again this is calculating the cumulative total from the beginning of all time and that is done by using the ALL reference in the calculation for the FIRSTDATE.  If we filter the years and only display 2008 and 2009 we will see the similar results like we did with MDX where the cumulative sales amount for 2008 includes the 2007 sales.

image

The one nice thing about PowerPivot is that this is in Excel, so if you want to do anything a little more custom or make any references to cells you can do that and use Excel functions as well.  As far as setting up a cumulative sales amount value like the second MDX example I can’t determine an equivalent in DAX.  Maybe someone else has an idea if this can be done so that it only does this based on the Date values being evaluated.  If you do, please leave a comment.

You can do the YTD calculation in DAX and here is formula for that:

=TOTALYTD(SUM(FactSales[SalesAmount]),DimDate[Datekey])

And the results look like this if we break out the years by the quarters:

image

Once again we can see that the yearly total is equivalent to the Q4 for the same year.  For some additional information and explanation of the PowerPivot calculations take a look at a great post by Kasper de Jonge here Accumulate values (running value) over multiple years in PowerPivot using DAX.

Conclusion

Cumulative total (running value) calculations are all calculated differently within each of the Microsoft products.  We took a look at some examples of how this can be done in T-SQL, Excel, SSIS, SSRS, MDX, and DAX.  These are not the only options, so if you want to see more take a look at the additional posts that I included as reference points the different sections.  It is important to understand how the formulas need to be configured and also what the results are once the calculations and logic is put into place.  I think that one of the easiest ones, besides the Excel formula, would have to be the SSRS with the RunningValue aggregate function.  I like how this is setup and it evaluates everything based on the context that is being referenced.  So if we go back to the SSRS report and apply a filter to the tablix to remove 2007 from what is displayed we would end up with the following:

image

Very simple and easy to understand, but that is just my personal opinion.

Posted in Analysis Services, Integration Services, PowerPivot, Reporting Services, SQL Server | Tagged: , , , , | 9 Comments »

Using PowerPivot with PerformancePoint Services (PPS) 2010

Posted by denglishbi on January 3, 2011

UPDATE (6/20/2011): Modified table layout of step-by-step walkthrough to be two columns instead of three.  Added a downloaded PDF file as well.

UPDATE (2/24/2012): I have a blog for using PowerPivot 2012 with PerformancePoint Servers here - Using PerformancePoint Services (PPS) with PowerPivot SQL Server 2012 RC0.

Analysis Services was one of the primary data sources with the previous version of PerformancePoint to get the most functionality out of the product.  Now with the release of the 2010 product we have some new options and one of them is PowerPivot.  PowerPivot is also a multi-dimensional data source and in this post I will talk about what this source provides for functionality and what it does not compared to Analysis Services.

First of all you will need to have an environment that has PerformancePoint Services configured.  If you are not familiar with how to do that you can take a look at this great posting on the product team’s web site – Deploying PerformancePoint 2010 Soup to Nuts.  Once you have that setup then the other thing that you are going to need to do is have PowerPivot setup with SharePoint so that we will be able to utilize the files as data sources.  There are plenty of articles and material available in regards to this configuration, here is one of them – How to: Install PowerPivot for SharePoint on a New SharePoint Server.

Once you have your environment setup and ready the next thing will need to do is create your PowerPivot data source that you are going to leverage to create your dashboard with.  In this example we are not going to be creating the file and leveraging the PowerPivot PivotCharts and PivotTables, we are simply going to be setting up a file that will be used to provide the data and measures needed for analysis.  I will be using it in place of having to implement an Analysis Services solution (except for the fact that PowerPivot is really an Analysis Services database under the hood).

Some of the reasons that we might want to use PowerPivot in a PPS dashboard would be because:

  • ability to leverage PPS analytical features like analytical grids & charts and decomposition tree
  • quickly integrate disparate data for analysis into a single source
  • leverage existing Excel Power User expertise, data, and files

For this example I will be leveraging the Contoso Retail BI Demo data – Microsoft Contoso BI Demo Dataset for Retail Industry.  I will also be using the following components to put this example together – Excel 2010 with PowerPivot add-in (this can be downloaded from here – PowerPivot Download), PowerPivot for SharePoint 2010, PerformancePoint Services, and SharePoint 2010 Enterprise.

Warning: This example does not include each step-by-step instruction and expects that you have some basic knowledge of PowerPivot, DAX, and PerformancePoint.

image image
First step is to launch PowerPivot Window from Excel 2010 to load data Now from the PowerPivot Window we will select the option to load data from a SQL Server database
image image
Select the ContosoRetailDW SQL Server database Choose the option to select from a list of database tables
image imageimage
In this example we will select DimDate, DimGeography, DimProduct, DimProductCategory, DimProductSubcategory, Dim Store, FactSales, and FactSalesQuota For the two fact tables we will remove a few columns that are not needed for analysis during the load using the Preview & Filter option (remove initial fact key column and last three ETL columns).  Also add a filter for FactSalesQuota on ScenarioKey to only include values of 2 (for Budget)
image image
Now you can click finish and watch the data load – over 5 millions rows of data! During the load process the relationships that were defined in the SQL database were retained, so we will not need to setup any additional relationships between the tables
image image
image
We might simplify the snowflaked model and pull the Product Category and Subcategory information into the Product table using DAX and the RELATED function To simplify the model a bit more we can remove columns and tables that are exposed to the PivotTable in Excel – remove the surrogate key, foreign key, ETL columns, and in the related tables remove all columns
imageimage image
Now the creation of the PivotTable data (measures) which will be needed in PerformancePoint to reference – I will comment on this more down below after the example Measures have been added for PerformancePoint, so now the file can be published to SharePoint to be referenced
image image
File –> Save & Send –> Save to SharePoint –> Save As (this will be saved to a Trusted Excel Services location in SharePoint) In Dashboard Designer a new Data Source will be created – go into the Data Connections library and add a new PerformancePoint Data Source (this will launch Dashboard Designer)
image image
Create a Multi-Dimensional Analysis Services data source and use a connection sting pointing to the Excel file in SharePoint – “DATA SOURCE = http://mybi10/Workbooks/ContosoSalesAnalysis.xlsx” and the cube name is “Sandbox” If you receive this message when setting up the data source to pick the cube you will need to verify the connection string URL and/or verify that the SQL Server Browser service is running
image
image
image
Now create a couple of KPIs for Sales and Gross Margin information (you will need to use the Add Lists option in the Workspace portion of the Ribbon to add the PerformancePoint Content list). Make sure to change the Calculation option from Default to Data Setup a Scorecard and add a dimension column (ProductCategory) to analyze
image image
Create an Analytical Chart (you need to save the data source if you haven’t done so yet to create the analytical reports). To apply formatting to the measures switch to Grid report type and change formatting to $ and %.  Also filter empty axis and series Add another Analytical Chart. Apply appropriate formatting and filter to axis and series
image image
image
Add an Analytical Grid. Adjust Settings so that the column and row headings are not truncated in the View options in the Ribbon Create a couple of Filters, one for the FiscalYear and another for RegionCountry (an MDX Query was used along with the Tree view setup for each filter)
image image
Create and setup the Dashboard Connect the filters to the Dashboard items
image image
Save all of the content and then deploy the Dashboard to SharePoint View and test the dashboard page
image
image
View the Decomposition Tree (requires Silverlight)

A couple of items that I wanted to discuss briefly about in regards to this example was the reasoning for setting up measures in the PowerPivot PivotTable and the other was for setting up the KPIs Calculation as Data Value.  External tools require that you setup measures in the PowerPivot file in our to reference.  Whether you did this in PerformancePoint or Reporting Services, you would need to have measures defined to be able to reference to utilize for analysis.  PowerPivot doesn’t have the ability to setup formatting either to pass through, so you also have to perform this in the tools referencing it currently.  You can reference items placed in the PivotTables in the Values section, but you can’t guarantee these will always exist, so I recommend setting up measures for every metric that you want to evaluate and utilize with external tools.

The other items that I wanted to mention was the KPI calculation setting.  This was changed to Data value so that we would be able to leverage the Decomposition tree option in the right-click context menu.  If this was left as Default value this would have been disabled.  You currently cannot use the Show Details option since drillthrough is not an option in the current PowerPivot release, but you can use the Decomposition tree for analysis.

For the most part PowerPivot provides very similar functionality when compared to Analysis Services.  A few items that you don’t get will be Show Details (Drillthrough), Named Set references, measure formatting (along with formatting configurations – colors), and user-defined hierarchies.

For a good reference point for this type of a setup take a look at this white paper – PerformancePoint Services and PowerPivot for Excel (white paper).

This white paper covers the basics of integration between PerformancePoint Services in Microsoft SharePoint Server 2010 and Microsoft SQL Server 2008 R2 PowerPivot for Microsoft Excel 2010. It describes the basics of utilizing a PowerPivot model in PerformancePoint Services and expands on issues and scope of using these applications together.

Well that was a really brief and fast example of using PowerPivot with PerformancePoint, but you can see how quickly you can build out a fairly advanced dashboard that provides insight and analytical functionality to the end-users.

If you liked this posting you might also be interested in my other two that are similar for Excel Services and Reporting Services:

Enjoy!

Download:

image

Posted in PerformancePoint, PowerPivot | Tagged: , , | 23 Comments »

New PowerPivot & Azure DataMarket Experience

Posted by denglishbi on November 19, 2010

I finally got around to installing the latest version of the PowerPivot application.  The new build number is 10.50.1747.0 (previous RTM build was 10.50.1600.1).  Here is my install experience while having the RTM x64 version already installed on my laptop (go from left to right and then work your way down):

image image image
image image image
image image image

Pretty neat, no need to uninstall and reinstall – thanks Microsoft!

Ok, so now that we have rebooted lets take a look at this new functionality that was added…Azure DataMarket data connection.  To the clouds we go!

image image image
New icon in the external data connection portion of the PowerPivot Ribbon – lets load up that cart with some data… The option that is available in the ‘From Other Sources’ option The new Import Wizard for getting data from the clouds
image image image
What you see when you click on the ‘View available Azure DataMarket datasets’ link in previous screenshot Need to register for the service to get an Account Key setup – this is a FREE registration Registration agreement screen
image image image
Going to subscribe to get access the to the crime data feed Sweet, we are all set to go and the list of options is available for us for exploring the data Account information provides Account Key needed in PowerPivot to connect.  This screen will be displayed if you click on the Find button in the wizard above
image image image
If you go into your account, datasets, you will get the URL for accessing it in the details section of the dataset in the browser. Place this in the wizard Next screen, let’s try the preview now Oh no, didn’t work, what now?
image image image
Never fear, we have Marco Russo to save us!  Just add an extra slash onto the end of the URL string Now we can preview the data.  The Test Connection option would work either way, but need extra slash to get data. Let the import begin.
image image image
Download complete. New table in the PowerPivot Windows. Time to analyze data for violent crimes, MN is looking okay (#34).

How cool is this?  Really neat that we can tap into some external data and get this incorporated into our existing internal data for analysis.  Lots of data is available out on the Azure DataMarket to take a look at including weather information which is always useful.  This analysis I did was extremely raw and has tons of gaps since I find it hard to believe MN had zero violent crimes in 2006 and you have to factor in population sizes into this analysis (that information was included, just not part of my Pivot or metrics).

So what are you waiting for, go ahead and download the latest version of PowerPivot and tap into some new data feeds –> Download PowerPivot.

Head to the clouds for your data – http://datamarket.azure.com/

Please support the connect bug to get this escalated – PowerPivot fails using URL published on Azure DataMarket website

Posted in Azure, PowerPivot | Tagged: , | 2 Comments »

Round 2: Calculating Previous Year Marketing Calendar Values with DAX

Posted by denglishbi on October 9, 2010

So a couple of weeks back I did the posting Calculating Previous Year Marketing Calendar Values with DAX where I talked about a solution for getting at previous (last) year values in DAX with Marketing (Retail) Calendars.  For the most part the solution worked, but as I looked at the results a bit closer I noticed that it eventually broke down and failed.  Why is this?

Well the answer should have been obvious and I have ran into this before, but eventually these 445, 454, 544, or 13 period calendars run into a small issue that they need to deal with.  The issue is that these calendars only deal with 364 days (13 periods x 4 weeks x 7 days = 364 days).  So there is an extra day that we eventually need to deal with and then we run into the leap year and all of this eventually creates the mysterious week 53!

This situation is documented here on the National Retail Federation web site “4-5-4 Calendar”:

An Explanation of the 53-Week Year

Dividing the retail calendar into 52 weeks of seven days each, or 364 days, leaves an extra day each year to be accounted for. As a result every five to six years a week is added to the fiscal calendar.

If I go back to my original solution and scroll a little further down in my PivotTable I would have seen this since in Feb 2008 we run into a leap year.  This eventually causes the solution to bomb out when we get into the 2009 and 2010 comparisons and so on.

image

So, now what do we do?  Time to regroup and look for a solution to work around this.  After some brainstorming I realized that we just needed to expand on the original solution.  This means that we need to tell each time frame that we are dealing with how it relates to the previous period.  So not only at the day level like I originally setup, but also for the week, month (period), quarter, and year.  So we go back into our table and setup some additional columns:   

image

And then populate these dates with values:

image

Make a few tweaks to the calculation to evaluate the different time frames and reference the date ranges that we just defined (I don’t show it, but we also updated the table in the PowerPivot Window to include these new columns into our model):

image

PY Sales Amount =

IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Year])) = 1,
CALCULATE([Sum of SalesAmount], DATESBETWEEN(DimRetailDate[PK_Date],
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Quarter_Of_Year])) = 1,
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Month_Of_Year])) = 1,
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Week_Of_Year]))=1,
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Day_Of_Year]))=1,
FIRSTDATE(DimRetailDate[PY_Date]),
FIRSTDATE(DimRetailDate[PY_Reporting_Week_Start_Date])),
FIRSTDATE(DimRetailDate[PY_Reporting_Month_Start_Date])),
FIRSTDATE(DimRetailDate[PY_Reporting_Quarter_Start_Date])),
FIRSTDATE(DimRetailDate[PY_Reporting_Year_Start_Date])),
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Quarter_Of_Year])) = 1,
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Month_Of_Year])) = 1,
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Week_Of_Year]))=1,
IF(COUNTROWS(VALUES(DimRetailDate[Reporting_Day_Of_Year]))=1,
FIRSTDATE(DimRetailDate[PY_Date]),
FIRSTDATE(DimRetailDate[PY_Reporting_Week_End_Date])),
FIRSTDATE(DimRetailDate[PY_Reporting_Month_End_Date])),
FIRSTDATE(DimRetailDate[PY_Reporting_Quarter_End_Date])),
FIRSTDATE(DimRetailDate[PY_Reporting_Year_End_Date]))))
,BLANK())

And now we go back and take a look at our results:

image

Tada! We are back in business. And this will work at the other levels as well (here we use the Reporting Year and Quarter):

image

So now we are back in business. Just needed to expand out the original solution into each of the different time frames that we wanted to analyze and make sure that we have the previous year mappings setup according to our business rules in the table we are referencing.

Posted in PowerPivot | Tagged: , , | 2 Comments »

Where did my Excel PowerPivot tab go? Follow-up

Posted by denglishbi on October 8, 2010

Back in June I did a post called Where did my PowerPivot Excel 2010 tab go? and I talked about being impatient waiting for Excel to startup.  The reason that it takes longer now is that I have to wait for the add-ins to load and specifically the PowerPivot one now that I am using that with Excel 2010.

I had a dream that Excel would load fast.  I even did a tweet about this back in May:

denglishbi any options to speed up #Excel 2010 loading. now that #PowerPivot is installed it takes too long to load add-in & extras. please speed up.

I then did the post that I referenced above talking about how I was able to disable the PowerPivot add-in by clicking on the cancel option on the startup splash screen when Excel loads up.  Little did I know there was someone observing my comments and they decided to take some time and actually build out a solution!  And his name is Hrvoje Piasevoli (Blog | Twitter).  He came up with a solution creating some PowerShell scripts – Shortcuts to start Excel with or without PowerPivot add-in.  Today I download the solution and ran the setup process provided in his instructions and it works like a charm!  Big kudos to Hrvoje!

imageOne thing to watch out though is that if you don’t use the shortcuts to access Excel the next time that it will load up the last option that you ran.  This means that if you do the ‘Excel Only’ shortcut, close Excel, and then click on the Excel shortcut in the start menu that Excel will load normally (without PowerPivot add-in).  To enable the PowerPivot add-in you can follow my steps in the posting above or simply close Excel and then double-click the ‘Excel with PowerPivot’ shortcut on your desktop.

Many thanks go out to Hrvoje for creating this setup.  Maybe it should be moved out to CodePlex?  Not sure what the process is for that, but definitely a sweet little setup and even includes an uninstall process.  Once again, thanks Hrvoje!  You rock!

Posted in PowerPivot | Tagged: , | 1 Comment »

Calculating Previous Year Marketing Calendar Values with DAX

Posted by denglishbi on September 24, 2010

UPDATE (9/27/2010): watch out for the mystery 53 week that occurs every 5 years or so.  formula does not account for this at all and will fail eventually.

UPDATE (10/9/2010): I have an updated post that has a resolution for handling the 53 week Marketing (Retail) Calendar setup – Round 2: Calculating Previous Year Marketing Calendar Values with DAX.

When it comes to reporting typically we will use the common calendar year that starts on January 1 and goes through December 31 with the standard months.  We might even be faced with a fiscal calendar that starts on July 1 and ends June 30 the following year.  In both of these cases we are sticking with the set days in the months and are not doing anything special.  But what happens when we start to use a custom calendar for marketing or retail purposes like a 445, 454, 544, or possibly 13 periods of 4 weeks each?  Will this have any impact on how we need to create our metrics to report on our data?  The answer – Yes.  Because in this case we are not simply evaluating September 1 of this year to September 1 of last year.  Nope.  We are actually evaluating it to September 2 of last year which is the same day of the week.

So now the challenge is how can we do this and how can we do it with PowerPivot using DAX?

Disclaimer: I cannot guarantee that this calculation and example will work the same as it did for me, so please always verify the results and add any additional checks that may be needed, this is strictly setup as a demonstration.

For this demonstration I am going to use a marketing calendar using a 454 week setup that starts with the first week in March.  To create this date dimension table I leveraged the functionality in BIDS to create a time table in my data source to generate this table for me – Creating a Time Dimension by Generating a Time Table.

image

Once this table was created for me in my SQL Server database I went into that table and created a new column in the table that I called PY_Date.  This column is going to contain the prior year equivalent date for the dates that are stored in this table in the PK_Date field that was generated by the wizard.  I then went ahead and populated this column with the prior year equivalent dates so we now have data that looks like this in the table (I am just displaying the two date columns here):

image 

This new column is what I call the ‘silver bullet’ that will be required to reference in our DAX formula.  I am kind of blowing it by introducing this first, but I will continue on and show a couple of DAX formulas for a typical calendar and then how it will be adjusted to work with the marketing calendar.

Once we have this table setup and have some facts that we want to relate to this new time (date) dimension we go ahead and load this into PowerPivot and setup our relationship between the two tables.

image

Now we are ready to setup a PowerPivot PivotTable to analyze our data and create additional measures.  In this first example I place the Year Name and Month on the Rows and Sales Amount in the Values.  After this I then create a new measure to calculate the Previous Year Sales Amount leveraging the CALCULATE, DATESBETWEEN, FIRSTDATE, DATEADD, and LASTDATE functions:

image

PY SalesAmount =CALCULATE([Sum of SalesAmount], DATESBETWEEN(DimRetailDate[PK_Date],FIRSTDATE(DATEADD(DimRetailDate[PK_Date],-12,MONTH)),LASTDATE(DATEADD(DimRetailDate[PK_Date],-12,MONTH))))

Here are the results we end up with:

image

Now this works create when you have standard reporting that uses the typical set month periods.  What does it look like when we swap out the members on the Rows with the Reporting Year Name and Reporting Month values that are referencing our marketing 454 week calendar setup:

image

As you can see this same formula will not work.  So now we take a look at using that additional column that we setup earlier for the equivalent prior year value for our dates and we modify the original calculation slightly:

image

PY Sales Amount =CALCULATE([Sum of SalesAmount], DATESBETWEEN(DimRetailDate[PK_Date],FIRSTDATE(DimRetailDate[PY_Date]),LASTDATE(DimRetailDate[PY_Date])))

Now let’s take a look at how this looks with the data:

image

There we go.  That is what we want.  Now we are looking at equivalent dates.  So we need one measure to use with standard calendar reporting periods and a different one if we are reporting against equivalent time periods like using a 454, 445, 544, or some other type like 13 equal 4 week periods.  The secret ‘silver bullet’ is the additional previous year equivalent date column in the time (date) dimension table that is utilized to make the results we need.

Warning: As stated above I cannot guarantee that this is going to work the same in your environment and you might need to include some additional IF logic to do some additional checks to return BLANK().

Enjoy!

Posted in PowerPivot | Tagged: , , | 3 Comments »

Sorting PowerPivot Labels Continued

Posted by denglishbi on September 14, 2010

I was reading the new PowerPivot book, Practical PowerPivot & DAX Formulas for Excel 2010, I purchased a few weeks back and I came across some new information (always a plus).  The information I discovered ties back into my posting I did a few months back in regards to sorting labels manually – Sorting PowerPivot Labels Manually.  The posting talked about sorting month labels so that they were not in alphabetical order, but in the standard Jan, Feb, Mar, etc. order.  Now what I come to find out is that instead of doing that manual work you can simply use the ‘Sort A to Z’ option instead.  Excel is smart enough to know that these are months and should be sorted in a particular order.  Let’s take a look at this:

Before (notice that the months are sorted alphabetically):

image

Select the ‘Sort A to Z’ option for the month label name:

image

After:

image

Now everything is in the proper order.  Not exactly what you would have expected, but this is what we want to see.  Nice little hidden feature.  So what else can this work for…hmmm…what else is standard?  How about days of the week?  Let’s take a look at that:

Before (notice that the days of the week are sorted alphabetically):

image

After the ‘Sort A to Z’:

image

Shazaam!  Sweet.  Now if only you could do this in the Slicers that would be golden, but no such luck.  Plus this is only for the particular PivotTable that you performed this on, so you would need to continue to remember to sort the label names in other PivotTables.

Hope you like that little trick, I know I do.

Posted in PowerPivot | Tagged: , | 4 Comments »

What happened to my PowerPivot Excel Data? 1904? What the?

Posted by denglishbi on September 1, 2010

A few months back I had a rather unique Excel discovery.  We were loading Excel data files into PowerPivot using the import Excel file option instead of the copy/paste and linked table method (for obvious reasons) and all of a sudden we noticed some issues with dates.  The source data looked like the following:

image

But when we loaded this into PowerPivot it looked like this:

image

I was like…what the?  Now I know I can’t stand using Excel as a source file for loading data from, but this was something I had never experienced before.  Why were all of the dates on the rows imported into PowerPivot off by 4 years and 1 day?  We did go ahead and try the copy/paste option and here are the results we saw:

image

Well now, that is what we want.  But we don’t want to use the copy/paste option because we want to streamline the load process and eventually schedule data refreshes in SharePoint.  So what is causing this to happen?  Well it turns out that different versions of spreadsheet software have different start dates and actually Excel provides a way for users to set the starting year date to not reference 1900 (which it thinks is a leap year, but it actually isn’t) and to use 1904 instead.

image

This was news to me.  I tracked down a Microsoft knowledge base article in regards to this issue as well XL: The 1900 Date System vs. the 1904 Date System.  So since the Excel file I was loading the data into did not have the 1904 setting in place, but the source file did then this caused the dates when loaded using the import option to be offset by 4 years and 1 day.  What we ended up doing was changing the source file setting to not reference the 1904 starting year and then adjusted the dates accordingly.  Once we did that we were back in business.  We did contact the business users and verify that this was okay and made sure they were aware of the change we were making.  It turned out they didn’t even know about the setting either.  Weird.  Assuming someone that originally created the file must have done it for some reason.

Posted in PowerPivot | Tagged: , | Leave a Comment »

PowerPivot Data Refresh with Excel Source Data

Posted by denglishbi on August 11, 2010

So wouldn’t you know, one of the source files that we get to load into PowerPivot is Excel.  Initially the data was being loaded into the PowerPivot Window using either the Linked table or the Copy & Paste options.  I immediately decided to change this option to help streamline the process and decided to use the ‘Excel File’ import option under the ‘From Other Sources’.

image

The reason that I did this was to be able to quickly refresh them by using the ‘Refresh’ option in the PowerPivot Window and to also be able to eventually leverage the scheduled data refresh option in SharePoint.

image

The one thing about this setup was that I had to have a fairly clean file to start with, meaning that the data was in a pretty standard format with no special headers or cells in the file, just column headers and data.  It didn’t matter if the data was starting out in cell A1 because the import wizard could detect this and would properly grab the data even if it started in B3.

image    image

What I would run into would be some typical Excel applications that had some crazy layouts with tons of worksheets and formulas all over the place.  These files typically had some data in the top left-hand corner that was helping to drive the entire worksheet.  Now for these worksheets I had to go ahead and setup a named range in the Excel worksheet.

image

Once this was setup we could then reference this named range when importing the Excel data into PowerPivot.  Now in my example here I wouldn’t have had to have done this, but this is just showing you how you go about doing this type of setup.

image

Now when you do the import you will be able to reference the data like such:

image

You will just need to make sure that if the data in the source file changes that the named range gets updated appropriately so that you can refresh the PowerPivot data without any issues.  The other thing that we ended up doing was making sure that the source data was being stored in a common location, which means not on our workstations.  The location we picked was a SharePoint document library.  We then accessed the data utilizing the SharePoint WebDAV protocol meaning we were using UNC paths out to the SharePoint library like such – \\ServerName\DavWWWRoot\Site\DocumentLibrary\ExcelData.xlsx.

Now that you have all of the data loaded into PowerPivot referencing a shared location the next step would be to create some PowerPivot PivotCharts and PivotTables (if you are using PowerPivot reporting functionality).  Once this is done you will want to then upload the data to SharePoint into a PowerPivot Gallery to share with the end-users.  The PowerPivot Gallery is also where you get the option to schedule data refreshes and this is what we are after.

image

The first time I setup the scheduled data refresh I ended up getting the following failure message:

The provider ‘Microsoft.ACE.OLEDB.12.0′ is not registered. The following system error occurred: Class not registered A connection could not be made to the data source with the DataSourceID of ‘<guid>’, Name of ‘<PowerPivot ConnectionName>’. An error occurred while processing the ‘<PowerPivot TableName>’ table. The operation has been cancelled.

I had received a similar error message when I tried to load Excel data into SQL Server on my x64 workstation.  To resolve that issue I had to install the ‘2007 Office System Driver: Data Connectivity Components’.  I tried this on the server and still ended up with the same error message (slight difference between SQL Server Management Studio, 32-bit, and SharePoint Server, 64-bit).  Thanks to some help from some external resources I realized that I needed to install a x64 version of the drivers, which meant installing ‘Microsoft Access Database Engine 2010 Redistributable’.  But when I went to install this it told me that I had to uninstall Office 32-bit software.  The server that I was using was setup in a Sandbox (single server) setup for development purposes, so all of the software was installed and this included 32-bit version of the Office 2010 suite.  I went ahead and uninstalled the data components that I just installed, Office 2010, and the PowerPivot add-in.  Once that was done I was able to install the x64 version of the 2010 components and I initially got this error message:

OLE DB or ODBC error: Failure creating file.; 3436. A connection could not be made to the data source with the DataSourceID of ‘<guid>’, Name of ‘<PowerPivot ConnectionName>’. An error occurred while processing the ‘<PowerPivot TableName>’ table. The operation has been cancelled.

After some quick research I ended up on Dave Wickert’s site (if only my initial search would have brought me here first, or if I just recalled reading this posting earlier this year…) Working with the 2010 Office ACE provider.  This told me that I needed the Desktop Experience feature enabled on the server.  Unfortunately this was already enabled since this server was setup in a Sandbox setup for development purposes.  After a little more testing, reloading the PowerPivot file in the Gallery, and running the data refresh again I ended up with a successful data refresh.  Ahhh…..relief….satisfaction….scheduled Excel data refreshes into PowerPivot:) Excel-to-Excel, sweet harmony.

Posted in PowerPivot | Tagged: , , | 7 Comments »

PowerPivot Book Reviews

Posted by denglishbi on August 2, 2010

This past month I purchased both of the PowerPivot books that were available.  The first one I got was the Professional Microsoft PowerPivot for Excel and SharePoint (wow, that was a mouthful).  I would have to say that this book is geared more toward the IT Professional where it does a great job covering the installation setup on the server side, managing and troubleshooting techniques, and going over some of the development.  While I was going through the exercises in the book I did do a few blog postings in regards to some of my experiences with the PowerPivot product

The other part that I really like about this product was the additional insights that were provided by the product team members talking about how the product came to life and the different paths that were explored initially (originally MS Access was looked at…phew).  The exercises that were provided were helpful and the troubleshooting section was very useful, especially coming from the IT side of the world.  All-in-all I would highly recommend this book to the IT Pro and would give this book 5 out of 5 stars – image.  Definitely worth the $$$.

Next on the list came the book from Mr. Excel (Bill Jelen).  I have to admit that I had not heard of him until the great Alpha Geek Challenge came about that Microsoft launched promoting the PowerPivot product.  I tried to get involved in Round 2, but ended up in Round 3 going head-to-head with Mr. Excel.

This book is definitely geared towards the Excel user and that was expected.  I have not read any of Mr. Excel’s books, so I wasn’t used to his style of writing.  There are some comments made in the book about ‘blowing a gasket’, ‘insane things that come out of Redmond’, ‘going hack off the people in Europe’ (not sure what that even means), etc. that I did not find appropriate in a formal published book.  These comments would have been better suited for a blog posting possibly, but not for a book that I might want to recommend or keep on my bookshelf.  The book does a good job explaining the differences between Excel and PowerPivot, going over all of the functions available to you, providing some examples to go through, and providing advice for publishing workbooks (formatting and look-and-feel).  At the very end it talks briefly about the SharePoint side of the world, but at a very high level.  I did get a good laugh when it stated ‘Build a PowerPoint pivot table’ (easy mistake and the names of products now are really easy to mix up).  If you are an Excel user and are looking to make the switch to PowerPivot then this would be a good reference book.  The price is right and I would have to give this book 3 out of 5 stars – image

I know that there are a few more PowerPivot books coming out and I am definitely looking forward to taking a look at them once they are available

Posted in PowerPivot | Tagged: , | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 90 other followers