Dan English's BI Blog

Welcome to my random thoughts in regards to Business Intelligence, databases, and other technologies

Archive for the ‘Analysis Services’ Category

Book Review: Microsoft Tabular Modeling Cookbook

Posted by denglishbi on February 21, 2014

There is a new Analysis Services (SSAS) Tabular book that is available to own and add to your library.  The book is written by Paul te Braak, who is a lead business intelligence consultant in Australia and is one of the developers on the DAX Studio project, and he has put together an outstanding cookbook.  When the book was released I was surprised and excited.  I was surprised because I did not know that Paul was working on this (he is the sole author, big kudos to Paul), and excited because I knew it was going to be a good one.  I had this one on my radar list of books to add to my collection; I am definitely a big fan of the Packt Publishing Cookbook series style of books. Microsoft Tabular Modeling CookbookWhat I like about the books is that they introduce a topic or situation and then go over the solution in a very simple and easy to understand format – Getting Ready, How to do it, How it Works, There’s more.  Paul adds a lot of great insights in this book in explaining how the solutions work as well as including a bunch of ‘Tips’ along the way as well.

Here is when I first became aware that the book was released last month and posted the O’Reilly media deal:

image

So when I was asked by the publisher this past month to provide a review of the book I was more than happy to say ‘yes’.

Paul does a great job on slowly working you into the Tabular modeling concepts and the only tool you need to get going is Excel and the Power Pivot add-in.  Paul’s examples use Excel workbooks and flat files for the most part, so that makes it really easy to get started and get your learn on.

What is amazing is that this book is just over 300 pages and it is loaded with great content that covers items such as how to use Power Pivot, hierarchies, drilldown, parent-child hierarchies (including how to hidememberif in DAX), smart measures, smart keys, programmatic access in Excel – cube functions and VBA, querying with DAX, Power View, and more!  Simply amazing, Paul does a fabulous job and this is a great intro book that progresses into advanced topics and has great examples, tips, and insights that are a big time value add.

Here is a outline of the chapters:

Chapter 1: Getting Started with Excel
Chapter 2: Importing Data
Chapter 3: Advanced Browsing Features
Chapter 4: Time Calculations and Date Functions
Chapter 5: Applied Modeling
Chapter 6: Programmatic Access via Excel
Chapter 7: Enterprise Design and Features
Chapter 8: Enterprise Management
Chapter 9: Querying the Tabular Model with DAX
Chapter 10: Visualizing Data with Power View

Book link – Microsoft Tabular Modeling Cookbook.

I would definitely rate this as a must have for anyone doing tabular SSAS development and give it 5 out of 5 stars – image

Awesome job Paul and thanks for writing the book and sharing!

Posted in Analysis Services | Tagged: , , | 1 Comment »

Book Review: MDX with SSAS 2012 Cookbook

Posted by denglishbi on November 25, 2013

A couple of years back I reviewed a new book on the market related to MDX and Analysis Services, Book Review: MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook, and the book has now been updated to 2012.  The book was written by Microsoft SQL Server MVP Tomislav Piasevoli and Sherry Li. MDX with SSAS 2012 CookbookThe cookbook style of book is an outstanding reference book.  The book is full of solutions that tell you “how to do it”, “how it works”, and wait, “there’s more” (great value add and insights).

The book has been updated for the 2012 edition of SSAS and the examples have also been updated against the 2012 version of Adventure Works database.  The other difference from the 2008 R2 release of the book is that some of content has been broken up into a a couple of new chapters.

  • Chapter 1: Elementary MDX Techniques
  • Chapter 2: Working with Sets
  • Chapter 3: Working with Time
  • Chapter 4: Concise Reporting
  • Chapter 5: Navigation
  • Chapter 6: Business Analytics
  • Chapter 7: When MDX is Not Enough
  • Chapter 8: Advanced MDX Topics
  • Chapter 9: On the Edge

Book link – MDX with SSAS 2012 Cookbook. Amazon reviews link – Amazon Customer Reviews.

I would definitely rate this a must have for anyone doing multidimensional SSAS development and give it 5 out of 5 stars – image.

Posted in Analysis Services | Tagged: , , | Leave a Comment »

Tabular Models, Compatibility Level, and Power View

Posted by denglishbi on October 4, 2012

A couple of weeks ago I did a review of the CTP4 of SP1 that came out for SQL Server 2012 – Installing SQL Server 2012 SP1 CT4 Experience. Part of my test involved upgrading an Excel 2013 PowerPivot file to a Tabular Analysis Service project.  After I did that I setup a BI Semantic Model connection file in SharePoint and explored the Power View designer updates.  It was business as usual and I was seeing all of the new features and functionality that I had previously mention in my posts on Excel 2013 and Power View – Power View meet Microsoft Excel 2013 and Power View meet Microsoft Excel 2013 Part 2.

Well this past week I was prepping for my pre-con for the SQLSaturday #149 event and I noticed something odd.  When I was connecting to my tabular models with Power View I was not seeing my hierarchies or key performance indicators (KPIs).  I was a bit confused.  I upgraded my environment to SP1, so what was going on.  I thought the upgrade provided support for these model features now, right?  I know I had seen them already, maybe I was just imagining things though.  When I was doing my tabular project deployments I was also seeing a new dialog box that I was just clicking ‘Yes’ on without really understanding what it was talking about…

image

I completed out the pre-con still confused and then last weekend I did a bit more digging into this issue.  I went back to my Excel 2013 SQLSaturday model that I upgraded and I reviewed the property settings on the database.

image

The item that I noticed that seemed like it might be important was the Compatibility Level setting.  This database was showing SQL Server 2012 SP1 (1103).  I then took a look at the database I was using for some of my demos.

image

Ahhh, now the light bulb was starting to get bright:) The database that I was deploying was still in the SQL Server 2012 (1100) setting even though I had upgraded my environment to SP1.  Interesting.  Apparently you have the ability to control this setting now if you want to run in different levels or do some testing in the same environment before upgrading all of your databases.

image

I didn’t realize that I now had to make a change on my model property settings before I deployed my database.

Once this setting is changed you will be prompted with the following:

image

Once again I click ‘Yes’ and redeploy to upgrade the database on the server.  Now when I connect to my model with Power View I see my hierarchies and KPIs, perfect!

image

So just something to be aware of now when working with tabular model projects now.  There is a property setting that we have to take a look at, Compatibility Level, and it is a pretty important one for the users of our data models:)

Here are a couple of additional reference links to take a look at (some of this documentation and instructions might differ from what you see with CTP4 of SP1, but they should be available in the RTM of SP1):

Cheers!

Posted in Analysis Services | Tagged: , , | 8 Comments »

Book Review: MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook

Posted by denglishbi on September 26, 2011

A couple of weeks back I mentioned a new book on the market related to MDX and Analysis Services and that I had received a copy to review.  The book is MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook written by Microsoft SQL Server MVP Tomislav Piasevoli.Book - MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook

This is by no means an introduction (101) book to MDX, but it is an MDX reference book that anyone using Analysis Services should have on their desk (or device – available in eBook version as well).  I have not read a cookbook technical style of book before and I simply couldn’t put it down.  I really enjoyed the style of here is a situation, here’s how you do it, and now let’s see how this actually works.  But the author doesn’t just stop there, nope, but what there’s more information about the recipe and then provides additional reference links to checkout or other spots in the book to look at as well.  After I got done reading one recipe I continued on to the next one because I was interested in reading the solution and finding out more and seeing how it compared to how I would approach the situation.

I found the book very easy to read and the information is extremely valuable for anyone working with Analysis Services and trying to solve solutions using MDX. This is a must have book for any SSAS professional, if you don’t believe me, check it out for yourself and you will definitely agree, just like others already have like Derek Goodridge (@WorkerThread) and Jason Thomas (@de_unparagoned).

Book link – MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook.

I would definitely rate this a must have and give it 5 out of 5 stars – image.

Posted in Analysis Services | Tagged: , , | Leave a Comment »

New Book: MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook

Posted by denglishbi on August 30, 2011

Just a couple of weeks ago I tweeted about a new book that came out in regards to MDX and Analysis Services:Book - MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook

tweet

Book –> MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook

Well just this week I received my very own eBook version for review from the publisher and I have already started diving into it to check it out.  This is a cookbook style of book that goes over different scenarios and/or issues and then provides the solution, explains the solution, and then provides even more information in regards to the item.  Very slick and I love the format.  Not only is Microsoft SQL Server MVP Tomislav Piasevoli a pro when it comes to this content, but he also got an all star MVP lineup to review and edit the book including Greg Galloway, Darren Gosbell, Deepak Puri, Marco Russo, and Chris Webb.

I plan on providing a full review on the book in a couple of weeks, but just wanted to put another shout out in regards to the book in case you haven’t heard about it yet and are looking for a technical reference cookbook in regards to MDX.  Stay tuned for my complete review.

Posted in Analysis Services, SQL Server, Training | Tagged: , , | Leave a Comment »

Calculating Cumulative Values in Microsoft BI Tools

Posted by denglishbi on January 24, 2011

In working on a business intelligence project a common situation you will find yourself in is having to come up with a cumulative value (running value) calculation.  The challenge that you will find is that depending on what product you are working with the calculation that you need is going to be different.  In this post I will compare how this can be done with T-SQL, Excel, SSIS, SSRS, MDX, and DAX.  Along the way I will provide some additional reference links to other options, content, and I will point out some differences in how the totals are being calculated so you will know what to expect with the results.

In these examples I will be working with the Contoso Retail Data Warehouse data sample.  These examples below are not necessarily going to be the optimal approaches, just showing how you can go about getting at the desired result.

SQL Server T-SQL

So let’s start out taking a quick look how calculating the sales over each of the years directly against the data in the database through SQL Server Management Studio.

SELECT d.CalendarYear,

       SUM(f.SalesAmount) SalesAmount,

       SUM(f.SalesAmount) + COALESCE((SELECT SUM(SalesAmount)

                        FROM FactSales fs JOIN DimDate dd ON fs.DateKey = dd.DateKey 

                        WHERE dd.CalendarYear < d.CalendarYear),0) CumSalesAmount

FROM FactSales f

INNER JOIN DimDate d ON f.DateKey = d.DateKey

GROUP BY d.CalendarYear

ORDER BY d.CalendarYear

And the results for this are:

CalendarYear SalesAmount           CumSalesAmount
———— ——————— ———————
2007         4561940955.0212       4561940955.0212
2008         4111233534.6841       8673174489.7053
2009         3740483119.1823       12413657608.8876

 

Not too bad.  We have an inner query that is being used to return the pervious year(s) values and append that onto the current year (if they exist).  If no value is returned then a zero is used in its place.  For additional examples and options take a look at this posting by Garth Wells – Calculating Running Totals.

Microsoft Excel

So now lets take a look at how this would be done if we were using Excel. 

image

There area a few different options you have here, in this example I enter a formula in the C2 cell of =SUM($B$2:B2) and then copy that formula down into cells C3 and C4.

SQL Server Integration Services (SSIS)

Now we will take a quick look at how this could be done in the ETL process.  As we are pulling data from a source we can evaluate the values in the data flow and accumulate the values inside a script component.  In the source we will simply get the sales for each year and then append the values in the flow to each other to get at the same results displayed above.

image

The cumulative value is being performed in the script component with some code.  We have a variable setup that we are using to append the sales amount of each row in the data flow and then sending this value back out in a new output column called CumulativeSales.

image

Nothing too fancy going on here.  I don’t have a destination in the flow, just added a derived column to get the data viewer included so that I could run the data flow to show the results.

SQL Server Reporting Services (SSRS)

Now lets switch over to Reporting Services.  In this example we will not do the cumulative value in the dataset query, we will do this value calculation in the report. 

image

In this example we leverage the RunningValue aggregate function that is available in Reporting Services.  The expression that we use in the Cumulative Sales column is =RunningValue(Fields!SalesAmount.Value,Sum,”DataSet1″)

image

This function returns a running aggregate of the values in our specified dataset using the SUM function which is also specified in the expression.

SQL Server Analysis Services (SSAS) MDX

Now how about taking a look at Analysis Services.  How would we go about doing this calculation using MDX?

WITH MEMBER [Measures].[Cumulative Sales] AS

    SUM({null:[Date].[Calendar Year].CurrentMember},[Measures].[Sales Amount])

    

SELECT {[Measures].[Sales Amount],[Measures].[Cumulative Sales]} ON 0,

    NON EMPTY [Date].[Calendar Year].[Calendar Year].Members ON 1

FROM [Sales]

And the results of this are:

image

We can apply some filters into the MDX statement to remove the years that currently don’t have Sales Amount associated with them.

WITH MEMBER [Measures].[Cumulative Sales] AS

    SUM({null:[Date].[Calendar Year].CurrentMember},[Measures].[Sales Amount])

    

SELECT {[Measures].[Sales Amount],[Measures].[Cumulative Sales]} ON 0,

    NONEMPTY([Date].[Calendar Year].[Calendar Year].Members,[Measures].[Sales Amount]) ON 1

FROM [Sales]

This returns what we have seen in our other results:

image

There are additional ways in going about this calculation and for more options take a look at this recent posting here by Amish Manubhai Shah – Various way to calculate running total from All Period to Currentmember. Now I want to show how this solution differs a bit from the other ones we have seen.  What happens if I apply a filter to the query to only display 2008 and 2009?  What would you expect?

image

Might not be what you would want, but then again maybe it is.  The Cumulative Sales calculation is summing up the values from the beginning of time (the [All] level).  So the first value that we see in 2008 is including the Sales Amount from 2007 as well (if there was sales in 2006 it would include that as well and so on). 

You can make a few modifications to the calculation and setup a named set to reference the items and come up with something along these lines.

WITH DYNAMIC SET [SelectedYears] AS Existing [Date].[Calendar Year].[Calendar Year].Members

 

 MEMBER MEASURES.[Cumulative Sales] as 

    SUM({EXISTS([Date].[Calendar YQMD].[Calendar Year].Members,

    [SelectedYears]).Item(0):[Date].[Calendar YQMD].CurrentMember},

    [Measures].[Sales Amount])

 

SELECT {Measures.[Sales Amount], Measures.[Cumulative Sales]} ON 0,

    [Date].[Calendar YQMD].[Calendar Year].Members ON 1

FROM [Sales]

WHERE {[Date].[Calendar Year].&[2008],[Date].[Calendar Year].&[2009]}

With the following results:

image

Here a dynamic set is used to get the references of the members that are in the slicer (WHERE clause) and feed this information into the calculation for the cumulative sales to grab the first item to come up with the reference to the 2008 year.  This will be the starting point for the cumulative total instead of from the beginning of time.

You might just want to look at year-to-date cumulative totals and be able to drilldown into a user defined hierarchy if you have one setup like Year-Quarter-Month-Date.  If we modify our calculation a bit we might end up with something like the following:

WITH MEMBER [Measures].[Cumulative Sales] AS

    SUM(YTD([Date].[Calendar YQMD].CurrentMember),[Measures].[Sales Amount])

 

SELECT {[Measures].[Sales Amount],[Measures].[Cumulative Sales]} ON 0,

    Hierarchize({[Date].[Calendar YQMD].[Calendar Year].Members,

    [Date].[Calendar YQMD].[Calendar Quarter].Members}) ON 1

FROM [Sales]

WHERE {[Date].[Calendar Year].&[2008],

    [Date].[Calendar Year].&[2009]}

 

With the following results:

image

Here we can see that our cumulative total stops at the year level and we can see at the lower levels that the values are accumulating as expected, so the cumulative total at Q4 is same as the Yearly value.  What we have now is a YTD calculation.

PowerPivot for Excel 2010 DAX

We have one more product that I want to take a look at and that would be PowerPivot.  This is going to be fairly similar to the Analysis Services solution since it actually is Analysis Services behind the scenes, the only difference here is that we will be using DAX instead of MDX.

So for the first example we will setup a cumulative sales calculation just like we did with the first MDX example.  The calculation we use is as follows:

=CALCULATE(SUM(FactSales[SalesAmount]),

DATESBETWEEN(DimDate[Datekey],

FIRSTDATE(ALL(DimDate[Datekey])),

LASTDATE(DimDate[Datekey])))

So we end up with the following:

image

You could go ahead and add some additional logic to evaluate the Sales Amount to determine if a value exists, but you get the general idea here.

Once again this is calculating the cumulative total from the beginning of all time and that is done by using the ALL reference in the calculation for the FIRSTDATE.  If we filter the years and only display 2008 and 2009 we will see the similar results like we did with MDX where the cumulative sales amount for 2008 includes the 2007 sales.

image

The one nice thing about PowerPivot is that this is in Excel, so if you want to do anything a little more custom or make any references to cells you can do that and use Excel functions as well.  As far as setting up a cumulative sales amount value like the second MDX example I can’t determine an equivalent in DAX.  Maybe someone else has an idea if this can be done so that it only does this based on the Date values being evaluated.  If you do, please leave a comment.

You can do the YTD calculation in DAX and here is formula for that:

=TOTALYTD(SUM(FactSales[SalesAmount]),DimDate[Datekey])

And the results look like this if we break out the years by the quarters:

image

Once again we can see that the yearly total is equivalent to the Q4 for the same year.  For some additional information and explanation of the PowerPivot calculations take a look at a great post by Kasper de Jonge here Accumulate values (running value) over multiple years in PowerPivot using DAX.

Conclusion

Cumulative total (running value) calculations are all calculated differently within each of the Microsoft products.  We took a look at some examples of how this can be done in T-SQL, Excel, SSIS, SSRS, MDX, and DAX.  These are not the only options, so if you want to see more take a look at the additional posts that I included as reference points the different sections.  It is important to understand how the formulas need to be configured and also what the results are once the calculations and logic is put into place.  I think that one of the easiest ones, besides the Excel formula, would have to be the SSRS with the RunningValue aggregate function.  I like how this is setup and it evaluates everything based on the context that is being referenced.  So if we go back to the SSRS report and apply a filter to the tablix to remove 2007 from what is displayed we would end up with the following:

image

Very simple and easy to understand, but that is just my personal opinion.

Posted in Analysis Services, Integration Services, PowerPivot, Reporting Services, SQL Server | Tagged: , , , , | 9 Comments »

Analysis Services (SSAS) Processing and Aggregations

Posted by denglishbi on October 29, 2009

I get a lot of questions about this and see quite a few postings about Analysis Services processing in the MSDN forums.  So you created an SSAS database solution and have deployed it to a production environment.  Now the data has accumulated and you need to take a look at moving to a incremental processing approach because you are either loading the data more regularly or the volume of data and processing does not meet your maintenance window anymore.  So what do you do?

The solution is to look at doing a more incremental processing approach instead of just a Process Full on the database.  You have a few options in regards to this and I am going to discuss what happens during a dimension update process and also go over an SSIS solution that I put together.

In order to update and refresh the data in your dimension table you can perform a Process Update on the dimension (or a Process Add if you are only adding new members and not updating any records, but that typically is not the case).  So what happens to the cube data when you update a dimension?  Well if nothing changed then you are okay, but if you have flexible relationships and the data within these relationships changed then the aggregation designs for these are dropped from the partitions.  Here is a walkthrough showing this:

image

Aggregation Advanced View Display in SSAS 2008

image

Query the partition information to view the aggregation information

image

image

Process Update with no changes – everything will stay the same

image

Process Update and there were data changes to dimension attributes that had aggregation designs

image

Aggregations designs are dropped that referenced those dimension attributes (A0 and A2 were dropped).  How do you get them back?

image

Run a Process Default on the affected partitions (you could also choose the option to process affected objects during your dimension processing, this will do the same thing)

image

Aggregation designs are now back

So that was just a quick overview of what happens when you do a refresh of a dimension in your SSAS database.  You can run this, just be aware of what is going on behind the scenes.  If there were changes and relationships were updated, then you will also need to rebuild the aggregation and indexes if these attributes were included in any aggregation designs.

So if you want to incorporate this logic along with a incremental update of your partitions, what can you do?  Well for this I leveraged SSIS and setup a package that allows for either a full or incremental update process.  In my example here I actually do this for a specific cube in a SSAS database, so I only process the dimensions that are associated with a particular cube.

image image

I have to give credit to Vidas Matelis here for providing some of the AMO coding for the processing, SSIS package that process all partitions/measure groups/cubes in one database .  I modified it a bit to be able to limit the scope for a particular cube along with a Try Catch block in the code, but it is what I used for a starting point.  Using the AMO allows the process to be more flexible and dynamic.  As with all things, as you expand on concepts and build them out they can start to get a bit complex, so I will try and explain this as best I can in a brief sort of way for this posting.

  1. I added a script task as a starting point so that I could evaluate the ProcessFull variable and I also use this to set the number of months that will be processed (or this could be years depending on your volume.  I use a naming convention in the partition names so that it is like partition_name YYYY or partition_name YYYYMM.  In this process it will be for the later YYYYMM format.  So if the IncrementalMonthsProcess is set to 2 then at the time of this posting the value would be set to 200909 and this will be used later in the data flow task for evaluation where the names of the partitions will be parsed.
  2. Now at this point you move on to either the Full or Incremental processing.  The Full is fairly self explanatory where I have it broken into a full dimension process script task and then another script task for the partition processing.  I broke it out into two tasks so that I could get some additional duration information for logging and history.  The dimension processing only processed the dimensions for the specified cube and this is the same for the incremental approach as well.  So if you have multiple cubes with more dimensions or some unused dimensions then these would not be touched.
  3. The incremental process starts out with a script task that does a process update on all of the dimensions in the specified cube.  Same script task code as the full with the exception of the type of processing that is performed.
  4. The data flow task is used to create three recordsets of data that will be used for the partition processing – a full, data, and default processing list.  I setup a script task as a source to create a recordset of values and to parse out the date information in the partitions if it exists.
  5. The first list that is created is the full list of partitions and these get placed in the default processing list (object variable) that will be used in the last step of the package.
  6. The split transform is used to determine which partitions will have a full process performed on them (no date information included in the partition names, so yearinfo was set to zero) along with the partitions that will be part of the incremental update.  These partitions are typically part of a large measure group and the data has been partitioned by year or month (in this case month) and we only are going to process the last 2 months based on step 1.  We are just going to do a Process Data on these partitions (this reduces impact on server and also once this is done makes the data accessible to end-users – the final step in the package will build out the aggregations and indexes).  The remaining partitions will be left untouched (at least in regards to processing the data, aggregations might be rebuilt if they were affected).
  7. Now back to the control flow and to finish off the process we first do the process full partition list, then move on to the process data partition list, and finally the process default processing list.  As shown above in the screenshots the last step will do nothing to the partitions that were fully processed, but it will build the aggregations and indexes for the process data partitions and also for any additional partitions that might have been affected by the dimension updates.

So that was a quick overview of the process.  Now there are definitely other ways you could go about this, this is simply one solution that I came up with and it has worked good.  I have made a few custom modifications as needed like to do incremental processing by year or month along with limiting it to a particular cube, but these can quickly be modified and put to use in pretty much any environment.  Just set the variable values (ASServer, ASDatabase, ASCube, ProcessFull, IncrementalMonthsProcess) and you are off to the races.  If you want to turn on the SSIS logging that is up to you.  I have used the SQL Server table logging and then leverage the SSIS report pack for reporting purposes, but that is entirely up to you.

I have uploaded the package if you want to check it out to my SkyDrive account.  I do have a SSIS 2005 and 2008 version available to take a look at in the download and I have annotations included in the packages.

Hope you find this useful

image

(Click on the image above that contains link to download. You should be able to do a right-click and do a save target)

UPDATE (6/2/2010): Package download has been updated to reflect a fix to handle the partitions with a yearly naming convention so that they do not get included in the full processing.  This was discovered during my May 2010 presentation preparation – PASSMN May 2010 Follow-up.

Posted in Analysis Services | Tagged: , | 1 Comment »

Windows Server 2008 Kerberos Bug Patch – resolves SSAS connection issues

Posted by denglishbi on July 21, 2009

Yesterday I was notified by a colleague that Microsoft had released the patch to fix the AES Kerberos issues that had been discovered earlier this year – Windows Server 2008 Kerberos Bug – Transport Connection Issues with SSAS data.  I had not been notified of this yet, but I checked out the blog posting by Dr. John Tunnicliffe SSAS: Microsoft release fix for “Kerberos killing MDX” issue and he provided a link to the downloads Smile

UPDATED (8/23/2009) *** This has been updated and is the official release for this fix now *** :

A Kerberos authentication fails together with the error code 0X80090302 or 0x8009030f on a computer that is running Windows Server 2008 or Windows Vista when the AES algorithm is used

http://support.microsoft.com/default.aspx/kb/969083

*** This has been updated and is the official release for this fix now ***

So for those of you who have had to put the upgrade to Windows Server 2008 on hold because of this you can now download the patch and continue on with your infrastructure upgrades.  Just make sure that you properly test the patch before deploying this into your production environments.

EnjoyOpen-mouthed

Posted in Analysis Services | Tagged: , | 1 Comment »

Slicing Analysis Services (SSAS) Partitions and more

Posted by denglishbi on May 15, 2009

Just wanted to touch base on a couple of items with partitions referencing the ever popular Adventure Works 2008 sample Analysis Service project available to download from CodePlex here – SQL Server 2008 SR1 downloads – it appears that they have bundled the databases together now, used to be broken out by the operational and the data warehouse.  The sample project file will be part of this download for Analysis Services and will be located in the following directory if you go with the default setup – C:\Program Files\Microsoft SQL Server\100\Tools\Samples\AdventureWorks 2008 Analysis Services Project.

Now if you open up the Enterprise version you will see that the measures groups have multiple partitions defined for them.  One thing to note is that the counts are not properly updated and that aggregations have not been defined for all partitions in the cube.

image

If you have BIDS Helper installed you can go ahead and use the add-in to perform an update on the estimated counts for all of your objects.

image

Just a warning, if you are going against a very large database you might not want to perform this operation.

image

As a work around you could go ahead and simply update the estimated row count in the properties for the partition to provide Analysis Services an estimated row count when you define and generate the aggregations for the partitions (actually there is a Partition Count setting that you will see when you go through the aggregation design wizard that gets used by the algorithm when creating aggregations, so it is important to set these values).

image

After the estimated counts have been updated you will see that the counts are updated, but you would still need to design aggregations for the partitions where these have not been defined yet (you might need to save the cube file, close it, and reopen the file to see the updated counts).

image

To create the aggregations in SSAS 2008 you have to switch over to the new Aggregations tab in the cube.  You can then select the group that you want to design the aggregates for and walk through the wizard.

image

And you can generate the aggregates for all of the partitions at once.

image

Ok, now let’s get back to the partitions portion.  I am going to make a modification to the Internet Sales partition for 2004 and break this out into Q1 and then place Q2 in a separate partition.  This really doesn’t need to be done for the Adventure Works data since the volume of data is extremely small, but in a real world scenario this could definitely improve query performance (plus reduce processing time if you are just processing the latest quarter instead of the entire year).

image

I went ahead and modified the query of the existing 2004 partition so that the cutoff was less than 20040401 instead of 20041231 for OrderDatekey.  You need to be careful that you do not overlap the ranges, because there is no validation going on, so you could potentially include data that is already in an existing partition.  After I had modified the existing partition for 2004 and added the new partition I went ahead and updated the estimated counts.

image

Now that this is done let’s run a query against the cube and take a look at what is going on.  Here is the query that I will execute against the cube:

SELECT {Measures.[Internet Sales Amount], 
        Measures.[Internet Order Quantity]} ON 0,
NON EMPTY([Customer].[Customer Geography].[Country]*
        [Product].[Category].[Category]) ON 1
FROM [Adventure Works]
WHERE ([Date].[Calendar].[Month].&[2004]&[4])

image

So the query executed and if we look at the Profiler trace we can see that the query actually touched multiple partitions for the measure group that we were querying information from.  The reason for this is because we have modified the basic partition from a single year.  If we would have left it at the year level it would have been fine, but since we are dividing this up into multiple parts now it does not know where to retrieve the data to satisfy the query.

Lets go back into the partitions in the cube and set the ‘Slice’ property for the partitions.  This is where you specify the tuple that defines the partition.

image

Now that we have this setup we will redeploy and run the query again.  You will need to define the ‘Slice’ property on each of the partitions in the measure group.

image

UPDATE (5/16/2009): fixed this picture (before I had highlighted the aggregation, not the partition)

Now that we have defined the ‘Slice’ for the partitions we see that our query only touches the partition that we are querying against and it was faster in response timeSmile  Granted this is a small set of data and using this doesn’t really make too much of a difference, but you can imagine what this would do to a very large dataset.  And if we switched over to reference the Fiscal hierarchy instead we would see the same results.

That is it for now, hope you enjoyed this little tip and I want to thank Siva Harinath and Howie Dickerman from Microsoft for their presentation at last year’s Microsoft BI Conference Designing High Performance Cubes in SQL Server 2008 Analysis Services where they pointed out this item.

Posted in Analysis Services | Tagged: | 3 Comments »

Large Dimension Processing issue – File system error

Posted by denglishbi on May 11, 2009

A couple of weeks back I experienced an issue with a some what large dimension.  The dimension had around 15.5MM rows and it continued to error while processing the key attribute within the dimension in Analysis Services (SSAS) 2008.  I don’t believe this is a version issue at all and I was able to come up with a solution that resolved the processing issue which I will explain.

What was happening was that all of the other attribute hierarchies were processing fine, but every time the key attribute was being processed it would continue to error out in the same spot (around  60K rows) and it would return this error message (I swapped out a few names to protect the innocent):

File system error: While attempting to read information from disk, a read error occurred for physical file: \?D:Program FilesMicrosoft SQL ServerMSAS10.MSSQLSERVEROLAPTempMSMDCacheRowsetBadRows_1932_16629_jm313.tmp, logical file: . Errors in the OLAP storage engine: An error occurred while the ‘DimensionName Key’ attribute of the ‘DimensionName’ dimension from the ‘SSAS_DB’ database was being processed.

This processing was going against an Oracle data store, not that I feel that the issue was related to that, but just a disclaimer.

I started in by verifying the drive space for the Temp directory that gets utilized during the processing to ensure that enough space was allocated.  I also review the design of the dimension to make sure that options like hierarchies are not optimized and not ordered were being set properly.  I also verified the SSAS server properties to make sure that the EnternalCommandTimeout was adequate to make sure that the queries would not timeout during processing.  Unfortunately none of this seemed to make a difference and I continued to get the same error message.  I even verified the size on the asstore file for the dimension attribute thinking that maybe I was bumping up against the 4GB limit, but it was only around 355MB.

After searching the web and forums I came across an item that seemed like a potential game winner – Using ProcessingGroup Dimension property option ByTable vs. ByAttribute may error with string keys.  At this point I had nothing to lose and based on the hardware – Dell 2950s with 32GB of RAM and 2 quad core processors, plus we were running x64 software – I felt that this was a good solution.  I went ahead and switched the ProcessingGroup property on the dimension from the default ByAttribute to ByTable.  So basically a massive query gets executed and the results get cached which is then used to process all of the attribute hierarchies.  In running this process not only did the process complete successfully, but it actually shaved about 20% of the time off for processing the dimension.  You definitely need to read up on this process, make sure that you have the hardware in place, and do some proper testing.

image

Some of the other references that I looked at during this to fix the processing issue were the following:

As always you should refer to the whitepapers that have been provided on the SQL Server Best Practices site and a few key ones to point out are the following:

Posted in Analysis Services | Tagged: | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 91 other followers