Installing Data Explorer Preview & Demo with IMDB Data
Posted by denglishbi on March 4, 2013
I can’t believe it has been over a year since I posted about Data Explorer in my post Installing Data Explorer Desktop Client. That was covering the initial release of the client plugin that used the cloud based service from SQL Azure Labs. Now there has been a considerable amount of work done and there is a public preview of the new COM Add-In you can download and use in Excel.
Let’s take a look at the requirements:
Supported Operating Systems: Windows 8, Windows 7, Windows Vista, Windows Server 2008
- requires .NET 3.5 SP1 or greater
- Office 2013 or Office 2010 SP1
You can download the 32 or 65-bit version of this from the Microsoft Office area – Download “Data Explorer”.
Here you can see the download loading in Excel 2010.
Here is the add-in being displayed in the ribbon along with the options that are available.
Version: 1.0.3207.2 (published 2/25/2013)
Now I am going to switch over to Office 2013 because once I get the data loaded into Excel I want to use Power View:)
Every time I tried to install the .NET 3.5 framework feature on my Windows 8 machine as well as download and install it from the URL specified I kept receiving the message that it was not able to install and returned the error code 0x800F0906. I did a little searching and came across this KB article from Microsoft:
I mounted my Windows 8 software and ran the following command to install this feature:
Dism /online /enable-feature /featurename:NetFx3 /All /Source:d:\sources\sxs
(where the ‘:d:’ is the drive letter for the Windows 8 media)
Here are the results of running this to get the framework installed from the command prompt:
Once that was finally setup I was able to install the add-in and it was available to load in Excel 2013 (and I am using Office Professional Plus 2013).
A little different experience for me with Excel 2013 instead of Excel 2010, I had to go into my COM Add-Ins and enable “Data Explorer”. Where as previously in Excel 2010 it was installed and enabled.
Here is the ribbon layout in Excel 2013, same as in Excel 2010.
For this example I am going to use the Get External Data->From Web option and tap into some information on the IMDB web site.
IMDB Charts: IMDb Top 250 – http://www.imdb.com/chart/top?ref_=nb_mv_3_chttp
Here is what the web page looks like that I am going to extract data from:
After I click apply the Query window opens and I can see what is available for me to access, in this case I am interested in Table 0.
NOTE: it would have been super cool if the hyperlink URL reference that was on the web site for the movie titles was included in this table, unfortunately it was not, so that forced me to do a little more additional work later on to enrich my data for Power View.
I am going to right-click on the Title column and split the column into two since the year the movie was released is included.
To split the column I am going to use a custom delimiter of a space and left parenthesis.
Now you can see the new column that was created and I have two columns, Title.1 and Title.2
Now I am going to cleanup the Title.2 column and replace the right parenthesis.
I will replace the right parenthesis with nothing and apply that change.
Now you can see the Title.2 column is cleaned up.
I am going to quick browse the values just to verify this and wouldn’t you know, I found one more item that needs to be evaluated.
I filtered for just the records with the 2011/I values and one record is displayed for “The Artist”.
I take a quick look at that move on IMDB and I see that after the title there is that additional (I), so that must be causing the issue.
So I will add another replace condition on the Title.2 column to cleanup that record and replace /I with nothing.
The next thing I need to do is take a look at the Steps pane on the right-hand side and you can see the trail of steps that I have performed so far. You can review them and in this case I am going to go back to the FilteredRows step and delete that step to remove the filter.
I get a warning about deleting a step in my process letting me know that it could impact subsequent steps, but I am okay with that.
Now we have my final dataset that I want to work with and this gets added into a new worksheet in Excel once I click Done in the Query window.
Now with a little of additional work with help from some calls out to the Internet and VBA we can get some additional information about these movies and make a Power View report like this (I made some additional adjustments to the model in PowerPivot as well, but this is more about Data Explorer, so we will skip that side of things):
The Rated ‘R’ movies have the most votes and you can see they have the top ranked moves as well by looking at the card layout.
By clicking on PG-13 in the bar chart in the middle we can filter the entire report – cards, summary data at top of report, as well as the column chart.
Based on the top 250 ranked movies from IMDB we can see that the average PG-13 movie has a rating of 8.2 and “The Dark Knight” has the highest rating of 8.9 with over 900 thousand votes.
I tracked down one of my favorite movies, “The Matrix”, and it was ranked 19 with an overall 8.7 rating.
We can also see down below that starting in the 60’s, the ‘R’ rated movies started to gain some traction and as the years go on those are the more favored movies, quite the difference compared with earlier decades where it was dominated by the ‘Approved’ rating, but maybe ‘R’ was not even a classification back then. More than likely if we had some demographic information on the voters we would probably see some correlation with that as well.
That is it for now, a quick preview of the install process as well as tapping into some data from a web URL, which in this example was IMDB.
You can check out some Getting Started, Tutorials, and How-to information here – Data Explorer Help.