Let me show you a quick way to setup Country and Region for your Data Warehouse or Data Mart. Normally, you expect to find this kind of reference data within your organization in some source system or even better in a MDCP (Master Data Control Point).
If you can’t find this data anywhere, then check out the Azure Data Market.
You can sign up for free here and browse a number of interesting datasets. One of them is the World Development Indicators. Not all the data sets out there are free, but the World Development Indicator is free.
Once you signup, you can go down to Explore the Dataset
You will see a query panel:
Check out the GetCountries dataset
I highly recommend you install the Excel Data Market Excel addin that will make this a breeze:
So here is how it looks in Excel:
Pass parameters, if any and you are done. The parameter I used is en or EN for english.
Here is the result:
There are some aggregate information in there that you can clean up. Also, you may want to clean up the (all income levels).
I am not recommending this as the best way of doing it, but if you find yourself with no reference country, region data, then give Azure Data Market a try.
PowerPivot for Excel is an Excel 2010 add-in that allows users to pull data from multiple sources, mash them up, and then build reports using regular pivot tables. You can even share these reports with others in Microsoft SharePoint (via PowerPivot for SharePoint). In this demo, Julie Strauss, Program Manager for Microsoft SQL Server Analysis Services, shows just how easy it is to get a better view into your data.
Find a nice post here
Today I played around with Netflix data after extracting it using PowerPivot for Excel addin.
Netflix OData feed: http://odata.netflix.com/v2/Catalog/
I wasn’t able to create a relationship between the entities Titles, Genres, People, TitleAudioFormats etc. The Titles data apparently contains duplicates in its TitleId column and the entities were missing the key attributes required for setting up a relationship.
Anyway, here is a shot at it:
Drill to detail:
You can download the file here. You’ll need Excel 2010 for the slicers to work. Due to the volume of the data, I only extracted Titles with release date greater than or equal to 1997.
You may be prompted to download this file as the file is ~40MB.
In this post, I describe how to quickly build a Date Dimension for your data warehouse by leveraging the Azure Data Marketplace and Excel PowerPivot. Rather than building the date table yourself, you can just download the date table from the Azure Data Marketplace using PowerPivot.
Start by signing up at the Azure Data Marketplace https://datamarket.azure.com/. Windows Data Azure Marketplace is envisioned to be the One-Stop shop for premium data and applications.
Next search for DateStream service.
The DateStream service provides all the dates from Year 1900 to 2100. You probably will not need all of the data, but that can be easily filtered out once you have all of the data downloaded. If you are not interested in extracting ~200 years worth of dates, you can apply filters at the source and extract only what you are interested in. Check out this post: http://www.bp-msbi.com/2011/10/range-queries-with-azure-datamarket-feeds/
The DateStream feed currently consists of a number of date tables:
You’ll need to subscribe to this data set. Once subscribed, you will be able to explore this data set right inside the marketplace.
I will download a connection to Excel PowerPivot which will download a ServiceQuery.atomsvc file. You could choose to download just a CSV file if you want.
Doubleclick on the ServiceQuery.atomsvc file and that should open the table import wizard. Provide your account key and proceed importing the data.
After you import the data:
After I downloaded this data, I created a count measure as follows: cnt:=COUNTROWS(DateStream). The table contained 73,414 rows since I didn’t apply any source filters.
I also created an additional column in the table and called it: MonthName2 with the following DAX: =[MonthOfYear]& " - " & [MonthName]. This will help me create a date hierarchy and sort the months in the right order. I switched to the “Diagram View” to create a date hierarchy – simply right click and create a hierarchy and drag and drop the columns that form the hierarchy.
Next, I created a Pivot table in Excel:
You don’t have to create a pivot table if all you are interested is building a Date table in your database. You should already have the data in CSV or Excel that you can import into your relational database.
Today I stumbled upon a site that has hosted some really cool PowerPivot demos.
Go https://insights.hostedpowerpivot.com/sites/Demo/Pages/default.aspx and browse the sample PowerPivot apps below.
Here is one that I created using Pluralsight’s OData feed. I uploaded it to Skydrive, but Skydrive doesn’t support links to data sources, so you’ll need to download it to see it in action.
With PowerPivot you can quickly extract data from a wide variety of data sources and built impressive looking reports right within Excel. In this post, I will extract data from an OData feed made available by Pluralsight training website.
I am on Windows 7 platform with Excel 2010 and PowerPivot.
I will start with a new Excel Spreadsheet and launch PowerPivot. In PowerPivot, click on “From other data sources” and click on “Other Feeds”.
Click Next and enter the Data feed URL: http://www.pluralsight-training.net/odata/
Here is a list of all OData producers: http://www.odata.org/producers
Click Next and it should come back with the source tables that are exposed in this feed:
You can give new friendly names to these source tables. This looks good to me, so I will go ahead and click Finish. This should extract the data into PowerPivot tables:
Take some time to explore and get familiar with data as the next step involves building a relationship.
Click on the Design View in PowerPivot:
Here is the view: note that there is no relationship that currently exists. This is what we will work on next.
To create a relationship between Modules and Authors, right click on Modules and choose “Create Relationship” and click “Create”.
This will create the relationship between Modules and Authors using the Author/AuthorName.
Similarly, I can go ahead and create the relationship amongst all the entities as seen below:
Next we need a Date Table to be our time dimension.
The Date Table can be built easily in Excel. Use a new sheet to generate date data. By exploring the data, I realized that data goes back to 2008. So, enter 1/1/2008 and drag the cell until you cover 2008, 2009, 2010, 2011 and 2012. This should give you about 1827 rows. Next create the formulas for Year, Month, Quarter and Month Name:
Quarter: =IF(C2<4, "Q1", IF(C2<7, "Q2", IF(C2<10, "Q3", "Q4")))
Month Name: =IF(C2=1, "Jan", IF(C2=2, "Feb", IF(C2=3, "Mar", IF(C2=4, "Apr", IF(C2=5, "May", IF(C2=6, "Jun", IF(C2=7, "Jul", IF(C2=8, "Aug", IF(C2=9, "Sep", IF(C2=10, "Oct", IF(C2=11, "Nov", IF(C2=12, "Dec"))))))))))))
Here is a partial screenshot:
Now, you need this as a Date Table in PowerPivot. There are two options to do this: (1) Click “Create Link Table” under the PowerPivot ribbon item in Excel (2) Copy and Paste directly into PowerPivot.
Here is the Date table in PowerPivot:
Before we setup the relationship between the Date table and Modules, we need to fix the PublishDate column in Modules because it contains the Date and Time components in it. In order for our join with Date table to work, we only need PublishDate. Rename PublishDate to PublishDatetime and create a new column PublishDate in Modules table with this formula: =DATEVALUE([PublishDateTime])
The next step is to create the relationship which can be created as follows:
Another thing to do is to create a Date hierarchy as follows:
Right Click>Create Hierarchy
Give a name to the hierarchy
Drag and drop the levels of the hierarchy
Almost there with the data model, we just need to create a couple of new measures: (1) Total Modules (2) Total Courses. You could possibly create more measures with the data we have, but we are good for now.
Switch to Data View in PowerPivot and go to Modules table. Create a new measure as follows: Total Modules:=COUNTROWS(Modules)
Next, In the Courses table: Create Total Courses:=COUNTROWS(Courses)
With that we are now ready to create our reports.
Create a new PivotTable>Chart and Table (Horizontal)
You get a blank slate. You can now start to drag and drop Objects from the Pivot Field List:
Drag and Drop the Course attributes as follows:
Row Labels: Category, Title
Slicer Vertical: Category
Values: Total Courses
Axis Fields: Category
Values: Total Courses
Here is the view I get:
Let’s take a few minutes to beautify the table and the chart:
Table: Highlight the table and choose the table format you like:
Chart: Highlight the chart, go to Design and apply the template you like:
You can hide the Axis Field Buttons on the Chart since we already have a slicer setup which is a much better way of slicing-dicing the data.
That’s it. Feel free to explore with Pivot Tables and Chart and create reports for the other Entities in PowerPivot database.
Here is the final Spreadsheet that I ended up creating: https://skydrive.live.com/redir.aspx?cid=d3aff4f279dff93d&resid=D3AFF4F279DFF93D!703&parid=D3AFF4F279DFF93D!701
If you’d like to follow a step-by-step tutorial on DAX, be sure to check out the labs here > Defining DAX Calculations with Microsoft PowerPivot for Excel 2010.
Let’s create some DAX calculations in this existing Excel 2010 report:
Go to PowerPivot>PowerPivot Window and note that we have the following tables with already extracted data from AdventureWorks SQL database:
[US Sales], [Product], [Category], [Subcategory], [Date], [US Geography], [US Population]
Add a new calculation DAX:=[SalesAmount] - [TotalProductCost]
Note support for intellisense in PowerPivot:
Create StateCode2 DAX as: =[StateCode]
Create a Relationship with US Population table:
Note: that PowerPivot does not support creating more than one relationship for a column. That’s why we are created a State2 DAX Calculation.
Go to US Population table and create a new DAX Calculation:=SUMX(RELATEDTABLE('US Sales'), 'US Sales'[SalesAmount]) / [Population]
To switch back to Excel:
Note that you need to click on the Refresh Button to see the new calculations:
Add GrossProfit to the report. In order to change the name and the formatting, right click on any cell under GrossProfit and choose “Value Field Settings”:
Change the name of the measure to Profit:
Apply a Number Format to Gross Profit to Comma Separated with Zero decimal places:
Lets apply some conditional formatting:
Now let’s add a new Measure to the PivotTable by going to PowerPivot ribbon and selecting “New Measure” and add the below DAX:=SUM('US Sales'[GrossProfit]) / SUM('US Sales'[SalesAmount]) . Note that Intellisense works inside the Measure Settings Dialog.
Rename Measure to Profit% and apply Percent (%) formatting:
Apply conditional formatting as we did earlier: