Dan English's BI Blog

Welcome to my random thoughts in regards to Business Intelligence, databases, and other technologies

Archive for October, 2022

Power BI using Synapse Serverless Pool with Delta Lake format

Posted by denglishbi on October 6, 2022

In this post I just wanted to show the ability to use the Delta Lake format that is very common now with Power BI. I will go over a quick example of creating the files to reference, building a view to use with Power BI, and then querying the data in Power BI.

In my Synapse Workspace I created a Linked service connection to an Azure SQL Database that has the AdventureWorksLT database loaded which is the sample database you can create when you first create a SQL instance in Azure and here is a walkthrough link and see the Additional settings section.

I am going to use this database as the source to generate the Delta Lake format files in an Azure Data Lake storage account. To do this I created a Data flow in my Synapse Workspace and referenced two tables as my source, one being the Product table and the other the Product Category table. Really basic example, just really want to highlight the ability to easily reference the Delta Lake format files that are in the storage account within Power BI.

Synapse Data flow screenshot
Synapse Data flow screenshot

On the source side I am referencing the Linked server connection that I already established against the AdventureWorksLT sample database in Azure SQL as you can see below in the Source settings and above you can see I am actually providing a query to simply extract just the columns I am interested in on the Source options tab.

Synapse Data flow source settings screenshot
Synapse Data flow source settings screenshot

This setup is done for both the sources in the Data flow and then on the sink side the target destination is going to be an Azure Data Lake storage account. For the Inline dataset type I specified Delta.

Synapse Data flow sink screenshot
Synapse Data flow sink screenshot

And on the sink settings tab you just need to provide the folder path and compression information as shown below for where the product table Delta Lake files will land.

Synapse Data flow sink settings screenshot
Synapse Data flow sink settings screenshot

After the Data flow is configured and in this example I am just doing the Product and Product Category tables then add the Data flow into a Synapse Pipeline and Trigger the Pipeline to run to create the files in the Azure Data Lake. After you add the Data flow activity to the Pipeline you just need to configure the settings as shown below and then under Add trigger do the Trigger now option. This should just take a few minutes to run to queue the job and spin up the resources.

Synapse Pipeline Data flow screenshot
Synapse Pipeline Data flow screenshot

Now in the Synapse Workspace I can go into the Data hub and view the files that are in the linked data lake storage account as shown below.

Synapse Data hub linked data lake screenshot
Synapse Data hub linked data lake screenshot

With the files in the Azure Data Lake we are now going to reference these with the Synapse Serverless Pool. To make it easier to reference these in Power BI I will actually create a view that combines the data from both tables to include the columns of information that I want for my report. The first thing will be to create a database in the Synapse Serverless Pool and this will store the view that we will reference in the report.

Synapse SQL Serverless script screenshot
Synapse SQL Serverless script screenshot

In the screenshot above I have the script to create the database. After that gets created as you can see in the screenshot in the top-right I connect to that database and then that is when the script to create the view is run.

If you query the view or run the SELECT statement above you will see the results as shown below in the screenshot.

Query results from Delta Lake format files in ADLS screenshot
Query results from Delta Lake format files in ADLS screenshot

In Power BI Desktop now you can Get Data and by using the Synapse Serverless SQL endpoint that you can get from the Azure portal on the Synapse Workspace Overview page you can make a connection to the servless_ondemand database we created and select the vwProductCategory view to include in the report.

Power BI Desktop Get Data Synapse Serverless SQL screenshot
Power BI Desktop Get Data Synapse Serverless SQL screenshot

By know means is this a great end to end example, just simply show casing the ability to reference the Delta Lake format files in Power BI, but you can create a report and in DirectQuery mode query and interact with the information that is in the Azure Data Lake with Synapse Serverless Pool that is stored in Delta Lake format!

Power BI Desktop report screenshot
Power BI Desktop report screenshot

This was a pretty brief overview, but hope you can see the possibilities of using the Synapse Serverless capabilities to easily reference the data in an Azure Data Lake and query the data directly using Power BI and in doing this we even created a view that joined the data from two different tables (files) in the lake!

Posted in Azure, Power BI, Synapse | Tagged: , , | Leave a Comment »