In this article, we are going to talk about Power BI Incremental Refresh. Power BI is the cloud-based business intelligence tool developed by Microsoft. Using Power BI, users can create various kinds of modern reports and dashboards that provide insights and information of their data in a meaningful way. Power BI is a strong and independent BI tool that is available for both Windows desktop and also the cloud. In order to demonstrate the capabilities of Power BI Incremental Refresh, we will leverage Power BI Desktop to develop and author the data model and then publish it to the Power BI cloud where it will be refreshed incrementally.
What is Power BI Incremental Refresh
In data warehousing solutions, often we come across a term known as Incremental Refresh. Incremental Refresh is the process of loading changed or new data from a transactional database into the data warehouse. A transactional database might contain millions of records which will also be present in the data warehouse. However, in order to keep the data warehouse updated with the latest information, we need to refresh it periodically.
If we refresh the entire data warehouse regularly, it will consume lots of resources and might also block the transactional database. So in order to overcome this, we only try to load the data that has changed or arrived fresh after the last time the warehouse was refreshed. This significantly decreases the load on the transactional database as we are just importing a small portion of the data instead of the entire database. In this case, Power BI has introduced a new feature that enables users to define an incremental refresh policy in Power BI and load only changed or new data into the Power BI model. We will look into more detail in the next sections.
Advantages of Power BI Incremental Refresh
There are multiple advantages of implementing incremental refresh within Power BI. These are discussed as follows.
- Lower Refresh Times – In order to refresh the data model from the entire database, it might take a long time. However, in the case of an incremental refresh, since we are bringing in only a small portion of the data, refreshes are generally faster
- More Reliable Queries – As we refresh a Power BI data model, in the background SQL queries are being fired to the database engine. Queries that run for a shorter duration are more reliable as they do not lock the database for long periods
- Less Resource Consumption – Due to the lower runtime of the queries, the database resources consumed are also quite less
Pre-Requisites while setting up Power BI Incremental Refresh
There are some prerequisites to implementing incremental refresh in Power BI.
- Import Data Mode – The data should be imported into the Power BI data model in Import Data mode. Incremental refresh doesn’t work with the Direct Query mode
- Power BI Data Gateway – In order to access the on-premises data sources by the Power BI Service, we need to set up the Data Gateway. This is not mandatory if your source data is present in the cloud
Setting up the Power BI Incremental Refresh in Power BI Desktop
Now that we have some idea about what incremental refresh in Power BI is all about, lets us go ahead and get started with it. In this tutorial, I am going to use Power BI and connect it to an on-premise SQL Server instance that is running on my local machine. I am going to use the popular FactInternetSales table from the AdventureWorksDW database for the demo.
Creating the Report
Head over to Power BI Desktop and set up a connection to the SQL Server instance. Select the database and click on Connect.
Provide the valid connection details to the SQL Server instance and get it connected. Select the appropriate database and the table for which the data is to be imported into the Power BI model. Click Load once completed.
This will import the data into Power BI. To keep things simple, I am going to create a simple report that would use the following two-column and create a bar chart out of it.
- OrderDate – It is a DateTime field
- SalesAmount – It is an integer field
Setting up the range parameters
In order to set up the Power BI Incremental Refresh, we need to dynamically modify the database query that Power BI will generate. This is done by using parameters in Power BI. The parameters will dynamically generate the date values based on the data available in the Power BI data model and will generate the query that would fetch records after the last present data in the model.
For this, we need to create two parameters using the reserved keywords – RangeStart and RangeEnd. Click on Transform and then select Manage Parameters. It is mandatory that the datatype for both these parameters must be of Date/Time. A default datetime value can be added to the parameter that will be used to create the initial query.
Once the parameters are set up correctly, the next step is to filter the Power Query. Click on the button on the column that you want the filters to be added. In this case, we would want the data to be filtered based on the OrderDate values. Select Date/Time Filters and then Custom Filter.
In the next window, we need to apply the range values using the parameters that we have created in the previous step. Apply similar settings to bring in the initial data based on the values defined in the parameters.
Defining the Power BI Incremental Refresh policy
Now that we have configured our query to use the parameters, the next step is to define the incremental refresh policy. This can be done by navigating to the Fields pane on the Power BI Desktop and then right-click on the table on which the policy is to be applied. Since in this case, we have only one table so we will select that and then select Incremental Refresh from the context menu that appears.
In the next pane that appears, you can define the incremental refresh policy by the number of rows of data that needs to be stored and the minimum refresh interval. The minimum refresh interval will bring in the data that is present in the last period, for example on the last day, or last month, etc. I have selected to store data of up to the last 10 years and incremental refresh for every month. This means whenever the data model will be refreshed from the Power BI service, incrementally only the last one month of data will be queried upon.
Once these settings are applied, the final step is to publish the report to Power BI Service and refresh the dataset. Since in this example, I am using SQL Server on-premises, I have also set up the Data Gateway that would enable the Power BI service to refresh the dataset.
Validating the Incremental Refresh query
Now that we have refreshed the dataset, we can verify if the queries being executed against the database engine are actually limited to a month or not. For this, we can set up SQL Server Profiler and view the queries that are being issued against the database.
As you can see in the figure above, the OrderDate is being filtered using two values which are provided by the RangeStart and the RangeEnd parameters.
In this article, we have learned about implementing incremental refresh in Power BI. Setting up an Incremental Refresh has added benefits and takes much less processing time. To learn more about incremental refresh in Power BI, you can follow the official documentation.