Dataflows

Power Query allows you, whether you use Excel or Power BI, to connect to numerous data sources, retrieve, transform, and use “refined” data to create reports. On that occasion, each step in the transformation process is recorded and, when you refresh the link to the source, the steps are repeated in the given order and you get tables of identical structure. What is a purpose of Dataflows?

Dataflows allow you to create scenarios for retrieving and transforming data without the need to create multiple connections, and can also be used with multiple datasets. They make it easier to work with large databases, prevent analysts from directly accessing data and make it easier for system administrators to work, especially when planning when to refresh data. Most importantly, the “purified” data remains in the cloud, ready for further use.

You must have Power BI Professional or Power BI Premium licenses to use dataflows. You cannot create them within the My Workspace, but you must use another existing workspace. In this example, we’ll use a workspace called MyDSC. To create a dataflow we will go to this workspace, and then from the New menu select suitable option.

Next, we choose whether we want to create a data stream based on a new source (Define new entities) or from an existing data model (Import model).

Selecting the first option will open a window in which we are offered connectors to various data sources. If we choose to import from a local file or database we need to connect using the On-premises data gateway. We connect to cloud services or databases by entering credentials, and it is even possible to enter a table or write a query manually.

When we import data we can perform all those transformations that we would otherwise do using Power Query.

If you enter the View ribbon you will see that there is a Diagram View option which is (at the time this post is written) in the experimental phase. If you turn it on, a data transformation diagram will be displayed and within it each phase of the transformation. Show Step Names will display the full name of the steps in the diagram, and Show Step Labels the abbreviated names. The Auto-highlight related queries option will highlight all related queries. Clicking Save & Close will save all transformations within the dataflow and give it a name.

Once we have created the dataflow we can also set the refresh period. With Power BI Professional licenses, the shortest refresh period is 30 minutes, while with Power BI Premium service it can be shorter. At the time of refresh, the dataflow will connect to the sources, perform the necessary transformations and save the data. Further, you access the data flow from the Power BI Desktop application, where you can use it as one of the sources and make reports based on the imported data.