Hello Folks!!
As a matter of fact and promised in previous post Power BI : Transform Data into different Format using M-Query with same number of Data rows, this time I comes with an extension of the previous post where we have discussed on how to manage or transfer one format of data into another under the umbrella of Power BI (Post name is mentioned above), this post will even go further and follow no boundaries on the number of rows data has after Header row, i.e. this will overshadow the restriction of number of rows as mentioned in previous post.
As technology grows and time demands, any restriction or limitation in any kind of technology will always degrade the fame of the technology, hence Microsoft too support on NO LIMIT policy, at least this is true in that case.
Sounds quite Interesting !! Indeed ! Let's explore the world of no limitation under the scope of Power BI with the help of our very own example illustrated in previous post.
Let's start the road roller with some illustration.
Requirement :
We have source file which is placed at any location does contains data that is not ideal for Power BI to render with, instead we have very absurd way of formatted data as detailed further, i.e. the very first row having information of the header, i.e. DATA, CLOUD and AZURE and there after, there are rows that contains rows that belongs to subsequent header, i.e. data is like we have Header and then their subsequent rows, then header and its subsequent rows and we do not have any limitation on number of rows to be available for every head, it can be any number as contrast to previous post where number of rows for each header must be same, now, requirement is to load the data into Power BI dataset which can be easily understood by Power BI and reports can be created based on this Power BI Dataset.
Resolution :
Again to proceed any further, we need to understand the source file, here by I am going to explain source file which will help us to understand what actually is required and what exact transformation and logics we are in need to create.
Just to begin with, let's say very first rows of the source file (termed as TABLE A) contains row as "ColA, ColB and ColC, and then onwards, it contains rows that consists of string having text as DATA for all the columns, post that, there are any number of rows of data beneath header DATA row which means all these corresponding rows belongs to DATA only, in a same manner, we have another set of one row that having string value as "CLOUD followed by number of rows that are belongs to CLOUD and so on for rest of the data, i.e. we have AZURE header followed by their subsequent rows with any number of data rows.
Aforementioned detail of Source file elaborated very much detail of the source data, now its right time to explore on how to transform this source file data format into the format which is easily readable by Power BI and processed in the further Reports.
In order to transform data of source file into the format which Power BI can understand, we need to perform transformation as shown in below image, i.e. Table A to Table B where Table B is the new transformed format and which is easily readable by Power BI.
Obviously, this is not something which can be achieved easily and not a straight forward at all, instead we need to apply few tricks to achieve the goal, who can be better friend in such situation other than M-Query, so let's see how M-Query will helps us to get rid of such bad format of the data and to convert into format that will be easily readable by Power BI.
i.e. all the header data now come under one umbrella i.e. under one column, lets say name of the column is Type and rest will go under specific column as previous i.e. ColA, ColB and ColC.
Let's see how we can transform the file with the help of M-Query of Power BI, illustrated and detailed in below mentioned steps.
Note - For Demo purpose, I have created Data shown in Table A via using ENTER DATA.
Step 1 : Open Power BI Desktop
Step 2 : Navigate ribbon to the Home and Click on Get Data
Step 3 : Under Get Data, search for Blank Query
Step 4 : On opened Blank Query, paste the below mentioned M-Query and provide meaningful name such as Dimension and click on OK, this will create a table named Dimension which can be further use in the Dataset.
* Below is the M-Query for the Function : Dimension
Table : Dimension
let
Source = YOUR SOURCE EXCEL,
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ColA", type text}, {"ColB", type text}, {"ColC", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Type", each if ([ColA]=[ColB]) and ([ColB]=[ColC]) then 1 else 0),
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 1, 1, Int64.Type),
#"Added Custom1" = Table.AddColumn(#"Added Index", "Custom", each List.Sum(
List.FirstN(
#"Added Index"[Type],
[Index]
)
)),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom1",{{"Custom", Int64.Type}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type1", each ([Type] = 1)),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"ColB", "Custom"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Custom", "RunningValue"}})
in
#"Renamed Columns"
* Above M-Query will create the Dimension table that having all the values of Header (DATA, CLOUD and AZURE) under one umbrella, i.e. one column will contain all the values of the Header.
* This table is intermediate table and will be used for look up and merging of data further.
* Data in the Dimension table will look like as mentioned below
* Just to synchronise with the project you work, please replace Source < Source = YOUR SOURCE EXCEL> with your source path in a code above.
Step 5 : Navigate ribbon to the Home and Click on Get Data
Step 6 : Under Get Data, search for Blank Query
Step 7 : On opened Blank Query, paste the below mentioned M-Query and provide meaningful name such as "MyData" and click on OK, this will create MyData table in Power BI platform and scope is limited to the complete report level.
M Query for Table : MyData
let
Source = YOUR SOURCE EXCEL,
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ColA", type text}, {"ColB", type text}, {"ColC", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Type", each if ([ColA]=[ColB]) and ([ColB]=[ColC]) then 1 else 0),
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 1, 1, Int64.Type),
#"Added Custom1" = Table.AddColumn(#"Added Index", "Custom", each List.Sum(
List.FirstN(
#"Added Index"[Type],
[Index]
)
)),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom1",{{"Custom", Int64.Type}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"Custom", "RunningValue"}}),
#"Merged Queries" = Table.NestedJoin(#"Renamed Columns", {"RunningValue"}, Dimension, {"RunningValue"}, "Dimension", JoinKind.LeftOuter),
#"Expanded Dimension" = Table.ExpandTableColumn(#"Merged Queries", "Dimension", {"ColB"}, {"ColB.1"}),
#"Renamed Columns1" = Table.RenameColumns(#"Expanded Dimension",{{"ColB.1", "Type Name"}}),
#"Merged Queries1" = Table.NestedJoin(#"Renamed Columns1", {"ColB"}, Dimension, {"ColB"}, "Dimension", JoinKind.LeftAnti),
#"Removed Other Columns" = Table.SelectColumns(#"Merged Queries1",{"ColA", "ColB", "ColC", "Type Name"})
in
#"Removed Other Columns"
* Please replace Source < Source = YOUR SOURCE EXCEL> with your source path in a code above.
Step - 8 - Click OK.
Hurray !!! Here we go, we got the desired result.
As its a nature of world, everything good things comes with some sort of exception as well, this too is also not exception, please check below for the same.
Pros :
1. This will change the complex format of the data into very easy and readable format which Power BI can read easily and process in Reports.
2. No external tool intervention is required to change the format of the data. M-Query does all the magic.
Cons :
1. Some time, it take more time to execute and needs good understanding of the M-Query.
Author - Feel free to ask query if you have any query
Thanks for providing this information
ReplyDeletePower BI Online Course
Power BI Online Training
Thank You
Delete