Introduction to Wrangling Data Flows in Azure Data Factory

December 3, 2020December 3, 2020 / Mitchell Pearson / Leave a comment

Hello! It’s been a while since I’ve done a video on Azure Data Factory. To get back in the flow of blogging on ADF I will be starting with Data Flows, specifically Wrangling Data Flows.

The video can be seen here:

What are Wrangling Data Flows in Azure Data Factory?

Wrangling Data flows are a method of easily cleaning and transforming data at scale. huh?

Wrangling Data Flows uses the M query language and the UI experience provided by the Power Query Editor in Power BI Desktop. This is a brilliant move by Microsoft to include this technology in Azure Data Factory. Just think of the hundreds of millions of people who currently are transforming and cleaning their data in Excel or Power BI Desktop. Now they can take their self service ETL (extract, transform and load) skills to the enterprise level with ADF.

What makes it scalable? Power Query Editor at Scale.

Wrangling data flows allows the developer to use the graphical user interface to do all the hard work with minimal to no code. But in the background all of your UI steps are being converted to the M language. At runtime, Azure Data Factory will take that M code and convert it to Spark and then run your data flow against big data clusters. This means as your data volumes grow, you should experience consistent performance!

Are there any limitations with Wrangling Data Flows?

Yes… quite a few actually. Wrangling Data Flows are still in preview at the time of this blog and the related YouTube video. Currently there are quite a few operations that just aren’t supported. The most obvious of those operations being promoting header rows and pivoting data. I hope that these features will be available once the product is in GA.

https://docs.microsoft.com/en-us/azure/data-factory/wrangling-data-flow-functions#known-unsupported-functions9

As always, thank you for reading my blog and watching my YouTube videos! Have a great day!!

Other Azure Data Factory resources!

Azure Data Factory – Metadata Activity (Part 1)
Azure Data Factory – Stored Procedure Activity (Part 2)
Azure Data Factory – Lookup and If Condition activities (Part 3)
Azure Data Factory – Foreach and Filter activities (Part 4)
Azure Data Factory – Copy and Delete Activities (Part 5)
Azure Data Factory – Web Activity / Executing a Logic App (Part 6)
Azure Data Factory – Executing a Pipeline from Azure Logic Apps (Part 7)

Azure Data Factory–Rule Based Mapping and This($$) Function

May 18, 2020December 3, 2020 / Mitchell Pearson / Leave a comment

Hello! This is the eight video in a series of videos that will be posted on Azure Data Factory! Feel free to follow this series and other videos I post on YouTube! Remember to like, subscribe and encourage me to keep posting new videos! Smile

Azure Data Factory – Metadata Activity (Part 1)
Azure Data Factory – Stored Procedure Activity (Part 2)
Azure Data Factory – Lookup and If Condition activities (Part 3)
Azure Data Factory – Foreach and Filter activities (Part 4)
Azure Data Factory – Copy and Delete Activities (Part 5)
Azure Data Factory – Web Activity / Executing a Logic App (Part 6)
Azure Data Factory – Executing a Pipeline from Azure Logic Apps (Part 7)

Schema flexibility and late schema binding really separates Azure Data Factory from its’ on-prem rival SQL Server Integration Services (SSIS). This video focuses on leveraging the capability of flexible schemas and how rules can be defined to map changing column names to the sink.

Rule Based Mapping

Rule based mapping in ADF allows you to define rules where you can map columns that come into a data flow to a specific column. For example, you can map a column that has ‘date’ anywhere in the name to a column named ‘Order_Date’. This ability to define rules based allows for very flexible and reusable data flows, in the video below I walk through and explain how to set this up in side of a Select transform, enjoy!

This ( $$ ) Function in a Derived transform and a Select Transform

The this ($$) function simply returns the name of the column or value of the column depending on where it is used. In this video I show two use cases, one in a Select transform and one in a Derived transform.

Video Below:

If you like what you see and want more structured end to end training then check out the training offerings for Pragmatic Works! https://pragmaticworks.com/training

Azure Data Factory–Metadata Activity (Part 1)

April 5, 2020April 18, 2020 / Mitchell Pearson / 8 Comments

Hello! This is the first video in a series of videos that will be posted on Azure Data Factory! This series will be primarily in video format and can be found on YouTube! Check it out there and if you like, subscribe and encourage me to keep posting new videos!

Metadata Activity in ADF v2

Metadata Activity of a file
Metadata Activity of a folder
Metadata Activity from an Azure SQL Server Table
Debugging and output parameters

Video Below:

If you like what you see and want more structured end to end training then check out the training offerings for Pragmatic Works! https://pragmaticworks.com/training

	Check it Out !!… on Dynamically changing title nam…
	fadwamousa on Dynamically changing title nam…
	weekly fanz on ALL vs ALLSELECTED in DAX and…
	VINICIUS AUGUSTUS PA… on Azure Data Factory–Filte…
	Mitchell Pearson on Advanced TSQL Takeover