Mitchellsql

Skip to content
  • Home
  • About
Search

Microsoft Cloud

Azure Data Factory – Get Metadata Activity

June 11, 2018June 11, 2018 / Mitchell Pearson / 7 Comments

Welcome to part one of a new blog series I am beginning on Azure Data Factory. In this first post I am going to discuss the get metadata activity in Azure Data Factory. In this post you are going to see how to use the get metadata activity to retrieve metadata about a file stored in Azure Blob storage and how to reference the output parameters of that activity. In part two of this blog series I am going to show you how to use the stored procedure activity to load the metadata information into a table in Azure SQL Database. Take a look at the below design pattern:

Design Pattern_Blog Image

In this blog post you are specifically going to learn the following three items:

  1. Set up and configuration of the activity.
  2. How to read the outputs.
  3. How to reference output parameters from the Get Metadata activity.

Part 1: Setting up the Get Metadata activity.

First, I am going to create a new pipeline and then add the Get Metadata activity to the pipeline.

1_NewPipeline_GetMetadata

Next, I am going to set up and configure the activity to read from a file I have in Azure Blob storage.

Please note, for this post I assume you know how to create a dataset in Azure Data Factory. If you do not, kindly let me know and I can throw together a quick blog on how that is done!

  • With the Get Metadata activity selected, complete the following tasks:
    1. Click on Dataset in the property window.
    2. Select your dataset from the dropdown, or create a new one that points to your file.
    3. Select the property Last Modified from the fields list.
    4. Select the property Size from the fields list.
    5. Select any other properties you would like to get information about.

image

Part two: How to read the outputs from Get Metadata activity

Now that the activity has been configured, it’s time to run it in debug mode to validate the output parameters.

3_Debug

Once debug completes you can now take a look at the output of the debug execution for any of the activities in your pipeline. We only have the one activity in this example. Click on the output to see the output values for the items selected:

Tip: If you don’t see the output of the debug operation, click in the background of the pipeline to deselect any activities that may be selected. The output of the debug operation is a property on the pipeline, not any particular activity.

4_Outputs

Part 3: How to reference the output parameters

For me, this was the hard part, I discovered early on that there is no “Output Parameter” option defined on any of the activities, this is something I just expected since I come from a background of SQL and SSIS. However, not all is lost, you will find that referencing these output parameters is not that difficult and they have a basic pattern you can work with. But that will come after we cover the basics!

For now, let’s take a look at the basic pattern:

@activity(‘Get Metadata1’).output

This pattern can be broken down into three basic parts.

  1. @activity – specifies you are getting information from an activity in the pipeline.
  2. ‘Get Metadata1’ – This is the name of the activity that you want to get information from.
  3. .output – you are looking for an output from the specified activity.

Unfortunately this part is not complete, now you have to specify exactly which output parameter you want, and you have to figure out how to call that parameter. For example, do you want to retrieve the Last Modified date or the Size? what specific name do you use to call that parameter? For example, @activity(‘Get Metadata1’).output.Last Modified won’t work because Last Modified is the incorrect name of the output parameter, so the challenge now is figuring out what that output parameter name is so you can use that somewhere else in your pipeline.

So how do you get the specific name of the output parameters?

You can get the specific name of the output parameters by taking a look at the output results of the Debug operation. Take another look at the output results and you will see the exact name needed for the output parameter reference. Last Modified is going to be lastModified and the final code to reference the output parameter will look like the following:

@activity(‘Get Metadata1’).output.lastModified

5_ParameterNames

As always, thanks for checking out my blog! Be sure to check out the other blogs in this series to get a better understanding of how to use use the output parameter in other activities.

Advertisement

Recent Posts

  • Working with Averages in DAX using AVERAGEX, SUMMARIZE and DATESINPERIOD
  • Managed Virtual Networks and Private Endpoints in Azure Synapse and Azure Data Factory
  • Dynamically changing title names in Power BI
  • Working with Parameters and Expressions in Azure Data Factory
  • Introduction to Wrangling Data Flows in Azure Data Factory

Recent Comments

VINICIUS AUGUSTUS PA… on Azure Data Factory–Filte…
Mitchell Pearson on Advanced TSQL Takeover
lordi on Advanced TSQL Takeover
Mitchell Pearson on The transaction log for databa…
You cant see me on The transaction log for databa…

Archives

  • January 2021
  • December 2020
  • September 2020
  • May 2020
  • April 2020
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • August 2017
  • July 2017
  • June 2017
  • April 2017
  • August 2016
  • May 2016
  • April 2016
  • February 2016
  • July 2015
  • June 2015
  • April 2015
  • February 2015
  • January 2015
  • November 2014
  • September 2014
  • August 2014
  • June 2014
  • May 2014
  • February 2014
  • January 2014

Categories

  • Azure Data Factory
  • Azure Logic Apps
  • BIxPress
  • DAX
  • DBA
  • MDX
  • Performance Tuning
  • Power BI
  • PowerApps
  • Problem, Design, Solution
  • Professional / Organization
  • Scripting
  • Speaking
  • SQL Tips
  • SSAS – Analysis Services
  • SSIS
  • T-SQL
  • Task Factory
  • Tips and Tricks
  • Working with R

Meta

  • Register
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Top Posts & Pages

  • Dynamically changing title names in Power BI
  • How to add Data Labels to maps in Power BI
  • ALL vs ALLSELECTED in DAX and Power BI Desktop
  • Managed Virtual Networks and Private Endpoints in Azure Synapse and Azure Data Factory
  • SQL Tips! How to edit the query for Edit TOP 200 Rows in Management Studio (SSMS)
Create a free website or blog at WordPress.com.
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy
  • Follow Following
    • Mitchellsql
    • Join 93 other followers
    • Already have a WordPress.com account? Log in now.
    • Mitchellsql
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar