In part three of my Azure Data Factory series I showed you how the lookup activity could be used to return the output results from a stored procedure. In part one you learned how to use the get metadata activity to return the last modified date of a file. In this blog, we are going to use the if condition activity to compare the output of those two activities. If the last modified date of the file is greater than the last execution date (last time the file was processed) then the copy activity will be executed. In other words, the copy activity only runs if new data has been loaded into the file, currently located on Azure Blob Storage, since the last time that file was processed.
Check out the following links if you would like to review the previous blogs in this series:
Setup and configuration of the If Condition activity
For this blog, I will be picking up from the pipeline in the previous blog post. Therefore, this pipeline already has the get metadata activity and lookup activity and I am going to add the if condition activity and then configure it accordingly to read the output parameters from two previous activities.
- Expand the category Iteration & Conditionals in the activities pane.
- Select the if condition activity.
- Drag the if condition activity from the activities pane and drop it into the pipeline.
The next step is to configure the if condition activity to only execute after the lookup and get metadata activities complete successfully. This can be accomplished by using the built in constraints. The default constraint is set to success. This can be changed by simply selecting the constraint and then right clicking on it. There are currently four options available for the constraints:
- Successful – default behavior
To configure the constraints between activities is quite simple. Take a look at the animated gif below:
Now it’s time to set up the configuration of the if condition activity:
- With the if condition activity selected, navigate to the properties pane and rename the activity:
- Name: Check if file is new
Adding a parameterized expression in Azure Data Factory
Now it is time to configure the settings tab for the if condition activity. The settings tab requires an expression that evaluates to either true or false. In this example, the expression needs to compare the output parameters from each of the previous tasks to determine if the file is new since the last load time. The “Add dynamic content” menu will help with building the expression for you, however, it will not give you the full path to the output parameter. In the previous blog post, I showed how you could identify the exact output parameter names after the debug phase. Let’s take another look at the output results:
Let’s jump right in and build the parameterized expression:
- Select the settings tab from the properties window
- Click in the expression box
- Click the hyperlink that appears below the expression box “Add dynamic content”.
System Variables and Functions in Azure Data Factory
In the Add Dynamic Content window there are some built in system variables and functions that can be utilized when building an expression. In this blog post I am going to use the built in function greaterOrEquals. This function will allow us to compare the two dates from the output parameters of our previous activities (Last Modified date and Last Execution date).
- Expand the functions category
- Next click to expand logical functions.
- Finally, click on the function greaterOrEquals, this function will now appear in the expression box.
Now it’s time to finish building out the expression. The built in function, greaterOrEquals, expects two parameters separated by a comma. This is where I will insert the output parameters from the lookup and get metadata activities. As previously mentioned, the dynamic content window will help in referencing those outputs. In the animated gif below, you can see that I use the activity outputs to begin the parameter reference. Then I manually complete the expression by adding the specific parameter names. Remember, that we obtained these names by looking at the output of each activity after debugging the pipeline.
Adding activities to the If Condition activity
The final step is to add activities to if condition activity. You have two options under the activities tab. These options are If True activities and If False activities. In other words, what activities would you like to perform if the expression evaluates to true? Alternatively, what activities would you like to perform if the expression evaluates to false? For the sake of simplicity I will add a wait task to each activity condition. In the next blog post in this series I will replace the wait activity with the copy activity.
- Click on the Activities tab found in the properties window.
- Click the box “Add If True Activity”
- This will open a pipeline that is scoped only to the if condition activity.
- Add the Wait activity to the new pipeline.
- I named the activity wait_TRUE to help during debug and validation.
Also, pay special attention to the breadcrumb link across the top of this pipeline. This makes navigation easy and also helps identify what scope you are currently developing in. As mentioned previously, the wait activity that was added is scoped to the if condition activity and only if the expression evaluates to true.
I want to add a final activity before debugging the pipeline. I want to add a wait activity to the if condition if the expression evaluates to false. To do this I have to navigate back to the if condition activity and select If False Activities under the activities property.
- I will use the breadcrumb link to navigate back to the main pipeline.
- Then select the Activities tab for the the If Condition activity.
- Finally, click the box for “Add If False Activity”.
- Add a wait activity to this pipeline and then name it wait_FALSE
With everything set up it’s now time to debug the pipeline. Since the last modified date of the file is 06/06/2018 and the last execution date is 06/13/2018, therefore, I would the wait activity defined within the If Condition-If False pipeline should be executed. As you can see from the results below, the wait_False activity was executed.
Thanks for checking out my blog!