Hey everyone, back in October I did a 3 hour live stream on YouTube for introduction to Azure. A big part of that 3 hours focused on Azure Data Factory. In this post, I am responding to one of the questions that I received during that live stream with a blog accompanied with a YouTube video.
Is there a way to create a secure connection between Azure Data Factory and Azure SQL DB?
Check out my YouTube video showing how to set up Managed Virtual Networks and Private Endpoints:
Let’s first take a look at the two methods I discussed in the live stream. I showed how to add the IP address of the Azure VM that was making the connection from Azure Data Factory. The method of using the IP address is problematic because the IP address is not static and will change. Therefore, adding the IP Address is a not a permanent fix. The second method I showed was that you could turn on Allow Azure Services. This will work, but….. many companies consider this a bit of a security risk.
When Allow All Azure services is enabled, any Azure Resource can try to authenticate to your Azure SQL DB and that’s a problem for many organizations.
Managed Virtual Network (V-Net) connections and Private End Points in Azure Data Factory
Creating private end points to all your services in Azure is recommended as a best practice and therefore we will be covering the necessary steps here.
Creating a secure connection between your Azure services is a 3 step process.
- Create an Azure Integration runtime and enable Virtual Network Configuration
- Create a Managed Private Endpoint to the Azure Service (Azure SQL DB, Azure Storage, ect..)
- Approve the private endpoint request through the Private Link Center
Azure Integration Runtime with Managed VNET in ADF and Synapse
Integration runtimes are the compute that is used to move resources. You are billed based on the amount of Data Integration Units (DIUs) that are used during the data movement process. To securely move your data in a managed virtual network, you first need to make sure that your Azure Integration runtime is created within a managed virtual network. This can be configured when you are provisioning the Data Factory for the first time or later from manage tab.
Note: At the time of this writing/video, Azure Synapse workspaces require that you configure this property when you are provisioning the Synapse resource. If you create your Synapse workspace and you do not enable virtual network configuration, you will not be able to enable it after the fact. Here is a screenshot from the Microsoft documentation on this:
Here are the steps to create an Integration Runtime within a Managed Virtual Network.
- Select the manage tab in Data Factory / Synapse
- Click on +New
- Select Azure when prompted.
- On the next screen, name your Integration Runtime and enable Virtual Network Configuration
- Click Create.
How to create Managed Private Endpoints
Once the Integration Runtime with the Managed Virtual Network has been created, you need to create managed private endpoints. Your private endpoint is a private IP address connecting your ADF and Synapse pipelines to a specific resource. Therefore you will create a private endpoint for each data store (Blob, ADLS, Azure SQL DB) that you wish to securely connect to.
To create a managed private endpoint in Azure Data Factory and Synapse, go to your Manage hub, then click on Managed private endpoints, then click New. Keep in mind, this will be disabled and not available until after you have created the Integration Runtime with the managed virtual network.
Next, choose the resource in Azure that you want to connect to.
Azure Private Link Center and Approving Private Endpoints
Once the private endpoint has been created it will be in a “Pending” state. This will need to be approved. You can approved a private endpoint from the specific resource or you can go to the Azure Private Link center.
In Azure, search for Private Link and then select Private Link from the list of services returned.
Once in the Private Link Center go to Pending Connections, from here, you can approve, reject or remove any connections that may be pending. In my screenshot I don’t have any pending connections because I approved them in the video!
Wrapping it up
If you’re like me, networking is a tough topic, I come from a background of writing code, developing solutions and performance tuning. In on-prem development scenarios I let the specialist handle things like networking. With Azure the developer can branch out and learn new things! I hope you enjoyed this blog / video series. Thanks for reading!