In Azure Data Factory, continuous integration and delivery (CI/CD) involves transferring Data Factory pipelines across different environments such as development, test, UAT and production. This process leverages Azure Resource Manager templates to store the configurations of various ADF entities, including pipelines, datasets, and data flows. This article provides a detailed, step-by-step guide on how to automate deployments using the integration between Data Factory and Azure Pipelines.
Prerequisite
- Azure database factory, Setup of multiple ADF environments for different stages of development and deployment.
- Azure DevOps, the platform for managing code repositories, pipelines, and releases.
- Git Integration, ADF connected to a Git repository (Azure Repos or GitHub).
- The ADF contributor and Azure DevOps build administrator permission is required
Step 1
Establish a dedicated Azure DevOps Git repository specifically for Azure Data Factory within the designated Azure DevOps project.
Step 2
Integrate Azure Data Factory (ADF) with the Azure DevOps Git repositories that were created in the first step.
Step 3
Create developer feature branch with the Azure DevOps Git repositories that were created in the first step. Select the created developer feature branch from ADF to start the development.
Step 4
Begin the development process. For this example, I create a test pipeline “pl_adf_cicd_deployment_testing” and save all.
Step 5
Submit pull request from developer feature branch to main
Step 6
Once the pull requests are merged from the developer's feature branch into the main branch, proceed to publish the changes from the main branch to the ADF Publish branch. The ARM templates (JSON files) will get up-to date, they will be available in the adf-publish branch within the Azure DevOps ADF repository.
Step 7
ARM templates can be customized to accommodate various configurations for Development, Testing, and Production environments. This customization is typically achieved through the ARMTemplateParametersForFactory.json file, where you specify environment-specific values such as link service, environment variables, managed link and etc.
For example, in a Testing environment, the storage account might be named teststorageaccount, whereas in a Production environment, it could be prodstorageaccount.
- To create environment specific parameters file Azure DevOps ADF Git repo > main branch > linkedTemplates folder > Copy “ARMTemplateParametersForFactory.json”
- Create parameters_files folder under root path
- Copy paste ARMTemplateParametersForFactory.json inside parameters_files folder and rename to specify environment for example, prod-adf-parameters.json
- Update each environment specific parameter values
Step 8
To create an Azure DevOps CICD pipeline, use the following code and ensure you update the variables to match your environment before running it. This will allow you to deploy from one ADF environment to another, such as from Test to Production.
name: Release-$(rev:r)
trigger:
branches:
include:
- adf_publish
variables:
azureSubscription: <Your subscription>
SourceDataFactoryName: <Test ADF>
DeployDataFactoryName: <PROD ADF>
DeploymentResourceGroupName: <PROD ADF RG>
stages:
- stage: Release
displayName: Release Stage
jobs:
- job: Release
displayName: Release Job
pool:
vmImage: 'windows-2019'
steps:
- checkout: self
# Stop ADF Triggers
- task: AzurePowerShell@5
displayName: Stop Triggers
inputs:
azureSubscription: '$(azureSubscription)'
ScriptType: 'InlineScript'
Inline: |
$triggersADF = Get-AzDataFactoryV2Trigger -DataFactoryName "$(DeployDataFactoryName)" -ResourceGroupName "$(DeploymentResourceGroupName)"
if ($triggersADF.Count -gt 0) {
$triggersADF | ForEach-Object { Stop-AzDataFactoryV2Trigger -ResourceGroupName "$(DeploymentResourceGroupName)" -DataFactoryName "$(DeployDataFactoryName)" -Name $_.name -Force }
}
azurePowerShellVersion: 'LatestVersion'
# Deploy ADF using ARM Template and UAT JSON parameters
- task: AzurePowerShell@5
displayName: Deploy ADF
inputs:
azureSubscription: '$(azureSubscription)'
ScriptType: 'InlineScript'
Inline: |
New-AzResourceGroupDeployment `
-ResourceGroupName "$(DeploymentResourceGroupName)" -TemplateFile "$(System.DefaultWorkingDirectory)/$(SourceDataFactoryName)/ARMTemplateForFactory.json" -TemplateParameterFile "$(System.DefaultWorkingDirectory)/parameters_files/prod-adf-parameters.json" -Mode "Incremental"
azurePowerShellVersion: 'LatestVersion'
# Restart ADF Triggers
- task: AzurePowerShell@5
displayName: Restart Triggers
inputs:
azureSubscription: '$(azureSubscription)'
ScriptType: 'InlineScript'
Inline: |
$triggersADF = Get-AzDataFactoryV2Trigger -DataFactoryName "$(DeployDataFactoryName)" -ResourceGroupName "$(DeploymentResourceGroupName)"
if ($triggersADF.Count -gt 0) {
$triggersADF | ForEach-Object { Start-AzDataFactoryV2Trigger -ResourceGroupName "$(DeploymentResourceGroupName)" -DataFactoryName "$(DeployDataFactoryName)" -Name $_.name -Force }
}
azurePowerShellVersion: 'LatestVersion'
Triggering the Pipeline
The Azure DevOps CI/CD pipeline is designed to automatically trigger whenever changes are merged into the main branch. Additionally, it can be initiated manually or set to run on a schedule for periodic deployments, providing flexibility and ensuring that updates are deployed efficiently and consistently.
Monitoring and Rollback
To monitor the pipeline execution, utilize the Azure DevOps pipeline dashboards. In case a rollback is necessary, you can revert to previous versions of the ARM templates or pipelines using Azure DevOps and redeploy the changes.
Updated Nov 13, 2024
Version 4.0MUA
Microsoft
Joined August 27, 2024
Data Architecture Blog
Follow this blog board to get notified when there's new activity