Blog Post

Data Architecture Blog
3 MIN READ

Automated Continuous integration and delivery – CICD in Azure Data Factory

MUA's avatar
MUA
Icon for Microsoft rankMicrosoft
Nov 13, 2024

In Azure Data Factory, continuous integration and delivery (CI/CD) involves transferring Data Factory pipelines across different environments such as development, test, UAT and production. This process leverages Azure Resource Manager templates to store the configurations of various ADF entities, including pipelines, datasets, and data flows. This article provides a detailed, step-by-step guide on how to automate deployments using the integration between Data Factory and Azure Pipelines.

Prerequisite

  • Azure database factory, Setup of multiple ADF environments for different stages of development and deployment.
  • Azure DevOps, the platform for managing code repositories, pipelines, and releases.
  • Git Integration, ADF connected to a Git repository (Azure Repos or GitHub).
  • The ADF contributor and Azure DevOps build administrator permission is required
Step 1

Establish a dedicated Azure DevOps Git repository specifically for Azure Data Factory within the designated Azure DevOps project.

Step 2

Integrate Azure Data Factory (ADF) with the Azure DevOps Git repositories that were created in the first step.

Step 3

Create developer feature branch with the Azure DevOps Git repositories that were created in the first step.  Select the created developer feature branch from ADF to start the development.

Step 4

Begin the development process. For this example, I create a test pipeline “pl_adf_cicd_deployment_testing” and save all.

Step 5

Submit pull request from developer feature branch to main

Step 6

Once the pull requests are merged from the developer's feature branch into the main branch, proceed to publish the changes from the main branch to the ADF Publish branch. The ARM templates (JSON files) will get up-to date, they will be available in the adf-publish branch within the Azure DevOps ADF repository.

Step 7

ARM templates can be customized to accommodate various configurations for Development, Testing, and Production environments. This customization is typically achieved through the ARMTemplateParametersForFactory.json file, where you specify environment-specific values such as link service, environment variables, managed link and etc.

For example, in a Testing environment, the storage account might be named teststorageaccount, whereas in a Production environment, it could be prodstorageaccount.

  1. To create environment specific parameters file Azure DevOps ADF Git repo > main branch > linkedTemplates folder > Copy “ARMTemplateParametersForFactory.json”
  2. Create parameters_files folder under root path
  3. Copy paste ARMTemplateParametersForFactory.json inside parameters_files folder and rename to specify environment for example, prod-adf-parameters.json
  4. Update each environment specific parameter values
Step 8

To create an Azure DevOps CICD pipeline, use the following code and ensure you update the variables to match your environment before running it. This will allow you to deploy from one ADF environment to another, such as from Test to Production.

name: Release-$(rev:r)

trigger:
  branches:
    include:
      - adf_publish
variables:
  azureSubscription: <Your subscription>
  SourceDataFactoryName: <Test ADF>
  DeployDataFactoryName: <PROD ADF>
  DeploymentResourceGroupName: <PROD ADF RG>

stages:
- stage: Release
  displayName: Release Stage
  jobs:
    - job: Release
      displayName: Release Job
      pool:
        vmImage: 'windows-2019'
      steps:
        - checkout: self

        # Stop ADF Triggers
        - task: AzurePowerShell@5
          displayName: Stop Triggers
          inputs:
            azureSubscription: '$(azureSubscription)'
            ScriptType: 'InlineScript'
            Inline: |
              $triggersADF = Get-AzDataFactoryV2Trigger -DataFactoryName "$(DeployDataFactoryName)" -ResourceGroupName "$(DeploymentResourceGroupName)"
              if ($triggersADF.Count -gt 0) {
                $triggersADF | ForEach-Object { Stop-AzDataFactoryV2Trigger -ResourceGroupName "$(DeploymentResourceGroupName)" -DataFactoryName "$(DeployDataFactoryName)" -Name $_.name -Force }
              }
            azurePowerShellVersion: 'LatestVersion'



        # Deploy ADF using ARM Template and UAT JSON parameters
        - task: AzurePowerShell@5
          displayName: Deploy ADF
          inputs:
            azureSubscription: '$(azureSubscription)'
            ScriptType: 'InlineScript'
            Inline: |
              New-AzResourceGroupDeployment `
                -ResourceGroupName "$(DeploymentResourceGroupName)" -TemplateFile "$(System.DefaultWorkingDirectory)/$(SourceDataFactoryName)/ARMTemplateForFactory.json" -TemplateParameterFile "$(System.DefaultWorkingDirectory)/parameters_files/prod-adf-parameters.json" -Mode "Incremental"
            azurePowerShellVersion: 'LatestVersion'


        # Restart ADF Triggers
        - task: AzurePowerShell@5
          displayName: Restart Triggers
          inputs:
            azureSubscription: '$(azureSubscription)'
            ScriptType: 'InlineScript'
            Inline: |
              $triggersADF = Get-AzDataFactoryV2Trigger -DataFactoryName "$(DeployDataFactoryName)" -ResourceGroupName "$(DeploymentResourceGroupName)"
              if ($triggersADF.Count -gt 0) {
                $triggersADF | ForEach-Object { Start-AzDataFactoryV2Trigger -ResourceGroupName "$(DeploymentResourceGroupName)" -DataFactoryName "$(DeployDataFactoryName)" -Name $_.name -Force }
              }
            azurePowerShellVersion: 'LatestVersion'
Triggering the Pipeline

The Azure DevOps CI/CD pipeline is designed to automatically trigger whenever changes are merged into the main branch. Additionally, it can be initiated manually or set to run on a schedule for periodic deployments, providing flexibility and ensuring that updates are deployed efficiently and consistently.

Monitoring and Rollback

To monitor the pipeline execution, utilize the Azure DevOps pipeline dashboards. In case a rollback is necessary, you can revert to previous versions of the ARM templates or pipelines using Azure DevOps and redeploy the changes.

Updated Nov 13, 2024
Version 4.0
  • jfolberth's avatar
    jfolberth
    Copper Contributor

    Nice post! Fyi there is a series that goes down the to the YAML templating level and how to use Linked ARM temates for ADF w/ source code: https://aka.ms/cicdadf