Scaling mechanism in hybrid deployment model for Azure Logic Apps Standard

Microsoft

Mar 06, 2025

Hybrid Logic Apps offer a unique blend of on-premises and cloud capabilities, making them a versatile solution for various integration scenarios.
A key feature of hybrid deployment models is their ability to scale efficiently to manage different workloads. This capability enables customers to optimize their compute costs during peak usage by scaling up to handle temporary spikes in demand and then scaling down to reduce costs when the demand decreases.
This blog will explore the scaling mechanism in hybrid deployment models, focusing on the role of the KEDA operator and its integration with other components.

Scaling Architecture:

KEDA is a Kubernetes-based Event Driven Auto scaler. With KEDA, you can drive the scaling of any container in Kubernetes based on the number of events needing to be processed.

It includes mainly these components.

KEDA operator: The KEDA operator is responsible for managing the scaling of Kubernetes resources, it scans through all the scaled objects present in the namespace and executes the scaler for activation and scaling logic.

Scaled objects: A KEDA scaled object is used to define the scaling behaviour for Kubernetes resources. It specifies the target resource to be scaled and the triggers that determine when scaling should occur.

Scaler: This component measures the desired count of instances based on the various metrics and decides whether to activate or deactivate the deployment. There are two categories like built-in and external scalers. Logic App scaler is an external scaler written by the Logic apps product team.

The target concurrency is one of the parameters considered for calculating the desired count of instances.

The target concurrency is the Number of parallel active dispatchers to run on a single processor and has a default value of 42 and can be modified through the environment variable “Microsoft.Azure.Workflows.TargetScaler.TargetConcurrency”

If the target concurrency is set to 42 and a logic app pod is allocated 0.5 vCPU (in the container configuration), it can run up to 0.5*42= 21 concurrent jobs at a time. If there are more than 21 jobs in the queue, the nodes would be scaled out.

Scaling Walkthrough:

In this section, we will go through the scaling configuration on a hybrid environment and perform a simple load test for a short duration to observe the scaling behaviour during the high load.

For more details on the hybrid environment setup for logic app, you can refer Set up your own infrastructure for Standard logic app workflows - Azure Logic Apps | Microsoft Learn

Once the hybrid environment is configured and the logic app is up and running, we can run few commands to verify the scaling configuration and the scaling events.

As a first step, it is required to add the kubeconfig of your Kubernetes cluster to the local context on your machine, by running the below command on the Windows PowerShell.

PS C:\WINDOWS\system32> az aks get-credentials --resource-group <ResourceGroupName> --name <Kubernetes cluster name> --admin

You can list the scaled objects in your Kubernetes namespace using the below command. The logic app related scaled objects are prefixed with the name of the logic app.

PS C:\WINDOWS\system32> kubectl get scaledObjects -n <namespace of the cluster extension>

You can fetch the configuration of this scaled object, and the scalers attached to it by running the below command on the chosen scaled object.

PS C:\WINDOWS\system32> kubectl describe scaledObject <ScaledObject of the logic app> -n <namespace of the cluster extension>

In the above screenshot, you can find three scalers. Activator and http-scaler are the built-in scalers. The workflowdispatcher is an external scaler created for the logic app scaling.

The events logged by the scaled object can be checked using below command

PS C:\WINDOWS\system32> kubectl get events --field-selector involvedObject.kind=ScaledObject -n <namespace of the cluster extension>

You can list the horizontal pod auto scalers in the namespace using below command.

PS C:\WINDOWS\system32> kubectl get hpa -n <namespace of the cluster extension>

In this example, I have processed large number of requests within the duration of 2 minute.

We can see the scaling logs using the below command for this app.

From the below screenshot, we can see that the nodes have scaled out from 2 to 9 gradually based on the metric ‘s1-upstream_rq_total’, and scaled down to 1, once the load reduced.

PS C:\WINDOWS\system32> kubectl describe hpa <hpa of the logic app> -n <namespace of the cluster extension>