Fabric Airflow Job Sizing and Pricing: Understanding Pools and SKUs

The goal of this article is not to explore (in depth) what Airflow is. Instead, is to provide insights on a relationship between capacity SKU and Airflow pool types.

Capacity planning and forecast is a very important so that you don’t risk throttling it.

In the case of Airflow jobs (pricing), you pay for the pool uptime, therefore it is very straightforward to forecast, and to pick an appropriate size, so as not to hinder other workloads.

In short, Airflow job is the Microsoft Fabric’s offering of Apache Airflow, enabling users which prefer orchestrating activities via code, or who just want to complement their orchestration needs.

In order to run a job, you need to have a pool. There are 2 types of pools:  starter and custom.

Starter pools is better suited for dev purposes, and will shutdown after 20 minutes of idleness.

Custom pool’s biggest selling point is for scenarios which require pools running for 24/7 – “always-on”.

The pricing model distinguishes 2 sizes of jobs: small and large, being translated into different CU consumption rates, respectively, 5 CUs and 10 CUs.

The size choice must be iterative, as the only reference there is in public docs is the following:

  • Compute node size: The size of compute node you want to run your environment on. You can choose the value Large for running complex or production DAGs and Small for running simpler Directed Acyclic Graphs (DAGs).

…having not found, at the time of writing a way of defining a simple or complex DAG.

Follows an exercise to enumerate all possible pool configurations:

Pool TypeExtra NodesTotal CU RequirementMinimum Valid SKU
Small05.0F8
Small15.6F8
Small26.2F8
Small36.8F8
Small47.4F8
Small58.0F8
Small68.6F16
Small79.2F16
Small89.8F16
Small910.4F16
Small1011.0F16
Large010.0F16
Large111.3F16
Large212.6F16
Large313.9F16
Large415.2F16
Large516.5F32
Large617.8F32
Large719.1F32
Large820.4F32
Large921.7F32
Large1023.0F32

In a more visual way (with the help of Copilot):

 Observations – for a 24/7 uptime:

  1. For small pools, the minimum SKU size is F8, with no extra nodes.
  2. For large pools, the minimum SKU size is F16, with no extra nodes.

(minimum SKU size to avoid capacity throttling – assuming this as the only workload using the capacity)

Analogously, lower SKU sizes can be used with starter pools for experimentation, exploration – dev purposes.

Example of what might happen, from Fabric Capacity Metrics app perspective, if you don’t consider the graphic shown above:


In short, not all Capacity SKUs support all the possible combinations of pool composition (in terms of size and extra nodes) – at least if you plan to use it 24/7.

Thanks for reading!


References

https://learn.microsoft.com/en-us/fabric/data-factory/apache-airflow-jobs-concepts

https://learn.microsoft.com/en-us/fabric/data-factory/pricing-apache-airflow-job#apache-airflow-job-pricing-model

Apache Airflow Job workspace settings – Microsoft Fabric | Microsoft Learn