Mastering Microsoft Fabric CI/CD: A Code-First Approach with Python

Geschreven door Lukas Heeren | Mar 5, 2026 3:26:48 PM

Mastering Microsoft Fabric CI/CD: A Code-First Approach with Python

By Lukas Heeren, Data & AI Solutions Architect at SDG Group Netherlands

Welcome to Tech Station, SDG Group’s hub for uncovering the latest innovations in data and analytics!

As enterprise data teams scale up with Microsoft Fabric, Data Engineers and Architects face a critical challenge: building a robust CI/CD pipeline. The standard Fabric Git integration often requires replicating entire databases for every feature branch, leading to unnecessary compute spend and data governance issues.

One of our Data & AI Solutions Architects, Lukas Heeren, advocates a Code-First CI/CD strategy that strictly separates stable infrastructure from code. This approach delivers immediate business value:

Cost Reductions: Eliminate the need to duplicate database layers for every developer environment.
Enterprise-Grade Governance: Treat Lakehouses as centrally managed infrastructure to prevent "data sprawl" and maintain a single source of truth.
Accelerated Time-to-Market: Automate code deployments using the declarative fabric-cicd Python library to iterate faster and ship data products with confidence.

In the article below, SDG Group Data & AI Solutions Architect, Lukas Heeren shares a practical, step-by-step blueprint for mastering this architecture and achieving the sweet spot of agile DataOps.

Microsoft Fabric has changed the game for data analytics, but implementing a robust Continuous Integration/Continuous Deployment (CI/CD) pipeline for it can still feel like navigating uncharted territory.

Microsoft recently introduced the fabric-cicd Python library (link), a tool designed to streamline the deployment of Fabric items. Its standout feature is its declarative nature. Instead of writing complex, imperative scripts to manage every API call, you simply define your desired end-state in a config file. You tell the library what you want, and it handles the heavy lifting of how to get there.

I’ve been taking it for a spin in a real-world DevOps environment, and I’ve found it excellent for specific tasks, provided you know its boundaries.

In this post, I’ll share a practical blueprint for Fabric CI/CD. We will focus on a strategy that separates “code” (Notebooks, Pipelines) from “infrastructure” (Lakehouses) to reduce costs and complexity, using Azure DevOps and the new Python library.

The Strategy: Why Split Code and Infra?

When you read standard Microsoft guidance on Fabric Git integration, the typical flow looks like this:

Each developer creates a feature branch and their own Fabric Workspace.
They develop their changes (Notebooks, Lakehouses, Reports).
They merge a Pull Request (PR) to main.
A pipeline deploys everything to Dev, Test, and Production workspaces.

While robust, this approach has a hidden downside: cost and sprawl.

If every developer spins up their own feature workspace complete with its own Lakehouses, you are effectively replicating your entire database layer for every active branch. This increases computational costs and creates data governance headaches.

A Leaner Approach

For my implementation, I wanted a leaner approach:

Infrastructure (Pre-provisioned): Lakehouses (databases) are treated as infrastructure. They only exist in the shared Dev,Test and Prodworkspaces. They are are created by the infrastructure CI/CD pipeline. You should manage these using Terraform or the Fabric SDK to ensure stable IDs and governance.
Code (The Pipeline): This CI/CD pipeline is responsible only for deploying “code blocks”, Notebooks and Pipelines for example, using the fabric-cicd library.
Developer Workflow: Developers work in feature branch workspaces synced to Git, but they connect their notebooks to the shared Dev Lakehouse for development data.

When code merges to main, the pipeline deploys the notebooks to Dev, Test and Prod. Crucially, it swaps out the Lakehouse references so the notebooks point to the correct database for that environment, something that the cicd library can do for you.

The new approach by Microsoft, with the databases in each environment.

Step 0: The Developer Setup (Git & Workspaces)

Before we build the pipeline, we need to establish the developer workflow. Following Microsoft’s best practices, we avoid “coding in production” by ensuring isolation.

1. One Developer, One Workspace Every developer creates their own Fabric Workspace. This acts as their local sandbox.

2. Sync to Feature Branches In the Workspace settings, connect to your Azure DevOps Git repository. Crucially, do not sync to main. Instead, developers should create a specific branch for their task (e.g., feature/new-transform) and sync their workspace to that branch. This allows them to commit changes to code without affecting the stable pipeline. You can sync git under workspace settings:

3. The “No-Database” Rule Here is the cost-saving twist: Do not create Lakehouses or any other database in these feature workspaces.

If every developer spins up their own Lakehouse, you are effectively duplicating your storage and compute costs for every active developer. You also risk data drift. Instead, developers should write their Notebooks in their feature workspace but mount/reference the Shared Development Lakehouse in dev to access data.

4. Create dev, test and production environments using your Infrastructure as Code preference (Options are for example Terraform and the Microsoft Python SDK, which is a different one than the CICD one)

This keeps the feature workspaces lightweight, containing only code (Notebooks for example), while testing against a central, governed data source.

Prerequisites: The Identity Setup

Before looking at the code, we need to solve the hardest part of Fabric CI/CD: authentication. This is where I see a lot of questions and errors on the forums and git repo’s and it needs to be setup carefully.

The fabric-cicd library runs in your Azure DevOps agent and needs permission to talk to the Fabric API. We use a Service Principal with Workload Identity Federation (the modern, secure way to connect Azure DevOps to Azure).

You can create one in Azure DevOps under Project settings -> Service Connections -> New Service Connection. Select Azure Resource Manager and fill in your information there.

The “Gotcha”: Finding the Right App Name

You need to grant your Service Principal permissions on the workspaces.

Mistake to avoid: Do not try to search for the Service Connection name you see in Azure DevOps. For my service connection I chose the name “SC_Lukas_Fabric_CICD”, but that will not show up if you look for it when assigning the rights to the workspaces.

You must go to the Microsoft Entra ID (formerly Azure AD) portal, find the App Registration associated with your service principal, and use that name when granting access in Fabric. You can also click ‘Manage App Registration’ in the service connection details page to go directly to the right page in Entra.

This is the right name that you need to search for when assigning contributor rights to the workspaces:

Two other things need to be enabled in the Fabric Admin Portal:

‘Users can create Fabric items’ needs to be enabled, make sure that if it is only for specific groups, your Service Connection is added to those groups.
‘Sevice principals can create workspaces, connections, and deployment pipelines’ should be enabled.

Required API Permissions

If you don’t configure this correctly, your pipeline will fail with a frustratingly vague error:

"Unhandled error occurred calling POST on 'https://api.powerbi.com/v1/workspaces/WORKSPACEID/items'. Message: The feature is not available."

This error almost always means missing API scopes.

Go to your App Registration in Entra ID.
Go to API Permissions -> Add a permission.
Select Power BI Service.
Select Delegated permissions (currently required for many Fabric operations via SP).
Ensure you select required permissions. For deploying Notebooks, you specifically need: Notebook.ReadWrite.All

Add API permissions

Also, please keep in mind that any change in Fabric rights can take up to 15 minutes (!) to sync. I have cases where I assigned rights and was to soon with testing, leaving me to doubt what went wrong.

The Implementation

With authentication solved, let’s look at the files that drive the deployment. I use a configuration setup, which means I create config files that tells the python library what to deploy. You will need to create several files in your git repository. In the end you will create an azure pipeline by defining the pipeline yaml file and linking it to a pipeline to run the setup.

1. The Configuration (`config.yml`)

This specific file tells the library which workspaces correspond to which environments and limits what items we deploy. Notice we only scope this to Notebookbut there are many more. Currently there are 25 different items like: CopyJob, DataPipeline, Eventhouse, Lakehouse, SemanticModel, Reflex and many more (click here for the complete list)

config.yml:

core:
  repository_directory: "."
  parameter_file_path: "parameter.yml"
  workspace_id:
    dev: "YOUR_DEV_WORKSPACE_GUID" 
    test: "YOUR_TEST_WORKSPACE_GUID"
  item_type_in_scope:               # Limit what gets deployed
    - Notebook

cleanup:
  enabled: true               # Enables the deletion of old resources
  exclude_item_list: []       # (Optional) List names of items you NEVER want deleted

2. The Parameter Replacement (`parameter.yml`)

This is the magic sauce. Because our Lakehouses are pre-existing infrastructure, they have different IDs in Dev than they do in Test. You can replace any text you like with a specific new text for an environment.

When a developer works in their feature branch, their notebook might reference a Lakehouse named LAKEHOUSE_DEV_BRONZE. When we deploy to the Test workspace, we need the pipeline to automatically find that reference and replace it with the name of the Test Lakehouse.

You can also do this with Id’s, so data item references will not break. Notebooks are linked to data items based on the Id of the data item (a lakehouse for example). By swapping these Id’s, you let the notebook link to another lakehouse. I personally have the Id’s in this file.

find_replace:
  - find_value: "LAKEHOUSE_DEV_BRONZE"
    replace_value:
      dev: "LAKEHOUSE_DEV_BRONZE"
      test: "LAKEHOUSE_TST_BRONZE"

- find_value: "LAKEHOUSE_DEV_SILVER"
    replace_value:
      dev: "LAKEHOUSE_DEV_SILVER"
      test: "LAKEHOUSE_TST_SILVER"

3. The Deployment Script (`deploy.py`)

We need a simple Python script to wrap the library call. This script checks which environment we are targeting (passed via an Azure DevOps variable which will be shown later) and triggers the configuration-based deployment. It automatically uses the identity of the pipeline.

import os
import sys
# The core library function
from fabric_cicd import deploy_with_config, change_log_level

def main():
    # FAIL-FAST: Ensure the pipeline passed the environment name
    target_env = os.environ.get("TARGET_ENV")
    if not target_env:
        print("❌ ERROR: 'TARGET_ENV' environment variable is missing.")
        print("Make sure 'export TARGET_ENV=test' is in your pipeline YAML.")
        sys.exit(1)
    print(f"🚀 Deployment Target: {target_env}")
    # Set logging for visibility
    change_log_level("DEBUG")
    try:
        # deploy_with_config looks at your config.yml 
        # It uses '.' as the repo directory (where your .Notebook folders are)
        deploy_with_config(
            config_file_path="config.yml",
            environment=target_env
        )
        print(f"\n✅ SUCCESS: Items deployed to {target_env} workspace.")
        
    except Exception as e:
        print(f"\n❌ DEPLOYMENT FAILED")
        print(f"Detail: {str(e)}")
        # Ensure the pipeline fails if the script fails
        sys.exit(1)
if __name__ == "__main__":
    main()

The Azure Pipeline

Finally, we glue it all together in Azure DevOps. This YAML pipeline triggers when PRs are merged to main. It installs the tool and runs the deployment script twice: once for Dev, once for Test.

Make sure your FABRIC_TENANT_ID and FABRIC_CLIENT_ID are available as pipeline variables (preferably from a Key Vault backed Variable Group).

trigger:
  - main

variables:
  # Map your secrets/IDs from ADO Variable Groups or define here
  FABRIC_TENANT_ID: 'TENANT_ID'
  FABRIC_CLIENT_ID: 'CLIENT_ID'

pool:
  name: 'Docker Agent' # Use your appropriate agent pool

steps:

- checkout: self

- script: |
    # Upgrade pip and install the tool
    python -m pip install --upgrade pip
    python -m pip install fabric-cicd
  displayName: 'Install fabric-cicd library'

# Deploy to DEV
- task: AzureCLI@2
  displayName: 'Execute Config-Based Deploy DEV'
  inputs:
    azureSubscription: 'SC_Lukas_Fabric_CICD' # Your OIDC Service Connection
    scriptType: 'bash'
    scriptLocation: 'inlineScript'
    inlineScript: |
      # Pass credentials to the environment for the Python tool
      export FABRIC_TENANT_ID="$(FABRIC_TENANT_ID)"
      export FABRIC_CLIENT_ID="$(FABRIC_CLIENT_ID)"
      
      # Tell the script which environment config to use
      export TARGET_ENV="dev" 
      
      python deploy.py

# Deploy to TEST following success in Dev
- task: AzureCLI@2
  displayName: 'Execute Config-Based Deploy TEST'
  inputs:
    azureSubscription: 'SC_Lukas_Fabric_CICD' # Your OIDC Service Connection 
    scriptType: 'bash'
    scriptLocation: 'inlineScript'
    inlineScript: |
      export FABRIC_TENANT_ID="$(FABRIC_TENANT_ID)"
      export FABRIC_CLIENT_ID="$(FABRIC_CLIENT_ID)"
      
      # Target test environment
      export TARGET_ENV="test" 
      
      python deploy.py

Now that we have all this in place, let’s test it out! I have a feature workspace with one notebook called Notebook2:

Lets add another Notebook called NewNotebook.

Connect this to the dev lakehouse by adding a data item:

You will see that this notebook is uncommitted:

Lets commit it and make a PR out of it. Click on Source control and create a nice message:

Hit Commit and go to your devops environment. Look at your folders structure in Git and notice that there is a new folder called “NewNotebook.Notebook”. This is the folder that contains your new notebook!

Create a new PR and merge it into main. Go to your pipelines and see it all deploying:

When I now go to my dev workspace I see the new notebook in my folder:

And it is connected to my dev lakehouse:

When I go to my test workspace I also see it in my folder:

But here it is connected to my test lakehouse!

Conclusion

The microsoft/fabric-cicd library marks a significant step forward for DataOps in the Microsoft ecosystem. However, tools are only as good as the strategy behind them. By treating your databases as stable infrastructure and your Notebooks as fluid code, you achieve the sweet spot: agile deployments without the exorbitant cost of duplicating storage layers for every feature branch.

We have only scratched the surface of what is possible here, but this pipeline provides a scalable foundation. As your team grows, don’t let your infrastructure sprawl. Let Terraform or the Fabric SDKs handle the heavy lifting of provisioning databases, and let this library handle the speed of your application logic. That is modern DataOps.

Ready to streamline your DataOps and eliminate unnecessary compute costs?
Implementing a Code-First CI/CD strategy in Microsoft Fabric can transform how your data teams scale. Whether you need help setting up this architecture or want to optimize your current data ecosystem, our Data & AI experts at SDG Group are here to help.
👉 Contact us today to discuss your data strategy

Volledig bericht weergeven

Mastering Microsoft Fabric CI/CD: A Code-First Approach with Python