Skip to content

Commit 9c607dc

Browse files
[Post] CICD for data teams
1 parent 7100634 commit 9c607dc

6 files changed

Lines changed: 176 additions & 0 deletions

File tree

1.92 MB
Loading
Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
---
2+
title: CICD for Fabric Data Teams
3+
description: Bring software engineering rigor to modern data teams using Microsoft Fabric
4+
image: /assets/images/blog/2026/2026-05-05-CICD-Fabric-Data-Teams/hero.png
5+
date:
6+
created: 2026-05-05
7+
authors:
8+
- jDuddy
9+
comments: true
10+
categories:
11+
- CICD
12+
slug: posts/CICD-Fabric-Data-Teams
13+
---
14+
15+
Data teams have historically operated differently from software engineering teams — relying on point-and-click tooling, sharing environments, and deploying by hand. As Fabric brings more of the data platform under one roof, it also raises expectations: data pipelines, semantic models, notebooks, and lakehouses should be treated with the same rigour as application code.
16+
17+
Without proper CI/CD practice you might struggle with:
18+
19+
- :material-source-branch: **Version control** — no history of what changed, when, why or by whom
20+
- :material-account-group: **Collaboration** — shared environments cause conflicts between in-progress work
21+
- :material-magnify: **Oversight** — no review process before changes reach higher environments
22+
- :material-check-circle-outline: **Quality** — no automated gates to validate changes before promotion
23+
- :material-cursor-default-click: **Click-ops tax** — manual, error-prone promotion steps that rely on humans remembering the right sequence
24+
- :material-restore: **Recovery** — no straightforward way to roll back a bad deployment
25+
26+
## Low Code Options
27+
28+
Fabric provides some built-in low code options to help deal with these issues.
29+
30+
### Deployment Pipelines
31+
32+
Most Power BI teams are familiar with [Deployment Pipelines](https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/intro-to-deployment-pipelines?tabs=new-ui), they are a low code option to move items between workspaces (i.e. `Dev` -> `Prod`).
33+
34+
![Deployment Pipelines](deployment-pipeline.png)
35+
36+
This option has some significant limitations:
37+
38+
- :material-content-copy: **Overwrites paired items**: deployment overwrites paired target items, making reverts harder than a source-controlled rollback
39+
- :material-source-branch-remove: **No branching or isolation**: developers share the same `Dev` workspace, so in-progress work can collide
40+
- :material-source-pull: **No pull request workflow**: changes can reach higher environments without formal review
41+
- :material-tune-variant: **Limited parameterization**: rules cover specific item types and properties, not every Fabric item
42+
- :material-account-lock: **Permission requirements**: deployments require `pipeline admin` plus source and target `workspace Contributor` access, which can create production access friction
43+
44+
### Deployment Pipelines + Git Integration
45+
46+
We can improve this situation with [git integration](https://learn.microsoft.com/en-us/fabric/cicd/git-integration/intro-to-git-integration) and simple trunk-based development. To develop new features we can branch-out from the `Dev` workspace and perform development work. Once complete, this can be commit to the `feature` branch, and changes can be applied to the `Dev` workspace by merging into our long-lived branch (`main`), and syncing. Movement of items to other environments is still performed by deployment pipelines.
47+
48+
![Deployment Pipelines with git integration](deployment-pipeline-git.png)
49+
50+
This gives us a meaningful step forward:
51+
52+
- :material-source-branch: **Version control**: all item definitions are committed to git — you have a full history of every change, who made it, and why
53+
- :material-account-switch: **Branching and isolation**: developers work on feature branches, keeping in-progress work separate from the shared Dev workspace
54+
- :material-source-pull: **Pull request workflow**: changes are reviewed and approved before being merged and synced to the Dev workspace
55+
56+
### Git Integration
57+
58+
But we can take this one step further, by throwing away Deployment Pipelines. This does mean we need a long lived branch per environment. Rather than use deployment pipelines to move changes through environment, we can do this by merging into the appropriate branch.
59+
60+
![Git integration](git-integration.png)
61+
62+
Replacing Deployment Pipelines with branch-based promotion gives us additional benefits:
63+
64+
- :material-source-merge: **Consistent promotion mechanism**: The same git merge process that moves code from `feat` to `dev` also moves it to `uat` and `prod`
65+
- :material-swap-horizontal: **Environment parity**: Each environment tracks a specific branch, so you always know exactly what is deployed where
66+
- :material-clipboard-text-clock: **Auditability**: Every promotion is a merge commit with an author, timestamp, and message — a complete audit trail with no extra tooling required
67+
- :material-backup-restore: **Rollback by revert**: Undoing a bad deployment is a standard `git revert`, not a manual re-promotion through deployment pipeline stages
68+
69+
For many teams, especially those early in their CI/CD journey or working with simpler solutions, this may be enough. But as solutions grow in complexity, some gaps start to emerge:
70+
71+
- :material-folder-multiple: **Monolith**: Multi-repo deployment into a single workspace are not supported, and push teams toward large repos
72+
- :material-variable: **Variable libraries**: Not all Fabric items and parameters are supported
73+
- :material-timeline-clock: **Orchestration**: No native pre/post deployment scripting, rollback automation, or dependency ordering
74+
75+
## API Options
76+
77+
For teams that need more control — over repo structure, deployment orchestration, quality gates, or environment parameterization — the [Fabric REST APIs](https://learn.microsoft.com/en-us/rest/api/fabric/articles/) open up a fully code-driven approach. Rather than being constrained by what the UI supports, you can compose exactly the deployment process your solution needs, using the APIs directly or through one of the available wrappers. Each wrapper sits on top of the Fabric REST APIs and provides higher-level abstractions, reducing the amount of boilerplate code needed to interact with Fabric:
78+
79+
- **[Terraform](https://registry.terraform.io/providers/microsoft/fabric/latest/docs):** Infrastructure-as-code tooling best suited for provisioning and managing infrastructure-level resources such as capacities, workspaces, and access control. Declarative and idempotent by design
80+
- **[fabric-cicd](https://microsoft.github.io/fabric-cicd/latest/):** A Microsoft-maintained Python library purpose-built for deploying Fabric item definitions from a git repository. Handles parameterization, item ordering, and orphan cleanup
81+
- **[Fabric CLI](https://github.com/microsoft/fabric-cli):** A command-line interface that models Fabric as a filesystem — allowing you to script interactions with workspaces and items. Can invoke `fabric-cicd` for configuration deployment as part of a broader orchestration script
82+
83+
### Why not a monolith?
84+
85+
A monolithic approach means either one repo per workspace, or one repo for the entire tenant. As team size and solution complexity grow, this becomes increasingly difficult to understand, own, and manage.
86+
87+
A solution-per-repo model offers a number of advantages:
88+
89+
- :material-target: **Scope**: A focused repo is easier for any developer to reason about in its entirety — reducing onboarding time and cognitive load
90+
- - :material-robot-outline: **Agentic workflows**: A concise repo gives AI agents a well-bounded context. You can define the solution's intent in `copilot-instructions.md` / `AGENTS.md` / `CLAUDE.md`, and bring in purpose-built agents and skills for that domain (e.g. a data engineer agent with Lakehouse and Semantic Model skills)
91+
- :material-account-key: **Ownership**: Clear team or domain ownership per repo, with explicit accountability for changes
92+
- :material-shield-account: **RBAC**: Least-privilege access can be enforced at the repo level — developers only have visibility and write access to the solutions they own
93+
- :material-radius-outline: **Blast radius**: A misconfigured deployment or bad merge only affects one solution, not the entire tenant
94+
- :material-calendar-sync: **Independent cadence**: Teams can deploy on their own schedule without coordinating with unrelated solutions
95+
- :material-source-pull: **Focused PRs**: Pull requests are small, scoped, and easy to review. This makes approvals faster and rollbacks trivial
96+
- :material-view-grid-plus: **Multi-workspace support**: Workspace sub-folders within a single repo allow related Fabric items spread across multiple workspaces to be deployed together as one coherent solution
97+
98+
Allowing us to have a repo structure like this:
99+
100+
``` { .json .annotate .no-copy title="Solution-per-repo" }
101+
├── 📁 .github
102+
│ ├── 📄 copilot-instructions.md // (1)!
103+
│ ├── 📁 agents
104+
│ │ └── 📄 data-engineer.agent.md // (2)!
105+
│ └── 📁 skills
106+
│ ├── 📁 lakehouse
107+
│ │ └── 📄 SKILL.md // (43)!
108+
│ └── 📁 semantic-model
109+
│ └── 📄 SKILL.md
110+
├── 📁 pipelines
111+
│ └── 📄 cd.yml // (4)!
112+
├── 📁 WorkspaceFoo // (5)!
113+
│ ├── 📁 Foo.SemanticModel
114+
│ └── 📁 Foo.Report
115+
├── 📁 WorkspaceBar
116+
│ ├── 📁 Bar.UDF
117+
│ ├── 📁 Bar.Lakehouse
118+
│ └── 📁 Bar.VariableLibrary
119+
├── 📄 parameters.yml // (6)!
120+
├── 📄 config.yml // (7)!
121+
├── 📄 readme.md
122+
└── 📄 .gitignore
123+
```
124+
125+
1. **Always-on custom instructions** - project-wide coding standards and conventions applied to every Copilot request
126+
2. **Agents** - AI personas with its own behavior, tools, and model preferences
127+
3. **Skills** - reusable, packaged capabilities (scripts/tools) agents can invoke to expand knowledge and refine behavior
128+
4. **CI/CD pipeline definition** - runs fabric-cicd to deploy Fabric workspace items on merge
129+
5. **fabric-cicd repository_directory per workspace**
130+
6. **fabric-cicd parameter.yml** - environment-specific value replacement (GUIDs, connection IDs, spark pools) applied at deploy time
131+
7. **fabric-cicd config.yml** - deployment configuration (workspace IDs, environments, item type scope)
132+
133+
### Deployment
134+
135+
With `fabric-cicd` as the deployment framework, we can build a structured, multi-stage pipeline framework:
136+
137+
![API deployment](api.png)
138+
139+
- :material-source-branch: **Branch-out workspaces**: Branch-out workspaces linked to feature branches, with Fabric CLI used to export items back into the repo layout
140+
- :material-check-decagram: **PR validation**: Pre-merge dry-runs or transient deployments catch broken definitions before they reach shared environments
141+
- :material-tag-arrow-up: **Tag-based promotion**: Release tested batches to `UAT` and `PROD` instead of deploying every commit
142+
- :material-account-key: **Environment-specific SPNs**: Pipelines deploy with least-privilege service principals, removing standing production access for humans
143+
- :material-view-grid-plus: **Multi-repo, multi-workspace**: Deploy one repo to multiple workspaces, or multiple repos into the same workspace
144+
145+
!!! warning "Current limitations"
146+
147+
`fabric-cicd` is powerful but not yet a complete deployment DAG. Some scenarios that currently require custom scripting:
148+
149+
- **Ordered dependencies**: fabric-cicd does order deployment of items, but this is not specifiable
150+
- **Post-deployment checks**: running smoke tests or pipeline executions after deployment and rolling back automatically on failure
151+
- **Unsupported items**: a small number of item types (e.g. Org Apps) are not able deployed programmatically, yet
152+
153+
### Additional Capabilities Worth Considering
154+
155+
Once the core deployment pattern is in place, there are a number of supplementary API calls worth incorporating into your pipeline:
156+
157+
- :material-calendar-clock: **Schedule management**: Set or update item schedules as part of deployment rather than relying on manual configuration in the UI post-deploy
158+
- :material-connection: **Connection binding**: Bind Semantic Models to the correct gateway connections for the target environment automatically, removing a common post-deployment manual step
159+
- :material-refresh: **Semantic model refresh**: Trigger a full refresh after deploying a new Semantic Model version to validate the model against the target data source before signalling success
160+
- :material-call-split: **Scale-out configuration**: Configure Import-mode Semantic Model read-only replicas for production environments as part of the deployment process
161+
- :material-home-sync: **Workspace lifecycle automation**: Use pipeline triggers (branch created / PR merged) to provision and deprovision branch-out workspaces on demand, eliminating the need for developers to manage workspace cleanup manually
162+
163+
The Fabric CICD story is still in its infancy and in active development. It is worth avoid over-engineering bespoke solutions on top of the existing frameworks at this time and these may become obsolete as need features are added, with excessive customization risking accumulating tech debt.
164+
165+
## Summary
166+
167+
There is no single right answer — the best approach depends on the maturity of the team and the complexity of the solution:
168+
169+
| Approach | Best for |
170+
|---|---|
171+
| **Deployment Pipelines** | Self-serve users and analysts who need a simple, UI-driven promotion path with no git involvement |
172+
| **Git Integration + Deployment Pipelines** | Small teams wanting version control and PR reviews, while keeping the familiar deployment pipeline for environment promotion |
173+
| **Git Integration (branch-per-env)** | Teams ready to drop deployment pipelines entirely and use branch merges as the sole promotion mechanism |
174+
| **API / fabric-cicd** | Complex, enterprise data engineering solutions requiring multi-repo deployments, parameterization, quality gates, and deployment orchestration |
175+
176+
I think it is important for all teams to start implementing at least a simple git integration approach to get some cheap wins, then increase the sophistication as the need arises.
135 KB
Loading
44.6 KB
Loading
16 KB
Loading
55.4 KB
Loading

0 commit comments

Comments
 (0)