Power BI Semantic Models Inside Microsoft Fabric: What BI Teams Need to Know

Semantic models sit at the center of every Power BI reporting layer, and Microsoft Fabric has fundamentally changed how they are built, stored, and maintained. For BI teams navigating the shift from Power BI Premium P-SKUs to Fabric F-SKUs, the surface area has expanded: new storage modes, deprecated default models, layered permissions, and capacity mechanics that behave differently than anything in classic Power BI Service. This article covers the architecture end to end, from OneLake through capacity throttling, with a focus on decisions that have real production consequences.

Fabric’s Layered Architecture and What It Means for Semantic Models

Microsoft Fabric is organized around a single, tenant-wide data lake called OneLake. Think of it as ADLS Gen2 made fully managed: BI teams never provision storage accounts or manage resource groups. Every Fabric workload (lakehouses, warehouses, dataflows, notebooks, semantic models) shares one OneLake instance per tenant, and data is stored in Delta Lake / Parquet format natively. That last point is not incidental: it is the technical prerequisite for Direct Lake mode.

The hierarchy is: tenant → OneLake → workspaces → items. Workspaces are the primary access control boundary. Each workspace is assigned to one Fabric capacity (F-SKU or P-SKU), and can be connected to a Git repository and assigned to a Fabric Domain.

F-SKU capacities are the preferred path for new deployments. One licensing point that surprises many teams: on capacities below F64, users viewing Power BI content still need a Pro, PPU, or trial license. Free-license consumption only unlocks at F64 and above.

Default vs. Custom Semantic Models: What Changed in 2025

Until September 2025, creating a Lakehouse, Warehouse, SQL Database, or Mirrored Database in Fabric automatically generated a default semantic model for that item. It was convenient for quick exploration but had real limitations: no support for hierarchies, no calculation groups, no GUI for RLS creation, and all tables were auto-included whether the BI team wanted them or not.

Microsoft deprecated the default model pattern in late 2025. Since September 5, 2025, Fabric no longer auto-generates default semantic models for new items. By November 30, 2025, all surviving default models were decoupled from their parent items and became independent semantic models that teams must manage explicitly. The “Reporting Tab” options (New Report, Manage default semantic model, Automatically update semantic model) have been removed from warehouse and SQL analytics endpoints.

The recommended approach is the custom semantic model, created explicitly through the Fabric portal or Power BI Desktop. Custom models open in the enhanced model editor, which supports full DAX (measures, calculated columns, calculated tables), hierarchies, calculation groups, and RLS. They are fully independent items with their own lifecycle. As of March 2025, custom semantic models using Direct Lake storage mode can be created in Power BI Desktop in public preview. Previously, Direct Lake models were browser-only. One practical note: decoupled default models that were in Import mode require manual credential reconfiguration after decoupling.

Direct Lake: Mechanism, F-SKU Guardrails, and the Fallback Question

Direct Lake is the storage mode that makes Fabric architecturally distinct. Rather than copying data into VertiPaq at refresh time (Import) or federating every DAX query to a SQL engine (DirectQuery), Direct Lake reads column segments directly from Delta tables in OneLake on demand, loading them into VertiPaq memory when a DAX query needs them. The engine reads Parquet column chunks and converts them from Parquet’s local dictionaries into VertiPaq’s global dictionary format, a process called transcoding. Files written with V-Order optimization (the Fabric Lakehouse default) are especially fast to transcode because their compression aligns with VertiPaq’s encoding.

There are two variants as of 2025:

Direct Lake on OneLake: accesses Delta tables via OneLake APIs directly, no SQL endpoint dependency, no DirectQuery fallback (public preview, March 2025+)
Direct Lake on SQL endpoint: the legacy approach, routes through the SQL analytics endpoint, supports automatic DirectQuery fallback when needed

The choice between them matters operationally because their guardrail behavior is completely different. For a deeper look at the Direct Lake architecture, see Power BI Direct Lake explained: performance, limits, and architecture.

F-SKU Guardrails for Direct Lake

Every F-SKU enforces limits on rows per table, Parquet files per table, and model memory. Exceeding these limits on Direct Lake on OneLake causes the framing operation to fail entirely; the model becomes unqueryable. On Direct Lake on SQL endpoint, the model falls back to DirectQuery automatically if fallback is enabled.

Fabric SKU	Rows per table (M)	Parquet files / table	Max memory (GB)
F2,F8	300	1,000	3
F16	300	1,000	5
F32	300	1,000	10
F64 / P1	1,500	5,000	25
F128 / P2	3,000	5,000	50
F256 / P3	6,000	5,000	100
F512 / P4	12,000	10,000	200
F1024,F2048	24,000	10,000	400

The memory figure is a soft ceiling: exceeding it causes paging rather than a hard failure. The rows and file limits are hard guardrails that cause framing failure on Direct Lake on OneLake, or DirectQuery fallback on the SQL endpoint variant.

Framing, Cold/Warm States, and Eviction

Framing is Direct Lake’s equivalent of a refresh. Instead of copying data, it copies only metadata: the semantic model scans the Delta log, identifies which Parquet files changed since the last framing commit, drops affected column segments, and remaps references. This typically takes seconds. With incremental framing (2025), only row groups that changed are reloaded; unchanged columns stay in memory.

After framing, a model progresses through memory states: cold (no columns loaded, all DAX queries trigger transcoding), warm (all required columns in memory, Import-equivalent performance), and hot (warm plus VertiScan caches from repeated queries). Under memory pressure, VertiPaq evicts column segments, pushing the model toward cold. Frequent cold states on a production model signal that the SKU is undersized for the data volume.

One operational edge case: after deploying a Direct Lake model via the XMLA endpoint, the model is unprocessed and falls back to DirectQuery until a framing operation is explicitly triggered. Always include a reframe step in deployment automation.

Import Mode Inside Fabric: Still the Right Default in Many Cases

Import mode remains fully supported inside Fabric and, per SQLBI’s Marco Russo, is “the gold standard for flexibility and performance” for models that fit within SKU memory limits. Data is copied from the source, transformed via Power Query, compressed, and stored in VertiPaq. In Fabric, Import mode data is automatically written to Delta tables in OneLake, making it accessible via SQL queries, notebooks, and shortcuts even though the model itself uses VertiPaq internally.

Teams ingesting data from operational source systems such as Salesforce into Fabric typically use Import mode to stage that data before landing it in OneLake. Tools like Power BI Connector for Salesforce handle the connector-level extraction into that pipeline. For a practical overview of connection types across modes, the Power BI data connection guide covers the options clearly.

Import remains preferable when the model needs full Power Query transformation capability, requires calculated columns or user-defined MDX hierarchies, must serve Excel-heavy workloads (Excel pivot tables on Direct Lake fall back to DirectQuery), or when the team is on F2,F32 and wants to avoid transcoding overhead at those memory-constrained SKUs. One sizing note: a full Import refresh temporarily requires approximately 2× the model’s in-memory size. Under-sized capacity will cause refresh failures during that window.

DirectQuery and Composite Models in Fabric

Pure DirectQuery semantic models are less common in Fabric than in classic Power BI, because Direct Lake covers most of the large-data scenarios. DirectQuery remains relevant for external sources not in OneLake (Azure SQL, Snowflake, on-premises SQL Server), real-time operational requirements where even incremental framing is too slow, and as the automatic fallback mechanism when Direct Lake on SQL endpoint hits its guardrails.

The more significant development is the new Direct Lake + Import composite model in a single semantic model, available in public preview since May 2025. Tables within one model can each be flagged as Direct Lake (OneLake flavor) or Import. VertiPaq treats the entire model as one unit: relationships are regular (not limited), and performance matches a pure Import model. The canonical use case is large fact tables in Direct Lake alongside dimension tables in Import mode that need Power Query shaping or calculated columns.

The older pattern of stitching two separate semantic models together via a classic composite reference creates limited relationships across the boundary, and filter propagation degrades at real cardinality. That approach should be retired.

Dataflow Gen2 vs. Power Query Inside the Semantic Model

Both Dataflow Gen2 and Semantic Model Power Query (SM PQ) use Power Query, but they serve fundamentally different roles and the choice between them has real architectural consequences.

SM PQ lives inside a single semantic model. It is available only for Import storage mode tables in the web editor. Its legitimate use cases are narrow: last-mile cleanup that is genuinely report-specific (column renames, type corrections, minor filtering) or situations where the upstream source cannot be modified. The risk is scope creep. Once complex transformation logic moves into SM PQ, it becomes invisible to platform lineage tools, is not separately monitorable, forces Import mode, and cannot be reused by other models.

Dataflow Gen2 runs Power Query Online, lands results as Delta tables in a Lakehouse or Warehouse, and is available to any downstream consumer. It is the correct layer for shared, governed transformation logic, with first-class monitoring (refresh history, monitoring hub, CU tracking), full Git integration, and deployment pipeline support.

Aspect	Semantic Model PQ	Dataflow Gen2
Scope	One semantic model	Workspace-wide; any consumer
Reusable output	No	Yes (shared Lakehouse Delta tables)
Supports Direct Lake downstream	Risky (forces Import mode)	Yes (ideal upstream prep layer)
ETL ownership	BI/report owner	Data or platform team
Monitoring	Via model refresh settings only	Monitoring hub, logs, CU tracking
CI/CD and Git	Limited	Full support
Max queries per artifact	No fixed limit	50 per dataflow

The practical rule: if the transformation needs to be shared, trusted, or reused, it belongs in Dataflow Gen2. If it is genuinely report-specific and small, SM PQ is defensible. Keep it small on purpose.

XMLA, Tabular Editor, Deployment Pipelines, and Git Integration

Professional semantic model development in Fabric relies on a set of tools that most BI teams already know from Premium, now with Fabric-specific extensions.

The XMLA endpoint’s read-write mode (must be enabled in capacity settings) unlocks deploying models from external tools, fine-grained partition refresh, backup and restore, TMSL scripting, and ALM operations. Tabular Editor 3 supports Direct Lake on OneLake, Direct Lake on SQL endpoint, Fabric SQL Databases, and Mirrored Databases as of version 3.23.0. For Direct Lake + Import composite models, Tabular Editor 3 remains the recommended authoring path until the Fabric portal’s native GUI support matures.

Deployment pipelines support 2,10 stages. Two things that do not carry across stages: refresh credentials and connection definitions. Both must be reconfigured in each target stage after deployment. For Direct Lake models, a reframe operation must be triggered after each XMLA-based deployment or the model falls back to DirectQuery until framing completes.

Git integration stores semantic models as `.bim` files in Azure DevOps or GitHub repositories, with TMDL view support in Power BI Desktop (preview, March 2025). Two known limitations: sensitivity labels are not exported or imported via Git (must be applied separately in each stage), and using the Enhanced Refresh API creates a Git diff after each refresh because refresh metadata updates the `.bim` file.

Governance in Fabric: Roles, Permissions, Labels, and Domains

Fabric’s governance model layers workspace roles, item-level permissions, OneLake security, sensitivity labels, and Domains. The interactions between these layers are where access errors most commonly surface.

The four workspace roles are Admin, Member, Contributor, and Viewer. Contributors can create, edit, and delete semantic models and schedule refreshes. Viewers can read content but cannot read OneLake files directly (ReadAll requires Contributor or higher).

The critical point for Direct Lake: sharing a Power BI report built on a Direct Lake model grants item-level read permission, but it does not automatically grant permission to query the underlying Delta tables in OneLake. This causes access errors that are not obvious from the sharing UI. Either grant ReadData on the Lakehouse explicitly, or configure the semantic model to use a fixed identity for OneLake access.

Sensitivity labels from Microsoft Purview are not exported or imported via Git integration and must be applied separately in each deployment stage. As of February 2026 (GA), Fabric Domains support domain-level default sensitivity labels that automatically apply to all new items in workspaces assigned to that domain. Domains are a logical grouping mechanism for workspaces enabling federated governance: a Finance domain groups Finance workspaces and can delegate certification authority to the Finance BI lead, without controlling data access directly.

Semantic Link and the sempy Library for BI Teams

Semantic link bridges Power BI semantic models and Fabric Notebooks. Pre-installed in the default Fabric Runtime (Spark 3.4+), it requires no additional setup inside a Fabric workspace.

The `sempy.fabric` Python library provides direct programmatic access to semantic models: `fabric.read_table()` reads a model table into a Pandas DataFrame, `fabric.evaluate_dax()` evaluates a DAX query and returns results as a DataFrame, and `fabric.refresh_dataset()` triggers a refresh with control over refresh type, parallelism, and incremental policy override. For operational management, `fabric.create_trace_connection()` traces live DAX queries to determine whether they are hitting VertiPaq or falling back to DirectQuery. `fabric.create_tom_server()` exposes the Tabular Object Model, useful for bulk operations like changing `DirectLakeBehavior` across many models in a workspace programmatically.

Capacity, CU Consumption, Throttling, and Eviction

Fabric measures compute in Capacity Units evaluated in 30-second timepoints. Interactive operations (DAX queries, report rendering) are smoothed over 5,64 minutes; background operations (refreshes, pipelines) are smoothed over 24 hours. When accumulated carryforward from bursting exceeds thresholds, throttling applies:

Carryforward	Policy	Impact
Up to 10 min	Overage protection	No throttling
10,60 min	Interactive delay	Interactive jobs delayed 20 seconds
60 min, 24 h	Interactive rejection	Interactive jobs rejected; background continues
Over 24 h	Background rejection	All new requests rejected

For Import mode, the concern is peak CU consumption during the refresh window. Parallel partition refreshes during incremental refresh can spike CU consumption above capacity limits. For Direct Lake, the primary concern is memory: under pressure, VertiPaq evicts column segments from warm state back toward cold, and subsequent queries must re-transcode from OneLake. Use the Fabric Capacity Metrics app for 30-second granularity diagnostics on throttling events and per-item CU consumption. For a detailed breakdown of Fabric capacity pricing and SKU sizing economics, see Microsoft Fabric capacity pricing explained.

Storage Mode Decision Framework and Common Anti-Patterns

Choosing the right storage mode depends on data volume, source location, freshness requirements, and modeling features needed. The table below maps common scenarios to the recommended approach:

Scenario	Recommended mode
Very large fact tables (>50M rows), frequently updated, data in OneLake	Direct Lake on OneLake
Large tables needing SQL-based security or fallback tolerance	Direct Lake on SQL endpoint
Large facts + dimensions needing Power Query or calculated columns	Direct Lake + Import composite (single model)
Small to medium models, full Power Query flexibility	Import
External sources not in OneLake (Azure SQL, Snowflake, on-premises)	DirectQuery or Import
Excel-heavy workloads	Import preferred
F2,F32, model fits in memory	Import

For storage mode decisions in the broader context of enterprise Power BI model design, see Power BI storage modes: Import, DirectQuery, and Composite for enterprise models.

Several anti-patterns consistently cause production failures. Using default semantic models in any new work after November 2025 carries governance risk since they are now unmanaged. Building heavy ETL logic inside Semantic Model Power Query creates shadow ETL: scoped to one model, invisible to lineage tools, and forcing Import mode with its full refresh cost. On F2,F8, deploying Direct Lake on OneLake against large Delta tables without testing guardrails first is a common failure mode: framing fails silently if the 300M row or 1,000 Parquet file limits are exceeded. After any XMLA deployment of a Direct Lake model, trigger a reframe operation explicitly or reports will serve DirectQuery data until framing completes. Run periodic `OPTIMIZE` on high-write Delta tables to keep Parquet file counts within guardrails and speed transcoding. Finally, one semantic model serving multiple reports is the intended architecture: fragmented per-report models increase refresh cost, duplicate metric definitions, and make governance harder as the workspace scales.

Author:

Metrica Software