WWW.SKGURU.COM

Fabric Data Ingestion (Pipelines)

← Previous Next →

Data ingestion in Fabric refers to how data is collected, moved, and transformed from different sources into Fabric destinations (like Lakehouse, Warehouse, etc.). The main orchestration tool for this is Pipelines, similar to Azure Data Factory.

1. Pipelines (ADF-like)
Fabric Pipelines are very similar to Azure Data Factory pipelines.
What they are:
• A workflow orchestration tool to automate data movement and transformation
• Built using a drag-and-drop UI
• Allows combining multiple activities into a data pipeline
Key features:
• Supports ETL / ELT workflows
• Connects to multiple data sources (on-prem, cloud, SaaS
)
• Reusable and modular design
• Integration with other Fabric components (Lakehouse, Warehouse, Notebooks)
Example:
A pipeline might:
1. Extract data from SQL Server
2. Transform it
3. Load it into a Lakehouse

2️. Dataflows Gen2
Dataflows Gen2 are the modern data transformation layer in Fabric.
What they are:
• Built on Power Query (M language)
• Used for data preparation and transformation
• Runs at scale using Fabric compute
Key capabilities:
• Visual, no-code / low-code transformation
• Data cleaning, shaping, filtering, joins
• Handles complex transformations without coding
• Stores output in OneLake

Think of it as:
• Pipeline = “when & how data moves”
• Dataflow = “how data is transformed”

️3. Copy Activity
Copy Activity is the core data movement component inside pipelines.
What it does:
• Copies data from source → destination
• Supports structured & semi-structured data
Supported sources/destinations:
• Databases (SQL Server, Azure SQL, etc.)
• File systems (CSV, Parquet, JSON)
• Cloud services (Blob Storage, APIs)
Features:
• Schema mapping
• Incremental loading
• Parallel data transfer (high performance)
• Fault tolerance & retry
Example:
Copy data from:
• On-prem SQL → Fabric Lakehouse
• REST API → Data Warehouse

4️. Scheduling & Triggers
Pipelines can be automated using triggers, so they run without manual intervention.
Types of triggers:
4.1 Schedule Trigger:
• Run pipelines at fixed intervals
(e.g., every hour, daily at 2 AM)
4.2 Tumbling Window Trigger:
• Runs in time slices (useful for incremental loads)
• Ensures no data gaps
4.3 Event-based Trigger:
• Triggered by events like:
• File arrival in storage
• External system events

Benefits:
• Full automation of data workflows
• Supports real-time or batch ingestion
• Reliable execution with monitoring & alerts

Dataflows Gen2: difference from Pipelines

Pipelines		Dataflows Gen2
--------------------------------------
Orchestration		Transformation
Controls workflow	Shapes data
Uses activities		Uses Power Query

Summary:
• Pipelines → Orchestrate workflows (ADF-like)
• Dataflows Gen2 → Transform and prepare data
• Copy Activity → Move data between systems
• Triggers → Automate execution

← Previous Next →

Fabric Data Ingestion (Pipelines)

Topics