Azure Data Factory Training: Designing and Implementing Data Integration Solutions

Price:

Course Outline

This Azure Data Factory Training covers all key aspects of the Azure Data Factory v2 platform. Special attention is paid to covering Azure services which are commonly used with ADF v2 solutions. These services are Azure Data Lake Storage Gen 2, Azure SQL Database, Azure Databricks, Azure Key Vault, Azure Functions, and a few others.

Azure Data Factory Training: Designing and Implementing Data Integration Solutions Benefits

In this Azure Data Factory course, you will learn how to:
- Build end-to-end ETL and ELT solutions using Azure Data Factory v2
- Architect, develop and deploy sophisticated, high-performance, easy-to-maintain and secure pipelines that integrate data from a variety of Azure and non-Azure data sources.
- Apply the latest DevOps best practices available for the ADF v2 platform.
Prerequisites

Learning Tree course 8566, Introduction to Cloud Infrastructure (AZ-900), or equivalent experience.

Azure Data Factory Training Outline

Module 1: Introduction to ADF

Historical background: SSIS, ADF v1, other ETL/ELT tools
Key capabilities and benefits of ADF v2
Recent feature updates and enhancements

Module 2: Core Architectural Components

Connectors: Azure services, databases, NoSQL, files, generic protocols, services & apps, custom
Pipelines
Activities: data movement, data transformation, control flow
Datasets: source, sink
Integration Runtimes: Azure, Self-Hosted, Azure-SSIS

Module 3: Building and Executing Your First Pipeline

Creating ADF v2 instance
Creating a pipeline and associated activities
Executing the pipeline
Monitoring execution
Reviewing results

Module 4: Data Movement

Copying Tools and SDKS

Copy Data Tool/Wizard
Copy activity
SDKs: Python, .NET
Automation: PowerShell, REST API, ARM Templates

Copying Considerations

File formats: Avro, binary, delimited, JSON, ORC, Parquet
Data store support matrix
Write behavior: append, upsert, overwrite, write with custom logic
Schema and data type mapping
Fault tolerance options

Module 5: Data Transformation

Transformation with Mapping Data Flows

Introduction to mapping data flows
Data flow canvas
Debug mode
Dealing with schema drift
Expression builder & language
Transformation types: Aggregate, Alter row, Conditional split, Derived column, Exists, Filter, Flatten, Join, Lookup, New branch, Pivot, Select, Sink, Sort, Source, Surrogate key, Union, Unpivot, Window

Transformation with External Services

Databricks: Notebook, Jar, Python
HDInsight: Hive, Pig, MapReduce, Streaming, Spark
Azure Machine Learning service
SQL Stored procedures
Azure Data Lake Analytics U-SQL
Custom activities with .NET or R

Module 6: Control Flow

Purpose of activity dependencies: branching and chaining
Activity dependency conditions: succeeded, failed, skipped, completed
Control flow activities: Append Variable, Azure Function, Execute Pipeline, Filter, ForEach, Get Metadata, If Condition, Lookup, Set Variable, Until, Wait, Web

Module 7: Runtime and Operations

Debugging
Monitoring: visual, Azure Monitor, SDKs, runtime-specific best practices
Scheduling execution with triggers: event-based, schedule, tumbling window
Performance, scalability, tuning
Common troubleshooting scenarios in activities, connectors, data flows and integration runtimes

Module 8: DevOps with ADF

Quick introduction to source control with Git
Integration with GitHub and Azure DevOps platforms
Environment management: Development, QA, Production
Iterative development best practices
Continuous Integration (CI) pipelines
Continuous Delivery (CD) pipelines

Module 9: Promoting Reuse

Templates: out-of-the-box and organizational
Parameters
Naming convention

Module 10: Security

Data movement security
Azure Key Vault
Self-hosted IR considerations
IP address blocks
Managed identity

Course Dates

Attendance Method

Live Virtual (Remote)
In Class

Additional Details (optional)

Comments

Price: $2,507.00