Databricks

Databricks Topics

Module 1: Introduction to Azure Databricks

Module 2: Core Databricks Concepts

Workspace

Notebooks

Library

Repos

Compute

Workflows


Module 3: Types Of Clusters

All-Purpose Clusters

Job Clusters

Module 4: Databricks - Internal Storage

Databricks File System (DBFS)

Module 5: Databricks - External Storage

Azure Blob Storage

Azure Datalake Storage Gen2

Azure SQL Database

Azure Synapse Dedicated SQL Pool


Module 6: Storages - Azure Credentials

Account Access Key

Shared Access Signature Token

Azure Service Principal

Module 7: Databricks Notebooks - Magic Commands

%Python or %py

%r

%scala

%sql

Module 8: Bigdata File Format

Row - Based File Formats

CSV,TSV, and AVRO

Columnar File Formats

Parquet, Delta, and ORC

Module 9: CSV File Format

Reading Data

Reading Data from Multiple CSV Files

Writing Data

Module 10: JSON File Format

Single Line JSON

Multi Line JSON

Complex Multi Line JSON

Arrays

Struct Fields

Module 11: Excel File Format

Single Sheet Reading

Multiple Sheet Reading Using List object

Dynamically Reading Multiple Sheets

Module 12: XML File Format

Simple XML Files

Complex XML Files

Module 13: Introduction To Spark SQL Module

Managed Tables(Internal Tables)

DataFrame API

Spark SQL API

Un-Manged Tables(External Tables)

DataFrame API

Spark SQL API

Temporary Views(Temporary Table)

Global Temporary Views

Data Processing_Transformations

Merging Joining Two DataFrames_Types of Joins

Merging Union_UnionAll UnionByName

Module 14: Introduction To Delta Lake

Delta Lake Features

ACID transactions

Handling metadata

Schema enforcement

Time travel

Upserts and deletes

Delta Lake Components

_delta_log(Transaction log)

Versioned parquet files

Delta lake Operations

Create Table

Upsert to a table

Read a table

Update a table

Delete from a table

Display table history

Time table

Clean up snapshots with VACUUM

Delta Lake table history

Restore a Delta table to an earlier state

Vacuum unused data files


Module 15: Azure databricks - Types of Loads

History Load

Incremental Load


Module 16: medallion architecture 

Module 17: unity catalogue 


Module 18: Delta Lake - Slowly Changing Dimension

Type1 Dimension

Type2 Dimension

Type3 Dimension


Module 19: Databricks - Azure SQL Database

Reading Data With Jdbc Driver

Writing Data With Jdbc Driver


Module 20: Databricks - Synapse Dedicated SQL Pool

Reading Data From Synapse Table

Writing Data To Synapse Table


Module 21: Delta Lake - Performance Optimization Technics

OPTIMIZE a Table

Z-ORDER by Columns

Module 22: Databricks Integration With Azure Data Factory

Call a Notebook using Notebook Activity

SetVariable Activity

Trigger ADF Pipeline

Module 23: Azure Key Vault Integration With databricks

Create Secrets

Create SecretScope

Project on Databricks



Comments