Databricks Topics
Module 1: Introduction to Azure Databricks
Module 2: Core Databricks Concepts
Workspace
Notebooks
Library
Repos
Compute
Workflows
Module 3: Types Of Clusters
All-Purpose Clusters
Job Clusters
Module 4: Databricks - Internal Storage
Databricks File System (DBFS)
Module 5: Databricks - External Storage
Azure Blob Storage
Azure Datalake Storage Gen2
Azure SQL Database
Azure Synapse Dedicated SQL Pool
Module 6: Storages - Azure Credentials
Account Access Key
Shared Access Signature Token
Azure Service Principal
Module 7: Databricks Notebooks - Magic Commands
%Python or %py
%r
%scala
%sql
Module 8: Bigdata File Format
Row - Based File Formats
CSV,TSV, and AVRO
Columnar File Formats
Parquet, Delta, and ORC
Module 9: CSV File Format
Reading Data
Reading Data from Multiple CSV Files
Writing Data
Module 10: JSON File Format
Single Line JSON
Multi Line JSON
Complex Multi Line JSON
Arrays
Struct Fields
Module 11: Excel File Format
Single Sheet Reading
Multiple Sheet Reading Using List object
Dynamically Reading Multiple Sheets
Module 12: XML File Format
Simple XML Files
Complex XML Files
Module 13: Introduction To Spark SQL Module
Managed Tables(Internal Tables)
DataFrame API
Spark SQL API
Un-Manged Tables(External Tables)
DataFrame API
Spark SQL API
Temporary Views(Temporary Table)
Global Temporary Views
Data Processing_Transformations
Merging Joining Two DataFrames_Types of Joins
Merging Union_UnionAll UnionByName
Module 14: Introduction To Delta Lake
Delta Lake Features
ACID transactions
Handling metadata
Schema enforcement
Time travel
Upserts and deletes
Delta Lake Components
_delta_log(Transaction log)
Versioned parquet files
Delta lake Operations
Create Table
Upsert to a table
Read a table
Update a table
Delete from a table
Display table history
Time table
Clean up snapshots with VACUUM
Delta Lake table history
Restore a Delta table to an earlier state
Vacuum unused data files
Module 15: Azure databricks - Types of Loads
History Load
Incremental Load
Module 16: medallion architecture
Module 17: unity catalogue
Module 18: Delta Lake - Slowly Changing Dimension
Type1 Dimension
Type2 Dimension
Type3 Dimension
Module 19: Databricks - Azure SQL Database
Reading Data With Jdbc Driver
Writing Data With Jdbc Driver
Module 20: Databricks - Synapse Dedicated SQL Pool
Reading Data From Synapse Table
Writing Data To Synapse Table
Module 21: Delta Lake - Performance Optimization Technics
OPTIMIZE a Table
Z-ORDER by Columns
Module 22: Databricks Integration With Azure Data Factory
Call a Notebook using Notebook Activity
SetVariable Activity
Trigger ADF Pipeline
Module 23: Azure Key Vault Integration With databricks
Create Secrets
Create SecretScope
Project on Databricks
Comments
Post a Comment
If you have any doubt then please let me know in comment section.