Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Databricks Fundamentals & Apache Spark Core
Rating: 4.2 out of 5(3,112 ratings)
30,478 students
Created byWadson Guimatsa
Last updated 9/2023
English

What you'll learn

  • Databricks
  • Apache Spark Architecture
  • Apache Spark DataFrame API
  • Apache Spark SQL
  • Selecting, and manipulating columns of a DataFrame
  • Filtering, dropping, sorting rows of a DataFrame
  • Joining, reading, writing and partitioning DataFrames
  • Aggregating DataFrames rows
  • Working with User Defined Functions
  • Use the DataFrameWriter API

Course content

8 sections72 lectures12h 8m total length
  • Introduction2:20
  • Create a Databricks community account2:26
  • Install the Dataset4:30
  • Overview of the dataset5:46
  • Install the notebooks2:20

Requirements

  • Basic Scala knowledge
  • Basic SQL knowledge

Description

Welcome to this course on Databricks and Apache Spark 2.4 and 3.0.0

Apache Spark is a Big Data Processing Framework that runs at scale.
In this course, we will learn how to write Spark Applications using Scala and SQL.

Databricks is a company founded by the creator of Apache Spark.
Databricks offers a managed and optimized version of Apache Spark that runs in the cloud.

The main focus of this course is to teach you how to use the DataFrame API & SQL to accomplish tasks such as:

  • Write and run Apache Spark code using Databricks

  • Read and Write Data from the Databricks File System - DBFS

  • Explain how Apache Spark runs on a cluster with multiple Nodes

Use the DataFrame API and SQL to perform data manipulation tasks such as

  • Selecting, renaming and manipulating columns

  • Filtering, dropping and aggregating rows

  • Joining DataFrames

  • Create UDFs and use them with DataFrame API or Spark SQL

  • Writing DataFrames to external storage systems

List and explain the element of Apache Spark execution hierarchy such as

  • Jobs

  • Stages

  • Tasks


Who this course is for:

  • Software developers curious about big-data, data engeneering and data science
  • Beginner data engineer who want to learn how to do work with databricks
  • Beginner data scientist who want to learn how to do work with databricks