
Learn the basics of Google BigQuery as a data warehouse, ingest data from multiple sources, and process it with tools to gain actionable insights, through concept and hands-on practice.
Register for BigQuery with a valid Google account and start in the sandbox to learn storage and query basics within the free tier, with an option to upgrade to billing.
Learn to query data in Google BigQuery with standard SQL, inspect tables via preview to avoid charges, and optimize queries with selective columns, limits, and validation, including legacy SQL awareness.
Clean and transform data in BigQuery using SQL and Dataprep, applying business rules to handle nulls, bad data, and defaults like 'Unknown' and 'Unknown station'.
Explore loading data into BigQuery from HTTP API or Kafka in a microservice architecture using Cloud Functions and BigQuery client libraries, with sample weather data and near real time streaming.
Explore how a messaging system enables application-to-application data exchange through a centralized message broker like RabbitMQ or Kafka, decoupling systems and enabling near real-time, JSON-based integrations.
Connect Google Sheets to BigQuery to query and visualize BigQuery data without SQL; requires G Suite enterprise edition and permissions. Create calculated columns, charts, pivot tables, and auto-refresh in Sheets.
Explore union and intersect in google bigquery, comparing union all and union distinct while uploading consultant_profile.json and joining consultant_profile with contractor_profile, then select company name, contact_email, and tax identification number.
Explore BigQuery basic statistics with loan_default.json sample data. Compute mean, median via percentile_cont and percentile_discrete, standard deviation (sample and population), and correlation with over() for grouping.
A data warehouse is a repository of historical data that is queried to answer questions, gain insight from data, and make business decisions. BigQuery is Google’s product for data warehouse. It is designed to store and query terabytes, even petabytes of data without we need to setup and manage any infrastructure. It is not a transactional database for day-to-day operation.
BigQuery supports standard SQL, so if you ever develop with relational database like Oracle, PostgreSQL, MySQL, Microsoft SQL Server, etc, it is easy to familiarize yourself with BigQuery. There are a few BigQuery functions to support modern-day requirements, and learning about them will make your job easier.
There is no infrastructure required. We don’t need to worry about the size of storage, number of processors, or memory allocation for processing query. BigQuery scales automatically to run query, and then release the resource when it is done. We don't even charged for memory or processor allocation.
Google provides sample database for practice and trial.
This course has several topics:
an introduction, where we will see what this course about
what is data warehouse and in which part BigQuery take roles
how we can create a simple data pipeline, including data input, data cleansing, and data visualization
tools and methods that can be used for data engineering, particularly on data ingestion from various sources to BigQuery
data visualization using Google Sheets & Data Studio
This course is for people with basic technical knowledge on SQL.
This course is not basic SQL course, so we will not learn the meaning of basic sql such as SELECT, FROM, WHERE, GROUP BY, ORDER BY
See the preview video Technology in This Course for SQL keyword that we will not discuss on detail
However, we will still learn some of modern SQL syntax that can be used in BigQuery
In this course we will also learn how to fetch data from several sources, so this is a good course if you are an engineer that responsible for creating data pipeline.