Fitcoding

Using BigQuery with SQL: Hands-on Examples for Data Analysts

If you’re a data analyst looking to harness the power of cloud-scale analytics, Google BigQuery offers an elegant yet robust SQL interface that can handle terabytes of data with speed and simplicity. In this guide, we’ll walk through real-world, hands-on examples using SQL in BigQuery—designed to help you unlock insights, streamline your queries, and get more done with less overhead. Whether you’re transitioning from traditional SQL environments or starting fresh in the cloud, this article gives you the practical knowledge to hit the ground running.

What Is BigQuery and Why Use It?

Google BigQuery is a serverless, highly scalable, and cost-effective cloud data warehouse. What sets it apart is its ability to process massive datasets in seconds using standard SQL syntax. Analysts can query petabytes of data with near-zero setup—no need for infrastructure management, performance tuning, or indexing.

Key Features:

  • Standard SQL support
  • Seamless integration with Google Cloud ecosystem
  • Real-time analytics with streaming inserts
  • Built-in machine learning (BigQuery ML)
  • Pay-per-query or flat-rate pricing

BigQuery allows analysts to focus on insight, not infrastructure.

Setting Up: Your First Steps in BigQuery

Before diving into queries, let’s set up the basics.

1. Create a Google Cloud Project

2. Enable BigQuery API

  • Go to APIs & Services > Enable APIs
  • Search for BigQuery API and enable it

3. Open BigQuery Console

  • Navigate to the BigQuery UI in Google Cloud Console

You’re now ready to write and run SQL queries in a serverless environment.

BigQuery SQL Basics: Query Structure

BigQuery uses ANSI-compliant SQL, so if you have prior SQL experience, the learning curve is shallow.

Basic Syntax:

SELECT column_name
FROM `project.dataset.table`
WHERE condition
ORDER BY column_name
LIMIT 10;

Let’s break that down using real examples.

Hands-On Example 1: Querying a Public Dataset

BigQuery hosts a wide array of public datasets. Let’s query data from the bigquery-public-data.covid19_jhu_csse.summary table.

Query: Top 5 countries by confirmed cases

SELECT country_region, SUM(confirmed) AS total_confirmed
FROM `bigquery-public-data.covid19_jhu_csse.summary`
GROUP BY country_region
ORDER BY total_confirmed DESC
LIMIT 5;

Explanation:

  • SUM() aggregates confirmed cases
  • GROUP BY groups the data per country
  • ORDER BY sorts descending

This gives a snapshot of COVID-19’s impact by country.

Hands-On Example 2: Filtering and Date Functions

Let’s explore U.S. weather data and apply filters.

Dataset: bigquery-public-data.ghcn_d.ghcn_d

Query: Average temperature in New York for January 2023

SELECT
  DATE(observation_date) AS date,
  AVG(value) / 10 AS avg_temp_celsius
FROM
  `bigquery-public-data.ghcn_d.ghcn_d`
WHERE
  id = 'USW00094728' -- NYC station
  AND element = 'TAVG'
  AND EXTRACT(YEAR FROM observation_date) = 2023
  AND EXTRACT(MONTH FROM observation_date) = 1
GROUP BY date
ORDER BY date;

This query demonstrates BigQuery’s robust date functions and numeric transformations.

Hands-On Example 3: Joining Tables

SQL joins are critical in analysis. Let’s join datasets for richer insights.

Query: Combining COVID and population data

SELECT
  covid.country_region,
  SUM(covid.confirmed) AS total_confirmed,
  pop.population
FROM
  `bigquery-public-data.covid19_jhu_csse.summary` AS covid
JOIN
  `bigquery-public-data.world_bank_intl_education.population_total` AS pop
ON
  LOWER(covid.country_region) = LOWER(pop.country_name)
WHERE
  pop.year = 2020
GROUP BY
  covid.country_region, pop.population
ORDER BY total_confirmed DESC;

Why this matters:

Joins in BigQuery allow analysts to merge datasets without ETL, enabling on-the-fly data enrichment.

Hands-On Example 4: Window Functions for Advanced Analysis

Window functions offer a way to do rankings, percentiles, and running totals without collapsing groups.

Query: Rank states by COVID confirmed cases

SELECT
  province_state,
  SUM(confirmed) AS total_confirmed,
  RANK() OVER (ORDER BY SUM(confirmed) DESC) AS rank
FROM
  `bigquery-public-data.covid19_jhu_csse.summary`
WHERE
  country_region = 'US'
GROUP BY province_state
ORDER BY rank;

This adds rankings dynamically, preserving each row while allowing comparison.

Hands-On Example 5: Creating Custom Functions

BigQuery supports User-Defined Functions (UDFs) to extend SQL with JavaScript.

Query: Normalize values using a UDF

CREATE TEMP FUNCTION normalize(value FLOAT64, mean FLOAT64, stddev FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS """
  return (value - mean) / stddev;
""";

SELECT
  id,
  value,
  normalize(value, AVG(value) OVER (), STDDEV(value) OVER ()) AS z_score
FROM
  `project.dataset.table`;

UDFs are especially useful when standard SQL isn’t enough.

Hands-On Example 6: BigQuery ML for Predictive Analysis

Use BigQuery ML to train and predict models directly within SQL.

Query: Create a linear regression model

CREATE OR REPLACE MODEL `project.dataset.sales_model`
OPTIONS(model_type='linear_reg') AS
SELECT
  product_category,
  marketing_spend,
  revenue
FROM
  `project.dataset.sales_data`;

Query: Predict revenue

SELECT
  product_category,
  marketing_spend,
  predicted_revenue
FROM
  ML.PREDICT(MODEL `project.dataset.sales_model`,
    (SELECT * FROM `project.dataset.new_data`));

Now analysts can build ML workflows without leaving the SQL environment.

Performance Tips for SQL in BigQuery

  • **Avoid SELECT ***: Specify columns to reduce read costs
  • Use partitioned tables: Speed up time-based queries
  • Leverage table preview: Inspect data without incurring costs
  • Cache queries: BigQuery automatically caches for 24 hours
  • Watch slot usage: Especially in high-concurrency environments

Real-World Use Case: Marketing Funnel Analysis

Let’s say your marketing team tracks user interactions across email, web, and purchases.

Query: User funnel behavior

SELECT
  user_id,
  MIN(CASE WHEN event_type = 'email_open' THEN event_time END) AS opened_email,
  MIN(CASE WHEN event_type = 'site_visit' THEN event_time END) AS visited_site,
  MIN(CASE WHEN event_type = 'purchase' THEN event_time END) AS made_purchase
FROM
  `project.dataset.user_events`
GROUP BY user_id;

This shows the earliest touchpoints for each user in their journey.

Conclusion

BigQuery is transforming the way data analysts work. It combines the familiarity of SQL with the power of Google Cloud, making it possible to analyze billions of rows without needing to manage infrastructure. With hands-on examples ranging from basic queries to predictive modeling, this guide has provided a practical foundation to grow your skills.

Whether you’re analyzing trends, predicting outcomes, or simply automating reports, BigQuery lets you do more with data, faster and smarter.

Read:

Top 10 Azure SQL Performance Tuning Tips: A Complete Guide for Data Professionals

How to Migrate Your On-Prem SQL Server to Azure SQL: A Complete Guide

Getting Started with Azure SQL Database: A Beginner’s Guide

Authentication and IAM in GCP for C# Applications: A Comprehensive Developer’s Guide

Integrating Google Cloud AI and Machine Learning APIs in a C# Application: A Developer’s Guide

C# and Google Cloud Firestore: Building a Serverless NoSQL App


FAQs

1. Do I need to know advanced SQL to use BigQuery effectively?

No. BigQuery supports standard SQL syntax, making it beginner-friendly. You can start with basic queries and gradually adopt more complex functions like joins, window functions, and machine learning models as you grow more comfortable.

2. Can I import my own datasets into BigQuery for analysis?

Yes. You can upload CSV, JSON, or Avro files directly through the BigQuery console or use Cloud Storage for larger files. Once uploaded, you can query them like any other table.

3. Does BigQuery store data permanently, and is it secure?

BigQuery stores your data in Google Cloud and encrypts it at rest and in transit. You can also configure access using IAM roles, set expiration times on tables, and use audit logs to track access.

4. Is there a free tier available for BigQuery?

Yes. BigQuery offers a generous free tier of 1 TB of query processing per month and 10 GB of storage. This is ideal for learning, prototyping, or light analysis work.

5. Can I automate and schedule SQL queries in BigQuery?

Absolutely. BigQuery lets you schedule queries to run at regular intervals—daily, hourly, or even by the minute—directly from the interface or using Cloud Scheduler for more complex workflows.

Leave a Comment