Data Engineering Interview Resources

Every guide, question set, and reference on the site, organized by interview round. Find what you need, then go practice.

Interview Prep by Round

Curated Playbooks

SQL Practice by Topic

Window Functions

ROW_NUMBER, RANK, LAG, LEAD, and running totals

Joins Practice

INNER, LEFT, RIGHT, FULL OUTER, and CROSS joins

CTEs

Common Table Expressions and recursive queries

GROUP BY

Aggregation, HAVING, and grouping sets

Subqueries

Correlated and non-correlated subqueries

CASE WHEN

Conditional logic, CASE expressions, and pivot patterns

CASE WHEN Practice

Conditional logic and pivoting with CASE expressions

CASE WHEN: Multiple Conditions

AND/OR logic, nested CASE, and interview patterns

CASE Statement

Syntax, examples, and interview patterns

PIVOT

Row-to-column transformations and dynamic pivoting

COALESCE

NULL handling with COALESCE and IFNULL patterns

Self Join

Syntax, use cases, and interview questions

Self Joins Practice

Hierarchical queries and row-pair comparisons

DISTINCT

Deduplication and distinct counting techniques

SELECT DISTINCT

Multi-column DISTINCT and performance

NULL Handling

IS NULL, COALESCE, NULLIF, and three-valued logic

SQL Cheat Sheet

Quick-reference syntax guide for interviews

Advanced SQL Cheat Sheet

Window functions, CTEs, MERGE, and more

Window Functions Cheat Sheet

Every window function with examples

RANK() Function

Syntax, ties, and PARTITION BY

LATERAL JOIN

Correlated subqueries in FROM clauses

FULL OUTER JOIN

Syntax, NULL handling, and reconciliation

CROSS JOIN

Cartesian products and use cases

UNION

UNION vs UNION ALL vs JOIN

ORDER BY

ASC, DESC, and NULLS FIRST/LAST

GROUPING SETS, ROLLUP, CUBE

Multi-level aggregation in one query

GROUP BY Multiple Columns

Composite groups and interview patterns

Recursive CTE

Hierarchies, date series, and graph traversal

Common Table Expression

The WITH clause explained

String Functions

CONCAT, SUBSTRING, TRIM, REPLACE, and more

Date Functions

DATE_TRUNC, DATEDIFF, EXTRACT, INTERVAL

SQL Server Window Functions

Differences from PostgreSQL

SQL Injection Cheat Sheet

What data engineers need to know

SQL Exercises

Practice questions and exercises

Data Modeling Deep Dives

Pipeline and Architecture

Tool-Specific Questions

Tools Hub

Tutorials, interview questions, and practice for every DE tool

PySpark Questions

DataFrame API, RDDs, and Spark optimization questions

Spark Questions

Distributed computing and Spark internals

Kafka Questions

Topics, partitions, consumer groups, and exactly-once

Airflow Questions

DAGs, operators, scheduling, and orchestration patterns

Airflow DAG Reference

DAG patterns, dependencies, XCom, and operators

dbt Questions

Models, tests, snapshots, and incremental strategies

dbt Tutorial

Beginner to interview-ready dbt

Snowflake Questions

Architecture, warehouses, and Snowflake-specific features

Databricks Questions

Unity Catalog, Delta Lake, and Databricks workflows

PySpark isin()

Performance and NOT IN pitfalls

PySpark Tutorial

SparkSession to write, with real-world context

PySpark Practice

Problems by category with real execution

PySpark Coding Practice

By difficulty with runnable code

PySpark Joins

Broadcast, anti, and multi-column joins

PySpark GroupBy

Shuffle cost, skew, and aggregation patterns

Spark Joins

Broadcast, shuffle, skew handling, and physical plans

Pandas GroupBy

Split-apply-combine, as_index pitfalls, and agg vs apply

PySpark Drop Duplicates

dropDuplicates vs window dedup

PySpark Functions Cheat Sheet

The functions you reach for in interviews

Spark for L5 to L7

Advanced Spark: Catalyst, physical plans, tuning

Spark Mock Interview

AI 4-phase simulation with grading

Spark SQL Functions

Function reference for data engineers

Spark SQL LEFT ANTI JOIN

Syntax, plan, and use cases

What Is PySpark?

Python API for Apache Spark

Pandas Cheat Sheet

Pandas quick-reference for data engineers

Company Interview Guides

Career Resources

DE Salary Guide

Compensation benchmarks by level and location

Senior DE Salary

L5/L6 compensation guide

Resume Guide

How to write a data engineering resume that gets callbacks

Resume Examples

Examples by level: junior, mid, senior

Portfolio Guide

Project ideas, GitHub structure, and hiring signals

DE Roadmap

Skills to learn and the order to learn them

DE Career Path

Junior to Staff and beyond

DE Jobs

Where to find them and how to stand out

DE Job Description

What hiring managers actually look for

Big Data Engineer

Career path and job description

Analyst to Engineer

Transition guide from data analyst to data engineer

How to Become a DE

Step-by-step career path into data engineering

DE vs Data Analyst

Role differences, skills, and career trajectories

DE vs Data Scientist

Where the roles overlap and where they diverge

DE vs Software Engineer

Comparing day-to-day work and technical focus

Study Plan

Structured prep schedule for your interview timeline

DE Bootcamp Review

Honest comparison vs self-study

Behavioral Questions

STAR method answers for data engineering interviews

Analyst Interview Questions

With DE role crossover

DS Interview Questions

For data engineers

HackerRank SQL

What's missing for data engineers

LeetCode SQL

Gaps and alternatives

DE Concepts Hub

Complete index of data engineering concepts

DE Tools Guide

Orchestration, transformation, storage, streaming

Data Democratization

Benefits, risks, and implementation patterns

Power BI vs Tableau

Which is better for your data stack

Certifications

Platform Comparisons

Ready to practice?

Reading is good. Writing code under time pressure is better. Jump into a challenge and see where you stand.