Interview Prep by Round
The most common SQL questions from real data engineering interviews
Star schema, normalization, SCD, and ER modeling questions
Pipeline architecture and distributed systems design rounds
Python coding and data manipulation questions for DE interviews
End-to-end guide covering every round of the DE interview loop
SQL Practice by Topic
ROW_NUMBER, RANK, LAG, LEAD, and running totals
INNER, LEFT, RIGHT, FULL OUTER, and CROSS joins
Common Table Expressions and recursive queries
Aggregation, HAVING, and grouping sets
Correlated and non-correlated subqueries
Conditional logic, CASE expressions, and pivot patterns
Conditional logic and pivoting with CASE expressions
AND/OR logic, nested CASE, and interview patterns
Syntax, examples, and interview patterns
Row-to-column transformations and dynamic pivoting
NULL handling with COALESCE and IFNULL patterns
Syntax, use cases, and interview questions
Hierarchical queries and row-pair comparisons
Deduplication and distinct counting techniques
Multi-column DISTINCT and performance
IS NULL, COALESCE, NULLIF, and three-valued logic
Quick-reference syntax guide for interviews
Window functions, CTEs, MERGE, and more
Every window function with examples
Syntax, ties, and PARTITION BY
Correlated subqueries in FROM clauses
Syntax, NULL handling, and reconciliation
Cartesian products and use cases
UNION vs UNION ALL vs JOIN
ASC, DESC, and NULLS FIRST/LAST
Multi-level aggregation in one query
Composite groups and interview patterns
Hierarchies, date series, and graph traversal
The WITH clause explained
CONCAT, SUBSTRING, TRIM, REPLACE, and more
DATE_TRUNC, DATEDIFF, EXTRACT, INTERVAL
Differences from PostgreSQL
What data engineers need to know
Practice questions and exercises
Data Modeling Deep Dives
Fact tables, dimension tables, and grain selection
Key differences explained
SCD Type 1, 2, and 3 with implementation patterns
Effective-dated history rows with row-version flags
Dimensional denormalized design for analytics
Normalized dimensional model with hierarchies
Bronze, silver, and gold layers in a lakehouse
Hubs, links, and satellites for agile data warehousing
Integration patterns for data engineers
3NF, primary/foreign keys, and OLTP design
Conceptual-to-physical modeling progression
Trade-offs between denormalized and normalized designs
Analytical vs transactional system design choices
Pipeline & Architecture
Patterns for batch, streaming, and hybrid pipelines
Building pipelines that are safe to re-run
Running historical partitions without breaking production
When to transform before or after loading
Choosing the right processing model for your use case
Common pipeline design questions from real interviews
Storage layer trade-offs and when to use each
Tool-Specific Questions
Tutorials, interview questions, and practice for every DE tool
DataFrame API, RDDs, and Spark optimization questions
Distributed computing and Spark internals
Topics, partitions, consumer groups, and exactly-once
DAGs, operators, scheduling, and orchestration patterns
DAG patterns, dependencies, XCom, and operators
Models, tests, snapshots, and incremental strategies
Beginner to interview-ready dbt
Architecture, warehouses, and Snowflake-specific features
Unity Catalog, Delta Lake, and Databricks workflows
Performance and NOT IN pitfalls
SparkSession to write, with real-world context
Problems by category with real execution
By difficulty with runnable code
Broadcast, anti, and multi-column joins
Shuffle cost, skew, and aggregation patterns
Broadcast, shuffle, skew handling, and physical plans
Split-apply-combine, as_index pitfalls, and agg vs apply
dropDuplicates vs window dedup
The functions you reach for in interviews
Advanced Spark: Catalyst, physical plans, tuning
AI 4-phase simulation with grading
Function reference for data engineers
Syntax, plan, and use cases
Python API for Apache Spark
Pandas quick-reference for data engineers
Company Interview Guides
Data engineering interview process and focus areas
Full Meta DE interview loop and prep timeline
Meta DE compensation by level (E3 to E7)
Leadership principles and DE technical rounds
Amazon DE compensation by level (L4 to L7)
System design and coding round expectations
Google DE interview questions and process
Culture fit and data platform design rounds
Netflix DE compensation (all-cash)
On-site interview structure and question types
Loop interview format and preparation tips
Data infrastructure and pipeline design focus
Real-time data and large-scale system design
Uber DE interview questions and process
Data platform and ML infrastructure questions
Metrics, experimentation, and data modeling rounds
Spark internals and lakehouse architecture questions
Cloud data platform and SQL optimization rounds
Large-scale data processing and system design
Career Resources
Compensation benchmarks by level and location
L5/L6 compensation guide
How to write a data engineering resume that gets callbacks
Examples by level: junior, mid, senior
Project ideas, GitHub structure, and hiring signals
Skills to learn and the order to learn them
Junior to Staff and beyond
Where to find them and how to stand out
What hiring managers actually look for
Career path and job description
Transition guide from data analyst to data engineer
Step-by-step career path into data engineering
Role differences, skills, and career trajectories
Where the roles overlap and where they diverge
Comparing day-to-day work and technical focus
Structured prep schedule for your interview timeline
Honest comparison vs self-study
STAR method answers for data engineering interviews
With DE role crossover
For data engineers
What's missing for data engineers
Gaps and alternatives
Complete index of data engineering concepts
Orchestration, transformation, storage, streaming
Benefits, risks, and implementation patterns
Which is better for your data stack
Certifications
Which DE certs actually help with interviews
Data Engineer Associate study guide
Professional Data Engineer study guide
Data Engineer Associate study guide
DP-203, DP-900, DP-700 guide
Certified Data Engineer Associate
Advanced Databricks cert guide
SnowPro Core + Advanced DE
Platform Comparisons
Why generic algorithm prep falls short for DE interviews
How DataDriven compares for data engineering prep
Feature and content comparison for DE candidates
Differences in approach and question coverage
DE-focused prep vs general coding challenges