Star Schema vs Snowflake Schema (2026)

Star schema is the most commonly tested dimensional modeling pattern in DE interviews, with medallion architecture close behind. Data modeling overall appears in roughly a third of DE loops.

What this guide actually says

Star is the default in 2026 for analytics workloads. Snowflake makes sense in a narrow set of cases: deep, frequently-changing hierarchies; storage-constrained on-prem; strict referential integrity. The interview question is rarely 'which one' — it's 'why this one for this scenario.'

Star schema example

fact_orders (order_id, customer_id, product_id, date_id, amount, quantity)
  dim_customer (customer_id, name, city, state, country, segment)
  dim_product (product_id, name, category, subcategory, brand, price)
  dim_date (date_id, date, month, quarter, year, is_weekend)

Each dimension is one flat table. 'city', 'state', and 'country' are all columns in dim_customer.

Snowflake schema example: same data, normalized hierarchies

fact_orders (order_id, customer_id, product_id, date_id, amount, quantity)
  dim_customer (customer_id, name, city_id, segment)
    dim_city (city_id, city_name, state_id)
      dim_state (state_id, state_name, country_id)
        dim_country (country_id, country_name)
  dim_product (product_id, name, subcategory_id, brand, price)
    dim_subcategory (subcategory_id, subcategory_name, category_id)
      dim_category (category_id, category_name)

Geography split into city, state, country tables. Product category is its own table. More JOINs, less redundancy.

Side-by-side comparison

DimensionStar SchemaSnowflake Schema
NormalizationDenormalized dim tables. Each dim is a single flat table with redundant data.Normalized dim tables. Dims split into sub-tables linked by foreign keys.
Query ComplexitySimpler queries. Fewer JOINs because dims are flat. Analysts can write queries without knowing the full schema.More JOINs required. Querying a single dim may require joining 2-3 tables.
StorageMore storage due to repeated values in dim tables. In practice, dim tables are small relative to fact tables, so the difference is minor.Less storage. Redundancy eliminated through normalization. Matters more when dim tables are very large.
MaintenanceEasier for analytics workloads. Updates to dim attributes require changing one table.More complex. Updates may propagate across multiple normalized tables. Better for enforcing data integrity.
When to useAnalytics, reporting, dashboards. When query speed and simplicity matter more than storage savings. The default for most warehouses.When dim tables are very large, storage cost matters, or you need strict referential integrity. More common in traditional enterprise environments.

How to answer schema design questions in interviews

Data modeling appears in ~30% of DE interviews. Interviewers don't want you to recite definitions; they want you to make a choice and defend it with trade-offs specific to the scenario.

Start with star unless told otherwise

In an interview, default to star. Simpler to explain, faster to query, easier to draw on a whiteboard. Only switch to snowflake if the interviewer gives you a reason (large dimensions, storage constraints, integrity requirements).

Name the trade-off explicitly

Don't just say 'I'd use a star schema.' Say: 'I'd use star because the dimension tables are small and query simplicity matters more than storage here. If the customer dimension grew to hundreds of millions of rows, I'd consider normalizing it.'

Know how to draw both on a whiteboard

Practice sketching a fact with 3-4 dimensions in star form, then normalizing one dimension into snowflake form. The interviewer wants to see that you understand what normalization actually changes.

Connect it to real workloads

Mention that modern columnar warehouses (which compress repeated values well) reduce the storage advantage of snowflake. Shows you think about the actual execution environment, not just textbook rules.

Frequently asked questions

Which is better: star schema or snowflake schema?+
Neither is universally better. Star schemas are simpler to query and perform well for analytics workloads. Snowflake schemas save storage and enforce stricter integrity. For most modern warehouses, star is the default because columnar storage reduces the storage penalty of denormalization.
Can you use both in the same data warehouse?+
Yes. Many warehouses mix both. Keep most dimensions denormalized (star) but normalize a very large dimension like a product catalog with deep hierarchies (snowflake). The schema should match the query patterns, not a single rule.
Do interviewers expect me to know both?+
Yes. Data modeling appears in ~30% of DE interviews, and star schema is the single most tested dimensional modeling pattern. You should explain both, draw them, and articulate when you'd choose one over the other. Simply naming them isn't enough.
How does this relate to normalization (1NF, 2NF, 3NF)?+
Snowflake applies normalization principles (typically 3NF) to dim tables. Star intentionally denormalizes for query performance. Both assume the fact table is already normalized. Understanding normal forms helps you explain why snowflake dimensions have fewer redundant columns.
02 / Why practice

Build schemas, not just read about them

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    The round is won on tradeoffs, not on the diagram

    Grain, star vs snowflake, SCD type, conformed dimensions, late-arriving data. Modeling under live pushback is what separates the bands, and it is the half almost nobody rehearses

Related guides