Most candidates treat "star vs snowflake" as a balanced trade-off question. It isn't. In a columnar warehouse in 2026 there is a default answer and a narrow exception, and the interviewer is watching for how quickly you say so. The candidates who hedge lose the round. The candidates who rattle off storage-cost arguments from a 2003 textbook lose faster.
The real question hiding underneath is whether you've actually written queries against both and measured anything. Almost nobody has.
Modeling Qs Default Star
Onsite Modeling Rounds
Modeling Qs Analyzed
Companies in Dataset
Source: DataDriven analysis of 1,042 verified data engineering interview rounds.
Eight dimensions where the two patterns actually differ. Read carefully: every "advantage" listed for snowflake used to be true on a row-store database in 2003 and is mostly disinformation on a columnar engine today. Interviewers are listening for whether you know the difference.
Star Schema
Denormalized. Each dimension is a single flat table. A product dimension contains category, subcategory, brand, and manufacturer all in one row. Data is duplicated across rows.
Snowflake Schema
Normalized to 2NF or 3NF. Dimensions are split into multiple related tables. Product links to category, which links to subcategory. No redundant data.
Star Schema
Fewer tables. A typical star schema has 1 fact table and 4 to 8 dimension tables. Queries involve 2 to 5 JOINs.
Snowflake Schema
More tables. The same schema might have 15 to 30 tables because each dimension hierarchy is decomposed. Queries involve 5 to 15 JOINs.
Star Schema
Simpler queries. Because dimensions are flat, you JOIN the fact table directly to each dimension with a single key. Analysts and BI tools can write queries without understanding the hierarchy structure.
Snowflake Schema
More complex queries. You often need multi-hop JOINs to traverse dimension hierarchies. A query that needs the category name for a product requires joining product to subcategory to category.
Star Schema
Generally faster for analytical queries. Fewer JOINs means less work for the query optimizer. On columnar warehouses (Snowflake, BigQuery, Redshift), the denormalized approach aligns well with the storage model.
Snowflake Schema
Generally slower for analytical queries due to more JOINs. However, the performance gap narrows on modern columnar engines with optimized JOIN algorithms. For very large dimensions with high cardinality, normalized tables can be faster because they're smaller.
Star Schema
Higher storage usage. Repeating 'Electronics > Computers > Laptops' across millions of fact rows (via the dimension table) wastes space compared to storing that hierarchy once in normalized form.
Snowflake Schema
Lower storage usage. Each piece of data is stored once. For large dimensions with long text fields, the savings can be meaningful. On modern cloud warehouses, storage is cheap enough that this rarely matters.
Star Schema
Update anomalies are possible. If a category name changes, you update it in one dimension table row, but the denormalized structure means the change is simple (one table). However, if the same data appears in multiple dimension tables, inconsistencies can creep in.
Snowflake Schema
Stronger data integrity. Each piece of data exists in one place. Updates are atomic. Foreign key constraints enforce referential integrity. This matters most in environments where dimension data changes frequently.
Star Schema
Simpler to load. You build each dimension table independently and load the fact table with foreign keys. The denormalized structure means fewer transformation steps.
Snowflake Schema
More complex to load. You need to load tables in dependency order (category before subcategory before product). Foreign key constraints add validation steps. More transformation logic in the pipeline.
Star Schema
Excellent. BI tools like Looker, Tableau, Power BI, and Metabase are designed for star schemas. Auto-generated SQL assumes flat dimensions. Star schemas 'just work' with most BI tools.
Snowflake Schema
Requires more configuration. BI tools can work with snowflake schemas, but you often need to define explicit relationships or create views that flatten the hierarchy back into a star-like structure.
An e-commerce data warehouse modeled both ways.
fact_orders order_id customer_key --> dim_customer product_key --> dim_product date_key --> dim_date store_key --> dim_store quantity revenue dim_product (flat) product_key product_name category_name -- duplicated subcategory_name -- duplicated brand_name -- duplicated manufacturer_name -- duplicated
fact_orders order_id customer_key --> dim_customer product_key --> dim_product date_key --> dim_date store_key --> dim_store quantity revenue dim_product (normalized) product_key product_name subcategory_key --> dim_subcategory dim_subcategory subcategory_key subcategory_name category_key --> dim_category dim_category category_key category_name
Notice how the star schema has one dim_product table with all hierarchy levels flattened in. The snowflake schema splits the product hierarchy into 3 tables. To get the category name for a product in the snowflake schema, you need to JOIN product to subcategory to category.
This question appears in nearly every DE system design round. Here's how to frame your answer.
In most DE interviews, the expected answer is: 'I'd use a star schema for the analytical layer because it's simpler to query, performs better for aggregations, and works well with BI tools. I'd consider a snowflake schema if the dimension hierarchies are deep and change frequently, or if storage constraints require normalization.' This shows you understand both models and can make a pragmatic choice. Don't dogmatically say one is always better.
The question usually comes in system design rounds: 'Design the data model for [use case].' You choose star or snowflake based on the requirements. The interviewer probes your reasoning: 'Why not normalize those dimensions?' or 'What happens when the category hierarchy changes?' Having specific reasons for your choice (query patterns, consumer types, update frequency) is what separates a strong answer from a textbook recitation.
In practice, most modern data warehouses use a hybrid approach. The core fact and dimension tables follow a star schema for simplicity, but some dimensions (like geographic hierarchies or product catalogs) are partially normalized where it makes sense. Mentioning this hybrid reality and explaining when you'd normalize vs. denormalize shows production experience. An answer like 'I'd start with a star schema and only normalize dimensions where the update frequency or hierarchy depth justifies the added complexity' is stronger than picking one model absolutely.
Be prepared for: 'How would you handle slowly changing dimensions in this model?' (SCD types 1, 2, and 3), 'How would you partition the fact table?' (by date, usually), 'What if a new dimension needs to be added?' (add a new dimension table and a foreign key column to the fact table). These questions test whether you've actually built data models in production, not just studied the theory.
Practice the schema prompts on DataDriven until you can commit to a default without hedging and name the one edge case that changes the answer. That's the whole interview.
Start Practicing