Fact Table vs Dimension Table: Key Differences Explained

If you would aggregate the column, it is a fact. If you would filter or group by it, it is a dimension attribute. That is the whole game. Understand grain, measures, and surrogate keys, and you will answer this class of question cleanly on every modeling round.

Fact Tables vs Dimension Tables

The rule to internalize before any modeling round: if you would aggregate the column, it is a fact. If you would filter or group by it, it is a dimension attribute. That is the whole game. You will be asked some version of this on almost every modeling round, and the candidates who hesitate on simple examples lose ground they never get back.

Fact tables store measurable events at a stated grain. Dimension tables store descriptive context around those events. Facts are tall and narrow, growing with every transaction. Dimensions are short and wide, changing slowly over time.

Prepare for the interview
01 / Open invite
02min.

Know Fact vs Dimension the way the interviewer who asks it knows it.

a Fact vs Dimension query, the same shape a screen would give you.
The diff against expected. Where ties broke. What you missed.
sandbox
1fact_orders
2 order_id bigint PK
3 customer_sk bigint FK
4 order_date date SCD2
5
Execute your solution0.4s avg.
AirbnbInterview question
Solve a Fact vs Dimension problem

Fact Table vs Dimension Table at a Glance

PropertyFact TableDimension Table
PurposeStores measurable events (transactions, clicks, shipments)Stores descriptive context (who, what, where, when)
Column TypesForeign keys + numeric measures (revenue, quantity, duration)Surrogate key + text/categorical attributes (name, city, category)
Row CountVery high. Millions to billions of rows. Grows continuously.Low to moderate. Thousands to millions. Grows slowly.
ShapeTall and narrow (few columns, many rows)Short and wide (many columns, fewer rows)
Update FrequencyAppend-only. New rows added as events occur. Rarely updated.Updated when attributes change (SCD Type 1, 2, or 3).
Primary KeyComposite key (combination of dimension foreign keys) or row-level surrogateSingle surrogate key (system-generated integer)
AggregationMeasures are summed, averaged, or counted in queriesAttributes are used for GROUP BY, WHERE, and filter clauses

Fact Table Examples

fact_orders

Grain: One row per order line item Measures: quantity, unit_price, discount_amount, line_total, tax_amount Foreign keys: date_key, customer_key, product_key, store_key, promotion_key Volume: 50M rows/month for a mid-size retailer

fact_page_views

Grain: One row per page view event Measures: time_on_page_seconds, scroll_depth_pct Foreign keys: date_key, user_key, page_key, device_key, referrer_key Volume: 500M rows/month for a SaaS product with 100K daily active users

fact_support_tickets

Grain: One row per support ticket Measures: resolution_time_hours, customer_satisfaction_score, num_replies Foreign keys: created_date_key, resolved_date_key, customer_key, agent_key, category_key Volume: 200K rows/month for a B2B company with 5,000 customers

Dimension Table Examples

dim_customer

Attributes: customer_name, email, city, state, country, signup_date, acquisition_channel, customer_tier, lifetime_order_count Row count: 2.5 million rows for a consumer e-commerce company Update pattern: SCD Type 2 on city and customer_tier. Type 1 on email.

dim_product

Attributes: product_name, sku, category, subcategory, brand, color, size, weight_kg, price_tier, is_active Row count: 150,000 rows for a retailer with a large catalog Update pattern: SCD Type 1 on price_tier and is_active. Type 2 on category if reclassifications need historical tracking.

dim_date

Attributes: full_date, day_of_week, month_name, quarter, year, fiscal_year, is_holiday, is_weekend, week_of_year Row count: 36,525 rows (100 years of dates, preloaded) Update pattern: Static. Loaded once and never modified. The only dimension that does not change.

5 Interview Questions on Fact vs Dimension Tables

What is the difference between a fact table and a dimension table?

Fact tables store measurements: numeric values you aggregate (sum, average, count). Dimension tables store descriptive context: text and categorical values you filter and group by. Facts answer 'how much' and 'how many.' Dimensions answer 'who, what, where, when.' Facts are tall and narrow, growing with every event. Dimensions are short and wide, changing slowly. In a star schema, the fact table sits at the center with foreign keys pointing outward to dimensions.

Can a column be both a fact and a dimension attribute?

Yes, depending on context. Price is the classic example. unit_price at the time of a sale is a fact measure: it applies to that specific transaction and you might average it across transactions. standard_list_price is a dimension attribute on the product table: it describes the product and you would filter by it ('show me products priced above $100'). The same real-world concept maps to different table placements based on whether it is a measurement captured at event time or a descriptor that exists independent of any event.

Why do fact tables use surrogate keys from dimension tables instead of natural keys?

Three reasons. First, surrogate keys support slowly changing dimensions: when a customer moves cities, a new dimension row with a new surrogate key preserves the old address for historical facts. Second, surrogate keys insulate the warehouse from source system changes: if the CRM changes its customer ID format, only the ETL mapping changes, not the fact table. Third, integer surrogate keys are smaller and faster to join on than string natural keys like email addresses or SKU codes. A join on two 4-byte integers is measurably faster than a join on two varchar(255) columns across billions of fact rows.

How do you decide whether a new attribute goes in the fact or dimension table?

Apply the aggregation test. If the value would be meaningfully summed, averaged, or counted across rows, it belongs in the fact table. If the value would be used to filter rows or group results, it belongs in a dimension table. Quantity is always a fact (you sum it). customer_name is always a dimension (you group by it). Some values are ambiguous: employee_count at a company could be a periodic snapshot measure (fact) or a descriptor of the company (dimension). When ambiguous, ask what the analyst will do with it. That determines placement.

What is a factless fact table?

A factless fact table has no numeric measures. It records events that either happened or did not happen. The classic example is student attendance: each row represents a student attending a class on a date. The foreign keys are student_key, class_key, and date_key. There are no measures to sum or average. The value comes from counting rows and from identifying the absence of expected rows (which students were enrolled but did not attend?). Factless fact tables test whether candidates understand that the 'fact' in a fact table refers to the event being recorded, not necessarily to numeric columns.

Frequently Asked Questions

What is a fact table in simple terms?+
A fact table records measurable business events. Think of it as a log of things that happened: sales, clicks, shipments, logins. Each row is one event. The table contains numeric measures (how much revenue, how many items, how long the session lasted) and foreign keys pointing to dimension tables that describe the context of each event. Fact tables are the largest tables in a data warehouse, often containing billions of rows.
What is a dimension table in simple terms?+
A dimension table describes the context around business events. It answers who was involved, what product was purchased, where the transaction happened, and when it occurred. Dimension tables contain descriptive text columns like customer_name, product_category, and store_city. They are much smaller than fact tables and change infrequently. Each dimension table has a surrogate key that the fact table references through a foreign key.
How are fact and dimension tables connected?+
They connect through foreign keys. The fact table contains a foreign key column for each dimension it references. For example, a fact_orders table has customer_key, product_key, date_key, and store_key columns. Each of these keys matches the primary key (surrogate key) of the corresponding dimension table. This structure creates the star schema pattern: the fact table at the center with dimension tables around it.
Can a table be both a fact table and a dimension table?+
In rare cases, yes. An accumulating snapshot fact table that tracks milestones (like order fulfillment with order_date, ship_date, delivery_date) can serve as a dimension for another fact table. For example, a fact_returns table might reference the fact_order_fulfillment table to get the original order context. But this is uncommon. In standard star schema design, fact tables and dimension tables serve distinct roles and are designed separately.
02 / Why practice

Say the Grain Out Loud. Then Ship the Design.

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Related Guides