SQL Reference

SQL CROSS JOIN

Hot-take from the post-mortem channel. Last year a staff engineer at a fintech watched a nightly cost report query grow from 4 minutes to 9 hours over a weekend. The cause: a missing ON clause. Two tables with a million rows each had been quietly Cartesian-producting since Friday, generating a trillion-row intermediate set on Snowflake. One unresolved JOIN turned into a five-figure compute bill.

CROSS JOIN shows up in about 5% of verified interview rounds, and roughly half the time it's a trap: the interviewer wants to see if you flinch. The other half is the date-spine pattern, which is genuinely useful and the reason CROSS JOIN earned its place in the spec. Learn both.

5%

SQL rounds use it

~50%

Of those are traps

1T

Rows from 1M x 1M

0

ON clauses needed

Source: DataDriven analysis of 1,042 verified data engineering interview rounds.

CROSS JOIN Syntax

CROSS JOIN is the simplest join syntax. There is no ON clause because every row pairs with every other row. The two syntaxes below are equivalent.

-- Explicit CROSS JOIN (preferred)
SELECT a.color, b.size
FROM colors a
CROSS JOIN sizes b;

-- Implicit CROSS JOIN (comma syntax)
SELECT a.color, b.size
FROM colors a, sizes b;

-- Both produce:
-- | color | size  |
-- |-------|-------|
-- | red   | S     |
-- | red   | M     |
-- | red   | L     |
-- | blue  | S     |
-- | blue  | M     |
-- | blue  | L     |

Three by three is fine. A million by a million is a trillion rows and a Snowflake bill that'll wake up a VP. The math isn't the part that catches people. The part that catches people is that the query runs, slowly at first, then not at all, and nobody notices until the morning report fails to land. We've seen it three times. Use CROSS JOIN on purpose or not at all.

Always use the explicit syntax. Writing CROSS JOIN tells every reader “yes, I intended the Cartesian product.” The comma syntax looks like a regular FROM clause with a missing WHERE, which makes code reviewers wonder if you forgot the join condition.

Use Case 1: Date Spines

The most common CROSS JOIN in production data engineering is the date spine. You CROSS JOIN a calendar table with a dimension (like stores, products, or regions) to create a row for every date for every entity. Then you LEFT JOIN your fact table onto this spine to fill in actual values, with zeros or NULLs for dates with no activity.

-- Generate a date spine: one row per date per store
WITH date_spine AS (
  SELECT generate_series(
    '2024-01-01'::date,
    '2024-12-31'::date,
    '1 day'::interval
  )::date AS date
),
stores AS (
  SELECT DISTINCT store_id, store_name
  FROM sales
)
SELECT
  d.date,
  s.store_id,
  s.store_name,
  COALESCE(f.daily_revenue, 0) AS daily_revenue,
  COALESCE(f.order_count, 0) AS order_count
FROM date_spine d
CROSS JOIN stores s
LEFT JOIN (
  SELECT
    sale_date,
    store_id,
    SUM(revenue) AS daily_revenue,
    COUNT(*) AS order_count
  FROM sales
  GROUP BY sale_date, store_id
) f ON d.date = f.sale_date AND s.store_id = f.store_id
ORDER BY d.date, s.store_id;

Without the CROSS JOIN, dates with no sales for a given store would be missing from the result. Dashboards would show gaps in the timeline. The date spine ensures every date appears for every store, and LEFT JOIN fills in zeros for inactive days.

Interview note: Date spine generation is one of the most-tested patterns in data engineering interviews. If an interviewer asks you to “fill in missing dates” or “show all days including days with zero activity,” they want a CROSS JOIN with a calendar table followed by a LEFT JOIN to the fact table.

Use Case 2: Generating Combinations

CROSS JOIN creates all possible pairs or combinations from two sets. This is useful for product variant generation, test matrix creation, and comparison analysis.

-- Generate all product variant combinations
SELECT
  c.color_name,
  s.size_label,
  CONCAT(p.product_name, ' - ', c.color_name, ' - ', s.size_label)
    AS variant_name
FROM products p
CROSS JOIN colors c
CROSS JOIN sizes s
WHERE p.category = 'shirts'
ORDER BY p.product_name, c.color_name, s.size_label;
-- Compare every pair of employees (for peer review assignment)
SELECT
  e1.employee_id AS reviewer_id,
  e1.name AS reviewer,
  e2.employee_id AS reviewee_id,
  e2.name AS reviewee
FROM employees e1
CROSS JOIN employees e2
WHERE e1.employee_id != e2.employee_id  -- exclude self-pairs
  AND e1.department = e2.department      -- same department only
ORDER BY e1.name, e2.name;

Performance Dangers

CROSS JOIN is the easiest way to accidentally create a query that brings down a production database. The output grows multiplicatively, and most people underestimate how fast that gets out of hand.

Left TableRight TableResult Rows
10010010,000
1,0001,0001,000,000
10,00010,000100,000,000
100,000100,00010,000,000,000

Accidental CROSS JOIN

The most common way to accidentally create a CROSS JOIN is joining two tables on a non-unique column. If both sides have duplicates on the join key, each duplicate pair creates a row, producing a partial Cartesian product.

-- BUG: date is not unique in either table
-- 50 sales rows and 30 expense rows on the same date
-- = 1,500 rows for that one date
SELECT s.*, e.*
FROM daily_sales s
JOIN daily_expenses e ON s.date = e.date;

-- FIX: aggregate first, then join on the unique grain
WITH agg_sales AS (
  SELECT date, SUM(amount) AS total_sales
  FROM daily_sales GROUP BY date
),
agg_expenses AS (
  SELECT date, SUM(amount) AS total_expenses
  FROM daily_expenses GROUP BY date
)
SELECT s.date, s.total_sales, e.total_expenses
FROM agg_sales s
JOIN agg_expenses e ON s.date = e.date;

Safety Guardrails

Check table sizes first. Before running a CROSS JOIN, check the row counts of both tables. If either exceeds a few thousand rows, think carefully about whether you need the full Cartesian product.

Add a LIMIT during development. When testing a CROSS JOIN query, add LIMIT 100 to avoid accidentally producing billions of rows.

Filter early. If you only need combinations for a specific subset, filter the input tables in CTEs before the CROSS JOIN.

3 CROSS JOIN Interview Questions

CROSS JOIN questions are less common than INNER or LEFT JOIN questions, but when they appear, candidates who know the pattern stand out.

Q1: Write a query that shows daily revenue for every store, including days with zero sales.

What they test:

Date spine pattern. Can you generate a complete time series and fill gaps with zeros?

Approach:

CROSS JOIN a date series with a distinct list of stores to create the spine. LEFT JOIN the sales fact table onto the spine. COALESCE the revenue column to 0 for missing days.

Q2: Generate all possible pairs of products that could be sold together as a bundle. Exclude pairing a product with itself.

What they test:

Combination generation with self-exclusion. Can you use CROSS JOIN and filter out self-pairs and duplicate pairs?

Approach:

CROSS JOIN the products table with itself. Filter WHERE p1.product_id < p2.product_id to eliminate self-pairs and ensure each combination appears only once (not both A-B and B-A).

Q3: Your query joining two tables returns 10x more rows than expected. What happened?

What they test:

Debugging awareness of accidental Cartesian products. This is a verbal question, not a coding question. They want you to identify the cause and describe the fix.

Approach:

The join column is not unique on one or both sides, creating a partial Cartesian product. Check the cardinality of the join key in both tables. Fix by either pre-aggregating to make the key unique or adding extra columns to the join condition to make the match one-to-one.

SQL CROSS JOIN FAQ

What does CROSS JOIN do in SQL?+
CROSS JOIN produces the Cartesian product of two tables. Every row in the left table is paired with every row in the right table. If the left table has 100 rows and the right table has 50 rows, the result has 5,000 rows. There is no ON clause because there is no matching condition. CROSS JOIN pairs every row with every other row unconditionally.
When would you actually use a CROSS JOIN?+
The most common production use case is generating date spines: CROSS JOIN a calendar table with a dimension table (like a list of products or stores) to create one row per date per entity. This is the starting point for filling gaps in time series data. Other uses include generating all possible combinations for A/B test variant assignment, creating matrices for reporting grids, and generating test data. Outside of these intentional uses, CROSS JOINs in production code usually signal a bug.
Is CROSS JOIN the same as a comma join?+
Yes. Writing FROM table_a, table_b without a WHERE clause is equivalent to FROM table_a CROSS JOIN table_b. Both produce a Cartesian product. The explicit CROSS JOIN syntax is preferred because it makes the intent clear. A comma join without a WHERE clause might be an accidental CROSS JOIN (a forgotten join condition), while an explicit CROSS JOIN tells the reader you wanted the Cartesian product on purpose.

Spot the trap before the cluster bill arrives

Practice a handful of fan-out problems and you'll recognize the shape on sight. The interviewers you'll meet next month can tell the difference between candidates who have seen one melt down and candidates who haven't.