You'll reach for DATE_TRUNC more times in your first month on the job than any other date function. Cohort retention, weekly active users, monthly revenue rollups: all of them are one DATE_TRUNC call away. This page is the reference you'll keep open in a second tab. It covers Postgres, Snowflake, BigQuery, and MySQL syntax side by side so you don't have to hunt for the right flavor.
DATE_DIFF interview frequency
SQL dialects shown
Date-focused problems
Rounds analyzed
Source: DataDriven analysis of 1,042 verified data engineering interview rounds.
DATE_TRUNC rounds a timestamp down to the specified precision. “Truncating to month” turns 2024-03-15 14:30:00 into 2024-03-01 00:00:00. This is the single most-used date function in analytical SQL because it powers every “group by month” and “group by week” query.
-- DATE_TRUNC(precision, timestamp)
-- PostgreSQL, Snowflake, Redshift
SELECT DATE_TRUNC('month', '2024-03-15 14:30:00'::timestamp);
-- 2024-03-01 00:00:00
SELECT DATE_TRUNC('week', '2024-03-15'::date);
-- 2024-03-11 (Monday of that week in PostgreSQL)
SELECT DATE_TRUNC('quarter', '2024-08-20'::date);
-- 2024-07-01
SELECT DATE_TRUNC('year', '2024-08-20'::date);
-- 2024-01-01
-- BigQuery uses DATE_TRUNC(date, part)
-- SELECT DATE_TRUNC('2024-03-15', MONTH);
-- MySQL has no DATE_TRUNC; use DATE_FORMAT
-- SELECT DATE_FORMAT('2024-03-15', '%Y-%m-01');-- Most common pattern: group by month
SELECT
DATE_TRUNC('month', order_date) AS month,
COUNT(*) AS order_count,
SUM(revenue) AS total_revenue
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;Interview note: DATE_TRUNC appears in nearly every time-series interview question. If the question says “by month” or “by week,” reach for DATE_TRUNC first. It is cleaner than EXTRACT(MONTH FROM ...) for grouping because it preserves the year (January 2024 vs January 2025).
EXTRACT pulls a single numeric component from a date or timestamp. Use it when you need just the year, month, day, hour, or day of week as a number.
-- EXTRACT(field FROM timestamp)
SELECT EXTRACT(YEAR FROM '2024-03-15'::date); -- 2024
SELECT EXTRACT(MONTH FROM '2024-03-15'::date); -- 3
SELECT EXTRACT(DAY FROM '2024-03-15'::date); -- 15
SELECT EXTRACT(HOUR FROM '2024-03-15 14:30:00'::timestamp); -- 14
SELECT EXTRACT(DOW FROM '2024-03-15'::date); -- 5 (Friday, 0=Sunday)
SELECT EXTRACT(ISODOW FROM '2024-03-15'::date); -- 5 (Friday, 1=Monday)
SELECT EXTRACT(EPOCH FROM '2024-03-15'::date); -- Unix timestamp
-- Use case: find peak hours
SELECT
EXTRACT(HOUR FROM event_time) AS hour_of_day,
COUNT(*) AS event_count
FROM events
GROUP BY 1
ORDER BY 2 DESC;
-- Use case: day-of-week analysis
SELECT
EXTRACT(ISODOW FROM order_date) AS day_of_week,
CASE EXTRACT(ISODOW FROM order_date)
WHEN 1 THEN 'Monday'
WHEN 2 THEN 'Tuesday'
WHEN 3 THEN 'Wednesday'
WHEN 4 THEN 'Thursday'
WHEN 5 THEN 'Friday'
WHEN 6 THEN 'Saturday'
WHEN 7 THEN 'Sunday'
END AS day_name,
COUNT(*) AS order_count
FROM orders
GROUP BY 1
ORDER BY 1;Common mistake: Do not use EXTRACT(MONTH ...) for grouping time series. EXTRACT gives you just the month number (1-12), so January 2024 and January 2025 both become 1 and get grouped together. Use DATE_TRUNC for time-series grouping. EXTRACT is for cross-year analysis like “which month of the year has the most sales.”
Calculating the difference between two dates is one of the most common operations in data pipelines. The syntax varies more across databases than any other date function.
-- PostgreSQL: subtract dates directly
SELECT '2024-03-15'::date - '2024-01-01'::date; -- 74 (integer days)
-- PostgreSQL: subtract timestamps (returns interval)
SELECT '2024-03-15 14:00'::timestamp - '2024-03-15 10:30'::timestamp;
-- '03:30:00'
-- Snowflake: DATEDIFF(unit, start, end)
SELECT DATEDIFF('day', '2024-01-01', '2024-03-15'); -- 74
SELECT DATEDIFF('month', '2024-01-01', '2024-03-15'); -- 2
SELECT DATEDIFF('hour', '2024-03-15 10:00', '2024-03-15 14:30'); -- 4
-- BigQuery: DATE_DIFF(end, start, unit)
-- SELECT DATE_DIFF('2024-03-15', '2024-01-01', DAY); -- 74
-- MySQL: DATEDIFF(end, start) returns days only
-- SELECT DATEDIFF('2024-03-15', '2024-01-01'); -- 74
-- For other units: TIMESTAMPDIFF(MONTH, start, end)-- Add/subtract time using INTERVAL
SELECT CURRENT_DATE + INTERVAL '7 days'; -- 7 days from now
SELECT CURRENT_DATE - INTERVAL '1 month'; -- 1 month ago
SELECT CURRENT_TIMESTAMP + INTERVAL '3 hours'; -- 3 hours from now
-- Common pattern: filter last N days
SELECT *
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days';
-- Retention calculation: days between signup and first purchase
SELECT
u.user_id,
u.signup_date,
MIN(o.order_date) AS first_order_date,
MIN(o.order_date) - u.signup_date AS days_to_first_order
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.signup_date;Interview pattern: Retention and cohort analysis questions always involve date arithmetic. “What percentage of users who signed up in January made a purchase within 7 days?” requires subtracting signup_date from order_date and checking if the result is <= 7.
These functions return the current date or timestamp at query execution time. They are used for filtering recent data, calculating ages, and stamping records.
-- Current date and time (no parentheses in standard SQL)
SELECT CURRENT_DATE; -- 2024-03-15
SELECT CURRENT_TIMESTAMP; -- 2024-03-15 14:30:00+00
-- Snowflake also supports:
-- SELECT CURRENT_DATE();
-- SELECT CURRENT_TIMESTAMP();
-- SELECT GETDATE(); -- SQL Server
-- Common patterns
-- Orders from today
SELECT * FROM orders WHERE order_date = CURRENT_DATE;
-- Orders from the last 24 hours
SELECT * FROM orders
WHERE created_at >= CURRENT_TIMESTAMP - INTERVAL '24 hours';
-- Account age in days
SELECT
user_id,
signup_date,
CURRENT_DATE - signup_date AS account_age_days
FROM users;
-- Stamp an ETL load
INSERT INTO load_log (table_name, loaded_at, row_count)
VALUES ('orders', CURRENT_TIMESTAMP, 1523);Timezones are one of the hardest problems in data engineering. The rules change between databases, daylight saving time creates ambiguity, and a single wrong assumption can shift every metric by a day. Here is the practical approach.
-- PostgreSQL: AT TIME ZONE converts between timezones
-- timestamp WITHOUT time zone -> interprets as the specified zone
SELECT '2024-03-15 14:30:00'::timestamp AT TIME ZONE 'America/New_York';
-- 2024-03-15 18:30:00+00 (converts to UTC)
-- timestamptz -> converts to display in the specified zone
SELECT '2024-03-15 18:30:00+00'::timestamptz AT TIME ZONE 'America/New_York';
-- 2024-03-15 14:30:00 (displays in Eastern)
-- Snowflake: CONVERT_TIMEZONE
SELECT CONVERT_TIMEZONE('UTC', 'America/New_York', created_at);
-- BigQuery: TIMESTAMP functions with timezone
-- SELECT TIMESTAMP('2024-03-15 14:30:00', 'America/New_York');
-- Group by local day (not UTC day)
SELECT
DATE_TRUNC(
'day',
created_at AT TIME ZONE 'America/New_York'
) AS local_day,
COUNT(*) AS order_count
FROM orders
GROUP BY 1
ORDER BY 1;All timestamps in your warehouse should be stored in UTC. This removes ambiguity. Convert to local time only at the query layer (for dashboards and reports) or in the application layer. Storing in local time creates daylight saving bugs that are extremely hard to debug.
Use 'America/New_York' instead of 'EST' or '-05:00'. Named timezones automatically handle daylight saving transitions. Fixed offsets do not. EST is always UTC-5, but New York is UTC-5 in winter and UTC-4 in summer.
If your business operates in US Eastern and you need daily metrics, convert to Eastern BEFORE applying DATE_TRUNC. Otherwise, an order placed at 11 PM Eastern (4 AM UTC next day) lands in the wrong day.
These patterns appear in production dbt models, ETL jobs, and analytical queries.
-- Monthly cohort retention
SELECT
DATE_TRUNC('month', u.signup_date) AS cohort_month,
EXTRACT(MONTH FROM AGE(o.order_date, u.signup_date)) AS months_since_signup,
COUNT(DISTINCT o.user_id) AS active_users
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY 1, 2
ORDER BY 1, 2;-- 7-day rolling average revenue
SELECT
date,
revenue,
AVG(revenue) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_7d_avg
FROM daily_revenue
ORDER BY date;-- Check data freshness per table
SELECT
table_name,
MAX(loaded_at) AS last_load,
EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - MAX(loaded_at))) / 3600
AS hours_since_last_load,
CASE
WHEN MAX(loaded_at) < CURRENT_TIMESTAMP - INTERVAL '2 hours'
THEN 'STALE'
ELSE 'FRESH'
END AS freshness_status
FROM load_log
GROUP BY table_name
ORDER BY hours_since_last_load DESC;When you get a date question in an interview, the trick is to slow down and figure out what the interviewer is really checking. Is it your syntax? Your handling of gaps? Your ability to reason about time zones? Look for the word that gives it away, then write the query. The three examples below cover the patterns that come up again and again.
What they test:
DATE_TRUNC for monthly grouping, date spine generation for filling gaps, COALESCE for zero-filling.
Approach:
Generate a date spine of 12 months using generate_series or a recursive CTE. LEFT JOIN the orders table (aggregated by month) onto the spine. COALESCE the revenue to 0 for months with no orders.
What they test:
Date arithmetic (signup_date + 7), date range filtering, ratio calculation with conditional counting.
Approach:
Filter users where signup_date is in January 2024. LEFT JOIN to activity table where activity_date is between signup_date and signup_date + 7. Count distinct users with activity divided by total users.
What they test:
LAG window function with date subtraction, then averaging the intervals. Tests both date arithmetic and window function skills.
Approach:
Use LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) to get the previous order date. Subtract to get days between orders. Then AVG the differences per customer. Handle customers with only one order (NULL interval).
You'll get comfortable with DATE_TRUNC and window-based date math faster than you expect. Start with one problem tonight.