Dates: Intermediate

Robinhood's tax reporting system must determine whether each stock sale is short-term or long-term, a distinction that can mean the difference between a 37% federal tax rate and a 20% rate on the exact same gain, and that determination comes down to whether the holding period is 365 days or 366. Every trade settlement, wash-sale window, and tax lot age calculation the company runs for millions of users depends on precise date arithmetic using functions like DATE_DIFF and DATE_ADD. A query that gets a holding period wrong by a single day does not just produce a bad report; it can trigger an IRS audit for a customer. This lesson covers the intermediate date functions that make this kind of precision possible.

EXTRACT()

Daily Life
Interviews

Pull year, month, or day from a date

EXTRACT pulls a single component from a date or timestamp and returns it as a number. It allows you to isolate the year, month, day, hour, minute, second, or other parts for analysis, grouping, or filtering. This function is fundamental for temporal aggregations.

The syntax is EXTRACT(part FROM date_or_timestamp). The part specifies which component to extract: YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, DAY_OF_WEEK, DAY_OF_YEAR, WEEK, and QUARTER are commonly supported. The result is always an integer.

Break apart a date into its calendar components:
1SELECT
2 order_date,
3 EXTRACT(YEAR FROM order_date) AS order_year,
4 EXTRACT(MONTH FROM order_date) AS order_month,
5 EXTRACT(DAY FROM order_date) AS order_day
6FROM orders
Result
order_dateorder_yearorder_monthorder_day
2024-03-152024315
2024-01-08202418
2023-12-2520231225

Each EXTRACT call returns a single integer. YEAR gives the full year (2024, not 24). MONTH returns 1-12. DAY returns 1-31 depending on the month. These integers enable numeric comparisons and grouping.

//

Time Component Extraction

EXTRACT also works with timestamps to pull time-of-day components:

1SELECT
2 event_timestamp,
3 EXTRACT(HOUR FROM event_timestamp) AS event_hour,
4 EXTRACT(MINUTE FROM event_timestamp) AS event_minute,
5 EXTRACT(SECOND FROM event_timestamp) AS event_second
6FROM event_data
Result
event_timestampevent_hourevent_minuteevent_second
2024-03-15 14:30:45143045
2024-03-15 09:05:129512
2024-03-15 23:59:59235959

HOUR returns 0-23 in 24-hour format. MINUTE and SECOND return 0-59. These extractions enable time-of-day analysis, peak hour detection, and shift-based reporting.

//

Calendar Extractions

EXTRACT supports higher-level calendar components:

DAY_OF_WEEK returns 1-7 where 1 is typically Sunday or Monday depending on locale. Essential for weekday vs weekend analysis.
1SELECT
2 order_date,
3 EXTRACT(DAY_OF_WEEK FROM order_date) AS dow
4FROM orders
TIP
EXTRACT() must be evaluated for every row when used in GROUP BY. For large tables, consider pre-computing these values in materialized columns or using date dimension tables for better performance.

Practical Applications

EXTRACT powers temporal aggregations by providing grouping keys:

1SELECT
2 EXTRACT(YEAR FROM order_date) AS yr,
3 EXTRACT(MONTH FROM order_date) AS mo,
4 COUNT(*) AS orders,
5 SUM(total) AS revenue
6FROM orders
7GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
8ORDER BY yr, mo
Result
yrmoordersrevenue
202411250125430.50
202421180118920.00
202431420142100.75

This query aggregates orders by year and month. The extracted components become the GROUP BY keys, creating monthly summaries. This pattern is foundational for time-series reporting.

//

Filtering by Components

EXTRACT enables selective filtering by temporal component:

1SELECT
2 order_id,
3 order_date,
4 total
5FROM orders
6WHERE EXTRACT(MONTH FROM order_date) = 12
7AND EXTRACT(DAY_OF_WEEK FROM order_date) IN (
8 1,
9 7
10 )
Result
order_idorder_datetotal
50212024-12-0789.99
50342024-12-08156.50
50892024-12-14234.00

This query finds December weekend orders. The MONTH filter restricts to December regardless of year. The DAY_OF_WEEK filter keeps only weekend days. Combining extractions creates precise temporal filters.

How would you group revenue by year and quarter?
Separate columns for year and quarter keep values as integers, making them easy to filter and sort programmatically. This is the standard approach for analytics tables.
1SELECT
2 EXTRACT(YEAR FROM order_date) AS yr,
3 EXTRACT(QUARTER FROM order_date) AS qtr,
4 SUM(total) AS revenue
5FROM orders
6GROUP BY 1, 2
7ORDER BY 1, 2

Best Practices

EXTRACT is most valuable when you need to analyze patterns or group data by specific time components.

//

When to Use EXTRACT()

These are the most common scenarios where EXTRACT proves invaluable.

Time-based grouping
Time-based grouping
Aggregate by year, month, quarter, or week for reporting
Component filtering
Component filtering
Select specific months, weekdays, or hours of operation
Pattern analysis
Pattern analysis
Detect hourly, daily, or seasonal trends in data
Companies use these patterns to uncover time-of-day trends that drive product decisions.

> Complete this query to pull the signup year from each user record.

SELECT
  ___(___ ___ signup_date) AS year
FROM users
EXTRACT
FROM
MONTH
YEAR

EXTRACT always returns an integer. You can use the result directly in arithmetic, comparison operators, and GROUP BY without any additional casting.

For large tables, using EXTRACT in WHERE clauses prevents the optimizer from using date-range indexes. Prefer range comparisons on the original column when filtering by year or month.

When building cohort analysis, EXTRACT(YEAR FROM signup_date) combined with EXTRACT(MONTH FROM signup_date) gives a precise month-of-signup cohort key without any string formatting.

Date Part Constants

Daily Life
Interviews

Reference any component of a timestamp

EXTRACT supports a variety of date part constants beyond YEAR, MONTH, and DAY. Knowing all available parts lets you slice temporal data at any granularity, from quarters down to seconds.

//

Additional Date Parts

QUARTER returns 1 through 4 and is essential for fiscal reporting. DOW gives the day of week (1 = Monday through 7 = Sunday). DOY gives the day of year (1 to 366), useful for seasonal analysis. For timestamps, HOUR, MINUTE, and SECOND extract time components.

//

Extracting Multiple Parts

1SELECT
2 order_date,
3 EXTRACT(YEAR FROM order_date) AS yr,
4 EXTRACT(QUARTER FROM order_date) AS qtr,
5 EXTRACT(DOW FROM order_date) AS day_of_week
6FROM orders
Result
order_dateyrqtrday_of_week
2024-03-15202415
2024-01-08202411
2023-12-25202341

Combining EXTRACT with GROUP BY lets you aggregate by any time period: weekly sales, quarterly revenue, hourly traffic patterns. This is one of the most common patterns in analytics.

//

CURRENT_TIMESTAMP with EXTRACT

CURRENT_TIMESTAMP provides the complete current moment with date and time. You can extract specific components like hour, day, or month to build time-aware logic directly in your queries.

1SELECT
2 EXTRACT(HOUR FROM CURRENT_TIMESTAMP) AS current_hour,
3 CASE
4 WHEN EXTRACT(HOUR FROM CURRENT_TIMESTAMP) < 12 THEN 'Morning'
5 WHEN EXTRACT(HOUR FROM CURRENT_TIMESTAMP) < 17 THEN 'Afternoon'
6 ELSE 'Evening'
7 END AS time_of_day
8FROM event_data
9LIMIT 1
Result
current_hourtime_of_day
14Afternoon
This query extracts the current hour and classifies the time of day. The pattern is useful for dashboards that display time-aware greetings or adjust behavior based on business hours.

> Complete this query to get the order month for seasonal analysis.

SELECT
  ___(___ ___ order_date) AS month
FROM orders
DAY
EXTRACT
FROM
MONTH

EXTRACT(MONTH) returns the same integer 1-12 regardless of the year. If your analysis spans multiple years, always pair it with EXTRACT(YEAR) in GROUP BY to avoid mixing months across different years.

NOW() and CURRENT_TIMESTAMP are functionally identical in most databases. NOW() is shorter for interactive queries, while CURRENT_TIMESTAMP is the ANSI SQL standard and preferred in production code for clarity.

Extracting DOW (day of week) is useful for detecting whether demand differs between weekdays and weekends. Grouping by DOW across a full year reveals structural patterns in consumer behavior.

DATE_DIFF()

Daily Life
Interviews

Calculate precise intervals between dates

DATE_DIFF calculates the difference between two timestamps or dates. It returns the result as an integer representing the number of complete units between them.

The syntax is DATE_DIFF(unit, timestamp1, timestamp2). The function returns timestamp2 - timestamp1 in the specified unit. Common units include DAY, HOUR, MINUTE, and SECOND.

//

Measuring Durations

1SELECT
2 session_id,
3 DATE_DIFF(
4 'minute',
5 start_time,
6 end_time
7 ) AS duration_minutes,
8 DATE_DIFF(
9 'second',
10 start_time,
11 end_time
12 ) AS duration_seconds
13FROM sessions
Result
session_idduration_minutesduration_seconds
S001452730
S0021056300
S0035345

DATE_DIFF counts complete units. Session S001 lasted 45 minutes and 30 seconds, which rounds down to 45 complete minutes or 2730 complete seconds. Partial units are not counted.

//

Response Time Analysis

1SELECT
2 ticket_id,
3 created_at,
4 first_response_at,
5 DATE_DIFF(
6 'minute',
7 created_at,
8 first_response_at
9 ) AS response_minutes,
10 CASE
11 WHEN DATE_DIFF(
12 'minute',
13 created_at,
14 first_response_at
15 ) <= 15 THEN 'Fast'
16 WHEN DATE_DIFF(
17 'minute',
18 created_at,
19 first_response_at
20 ) <= 60 THEN 'Normal'
21 ELSE 'Slow'
22 END AS response_rating
23FROM alert_events
Result
ticket_idcreated_atfirst_response_atresponse_minutesresponse_rating
T0012024-03-15 10:002024-03-15 10:088Fast
T0022024-03-15 11:302024-03-15 12:1545Normal
T0032024-03-15 14:002024-03-15 16:30150Slow
This query calculates response time in minutes and categorizes performance. Sub-hour granularity is critical for support SLAs where 15 minutes versus 45 minutes makes a significant difference.
TIP
For billing or compliance, use the smallest practical unit. Measuring in seconds and converting to display units preserves precision. Measuring directly in hours loses minute-level detail.
//

Aggregating Durations

1SELECT
2 user_id,
3 COUNT(*) AS session_count,
4 AVG(
5 DATE_DIFF(
6 'minute',
7 start_time,
8 end_time
9 )
10 ) AS avg_session_minutes,
11 SUM(
12 DATE_DIFF(
13 'minute',
14 start_time,
15 end_time
16 )
17 ) AS total_minutes
18FROM sessions
19GROUP BY user_id
Result
user_idsession_countavg_session_minutestotal_minutes
U0011235.5426
U002822.3178
U0032545.81145
This query calculates session statistics per user: how many sessions, average duration, and total time spent. These metrics are fundamental for user engagement analysis.
What unit should you measure in for different use cases?
Minutes work well for human-scale durations like support ticket resolution or meeting lengths. The truncation to whole minutes is acceptable when precision below one minute does not matter.
1SELECT
2 ticket_id,
3 DATE_DIFF(
4 'minute',
5 created_at,
6 resolved_at
7 ) AS resolution_min
8FROM alert_events
//

DATE_DIFF() Uses

DATE_DIFF powers time-based metrics across industries.

Session durationsResponse timesAPI latencyUser engagement
Session durations
Sessions
Track user time on site
Response times
Support
Measure ticket resolution
API latency
Latency
Monitor endpoint speeds
User engagement
Engagement
Quantify active minutes
Time difference calculations power some of the most important operational metrics in industry.

> Complete this query to calculate the run time in minutes for each batch job.

SELECT
  job_id,
  ___(
    ___,
    ___,
    ended
    ) AS run_minutes
FROM batch_jobs
'minute'
'hour'
started
DATE_DIFF

DATE_DIFF counts complete units and discards the remainder. A 90-minute session is 1 hour in DATE_DIFF hours. Use the smallest practical unit to preserve accuracy, then convert for display.

For SLA calculations where every minute matters, measure in seconds and divide for display. Measuring directly in hours can misclassify a 59-minute ticket as a 0-hour resolution.

DATE_TRUNC to month is equivalent to setting day to 1 and time to 00:00:00. The result is a valid timestamp you can use directly in GROUP BY to create monthly aggregations.

Unix Time Conversion

Daily Life
Interviews

Convert between timestamps and epoch time

Unix time (also called epoch time or POSIX time) represents time as a single integer: the number of seconds since January 1, 1970 00:00:00 UTC. This representation is compact, timezone-independent, and universally understood by computing systems.

FROM_UNIXTIME converts a Unix timestamp to a human-readable timestamp. TO_UNIXTIME converts a timestamp to its Unix representation. These functions bridge between application data formats and human-friendly display.

//

Understanding Unix Time

Unix TimestampHuman-Readable
01970-01-01 00:00:00 UTC
10000000002001-09-09 01:46:40 UTC
17105118452024-03-15 14:30:45 UTC
20000000002033-05-18 03:33:20 UTC
The Unix epoch (time zero) is January 1, 1970. Every second adds 1 to the counter. 1.7 billion seconds have passed since 1970, bringing us to 2024. This simple counting system enables easy time arithmetic.
//

Converting from Unix Time

1SELECT
2 event_id,
3 unix_timestamp,
4 FROM_UNIXTIME(unix_timestamp) AS event_time
5FROM event_data
Result
event_idunix_timestampevent_time
E00117105118452024-03-15 14:30:45
E00217105088002024-03-15 13:46:40
E00317104320002024-03-14 16:00:00

FROM_UNIXTIME takes the integer and returns a timestamp. The result is in the database's local timezone unless otherwise configured. This makes Unix timestamps human-readable for analysis.

//

Converting to Unix Time

TO_UNIXTIME converts timestamps to integers. Useful when sending data to systems that expect Unix format or for compact storage.
1SELECT
2 event_time,
3 TO_UNIXTIME(event_time) AS unix_ts
4FROM event_data
//

Millisecond Timestamps

How do you handle Unix timestamps stored in seconds versus milliseconds?
Standard Unix timestamps use seconds since epoch (10 digits, like 1710511845). FROM_UNIXTIME handles these directly with no transformation needed.
1SELECT
2 event_id,
3 FROM_UNIXTIME(epoch_seconds) AS event_time
4FROM event_data
TIP
Some systems use milliseconds since epoch (13 digits) instead of seconds (10 digits). Check your data source documentation and count digits to identify which format you have.
//

Unix Time Function Uses

Unix timestamps appear frequently in APIs, log files, and cross-platform integrations.
UNIX TIME USE CASES
  • Converting API responses that use Unix timestamps
  • Processing log files with epoch time format
  • Storing timestamps compactly as integers
  • Interoperating with systems that expect Unix time
  • Performing time arithmetic using simple integer math

> Complete this query to convert Unix epoch values into a readable timestamp.

SELECT
  ___(___) AS log_time
FROM server_logs
TO_UNIXTIME
FROM_UNIXTIME
response_time_ms

FROM_UNIXTIME converts in the database server's local timezone. If the server is not in UTC, the human-readable result will be offset from the original UTC epoch. Use AT TIME ZONE after conversion to get UTC.

JavaScript timestamps are milliseconds since epoch (13 digits). Dividing by 1000 before FROM_UNIXTIME is a common and necessary step when processing data from web event streams or browser logs.

Unix timestamps are ideal for cross-system data exchange because they are timezone-neutral integers. Any system can convert them to local time independently without needing timezone metadata.

Grouping by Time Parts

Daily Life
Interviews

Aggregate data by week, month, or quarter

Combining EXTRACT with GROUP BY enables time-based aggregations: monthly revenue trends, hourly traffic patterns, day-of-week analysis. Extracted integers become grouping keys, turning timestamp data into actionable summaries.

Group transactions by year and month to see monthly trends:
1SELECT
2 EXTRACT(YEAR FROM order_date) AS order_year,
3 EXTRACT(MONTH FROM order_date) AS order_month,
4 COUNT(*) AS order_count,
5 SUM(amount) AS total_revenue
6FROM orders
7GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
8ORDER BY order_year, order_month
Result
order_yearorder_monthorder_counttotal_revenue
2024115645230.00
2024218952100.00
2024321161450.00

Both the SELECT and GROUP BY must include the same EXTRACT expressions. The result shows one row per year-month combination with aggregated metrics.

Hourly Traffic Patterns

Group events by hour to find peak activity times:
1SELECT
2 EXTRACT(HOUR FROM event_time) AS event_hour,
3 COUNT(*) AS event_count
4FROM page_views
5GROUP BY EXTRACT(HOUR FROM event_time)
6ORDER BY event_hour
Result
event_hourevent_count
91245
102890
113102
122567
Hours 10-11 show peak traffic. This pattern reveals when users are most active, informing capacity planning and marketing schedules.

Filtering by Time Parts

EXTRACT also works in WHERE clauses to filter by time components:

Filter to weekdays only (Monday=2 through Friday=6 in most databases). Useful for business metrics excluding weekends.
1SELECT
2 *
3FROM orders
4WHERE EXTRACT(DAY_OF_WEEK FROM order_date) BETWEEN 2
5AND 6

> Complete this query to convert timestamps into Unix epoch integers for an API response.

SELECT
  ___(___) AS epoch
FROM event_data
event_timestamp
TO_UNIXTIME
FROM_UNIXTIME

Combining EXTRACT with GROUP BY is the standard pattern for time-series reporting. The extracted integer serves as a compact grouping key that sorts naturally and is easy to filter with numeric comparisons.

Filtering with EXTRACT in WHERE bypasses date-range index pruning. Instead of WHERE EXTRACT(YEAR FROM order_date) = 2024, prefer WHERE order_date BETWEEN DATE '2024-01-01' AND DATE '2024-12-31'.

EXTRACT-based grouping and TO_UNIXTIME together give you two complementary tools: extract components for aggregations within a system, and convert to Unix time for compact numeric output to external APIs.

PUTTING IT ALL TOGETHER

> You are a retention analyst at Duolingo calculating how long users stay active between their signup timestamp and last login before churning. The analysis powers a cohort report grouped by signup quarter and month.

EXTRACT() pulls the signup year and quarter from each timestamp for cohort grouping in the retention table.
Date part constants like MONTH and DAY_OF_WEEK reveal whether churn spikes on weekends or specific months.
DATE_DIFF() measures precise hours between last login and signup, giving sub-day retention granularity.
Unix time conversion aligns log-file timestamps from the mobile app with SQL timestamps for JOIN operations.
KEY TAKEAWAYS
EXTRACT(part FROM date) pulls year, month, day, hour, etc. as integers
NOW() returns current timestamp, identical to CURRENT_TIMESTAMP
DATE_DIFF measures differences in days, hours, minutes, or seconds between timestamps
Unix time is seconds since 1970-01-01 00:00:00 UTC
FROM_UNIXTIME converts integers to timestamps; TO_UNIXTIME reverses
Watch for millisecond Unix timestamps (13 digits) vs second (10 digits)
Extracted components enable grouping by time periods for trend analysis

Temporal surgery: EXTRACT, DATE_TRUNC

Category
SQL
Difficulty
intermediate
Duration
23 minutes
Challenges
0 hands-on challenges

Topics covered: EXTRACT(), Date Part Constants, DATE_DIFF(), Unix Time Conversion, Grouping by Time Parts

Lesson Sections

  1. EXTRACT() (concepts: sqlExtract)

    Break apart a date into its calendar components: Time Component Extraction Calendar Extractions Practical Applications Filtering by Components How would you group revenue by year and quarter? Best Practices When to Use EXTRACT() Companies use these patterns to uncover time-of-day trends that drive product decisions.

  2. Date Part Constants

    Additional Date Parts Extracting Multiple Parts CURRENT_TIMESTAMP with EXTRACT This query extracts the current hour and classifies the time of day. The pattern is useful for dashboards that display time-aware greetings or adjust behavior based on business hours. Extracting DOW (day of week) is useful for detecting whether demand differs between weekdays and weekends. Grouping by DOW across a full year reveals structural patterns in consumer behavior.

  3. DATE_DIFF()

    Measuring Durations Response Time Analysis This query calculates response time in minutes and categorizes performance. Sub-hour granularity is critical for support SLAs where 15 minutes versus 45 minutes makes a significant difference. Aggregating Durations This query calculates session statistics per user: how many sessions, average duration, and total time spent. These metrics are fundamental for user engagement analysis. What unit should you measure in for different use cases? DATE_DIFF() Use

  4. Unix Time Conversion

    Unix time (also called epoch time or POSIX time) represents time as a single integer: the number of seconds since January 1, 1970 00:00:00 UTC. This representation is compact, timezone-independent, and universally understood by computing systems. Understanding Unix Time The Unix epoch (time zero) is January 1, 1970. Every second adds 1 to the counter. 1.7 billion seconds have passed since 1970, bringing us to 2024. This simple counting system enables easy time arithmetic. Converting from Unix Ti

  5. Grouping by Time Parts

    Group transactions by year and month to see monthly trends: Hourly Traffic Patterns Group events by hour to find peak activity times: Hours 10-11 show peak traffic. This pattern reveals when users are most active, informing capacity planning and marketing schedules. Filtering by Time Parts