Dates: Intermediate
Robinhood's tax reporting system must determine whether each stock sale is short-term or long-term, a distinction that can mean the difference between a 37% federal tax rate and a 20% rate on the exact same gain, and that determination comes down to whether the holding period is 365 days or 366. Every trade settlement, wash-sale window, and tax lot age calculation the company runs for millions of users depends on precise date arithmetic using functions like DATE_DIFF and DATE_ADD. A query that gets a holding period wrong by a single day does not just produce a bad report; it can trigger an IRS audit for a customer. This lesson covers the intermediate date functions that make this kind of precision possible.
EXTRACT()
Pull year, month, or day from a date
EXTRACT pulls a single component from a date or timestamp and returns it as a number. It allows you to isolate the year, month, day, hour, minute, second, or other parts for analysis, grouping, or filtering. This function is fundamental for temporal aggregations.
The syntax is EXTRACT(part FROM date_or_timestamp). The part specifies which component to extract: YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, DAY_OF_WEEK, DAY_OF_YEAR, WEEK, and QUARTER are commonly supported. The result is always an integer.
Each EXTRACT call returns a single integer. YEAR gives the full year (2024, not 24). MONTH returns 1-12. DAY returns 1-31 depending on the month. These integers enable numeric comparisons and grouping.
Time Component Extraction
EXTRACT also works with timestamps to pull time-of-day components:
HOUR returns 0-23 in 24-hour format. MINUTE and SECOND return 0-59. These extractions enable time-of-day analysis, peak hour detection, and shift-based reporting.
Calendar Extractions
EXTRACT supports higher-level calendar components:
EXTRACT() must be evaluated for every row when used in GROUP BY. For large tables, consider pre-computing these values in materialized columns or using date dimension tables for better performance.Practical Applications
EXTRACT powers temporal aggregations by providing grouping keys:
This query aggregates orders by year and month. The extracted components become the GROUP BY keys, creating monthly summaries. This pattern is foundational for time-series reporting.
Filtering by Components
EXTRACT enables selective filtering by temporal component:
This query finds December weekend orders. The MONTH filter restricts to December regardless of year. The DAY_OF_WEEK filter keeps only weekend days. Combining extractions creates precise temporal filters.
Best Practices
EXTRACT is most valuable when you need to analyze patterns or group data by specific time components.
When to Use EXTRACT()
These are the most common scenarios where EXTRACT proves invaluable.
> Complete this query to pull the signup year from each user record.
SELECT ( signup_date) AS year FROM users
EXTRACT always returns an integer. You can use the result directly in arithmetic, comparison operators, and GROUP BY without any additional casting.
For large tables, using EXTRACT in WHERE clauses prevents the optimizer from using date-range indexes. Prefer range comparisons on the original column when filtering by year or month.
When building cohort analysis, EXTRACT(YEAR FROM signup_date) combined with EXTRACT(MONTH FROM signup_date) gives a precise month-of-signup cohort key without any string formatting.
Date Part Constants
Reference any component of a timestamp
EXTRACT supports a variety of date part constants beyond YEAR, MONTH, and DAY. Knowing all available parts lets you slice temporal data at any granularity, from quarters down to seconds.
Additional Date Parts
QUARTER returns 1 through 4 and is essential for fiscal reporting. DOW gives the day of week (1 = Monday through 7 = Sunday). DOY gives the day of year (1 to 366), useful for seasonal analysis. For timestamps, HOUR, MINUTE, and SECOND extract time components.
Extracting Multiple Parts
Combining EXTRACT with GROUP BY lets you aggregate by any time period: weekly sales, quarterly revenue, hourly traffic patterns. This is one of the most common patterns in analytics.
CURRENT_TIMESTAMP with EXTRACT
CURRENT_TIMESTAMP provides the complete current moment with date and time. You can extract specific components like hour, day, or month to build time-aware logic directly in your queries.
> Complete this query to get the order month for seasonal analysis.
SELECT ( order_date) AS month FROM orders
EXTRACT(MONTH) returns the same integer 1-12 regardless of the year. If your analysis spans multiple years, always pair it with EXTRACT(YEAR) in GROUP BY to avoid mixing months across different years.
NOW() and CURRENT_TIMESTAMP are functionally identical in most databases. NOW() is shorter for interactive queries, while CURRENT_TIMESTAMP is the ANSI SQL standard and preferred in production code for clarity.
DATE_DIFF()
Calculate precise intervals between dates
DATE_DIFF calculates the difference between two timestamps or dates. It returns the result as an integer representing the number of complete units between them.
The syntax is DATE_DIFF(unit, timestamp1, timestamp2). The function returns timestamp2 - timestamp1 in the specified unit. Common units include DAY, HOUR, MINUTE, and SECOND.
Measuring Durations
DATE_DIFF counts complete units. Session S001 lasted 45 minutes and 30 seconds, which rounds down to 45 complete minutes or 2730 complete seconds. Partial units are not counted.
Response Time Analysis
Aggregating Durations
DATE_DIFF() Uses
DATE_DIFF powers time-based metrics across industries.
> Complete this query to calculate the run time in minutes for each batch job.
SELECT job_id, ( , , ended ) AS run_minutes FROM batch_jobs
DATE_DIFF counts complete units and discards the remainder. A 90-minute session is 1 hour in DATE_DIFF hours. Use the smallest practical unit to preserve accuracy, then convert for display.
DATE_TRUNC to month is equivalent to setting day to 1 and time to 00:00:00. The result is a valid timestamp you can use directly in GROUP BY to create monthly aggregations.
Unix Time Conversion
Convert between timestamps and epoch time
FROM_UNIXTIME converts a Unix timestamp to a human-readable timestamp. TO_UNIXTIME converts a timestamp to its Unix representation. These functions bridge between application data formats and human-friendly display.
Understanding Unix Time
| Unix Timestamp | Human-Readable |
|---|---|
| 0 | 1970-01-01 00:00:00 UTC |
| 1000000000 | 2001-09-09 01:46:40 UTC |
| 1710511845 | 2024-03-15 14:30:45 UTC |
| 2000000000 | 2033-05-18 03:33:20 UTC |
Converting from Unix Time
FROM_UNIXTIME takes the integer and returns a timestamp. The result is in the database's local timezone unless otherwise configured. This makes Unix timestamps human-readable for analysis.
Converting to Unix Time
Millisecond Timestamps
Unix Time Function Uses
- Converting API responses that use Unix timestamps
- Processing log files with epoch time format
- Storing timestamps compactly as integers
- Interoperating with systems that expect Unix time
- Performing time arithmetic using simple integer math
> Complete this query to convert Unix epoch values into a readable timestamp.
SELECT () AS log_time FROM server_logs
FROM_UNIXTIME converts in the database server's local timezone. If the server is not in UTC, the human-readable result will be offset from the original UTC epoch. Use AT TIME ZONE after conversion to get UTC.
JavaScript timestamps are milliseconds since epoch (13 digits). Dividing by 1000 before FROM_UNIXTIME is a common and necessary step when processing data from web event streams or browser logs.
Grouping by Time Parts
Aggregate data by week, month, or quarter
Combining EXTRACT with GROUP BY enables time-based aggregations: monthly revenue trends, hourly traffic patterns, day-of-week analysis. Extracted integers become grouping keys, turning timestamp data into actionable summaries.
Both the SELECT and GROUP BY must include the same EXTRACT expressions. The result shows one row per year-month combination with aggregated metrics.
Hourly Traffic Patterns
Filtering by Time Parts
EXTRACT also works in WHERE clauses to filter by time components:
> Complete this query to convert timestamps into Unix epoch integers for an API response.
SELECT () AS epoch FROM event_data
Combining EXTRACT with GROUP BY is the standard pattern for time-series reporting. The extracted integer serves as a compact grouping key that sorts naturally and is easy to filter with numeric comparisons.
Filtering with EXTRACT in WHERE bypasses date-range index pruning. Instead of WHERE EXTRACT(YEAR FROM order_date) = 2024, prefer WHERE order_date BETWEEN DATE '2024-01-01' AND DATE '2024-12-31'.
EXTRACT-based grouping and TO_UNIXTIME together give you two complementary tools: extract components for aggregations within a system, and convert to Unix time for compact numeric output to external APIs.
> You are a retention analyst at Duolingo calculating how long users stay active between their signup timestamp and last login before churning. The analysis powers a cohort report grouped by signup quarter and month.
EXTRACT() pulls the signup year and quarter from each timestamp for cohort grouping in the retention table.MONTH and DAY_OF_WEEK reveal whether churn spikes on weekends or specific months.DATE_DIFF() measures precise hours between last login and signup, giving sub-day retention granularity.JOIN operations.EXTRACT(part FROM date) pulls year, month, day, hour, etc. as integersNOW() returns current timestamp, identical to CURRENT_TIMESTAMPDATE_DIFF measures differences in days, hours, minutes, or seconds between timestampsFROM_UNIXTIME converts integers to timestamps; TO_UNIXTIME reversesTemporal surgery: EXTRACT, DATE_TRUNC
- Category
- SQL
- Difficulty
- intermediate
- Duration
- 23 minutes
- Challenges
- 0 hands-on challenges
Topics covered: EXTRACT(), Date Part Constants, DATE_DIFF(), Unix Time Conversion, Grouping by Time Parts
Lesson Sections
- EXTRACT() (concepts: sqlExtract)
Break apart a date into its calendar components: Time Component Extraction Calendar Extractions Practical Applications Filtering by Components How would you group revenue by year and quarter? Best Practices When to Use EXTRACT() Companies use these patterns to uncover time-of-day trends that drive product decisions.
- Date Part Constants
Additional Date Parts Extracting Multiple Parts CURRENT_TIMESTAMP with EXTRACT This query extracts the current hour and classifies the time of day. The pattern is useful for dashboards that display time-aware greetings or adjust behavior based on business hours. Extracting DOW (day of week) is useful for detecting whether demand differs between weekdays and weekends. Grouping by DOW across a full year reveals structural patterns in consumer behavior.
- DATE_DIFF()
Measuring Durations Response Time Analysis This query calculates response time in minutes and categorizes performance. Sub-hour granularity is critical for support SLAs where 15 minutes versus 45 minutes makes a significant difference. Aggregating Durations This query calculates session statistics per user: how many sessions, average duration, and total time spent. These metrics are fundamental for user engagement analysis. What unit should you measure in for different use cases? DATE_DIFF() Use
- Unix Time Conversion
Unix time (also called epoch time or POSIX time) represents time as a single integer: the number of seconds since January 1, 1970 00:00:00 UTC. This representation is compact, timezone-independent, and universally understood by computing systems. Understanding Unix Time The Unix epoch (time zero) is January 1, 1970. Every second adds 1 to the counter. 1.7 billion seconds have passed since 1970, bringing us to 2024. This simple counting system enables easy time arithmetic. Converting from Unix Ti
- Grouping by Time Parts
Group transactions by year and month to see monthly trends: Hourly Traffic Patterns Group events by hour to find peak activity times: Hours 10-11 show peak traffic. This pattern reveals when users are most active, informing capacity planning and marketing schedules. Filtering by Time Parts