# Session Logins Dec 13 to 19

> Logins during one specific window.

Canonical URL: <https://datadriven.io/problems/session_logins_dec_13_to_19>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The security team is investigating a suspicious login window. Find all unique users with a session start between December 13 and December 19, 2026, inclusive.

## Worked solution and explanation

### Why this problem exists in real interviews

This problem targets filtering and projection across the `user_sessions` table. You need to work with the `user_id` and `session_start` columns to satisfy the requirements.

---

### Break down the requirements

#### Step 1: Apply the range filter

The WHERE clause restricts rows to the target range. Applying this filter early reduces the volume flowing into downstream operations.

#### Step 2: Deduplicate the result with DISTINCT

`SELECT DISTINCT` removes duplicate rows from the output. This is necessary when joins or subqueries can produce repeated combinations.

---

### The solution

**Apply the range filter to find session logins dec 13 to**

```sql
SELECT DISTINCT user_id
FROM user_sessions
WHERE session_start BETWEEN '2026-12-13' AND '2026-12-19'
```

> **Cost Analysis**
>
> With ~50M rows, the query performs a single sequential scan. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for whether the query returns exactly the columns and ordering the prompt specifies; how quickly you identify the core operation and write clean, minimal code.

> **Common Pitfall**
>
> Returning extra columns that the prompt did not ask for, or using the wrong column alias, causes a grading mismatch even when the logic is correct.

---

## Common follow-up questions

- What would happen to your result if `user_sessions.session_start` contained duplicate values that you did not expect? _(Tests whether the candidate considers data quality issues in `session_start` and uses DISTINCT or deduplication where needed.)_
- `user_sessions.user_id` has roughly 4,000,000 distinct values. What index strategy would you use to avoid a full scan on `user_sessions`? _(Tests indexing knowledge specific to the high-cardinality `user_id` column in `user_sessions`.)_
- How would you modify this query if the business logic required grouping by both `session_id` and `user_id` instead of just one? _(Tests ability to adapt the query structure to changing requirements.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/session_logins_dec_13_to_19)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.