# Suspected Bot Sessions

> Five seconds or less. Probably a bot.

Canonical URL: <https://datadriven.io/problems/suspected_bot_sessions>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

Sessions shorter than 100 seconds get flagged as potential bot activity. Return the session ID, user ID, and session duration for each suspect session in 2026.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests whether a candidate can demonstrate working with date arithmetic and formatting. This is a foundational check that interviewers use early in a round to verify baseline proficiency.

---

### Break down the requirements

#### Step 1: Apply the range filter

The WHERE clause restricts rows to the target range. Applying this filter early reduces the volume flowing into downstream operations.

#### Step 2: Select the target columns

The SELECT clause picks exactly the columns the prompt asks for. Returning extra columns or missing a required alias would fail the grading check.

---

### The solution

**Filter user_sessions by duration threshold and year range**

```sql
SELECT session_id, user_id, session_duration_sec
FROM user_sessions
WHERE session_duration_sec < 100 AND strftime('%Y', session_start) = '2026'
```

> **Cost Analysis**
>
> With ~40M rows, the query performs a single sequential scan. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for how you handle date arithmetic and whether you account for edge cases like month boundaries.

> **Common Pitfall**
>
> Returning extra columns that the prompt did not ask for, or using the wrong column alias, causes a grading mismatch even when the logic is correct.

---

## Common follow-up questions

- If session_start is stored as a TEXT timestamp, how would you reliably extract the year for filtering? _(Tests string-to-date parsing or EXTRACT/strftime usage depending on the engine.)_
- Would an index on session_duration_sec alone help, or should session_start come first in the index? _(Tests index selectivity reasoning; the date filter may be more selective than the duration threshold.)_
- If the business changes the threshold to 'fewer than 3 pages viewed AND under 100 seconds', how does the WHERE clause change? _(Tests combining multiple conditions with AND and referencing the pages_viewed column.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/suspected_bot_sessions)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.