# Unique Hosts by Node Type

> How many unique hosts per node type?

Canonical URL: <https://datadriven.io/problems/unique_hosts_by_node_type>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

How many unique hostnames exist across compute, storage, network, and GPU node types? Nodes appearing under multiple types should be counted only once.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests `COUNT(DISTINCT ...)` with a multi-value `WHERE IN` filter. Interviewers verify that you understand how to count unique values across a subset of categories without double-counting.

---

### Break down the requirements

#### Step 1: Filter to specified node types

`WHERE node_type IN ('compute', 'storage', 'network', 'GPU')` restricts to the four relevant types.

#### Step 2: Count distinct hostnames

`COUNT(DISTINCT hostname)` ensures nodes appearing under multiple types are counted only once.

---

### The solution

**Distinct count across multiple categories**

```sql
SELECT COUNT(DISTINCT hostname) AS unique_hosts
FROM infra_nodes
WHERE node_type IN ('compute', 'storage', 'network', 'GPU')
```

> **Cost Analysis**
>
> With 8K rows and 5 node types, the filter and distinct count are trivially fast. The hash set for DISTINCT tracking holds at most 8K values.

> **Interviewers Watch For**
>
> Candidates who group by node_type and then try to sum the counts. That approach double-counts hostnames appearing under multiple types. The single DISTINCT across all types is the correct approach.

> **Common Pitfall**
>
> Using `COUNT(hostname)` without DISTINCT. This counts total rows, not unique hostnames, inflating the result if any hostname appears under multiple node types.

---

## Common follow-up questions

- How would you get the unique count per node type instead? _(Add GROUP BY node_type, which now counts independently per type.)_
- What if node_type values were case-inconsistent? _(Use LOWER(node_type) IN (...) for case-insensitive matching.)_
- How would you find hostnames that appear in more than one node type? _(GROUP BY hostname HAVING COUNT(DISTINCT node_type) > 1.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/unique_hosts_by_node_type)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.