# Read Both Ways

> Placeholders hide in plain sight. The symmetric ones give themselves away.

Canonical URL: <https://datadriven.io/problems/read-both-ways-token-sweep>

Domain: Python · Difficulty: easy · Seniority: junior

## Problem

We sweep production records for placeholder identifiers that QA seeds as values reading the same forwards and backwards, so they are easy to spot and purge later. Scan a batch of `tokens` and return the ones that still read the same in both directions once you ignore case and any character that is not a letter or digit, keeping them in the order they appeared. A token with no letters or digits at all does not count.

## Worked solution and explanation

### What this is really testing

Strip the QA-placeholder costume and this is a filter with a normalization step hiding inside a symmetry check. Anyone can write the reverse-compare; the part that actually separates candidates is recognizing that the raw token is not what you compare. 'Race-car' is not equal to its own reverse as a raw string, and '!!!' looks symmetric but carries no real value. The candidate who normalizes first (case-fold, keep only letters and digits) and then guards against the empty result gets the right set. The one who writes the one-liner over raw strings misses real flags and invents fake ones.

---

### Break down the requirements

#### Step 1: Normalize each token before judging it

Lowercase every character and keep only the ones where c.isalnum() is true. This is the load-bearing step: it turns 'Race-car' into 'racecar' so the comparison can succeed, and it is what a raw reverse-compare skips.

#### Step 2: Compare the cleaned form to its reverse

Once normalized, cleaned == cleaned[::-1] is the symmetry test. Slicing with [::-1] reverses the string in one expression and reads exactly like the spec sentence.

#### Step 3: Reject tokens that normalize to nothing

A token like '!!!' becomes the empty string after normalization, and the empty string equals its own reverse, so without a guard it would be flagged. The prompt says a token with no letters or digits does not count, so require cleaned to be non-empty.

#### Step 4: Append the ORIGINAL token, in order

Collect the untouched input token, not its normalized form, and append as you walk the list so input order is preserved for free.

---

### The solution

**Normalize, then mirror-compare**

```python
def find_palindromes(tokens):
    flagged = []
    for token in tokens:
        cleaned = "".join(c.lower() for c in token if c.isalnum())
        if cleaned and cleaned == cleaned[::-1]:
            flagged.append(token)
    return flagged
```

> **Complexity**
>
> Time is O(total characters across all tokens): one pass to normalize each token and one slice to reverse it, both linear in the token's length. Space is O(k) for the cleaned copy of the current token plus O(m) for the output, where m is the number of flagged tokens. At the scale of a batch sweep (thousands of short identifiers) this is effectively instant.

> **Interviewers Watch For**
>
> Whether you normalize BEFORE comparing rather than after, and whether you proactively name the empty-token edge case without being prompted. Strong candidates also state that they return the original token, not the cleaned one, because the caller wants to act on the real record.

> **Common Pitfall**
>
> Writing [t for t in tokens if t == t[::-1]] over the raw strings. It silently drops 'Race-car' and 'Level' because their punctuation and capitalization break the raw match, and it flags '!!!' because the empty string is its own reverse. The bug is invisible on clean single-word inputs and only surfaces on the realistic, messy ones.

---

## Common follow-up questions

- How would you change this to also report WHERE in each token the symmetry breaks for the non-matching ones? _(Pushes from a boolean predicate to a two-pointer walk that returns the first mismatched index pair.)_
- The tokens now arrive as a stream too large to hold in memory. How do you adapt? _(Tests turning the list comprehension into a generator that yields flagged tokens lazily.)_
- What if symmetry should be checked on the raw characters, punctuation and all, with only case ignored? _(Probes whether the candidate can isolate and adjust the normalization rule without rewriting the comparison.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/read-both-ways-token-sweep)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.