Twitter (now X) serves a real-time timeline to 600M+ users on top of a social graph of follows, mentions, replies, and retweets. The data engineering challenge is unique: low-latency timeline generation, ad targeting at scale, and real-time abuse detection all run on the same underlying graph data. The 2025 to 2026 rebuilds under Elon-era leadership consolidated the stack significantly. The data engineer interview reflects this: performance-conscious, lean-team-friendly, and skeptical of over-engineering. Loops run 3 to 4 weeks. Pair with the our data engineer interview prep hub.
4 to 5 rounds, 3 to 4 weeks. Mostly virtual.
Total comp from levels.fyi and verified offers. Note: post-2023 stock comp at X is non-public-market private equity; valuation is contested. US-based.
| Level | Title | Range | Notes |
|---|---|---|---|
| IC2 | Data Engineer | $160K - $230K | 2-4 years exp. Lean ownership, multiple pipelines. |
| IC3 | Senior Data Engineer | $220K - $340K | Most common hiring level. Cross-team systems. |
| IC4 | Staff Data Engineer | $310K - $470K | Sets technical direction for a domain. |
| IC5 | Senior Staff Data Engineer | $400K - $580K | Multi-org technical leadership. |
Questions reported by candidates in 2024-2026 loops, paraphrased and de-identified. Each answer covers the approach, the gotcha, and the typical follow-up.
from collections import deque
from datetime import datetime, timedelta
class TweetRateLimiter:
def __init__(self, daily_limit: int = 2400):
self.daily_limit = daily_limit
self.window = timedelta(days=1)
self.user_posts: dict[str, deque[datetime]] = {}
def can_post(self, user_id: str, now: datetime) -> bool:
posts = self.user_posts.setdefault(user_id, deque())
cutoff = now - self.window
while posts and posts[0] < cutoff:
posts.popleft()
if len(posts) >= self.daily_limit:
return False
posts.append(now)
return TrueTweet write
-> Kafka (tweets topic, key=user_id)
-> Heron stream job:
if author.follower_count < 1M (normal user):
FANOUT-ON-WRITE: push tweet_id to inbox of every follower
(in Manhattan / Redis, sharded by follower_id)
else (celebrity):
FANOUT-ON-READ: tweet stays in author's outbox only
on follower's timeline request, fetch celebrity outboxes
(limited set, can be cached)
Timeline request
-> Service merges:
(a) follower's inbox (fanout-on-write portion)
(b) celebrity outboxes for follows >1M (fanout-on-read portion)
-> ML ranker scores combined candidate set
-> top-K returned
Failure modes:
1. Heron worker crash: Kafka redelivers, idempotent inbox writes.
2. Celebrity inbox-fanout drift: rare; daily reconciliation job.
3. Cache eviction during traffic spike: degrade to direct Manhattan
read with higher latency, no functional break.X overlaps with Pinterest data engineering interview prep on the social graph data modeling and recommendation pipeline patterns, and with Netflix data engineering interview prep on real-time streaming pipelines.
If you're targeting a streaming-heavy team, also see the Kafka and Flink interview prep guide. Drill the rounds in system design framework for data engineers for the timeline generation patterns and behavioral interview prep for Data Engineer for the kill-a-project stories X interviewers reward.
Drill timeline generation, social graph traversal, and the lean-architecture patterns that win the X data engineer loop.
Start PracticingStripe Data Engineer process, comp, financial-precision SQL, and the collaboration round.
Uber Data Engineer process, marketplace and surge data modeling, geospatial pipelines.
Airbnb Data Engineer process, experimentation platform questions, two-sided marketplace modeling.
Databricks Data Engineer process, Spark internals, lakehouse architecture, Delta Lake questions.
Snowflake Data Engineer process, micro-partitions, query optimization, warehouse architecture.
Netflix Data Engineer process, streaming pipelines, A/B test infra, and the keeper test.
Continue your prep
50+ guides covering every round, company, role, and technology in the data engineer interview loop. Grounded in 2,817 verified interview reports across 929 companies, collected from real candidates.