Twitter (X) Data Engineer Interview
Twitter (X) Data Engineer Interview Process
4 to 5 rounds, 3 to 4 weeks. Mostly virtual.
- 01
Recruiter Screen (30 min)
Conversational. X hires across Timeline, Search and Discovery, Ads, Trust and Safety, Creator Tools, ML Platform. Post-2023 the teams are smaller and ownership is broader; expect a single data engineer to own multiple pipelines historically owned by separate teams. Mention experience with real-time systems, social graph data, or recommendation systems if you have it. - 02
Technical Phone Screen (60 min)
Live SQL or Python in CoderPad. SQL leans on social graph queries (mutual follows, second-degree connections, engagement funnels per follower count bucket). Python leans on stream-processing patterns (deduplicate tweets, aggregate engagement, detect spam). - 03
System Design Round (60 min)
Common: design the timeline generation pipeline at 600M user scale, design the trending-topics computation, design the spam and abuse detection pipeline, design the ad-targeting feature pipeline. Use the 4-step framework. Cover real-time + batch dual-track, fanout-on-write vs fanout-on-read trade-offs, hot-key handling for celebrity accounts, cost-conscious architecture choices. - 04
Live Coding Onsite (60 min)
Second live coding round. For some teams this is C++ instead of Python (legacy infra teams). Leans on data structure choice (LRU cache, bloom filter for spam detection, count-min sketch for top-K). - 05
Behavioral Round (60 min)
STAR-D format. X's 2023+ culture explicitly rewards lean ownership, fast shipping, and skepticism of over-engineering. Stories about killing a project, deprecating a system, or shipping a 1-week version of a 6-week proposal land especially well. Decision postmortem essential.
Twitter (X) Data Engineer Compensation (2026)
Total comp from levels.fyi and verified offers. Note: post-2023 stock comp at X is non-public-market private equity; valuation is contested. US-based.
| Level | Title | Range | Notes |
|---|---|---|---|
| IC2 | Data Engineer | $160K - $230K | 2-4 years exp. Lean ownership, multiple pipelines. |
| IC3 | Senior Data Engineer | $220K - $340K | Most common hiring level. Cross-team systems. |
| IC4 | Staff Data Engineer | $310K - $470K | Sets technical direction for a domain. |
| IC5 | Senior Staff Data Engineer | $400K - $580K | Multi-org technical leadership. |
Twitter (X) Data Engineering Tech Stack
Languages
Processing
Storage
Streaming
Query Engines
Orchestration
ML Platform
Graph
15 Real Twitter (X) Data Engineer Interview Questions With Worked Answers
Questions reported by candidates in 2024-2026 loops, paraphrased and de-identified. Each answer covers the approach, the gotcha, and the typical follow-up.
Find users with mutual follows (mutual mooting)
Compute engagement rate per tweet bucketed by follower count
Compute second-degree connections (friends-of-friends)
Find tweets going viral (engagement velocity rising fast)
Detect coordinated inauthentic behavior (CIB)
Implement a sliding-window rate limiter for tweet posting
from collections import deque
from datetime import datetime, timedelta
class TweetRateLimiter:
def __init__(self, daily_limit: int = 2400):
self.daily_limit = daily_limit
self.window = timedelta(days=1)
self.user_posts: dict[str, deque[datetime]] = {}
def can_post(self, user_id: str, now: datetime) -> bool:
posts = self.user_posts.setdefault(user_id, deque())
cutoff = now - self.window
while posts and posts[0] < cutoff:
posts.popleft()
if len(posts) >= self.daily_limit:
return False
posts.append(now)
return TrueCompute trending hashtags via count-min sketch
Implement timeline ranking inference at p99 < 100ms
Bloom filter for already-shown-tweet detection
Design the timeline generation pipeline at 600M user scale
Tweet write
-> Kafka (tweets topic, key=user_id)
-> Heron stream job:
if author.follower_count < 1M (normal user):
FANOUT-ON-WRITE: push tweet_id to inbox of every follower
(in Manhattan / Redis, sharded by follower_id)
else (celebrity):
FANOUT-ON-READ: tweet stays in author's outbox only
on follower's timeline request, fetch celebrity outboxes
(limited set, can be cached)
Timeline request
-> Service merges:
(a) follower's inbox (fanout-on-write portion)
(b) celebrity outboxes for follows >1M (fanout-on-read portion)
-> ML ranker scores combined candidate set
-> top-K returned
Failure modes:
1. Heron worker crash: Kafka redelivers, idempotent inbox writes.
2. Celebrity inbox-fanout drift: rare; daily reconciliation job.
3. Cache eviction during traffic spike: degrade to direct Manhattan
read with higher latency, no functional break.Design the trending-topics computation
Design the spam and abuse detection pipeline
Design the schema for follows, mentions, replies, retweets
Model the tweet engagement rollup for analytics
Tell me about a time you killed a project or deprecated a system
What Makes Twitter (X) Data Engineer Interviews Different
Lean ownership is the new normal
Real-time at planet scale is the central challenge
Cost consciousness is graded explicitly
Skepticism of over-engineering is a culture trait
How Twitter (X) Connects to the Rest of Your Prep
X overlaps with Pinterest data engineering interview prep on the social graph data modeling and recommendation pipeline patterns, and with Netflix data engineering interview prep on real-time streaming pipelines.
If you're targeting a streaming-heavy team, also see the Kafka and Flink interview prep guide. Drill the rounds in system design framework for data engineers for the timeline generation patterns and behavioral interview prep for Data Engineer for the kill-a-project stories X interviewers reward.
Data engineer interview prep FAQ
How long does the Twitter (X) Data Engineer interview take?+
Is X remote-friendly?+
What level should I target?+
Does X test algorithms / LeetCode?+
How do I prepare for the lean-ownership culture?+
What languages can I use?+
How is the X stock comp valued?+
How does X compare to Meta or Google for data engineering?+
Practice Real-Time Pipelines at Scale
Drill timeline generation, social graph traversal, and the lean-architecture patterns that win the X data engineer loop.
Adjacent Data Engineer Interview Prep Reading
Closest comparable graph-data social platform loop.
Real-time pipeline depth, Kafka and Flink patterns.
Pillar guide covering every round in the Data Engineer loop, end to end.
More data engineer interview prep guides
Stripe Data Engineer process, comp, financial-precision SQL, and the collaboration round.
Uber Data Engineer process, marketplace and surge data modeling, geospatial pipelines.
Airbnb Data Engineer process, experimentation platform questions, two-sided marketplace modeling.
Databricks Data Engineer process, Spark internals, lakehouse architecture, Delta Lake questions.
Snowflake Data Engineer process, micro-partitions, query optimization, warehouse architecture.
Netflix Data Engineer process, streaming pipelines, A/B test infra, and the keeper test.