Company Interview Guide

Twitter (X) Data Engineer Interview

Twitter (now X) serves a real-time timeline to 600M+ users on top of a social graph of follows, mentions, replies, and retweets. The data engineering challenge is unique: low-latency timeline generation, ad targeting at scale, and real-time abuse detection all run on the same underlying graph data. The 2025 to 2026 rebuilds under Elon-era leadership consolidated the stack significantly. The data engineer interview reflects this: performance-conscious, lean-team-friendly, and skeptical of over-engineering. Loops run 3 to 4 weeks. Pair with the our data engineer interview prep hub.

The Short Answer
Expect a 4 to 5 round loop: recruiter screen, technical phone screen (SQL or Python), then a 3 to 4 round virtual onsite covering system design (often timeline generation or recommendation pipeline), live coding (typically C++ or Python depending on team), and behavioral. X's distinctive emphasis: real-time at scale, cost-conscious architecture, social graph traversal, lean-team ownership. The behavioral round leans on stories about delivering under pressure and questioning over-engineering.
Updated April 2026·By The DataDriven Team

Twitter (X) Data Engineer Interview Process

4 to 5 rounds, 3 to 4 weeks. Mostly virtual.

1

Recruiter Screen (30 min)

Conversational. X hires across Timeline, Search and Discovery, Ads, Trust and Safety, Creator Tools, ML Platform. Post-2023 the teams are smaller and ownership is broader; expect a single data engineer to own multiple pipelines historically owned by separate teams. Mention experience with real-time systems, social graph data, or recommendation systems if you have it.
2

Technical Phone Screen (60 min)

Live SQL or Python in CoderPad. SQL leans on social graph queries (mutual follows, second-degree connections, engagement funnels per follower count bucket). Python leans on stream-processing patterns (deduplicate tweets, aggregate engagement, detect spam).
3

System Design Round (60 min)

Common: design the timeline generation pipeline at 600M user scale, design the trending-topics computation, design the spam and abuse detection pipeline, design the ad-targeting feature pipeline. Use the 4-step framework. Cover real-time + batch dual-track, fanout-on-write vs fanout-on-read trade-offs, hot-key handling for celebrity accounts, cost-conscious architecture choices.
4

Live Coding Onsite (60 min)

Second live coding round. For some teams this is C++ instead of Python (legacy infra teams). Leans on data structure choice (LRU cache, bloom filter for spam detection, count-min sketch for top-K).
5

Behavioral Round (60 min)

STAR-D format. X's 2023+ culture explicitly rewards lean ownership, fast shipping, and skepticism of over-engineering. Stories about killing a project, deprecating a system, or shipping a 1-week version of a 6-week proposal land especially well. Decision postmortem essential.

Twitter (X) Data Engineer Compensation (2026)

Total comp from levels.fyi and verified offers. Note: post-2023 stock comp at X is non-public-market private equity; valuation is contested. US-based.

LevelTitleRangeNotes
IC2Data Engineer$160K - $230K2-4 years exp. Lean ownership, multiple pipelines.
IC3Senior Data Engineer$220K - $340KMost common hiring level. Cross-team systems.
IC4Staff Data Engineer$310K - $470KSets technical direction for a domain.
IC5Senior Staff Data Engineer$400K - $580KMulti-org technical leadership.

Twitter (X) Data Engineering Tech Stack

Languages

Scala (heavy, legacy), Java, Python, C++ for some serving paths

Processing

Apache Spark, Apache Heron (Twitter's in-house Storm replacement), Apache Beam

Storage

Manhattan (in-house key-value store), Apache Druid, Apache Iceberg, S3, Hadoop legacy

Streaming

Apache Kafka, Heron for stream processing, EventBus (in-house pub-sub)

Query Engines

Presto/Trino, Apache Druid for real-time, BigQuery for some analytics post-GCP migration

Orchestration

Apache Airflow, in-house workflow tooling for legacy

ML Platform

DeepBird (in-house ML platform), TensorFlow, custom serving infra for ranking models

Graph

FlockDB (in-house graph store, legacy but still in production for some serving paths), GraphJet for in-memory recommendations

15 Real Twitter (X) Data Engineer Interview Questions With Worked Answers

Questions reported by candidates in 2024-2026 loops, paraphrased and de-identified. Each answer covers the approach, the gotcha, and the typical follow-up.

SQL · L4

Find users with mutual follows (mutual mooting)

Self-join follows table on follower-followee inversion. SELECT a.follower, a.followee FROM follows a JOIN follows b ON a.follower = b.followee AND a.followee = b.follower WHERE a.follower < a.followee (dedupe pairs). Discuss why the inequality matters for pair deduplication.
SQL · L4

Compute engagement rate per tweet bucketed by follower count

JOIN tweets to user_followers_at_post_ts. Bucket follower count (0-100, 100-1K, 1K-10K, 10K-100K, 100K+). Per bucket, compute avg(impressions / followers) and avg(likes / impressions). Discuss why follower count at post-time matters more than current follower count.
SQL · L5

Compute second-degree connections (friends-of-friends)

Self-join follows twice. SELECT a.follower, c.followee FROM follows a JOIN follows b ON a.followee = b.follower WHERE a.follower != b.followee AND NOT EXISTS (SELECT 1 FROM follows WHERE follower = a.follower AND followee = b.followee). Discuss why this won't scale to billions of edges; mention precomputed neighbor table or graph database for production.
SQL · L5

Find tweets going viral (engagement velocity rising fast)

Per tweet per hour: engagements_this_hour vs engagements_prior_hour. CASE WHEN current > prior * 5 AND prior >= 100 THEN 1 ELSE 0 AS is_viral. Discuss the threshold tuning trade-off (low threshold = noisy, high threshold = misses early viral tweets).
SQL · L5

Detect coordinated inauthentic behavior (CIB)

Find user pairs with suspicious behavioral correlation: posting the same content within 1 minute of each other more than N times in 24 hours. Self-join tweets on near- identical content_hash, abs(ts_diff) < 60 sec, GROUP BY user pair, HAVING COUNT > threshold. Discuss the follow-up signal layer: account age, IP clustering, device fingerprint similarity.
Python · L4

Implement a sliding-window rate limiter for tweet posting

Per user, maintain a deque of recent post timestamps. On new post: trim deque to last 24 hours, if len > daily_limit reject, else append and accept. Discuss why deque (O(1) push/pop) vs list. Mention that production uses Redis sorted sets for distributed rate limiting.
from collections import deque
from datetime import datetime, timedelta

class TweetRateLimiter:
    def __init__(self, daily_limit: int = 2400):
        self.daily_limit = daily_limit
        self.window = timedelta(days=1)
        self.user_posts: dict[str, deque[datetime]] = {}

    def can_post(self, user_id: str, now: datetime) -> bool:
        posts = self.user_posts.setdefault(user_id, deque())
        cutoff = now - self.window
        while posts and posts[0] < cutoff:
            posts.popleft()
        if len(posts) >= self.daily_limit:
            return False
        posts.append(now)
        return True
Python · L4

Compute trending hashtags via count-min sketch

Stream of tweets, output top-K hashtags per minute. Naive hash-map grows unbounded. Count-min sketch: fixed-size 2D array, hash each hashtag with K hash functions, increment cell, take min on read. Discuss space-vs-accuracy trade-off. Mention HyperLogLog for unique-tagger counting.
Python · L5

Implement timeline ranking inference at p99 &lt; 100ms

Per user request: fetch candidate tweets (from cache), pull features per (user, tweet) from feature store, batch inference call to ML model, return top-K. Discuss the budget: 30ms candidate fetch, 30ms feature lookup, 30ms inference, 10ms slack. Mention the cold-start fallback (popular tweets when user has no history).
Python · L5

Bloom filter for already-shown-tweet detection

Per user, maintain a bloom filter of tweets shown in last 24 hours. On candidate generation, filter out positives from bloom. Discuss false-positive rate (1% acceptable; user occasionally misses a fresh tweet) vs false-negative rate (0% required; user never sees same tweet twice). Trade-off: bloom size vs FP rate.
System Design · L5

Design the timeline generation pipeline at 600M user scale

Two patterns. Fanout-on-write: when user A tweets, push the tweet_id into the inbox of every follower. Pro: read is O(1). Con: write amplification (Taylor Swift posts: 100M writes). Fanout-on-read: on user's timeline request, query tweets-from-follows from a cache. Pro: write is O(1). Con: read is expensive, cache key explosion. Hybrid: fanout-on-write for normal users, fanout-on-read for celebrities (defined as >1M followers). Discuss the threshold tuning, the eventual-consistency window, and the role of pre-aggregated “Home Latest” vs ML-ranked “Home For You” pipelines.
Tweet write
   -> Kafka (tweets topic, key=user_id)
   -> Heron stream job:
        if author.follower_count < 1M (normal user):
            FANOUT-ON-WRITE: push tweet_id to inbox of every follower
            (in Manhattan / Redis, sharded by follower_id)
        else (celebrity):
            FANOUT-ON-READ: tweet stays in author's outbox only
            on follower's timeline request, fetch celebrity outboxes
            (limited set, can be cached)

Timeline request
   -> Service merges:
       (a) follower's inbox (fanout-on-write portion)
       (b) celebrity outboxes for follows >1M (fanout-on-read portion)
   -> ML ranker scores combined candidate set
   -> top-K returned

Failure modes:
1. Heron worker crash: Kafka redelivers, idempotent inbox writes.
2. Celebrity inbox-fanout drift: rare; daily reconciliation job.
3. Cache eviction during traffic spike: degrade to direct Manhattan
   read with higher latency, no functional break.
System Design · L5

Design the trending-topics computation

Tweets -> Kafka -> Heron stream (extract hashtags and entities) -> sliding window count per hashtag per geo over 30-min window -> rank by velocity (current rate vs trailing baseline) -> emit top-K to Druid. Cover: spam suppression (exclude hashtags from accounts < 30 days old), per-geo localization, the “always trending” problem (popular topics like #Apple are always counts-high but not trending).
System Design · L5

Design the spam and abuse detection pipeline

Tweet stream -> Heron (multi-stage classifier: rule- based filters first, then ML scoring) -> queue for human review or auto-action based on score. Cover: feature pipeline for the ML classifier (account age, posting velocity, content embedding similarity to known spam), label feedback loop (human reviewer decisions flow back as training data), audit log for appeals.
Modeling · L5

Design the schema for follows, mentions, replies, retweets

Four edge tables in a graph: follows (follower_id, followee_id, ts), mentions (tweet_id, mentioned_user_id, mention_position), replies (reply_tweet_id, parent_tweet_id), retweets (retweet_id, original_tweet_id, retweeter_id). Plus fact_tweet (tweet_id, author_id, content, ts) and dim_user. Discuss: why edges are separate tables vs a single edges table with edge_type column (separate is faster to query for type-specific traversal).
Modeling · L5

Model the tweet engagement rollup for analytics

fact_tweet_engagement_hourly: one row per (tweet_id, hour_bucket) with impressions, likes, retweets, replies, bookmarks, profile_visits. Aggregated from raw event log by hourly Spark job. Discuss: 7-day rolling rollup for recent analytics, monthly rollup for older. Trade-off: storage cost vs query latency. Most queries hit the hourly rollup; raw is kept 30 days for ad-hoc.
Behavioral · L5

Tell me about a time you killed a project or deprecated a system

X's 2023+ culture explicitly rewards eliminating work. Story should cover: the project or system you killed, why it was the right call (cost vs benefit), how you handled the affected stakeholders, the outcome 6 months later. Decision postmortem essential. Stories where you proactively proposed the deprecation land better than stories where it was assigned.

What Makes Twitter (X) Data Engineer Interviews Different

Lean ownership is the new normal

Post-2023 X teams are dramatically smaller than pre-Musk Twitter. A single data engineer often owns multiple pipelines that historically had dedicated teams. Frame stories about broader scope ownership, fewer specialists, more generalist judgment.

Real-time at planet scale is the central challenge

X serves a real-time global feed. Every system design answer should acknowledge global scale (600M users), real-time SLAs (sub-second timeline updates), and the cost-vs-latency trade-offs that follow. Vague answers about throughput don't survive.

Cost consciousness is graded explicitly

Post-2023 X interviewers ask about cost optimization unprompted. Mentioning storage class transitions, compute right-sizing, or caching strategies that reduce egress is a positive signal. Over-engineered answers that ignore cost are a downgrade signal.

Skepticism of over-engineering is a culture trait

X's post-2023 culture rewards questioning whether a system needs to exist. Stories about killing a project, deprecating a service, or shipping a simpler version of an elaborate proposal score well. Stories about elaborate engineering excellence in service of an unclear business need score badly.

How Twitter (X) Connects to the Rest of Your Prep

X overlaps with Pinterest data engineering interview prep on the social graph data modeling and recommendation pipeline patterns, and with Netflix data engineering interview prep on real-time streaming pipelines.

If you're targeting a streaming-heavy team, also see the Kafka and Flink interview prep guide. Drill the rounds in system design framework for data engineers for the timeline generation patterns and behavioral interview prep for Data Engineer for the kill-a-project stories X interviewers reward.

Data Engineer Interview Prep FAQ

How long does the Twitter (X) Data Engineer interview take?+
3 to 4 weeks from recruiter screen to offer. Post-2023 the process moves faster than pre-Musk Twitter.
Is X remote-friendly?+
Mostly no, post-2023. Roles typically require in-office attendance in San Francisco or one of a small number of approved locations. Confirm with the recruiter.
What level should I target?+
IC3 (Senior) is the most common external hire. IC4+ usually internal promotion.
Does X test algorithms / LeetCode?+
More than most data engineering loops. Focus on data structures (bloom filters, count-min sketch, LRU caches), not full DSA grinds.
How do I prepare for the lean-ownership culture?+
Read the post-2023 X engineering blog posts and Musk-era public communications. Understand the explicit cultural shift from elaborate engineering to lean ownership. Have stories ready that fit this lens.
What languages can I use?+
Python and SQL universally. Scala for legacy infra teams. C++ for some serving-path teams.
How is the X stock comp valued?+
Post-2023 X is private with disputed valuation. Stock comp is non-trivial but illiquid. Verified levels.fyi data treats it conservatively. Negotiate with this in mind.
How does X compare to Meta or Google for data engineering?+
Smaller teams, faster shipping, less platform tooling, more real-time at scale. Better fit for engineers who like broad scope; worse for engineers who want deep specialization.

Practice Real-Time Pipelines at Scale

Drill timeline generation, social graph traversal, and the lean-architecture patterns that win the X data engineer loop.

Start Practicing

More Data Engineer Interview Prep Guides

Continue your prep

Data Engineer Interview Prep, explore the full guide

50+ guides covering every round, company, role, and technology in the data engineer interview loop. Grounded in 2,817 verified interview reports across 929 companies, collected from real candidates.

Interview Rounds

By Company

By Role

By Technology

Decisions

Question Formats