Company Interview Guide

Uber Data Engineer Interview

Uber processes millions of trips and deliveries daily across hundreds of cities, generating massive volumes of real-time geospatial and transactional data. Their DE interviews test streaming architecture, geospatial reasoning, and the ability to build systems that operate at low latency under constant load. Here is what each round covers.

Uber DE Interview Process

Three stages from first contact to offer.

1

Recruiter Screen

30 min

Initial call covering your experience and interest in Uber. The recruiter assesses your background with real-time data systems, large-scale infrastructure, and streaming architectures. Uber operates a massive real-time platform processing millions of rides and deliveries daily, so they look for candidates comfortable with event-driven systems and low-latency requirements.

*Emphasize real-time experience: streaming pipelines, Kafka, Flink, or similar tools
*Uber has open-sourced many data tools (Hudi, AresDB, Cadence); mentioning familiarity shows research
*Ask which team: Marketplace, Maps, Safety, or Data Platform each have different focuses
2

Technical Phone Screen

60 min

One to two coding problems, typically SQL or Python. Uber phone screens test data manipulation with ride and delivery event data. Expect questions about time-series analysis, geospatial logic, and event processing. The interviewer evaluates both correctness and your ability to reason about scale.

*Be comfortable with geospatial concepts: latitude/longitude distance calculations, geohashing
*Practice time-series SQL: sessionization, gap detection, and event ordering
*Think aloud about how your solution scales to millions of events per minute
3

Onsite Loop

4 to 5 hours

Four to five rounds covering system design, SQL deep dive, coding, data modeling, and behavioral. System design at Uber focuses on real-time architectures: surge pricing computation, ETA prediction pipelines, and marketplace matching. The data modeling round often involves designing schemas for trip data that support both real-time operations and historical analytics.

*Know the CAP theorem and how it applies to Uber's real-time requirements
*Uber's system design questions involve geographic partitioning and time-sensitive data
*Behavioral questions focus on working under pressure and adapting to rapidly changing requirements

8 Example Questions with Guidance

Real question types from each round. The guidance shows what the interviewer looks for.

SQL

Calculate the average wait time between ride request and driver acceptance, segmented by city and hour of day.

Join ride_requests to ride_acceptances on ride_id. Compute wait_time = acceptance_ts - request_ts. AVG grouped by city and EXTRACT(HOUR FROM request_ts). Discuss handling rides that were never accepted.

SQL

Find drivers who completed more than 20 trips in a single day but had an average rating below 4.0 on those trips.

Aggregate trips by driver and date. HAVING COUNT >= 20 AND AVG(rating) < 4.0. Discuss whether to include trips with no rating, and what this pattern might indicate about driver fatigue.

SQL

Identify surge pricing periods: find continuous time windows where the surge multiplier exceeded 2.0 for more than 15 minutes in a given zone.

Use the islands-and-gaps technique: ROW_NUMBER minus a sequence to group consecutive rows, then filter groups spanning more than 15 minutes. Discuss event granularity and how to handle missing data points.

Python

Write a function that takes a stream of GPS coordinates and detects when a driver has been stationary for more than 5 minutes.

Track last-moved timestamp. If distance between consecutive points is below threshold (e.g. 50 meters) and elapsed time exceeds 5 minutes, flag as stationary. Discuss Haversine distance and GPS drift.

System Design

Design a real-time surge pricing computation pipeline.

Ingest ride requests and driver availability via Kafka. Flink computes supply/demand ratio per geospatial zone in tumbling windows. Serve from a low-latency key-value store. Discuss zone granularity (H3 hexagons), smoothing to avoid price oscillation, and fallback when streaming lags.

System Design

Design Uber's trip data pipeline that serves both real-time operations and historical analytics.

Lambda or Kappa architecture: Kafka for real-time, Spark for batch reprocessing, Hudi for incremental updates to the lake. Discuss exactly-once semantics, late-arriving events from mobile clients, and partition strategy (by city and date).

Data Modeling

Model trip data to support marketplace analytics: supply/demand balance, driver utilization, and rider conversion funnels.

Fact: trips (request_ts, accept_ts, pickup_ts, dropoff_ts, fare, surge_multiplier, zone_id). Dimensions: zones (with H3 hierarchy), drivers, riders. Discuss pre-aggregating zone-level metrics hourly and the difference between completed trips and requested trips.

Behavioral

Tell me about a time you had to make a technical decision quickly under production pressure.

Uber operates 24/7 with real-time financial impact. Describe the incident, the options you considered, what you chose and why, and the outcome. Show you can balance speed with safety.

Uber-Specific Preparation Tips

What makes Uber different from other companies.

Real-time is the default, not the exception

Most Uber DE questions are framed around real-time or near-real-time requirements. Batch processing is secondary. Know Kafka, Flink, and streaming concepts: watermarks, windowing, exactly-once delivery, and backpressure.

Geospatial data is core to Uber's business

Uber partitions data geographically using H3 hexagonal indexing. Understand geohashing, spatial joins, and how to partition and query location-based data efficiently. This comes up in both system design and data modeling.

Know Uber's open-source contributions

Uber created Apache Hudi (incremental data processing), AresDB (real-time analytics), and Cadence (workflow orchestration). Mentioning these tools and understanding their purpose shows deep familiarity with Uber's data ecosystem.

Scale is measured in events per second

Uber processes millions of events per second across rides, deliveries, and driver locations. When discussing system design, think in terms of throughput (events/sec), latency (p99 in milliseconds), and geographic distribution across hundreds of cities.

Uber DE Interview FAQ

How many rounds are in an Uber DE interview?+
Typically 6 to 7: recruiter screen, technical phone screen, and 4 to 5 onsite rounds covering SQL, system design, coding, data modeling, and behavioral. Some teams add a domain-specific round for marketplace or maps.
Does Uber test Kafka and Flink in DE interviews?+
Not always directly, but streaming concepts are central. You should understand event-time vs processing-time, windowing strategies, watermarks, and exactly-once semantics. Uber uses Kafka and Flink heavily, so referencing them in system design is appropriate.
What programming languages does Uber DE use?+
Python, Java, Scala, and Go are common. For interviews, Python and SQL are accepted. If you have Java/Scala experience with Spark or Flink, it can be an advantage for streaming-focused roles.
How does Uber's interview compare to other ride-sharing companies?+
Uber's interview is more infrastructure-focused than Lyft's, with heavier emphasis on real-time systems, geospatial data, and large-scale distributed processing. The behavioral round focuses on operating under pressure in a fast-moving environment.

Prepare at Uber Interview Difficulty

Uber DE questions emphasize real-time systems and geospatial data. Practice with problems that test streaming logic and scale.

Practice Uber-Level Problems