Lyft Data Engineer Interview

Lyft processes 1M+ rides per day across 600+ US cities. Their data pipelines power surge pricing, ETA prediction, driver matching, and financial reconciliation. The Data Engineer interview reflects this: heavy emphasis on geospatial data, marketplace dynamics, and real-time pricing pipelines. Lyft loops typically run 3 to 4 weeks, targeting IC2 through IC5. Pair this with the our data engineer interview prep hub.

Lyft

Transportation · San Francisco, US · LYFT

live data · June 11, 2026

DE total comp

$292K median

$255K–$313K · 4 verified reports

Hiring now

2 open DE roles

live from career pages

Team happiness

46 / 100 · Neutral

model score from employee signals

Layoff risk (30d)

Moderate

Employee sentiment

3.8 / 5

Mixed

Employees

5,001–50,000

Lyft Data Engineer Interview Process

5 rounds, 3 to 4 weeks end to end. Mostly virtual with optional onsite for finalists.

01
Recruiter Screen (30 min)
Conversational call about your background and Lyft's current open headcount. Lyft hires across multiple data engineering teams (Marketplace, Pricing, Maps, Driver, Rider, Financial Data Platform), so be prepared to discuss which team interests you. Mention experience with geospatial data, real-time systems, or marketplace dynamics if you have it.
02
Technical Phone Screen (60 min)
Live SQL or Python coding in CoderPad. SQL leans on window functions and rolling aggregations (typical: compute rolling 7-day driver utilization rate per city). Python leans on data manipulation, often with a geospatial twist (parse trip telemetry, group by H3 cell). Strong candidates handle edge cases like NULL coordinates and timezone-shifted timestamps.
03
System Design Round (60 min)
A real Lyft-relevant problem. Common prompts: design the surge pricing pipeline, design ETA prediction infrastructure, design the driver matching event log. Use the 4-step framework. Cover real-time + batch dual-track architecture, exactly-once semantics, and SLA tiering.
04
Live Coding Onsite (60 min)
Second live coding round, usually the language you didn't use in the phone screen. Often includes a follow-up that adds streaming or scale (e.g., 'now this needs to run on 10K events/sec').
05
Behavioral / Collaboration Round (60 min)
STAR-D format. Lyft emphasizes cross-functional collaboration with product managers, data scientists, and operations teams. Expect questions about handling disagreements, prioritizing competing requests, and influencing decisions without authority. The Decision postmortem carries significant weight.

Lyft data engineer compensation

Median and range from verified salary reports, by level.

Level	Base	Total comp
JuniorL3	$125K–$155K	$160K–$220K
Mid-levelL4	$195K median	$292K median · $255K–$313K · 4 reports
SeniorL5	$195K–$240K	$350K–$490K
StaffL6	$230K–$290K	$470K–$660K
PrincipalL7	$275K–$350K	$620K–$880K

The Lyft data stack

What their data engineers work with day to day. Worth brushing up on the heavy hitters before the loop.

Languages

Python2 SQL2Java1

Tools and platforms

Airflow2 Spark2 Hadoop1 Presto1S31Hive1AWS1

Real Lyft interview questions

Reported questions from this company's loops, tagged by domain, round, and level.

SQLonsite sql· L42024

Given a drivers table (driver_id, name) and a rides table (ride_id, rider_id, driver_id, start_time, end_time, pick_up, drop_off, cost), write a query to calculate for each driver: total number of rides, average cost per ride, and number of rides costing over $20

Schema: drivers(driver_id INT, name VARCHAR), rides(ride_id INT, rider_id INT, driver_id INT, start_time TIMESTAMP, end_time TIMESTAMP, pick_up VARCHAR, drop_off VARCHAR, cost DECIMAL). Expected approach: JOIN drivers to rides ON driver_id, GROUP BY driver_id and name, compute COUNT(*) AS total_rides, AVG(cost) AS average_cost, and SUM(CASE WHEN cost > 20 THEN 1 ELSE 0 END) AS rides_over_20. DataLemur explicitly labels this as a Data Engineer question at Lyft.

Pythonphone screen python· L32025

Find the nth missing number in a sorted list without duplicates

Given a sorted list of unique integers and an integer n, return the nth number not present in the list. Iterate through the sorted list tracking gaps between consecutive elements, counting missing numbers until the nth is found. Tests loop control, arithmetic, and edge case handling.

SQLonsite sql· L52024

Given an orders table, find all customer_ids who placed their first-ever order on each calendar day (new customers per day: customer appears that day but not in any prior day)

SQLonsite sql· L42024

Given a table of customer orders, find each day's new customer_ids — customers who placed their first-ever order on that day.

Schema: orders(order_id, customer_id, order_date, ...). Approach: find each customer's MIN(order_date) as their first order date, then group by that date to count new customers per day. Alternative approach: LEFT JOIN customers against all prior orders and filter for NULL matches. Requires understanding of MIN aggregate with GROUP BY and self-join or window function alternatives. Lyft DE phone screen or onsite SQL question.

SQLonsite sql· L42024

Find customers who made purchases on two consecutive calendar days.

Schema: purchases(customer_id, purchase_date, ...). Approach 1: Self-join on customer_id where DATEDIFF(p2.purchase_date, p1.purchase_date) = 1. Approach 2: use LAG() window function to get previous purchase date per customer, then filter for date difference of 1 day. Need to handle duplicates (multiple purchases per day) by first deduplicating on (customer_id, DATE(purchase_date)). Edge cases: same-day multiple purchases, timezone differences. Lyft DE interview.

What Makes Lyft Data Engineer Interviews Different

Marketplace dynamics show up everywhere

Two-sided marketplace context (riders + drivers) shapes every system design and modeling question. If your answer doesn't acknowledge supply-demand dynamics, the interviewer asks until you do. Frame ride trip data as a record of supply meeting demand; surge as a control signal; ETA as both a UX and a marketplace metric.

Geospatial fluency expected

H3 hexagonal grid indexing is the lingua franca. Know what H3 is, how resolution levels work (resolution 8 ~ 0.7 km^2, resolution 9 ~ 0.1 km^2), and when to use it vs PostGIS or Geohash. Asking what resolution to bucket at is a senior signal.

Real-time + batch dual-track architecture is standard

Almost every system at Lyft has a real-time path (Flink or Spark Structured Streaming) and a batch path (Spark daily). The batch path is the source of truth; real-time is approximate. Reconciliation pipelines compare them daily and alert on drift. Mention this dual-track pattern unprompted.

Cross-functional collaboration weighs heavily

Lyft's data engineering teams sit close to product and operations. The behavioral round explicitly tests whether you can translate business asks into technical scope and push back when scope is unclear. Stories about working with non-engineers score well here.

How Lyft Connects to the Rest of Your Prep

The system design questions at Lyft overlap with Uber data engineering interview prep, since both companies solve similar marketplace problems. The geospatial pipeline patterns also show up at DoorDash data engineering interview prep and Instacart data engineering interview prep, which are three-sided marketplaces with similar architecture.

Drill the round-specific guides: window functions and SQL patterns interviewers test for the rolling window and top-N patterns, system design framework for data engineers for the marketplace pricing architecture, behavioral interview prep for Data Engineer for the cross-functional collaboration stories.

Lyft practice set

Problems on the platform tagged and predicted for Lyft loops, from live listings and interview reports.

SQLeasy~5 min

Full Customer Order List

Return first_name, last_name, and country for every customer in customers. Sort alphabetically by first_name, then last_name.

Pythonmedium~10 min

Detect Cycle in Sequence

You are given a list of integers where each value at index i is the next index to visit (or -1 to terminate). Starting from index 0, follow the chain and return True if you revisit any index, False otherwise. Out-of-range indices (including -1) count as termination, not a cycle.

SQLeasy~5 min

High Volume Batch Jobs

Surface all batch jobs that processed more than 5000 rows, showing each job's name, priority, and rows processed, ranked from most to fewest.

Pythoneasy~10 min

The Bitwise Judge

Given an integer n (possibly negative), return True if n is even, False if odd. Solve using bitwise operations only - no %, no /, no //.

SQLmedium~5 min

Active Duo

The growth team is building a cross-engagement segment of users who both make purchases and log browsing sessions on the platform. Return a deduplicated list of usernames for users with activity in both areas.

Pythoneasy~10 min

Quantile Calculator

Given a list of numbers and percentile (0-100), return the value at that percentile using linear interpolation. The index is percentile / 100 * (n - 1); if fractional, linearly interpolate between the floor and ceiling indices of the sorted values.

Lyft Interview FAQ

How long does the Lyft Data Engineer interview process take?+

3 to 4 weeks from recruiter screen to offer. Lyft moves at a moderate pace. Some candidates report faster timelines (2 weeks) when there's mutual urgency, but plan for a month.

Is Lyft remote-friendly for data engineers?+

Yes. Lyft has been remote-first since 2022. Most teams are distributed. Some roles require quarterly visits to San Francisco or NYC offices, but the interview format is fully remote.

What level should I target at Lyft?+

IC3 (Senior) is the most common external hiring level. IC2 roles exist but are typically filled internally or via early-career programs. IC4+ are mostly internal promotion with rare external hires for specific domain expertise.

Does Lyft test algorithms / LeetCode style?+

Lightly. The Python coding round leans on data manipulation, but expect one DSA-flavored follow-up (typically a hash map or two-pointer problem). Don't grind 200 LeetCode problems for Lyft; spend the time on data engineering patterns instead.

How important is geospatial knowledge?+

Important if you're targeting Maps, Pricing, or Marketplace teams. Less critical for Financial Data Platform or Driver/Rider product teams. The recruiter will tell you which team you're interviewing for; tailor your prep accordingly.

What languages can I use in Lyft Data Engineer interviews?+

Python and SQL are universally accepted. Scala is fine for Spark-heavy roles. Go is acceptable for backend-leaning Data Engineer roles. Pick the language where you can write the cleanest code under pressure.

Does Lyft have a Bar Raiser equivalent?+

Not formally. The behavioral round is conducted by a calibrated interviewer who participates in cross-team hiring decisions. The function is similar to Amazon's Bar Raiser without the explicit name.

How is comp negotiated at Lyft?+

Initial offers are typically at the midpoint of the range. RSU refreshers vest annually. Sign-on bonuses are negotiable. Verified offer data on levels.fyi shows successful negotiations of 10 to 25% over initial offer when candidates have competing offers.

02 / Why practice

Practice Marketplace System Design

01
Active recall beats re-reading by 50%
Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom
02
76% of hiring managers reject on the coding task, not the resume
From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice
03
Five problem shapes cover 80% of data engineer loops
Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Start Practicing

More data engineer interview prep guides

Stripe data engineering interview prep→

Stripe Data Engineer process, comp, financial-precision SQL, and the collaboration round.

Uber data engineering interview prep→

Uber Data Engineer process, marketplace and surge data modeling, geospatial pipelines.

Airbnb data engineering interview prep→

Airbnb Data Engineer process, experimentation platform questions, two-sided marketplace modeling.

Databricks data engineering interview prep→

Databricks Data Engineer process, Spark internals, lakehouse architecture, Delta Lake questions.

Snowflake data engineering interview prep→

Snowflake Data Engineer process, micro-partitions, query optimization, warehouse architecture.

Netflix data engineering interview prep→

Netflix Data Engineer process, streaming pipelines, A/B test infra, and the keeper test.

Lyft Data Engineer Interview

Lyft Data Engineer Interview Process

Recruiter Screen (30 min)

Technical Phone Screen (60 min)

System Design Round (60 min)

Live Coding Onsite (60 min)

Behavioral / Collaboration Round (60 min)

Lyft data engineer compensation

The Lyft data stack

Real Lyft interview questions

What Makes Lyft Data Engineer Interviews Different

Marketplace dynamics show up everywhere

Geospatial fluency expected

Real-time + batch dual-track architecture is standard

Cross-functional collaboration weighs heavily

How Lyft Connects to the Rest of Your Prep

Lyft practice set

Lyft Interview FAQ

Practice Marketplace System Design

More data engineer interview prep reading

More data engineer interview prep guides