Airbnb pioneered the modern data culture and created Apache Airflow. Their DE interviews test marketplace data modeling, experimentation infrastructure, and the ability to make data accessible across an entire organization. The process spans 3 to 5 weeks with leveling from L4 through L7.
Total compensation includes base salary, RSUs on a 4-year vest schedule, and annual refresh grants. Airbnb RSUs vest quarterly after the first year.
Ranges reflect total comp (base + RSUs + bonus) for US-based roles. Actual offers vary by location, experience, and negotiation.
Entry-level for new grads or candidates with 1 to 3 years of experience
Most common hiring level for experienced candidates
Requires demonstrated cross-team technical leadership
Rare external hires; typically internal promotions with org-wide impact
Three stages from recruiter call to offer decision. Timeline: 3 to 5 weeks.
Initial conversation about your experience and interest in Airbnb. The recruiter evaluates your background with data platforms, pipeline orchestration, and experimentation infrastructure. Airbnb has one of the strongest data cultures in tech, so they look for candidates who understand how data drives product decisions and can articulate the role of data engineering in enabling that culture.
SQL problems set in a marketplace context: bookings, listings, reviews, search interactions. Airbnb phone screens test aggregation, window functions, and multi-step analytical queries. The interviewer evaluates clarity of thought and how you handle ambiguity in problem definitions. You may need to ask clarifying questions about business logic before writing queries.
Five rounds covering SQL deep dive, system design, coding (Python), data modeling, and a core values interview. System design at Airbnb emphasizes experimentation platforms, search ranking data pipelines, and pricing analytics. The core values interview evaluates alignment with Airbnb's mission and how you champion the data culture. Data modeling often involves marketplace schemas that support both operational and analytical workloads.
The tools and systems Airbnb's data engineering org builds on.
Python, Java, Scala
Apache Spark, Apache Airflow (created at Airbnb), Apache Hive
S3, Parquet, custom data lake architecture
Minerva (Airbnb's centralized metrics platform, single source of truth for all metric definitions)
Presto/Trino, Spark SQL
Apache Airflow (internal fork with Airbnb-specific operators)
Bighead (Airbnb's end-to-end ML infrastructure)
Data engineers at Airbnb are embedded across these major teams. Each has distinct technical focus areas and interview emphasis.
Ranking pipelines, search relevance features, listing quality signals, personalization data
Fraud detection, identity verification, risk scoring, trust signals across hosts and guests
Transaction pipelines, pricing models, payout reconciliation, tax and compliance data
Airflow infrastructure, Minerva development, data quality tooling, schema management
Two-sided marketplace metrics, retention funnels, lifetime value models, engagement tracking
Dynamic pricing models, demand forecasting, occupancy optimization, smart pricing features
Real question types from each round. The guidance shows what the interviewer evaluates and how to structure your answer.
Join searches to bookings on user_id where booking_ts BETWEEN search_ts AND search_ts + INTERVAL 24 HOUR. Group by city and month. Discuss attribution: what if a user searched multiple cities before booking?
Calculate avg response time per host per quarter. Use LAG to get prior quarter. Filter where current > 1.5 * prior. Discuss how to handle hosts with very few requests (small sample sizes).
LEFT JOIN listings to bookings, filter WHERE booking_id IS NULL. Join to search_impressions, group by listing, HAVING COUNT >= 100. Discuss what this indicates about listing quality or pricing.
Weight each signal (e.g., identity = 30%, host reviews = 30%, response rate = 20%, cancellations = 20%). Use window functions with ROWS BETWEEN 6 PRECEDING AND CURRENT ROW. Discuss how to handle new guests with no history and the cold-start problem.
Define check functions for each validation. Return a structured report with pass/fail per check. Discuss how to set dynamic thresholds (e.g., expected row count based on day of week and season) and how to handle check failures in a pipeline.
Compare each price change against historical distribution for that listing and its neighborhood. Use z-scores or IQR for outlier detection. Discuss how to distinguish legitimate event pricing from manipulation, and how to handle listings with limited price history.
Event collection, experiment assignment logging, metric computation (pre-experiment and during), statistical analysis pipeline. Discuss metric standardization (Minerva), avoiding p-hacking with pre-registered metrics, and how to handle interaction effects between overlapping experiments.
Batch features (historical booking rate, review scores) from Spark, real-time features (current availability, recent views) from a feature store. Discuss feature freshness requirements, how to backfill features for new listings, and serving latency constraints for search.
Ingest demand signals (search volume, booking pace), event calendars, and neighborhood comps. Batch compute base recommendations daily with Spark; update intra-day for demand spikes. Discuss how to handle new listings (cold start), host override behavior, and feedback loops where pricing affects demand.
Fact: bookings, searches, reviews, messages. Dimensions: listings, hosts, guests, locations. Discuss the two-sided marketplace: host metrics (occupancy rate, response time, revenue) vs guest metrics (search-to-book conversion, repeat booking rate). Define grain carefully for each fact table.
Define funnel stages: search impression, listing view, booking request, host response, booking confirmed, check-in, review submitted. Use a session-based fact table keyed on search_session_id. Discuss how to handle funnel re-entry (guest who views the same listing twice), attribution across devices, and how Minerva would define standard funnel metrics.
Airbnb's data culture means DEs serve analysts, PMs, and executives. Describe building self-serve tools, documentation, or training. Quantify impact: 'Reduced analyst time-to-answer from 2 days to 30 minutes' or 'Enabled 50 PMs to query experiment results independently.'
Patterns that consistently lead to rejections, and how to avoid them.
Airbnb is a two-sided marketplace. Every metric has a host perspective and a guest perspective. Candidates who only think about the buyer side miss half the picture. When modeling bookings, consider occupancy rate (host metric) alongside conversion rate (guest metric).
Minerva is Airbnb's centralized metrics layer. It defines how every metric is calculated so that teams cannot produce conflicting numbers. If you propose a system design without mentioning metric standardization or a single source of truth, you are ignoring a core part of Airbnb's data philosophy.
Airbnb created Airflow. Surface-level knowledge (DAGs, operators, scheduling) is the minimum. Interviewers expect you to discuss backfill strategies, idempotent task design, dynamic DAG generation, SLA monitoring, and how to handle task failures gracefully in production.
Airbnb's business is highly seasonal and location-dependent. A pipeline that works for New York in July will produce very different volumes for rural Japan in February. Always mention how your design handles variable data volumes, seasonal spikes, and regional differences.
The core values interview is a real evaluation round with veto power. Candidates who prepare only for technical rounds and wing the values discussion get rejected. Prepare specific stories that demonstrate belonging, hosting, and championing the mission.
Why preparing for Airbnb requires a different approach than other top-tier companies.
Airflow was born as an internal Airbnb project in 2014 before becoming an Apache top-level project. This means Airbnb's data infrastructure runs on a deeply customized fork with proprietary operators, monitoring, and SLA tooling. The internal engineering culture treats orchestration as a first-class discipline, not just a scheduler.
Unlike most companies where metric definitions live in scattered dashboards and notebooks, Airbnb built Minerva to centralize every metric definition. DEs are responsible for registering metrics in Minerva and ensuring pipelines produce data that conforms to these definitions. This changes how you think about pipeline design: output is not just a table, it is a certified metric.
Most data engineering roles deal with one type of customer. At Airbnb, every feature, every metric, and every pipeline must account for both hosts and guests. A cancellation pipeline must update host availability, guest booking history, trust scores for both parties, and financial records. This multiplier effect makes even simple-sounding problems significantly harder.
Airbnb invests in data literacy training for all employees. Product managers, designers, and operations staff are expected to query data and interpret experiment results. DEs build infrastructure that serves the entire company, not just analysts. This means you are evaluated on how well you make data accessible, not just how well you build pipelines.
Tactical advice for each aspect of the interview loop.
Apache Airflow was born at Airbnb. Interviewers expect you to understand DAG-based orchestration, dependency management, idempotent tasks, and retry strategies. If you use Airflow, know it well. If you use a competitor (Dagster, Prefect), be ready to compare.
Airbnb trains all employees on data literacy and has a centralized metrics layer (Minerva). DEs are expected to help the entire organization, not just serve other engineers. Prepare examples of making data accessible to non-technical users.
Every Airbnb metric has a host side and a guest side. When modeling data, always consider both perspectives: a booking is revenue for the host and an experience for the guest. Interviewers notice candidates who naturally think about both sides.
Airbnb runs thousands of experiments concurrently. DE pipelines compute metrics for these experiments. Understand how to design pipelines that support A/B testing: logging experiment assignment, computing treatment vs control metrics, and handling novelty effects.
Airbnb DE questions test marketplace thinking, Minerva-style metric standardization, and data culture. Practice with problems that mirror two-sided platform analytics.
Practice Airbnb-Level SQLContinue your prep
50+ guides covering every round, company, role, and technology in the data engineer interview loop. Grounded in 2,817 verified interview reports across 921 companies, collected from real candidates.