Amazon Data Engineer Interview

Amazon's DE interview is unique because Leadership Principles carry as much weight as technical skills. Every round blends behavioral questions with SQL or system design. The technical side covers SQL, system design with AWS services, and sometimes Python. This guide walks through the process, the five LPs that matter most for DEs, and 10 example questions with approaches.

Amazon

Technology · Seattle, LU · AMZN

live data · June 11, 2026

DE total comp

$203K median

$139K–$238K · 38 verified reports

Hiring now

30 open DE roles

live from career pages

Team happiness

40 / 100 · Stressed

model score from employee signals

Layoff risk (30d)

Moderate

Employee sentiment

3.5 / 5

Negative

Employees

5,001–50,000

Interview Process

Three stages. The onsite loop is where most evaluation happens.

01
Recruiter Screen
Mostly logistics and background review. The recruiter confirms your experience level, asks about interest in Amazon, and explains the process. They may ask one or two Leadership Principle questions early. Have a concise pitch about your data engineering experience and why Amazon.
- ▸Know which team you are interviewing for; Amazon has hundreds of DE teams across Retail, AWS, Alexa, Ads, Prime Video
- ▸Mention AWS services experience if you have it; not required but helps
- ▸Ask about the team's data stack and the problems they solve
02
Technical Phone Screen
Split between SQL/coding and Leadership Principle questions. Expect 1 to 2 SQL problems and 1 to 2 behavioral questions. Amazon phone screens run 60 minutes, longer than most companies. The SQL is intermediate difficulty. Behavioral questions carry real weight even in the phone screen.
- ▸Every answer should map to a Leadership Principle
- ▸For SQL, expect aggregation, JOINs, and basic window functions
- ▸Use STAR format and quantify impact with specific numbers
03
Onsite Loop (4 to 5 rounds)
Four to five interviewers each test a different combination of technical skills and Leadership Principles. Every round includes at least one behavioral question. There is no purely behavioral round; LPs are woven into every interview. Technical rounds cover SQL, system design, and sometimes Python. One interviewer is the 'bar raiser' with veto power.
- ▸Each interviewer is assigned specific LPs to evaluate
- ▸Prepare at least 2 stories per principle for Dive Deep, Ownership, Bias for Action
- ▸The bar raiser asks the hardest questions and holds the highest standard

Leadership Principles for Data Engineers

Amazon has 16 LPs. These five are most relevant for DE roles and most frequently tested.

Dive Deep

The most relevant LP for data engineers. Amazon wants DEs who dig into data quality issues, trace anomalies to root causes, and refuse surface-level explanations. When a dashboard number looks wrong, you investigate the pipeline, source data, and transformation logic before reporting the issue.

Example question: Tell me about a time you found a data quality issue that others had missed.

Ownership

DEs who own their pipelines end-to-end. You do not throw data over the wall. You monitor, alert on failures, and fix issues proactively. Amazon wants to hear about times you took ownership beyond your explicit responsibilities.

Example question: Describe a situation where you went beyond your role to solve a data problem.

Bias for Action

Speed matters at Amazon. They want DEs who make decisions with incomplete information and iterate. If a stakeholder needs a dataset and the perfect solution takes 3 months, what can you deliver in 2 weeks? The interim solution demonstrates Bias for Action.

Example question: Tell me about a time you had to make a quick decision with limited data.

Insist on the Highest Standards

Data accuracy is non-negotiable. Amazon runs on data for pricing, inventory, recommendations, and logistics. A wrong number can cost millions. This LP tests whether you build validation, testing, and monitoring into your work.

Example question: Describe how you maintain data quality in a pipeline you own.

Learn and Be Curious

Data engineering tools change fast. Amazon wants DEs who keep up with new tools, evaluate them critically, and adopt what works. This LP comes up when discussing how you chose a technology or learned a new domain.

Example question: Tell me about a time you learned a new technology to solve a problem.

Real Amazon interview questions

Reported questions from this company's loops, tagged by domain, round, and level.

mixedbehavioral· L52024

Tell me about a time you resolved a data incident that impacted downstream analytics: describe how you identified the root cause, communicated with stakeholders, and implemented a fix.

Amazon Leadership Principle: Ownership / Deliver Results. Interviewers expect a STAR-format answer. High-scoring responses cover: (1) how the candidate detected the issue (monitoring alert, stakeholder report, data quality check), (2) root cause diagnosis (upstream schema change, late data, pipeline bug), (3) immediate mitigation (rerun, hotfix, downstream team notification), (4) long-term fix (alerting, validation checks), (5) measurable outcome (SLA recovery time, stakeholder trust). DE-specific context expected: should mention data pipeline, ETL job, or analytics dependency.

Pythonphone screen python· L42025

Given a list of daily prices, find the maximum profit from buying and selling once, and return the buy and sell day indices

Write a function best_trade(prices) that returns a tuple (profit, buy_day, sell_day) representing the maximum profit achievable by buying on buy_day and selling on sell_day (where sell_day > buy_day). If no profitable trade exists, return (0, -1, -1). If multiple trades yield the same maximum profit, return the earliest buy day. Example: prices = [7, 1, 5, 3, 6, 4] Output: (5, 1, 4) (buy at index 1 for price 1, sell at index 4 for price 6) prices = [7, 6, 4, 3, 1] Output: (0, -1, -1) prices = [2, 4, 1, 7] Output: (6, 2, 3) (buy at index 2 for price 1, sell at index 3 for…

Data Modelingonsite data modeling· L62025

Design a data model to track a product from the vendor to the Amazon warehouse to delivery to the customer

Amazon Data Engineer Interview Loop data modeling question. Candidate must design an end-to-end supply chain data model with entities for: vendors, purchase_orders, warehouse_inventory, shipments, delivery_events, and customers. Expected to define: grain of each table (one row per shipment leg vs one row per product unit), surrogate keys for each entity, foreign key relationships linking the supply chain stages, and how to handle split shipments where one order goes to multiple warehouses. Interviewer probed on normalization decisions and how to track product state transitions (ordered,…

SQLphone screen sql· L42023

Amazon DE phone screen: 4 SQL questions escalating from simple single-table output to multi-table joins, CTEs, and window functions; recruiter confirmed SQL + data modeling + Python (no algorithms)

Pipeline Architectureonsite pipeline architecture· L72025

How would you build a data pipeline around an AWS product that can handle increasing data volume?

Amazon Data Engineer Interview Loop system design question targeting principal-level candidates. Open-ended question requiring discussion of: choice of AWS services (Kinesis vs MSK for ingestion, Glue vs EMR for transformation, Redshift vs Athena for querying), auto-scaling strategies for each component, partitioning and compaction strategies for S3 data lake, cost optimization at scale, and monitoring/alerting for pipeline health. Interviewer expected candidates to discuss specific throughput numbers and back-of-envelope capacity calculations. Source provided limited follow-up detail.

mixedbehavioral· L62024

Describe a time when you improved the scalability or cost efficiency of a data pipeline: what was the problem, what changes did you make, and what was the measurable impact?

Amazon Leadership Principle: Frugality / Think Big. Expected STAR answer covers: (1) context of the pipeline (batch or streaming, scale of data, business criticality), (2) what triggered the review (cost spike, SLA miss, capacity planning), (3) specific technical changes made (partitioning, compression, query optimization, right-sizing clusters, caching), (4) quantified result (cost reduction %, throughput increase, latency improvement). Interviewers at L6+ look for cross-team influence and org-level impact. DE-specific framing required.

Pythononsite python· L52025

Given a JSON object with nested objects, write a function that flattens all the objects to a single key-value dictionary

Amazon Data Engineer Interview Loop coding question. Given a nested JSON/dict structure like {"a": 1, "b": {"c": 2, "d": {"e": 3}}}, flatten to {"a": 1, "b.c": 2, "b.d.e": 3} using dot-separated keys. Expected approach: recursive function that traverses the dict, building up a key prefix as it descends. Must handle: nested dicts at arbitrary depth, mixed value types (ints, strings, lists), and empty nested dicts. Follow-up may ask about iterative vs recursive approach and stack overflow concerns for deeply nested objects.

Data Modelingonsite data modeling· L52025

Implement a Type 2 Slowly Changing Dimension for customer profiles that preserves historical changes in attributes such as address and Prime membership status.

Schema design task: dim_customer table must track history of mutable attributes (shipping_address, prime_status, email). SCD Type 2 pattern requires: surrogate key (customer_sk), natural key (customer_id), effective_date, expiry_date (or is_current boolean), version_number. INSERT logic for new records, UPDATE logic to close old records (set expiry_date = current_date - 1 day, is_current = false). Interviewers follow up on: how to handle late-arriving records, index design for performance, and how downstream fact tables reference the dimension (by surrogate key not natural key). From Amazon…

SQLonsite sql· L6

Identify the top two highest-grossing products within each category for the year 2022; schema: product_spend(category VARCHAR, product VARCHAR, user_id INTEGER, spend DECIMAL, transaction_date DATETIME)

mixedbehavioral· L52023

Tell me about a time you were in a meeting and had a different opinion from everyone else in the room; what did you do and what was the outcome?

Amazon Leadership Principle: Have Backbone; Disagree and Commit. Expected STAR answer: (1) describe the context and the decision being made, (2) explain your opposing view and how you backed it with data or reasoning, (3) what happened — did you escalate, present a counter-proposal, or ultimately commit to the group decision, (4) final outcome and retrospective. For a DE role, this often surfaces in discussions about technology choices (e.g., Spark vs. dbt, warehouse vs. data lake), data modeling decisions, or build vs. buy choices. From IGotAnOffer Amazon DE interview guide.

AWS Services to Know

Familiarity with these shows you understand the AWS data ecosystem.

Redshift

Columnar warehouse. Know distribution keys, sort keys, and how they affect query performance. Interviewers may ask when to use Redshift vs Athena.

S3

Object storage backbone. Know partitioning strategies (by date, by source) and file formats (Parquet for analytics, JSON for raw ingestion).

Glue

Managed ETL (serverless Spark). Glue Crawlers discover schemas; Glue Data Catalog is a Hive metastore. Know Glue vs EMR tradeoffs.

EMR

Managed Spark/Hadoop. For heavy batch processing. Know cost model: transient clusters for batch vs persistent for interactive.

Kinesis

Real-time streaming. Data Streams for ingestion, Firehose for delivery, Data Analytics for processing. Know how it compares to Kafka.

Athena

Serverless SQL on S3. No infrastructure. Pay per query. Best for ad-hoc analysis when data is already in S3.

Amazon data engineer compensation

Median and range from verified salary reports, by level.

Level	Base	Total comp
JuniorL4	$116K median	$159K median · $111K–$176K · 44 reports
Mid-levelL5	$146K median	$203K median · $139K–$238K · 38 reports
SeniorL6	$166K median	$290K median · $201K–$388K · 34 reports
StaffL7	$227K median	$531K median · $523K–$546K · 6 reports
PrincipalL8	$260K–$340K

The Amazon data stack

What their data engineers work with day to day. Worth brushing up on the heavy hitters before the loop.

Languages

SQL25 Python19 Scala10Java9PySpark2TypeScript1

Tools and platforms

EMR27Redshift25Lambda22AWS22S321Glue21Kinesis21 Spark14 Hive11 Hadoop10 Airflow6Athena5

Amazon practice set

Problems on the platform tagged and predicted for Amazon loops, from live listings and interview reports.

SQLmedium~5 min

Active Duo

The growth team is building a cross-engagement segment of users who both make purchases and log browsing sessions on the platform. Return a deduplicated list of usernames for users with activity in both areas.

Pythoneasy~10 min

Quantile Calculator

Given a list of numbers and percentile (0-100), return the value at that percentile using linear interpolation. The index is percentile / 100 * (n - 1); if fractional, linearly interpolate between the floor and ceiling indices of the sorted values.

Data Modelingeasy~15 min

The Balance Always Reconciles

We're a consumer lending company that offers personal loans, auto loans, and mortgages. Customers make monthly payments, but sometimes they pay early, miss payments, or refinance. The operations team needs outstanding balances and the risk team needs to flag delinquent accounts. Can you design the schema?

SQLhard~22 min

Users Who Churned in February

Find all users who had sessions in January {{YEAR}} but none in February {{YEAR}}.

SQLmedium~10 min

Subscribers Without Premium

Pull basic-plan subscribers who never upgraded to premium from the subscriptions data. The retention team wants to run a winback campaign targeting this group.

Pythonhard~5 min

The Overlap

Your monitoring system logs server maintenance as `[start, end]` minute ranges, and windows that overlap or sit back-to-back really describe one continuous outage. Collapse the `windows` so any that overlap or touch at an endpoint become a single range, and return them ordered by start time. Two windows touch when one ends exactly where the next begins.

Amazon DE Interview FAQ

How important are Leadership Principles at Amazon?+

Extremely. LPs are evaluated in every round, not just behavioral. A candidate who solves every SQL problem but cannot demonstrate Dive Deep or Ownership may not receive an offer. Prepare LP stories with the same rigor as technical prep.

Do I need to know AWS for an Amazon DE interview?+

Depends on the team. AWS-internal teams expect strong AWS knowledge. Retail and Ads teams care more about SQL and system design fundamentals. Knowing the basics of Redshift, S3, Glue, and Kinesis is always helpful.

How many behavioral questions should I expect?+

At least one per round, sometimes two. Across a 5-round loop, that is 5 to 10 behavioral questions. Prepare 8 to 10 distinct STAR stories covering the most relevant LPs.

What level are Amazon DE roles?+

L4 (entry/mid), L5 (mid/senior), L6 (senior/principal). Most external hires are L5. L4 focuses on SQL and basic system design. L5 adds deeper design and stronger LP demonstration. L6 requires organization-level impact.

Is the Amazon DE interview harder than Meta or Google?+

Different, not necessarily harder. Amazon weights behavioral questions more heavily. The technical bar is comparable, but the LP component is unique and requires specific preparation.

02 / Why practice

Prepare for Amazon's Dual Bar

01
Active recall beats re-reading by 50%
Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom
02
76% of hiring managers reject on the coding task, not the resume
From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice
03
Five problem shapes cover 80% of data engineer loops
Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Practice SQL Problems

Related Guides

DE Interview Prep→

Complete preparation guide for data engineering interviews

Behavioral Questions→

STAR stories for the round that carries outsized weight

System Design for DE→

Pipeline architecture patterns tested in onsite rounds