Interview Questions

Data Science Interview Questions

Most candidates think data science and data engineering interviews are different tracks. They're not. Interviewers at both roles pull SQL questions from the same well. What changes is the follow-up: DS rounds push on sampling and statistical significance, DE rounds push on scale and pipeline correctness. If you practice DS SQL questions and ignore the DE angle, you'll bomb the second half of the conversation.

41%

SQL share in DE rounds

35%

Python share

3%

Pure system design

1,042

Rounds analyzed

Source: DataDriven analysis of 1,042 verified data engineering interview rounds.

Where DS and DE Interviews Overlap

Five areas of overlap, ranked by how much they matter for data engineers.

SQL and Data Manipulation

High overlap

Data scientists and data engineers both face SQL questions in interviews, and the overlap here is almost complete. Aggregations, window functions, CTEs, joins, and subqueries appear identically in both interview tracks. If you are a data engineer interviewing at a company that titles the role 'data scientist,' the SQL portion will feel familiar. The main difference: DS interviews may include probability-themed SQL (sampling, bootstrapping with SQL) while DE interviews lean toward performance optimization.

Statistics and Probability

Medium overlap

DE interviews rarely test statistics directly, but many DE-adjacent roles (analytics engineer, ML engineer) do. Understanding p-values, confidence intervals, distributions, and hypothesis testing helps you build better data pipelines. If your pipeline feeds an A/B testing system, knowing how sample size, statistical power, and significance work makes you a better engineer. Several companies (Meta, Netflix, Spotify) ask DEs light statistics questions to gauge breadth.

ML Pipeline Design

High for senior roles overlap

At senior levels, data engineers are expected to understand how ML models consume data. Feature engineering, training data pipelines, model serving, and monitoring all live in the DE domain. DS interviews test the ML concepts. DE interviews test the infrastructure that supports them. If you can speak both languages, you are significantly more valuable in cross-functional interviews.

A/B Testing and Experimentation

Medium-High overlap

Companies like Meta, Netflix, and Airbnb ask DEs about experimentation infrastructure. How do you partition users into control and treatment groups? How do you ensure the pipeline delivers clean experiment data? How do you handle metric computation across variants? These questions sit at the intersection of DS and DE, and candidates who can answer both sides stand out.

Data Visualization and Business Metrics

Low-Medium overlap

DS interviews test your ability to choose the right chart, interpret trends, and communicate findings to stakeholders. DE interviews do not test visualization directly, but understanding business metrics (DAU, MAU, retention, churn, LTV) helps you build better data models. If a DE interviewer asks you to design a data model for user engagement, knowing what metrics the model needs to support is critical.

10 Questions with Approaches

Most DS prep guides stop at the statistical answer. Interviewers don't. They ask a DE follow-up: how would this scale, where does the data come from, what breaks at 10x volume. Every answer below includes that second beat so you're not caught flat-footed.

Q1
SQL

Write a query to calculate the 7-day rolling average of daily active users, handling days with zero activity.

Approach

Start with a date spine (generate_series or a calendar table) to ensure every day has a row, even days with zero active users. LEFT JOIN the user activity table to the date spine. COUNT(DISTINCT user_id) per day gives you daily active users. Then apply AVG() with a window frame: AVG(dau) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW). The date spine is the key insight. Without it, your rolling average skips zero-activity days, inflating the result. This question tests window functions, date handling, and awareness of data gaps.

Q2
SQL

Find users whose total spending increased month-over-month for 3 consecutive months.

Approach

Aggregate spending by user and month. Use LAG to compare each month's spending to the previous month. Flag months where spending increased. Then identify sequences of 3 consecutive increase flags. The date-minus-ROW_NUMBER trick creates groups of consecutive months. Filter groups with COUNT >= 3. This tests window functions, aggregation, and the consecutive-sequence pattern that appears in both DS and DE interviews.

Q3
Statistics

A product team ran an A/B test for 2 weeks and found a 3% lift in click-through rate with a p-value of 0.04. Should they ship the change?

Approach

A p-value of 0.04 is below the standard 0.05 threshold, so the result is statistically significant. But significance alone is not sufficient for a ship decision. Consider: (1) Is the 3% lift practically significant? A 3% lift on a 0.1% baseline is different from a 3% lift on a 50% baseline. (2) Was the sample size adequate? Two weeks might not capture weekly seasonality patterns. (3) Did they look at guardrail metrics (revenue, latency, error rates)? A CTR improvement that degrades user experience is not worth shipping. (4) Was there any p-hacking (checking results daily and stopping early)? The right answer is: statistically significant, but need to verify practical significance, check guardrails, and confirm the test ran for the pre-determined duration.

Q4
Statistics

Explain the difference between Type I and Type II errors in the context of a data quality alert system.

Approach

Type I error (false positive): the alert fires when there is no real data quality issue. Your pipeline data is fine, but the monitor triggers a page at 3 AM. Type II error (false negative): there is a real data quality issue, but the alert does not fire. Bad data flows downstream undetected. For a data quality system, Type II errors are usually more costly because corrupted data reaches dashboards and ML models before anyone notices. When tuning alert thresholds, you trade off between the two: tighter thresholds reduce Type II errors but increase Type I (more false alarms). This framing shows interviewers you understand statistical concepts in an engineering context.

Q5
ML Pipelines

Design a feature engineering pipeline for a recommendation system that serves 50M users.

Approach

Start with requirements: what features does the model need, how fresh must they be, and what latency does serving require? Batch features (user demographics, historical preferences) can be computed daily in Spark and stored in a feature store. Near-real-time features (session clicks, last-5-minutes behavior) require a streaming pipeline (Kafka to Flink) writing to a low-latency store (Redis or DynamoDB). The feature store should support both batch training (historical features joined to training labels) and online serving (point lookup by user_id at inference time). Discuss versioning: when you add a new feature, the model needs retraining on the same feature definition. This tests your understanding of the ML infrastructure that data engineers build and maintain.

Q6
ML Pipelines

How would you detect and handle data drift in a production ML pipeline?

Approach

Data drift means the distribution of input features changes over time, which can degrade model performance. Detection: compute statistical summaries (mean, median, percentiles, distribution histograms) for each feature on each batch. Compare current batch distributions to a reference window using KS tests, PSI (population stability index), or simple threshold checks on mean/variance. Handling: alert the ML team when drift exceeds a threshold, trigger model retraining if drift is persistent, and log drift metrics for retrospective analysis. As a data engineer, your responsibility is building the monitoring pipeline that detects drift and the retraining infrastructure that responds to it. You are not necessarily choosing the model architecture, but you are building the system that keeps the model healthy.

Q7
A/B Testing

A data scientist tells you their A/B test results are being polluted by bot traffic. How would you build a pipeline to filter bots?

Approach

Layer multiple signals: (1) Known bot user-agents (Googlebot, Bingbot, etc.) are easy to filter in the ingestion layer. (2) Request rate limits: flag user_ids with more than N actions per minute as suspicious. (3) Session behavior: bots often have zero scroll depth, no mouse movements, and perfectly uniform time-between-clicks. (4) IP reputation: cross-reference IPs against known bot networks. Build a scoring pipeline that assigns a bot_probability to each session. Let the experimentation team choose their threshold (e.g., exclude sessions with bot_probability > 0.8). Store both raw and filtered datasets so the team can compare results with and without filtering. The key is that bot filtering is a data engineering problem that directly affects data science outcomes.

Q8
A/B Testing

How would you design the data pipeline for an experimentation platform that runs 200 concurrent A/B tests?

Approach

The pipeline needs to: (1) Assign users to variants deterministically (hash user_id + experiment_id to ensure consistent assignment across sessions). (2) Log exposure events (which user saw which variant, when). (3) Join exposure events to outcome metrics (clicks, purchases, engagement). (4) Compute per-experiment statistics (mean, variance, sample size per variant). For 200 concurrent tests, the join between exposures and outcomes is the bottleneck. Partition both tables by date and experiment_id. Pre-aggregate metrics daily to reduce join size. Store results in a materialized table that the experimentation UI reads. Discuss backfill: when a metric definition changes, you need to recompute results for running experiments. This tests pipeline design, data modeling, and understanding of how experimentation infrastructure works at scale.

Q9
Business Metrics

Define retention rate, and write SQL to calculate 7-day retention for a mobile app.

Approach

Retention rate: the percentage of users who return to the app N days after their first visit. For 7-day retention: identify each user's first activity date. Check if they had any activity on exactly day 7 (or within a 7-day window, depending on definition). SQL: WITH first_visit AS (SELECT user_id, MIN(activity_date) AS first_date FROM activity GROUP BY user_id) SELECT first_date, COUNT(DISTINCT CASE WHEN a.activity_date = f.first_date + 7 THEN f.user_id END) * 100.0 / COUNT(DISTINCT f.user_id) AS retention_7d FROM first_visit f LEFT JOIN activity a ON f.user_id = a.user_id AND a.activity_date = f.first_date + 7 GROUP BY first_date. Discuss the definition ambiguity: strict (exactly day 7) vs window (days 7 to 13). Different companies use different definitions, and clarifying this upfront shows attention to detail.

Q10
Business Metrics

A dashboard shows that average order value (AOV) increased 15% this month, but total revenue is flat. What happened?

Approach

If AOV went up but total revenue stayed flat, the number of orders must have decreased. Revenue = AOV * order_count. If AOV increases by 15% and revenue is constant, order count dropped by approximately 13%. Investigate: (1) Did a segment of low-value customers stop ordering, which would mechanically raise AOV while reducing volume? (2) Was there a price increase that drove away price-sensitive customers? (3) Did a promotion end that was generating high-volume, low-value orders? (4) Check for data pipeline issues: are all order sources still reporting? A broken ingestion pipeline for one order channel could reduce order count without the team realizing. This question tests business intuition and the habit of questioning data rather than accepting it at face value.

Why Data Engineers Should Care About DS Topics

Three practical reasons, not theoretical hand-waving.

Many companies blur the line between DE and DS roles

At mid-size companies, the 'data engineer' role often includes experimentation analysis, metric computation, and dashboard building. These tasks overlap heavily with data science. If your interview includes a 'business case' or 'metric analysis' round, you are being tested on DS skills even if the title says data engineer. Being prepared for this overlap gives you an advantage.

Cross-functional credibility accelerates your career

Data engineers who understand statistics, ML pipelines, and business metrics earn trust from data scientists and product managers faster. This trust translates to more interesting projects, more influence on technical decisions, and faster promotions. You do not need to be an expert statistician, but understanding the basics makes you a better collaborator.

Adjacent roles expand your job search

If you can answer both DE and DS interview questions, your target job pool expands to include analytics engineer, ML engineer, and data platform engineer roles. These roles often pay comparably to or more than traditional DE roles and offer different growth paths. Breadth in your interview preparation translates directly to more opportunities.

Frequently Asked Questions

How much overlap is there between data science and data engineering interviews?+
About 40 to 50% overlap, concentrated in SQL and data manipulation. Both roles test SQL heavily (window functions, CTEs, aggregation). The divergence is in depth: DS interviews go deep on statistics and ML theory, while DE interviews go deep on system design and pipeline architecture. At companies like Meta and Netflix, DE interviews also include light statistics and experimentation questions.
Should data engineers study machine learning for interviews?+
You do not need to understand gradient descent or backpropagation. But you should understand how ML models consume data: what a feature store is, batch vs online feature serving, training data requirements, and model monitoring. At senior DE levels, these topics appear in system design rounds. Understanding them makes your pipeline designs more practical.
Can data science interview prep help me as a data engineer?+
Yes, selectively. Focus on SQL (directly transferable), A/B testing concepts (useful for system design), and business metrics (useful for data modeling). Skip deep ML theory, Bayesian statistics, and optimization algorithms unless you are targeting an ML engineer role. The DS prep that helps DEs most is the analytical thinking framework: decompose a problem, define metrics, identify edge cases.
Are data science jobs being replaced by data engineering jobs?+
Not replaced, but the boundaries are shifting. Many tasks that data scientists did manually (feature engineering, metric computation, experiment analysis) are being automated by data pipelines that DEs build. The remaining DS work is more specialized: model development, causal inference, and research. Both roles remain in demand, but the DE role has broadened to absorb some traditionally DS responsibilities.

Stop Practicing Only Half the Interview

The questions don't change between DS and DE. Your answers have to. Train on both framings in one place.

Start Practicing