Try DataDriven before you pay for it. 10 real interview questions (5 SQL, 3 Python, 2 data modeling) with full code execution and AI grading. No credit card. No trial period. No watered-down problems. The same execution environment, the same AI feedback engine, and the same interview-calibrated difficulty as the paid tier. Just fewer questions.
If you can solve these 10 problems cleanly and quickly, you're ready for a phone screen at Google, Meta, Amazon, or Netflix. If you can't, you know exactly where your gaps are and whether DataDriven's full library is worth investing in.
Free questions
Domains
Code execution
Cost
Every SQL query runs against a real database with realistic sample data. Every Python function executes and produces real output, not a simulated approximation. This matters because the most common interview mistakes are subtle: a query that looks correct but produces wrong results due to a NULL handling issue, a Python function that works on the sample input but crashes on an edge case. You can't catch these mistakes without real execution.
After you submit your solution, the AI grader evaluates it on correctness, efficiency, and readability. It highlights specific lines: 'Line 14: This self-join will produce duplicate rows when a user has multiple orders on the same day. Add a DISTINCT or use a different join strategy.' This is the feedback you'd get from a senior engineer reviewing your code, not just a binary pass/fail.
The free questions aren't watered-down versions of the real thing. They're the same difficulty level you'd encounter in a phone screen at Google, Meta, Amazon, or Netflix. We include medium and hard problems because that's what interviews actually test. If you can solve these 10 problems cleanly and quickly, you're ready for a phone screen. If you struggle, you know exactly what to work on.
The free tier is permanently free. It's not a 7-day trial that converts to a subscription if you forget to cancel. You create an account, you get 10 questions with full execution and grading, and they stay available forever. We do this because we want you to experience the platform before you decide whether the full library is worth paying for. If 10 questions are enough for your prep, great. If you need more, you'll know because you've already seen the quality.
These aren't random. Each question was chosen because it tests a pattern that appears in 80%+ of real data engineer interviews. Solve all 10 and you've covered the highest-frequency concepts.
Given an employees table with department_id, salary, and hire_date, find the top 3 earners in each department and their rank. Tests: window functions (RANK or DENSE_RANK), PARTITION BY, and filtering windowed results. This is a phone-screen-level problem that appears in some form at every major tech company.
Given a login_events table, calculate Day 1 and Day 7 retention for users who signed up in a specific month. Tests: self-joins or window functions on date differences, cohort analysis logic, and handling users who never logged in again (they should show as 0% retention, not be excluded).
Given an orders table with order_date and amount, compute daily revenue and a 7-day running total. Tests: SUM() OVER (ORDER BY ... ROWS BETWEEN), date grouping, and the difference between ROWS and RANGE window frames (a subtle point that interviewers love to probe).
Given a table of session events with sequential IDs, find the gaps where IDs are missing. Tests: LAG/LEAD to detect discontinuities, GENERATE_SERIES for the set-based approach, and discussion of which approach scales better on a table with 500 million rows.
Given three tables (users, orders, products), find the top-selling product category for each user segment. Tests: multi-table JOINs, GROUP BY with aggregate functions, subquery vs. CTE style preference, and handling NULL values when a user has no orders.
Parse a server log file (each line has a timestamp, log level, and message) and return the count of ERROR entries per hour. Tests: file I/O, string parsing, datetime handling, and dictionary-based aggregation. The interviewer follow-up: 'The file is now 10GB. How does your approach change?'
Write a function that takes a nested JSON object (arbitrary depth) and returns a flat dictionary with dot-notation keys. Tests: recursion, type checking, and handling edge cases like empty objects, arrays, and None values. This is a common Meta and Google DE phone screen problem.
Given a list of records with potential duplicates (same ID but different field values), write a function that deduplicates by keeping the most recent record for each ID. Tests: dictionary-based grouping, comparison logic for timestamps, and handling records with identical timestamps.
Design a star schema for an e-commerce platform. Define the fact table (orders) and at least 4 dimension tables (users, products, dates, geography). Discuss: grain of the fact table, slowly changing dimensions for product price changes, and how you'd support both 'total revenue by category' and 'conversion rate by marketing channel' queries.
Design the data model for a ride-sharing app's event stream. Events include: ride_requested, driver_assigned, ride_started, ride_completed, ride_cancelled, payment_processed. Discuss: how to model the ride lifecycle as a series of events, how to reconstruct the current state of any ride from its event history, and the tradeoffs between event sourcing and a stateful rides table.
Here's exactly what you get at each tier. No marketing spin.
Free
10 questions across SQL, Python, and Data Modeling
Full access
1,000+ questions across SQL, Python, Data Modeling, Pipeline Architecture, and Spark
Free
Full execution on all 10 questions. Real databases, real output.
Full access
Full execution on all 1,000+ questions.
Free
Full AI grading with line-by-line feedback on all 10 questions.
Full access
Full AI grading on all 1,000+ questions, plus comparative scoring against other users.
Free
Single question practice mode only.
Full access
All 4 modes: coding round simulator, discussion simulator, rapid-fire drill, and full loop simulation.
Free
General data engineering questions.
Full access
Questions tagged by company (Google, Meta, Amazon, Netflix) and filtered to match each company's interview format and difficulty.
Free
Scores saved for your 10 questions.
Full access
Full progress dashboard: accuracy trends, weak topic identification, time-per-question analysis, and readiness score.
Free
Not included. Discussion rounds require the AI interviewer, which is a paid feature.
Full access
Full system design and data modeling discussion mode with AI follow-up questions and multi-dimensional scoring.
Maybe you're a data analyst considering a move to data engineering. Or a backend engineer who's heard that DE pays well. The free tier lets you try real interview problems without committing. If you solve the SQL problems comfortably, you might be closer to ready than you thought. If they're a struggle, you know you need structured prep before scheduling interviews.
Every prep platform claims to have 'realistic' problems. The free tier lets you judge for yourself. Try 3 to 4 problems. Did the execution work? Was the AI feedback useful? Did the difficulty match what you've heard about real interviews? If yes, the paid tier is more of the same. If no, you saved yourself a subscription fee.
Phone screens are typically 1 to 2 problems in 45 to 60 minutes. The free SQL questions are calibrated to phone-screen difficulty. If you can solve 3 of the 5 free SQL problems in under 20 minutes each, with correct results and clean CTEs, you're in solid shape for most phone screens. That might be all the practice you need.
Before recommending a platform to your team, try it yourself. The free tier gives you hands-on experience with the execution environment, grading quality, and problem calibration. You can make an informed recommendation without asking your company to pay for a trial.
10 questions is enough to diagnose your readiness level and decide your next step. Here's the sequence that extracts the most signal.
SQL is the most-tested skill in DE interviews. If you can only practice 5 problems, these are the right 5. They cover window functions, self-joins, retention analysis, running totals, and gap detection. These patterns appear in 80%+ of real SQL interview rounds.
These cover the core Python patterns DE interviews test: file parsing, JSON transformation, and deduplication. They're at phone-screen difficulty and include the follow-up questions interviewers ask about scaling.
Data modeling rounds are the most under-practiced part of DE interviews. Most candidates focus entirely on coding. The free modeling problems introduce the format: design a schema, defend your choices, discuss tradeoffs.
After solving each problem, read the AI grading carefully. It shows you exactly where your solution could improve. If the feedback is revealing things you didn't know (subtle NULL bugs, inefficient query patterns, readability issues), that's a signal that more practice with this type of feedback would be valuable.
If you aced all 10 problems with clean solutions, you might only need a few more weeks of targeted practice. If you found gaps, the paid tier gives you 1,000+ more problems with the same execution and grading quality, plus timed mock interview modes that build the stamina and pressure-management skills the free tier doesn't train.
Most interview prep platforms gate their best content behind a paywall and show you a marketing page with screenshots. You're supposed to pay $30 to $80/month based on trust that the product is good.
We think that's backwards. DataDriven's best marketing is the product itself. When you write a SQL query and see it execute against real data, when the AI grader points out a NULL handling bug you didn't notice, when the feedback tells you exactly which line of your Python function would fail on a 10GB file, you understand the value immediately. No screenshot or testimonial communicates that as effectively.
The free tier exists because we're confident that once you solve 3 to 4 problems with real execution and AI grading, you'll understand why 1,000+ questions with timed mock interviews and discussion rounds is worth paying for. If you don't, that's fine too. The 10 free questions are genuinely useful for phone screen prep on their own.
We also know that interview prep is often urgent. You just got a phone screen scheduled for next week. You don't have time to research platforms, read reviews, and debate subscription costs. The free tier lets you start practicing in under 60 seconds. If it works for you, upgrade. If it doesn't, you lost nothing.
10 interview questions. Real code execution. AI grading. No credit card. Under 60 seconds to start.