Microsoft Data Engineer Certification

DP-203 retired March 31, 2025. DP-700 is now the Microsoft DE cert, and it's a meaningfully different exam from the one your senior teammates took. This guide covers the retirement, what DP-700 actually tests, the F-SKU pricing trap, how it compares to AWS and Google, and an eight-week plan that maps to exam objectives.

What this guide actually says

DP-203 retired March 31, 2025. DP-700 is the current Microsoft DE cert. It tests Fabric — Lakehouse, Warehouse, Real-Time Intelligence — and Synapse-only knowledge no longer covers it. DP-203 holders should renew through the Fabric path before their window closes. DP-900 is the practical prereq for career switchers. F-SKU pricing is graded aggressively; memorize the F2 / F8 / F64 anchor points.

2025
DP-203 retired
DP-700
Current DE cert
$165
Exam fee
120 min
Time limit

The Real-Time Intelligence section is where most candidates lose points. Eventstreams, Eventhouses, and KQL databases are net new material that DP-203 holders never saw. The exam gives you a scenario about high-cardinality clickstream or IoT telemetry and asks you to design the ingest path and the query layer in the same answer.

Prepare for the interview
01 / Open invite
02min.

Know Microsoft DP-700 Fabric the way the interviewer who asks it knows it.

a Microsoft DP-700 Fabric query, the same shape a screen would give you.
The diff against expected. Where ties broke. What you missed.
sandbox
1source → bronze → silver → gold
2 ingest : CDC + Kafka
3 transform : dbt + Airflow
4 serve : Snowflake
5
Execute your solution0.4s avg.
Capital OneInterview question
Solve a Microsoft DP-700 Fabric problem
Prepare for the interview
03 / From the bank02 of many
02hand-picked.

Two Hundred Million Redirects

Medium25 min

Billions of clicks. One tiny code. Two very different clocks.

Pulled from debriefs where system design separated levels.

DP-203 retirement timeline

The dates senior interviewers reference when they read your resume.

DateEventWhat it meant
November 2023DP-700 announcedMicrosoft signals Fabric as the strategic surface. Most candidates miss it.
Q1 2024DP-700 betaBeta testers report meaningfully different shape: KQL and Eventstreams as new sections.
Q2 2024DP-700 GALaunches at standard $165. DP-203 still active and booked heavily.
Late 2024DP-203 Learn path frozenUpdates stop. DP-203 enters maintenance mode.
March 31, 2025DP-203 retiresFinal exam date. New registrations end. Existing certs valid through their renewal cycle.
2026 onwardDP-700 onlyMicrosoft DE associate track is single-cert. Renewals route through Fabric.

DP-700 in detail: what's actually on the exam

Six workload areas. The boundaries between them are the source of every scenario question.

Lakehouse

Fabric Lakehouse

Delta tables backed by OneLake. T-SQL endpoint for ad hoc reads, Spark notebooks for transformation. Shortcuts let you mount data from another workspace, ADLS Gen2, or S3 without copying bytes. The exam tests whether you understand a shortcut is a metadata pointer, not replication.

Warehouse

Fabric Warehouse

T-SQL surface that looks like Synapse but isn't. Identity columns, schema-bound views, and cross-database queries behave differently from Dedicated SQL Pools. Storage lives in OneLake as Delta; the engine is the new Polaris-derived MPP, not the legacy Synapse pool.

Pipelines

Data pipelines and Dataflow Gen2

Fabric Data pipelines are the lift-and-shift of Azure Data Factory. Dataflow Gen2 is the Power Query authoring surface for low-code transformation. Pipelines win for orchestration, parametrization, and copy-at-scale. Dataflows win for analyst-authored cleanup. Picking wrong is one of the most common scenario traps.

Real-Time

Real-Time Intelligence

Eventstreams ingest from Event Hubs, Kafka, IoT Hub, HTTP. Eventhouses store data in KQL databases (the Kusto engine behind Application Insights and Azure Data Explorer). KQL is the differentiator most candidates skip. Plan to learn summarize, mv-expand, make-series, and let bindings.

Lifecycle

Deployment, security, capacity

Deployment pipelines move artifacts dev → test → prod. Git integration (Azure DevOps or GitHub) backs everything. OneLake security uses workspace roles plus item-level permissions, with row/column security inherited from the underlying Delta table. Capacity is the F-SKU you paid for. Bursting past it throttles requests.

Semantic

Semantic models and Direct Lake

Direct Lake reads Delta files in OneLake without import or DirectQuery. Fast for Power BI, but it inherits Vertipaq limits underneath. The exam asks scenario questions about when to fall back to import or DirectQuery, and how to diagnose Direct Lake fallback in the capacity metrics app.

F-SKU pricing

Capacity is sold by F-SKU. The exam grades sizing scenarios more aggressively than prep books cover. F64 is the magic line: at F64+, every viewer gets Power BI Pro included.

F-SKUHourlyMonthlyTypical use
F2$0.36 / hour$262 / monthToy. Demos and sandboxes only.
F4$0.72 / hour$525 / monthSolo developer. Tight.
F8$1.45 / hour$1,057 / monthSmall team or single product line.
F16$2.90 / hour$2,114 / monthMid-size analytics team.
F32$5.81 / hour$4,242 / monthMulti-team workspace.
F64$11.62 / hour$8,481 / monthMid-enterprise. Power BI Pro included for all viewers.
F128$23.23 / hour$16,956 / monthEnterprise. Multiple capacities common.

DP-700 vs the other clouds

How DP-700 maps against the other DE certs you might choose between in 2026.

ExamDifficultyScopeReach (2026)Hiring signalTransferability
Microsoft DP-700Medium-HardFabric Lakehouse, Warehouse, Real-Time, PipelinesStrongest in regulated enterprises (finance, healthcare, gov)Strong inside Microsoft ecosystem, modest outsideLimited. F-SKU and OneLake don't map to other clouds.
AWS DEA-C01MediumGlue, Redshift, Kinesis, Lake Formation, S3Broadest cloud DE market share in 2026Strong almost everywhere. Default cert when undecided.High. Most patterns transfer to Azure and GCP equivalents.
Google Pro Data EngineerHardBigQuery, Dataflow (Beam), Pub/Sub, Bigtable, Vertex AISmaller footprint, concentrated at GCP-first shopsHighest per-cert prestige. Hard to fake.High for streaming, ML pipelines, watermarking concepts.
Databricks DEAMediumDelta Lake, Spark, medallion, Unity CatalogHot. Lakehouse adoption accelerating across clouds.Strong for any company running Spark, regardless of cloudHigh. Spark + Delta knowledge applies on AWS, Azure, GCP.

What interviewers grade on at Microsoft-stack shops

Real questions from Fabric-shop interview loops in 2026. The patterns recur.

Q01

Walk me through your Fabric workspace organization for a multi-domain analytics platform.

Strong answers separate workspaces by domain (sales, finance, supply chain) and lifecycle (dev, test, prod), then explain how OneLake shortcuts let teams share canonical Gold tables without copying. Mention deployment pipelines, capacity assignment, and the trade-off between one large F-SKU and many smaller capacities. Weak answers describe a single 'analytics' workspace and miss the governance question entirely.

Q02

When would you pick a Fabric Lakehouse vs Warehouse vs Eventhouse?

Lakehouse for raw ingestion, Spark transformation, ML feature engineering. Warehouse for T-SQL workloads where analysts expect SQL Server semantics and stored procedures. Eventhouse for high-cardinality time-series and log-style data where you need sub-second KQL queries over billions of rows. The interviewer is checking whether you understand all three sit on OneLake but use different engines.

Q03

OneLake shortcuts: explain the security implications when shortcutting across workspaces.

A shortcut inherits the source table's row-level security and column masking, but the destination workspace's roles control who can resolve the shortcut. That gap is where leaks happen. Strong answers also mention that shortcuts to external storage (ADLS Gen2, S3) authenticate using the source connection, not the destination, which can route reads through unintended identities.

Q04

Your Eventstream is dropping events under load. Diagnose.

Walk through the layers. First: capacity throttling at the Eventhouse (check capacity metrics app for throttled requests). Second: Eventstream throughput unit limits. Third: source side — are Event Hubs partitions saturated, or is the producer batching badly? Strong answers reference the Eventstream monitoring view and the difference between dropped events and rejected events.

Q05

Design a CDC pipeline from on-prem SQL Server into Fabric.

Most candidates start with Data Factory's self-hosted integration runtime. Better answers consider SQL Server CDC enable + Debezium-to-Event-Hubs, then Eventstream into a Bronze Lakehouse Delta table, then a notebook merging into Silver. The interviewer wants you to understand initial snapshot vs ongoing delta, idempotency on retry, and schema drift handling on the source side.

OneLake security: the part candidates underprepare

Four layers determine who sees what in Fabric. The exam tests the gaps between them. Walk all four in order even when the question is about one.

Workspace

Workspace roles control item access

Admin, Member, Contributor, Viewer. Viewers see lakehouses and warehouses but can't edit. Contributors can create items. Member adds the right to manage workspace settings. Admin owns the workspace and assigns roles.

Item

Item-level permissions narrow workspace roles down

You can grant a user read on a single Lakehouse without giving them the rest of the workspace. The exam tests scenarios where a Viewer needs read on Gold tables but no access to Bronze. Item permissions answer this. They cannot escalate above the workspace role, only restrict beneath it.

Row

Row-level security travels with the Delta table

RLS defined on a Lakehouse Delta table is enforced uniformly: T-SQL queries through the SQL endpoint, Spark notebooks reading the Delta, and shortcuts pointing at the table all see the filter. The cleanest part of the OneLake security story and the most-tested.

Shortcut

Shortcuts inherit source security but consumer authorization

When you shortcut a table from Workspace A into Workspace B, the source's RLS and column masking still apply. But resolving the shortcut requires permissions in the destination workspace. Misconfigure either side and you either over-share data or break a published dashboard.

Interview soundbites

Short, defensible answers to recurring questions in Microsoft-stack DE interviews. Memorize the structure, not the words.

When asked

Lakehouse vs Warehouse

Lakehouse first if the workload is Spark, ML feature engineering, or open-format storage you need to share with non-Microsoft consumers. Warehouse first if it's T-SQL with stored procedures, the team is SQL developers, and you need full ANSI semantics for joins and window functions.

When asked

Direct Lake fallback

Direct Lake reads Delta files in OneLake without import or DirectQuery. It falls back to DirectQuery when the table exceeds Vertipaq limits, when calculated columns block the lake path, or when the user lacks proper SQL endpoint permissions. Diagnose with the Capacity Metrics app's fallback indicator.

When asked

Capacity throttling

Fabric smooths capacity over a 24-hour window. Workloads can burst above the SKU briefly, then throttle when the smoothing window fills. The right answer to a throttle question is rarely 'increase the SKU' — it's 'right-size the workload, schedule heavy jobs off-peak, or move the noisy item to its own capacity.'

When asked

Eventstream durability

Eventstreams are not durable storage. They route events. Durability lives at the destination: an Eventhouse, a Lakehouse, or a Custom App with retry logic. Treat Eventstream like a Kafka Streams topology, not like Kafka itself. The exam asks this distinction in scenario form.

When asked

Schema drift

Spark notebooks handle schema drift natively with mergeSchema=true on Delta writes. Pipelines and Dataflow Gen2 don't, and they fail loudly when the source adds a column. Strong answers walk through both paths and recommend Spark notebooks for sources where drift is common.

When asked

Cross-cloud shortcuts

Fabric can shortcut to ADLS Gen2 and S3, but not GCS as of mid-2026. The shortcut authenticates through the source connection, so a workspace can read S3 tables without data ever copying into OneLake. The answer to 'we have a Snowflake bill on AWS, can we keep the data there?'

Myth vs reality

Myth: Fabric replaces Synapse

Reality: Synapse Dedicated SQL Pools are still GA and supported. Most large customers run both for years during migration. The exam expects you to know the difference and choose between them.

Myth: DP-700 is just DP-203 with Fabric chapters bolted on

Reality: DP-700 is meaningfully different. KQL and Eventstreams are full sections. Synapse-specific topics (dedicated pool distributions, PolyBase) are gone. Studying DP-203 material leaves you 30% under-prepared.

Myth: Fabric is just Power BI dressed up

Reality: at the DE layer, Fabric runs Spark, Delta, and the Kusto engine. None of that is Power BI. The semantic model layer touches Power BI; DP-700 grades the engineering tier independently.

Myth: Microsoft's Azure DE market shrank when DP-203 retired

Reality: it grew. Regulated enterprises (finance, healthcare, government) accelerated Fabric adoption in 2025-26 because the unified billing and OneLake security model fit their compliance posture.

Myth: I can use my AWS knowledge to pass DP-700

Reality: the F-SKU capacity model and Fabric workspace concepts have no AWS analogue. Plan to study the pricing layer and OneLake security from scratch even if you're senior on AWS.

Decision matrix

Use this if you have ten seconds. The answer is one row away.

SituationPickReason
Targeting Microsoft enterprise shopsDP-700Direct match for the platform they actually run.
Already in Synapse, want renewal pathDP-700DP-203 retired. DP-700 is the official Synapse-to-Fabric bridge.
Power BI developer pivoting to DEDP-900 then DP-700DP-900 builds the data vocabulary DP-700 assumes.
Multi-cloud consultantAWS DEA-C01 first, DP-700 secondAWS for breadth, DP-700 for Microsoft engagements.
Pure DE at AWS-only shopSkip DP-700, take AWS DEA-C01DP-700 won't move the needle if your stack never touches Azure.
ML engineer needing MS credentialsAI-102 insteadAI-102 (Azure AI Engineer) maps to your work; DP-700 won't.
Career switcher, no cloud backgroundDP-900 first, then DP-700Skipping fundamentals usually means failing DP-700 once and re-paying.

Eight-week DP-700 study plan

Calibrated to the actual exam blueprint, not the marketing copy.

  1. 01

    Weeks 1-2: Microsoft Learn DP-700 path

    Complete the official DP-700 path on Microsoft Learn (free). Spin up a Fabric trial tenant and confirm you can create a workspace, Lakehouse, and Warehouse. The trial includes 60 days of full F-SKU capacity — enough for the entire study cycle. Don't skip the labs; scenario questions assume hands-on familiarity with the workspace UI.

  2. 02

    Week 3: Build a medallion pipeline using OneLake shortcuts

    Ingest a real public dataset (NYC taxi, GitHub archive) into a Bronze Lakehouse, transform with a Spark notebook into Silver, aggregate into a Gold Warehouse table. Use a OneLake shortcut to expose Gold to a second workspace as if a downstream team consumed it. Highest-ROI hands-on exercise for the exam.

  3. 03

    Week 4: Eventstream and Eventhouse hands-on

    Stand up an Eventstream from a sample source (built-in Bicycles or Stocks generator). Land in an Eventhouse. Write KQL using summarize, bin, mv-expand, and make-series. Most candidates underprepare here and lose 15-20% of their score.

  4. 04

    Week 5: Practice exams (MeasureUp, Whizlabs)

    Take a full timed practice exam. Score honestly. For every wrong question, write a paragraph explaining why the right answer is right and why the others are wrong. This 'why-not' analysis catches conceptual gaps a passing flash-card score hides.

  5. 05

    Week 6: Cost and capacity scenarios

    DP-700 grades F-SKU sizing more aggressively than candidates expect. Memorize F2 / F8 / F64 anchor prices. Understand capacity smoothing, bursting, throttling. Practice questions where the answer is 'pick a smaller F-SKU and turn on autoscale' versus 'pick larger and dedicate it.'

  6. 06

    Week 7: Deployment pipelines and Git

    Configure Git integration on a workspace. Make a change, push it, deploy to a staging workspace via a deployment pipeline. Understand selective deployment, deployment rules, parameter overrides. The exam includes at least one scenario about promoting a parameterized pipeline through dev / test / prod.

  7. 07

    Week 8: Final timed practice exam

    One sitting, exam-day conditions. No notes. No pausing. Above 80%: schedule the real exam within 7 days. Below 70%: don't book yet. Re-do the weakest section's hands-on labs and re-test before scheduling.

Common pitfalls on first attempts

Patterns that appear in failed first attempts. Avoid these and your second sitting becomes your only sitting.

Studying DP-203 material and assuming it covers DP-700

About 30% of DP-700 is net new. Old material gives false confidence. Throw out the 2023 prep books and start from the current Microsoft Learn DP-700 path.

Skipping KQL because 'I'm not a streaming engineer'

KQL is on the exam regardless of role. Real-Time Intelligence is ~20% of the score. You won't pass without basic KQL fluency: summarize, bin, where, project, mv-expand.

Memorizing F-SKU prices but not the F64 license boundary

F64 is the line where viewer Power BI Pro licenses are included in the capacity. Below F64, you still pay per-viewer. Candidates who memorize prices but miss this fail the licensing scenario.

Treating Direct Lake like DirectQuery

Direct Lake is a different mode with different limits. Calculated columns, calculated tables, and certain DAX patterns force fallback. The exam grades whether you know when Direct Lake works and when fallback is required.

Ignoring deployment pipelines and Git integration

Several scenario questions assume you've promoted artifacts dev → test → prod. If you've only worked in a single workspace, you'll guess wrong on deployment-rule and parameter-override questions. Practice the flow once end to end before the exam.

Frequently asked questions

Is DP-203 still worth taking in 2026?+
No. DP-203 retired March 31, 2025. New candidates can't register. Existing holders are valid for the remaining renewal window. DP-700 is the current Microsoft DE associate cert.
What happens to my DP-203 cert if I already have it?+
Valid until your renewal date. Microsoft offers a transitional renewal path through Fabric content on Microsoft Learn. Plan to renew through DP-700 study material since your existing renewal is the last one tied to DP-203 lineage.
How hard is DP-700 compared to DP-203?+
Comparable difficulty, different scope. DP-700 trades deep Synapse Dedicated Pool material for KQL, Eventstreams, OneLake security, and F-SKU capacity sizing. Candidates strong on classic Azure data services tend to underestimate KQL and Real-Time Intelligence.
Do I need DP-900 before DP-700?+
Not formally, but yes in practice if you're new to data concepts. DP-900 is two weeks of study and gives you the vocabulary (relational vs non-relational, batch vs stream, fact vs dimension) DP-700 assumes. Switchers who skip DP-900 fail DP-700 more often.
How much does Microsoft Fabric cost in production?+
Sold by capacity. F2 starts at $0.36/hour (~$262/month). F64 is $11.62/hour (~$8,481/month) and is the smallest SKU that includes Power BI Pro for all viewers. Most mid-market customers land on F32 or F64. Enterprise tenants run multiple capacities to isolate workloads.
Does DP-700 expire?+
Yes. Microsoft role-based associate certs (including DP-700) require renewal once a year. The renewal is a free open-book online assessment, takes ~30 minutes, and covers features added since your last renewal.
Is KQL really on the exam, even for non-streaming roles?+
Yes. Real-Time Intelligence is ~20% of DP-700, and KQL questions appear within it. You don't need expert level, but you should read a KQL query, identify what summarize and bin do, and pick the right time-series operator for a scenario.
02 / Why practice

The cert proves what you know. Practice proves what you can ship.

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Related guides