Data Engineering Interview Practice Problems
1467+ data engineering interview practice problems with real code execution. Write SQL queries, Python solutions, and design schemas against live databases with instant grading. Filter by domain, difficulty, seniority level, and target company.
Domains: Python (387), SQL (903), Data Modeling (56), Pipeline Architecture (121). Difficulty breakdown: easy (534), medium (677), hard (256).
Python Practice Problems (387)
- The Dominant Signal - easy - Hottest items in the transaction log. Ties included.
- The Original Keeper - easy - Clean up duplicate events without losing the timeline.
- The Forward Fill - easy - Patch the gaps in a noisy sensor stream.
- The Word Mismatch - easy - Some text does not match.
- The Social Graph - easy - Everyone knows someone.
- The Sequel Spotter - easy - Spot the sequels hiding in the catalog.
- The Numbered Chair - easy - A standing list. Position n holds one entry.
- The Character Encoder - easy - Squeeze a string down to its tightest form.
- The One-Way Street - easy - Monotonic time-series. Direction only.
- The IP Validator - easy - Real and fake, mixed together.
- The Log Pulse - easy - Some lines repeat themselves.
- The One-of-Each - easy - Strip the repeats, keep the originals.
- The Config Blender - easy - Config collision. The surviving values after a merge.
- Flatten the Feed - easy - Nested lists, all the way down.
- Activity Time Ledger - easy - Matching activities. One runtime.
- Batch With Metadata - easy - The list gets chopped.
- Caesar Shift Check - easy - The key turns. Does it open?
- Character Occurrence Map - easy - Character frequency as a map.
- Coalesce Fields - easy - Nulls are hiding. Fill them in.
- Column Max - easy - One value rules the column.
- Column Range - easy - From minimum to maximum. What is the spread?
- Column Sum - easy - Add up the column. Every value counts.
- Dominant Element - easy - Majority element. Appears more than half the time.
- Even Filter - easy - Only the even ones survive.
- Explode List - easy - One row holds many values. Unpack it.
- Extract Domain - easy - The domain is buried in the string.
- Flatten the Nest - easy - Mixed nesting. One flat list out.
- Greeting Formatter Class - easy - First impressions are formatted carefully.
- Normalize Name - easy - Names are messy. Standardize them.
- No Shortcuts - easy - The peak value. Built-ins off the table.
- Null Counter - easy - How many holes in the data?
- Ordered Character Check - easy - Check if all As appear before all Bs.
- Progress Milestones - easy - Progress at every 10% increment. Keep the receipts.
- Quality Gate - easy - Not everything passes inspection.
- Quantile Calculator - easy - Mark the boundary value at a given point.
- Record Filter - easy - Some records belong. Others do not.
- Reverse Field - easy - Flip it. See what happens.
- Run Length Encoding - easy - AAABBB becomes 3A3B. Compress it.
- Sanitize Field - easy - Dirty input. Clean output.
- Schema Checker - easy - The schema says one thing. The data says another.
- Sequential Word Pairs - easy - Everything has a neighbor.
- Single Element Among Pairs - easy - One element has no partner.
- Sort Descending - easy - Biggest first. No exceptions.
- The Account Manager - easy - Deposits, withdrawals, and the risk of going negative.
- The Additive Chain - easy - Each value is the sum of the two before it - no calls to itself allowed.
- The Address Surgeon - easy - One string hides a street, a city, a state, and a zip.
- The Alphabet Score - easy - Every letter has a secret numeric value - what's your total?
- The Alphabet Sorter - easy - Filing cabinet logic: everything goes in its proper drawer.
- The Balanced Sum - easy - Some numbers have a rare quality that mathematicians revere.
- The Bit Counter - easy - How many lights are on in the binary representation?
- The Bit Ladder - easy - Count the ones all the way up.
- The Bitwise Judge - easy - No division, no modulo - just a single bit tells you everything.
- The Bouncer - easy - Every door has a guest list.
- The Bronze Medalist - easy - Not first, not last - somewhere in the middle of the podium.
- The Bug Spotter - easy - It compiles. The answer is still wrong.
- The Calendar Sort - easy - Time has its own opinion about order.
- The Carousel - easy - Keep moving, same ride.
- The Character Map - easy - Character-level frequency. As a dictionary.
- The Cipher Wheel - easy - Every letter has an alias - you just need the right codebook.
- The Clock Angle - easy - Two hands. One gap. One number.
- The Code Expander - easy - Compressed messages need a decoder to come alive.
- The Column Transformer - easy - Each column gets its function.
- The Column Zipper - easy - Headers on top, values below, dict in the middle.
- The Complement Hunt - easy - Every number is looking for its other half.
- The Crowd Favorite Eatery - easy - One restaurant clearly won the most hearts.
- The Crowd Pleaser - easy - One value shows up more than all others combined.
- The Crowd Splitter - easy - The middle holds even with a dominant outlier.
- The Decomposer - easy - Every composite thing can be broken down to its simplest parts.
- The Deep Dictionary - easy - One key goes further than the rest.
- The Deep Dive - easy - A specific position in the unsorted pile.
- The Deep Selector - easy - Tell it what you want. It knows where to look.
- The Deep Unpacker - easy - Boxes inside boxes. Eventually you reach the bottom.
- The Depth of Field - easy - Some containers hold containers that hold containers.
- The Diagonal Accountant - easy - Two diagonals cross in the center of every square.
- The Duplicate Spotter - easy - Some values appear more than once - report only those.
- The Even Checkpoint - easy - Is this number in the even club? Prove it the fast way.
- The Expander - easy - What goes in small comes out big.
- The Field Counter - easy - Some fields speak louder than others.
- The First Encounter - easy - Every character has a story - but only if you remember where it started.
- The First Stranger - easy - In a crowd, the unique ones stand out first.
- The Forbidden Ceiling - easy - Round up. But not the obvious way.
- The Gap Filler - easy - Fill the Nones with the last real value.
- The Gate Keeper - easy - Not all openings have a closing.
- The Grid Pivot - easy - A different angle reveals a completely different picture.
- The Halftime Score - easy - Middle value of a dataset. No built-in shortcuts.
- The Hash Stamper - easy - One input, one irreversible output - the foundation of every secret.
- The Indivisibles - easy - Numbers that yield only to themselves.
- The Integer Sieve - easy - Not everything in this list belongs here.
- The Last Instance - easy - When duplicates appear, only the last one counts.
- The Last Seen Map - easy - For each character, where did it appear last?
- The Lazy Squares - easy - A sequence that never fully reveals itself.
- The Letter Census - easy - Every crowd has its share of talkers and quiet ones.
- The Letter Frequency Map - easy - Count every character in the string and report the results.
- The Letter Ledger - easy - Every character has a count to answer for.
- The Letter Tally - easy - Each character in the string has a count to answer for.
- The Line Cutter - easy - Did everyone with an A-pass get through before the B-crowd arrived?
- The Line Splitter - easy - Comma-separated truths, one at a time.
- The Log Decoder - easy - Every line holds a secret.
- The Lone Character - easy - It appeared exactly one time. That made it special.
- The Lone Traveler - easy - One character stands apart from the crowd.
- The Manual Sorter - easy - No shortcuts, no built-ins, just work.
- The Matching Manifest - easy - Two warehouses, one shipment - only load what's in both.
- The Merge - easy - Chaos in. Order out.
- The Messy Pipeline - easy - The upstream API has no idea what a schema is.
- The Minutes Tracker - easy - Some activities eat more time than others.
- The Mirror Flip - easy - Sometimes the fastest fix is to swap everything.
- The Mirror Image - easy - Flip the tape backwards - start from the end.
- The Mirror Test - easy - Check if a string reads the same forwards and backwards.
- The Mirror Words - easy - Each word looks back at itself.
- The Missing Number - easy - Something is missing from the sequence.
- The Molecule Report - easy - Four letters. A lot of math hidden in the sequence.
- The Multiplication Trail - easy - Each step multiplies the whole journey.
- The Never-Ending Sequence - easy - Sequence that keeps going. Follow it.
- The Number Screen - easy - Some numbers make the cut. Most do not.
- The Odd Digits - easy - Hidden inside a mess of characters are a few odd numbers.
- The Odd Extractor - easy - Not all numbers from a string are welcome here.
- The Odd Filter - easy - Strip out everything that does not belong to the odd club.
- The One-Timers - easy - Values that never repeated.
- The Op Dispatcher - easy - Name the operation, apply it everywhere.
- The Order Enforcer - easy - Some rules say every A must come before every B.
- The Overlap Finder - easy - Two guest lists - who made it onto both?
- The Pair Counter - easy - How many pairs can be formed from the crowd?
- The Paired Doors - easy - Every open bracket has a partner - but not every partner shows up.
- The Pascal Row - easy - Each number is the sum of two numbers above it.
- The Password Builder - easy - Random characters, fixed rules.
- The Password Forge - easy - Eight random characters - how many combinations exist?
- The Peak Finder - easy - Largest number in the list. Max() is not an option.
- The Pipeline Filter - easy - In the door as one thing, out the door as another.
- The Price Bander - easy - Different prices, different treatment.
- The Progress Parade - easy - Just tell them how far along you are.
- The Ranked Dict - easy - Values deserve order too.
- The Repeat Offenders - easy - Repetition is a clue.
- The Roman Converter - easy - Roman numerals decoded.
- The Runner-Up - easy - Not the winner. The one just behind it.
- The Running Total - easy - Each position holds the sum of everything before it.
- The Safe Caster - easy - Type conversion is easy, until it is not.
- The Score Sorter - easy - Points on the board, sorted by who earned the most.
- The Scramble Check - easy - Same letters, different order - are these two strings secret twins?
- The Second Summit - easy - Not the top of the mountain - just below it.
- The Secret Twins - easy - Same letters, different disguises.
- The Self-Portrait Number - easy - Some numbers describe themselves perfectly.
- The Shadow Cleaner - easy - Remove the repeats. No shortcuts.
- The Silent Locator - easy - Every lookup should cost you less than the one before it.
- The Single Bit - easy - One particular pattern hides in plain sight.
- The Solo Act - easy - One-and-done values only.
- The Spread - easy - Data spread around a center. The range matters.
- The Squeeze - easy - aaabbb gets old fast. Shrink it.
- The Step Counter - easy - You can hop one step or two - how many ways to reach the top?
- The Streak Breaker - easy - It has a problem with repetition.
- The Style Guide - easy - Not every word deserves the same treatment.
- The Syntax Sentinel - easy - Brackets opened and closed. The nesting might be off.
- The Tail End - easy - Push, pop, peek. The basics that break people.
- The Tail Trimmer - easy - Remove the k-th item from the back without counting forward first.
- The Tally Counter - easy - How many times does a single guest show up to the party?
- The Top Reviewer - easy - One restaurant receives the most feedback - which one?
- The Traffic Director - easy - Spread the load evenly - nobody should be doing all the work.
- The Tree Measurer - easy - How deep does the rabbit hole go?
- The Trip Grouper - easy - Where did everyone go, and for how long?
- The Type Sorter - easy - A mixed list is hiding its numbers - extract them.
- The Value Sorter - easy - The order was always negotiable.
- The Version Parade - easy - 1.0 before 2.0. Don't let the dots confuse you.
- The Vowel Hunt - easy - Just the vowels. All of them.
- The Word Census - easy - Who said what - and how many times?
- The Word Counter - easy - How many times does each word show up in a file?
- The Word Flipper - easy - The sentence stays, the words surrender.
- The Word Inventory - easy - Every word, twice over.
- The Word Map - easy - Input text. Output: word frequency.
- Tokenize - easy - Split it apart. Keep the pieces.
- Transform Column - easy - Same data, new shape.
- Type Caster - easy - Wrong type. Fix it.
- Unique Values - easy - Duplicates are noise. Remove them.
- Value Count - easy - How many of each? Count them.
- Word Counter - easy - Words in, counts out.
- Zip to Record - easy - Two lists become one record.
- The High Mark - easy - Scan the list. Report the max.
- The Event Bucketer - easy - Logs slotted into buckets.
- The List Merger - easy - No shortcuts.
- The Dictionary Inverter - easy - Flip the dict. Group what used to be values.
- The String Shrinker - easy - Compress the string. Shorter wins.
- The Bracket Validator - easy - Brackets opened and closed. The nesting might be off.
- The Trade Signal - easy - Buy low, sell high. Identify the ideal moment.
- The Stream Averager - easy - The answer moves with the data.
- The Generous Ones - medium - The generous ones are obvious.
- The Payload Flattener - medium - Turn a deeply nested API response into a flat row.
- The Resume Sifter - medium - Pull what's useful. Skip what you know.
- The Title Ladder - medium - Job titles and the salary tier they belong to.
- The Repeat Review - medium - The echo came back.
- The File Size Profiler - medium - File types and their disk footprint. One type dominates.
- The Schedule Cleaner - medium - Overlapping sessions. One clean line.
- Stock Range Finder - medium - Prices move. One stretch had the widest gap.
- The Status Board - medium - Make sense of a pile of raw Nginx access logs.
- The Budget Allocator - medium - Split the money. Some wore two hats.
- The Trade Log Aggregator - medium - Every trade left a footprint.
- The Timezone Trap - medium - Trip data and timezones. They're not the same thing.
- The Host Ranker - medium - Some hosts have more to offer.
- The Email Ranker - medium - Some inboxes see more action.
- The Consecutive Streak - medium - Login streaks. No gaps allowed.
- The Schema Differ - medium - Schema from yesterday vs today. Something changed.
- The Throttle Ceiling - medium - Too many requests in too short a timeframe. Throttle it.
- The Event Aggregator - medium - Bucket a firehose of events into tidy time windows.
- The Record Reconciler - medium - Two versions of the same truth.
- The Dependency Resolver - medium - Everything depends on everything.
- Batch Partitioner - medium - One pile becomes many. Split wisely.
- Batch Records - medium - Too many at once. Break them into groups.
- Char Profile - medium - Every character in the string tells a story.
- Cumulative Sum - medium - The total grows with every row.
- Deep Flatten - medium - Nested deep. Flatten everything.
- Deep Get - medium - Nested deep. Reach in and grab it.
- Detect Cycle in Sequence - medium - Follow the chain long enough and it might loop back.
- Detect Outliers - medium - Most values are normal. Some are suspicious.
- Diagonal Extract - medium - Not every value sits in a row or column.
- Dice Roll Scoring - medium - The pattern rewards the patient.
- Dictionary Key Intersection - medium - Two dictionaries. What do they share?
- Execution Timer Wrapper - medium - Function wrapped with a timer. Duration captured on exit.
- Extract Leaf Values - medium - The tree has leaves. Pluck them.
- Find Indices - medium - It is in there somewhere. Where exactly?
- Find Mode - medium - One value appears more than the rest.
- Full Outer Zip - medium - Two sides. No value left behind.
- Group By - medium - Same key, different rows. Bring them together.
- Lag Column - medium - What came before this row?
- Left Join - medium - Keep the left side. Match what you can.
- Max Length Token - medium - The longest token wins.
- Merge Counters - medium - Two tallies. Combine them.
- Merge Overlapping Time Ranges - medium - Intervals piling up. Clean the timeline.
- Palindrome Hunt - medium - It reads the same both ways. Go further.
- Parse Log Line - medium - One line. A dozen fields hidden inside.
- Permissions Manager - medium - Manage user permissions with config updates.
- Portfolio Profit Calculator - medium - Portfolio gain from purchase history and current prices.
- Precision and Recall - medium - Precision and recall. Both matter.
- Prefix Based Word Replacement - medium - Every word trimmed to its root.
- Rank Metrics - medium - Not all numbers are equal. Rank them.
- Rename Keys - medium - Old names out. New names in.
- Rotate Buffer - medium - The buffer is full. Rotate it.
- Row Aggregates - medium - Each row holds its own summary.
- Running Distinct Count - medium - New values keep appearing. Track the count.
- Subarray Signal - medium - One stretch carries the strongest signal.
- The Balanced Inspector - medium - Every branch should carry the same weight.
- The Bipartite Test - medium - Can this crowd be split into two perfectly separated groups?
- The Bit Reverser - medium - Sometimes the answer is literally backwards.
- The Blind Multiplier - medium - Compute the result of everything around you - without seeing yourself.
- The Bonus Round - medium - Consecutive matching dice rolls trigger a special scoring rule.
- The Build Order - medium - Some tasks must wait for others to finish first.
- The Chain Builder - medium - Links connect in sequence - build the chain from scratch.
- The Chain Transform - medium - One small step at a time can cover a great distance.
- The Change Tracker - medium - Before and after snapshots. The delta is in there.
- The Character Clans - medium - Words sharing the same letters belong to the same clan.
- The Chunked Reader - medium - Too big for memory. Read in pieces.
- The Clock Examiner - medium - Two hands on a clock - how wide is the gap?
- The Coin Vault - medium - Exact change only - and you want to use as few coins as possible.
- The Column Shuffle - medium - Rows in, columns out. Number them.
- The Counting Machine - medium - It knows where it stopped last time.
- The Custom Iterator - medium - Some sequences follow their own rules.
- The Cycle Detector - medium - Follow the chain long enough and you might end up where you started.
- The Date Sorter - medium - Jumbled calendar. Sort it first.
- The Deep Config - medium - Nested config, dot-notation output.
- The Dict Comparator - medium - Two dictionaries. Subtle differences.
- The Double-Ended Gateway - medium - Some queues let you skip the line from both ends.
- The Elevator Trace - medium - Nested floors. One path through.
- The Encoded Signal - medium - The encoding is hiding multipliers. Decode it.
- The Event Broadcaster - medium - Subscribers show up, listen, and sometimes leave.
- The Event Window - medium - A five-minute window is all that matters.
- The Eviction Policy - medium - Fixed capacity. Oldest unused entry gets evicted.
- The Exception Handler - medium - Good code handles failure as gracefully as success.
- The Face That Breaks the Bank - medium - Roll enough dice and one number always runs away with it.
- The Family Reunion - medium - Two cousins share a common ancestor somewhere above.
- The Fast Climber - medium - Some routes up the mountain are faster than others.
- The First Class Function - medium - Functions travel as values - prove you can pass one around.
- The Flat Mapper - medium - Nested values. One flat stream out.
- The Forbidden Sorter - medium - Put the letters in order without the obvious tool.
- The Forgetful Machine - medium - It remembers everything, until it does not.
- The Gap Reporter - medium - The missing IDs in the log - somebody has to notice.
- The Genre Filter - medium - Three tables, two conditions, one actor's total.
- The Half-Life Search - medium - Every guess cuts the problem in half.
- The High Rollers - medium - Not every gambler bets the same - some wager far more than others.
- The Horizon Scanner - medium - For each position, what is coming up ahead?
- The Hostile Takeover - medium - One dict eats another.
- The Hourly Bucket - medium - Timestamps belong somewhere.
- The Intervals - medium - Timestamps in buckets.
- The Inverted Triangle - medium - A pattern of stars narrows toward the bottom.
- The Island Counter - medium - Surrounded by water, connected by land - how many separate landmasses?
- The Lazy Unpacker - medium - Instead of loading it all at once, yield it one piece at a time.
- The Letter Kin - medium - Words that share the same letters belong together.
- The Letter Mapper - medium - A consistent substitution, or not.
- The Level Inspector - medium - Each floor of the tower tells a different story.
- The Level Summer - medium - Add up each level of the tree.
- The Link Shrinker - medium - Long addresses have aliases - you give them out, you keep the map.
- The Load Balancer - medium - Distribute incoming requests evenly across available servers.
- The Map Reducer - medium - Map it. Reduce it. One answer.
- The Market Streak - medium - Some stocks run longer than you think.
- The Market Timer - medium - One buy, one sell - when do you make the most?
- The Merge Champion - medium - Many sorted rivers flowing into one.
- The Min Tracker - medium - The stack remembers the best it ever saw.
- The Month-by-Month Snapshot - medium - Every salesperson has a story. The months just tell it sideways.
- The Mountain Peak - medium - The sequence has a summit.
- The Multiplier Rush - medium - Negatives cancel negatives - but only if you keep both in view.
- The Narrow Lens - medium - A narrow timeframe. Everything inside matters.
- The Number Miner - medium - JSON strings are hiding numeric secrets - dig them out.
- The Number Narrator - medium - Every number has a story in words.
- The Online Elite - medium - The top performers are hiding in the data.
- The OOP Pillars Exam - medium - Four principles, one class hierarchy - show you know all of them.
- The Order Inspector - medium - A binary tree has rules - is this one actually following them?
- The Page Turner - medium - Nobody loads everything at once.
- The Pandas Pivot - medium - Rows become columns. Columns become power.
- The Parentheses Factory - medium - Building balanced brackets is an art form.
- The Pay Ladder - medium - Climb the ladder the hard way. No shortcuts allowed.
- The Perfect Match - medium - Two numbers walk into an interview...
- The Placement Fixer - medium - Each value belongs in exactly one spot.
- The Postfix Processor - medium - Math without parentheses - the operators come after the numbers.
- The Precision Hunt - medium - Some answers need no decimal point.
- The Priority Queue - medium - When two things tie, something has to break the deadlock.
- The Progress Meter - medium - Report progress at every tenth of the way through.
- The Quarter Turn - medium - One rotation changes everything.
- The Queue Disguise - medium - A queue in sheep's clothing.
- The Repeat Visitor - medium - Loyal customers come back sooner than expected.
- The Response Aggregator - medium - Multiple result pages. One clean summary.
- The Rolling Peak - medium - The sweetest stretch in the sequence.
- The Rolling Window - medium - Smooth things out, one step at a time.
- The Rotated Array - medium - Someone shuffled it. Now locate what you came for.
- The Schema Diff - medium - Two versions of the same config - what changed between them?
- The Scoreboard Race - medium - Simulate rounds until someone hits the target.
- The Shifting Standard - medium - A benchmark in motion.
- The Short Address - medium - Turn a big number into a compact alphanumeric code.
- The Shortest Route - medium - Fewer hops is always better.
- The Silver Screen Summit - medium - Box office totals decide who makes the top of the marquee.
- The Slow Leak - medium - Nested iterators. One flat stream.
- The Sneaky Twins - medium - They look different but they are the same inside.
- The Spin Doctor - medium - Ninety degrees, but which way?
- The Spiral Harvest - medium - The snail reads the grid in its own special order.
- The Staircase Problem - medium - One step or two, the choices add up.
- The Subarray Tally - medium - How many hidden windows hit the target?
- The Table Thief - medium - Somewhere in that query, tables are hiding.
- The Tag Analyst - medium - Two sets of labels, one analysis.
- The Tail Finder - medium - Navigate to the end of a linked list using recursion.
- The Timing Decorator - medium - Wrap any function to capture how long it takes.
- The Top Words - medium - In every document, some words dominate the conversation.
- The Trip Aggregator - medium - Travel records hold patterns waiting to be surfaced.
- The Triplet Hunt - medium - Every path that works gets a seat at the table.
- The Velvet Rope - medium - Some users get in. Others wait outside until the window resets.
- The Version Ranker - medium - Software versions follow their own ordering rules.
- The Vocabulary Test - medium - Can you spell out the whole sentence using only the words you know?
- The Waiting Game - medium - Patience has a price - and a count.
- The Water Gauge - medium - Elevation bars trap water between peaks - count the volume.
- The Window Cleaner - medium - Keep it fresh, keep it unique.
- The Word Families - medium - Different spellings, same letters - they belong together.
- The Yahtzee Scorer - medium - Dice scoring. Multiple categories evaluated.
- The Zero Propagator - medium - One zero can change the whole picture.
- The Zigzag Encoder - medium - The message snakes its way across the rails.
- Threshold Filter - medium - Above the line or below it.
- Top N Keys - medium - Most of them do not matter. The few that do stand out.
- Transpose Table - medium - Rows become columns. Columns become rows.
- Triangle Validator - medium - Not every triangle is a triangle.
- Unflatten Keys - medium - Dots in the key names. Rebuild the structure.
- Validate Email - medium - Looks like an email. But is it?
- Distribute Values Into Container Types - medium - Round-robin the values. Keep rotating.
- The Nearest Value Mapper - medium - Close enough counts. Ties go low.
- The Target Hunt - medium - Pairs that hit a target. Every one of them.
- The Event Overlap Detector - medium - Overlapping events. The calendar knows.
- The Consecutive Sequence Finder - medium - Numbers that flow without interruption.
- The File Tree Builder - medium - Flat paths. Build the nested tree.
- The Impersonator - medium - You only have stacks. Make a queue anyway.
- The Category Ranker - medium - Categories have standing. Rows get theirs.
- The Throttle Wall - hard - Stop the abusers. Let the rest through.
- The Change Data Capture - hard - Inserts, updates, deletes : all present.
- The Stream Joiner - hard - Events don't wait for each other. This does.
- The Anomaly Detector - hard - Spot the outliers before they page someone.
- The Schema Migrator - hard - Old schema in, new schema out.
- The DAG Executor - hard - Wire up a mini pipeline and watch it run.
- Common Prefix - hard - They all start the same way. How far?
- Data Quality Report - hard - The data is not as clean as it looks.
- Group Average - hard - Same group, different values. What is typical?
- Merge Intervals - hard - Overlapping ranges. Merge them.
- Pivot Records - hard - Long format is easy. Wide format is useful.
- The Dynamic Container - hard - Build your own resizable list with no help from the standard library.
- The Frequency Eviction - hard - When storage is tight, something has to go.
- The Infection Spread - hard - It starts with one, and then it spreads.
- The Lazy Stream - hard - Yield values one at a time from a potentially infinite source.
- The Median Keeper - hard - The middle value keeps moving as new data arrives.
- The Onion Layer - hard - Peel from the outside in - one ring at a time.
- The Trapped Pool - hard - What collects in the valleys after the rain?
- The Triple Alliance - hard - Three numbers, one target.
- The Water Collector - hard - Two walls, one sky, and a very important question.
- The Yahtzee Engine - hard - Five dice. Six faces. Score it.
- Stream-Process a Large CSV - hard - Too big to load. Read what you can.
- The Meeting Room Allocator - hard - Meetings overlap on the calendar. Rooms are limited.
- The Middle Ground - hard - The middle value keeps moving.
- The Hierarchy Builder - hard - Parent-child pairs, flat. Build the family tree.
- The Output Peak - hard - One stretch outpaced all the others.
SQL Practice Problems (903)
- Unmatched Credit Complaints - easy - Credits were promised. Not everyone got theirs.
- The Duplicate Detection Sprint - easy - Same email, different rows. Spot the repeats.
- Weekend Warriors - easy - Weekdays vs. weekends. When does the action really happen?
- The Dormant Accounts - easy - They are still paying. They stopped showing up.
- 30-Day Page View Counts - easy - Thirty days of engagement. Quick snapshot.
- Above Average Interactions - easy - The average user is boring. Who is above?
- Above Category Average - easy - The category average is one thing. These beat it.
- Active API Tokens - easy - Tokens that have actually been used.
- Active Campaigns - easy - Which campaigns are earning their keep?
- Active Token Owners in 2026 - easy - Active token owners this year.
- Active User Revenue for April - easy - Total revenue from active users in a single month
- Active Users With April Transactions - easy - Active accounts that also opened their wallets. How many?
- Activity Histogram - easy - How many users did X things? Build the distribution.
- Ad Revenue 2026 - easy - Annual ad revenue. On the books.
- Alert Hotspots by Service and Severity - easy - Some services and severities light up more than others.
- All Infra Regions - easy - The infrastructure spans the globe. Map it.
- Annual Cloud Spend - easy - One year of cloud bills. The total.
- Annual Cloud Spend Summary - easy - A year of cloud bills. Add it all up.
- Annual Pipeline Failures - easy - How many pipelines broke this year?
- April and May Active Users - easy - Spring cleaning for the user base. Who was actually around?
- Auth Endpoints - easy - Not all endpoints are visible to everyone.
- Authors With Successful Deploys - easy - Who deployed successfully?
- Auth Service Health Checks - easy - One service. Full audit trail.
- Average Brand Campaign Revenue - easy - A quick benchmark on brand campaigns.
- Average Build Duration by Repo - easy - Some repos build fast. Others don't.
- Average DQ Fail Rate - easy - Average failure rate, table by table.
- Average GPU Node CPU Usage - easy - GPU nodes burning CPU. How much?
- Average Headcount by Department - easy - Compensation benchmarks, department by department.
- Average High-Range Accuracy - easy - The top-scoring models. What's their average?
- Average Latency by Health Status - easy - Healthy versus degraded. The latency gap is real.
- Average Latency by Status - easy - Each status code has its own latency story.
- Average Node CPU by Region - easy - Average infrastructure node CPU usage broken down by region
- Average Node Utilization - easy - CPU and memory, region by region.
- Average Rating by Category - easy - Category ratings. Some shine, some don't.
- Average Response Time by Hour - easy - Hour by hour. When does latency spike?
- Average Search Endpoint Latency - easy - One endpoint. Average speed.
- Average Search Results Per User - easy - How many results per searcher?
- Average Session Duration by Device - easy - Session length, device by device.
- Bargain Bin - easy - Floor prices. Right before the vendor call.
- Best-Selling Reps Each Month - easy - In every category, a few sellers rise to the top.
- Big Spenders - easy - The whale list.
- Budget Flag - easy - Join tables and label rows as over or under budget.
- Budget-Friendly Products - easy - Affordable does not mean invisible.
- Campaign Match Rate - easy - Campaign reach. Measured.
- Campaign Revenue Totals - easy - Every campaign has a price tag. Total them up.
- Cart Sizes - easy - Power buyers. Big carts.
- Category Census - easy - Which aisles are worth restocking?
- Category Sales Summary - easy - Category by category. How did they do?
- Category-Specific Product Volume - easy - Sum transactions for a specific payment type.
- CDN Image Request Paths - easy - CDN image traffic. Every path.
- CDN-Related DNS Lookups - easy - DNS lookups tied to the CDN.
- Character Position in Endpoint - easy - URL patterns, character by character.
- Chat Activity - easy - Which channels are ghost towns?
- Cheapest Cost Per Region - easy - Lowest spend per region.
- Cheapest Transaction per User - easy - Everyone has a smallest purchase.
- Clean Cache CDN Edges - easy - Cached, clean, error-free edges.
- Clean Latency Cast - easy - The latency column is a string. It should not be.
- Clicked Ad Impressions - easy - They saw the ad. They clicked.
- Cloud Bill - easy - Which cost buckets are bleeding money?
- Cloud Cost by Team - easy - Spend by team. Who's burning most?
- Common Age Buckets - easy - Duplicate records hiding in the users table.
- Completed Priority-1 Jobs - easy - Priority one. Completed.
- Compute Nodes in Key Regions - easy - Compute nodes across the key regions.
- Content by Specific Users - easy - Two creators. What did they publish?
- Content Duration Snapshot - easy - A popularity snapshot by duration.
- Content Mix - easy - One content format to bet the quarter on.
- Content Published in 2026 - easy - Published back then. Still relevant?
- Content Sorted by Duration - easy - The catalog, sorted by length.
- Content Type Distribution - easy - How many of each content type?
- Content Types by Creator - easy - One creator. What did they make?
- Content Viewer Penetration - easy - What share of the user base has viewed at least one piece of content
- Cost Efficiency Ratio - easy - Dollars in, value out. What's the ratio?
- Count Distinct Services - easy - How wide is the service mesh?
- Count Nodes in Region - easy - One region. How many nodes?
- CPU Utilization Summary - easy - The CPUs are working. How hard?
- Customer Full Name Concat - easy - First name, last name. Combine them.
- Daily and Weekly Active Users - easy - One metric by day, one by week. Same users, different lenses.
- Daily Cross-Platform Users - easy - Mobile and web. Same day, same users?
- Daily Deployment Count - easy - Deploys per day.
- Department Spend Difference - easy - The compensation gap between departments.
- Department Spend Gap - easy - Gap between Engineering's and Marketing's biggest single purchase
- Deploy Cadence - easy - Which environments ship the most?
- Deploy Count by Service - easy - Some services deploy constantly. Others barely at all.
- Deployed Models by Framework - easy - Which frameworks are actually in production?
- Deployment Duration by Status - easy - Fast deploys versus slow ones. By outcome.
- Deployments Without Alerts - easy - Deployed without a single alert. Suspicious or impressive?
- Deprecated Model Count - easy - How many models are past their expiration date?
- Device Mix - easy - The device breakdown before the redesign.
- Device Types With Chrome Users - easy - Power users and their devices.
- Disabled Feature Flags - easy - Disabled flags. Still worth auditing.
- Distinct Blog Referrers - easy - Where did the traffic really come from? No repeats.
- Distinct Product Categories - easy - A quick category inventory.
- Early 2026 Data Pipelines - easy - Early-year data pipelines.
- Employees Per Department - easy - Headcount, location by location.
- Error Severity Buckets - easy - Errors sorted by how much they hurt.
- Errors With Service Health - easy - Error data, enriched with health context.
- Even-ID February Signups - easy - A very specific slice of a very specific cohort.
- Even-ID June Signups - easy - Odd IDs, even IDs. The filter is precise.
- Event Count on Key Days - easy - Key days. Key event volumes.
- Events by Month Across Years - easy - Month by month, year by year. The pattern emerges.
- Event Types Spanning Multiple Months - easy - Some events span seasons.
- Expensive AWS Services - easy - Some AWS services quietly drain the budget.
- Extreme Headcount Departments - easy - The pay extremes tell a story.
- Failed Payment Deployments - easy - Payment deploys that went wrong.
- Features With Missing Values - easy - Missing data in the features.
- February 2024 Signups - easy - One signup window. One cohort. Who joined the club?
- Filter By Domain - easy - Select rows matching a text suffix pattern.
- Filtered User Roster - easy - A clean roster for the all-hands.
- Find Deploy Authors - easy - Same person. Many different spellings.
- First Build per Repository - easy - Every repo had a first build.
- First Migration Record - easy - The very first migration. Where it all began.
- First Run Row Count - easy - Every job's first run. How many rows?
- Flag Check - easy - Which flags are actually live?
- Full Customer Order List - easy - Every customer. Every order. The full picture.
- Gateway Connection Timeouts - easy - Timeouts at the gateway.
- Health Check Distribution - easy - Pass, fail, degraded. The distribution.
- Health Checks per Service - easy - Some services get checked constantly.
- Heavy Searchers in August - easy - August's power searchers.
- High and Critical Alerts in 2026 - easy - High and critical alerts from that year.
- Higher Performing Variant - easy - Control versus treatment. One wins.
- Higher Than Supervisor - easy - When the student outscores the teacher.
- Highest Cost Per Team - easy - Peak cost, team by team.
- Highest Latency Endpoints - easy - The slowest endpoints. Everyone notices.
- High-Output Creators - easy - High engagement creators.
- High Price Products - easy - Everything above 100.
- High-Rated In-Stock Percentage - easy - Highly rated and in stock. A rare combo.
- High-Spend 2025 Campaigns - easy - Big-budget campaigns from last year.
- High-Traffic Endpoints in February - easy - When traffic spikes, some endpoints get buried. How many crossed the line?
- High Volume Batch Jobs - easy - Batch jobs that processed millions.
- Holiday Promo Campaign Click Year - easy - One year, the holiday campaign exploded.
- Holiday Sale Campaign Revenue - easy - The holiday sale campaign. How did it do?
- Idle Team Members - easy - Sprint started. Some people never got assigned.
- Inactive Unverified Users - easy - Signed up. Never verified. Never came back.
- Initial Count - easy - Support is looking for naming patterns that predict ticket volume.
- In-Stock Product Count - easy - How many products are actually available?
- Japan Revenue for April - easy - Last month's numbers for one region.
- Joined Employee Details - easy - Combine two related tables with a join.
- Largest Group - easy - One group towers above the rest.
- Last Five Batch Jobs - easy - The last five. A quick tail check.
- Last Migration Record - easy - The most recent migration. Is it the last?
- Last Server Activity - easy - Each server's last heartbeat.
- Latency vs Regional Average - easy - Each service versus its region's average.
- Latest Metric Values - easy - Stale records hiding in the metrics.
- Latest Session Per User - easy - Everyone has a most recent session.
- Latest Version Per Service - easy - The latest version deployed. Each service.
- Log Entries by Level - easy - Info, warn, error, fatal. The breakdown matters.
- Log Volume by Day of Week - easy - Some days are noisier than others.
- Longest Active Membership Streak - easy - The longest unbroken streak.
- Longest Deploy With Full Identifier - easy - The longest deployment. Full ID.
- Long Searches Containing 'er' - easy - Long queries with 'er'. A pattern?
- Low-Byte CDN Responses - easy - Tiny responses from the edge.
- Low-Engagement User Count - easy - How many users are barely engaged?
- Lowest Average Price Category - easy - The cheapest category. Not necessarily the worst.
- Low Latency API Calls - easy - Fast endpoints. Confirmed fast.
- Low Severity DQ Checks - easy - Low severity checks. All of them.
- Low Throughput Pipelines - easy - Pipelines barely moving data.
- Low Uptime Services - easy - Underperforming services.
- Max Value Per Location - easy - Every location has a peak.
- Memory-Heavy Pods - easy - Memory-hungry workloads.
- Merge-Triggered Builds 2026 - easy - How many builds came from merges this year?
- Message Length - easy - Verbose commits. Risky changes?
- Messages Containing Keyword - easy - Flagged terms in the messages.
- Messages From Specific Users - easy - Specific users. What did they say?
- Metric Range Per Group - easy - The spread within each group.
- Metric Value Quarter Complement - easy - Two metrics that accidentally match.
- Metric Volatility Gap - easy - Stable metrics are boring. Volatile ones need attention.
- Mid-CPU Nodes - easy - Not the heaviest. Not the lightest. The middle.
- Mid-Range Cost Allocations - easy - Not the cheapest. Not the priciest. The middle.
- Mid-Tier Batch Jobs - easy - Not the biggest, not the smallest. The overlooked middle.
- Missing Email for Non-Active Users - easy - No email on file. No recent activity. Something smells off.
- Mobile Event Counts - easy - Mobile engagement, device by device.
- Monthly Active Users per Endpoint - easy - One endpoint, many users. Which ones showed up?
- Monthly Category Totals - easy - Sum amounts by category and month.
- Monthly Deployment Count - easy - Deploys by month.
- Monthly Signup Counts - easy - Signups, month by month.
- Monthly Transaction Counts - easy - Every month tells a spending story, user by user.
- Monthly Unique Users per Campaign - easy - Monthly reach, campaign by campaign.
- Morning Warning Logs - easy - Warnings before noon.
- Most Common Export Job Status - easy - The most common job status.
- Most Recent Token Usage - easy - Each user's latest token activity.
- Multi-Column User Sort - easy - Sorted by name. Then by something else.
- Multi-OS Users - easy - iOS today, Android tomorrow.
- Multi-Provider Cost Lookup - easy - AWS, GCP, Azure. Side by side.
- Multi-Variant Experiments - easy - One user, multiple experiments.
- Never-Ordered Products - easy - In the catalog. Never purchased.
- Nodes in Target Regions - easy - The target regions need attention.
- Node Summary Per Region - easy - Every region has a node story.
- No Gaps - easy - Zero blanks. A clean contact list.
- Non-Bot Acknowledged Alerts - easy - Human-acknowledged alerts only.
- Non-Draft Content - easy - Everything except drafts.
- Notifications Opened on Date - easy - One day, many pings. How many actually got opened?
- Nth Highest Salary - easy - Not the highest. Not the second. The third.
- Nth Largest Value - easy - Select the row with a specific rank position.
- NULL Keys in Joins - easy - Rows that vanish during the join.
- Oldest and Newest User Sessions - easy - The extremes of the user base.
- One-Star Product Review Count - easy - One-star reviews. How many?
- Overall Average API Latency - easy - The overall average. Across everything.
- Peak Activity by Device - easy - Activity windows, device by device.
- Peak Ad Revenue Moment - easy - The single peak earning moment.
- Peak Metric Per Department - easy - Peak metrics for the quarterly deck.
- Peak Non-Converting Month - easy - Everyone showed up. Nobody bought anything.
- Peak Satisfaction - easy - Which departments are winning on satisfaction?
- Peak Spending Month - easy - One month, the bill was unforgettable.
- Pending Batch Jobs - easy - Stuck jobs. Still pending.
- Pipeline Run History - easy - The lineage trail.
- Pipeline Throughput Ratio - easy - Compute current-to-initial value ratio per period.
- Platform Check - easy - OS and device combos. Which sessions last longest?
- Platform Team Feature Flags - easy - The platform team owns a lot of flags.
- Platform Team Mobile Flags - easy - Mobile flags under platform ownership.
- Pod Distribution by Restart Count - easy - Low-restart pods. Reliable or idle?
- Popular Categories - easy - Merchandising only cares about categories big enough to negotiate shelf space.
- Price Check - easy - Priced to sell or priced to sit?
- Production Deployment Count - easy - How many production deploys?
- Production Deploys From April Onward - easy - After the cutoff, how many times did prod get a push?
- Product Name Letter Replace - easy - A quick text transform on product names.
- Product Name Prefix - easy - Just the first three characters. That is all.
- Product Page Sale Searches - easy - They searched from the product page.
- Product Revenue Ranking - easy - Rank them by revenue. See who leads.
- Products Without Sales - easy - Listed but never sold.
- Profitable Categories by Price - easy - The most profitable categories.
- Promo Campaign Cost per Acquisition - easy - The campaign ran. What did each customer cost?
- Provider Cost Change H1 - easy - Cost swings in the first half of the year.
- Purchase Log - easy - Names on receipts, not just IDs.
- Q2 Search Volume - easy - Q2 search volume. The numbers.
- Quarterly Deployment Count - easy - Deploys per quarter.
- Recurring Error Types - easy - The same errors, recurring.
- Regional Profits - easy - P&L by region. Before the board meeting.
- Regions With 5+ Nodes - easy - Regions with five or more nodes.
- Retargeting Campaign Impressions - easy - Retargeting impressions. All of them.
- Revenue by Product - easy - Which products carry the revenue line?
- Revenue for Specific Users - easy - Alice and bob. Total spend.
- Reviews Per Reviewer - easy - The workload split across reviewers.
- Running Node Pairs - easy - Two servers, same region, both alive.
- Satisfaction Score by Region - easy - Satisfaction scores. Missing region data.
- Search Endpoint Status Distribution - easy - Status codes on the health endpoint.
- Searches by Users With Email - easy - One user's search behavior.
- Search Terms Starting With G - easy - Queries starting with 'g'.
- Second Highest Salary - easy - Silver medal. Almost the top, but not quite.
- Second Highest Value - easy - Almost the top. Not quite.
- Service Alert Frequency - easy - How often does each service trigger alerts?
- Services With Most Error Occurrences - easy - The noisiest services.
- Service User Growth Rate - easy - User growth, service by service.
- Session-Fit Content - easy - Content that fits the session length.
- Session Logins Dec 13 to 19 - easy - Logins during one specific window.
- Session Pulse - easy - Engagement is slipping. Who is phoning it in?
- Sessions Per Device Type - easy - Sessions, device by device.
- Signups by Age Bucket Since April - easy - Recent signups by age.
- Signups Jan to Jul 2026 - easy - Signups from January through July.
- Sirens and Smoke - easy - Stale alerts. Still ringing.
- Slow Batch Jobs - easy - Promised by noon. Delivered at midnight.
- Slow Failures - easy - SRE is hunting for the endpoints that fail slowly enough to burn timeouts.
- Slow Production Deploys - easy - Production deploys that took way too long.
- Sort Tokens by Scope Character - easy - Token scopes, sorted for compliance.
- Status Report - easy - Where are orders getting stuck?
- Stock Status - easy - Human-readable availability labels.
- Storage Node Lookup - easy - The storage nodes hold the critical data.
- Successful Deploy Endpoint Calls - easy - Successful deploys only. No failures allowed.
- Successful Pipeline Runs - easy - Which pipelines completed successfully?
- Successful Production Deploys - easy - Successful production deploys with duration.
- Suspected Bot Sessions - easy - Five seconds or less. Probably a bot.
- Targeted Ad Campaigns - easy - High-value impressions. Targeted precisely.
- The Ad Ledger - easy - Annual ad revenue. On the record.
- The Campaign Trail - easy - Impressions are vanity. Conversions are sanity.
- The February Cohort - easy - One signup window. One cohort. Who joined the club?
- The First Half - easy - New arrivals during one specific window.
- The Legacy Hunt - easy - Old data. Still matters.
- The Merge Counter - easy - How many builds came from merges?
- The Publishing Audit - easy - Published years ago. Still generating views?
- The Token Census - easy - How many tokens are out there?
- Third Largest Batch Job - easy - Bronze medal in the batch job rankings.
- Threads Excluding User - easy - Every thread they're not part of.
- Three Lowest Distinct Cloud Cost Amounts - easy - The three cheapest bills on record.
- Tiered Transaction Summary - easy - Compute multiple date windowed aggregates in a single query.
- Timeout Status Records - easy - Unknown status in the health records.
- Timeout Warning Logs - easy - Timeout warnings. The postmortem trail.
- Titles Ending With S - easy - Naming conventions. Specifically the plurals.
- Top 100 Batch Jobs Total Output - easy - The hundred biggest jobs. Combined output.
- Top 10 Batch Jobs - easy - The ten biggest batch jobs.
- Top 10 Model Accuracies - easy - Top ten model performance.
- Top 10 Slowest Endpoints - easy - The ten endpoints nobody wants to call.
- Top 5 Slowest DNS Lookups - easy - Five DNS lookups that took too long.
- Top Ad Campaigns by Revenue - easy - Every campaign has a bottom line. Stack them up.
- Top API Token Scopes - easy - The highest-value token scopes.
- Top Average By Region - easy - Region by region, who pulls the best average?
- Top Deployed Model - easy - The best-performing model in production.
- Top Device by Sessions - easy - One device type generates the most sessions.
- Top Duration Content Items - easy - The content that held the number-one spot.
- Top Five - easy - The five priciest items for the luxury section.
- Top Metric Values - easy - The five highest numbers. No duplicates.
- Top Mobile OS by Session Duration - easy - Which mobile OS keeps users longest?
- Top Performing Models - easy - The models that actually perform.
- Top Product Categories by Sales - easy - The highest-grossing categories.
- Top-Ranked Wines by Variety - easy - The best bottles. Ranked by variety.
- Top Recent Sellers - easy - Fresh data, top sellers. The recent leaderboard.
- Top Selling Items - easy - Revenue crowns the winners. Who sold the most?
- Top Shelf - easy - Buyers need to know ceiling prices before negotiating with vendors.
- Top Spenders Dense Rank - easy - Spending speaks. Let the leaderboard do the talking.
- Total Compute Cloud Cost - easy - Total compute spend. The number.
- Total Cost by Category - easy - Total spend per category.
- Total Engineering Cost Allocation - easy - Engineering's total allocated budget.
- Total Rows by Pipeline Status - easy - Row counts alongside pipeline aggregates.
- Total User Spend - easy - Each customer's total. Summarized.
- Transaction Overview - easy - The executive snapshot. Users, products, revenue.
- Transaction Source Features - easy - One pipeline reviewed them. What did it see?
- Transactions With Product Names - easy - Simple select progressing to a join
- Trim Endpoints Right - easy - Trailing whitespace. Clean it up.
- Trim Search Terms Left - easy - Leading whitespace. Clean it up.
- Tutorial Content Count - easy - How much of the catalog is tutorials?
- Unique Hosts by Node Type - easy - How many unique hosts per node type?
- Unique Searchers - easy - How many users actually searched?
- Unique Searchers Count - easy - Unique searchers. The count.
- Unique Stream Topics - easy - A clean inventory of streaming topics.
- Unmatched Categories - easy - Categories with nothing on the shelf. Empty aisles.
- Unreviewed Models - easy - Models that have never been evaluated.
- Unused Read Tokens - easy - Active tokens that nobody uses.
- US-East KV Store Entries - easy - KV store inventory. us-east-1.
- User Age Ranking - easy - Age brackets, stacked from top to bottom.
- User Engagement Totals - easy - Per-user engagement. The totals.
- User Event Type Count - easy - How many flavors of activity does each user have?
- User Roster - easy - Which account states are bleeding users?
- User Session Roster - easy - Every user paired with their sessions, even users who never logged in
- User Sessions on Specific Days - easy - One user. Specific days. What happened?
- Users Per Device Type - easy - Users per device. The split.
- Users Who Clicked Ads - easy - Ad clickers and their account details.
- Users With Purchase Events - easy - At least one purchase. That changes everything.
- Verify Commit ID Uniqueness - easy - Duplicate commit IDs. Are there any?
- View Count Per Page - easy - Every page has visitors. Some just have more.
- Views by Specific Users - easy - Retrieve all content views for a set of flagged user accounts
- Weekly Transaction Volume - easy - Weekly volume. The pulse.
- Welcome Wagon - easy - How many signed up this year?
- Whale Watch - easy - The accounts driving the top line.
- Yearly Output - easy - Publishing velocity for the board deck.
- 2026 Signup Count - easy - This year's signup count.
- Join Type Row Counts - easy - Same tables, different handshakes, wildly different results.
- Ad Clickers - easy - Who clicked? What did they spend?
- Clean Averages - easy - Merchandising only cares about the categories customers actually rate.
- Log Priority - easy - Which servers are on fire before coffee?
- Unique Visitors - easy - Which months actually had an audience?
- High-Value Electronics - easy - The five priciest electronics.
- Regional Status - easy - The full regional breakdown.
- Click Revenue - easy - Which campaigns are earning their keep?
- Email Census - easy - The reachability split.
- Log Levels - easy - Severity breakdown with response times.
- Above Average - easy - Products beating the catalog average.
- The Revenue Cliff - medium - Revenue was climbing. Then it wasn't. Spot the drop.
- The Phantom Readers - medium - They read everything. They bought nothing.
- The Day-7 Retention Cohort - medium - Day one was promising. Day seven tells the truth.
- The Latest Transaction Per Product - medium - Every product has a last sale. When was it?
- 10 Lowest Uptime Services - medium - Ten services at the bottom of the reliability chart.
- 2FA Confirmation Rate - medium - Two-factor sent. How many confirmed?
- 7-Check Rolling Average - medium - Seven entries hold the trend.
- 7-Day Token Retention - medium - Premium tokens, day by day.
- 80th Percentile API Latency - medium - The 80th percentile tells the real story.
- 90th Pctl Model Accuracy Gap - medium - Most models are fine. The bottom 10% are not.
- Above-Average Cloud Spend - medium - Some services quietly burn more than the rest.
- Above Average Product Prices - medium - Some products cost more than they should.
- Active Duo - medium - Shoppers who also browse. The overlap is the insight.
- Active Searchers - medium - They typed a query. That means something.
- Active Tokens on Target Date - medium - One specific day. Which tokens were still alive?
- Active User Open Rate - medium - What share of push notifications were opened by active users
- Active Users by Session Count - medium - Signed up is one thing. Showing up is another.
- Active vs Regional User Count - medium - Active users versus total users. The gap is telling.
- Ad Revenue by Age Bucket - medium - Ad dollars, sliced by country.
- After Hours API Calls - medium - The office is dark. The API is not.
- Alert Count by Severity Tier - medium - Alerts by severity. The breakdown matters.
- Alert Response Breakdown - medium - An on-call postmortem asks which services are bleeding alerts nobody acknowledges.
- Alert Severity Pivot by Service - medium - When services cry wolf, the severity matrix tells who's serious.
- All Known Endpoints - medium - Two tables. One truth. Every endpoint accounted for.
- API Calls With and Without Errors - medium - Some calls succeed. Some do not. Break it down.
- API Calls With Matching Status - medium - Same status, same pattern. Coincidence?
- API Token Churn Rate - medium - Tokens come and go. What's the turnover?
- API Traffic by CDN Edge - medium - CDN paths carrying API traffic. Which edges?
- App Stability by Region - medium - Some regions crash more than others.
- Attributable Impression Rate - medium - What share of ad impressions can be traced to a real user account
- Auction Lot Summary - medium - The hammer falls. Who bid the most?
- Auth Endpoint Callers - medium - Identify users who have called authentication API endpoints
- Authors Deploying to Dev and Production - medium - Dev, staging, production. Who has touched all three?
- Average Accuracy by Framework - medium - Not all frameworks deliver the same accuracy.
- Average API Latency by Year - medium - Latency year over year. Is it getting better?
- Average Compensation by Department and Status - medium - Average compensation. Department by department.
- Average Fulfillment Lag - medium - Ordered, then... waiting.
- Average Initial Call Latency - medium - First contact latency. The benchmark.
- Average Results for Python Searches - medium - Python searches. What's the click-through?
- Average Review Comments by Author - medium - Some authors get more feedback than others.
- Average Session Duration - medium - How long do users actually stay?
- Average Spending by Account Status - medium - Average per-user lifetime spending segmented by account status
- Average Update Call Latency - medium - Follow-up calls. How fast?
- Average Watch Time by Format - medium - Which content format keeps viewers watching the longest
- Avg Alerts by Severity - medium - Alert patterns by severity.
- Avg Daily Active Users per Endpoint - medium - Daily engagement, endpoint by endpoint. The averages reveal all.
- Avg Session Duration by Creator - medium - Some creators keep users longer.
- Batch Job Performance Tiers - medium - Every batch job gets a grade.
- Best Accuracy to Training Time Ratio - medium - Fast to train. Accurate too. Which model?
- Best Day for Ad Revenue - medium - One day of the month outperforms the rest.
- Biggest Deployment Decline - medium - One team's deploy count cratered. Which one?
- Binary Flag Indicators - medium - On or off. Every flag at a glance.
- Bottom Endpoints by POST Volume - medium - The quietest POST endpoints.
- Builds per Author per Branch - medium - Who triggered what, and where?
- Build Success Rate by Trigger - medium - Which triggers produce green builds?
- Build Success vs Failure by Repo - medium - Green versus red, repo by repo.
- Busiest Pipeline Month - medium - One month, more pipeline runs than any other.
- Busiest Route by Passenger Volume - medium - The busiest route by volume.
- Busy Authors - medium - Some developers spread their commits everywhere.
- Campaign Click-Through Rates - medium - Clicks per impression. Campaign by campaign.
- Campaign Cost Effectiveness - medium - Money in, conversions out. What is the ratio?
- Campaign Revenue by Click Channel - medium - Which ad format drives the most revenue?
- Campaigns With Most Clicks - medium - The campaigns getting all the clicks.
- Categories With Mixed Price Tiers - medium - Users who cross content types.
- CDN Traffic by Day and Hour - medium - CDN traffic, hour by hour.
- Cheapest High-Rated Product - medium - Cheap and highly rated. A rare combination.
- Classify Services by Name - medium - The name tells you what it is. Mostly.
- Clicked Holiday Impressions - medium - Holiday ads. Who actually clicked?
- Click vs Non-Click Rates - medium - Some searches lead to clicks. Most do not.
- Cloud Cost Stats by Provider - medium - Three providers. Three very different bills.
- Cloud Cost Trend Analysis - medium - Cost trends across billing periods.
- Combined Cloud Spend by Region and Service - medium - Region by region. Service by service. Where does the money go?
- Commit Royalty - medium - In a sea of commits, only a few wear the crown.
- Completion Rate - medium - Not every region closes orders cleanly. The percentages tell the story.
- Consistent High-Quantity Revenue - medium - Big orders, consistent revenue. A rare combination.
- Content Recommendation Engine - medium - Pages they haven't discovered yet.
- Content Session Counts - medium - Session metrics, content item by item.
- Cost Density Extremes - medium - Some regions pack more cost per node than others.
- Cost Share Within Category - medium - Each entry's slice of the category total.
- Creators With Top-Rated Content - medium - Top-rated content. Who made it?
- Cross-Region Customers - medium - Orders crossing borders.
- Cross-Variant User Pairs - medium - Same experiment. Different variants. Who overlaps?
- Cumulative Monthly Revenue Avg - medium - Revenue, cumulating month by month.
- Currently Active Feature Flags - medium - Which flags are live right now?
- Customers Without Orders - medium - Customers who have never ordered.
- Custom Message Type Counts - medium - Not all messages are created equal.
- Daily Error Count Change - medium - Errors, trending up or down?
- Daily Error Resolution Ratio - medium - Reported versus removed. The daily ratio.
- Daily Metric Percentage Change - medium - Yesterday versus today. What moved?
- Daily Session and User Counts - medium - Sessions and users, day by day.
- Daily Spam Impression Rate - medium - How much of the ad feed is spam?
- Daily Top Endpoints - medium - Three winners each day.
- Data Repo Fix Commits - medium - How many commits start with 'fix'?
- Days with More Edited Than Unedited Messages - medium - Some days, more messages get edited than sent.
- Deduplicate and Keep Latest - medium - Duplicates everywhere. Only the freshest survives.
- Deduplicated Sales Volume by Category - medium - Clean the noise, then see what each aisle really earned.
- Department Cost by Status - medium - Headcount and compensation. The dashboard view.
- Department Running Totals - medium - Compute cumulative metric values within each department using window operations.
- Deploy Author Performance Score - medium - Not all deployers are equally reliable.
- Deployment Failure Impact - medium - When deploys fail, how bad is the blast radius?
- Deployments per Environment - medium - Dev, staging, prod. Where do most deploys land?
- Deploy Reliability Scores - medium - A reliability scoreboard for deploy teams.
- Devices Per Age Bucket - medium - Device diversity among the younger users.
- Device Type Serving Most Users - medium - One device type serves more users than the rest.
- Disabled Flag Ratio - medium - Feature flags that went dark. What percentage fell silent?
- Distinct Chat Conversations - medium - How many unique conversations?
- DQ Fail Rate by Table - medium - Pass rates, table by table.
- DQ Score Spread - medium - The spread in data quality scores.
- Duplicate DQ Check Records - medium - Passed QA twice. That's the problem.
- Duplicated User Event Messages - medium - Duplicated messages from the alerts topic.
- Duplicate Training Runs - medium - Same model, trained twice.
- Early Commit Velocity by Author - medium - How productive was each author during the first year of a repo's CI pipeline
- Early User Activation - medium - Activated early. A good sign.
- Efficient Pipeline Throughput - medium - Throughput per pipeline. The benchmark.
- Endpoint Latency Spread - medium - Latency spread across endpoints.
- Endpoint Performance Report - medium - Every endpoint has a speed and a reliability story.
- Endpoint With Most GET-Only Users - medium - Read-only users have a favorite endpoint.
- Engagement by Content Type - medium - Some content types get all the attention.
- Engagement Gap - medium - Zero transactions is still a data point. Count everyone.
- Error Hall of Fame - medium - The year's worst error categories.
- Error Rate by Region - medium - Error rate per day and region via conditional aggregation.
- Exclusive Users per Device Type - medium - Loyal to one platform only.
- Experiment Conversion Pivot - medium - Variant A or Variant B? The conversion numbers tell the story.
- Extract Deploy Versions - medium - The version number is buried in the log.
- Extreme API Token Usage - medium - Outlier tokens. Suspiciously busy.
- Extreme Category Totals - medium - The highest and the lowest. Both are interesting.
- Extremely Late Resolutions - medium - Twenty minutes past the SLA. Still unresolved.
- Failed Constraint Checks Count - medium - Constraints failed. How many?
- Failure Rate - medium - Build failures happen. Which repos break the most?
- Fastest CI Build Date - medium - The fastest build ever. When did it happen?
- Fastest Completion Per Day - medium - Every day has a speed champion.
- Fastest Regions by Latency - medium - The fastest regions. Benchmarked.
- Feature Flag Adoption - medium - How widely adopted are the flags?
- Feature Quality by Source - medium - Quality varies by source.
- Feature Vote Winner - medium - Users voted with their clicks. Who won?
- Find the Fifth Largest Cost - medium - Not the biggest. Not the smallest. The fifth.
- First and Last Peak Accuracy Dates - medium - Peak accuracy. When it first hit and when it last did.
- First and Last Timeout Per Service - medium - First timeout. Last timeout. Each service.
- First Deploy Attribution - medium - The first deploy per service.
- First Half of Page Views - medium - Half the data. The first half.
- First Time Learners Per Day - medium - Brand new users, day by day.
- First Touch Attribution - medium - The first interaction matters most. Or does it?
- Frequent Message Senders - medium - Someone is sending too many messages.
- Friday Sessions for Shared Experiments - medium - Friday vibes only. Same experiment, different users.
- Fulfillable Order Percentage - medium - What percentage of orders can be fulfilled?
- Ghost Products - medium - Listed but never sold. The shelves collect dust.
- Heavy Ad Exposure - medium - Saturated with ads. Is it too much?
- Heavy Hitters - medium - Some repos never sleep.
- Heavy Namespaces - medium - Kubernetes has favorites. Some namespaces carry more weight.
- Highest and Lowest Cloud Costs - medium - The extremes in cloud spending.
- Highest Daily Spend - medium - Somewhere in that window, someone broke the spending record.
- Highest Node Density Regions - medium - Some regions are packed with nodes.
- Highest Throughput Pipelines - medium - The pipes that carry the most water.
- Inactive Android Control Users - medium - Android control cohort. Gone quiet.
- Inactive Users in Date Range - medium - Ghost accounts. Active signup, zero sessions.
- Inactive vs Suspended Engagement - medium - Premium versus free. The engagement gap.
- iOS Adoption by Age Bucket - medium - The install numbers don't match the hype.
- iOS Sessions by Device Type - medium - iOS engagement, device by device.
- Job Status Duration - medium - How long in each job state?
- Keep Most Recent Record - medium - Carbon copies clutter the table. Only the latest matters.
- Keyword-Based User Search - medium - The search terms reveal intent.
- Largest A/B Test by Participants - medium - The biggest experiment ever run.
- Largest Single Cloud Cost - medium - One line item. The biggest bill of all.
- Latency Gap to 10th Fastest - medium - One server. Compared to the 10th fastest.
- Latest Commit Build Cost - medium - The latest commit came with a build cost.
- Latest Migration Output per Author - medium - Each author's most recent migration output.
- Leading ML Frameworks by Accuracy - medium - Which frameworks lead on accuracy?
- Least Viewed Content - medium - Nobody is watching. Should it still exist?
- Longest Gap Between Token Events - medium - The longest gap between token events.
- Longest Running Pipeline - medium - One pipeline outlasted them all.
- Long Messages - medium - Some commit messages tell a novel.
- Long-Running Feature Flags - medium - Flags that have been on for too long.
- Low-Engagement Sessions - medium - Users whose average session duration is below the engagement threshold
- Lowest Cost Network-Heavy Team - medium - Networking costs versus compute. Which teams?
- Lowest Latency per Service - medium - The fastest response each service ever gave.
- Low Severity Checks in 2026 - medium - Low severity. High volume.
- Low-Volume Stream Topics - medium - Quiet topics in the stream.
- March Revenue by Customer - medium - One month, every customer, every dollar accounted for.
- Median Null Percentage of Float Features - medium - Nulls in float columns. How widespread?
- Mentorship User Pairs - medium - Pair them up. Mentor and mentee.
- Metric Count - medium - How deep does each department's tracking go?
- Metric Value Pairs Over Threshold - medium - Two metrics, both above the line.
- Minimum Cost Per Provider - medium - The cheapest month from each provider.
- Mobile vs Desktop Session Duration - medium - Mobile versus desktop. Who stays longer?
- Models With Variable Accuracy - medium - Accuracy should be stable. These models are not.
- Model Training Completion Rate - medium - How many models finished training?
- Monthly Cohort Retention - medium - Compute month over month retention rates for user signup cohorts.
- Monthly Revenue Comparison - medium - Last month versus this month. Per product.
- Monthly Running Total - medium - Cumulative sales per product across months.
- Monthly Spend Pivot by Provider - medium - Cloud bills by month, split by who sent the invoice.
- Monthly Transaction Summary - medium - A monthly engagement summary.
- Month With Fewest Deploys - medium - One month, nobody deployed.
- Most Active Chat Users - medium - The loudest voices on the platform.
- Most Active Recent Committers - medium - Who has been writing the most code lately?
- Most Active Servers by Log Volume - medium - The busiest servers by log volume.
- Most Commented Code Review - medium - The code review that started a debate.
- Most Common Monday Outcome - medium - Mondays have a pattern.
- Most Efficient API Endpoint - medium - Best throughput per call.
- Most Frequent Error Types - medium - The errors that keep coming back.
- Most Ordered Product by Country - medium - Popular products in specific markets.
- Most Popular Content Type - medium - The content type everyone prefers.
- Most Popular Signup Day - medium - One day of the week wins on signups.
- Most Profitable Region Month - medium - One region, one month. Peak profit.
- Multi-Host Regions by Node Type - medium - Some regions are quietly building empires.
- Multi-Table Report - medium - Join three tables into a summary report.
- Mutual Channel Connections - medium - Two users. What channels do they share?
- Negative Outcome Rate for New Users - medium - New users have a rough first two weeks.
- Net Lines - medium - Some authors build. Others trim. The net tells the truth.
- New Customers Per Day - medium - Count users whose first order falls on each date.
- New User Purchases - medium - Revenue from the signup cohort that joined this year.
- Nodes by Region and Type - medium - Broken down by region. Broken down by type.
- Nodes in Key Regions - medium - Six regions. How many nodes in each?
- Noisiest Tables by DQ Failures - medium - The tables that fail the most checks.
- Non-Trivial Fatal Errors - medium - Short errors are noise. Long ones matter.
- Notification Delivery Ratio - medium - Sent versus delivered. The gap is the problem.
- Notification Open Rate - medium - Sent versus opened. The rate.
- Notifications Pivot by Weekday - medium - Notifications by platform and day of week.
- Nth Highest Salary Per Department - medium - Third place in every department.
- Opened Notifications in Jan-Feb - medium - Two months of push notifications. How many were actually read?
- Over-Budget Services - medium - Over budget. Flagged.
- Overlapping User Sessions - medium - Two sessions, one user, same clock. Something overlaps.
- Overloaded Infrastructure Nodes - medium - CPU above 90. Memory above 80. Red alert.
- Pages Viewed by Session Duration - medium - Longer sessions, more pages? Check.
- Pairwise Latency Maximum - medium - Every pair compared.
- Peak API Hour - medium - The hour when traffic peaks.
- Peak Hour Power Callers - medium - One hour. The phone lines exploded.
- Peak Latency for 2026-Era Endpoints - medium - Peak latency for that era's endpoints.
- Peak Retargeting Revenue Month - medium - Retargeting revenue. The peak month.
- Pipeline Completion Rate - medium - How far do users get through the flow?
- Pipeline Overhead by Environment - medium - Production overhead versus staging.
- Pipeline Recovery by Priority - medium - Recovery time, priority by priority.
- Pivot Event Counts - medium - Reshape rows into columns by event type.
- Pod CPU to Memory Ratio - medium - CPU versus memory. Resource efficiency.
- Power Users - medium - Engagement separates tourists from regulars.
- Power Users by Session Activity - medium - More sessions. More time. The power users.
- Power Users by Session Count - medium - Three sessions is casual. More than that is serious.
- Price Rank - medium - In every category, someone charges the most. Who's on top?
- Priciest Item in Each Category - medium - The most expensive item per category.
- Product Ratings vs Sales - medium - Do higher ratings actually mean more revenue?
- Products With Strong Unit Price - medium - Budget-friendly and high-performing.
- Product Transaction Counts - medium - Show how many transactions each product has, sorted by product ID.
- Profit Tiers - medium - High, moderate, or in the red. Every order gets a label.
- Prolific Authors in Largest Service Teams - medium - Senior leads in the biggest teams.
- Provider Spend Variance Between Halves - medium - Two time windows. Did the cloud bill go up or down?
- Push Notification Open Rate - medium - Push sent. How many opened?
- Push Notification Status Pivot - medium - Sent, opened, ignored. The notification lifecycle in numbers.
- Push Opens by Platform and Campaign - medium - Opens by platform and campaign.
- Quarterly Consolidated Cloud Costs - medium - Quarterly cloud spend, weighted.
- Rank Users by Search Query Count - medium - Who searches the most? The answer might surprise you.
- Rapid Retry Detection - medium - Detect retried API calls within 5 minutes of failure.
- Rate Limit Rules Per Endpoint - medium - Threshold rules, endpoint by endpoint.
- Rating Tiers - medium - No gaps, no skips. Ratings stacked tight within each category.
- Recent Price Drops - medium - The price just dropped. Who noticed?
- Regional Order Summary - medium - Region by region. The order numbers tell the story.
- Regions by Alert Volume - medium - Some regions are quiet. Others never stop screaming.
- Region With Best Uptime - medium - The single most reliable region.
- Region With Most Nodes - medium - Which region hosts the most?
- Repeat Buyers Across Halves - medium - First half buyer. Second half buyer. Same person.
- Repeated Transactions - medium - Detect same amount transactions within 10 minutes.
- Repeat Purchases Within a Week - medium - They bought again within seven days.
- Repeat Purchase Window - medium - The retention squad is looking for repeat purchasers.
- Repository Commit Ranking - medium - Lines added tell the story of a repo's ambition.
- Repos with More Builds Than Commits - medium - More builds than commits. Something is off.
- Response Buckets - medium - Fast, normal, or slow. Every API call gets a verdict.
- Retried Failed API Calls - medium - Spot users who retry API calls within 5 minutes of a failure.
- Returning Buyers - medium - They came back and bought again.
- Revenue Per Product With Zeros - medium - Total revenue per product. Even the zeros.
- Reviewer Performance Metrics - medium - Some reviewers are thorough. Others are fast.
- Reviewers Per Repo Per Year - medium - Reviewers per repo, year by year.
- Revoked Tokens by Scope - medium - Banned tokens, sorted by what they had access to.
- Rolling Weekly Total - medium - Seven days at a time, the totals keep rolling forward.
- Rows With Multiple Flag Conditions - medium - Rows caught by multiple flags.
- Runner-Up Cost Without ORDER BY - medium - The second highest. Without sorting.
- Running Tab - medium - Every purchase adds to the total. Watch the tab grow.
- Rush Hour API Latency - medium - Rush hour hits the API differently.
- Same-Day Signup Rate - medium - Percentage of transactions on the signup date.
- Same First and Last Reply Target - medium - They started and ended the month messaging the same person.
- Satisfaction by Platform - medium - Satisfaction scores, platform by platform.
- Second Highest Cloud Cost - medium - The second biggest bill on record.
- Second Highest Latency by Method - medium - Almost the slowest. By method.
- Senior to Junior Ratio - medium - The ratio tells you a lot about the department.
- Servers Returning to Origin - medium - Servers that migrated back home.
- Server With Most Errors - medium - One server stands out. Not in a good way.
- Service Budget per Head - medium - Budget per head. Pipeline by pipeline.
- Service Component Classification - medium - Classified by naming pattern.
- Service Reliability Tiers - medium - Reliability tiers. Based on uptime.
- Services at Median Uptime - medium - Exactly at the median. Not above, not below.
- Service Uptime Minutes - medium - Status changed. How long was it actually up?
- Session Duration by Account Status - medium - Average session duration broken down by user account status
- Session Overview - medium - Full engagement picture, even for the ones who never showed up.
- Session Rank - medium - Longest sessions rise to the top. Within each user, a pecking order.
- Sessions by Content Type - medium - Engagement, broken down by content format.
- Shared Category Purchasers - medium - They bought different things from the same aisle.
- Shared Endpoints - medium - Shared credentials across endpoints.
- Signup to Subscription Rate - medium - Conditional aggregation for conversion rates
- Single Service Owners - medium - One owner, one service. Nobody else.
- Smooth Latency - medium - Noisy latency readings, smoothed into a trend you can trust.
- Spending by Account Status - medium - Segment user spending and activity by account status across the platform
- Spending Tiers - medium - High rollers, mid-spenders, and the frugal. Everyone gets a tier.
- Split Metric Sums - medium - One column, two totals.
- Subscribers Without Premium - medium - Subscribed. But never upgraded.
- Successful Build Duration by Repository - medium - CI throughput, repo by repo.
- Successful Call Volume per Endpoint - medium - Not every ping is honest.
- Sum Excluding Extremes - medium - Remove the outliers. Then sum.
- Super Reviewers - medium - The most prolific code reviewers.
- Symmetric Reply Network - medium - Who replies to whom? Both directions.
- Tables With Many DQ Failures - medium - Some tables have never once passed QA.
- Tables With Most DQ Failures - medium - The tables with the most failures.
- Teams Below Double Average Spend - medium - Teams spending under twice the average.
- Tenure Mentorship Match - medium - Pair by tenure. Longest with newest.
- The Podium Finish - medium - Top two products per category.
- The Quiet Alarms - medium - Low severity. High volume. Worth a look.
- The Slow Lane - medium - Peak API load. The slow endpoints.
- Third Highest Spender - medium - Bronze medal in spending.
- Three-Item Combinations - medium - Generate all unique 3-item sets with total cost.
- Three-Value Sum Combinations - medium - Pick three. See what they add up to.
- Token Churn Rate - medium - Tokens come and go. How fast is the revolving door?
- Tokens With Non-Read Scope Prefix - medium - Tokens that don't start with 'read'.
- Top 10 AB Test Variants - medium - The ten best-performing variants.
- Top 10 CPU-Heavy Nodes - medium - The ten hungriest nodes.
- Top 10 Rated Products - medium - The ten highest-rated items.
- Top 2 Active Push Days - medium - Two days stood out from the rest. Which ones?
- Top 2 Ad Campaigns by Spend - medium - Two campaigns. Most of the budget.
- Top 2 Busiest API Slots - medium - Two time slots per week. The busiest.
- Top 2 Callers per Endpoint - medium - Two top callers per endpoint.
- Top 2 Cloud Services by Cost - medium - Two services eating most of the budget.
- Top 2 Rate-Limited Clients - medium - Two clients are hitting the rate limit harder than anyone.
- Top 3 First-View Pages - medium - The first three pages new users see.
- Top 3 Revenue Months - medium - The three best months on record.
- Top Accuracy Model - medium - The single best-performing model.
- Top Active API Tokens - medium - The five busiest tokens.
- Top Active Senders per Channel - medium - Top three messages per channel by replies.
- Top Alert Resolvers - medium - The engineers who resolve the most.
- Top API Caller - medium - One user triggered more API calls than anyone.
- Top AWS Non-APAC Service Costs - medium - Outside APAC, AWS costs tell a different story.
- Top Batch Job Under Priority 1 - medium - Priority one. Top performer.
- Top Buyers by Transaction Count - medium - Frequency is loyalty. Who keeps coming back?
- Top Buyers of Premium Products - medium - Which users bought the most top-rated products
- Top Campaign by Opens - medium - One campaign got all the opens.
- Top Campaign by User Revenue - medium - Which campaign made each user spend the most?
- Top Category by User Segment - medium - Each segment has a favorite category.
- Top Chat Contributors - medium - The ten most active chat users.
- Top Committers in 2025 - medium - In a sea of commits, only a few wear the crown.
- Top Content by Lifetime Value - medium - Lifetime value. Measured in total watch time.
- Top Content by Views - medium - Top five content items by views.
- Top Content by Watch Time - medium - Some content holds attention. Others get skipped.
- Top Content Flagger - medium - Flagged content. Who flagged the most?
- Top Cost Categories - medium - Three categories eating the budget.
- Top Cost Entry per Team - medium - The single biggest bill per team.
- Top Earner Per Campaign - medium - The top-earning user per campaign.
- Top Error Categories in 2025 - medium - Last year's worst error categories.
- Top Error-Service Pair - medium - Which error-service pair triggered the most resolved incidents
- Top Frameworks by Accuracy - medium - Top three frameworks by accuracy.
- Top Identified Event Types - medium - The top users by events, but only the identifiable ones.
- Top Lessons Each Month - medium - Rank items within time periods and keep top 3
- Top Metric per Department - medium - Peak performer in every department.
- Top Pattern Matches - medium - A needle in a haystack, but how many haystacks?
- Top Percentile Spenders - medium - Top 1% of users by total spend via percentile bucketing.
- Top Product Categories - medium - Top three categories by page views.
- Top Product Category by Transactions - medium - Organic purchases, no marketing nudge. Which category wins?
- Top Products by Quantity Sold - medium - The bestsellers. By volume.
- Top Products per Category - medium - Five winners per category.
- Top Region by Order Volume - medium - The single busiest region.
- Top Regions by Critical Alerts - medium - Which regions have the highest volume of critical alerts
- Top Regions by Effective Uptime - medium - The most reliable regions.
- Top Repos by Commit Volume - medium - The most active repos in the org. No ties left behind.
- Top Repos by Successful Builds - medium - Green builds. Which repos lead?
- Top Revenue Products H1 - medium - First half of the year. Which products led the revenue race?
- Top Services by Regional Cost - medium - Top spenders in one region.
- Top Services by Uptime - medium - Uptime is a competition. Which services never blink?
- Top Services Per Provider - medium - Within each cloud, two services rise above the rest.
- Top Spender - medium - When your spending exceeds the priciest item on the shelf.
- Top Users by Pages Viewed - medium - Five users who browsed the most.
- Top Users by Recent Spend - medium - Big spenders in the last 30 days.
- Top Users by Session Time - medium - They spent the most time here.
- Transaction Revenue by Customer - medium - One month, every customer, every dollar accounted for.
- Transaction Share of User Spend - medium - Each transaction's share of the whole.
- Transaction Timeline - medium - First purchase to last. The full spending arc.
- Trend Spotter - medium - What did they spend last time? Context changes everything.
- Unclicked Searches by Campaign - medium - Searched but never clicked.
- Unique Hostnames per Region - medium - How many distinct machines live in each region?
- Unique Reporters per Content - medium - How many people flagged each item?
- Unmatched Deploy Services - medium - Two registries. They do not agree.
- Unsold Product Categories - medium - Dead inventory inflating storage costs.
- US Active User Share - medium - What percentage of active users are US-based?
- User Devices - medium - Desktop, mobile, tablet. What does each user actually use?
- User Engagement Summary - medium - Sessions plus searches. The full engagement picture.
- Users Outperforming Control - medium - Treatment beat control. For these users.
- User Spend Audit - medium - One user. One category. Total spend.
- Users With Admin Tokens - medium - Admin tokens. Who holds them?
- Users With API Errors - medium - Count unique users who have triggered an API error response
- Users Without Purchases - medium - How many registered users have never made a single purchase
- Users Without Sessions - medium - Account created. Never logged in.
- User With Most Transactions - medium - The most active buyer.
- Views by Content Type - medium - Count content views broken down by content type
- Word Count Per Message - medium - How wordy are the messages?
- Workers Earning Above Department Average - medium - Earning above the department average.
- Yearly Build Duration by Repo - medium - Build times by repo, year by year.
- Year-over-Year Content Launches - medium - Launch velocity, year over year.
- Zero Accuracy on First Training - medium - First run. Zero accuracy. How common?
- Cumulative Sales Per Customer - medium - Each purchase adds to the running total. Watch it climb.
- Category Revenue - medium - Which categories pull their weight?
- Platform Speed - medium - Which devices keep users longest?
- Click Rate - medium - Campaigns nobody clicks.
- Above the Curve - medium - Spenders who break from the pack.
- Department Snapshot - medium - Who is underperforming and who is excelling?
- Noisy Endpoints - medium - The routes generating the most noise.
- Build Health - medium - Repos that break more than they ship.
- Category Buyers - medium - Which categories have the broadest reach?
- Diverse Shoppers - medium - They shop the whole catalog.
- Silent Users - medium - Users who have never typed a query.
- Funnel Leakage Report - hard - Users enter the funnel. Most never reach the bottom.
- The Session Stitcher - hard - Page views without sessions are just noise.
- The Regional Cost Reconciliation - hard - Two cost tables, one region. Reconcile the running balance.
- The Cannibalization Report - hard - The new product launched. The old one suffered.
- 2nd Most Common Content Type - hard - Everyone talks about number one. What about number three?
- 7-Day Onboarding Conversion - hard - Signed up Monday. Still here by Sunday?
- Above Category Avg - hard - Above average is relative. Relative to what?
- Active User Penetration Rate - hard - How much of the user base is actually alive?
- Adopters Before Migration - hard - They used the old feature. Did they ever touch the new one?
- Aggregate Votes by Paper Subject - hard - Net revenue, day by day, for one product in one region.
- Alert Severity - hard - When the alarms go off, who screams loudest?
- Allocations in Top Spending Region - hard - The biggest spenders live in one region.
- Alphabetical Tag Sort - hard - Tags in the wrong order.
- API Call Distribution Fraction - hard - Not all endpoints are created equal.
- Average Event Progression Time - hard - How fast do users move through the funnel?
- Average Sessions Per User - hard - How often do users come back?
- Best Selling Product by Month - hard - Every month has a winner.
- Bottom 2% Services by Spend - hard - The bottom 2% of spenders. Who are they?
- Cache Efficiency - hard - Some edges run hot. Others coast on the global average.
- Campaign Bookend Engagement - hard - First impression versus last. The gap.
- Campaign Conversion Count - hard - The push notification went out. Did anyone convert?
- Campaign Conversion Window - hard - A narrow window between impression and action.
- Campaign Engagement Rank Shift - hard - Two months, many countries. Who moved up? Who fell?
- Category Deep Dive - hard - Revenue, units, rank. The full category report card.
- Cheapest and Most Expensive Service per Region - hard - Every region has a bargain and a budget-buster.
- Cheapest CDN Route - hard - The cheapest path across regions.
- Classify Accounts by Activity Tier - hard - The accounts fall into tiers. Where is the cutoff?
- Cloud Cost Breakdown by Provider - hard - Cloud costs, provider by provider.
- Commit Cadence - hard - Some repos go quiet for too long.
- Consecutive Cost Growth Periods - hard - Five straight months of spending increases.
- Content Page Spreads - hard - Content, laid out in two columns.
- Cost Efficiency Variance - hard - Cost efficiency varies. By how much?
- Creator Favorite Content Type - hard - Every creator has a go-to format.
- Daily Net Revenue - hard - Net revenue, day by day. Refunds included.
- Data Quality - hard - Failed checks pile up. Which tables need the most attention?
- Department Quarterly Pivot - hard - Headcount by department, sliced by quarter. The org chart in numbers.
- Deploy Velocity - hard - Days between deploys. Some services ship fast, others crawl.
- Endpoint Name Word Count - hard - Some endpoint names are novels.
- Endpoint Ranking - hard - The slowest endpoints. Called to the principal's office.
- Error Category Breakdown - hard - Postmortem time. Categorize the errors.
- Exact Keyword Counts in Logs - hard - Errors and warnings. Count every single one.
- Experiment Impact - hard - Which experiments moved the needle? Rank them within each group.
- Experiment Variant Ratios - hard - Control versus treatment. The participation split.
- Fastest and Slowest Services by Region - hard - The fastest and slowest in every region.
- Fastest Page View to Click - hard - How fast from view to click?
- Feature Flag Engagement Impact - hard - Flags on versus flags off. The engagement gap.
- Feature Flag Fan vs Detractor Pairs - hard - Some users love the flag. Others want it gone.
- Feature Name Intersection - hard - Training names versus serving names. The overlap.
- First-Day Session Retention - hard - Day one retention. The first test.
- First Interaction Credit - hard - Attribute transactions to earliest touchpoint
- Flatten Org Chart Hierarchy - hard - The tree runs deep. Walk every branch to the root.
- Friday Spending Analysis - hard - Friday spending during Q1.
- Full Funnel - hard - Search. Browse. Buy. Only a few do all three.
- Healthiest Service Check History - hard - The healthiest service. Full history.
- High Engagement Pages - hard - Some pages hold attention longer than others.
- Impressions by Search Keyword - hard - Campaign performance, keyword by keyword.
- Incident Keyword Messages - hard - Certain words trigger an investigation.
- Intra-Region Latency Diff - hard - Same region. Different latency.
- Largest CDN Response - hard - One edge location served something massive.
- Latency Quartiles Per Endpoint - hard - Quartile breakdowns. Endpoint by endpoint.
- Latency Variance and Std Dev - hard - How much does latency actually vary?
- Longest Uptime Streak - hard - Pass, pass, pass. How long until fail?
- Longest Visit Streaks - hard - Day after day after day. Who kept coming back?
- Lowest CPU Pods per Namespace - hard - The five lightest pods per namespace.
- Market Share - hard - Every category wants a bigger slice.
- Median Cloud Cost by Service - hard - The median cloud bill, service by service.
- Median Failure Rate by Table - hard - Half the tables fail more than this.
- Median Household Earnings - hard - Household earnings. The median reveals the middle.
- Median Model Accuracy - hard - The median accuracy. Not the mean.
- Median Transaction by Category - hard - The middle transaction in each category.
- Mid-Range Team Spenders - hard - Above average but not extreme.
- Minimum Parallel Workers - hard - Too few workers and it stalls.
- Model Accuracy Drift - hard - Accuracy used to be higher.
- Mode of Small Team Costs - hard - One charge keeps showing up everywhere.
- Monthly Cloud Cost Forecast Error - hard - The forecast was off. By how much?
- Monthly Deploy Counts Pivoted - hard - Deploys by month. Side by side.
- Monthly Revenue Change - hard - Revenue, month over month.
- Monthly Service Retention - hard - Users came back. Or they did not.
- Most Efficient High-Volume Campaign - hard - High volume. Low cost. The dream campaign.
- Most Efficient Region by Token Usage - hard - Some regions squeeze more out of every token.
- Multi-Category Buyers - hard - One-category shoppers are boring.
- Multi-Month Active Users - hard - Active this month and last month. Who stuck around?
- New Services With Poor Health - hard - New services, already struggling.
- New vs Returning User Share - hard - Fresh faces versus familiar ones.
- Node Utilization - hard - Overloaded nodes hiding in busy regions. Spot the hot spots.
- Oldest Alert per Service - hard - The oldest unresolved alert per service.
- Peak Concurrent Pods - hard - The most pods alive at once.
- Peak Concurrent Tokens - hard - How many tokens were alive at the same time?
- Pipeline Duration vs Throughput - hard - Does throughput correlate with duration?
- Previous Day Top Service - hard - Yesterday's top spender.
- Price Pairs - hard - Same shelf, wildly different stickers. Spot the pricing gaps.
- Quarterly Peak Cloud Costs - hard - Every quarter has a peak bill.
- Quarter-over-Quarter Latency Trend - hard - Latency trending up or down? The quarters have the answer.
- Rarest Latency Value - hard - A latency value that appeared exactly once.
- Regional Sales Growth QoQ - hard - Quarter-over-quarter growth. Region by region.
- Resolved vs Unresolved Alerts - hard - Resolved versus open. By severity.
- Rolling Revenue Average - hard - Smooth out the revenue bumps. The trend matters more.
- Running Total With CTE - hard - A running total that builds step by step.
- Same-Day Session and Transaction Correlation - hard - Same day session and purchase. Connected?
- Search Algorithm Rating - hard - How good are the search results?
- Search Success by User Tenure - hard - Compare search click-through rates between new and existing users.
- Search Term Length vs Click Rates - hard - Longer queries, more clicks?
- Second Purchase - hard - The first buy is curiosity. The second is commitment.
- Sequential Service Transitions - hard - Job to job. The transitions.
- Service Scorecard - hard - Deploys vs. alerts. One row per service tells the whole story.
- Services Hitting Cost Threshold - hard - The budget line is here. How many crossed it?
- Services With Most Checks in 2025 - hard - Last year's most-checked services.
- Services With Multi-Quarter Uptime - hard - Multi-quarter uptime streaks.
- Service Uptime Turnaround - hard - It was down. Then it came back. Stronger.
- Service With Most Critical Alerts - hard - One service keeps setting off the alarms.
- Session Count Distribution - hard - How are sessions distributed among the newest users?
- Session Page View Distance - hard - Page view distance per session.
- Shared Channel Contacts - hard - User networks mapped through messages.
- Spend and Rank - hard - Five thrones at the top of the spending leaderboard.
- Spending Range - hard - Between the smallest purchase and the biggest lies the story.
- Streak Status Changes - hard - Detect value changes across consecutive rows
- Team Cost Allocation Comparison - hard - Individual spend versus team average.
- Tenure Spread for Active Tokens - hard - Tenure extremes among active tokens.
- The Usual Suspects - hard - Same services, same checks, same problems.
- Top 3 Monthly Costs per Team - hard - Three priciest months per team.
- Top and Bottom Cloud Spenders - hard - The extremes. Top and bottom.
- Top Commit Authors by Repo - hard - Three authors per repo. The top committers.
- Top CPU Pods per Namespace - hard - The two most CPU-hungry pods in each namespace.
- Top Endpoint by Power Users - hard - Power users have a favorite endpoint.
- Top Flagged Campaign Resolutions - hard - Flagged the most. Resolved how?
- Top Framework by Deployments - hard - The framework most often deployed.
- Top Models by Framework - hard - Every framework has a star model.
- Top Per Category - hard - Every category has a champion. Crown them all.
- Top Percentile API Tokens - hard - The most suspicious tokens.
- Top Regions by High CPU Nodes - hard - Five regions with the hottest CPUs.
- Total Hours Between Consecutive Events - hard - Hours between state changes.
- Transaction-Only Features - hard - Exclusive to one source. Missing from the other.
- Upvote Percentage by Age Cohort - hard - New users versus existing. The upvote gap.
- User 360 - hard - One row per user. Everything they did, or didn't do.
- User Campaign Overlap Percentage - hard - How much ad overlap between users?
- User Connection Score - hard - Every user has a social score.
- User Spend Segmentation by Category - hard - Users segmented by spending behavior.
- Users Who Churned in February - hard - Gone in February.
- Users With and Without Ad Clicks - hard - Clicked an ad versus never clicked. The split.
- Viewer-to-Purchaser Activity - hard - Started as viewers. Became creators.
- Weekly Order Status Report - hard - Weekly order status. The report.
- Weekly Transaction Day Split - hard - Transactions by day of week.
- Weighted Variant Selection - hard - Select a row using cumulative weight probabilities.
- Worst Table Per Year by DQ Failures - hard - Every year has a worst table.
- YoY Signup Growth Rate - hard - This year versus last year. Growing or shrinking?
- Zero-Retry Job Ratio by Priority - hard - No retries needed. First try success rate.
- Slowly Changing Dimension Type 2 - hard - Addresses change. History must not be erased.
- Normalization Tradeoffs in Practice - hard - Clean data or fast queries? You can't always have both.
Data Modeling Practice Problems (56)
- Customer Address History - easy - People move. Sometimes twice in a month. How do you remember where everyone was, and when?
- B2B Invoicing Data Model - easy - Invoices go out, partial payments trickle in, and some customers are three months overdue.
- Fitness Studio Membership Schema - easy - Classes fill up. Members no-show. Billing continues.
- A Number for the Seller - easy - They want a total. Give them the right schema first.
- Event Ticketing System Data Model - easy - JSON in. Reporting warehouse out. Design both ends.
- Loan Management Schema - easy - Money out, payments back. The balance has to be exact.
- Toll Road Sensor Analytics - easy - Cars enter, cars exit. Except when they don't.
- Fitness App Data Model - easy - Reps, sets, streaks, and personal bests. Gym rats love their stats.
- Ride-Sharing Platform Schema - medium - Riders, drivers, and fares. Everyone takes a cut.
- Employee Transfer Tracking System - medium - People switch teams. HR loses track.
- Movie Streaming Analytics Schema - medium - They pressed play. What happened next is the whole question.
- Log Parsing Pipeline Schema - medium - Raw text files, terabytes of them, full of buried signals and cryptic error codes.
- Livestream Analytics Schema - medium - Someone goes live, thousands tune in, chat explodes, and virtual gifts start flying.
- POS Sales Data Warehouse - medium - Every beep at the register. Coupons, returns, all of it.
- Online Retail Star Schema - medium - Prices change. Categories shift. Revenue slices everywhere.
- Social Platform Data Model - medium - Follows, likes, replies to replies. It never stops.
- Subscription Churn Analysis Model - medium - Subscribers are leaving. The data knows why.
- Employee Application Time Tracking - medium - Every minute tracked. Every app accounted for.
- Food Truck Operations Data Model - medium - Mobile vendor, fixed menu, unpredictable locations.
- Loan Application Reporting Schema - medium - Approved, declined, or pending. Design the tables that say so.
- Machine Process Event Log Schema - medium - Machines fire events. Pair them up before they bury you.
- Order and Shipment Data Model - medium - Order placed. Now track it to the door.
- Sales Analytics Star Schema - medium - Five rounds with a data engineer. Round five: design the star.
- Subscription and Payment Data Model - medium - Two user types. Multiple payment methods. One messy billing table.
- The JSON Files That Became a Data Mart - medium - Three semi-structured inputs. One queryable warehouse.
- The Plan That Changed Twice This Month - medium - Subscribers come, go, downgrade, and share. The schema has to keep up.
- The Retail Tables That Need a New Home - medium - A working system. Now redesign it so the analysts can actually use it.
- The Talent Funnel - medium - Thousands applied. One accepted. Where did the rest go?
- The Transfer Request - medium - Apply, wait, get approved or denied. Track all of it.
- Retailer Data Warehouse Design - medium - Queries are crawling. The analysts are not happy.
- The Table That Lies - medium - Every query comes out wrong. The data is all there.
- Clickstream and Session Schema - medium - Millions of clicks, mostly anonymous.
- The Celebrity Problem - medium - One post. A million notifications. Something has to give.
- Housing Marketplace Analytics - medium - Sellers want buyers. Buyers want deals.
- Trending Dishes Dashboard - medium - What's everyone eating? The answer changes hourly.
- Airline Flight Operations Schema - medium - Flights, passengers, and routes. Before you draw a single table, tell me the grain.
- A/B Experiment Assignment Schema - medium - One user, one experiment, one variant. No exceptions.
- Multiplayer Game Match History - medium - Millions of matches. The leaderboard refreshes in fifteen minutes.
- EdTech Classroom Engagement Schema - medium - They opened the assignment. Did they actually read it?
- Telecom Network Connectivity Warehouse - hard - One device goes down. The ripple keeps going.
- Metric Definition Reverse Engineering - hard - Five numbers on a dashboard. Your job: figure out where they come from.
- Property Booking Platform - hard - Five-star listing. Three-star reality.
- E-Commerce Supply Chain Tracking - hard - A package splits, reroutes, and (maybe) arrives.
- SCD Type 2 Customer Dimension - hard - Things were different six months ago. Can you prove it?
- Financial Trading Warehouse - hard - Every trade, every tick, every fraction of a share. The regulators want receipts.
- Content Engagement Data Model - hard - Post published. Now measure everything that happens next.
- Content Search and Discovery Schema - hard - Searchable from every angle. Design it so nothing gets lost.
- Marketplace Sales Warehouse - hard - No schema given. The interviewer is watching.
- The League With Too Many Loyalties - hard - A player can belong to many teams. The schema must agree.
- The Schema That Could Not Answer Back - hard - Forty columns in. Zero useful answers out.
- The Churner Who Came Back - hard - They cancelled. They came back. The report has to tell both stories correctly.
- The Territory That Keeps Moving - hard - Reps get reassigned. The receipts have to survive.
- Insurance Claims Lifecycle - hard - A claim gets filed. Then it gets complicated. Then it gets reassigned. Then it loops back.
- Online Marketplace - Seller Payouts - hard - The buyer paid one number. The seller got a different one.
- Cloud File Storage Metadata Schema - hard - A file is also a folder. A folder is also a file.
- Three-Sided Marketplace Delivery Schema - hard - One order. Two deliveries. Revenue counted twice. Where is the bug in your schema?
Pipeline Architecture Practice Problems (121)
- Hourly ETL Pipeline with Consistency - medium - Every hour, on the hour. No excuses.
- Time Series CSV Ingestion Pipeline - medium - One massive CSV. Millions of timestamps.
- Order and Menu Recommendation Pipeline - medium - What they ordered says a lot about what they want next.
- Card Transaction Streaming Pipeline - medium - Every swipe tells a story.
- Data Pipeline for Sales Analytics - medium - Sales data is piling up. Someone has to make sense of it.
- Batch ETL: MongoDB to Redshift - medium - Two databases. One direction. No data left behind.
- Whiteboard ETL Pipeline Design - medium - Marker in hand. Draw the whole thing.
- GPS Tracking Pipeline for Logistics - medium - Trucks are moving. Every ping counts.
- SCD Pipeline into a Delta Lakehouse - medium - Dimensions change. History must survive.
- SaaS API Connector with Incremental Sync - medium - The API has rate limits. You have deadlines.
- Real-Time POS Ingestion into Snowflake - medium - The cash register data needs to be queryable by morning.
- Streaming Pipeline with Schema Validation and Snowflake Sink - medium - Bad records cannot reach the warehouse.
- Dynamic Schema File Ingestion Pipeline - medium - The schema changed overnight. Again.
- Pre-Aggregated User Activity Metrics Pipeline - medium - DAU, WAU, MAU. Refreshed every hour.
- Database Replication and Schema Normalization Pipeline - medium - Production is the source. Analytics needs its own copy.
- Document Ingestion and Text Extraction Pipeline - medium - Buried in PDFs. The data is in there somewhere.
- On-Prem to Cloud Pipeline Modernization - medium - The on-prem servers are not getting any younger.
- The API Drip Feed - medium - The API gives you 100 records at a time. You need millions.
- CDC Connector: Log-Based vs Trigger-Based - medium - Two ways to watch the database. Each has a cost.
- Snowflake Query Performance Degradation Diagnosis - medium - Queries used to be fast. Something changed.
- Real-Time POS Pipeline with Snowpipe and MERGE - medium - Sales hit the register. Snowflake needs to know now.
- GCP Sales Analytics Pipeline - medium - Sales data, BigQuery, Dataflow. Make it all sing.
- Resume Document Ingestion and Extraction Pipeline - medium - A thousand resumes. Structured data inside each one.
- Subscription Analytics Pipeline - medium - Subscribers churn. The pipeline cannot.
- Large-Scale Sales Data Pipeline for CPG Analytics - medium - Retail data at CPG scale. Every SKU, every store.
- Financial Services Pipeline with Regulatory Reporting - medium - The regulator does not accept 'eventually consistent.'
- Event-Driven Insurance Pipeline with Async Claim Processing - medium - Policies are instant. Claims take their time.
- Databricks Pipeline with Spark Performance Optimization - medium - Spark jobs are running. Just not fast enough.
- Gaming Event Pipeline: Streaming vs Batch Architecture Decision - medium - Millions of gamers. The architecture decision changes everything.
- Vehicle Fleet Telematics and Rental Operations Pipeline - medium - Every vehicle is reporting. Every rental matters.
- Insurance Claims and Policy Data Platform on Azure Databricks - medium - Claims arrive messy. The medallion cleans them up.
- Healthcare Claims CDC Pipeline with PySpark - medium - Healthcare claims change constantly. The warehouse cannot fall behind.
- Fintech Lending Platform Event Pipeline - medium - Loan approved. Loan denied. Every decision is an event.
- Azure Data Factory Orchestration with Databricks Unity Catalog - medium - ADF orchestrates. Unity Catalog governs. Nothing leaks.
- Energy Trading Market Data Pipeline - medium - Markets move in milliseconds. The pipeline has to keep up.
- Streaming Content Metadata and Viewer Engagement Pipeline - medium - The catalog updated. Did anyone notice?
- E-Commerce Platform Analytics Pipeline: Orders to Warehouse - medium - Orders placed. Data warehouse hungry.
- Regulatory Data ETL Pipeline with Dynamic Schema Handling - medium - The regulator changed the format. Again. Handle it.
- Last-Mile Delivery Shipment Tracking State Machine Pipeline - medium - Out for delivery. Delivered. Except the events arrived backwards.
- Financial Ratings Data Pipeline with dbt Incremental Strategy - medium - Ratings change. The incremental model has to keep pace.
- The Fare Aggregator - medium - Airfares shift every minute. Catch the best ones.
- The Consent Stitcher - medium - Consent was given. Or was it? Stitch the records together.
- Loyalty Rewards Pipeline with Late Bank Data - medium - The bank data shows up late. The rewards were already sent.
- Multi-Cloud Billing Unification Pipeline with Medallion Architecture - medium - AWS, Azure, GCP. Three bills. One truth.
- Multi-Touch Marketing Attribution Pipeline on Snowflake - medium - They saw the ad, clicked the email, then bought. Who gets credit?
- The Queue That Wouldn't Stop Growing - medium - 500,000 messages behind and the number keeps climbing.
- The Vendor Who Never Warns You - medium - Every month, something is different. The dashboards have no idea.
- The Sale That Needs to Land Now - medium - Three channels feeding one view. Not all of them speak the same language.
- The Provider That Sometimes Sleeps - medium - The models run at dawn. The data has to be there first.
- The Revenue That Was Wrong for Two Weeks - medium - Nobody caught it until the CFO asked a question. Design the system that catches it first.
- Six Hours to Miss a Deadline - medium - The rebuild works. It just doesn't finish in time.
- Every Device Has Its Own Dialect - medium - Three sources. Three formats. Same workout.
- Personalization Platform Ingestion - medium - Fresh signals, many teams, one pipeline.
- The Claim That Picks Its Own Lane - medium - Three entry points. Different workflows. All must route correctly.
- The Distributor Filing Problem - medium - Hundreds of suppliers. One warehouse. One deadline.
- URL Shortener Click Analytics Pipeline - medium - Billions of clicks. One tiny code. Two very different clocks.
- Real-Time Fraud Detection Pipeline - hard - The fraudsters move fast. Your pipeline has to move faster.
- Event System for Multiple Consumers - hard - One event, many hungry consumers.
- Real-Time Sales Lakehouse Ingestion - hard - The registers never stop ringing.
- Viewing Event Pipeline - hard - Someone is watching. Capture everything.
- Ad Simulation Platform Pipeline - hard - A million slots. A thousand campaigns. Every combination matters.
- Data Ingest Pipeline with Access Tradeoffs - hard - How you store it decides how fast you can read it.
- Fintech ETL with Data Validation Checks - hard - Bad data in fintech is not just messy. It is expensive.
- ML Feature Pipeline for Model Deployment - hard - The model is only as good as what you feed it.
- Streaming CDC into Delta Lake with UPSERT - hard - The source changed. The lake needs to know immediately.
- Multi-Region Payment Event Pipeline - hard - Payments from everywhere. One consistent report.
- Dual-Source Inventory Sync Pipeline - hard - Two systems, two schemas. One truth.
- Multi-Device Event Pipeline with Late Data - hard - Phones, tablets, laptops. And some of them report late.
- Cost-Optimized Clickstream Data Lake - hard - 600 million clicks a day. The budget is not infinite.
- Livestream Event Ingestion Pipeline - hard - The stream is live. The data cannot wait.
- S3-Based Data Warehouse with File-Level Access Control - hard - Everyone can see the bucket. Not everyone should.
- Multi-City Demand Forecasting Data Pipeline - hard - Five cities. Five data formats. One prediction.
- Healthcare Data Lake with Multi-Format Ingestion - hard - PDFs, HL7, JSON. All of it lands in the same lake.
- Near-Real-Time Trending Dishes Dashboard - hard - The dish rankings update faster than the kitchen.
- Lambda Architecture for Batch and Streaming Workloads - hard - Real-time and batch. Same pipeline. No compromises.
- AWS Pipeline Auto-Scaling for Variable Volume - hard - Tuesdays are quiet. Black Friday is not.
- Clickstream Pipeline for Apple Product Analytics - hard - Every tap, swipe, and scroll. At scale.
- Dual-Source Hotel Inventory Sync Pipeline - hard - Two booking systems. Rooms do not duplicate themselves.
- Merchant Payment Summary Pipeline - hard - Raw payment logs in. Clean merchant summaries out.
- Multi-Device Streaming Pipeline with GDPR Deletion - hard - Users want their data erased. Completely.
- Financial Trading Data Warehouse - hard - Fractional shares, multi-currency, point-in-time. All of it.
- Data Platform IaC with Semantic Layer - hard - Infrastructure as code. Meaning as a service.
- Online Schema Migration on a Billion-Row Table - hard - Add and backfill a new column to a billion-row production table with zero downtime.
- Order and Menu Feature Pipeline for Recommendations - hard - They ordered pad thai twice. That means something.
- AWS Pipeline with Auto-Scaling and Cost Governance - hard - Scale up when needed. Do not bankrupt the team.
- Pharma Data Ingestion Pipeline with Governance - hard - The FDA has opinions about your data pipeline.
- City-Wide Bicycle Demand Forecasting Pipeline - hard - Bikes in, bikes out. The city needs to predict demand.
- Cost-Efficient Clickstream Analytics with Two-Year Retention - hard - Two years of clicks. Every query has to be affordable.
- Retail Clickstream Event Store at Kafka Scale - hard - 600 million events a day. Two years of retention.
- Cellular Connectivity and App Log Data Warehouse - hard - Tower signals meet app events. Somewhere in between is the truth.
- On-Prem and Event-Driven Pipeline Migration to Cloud - hard - Half the jobs run on cron. Half run on events. All of it has to move.
- HIPAA-Compliant PHI De-identification Pipeline for Development - hard - Dev needs production data. HIPAA says absolutely not.
- Streaming Device Telemetry and Ad Impression Pipeline - hard - Every ad seen. Every second watched. Real-time.
- Streaming and Batch Unified Pipeline on Azure Databricks - hard - Streaming and batch. One pipeline to rule them.
- Consumer Goods Trade Promotion Pipeline on GCP - hard - Was the promotion worth it? The data knows.
- EHR Platform Operational Data Pipeline - hard - Patient records in, operational insights out.
- Global Insurance Premium and Loss Ingestion Platform - hard - Premiums collected globally. Losses happen locally.
- Rocket Delivery Feature Store Pipeline - hard - Same-day delivery. The features have to be faster.
- Real-Money Card Game Session Reconstruction Pipeline - hard - Real money on the table. Reconstruct every hand.
- Legacy ETL Modernization with SCD Type 2 Entity Resolution - hard - The legacy pipeline works. Nobody knows how.
- Connected Vehicle Telemetry Pipeline with IaC Deployment - hard - Every vehicle is a sensor. Deploy the pipeline to catch it all.
- Real-Time Investment Portfolio Position Pipeline - hard - Positions shift by the second. The math cannot lag.
- Device Insurance Claims Pipeline with Real-Time Fraud Scoring - hard - The claim looks clean. The fraud model disagrees.
- TV Audience Measurement Pipeline with Panel Projection - hard - Set-top boxes tell you who watched. Projection tells you how many.
- Cross-Platform TV and Digital Ad Measurement Pipeline - hard - TV and digital. Same viewer, two measurement worlds.
- Real-Time News Event Detection Pipeline from Social Media Firehose - hard - The firehose is on. Separate signal from noise.
- Capital Markets Intraday Risk Pipeline with BCBS 239 Lineage - hard - Intraday risk, full lineage. The regulator is watching.
- Federated Clinical Trial Data Pipeline - hard - Patient data stays local. Insights have to be global.
- Print Order Ganging and Manufacturing Analytics - hard - One press run, many orders. Group them right.
- Daily Payment Log Pipeline - hard - Three regions, billions of payments, one merchant summary by 6 AM.
- The Booking That Came Three Ways - hard - PMS, OTA, and website all think they took the reservation first.
- The Boutique That Sold in Six Currencies - hard - Every sale is real. The rate it was converted at depends on who is asking.
- The Clock That Runs Two Ways - hard - Nightly batch and live events. One dashboard.
- The Fleet That Never Stops - hard - Every truck is talking. Not everyone can hear them yet.
- Three Providers, One Workout - hard - The same ride, reported three times.
- The Decision Before the Door Closes - hard - The window to stop it is smaller than you think.
- The Migration That Cannot Break Morning - hard - It all works today. Moving it without losing a single report is the hard part.
- Two Million Boxes by Monday Morning - hard - Shipped, maybe. Delivered, debatable.
- The Leaderboard That Costs $25K a Month - hard - Product wants it live. Engineering has a price tag.
- Four Teams, One Topic, No Agreement - hard - Everybody is writing to it. Nobody documented it. Now production is fragile.
- The Analyst Who Saw the Salary Data - hard - Two incidents. One shared lake. The access model was never designed, just assumed.