SQL JOIN Interview Questions
SQL JOIN Interview Questions
JOIN-focused SQL interview problems with multi-seed grading for data engineer practice.
168 JOIN problems isolated from the data engineer SQL catalog. INNER, LEFT, RIGHT, FULL OUTER, anti-joins via NOT EXISTS, self-joins on inequality, and the many-to-many duplication trap. The largest single source of wrong answers in submission logs is the JOIN that should have been INNER but was written as LEFT.
JOINs are 20 percent of the data engineer SQL interview catalog by count and roughly 30 percent by graded weight. The disproportionate weight exists because JOINs are where most silent bugs live. A LEFT JOIN that should have been INNER produces extra rows with NULLs that pass simple SELECT but break downstream aggregations. A JOIN on a non-unique key in a many-to-many produces a Cartesian explosion that inflates COUNT and SUM silently. A self-join without an inequality predicate returns pairs in both directions plus self-pairs. Data engineer interviewers test for each of these specifically.
Six JOIN sub-patterns appear in this catalog. INNER versus LEFT distinction: the seeds include orphan rows on both sides. A LEFT JOIN with a WHERE clause on the right table accidentally converts to INNER because the WHERE filters out the NULL rows from non-matches. The fix is to move the predicate into the ON clause. Self-join on inequality: pairs of employees sharing a manager use e1.id less-than e2.id as the strict inequality; not-equal produces both directions and self-pairs. Anti-join via NOT EXISTS or LEFT JOIN with right.key IS NULL: NOT IN with a NULL value silently returns empty because NULL NOT IN list is unknown, not true; this is a graded trap. JOIN on a composite key with NULL: NULL equals NULL is unknown so the row drops; if NULL means "match" use IS NOT DISTINCT FROM in Postgres or COALESCE to a sentinel value. Many-to-many with explicit deduplication: LEFT JOIN to a bridge table inflates by the bridge cardinality, so SELECT DISTINCT on the keys or pre-aggregate the bridge to one row per fact key before joining. SCD2 half-open temporal joins: effective_from less-than-or-equal-to event_time AND (effective_to IS NULL OR event_time less-than effective_to); the closed-interval mistake doubles facts at the boundary microsecond, the open-interval mistake drops them.
The most-asked JOIN variant in 2026 data engineer interviews is the self-join for hierarchies and pairwise comparisons (employees-and-managers, products-frequently-bought-together, friends-of-friends). The second-most-asked is the anti-join for "find X without Y" (users who never made a purchase, products never reviewed, departments without any senior engineers). Both have multiple equivalent SQL formulations: LEFT JOIN with IS NULL, NOT EXISTS with correlated subquery, EXCEPT. The choice between them is a performance question in optimization rounds. NOT EXISTS short-circuits on first match and is almost always the right pick at scale.
Dialect notes for data engineer interviews. Postgres and Snowflake support IS NOT DISTINCT FROM for NULL-safe equality. BigQuery does not, but you can simulate with COALESCE. Hive and Presto support LATERAL VIEW EXPLODE and UNNEST for array column expansion which substitutes for some many-to-many bridges. MySQL 8 added native CTEs and window functions; pre-8 MySQL forces awkward self-join workarounds for ranking. EXPLAIN reading at L5 and above asks the data engineer to identify a hash join versus a sort-merge join versus a nested loop in the plan and explain when each is appropriate.
- What is the most common SQL JOIN bug interviewers fish for?
- Using a LEFT JOIN when an INNER JOIN is correct, then putting a WHERE clause on the right table's column. The WHERE clause filters out the NULL rows from non-matches, silently converting the LEFT JOIN to an INNER. The result looks plausible but drops the rows the LEFT was supposed to preserve. The fix is to move the predicate into the ON clause: LEFT JOIN t2 ON t1.key equals t2.key AND t2.col equals 'X'.
- When does a data engineer use a self-join in SQL?
- Hierarchical data (employees reporting to managers, categories reporting to parent categories), pairwise relationships (friends-of-friends, products-frequently-bought-together), and time-series comparisons (this month's revenue vs last month's). Pattern is JOIN the table to itself with aliases. For pairwise without duplicates, use strict inequality (e1.id less-than e2.id). Equality (e1.id not-equal e2.id) produces both (A, B) and (B, A) plus self-pairs.
- What is the difference between NOT IN and NOT EXISTS in SQL?
- NOT EXISTS is correlated and short-circuits on first match; performs well on large data. NOT IN materializes the subquery; performs worse on large data AND silently returns empty if the subquery has any NULLs because NULL NOT IN (1, NULL) is unknown, not true. For 'users who never made a purchase', use NOT EXISTS (SELECT 1 FROM orders WHERE orders.user_id equals users.id) or LEFT JOIN ... WHERE orders.user_id IS NULL.
- How do you JOIN on a composite key when one column is nullable?
- Standard equality (a.k1 equals b.k1 AND a.k2 equals b.k2) drops rows where either side is NULL because NULL equals NULL is unknown. If NULL means 'match', use Postgres IS NOT DISTINCT FROM or COALESCE both sides to a sentinel value that cannot appear in the data (rare in practice). If NULL means 'no match', the standard equality is correct. Verify the semantic with the interviewer.
- What is a many-to-many duplication trap in SQL?
- Joining a fact to a many-to-many bridge inflates row counts by the bridge cardinality. SELECT SUM(fact.amount) after a JOIN to a bridge that has 3 rows per fact returns 3x the true sum. The fix is to pre-aggregate the bridge to one row per fact key before joining, or to use SELECT DISTINCT on the keys before the SUM. Data engineer interviewers engineer this trap in the seeds; the query passes on data where the bridge happens to be 1-to-1 and fails when it is not.
- When should I use FULL OUTER JOIN?
- Rarely. The legitimate use cases are reconciliation (comparing two sources where either may have rows the other does not, plus matches in the middle) and pivoted comparisons (showing both sides of a delta). In most data engineer interview contexts where FULL OUTER seems right, a UNION ALL of two LEFT JOINs is clearer and performs the same. Mention FULL OUTER and defend whether it is the right choice.
- How do you JOIN to a slowly changing dimension Type 2 correctly?
- Half-open interval: ON dim.entity_id equals fact.entity_id AND dim.effective_from less-than-or-equal-to fact.event_time AND (dim.effective_to IS NULL OR fact.event_time less-than dim.effective_to). The half-open (less-than-or-equal on the left, strict less-than on the right) is what prevents two dim rows from matching at the boundary microsecond and doubling the fact. Closed-interval joins double; open-interval joins drop.
922 practice problems matching this filter. Difficulty: medium (431), hard (146), easy (345).
SQL (922)
- 10 Lowest Uptime Services - medium - Ten services at the bottom of the reliability chart.
- 2FA Confirmation Rate - medium - Two-factor sent. How many confirmed?
- 2nd Most Common Content Type - hard - Everyone talks about number one. What about number three?
- 30-Day Page View Counts - easy - Thirty days of engagement. Quick snapshot.
- 7-Check Rolling Average - medium - Seven entries hold the trend.
- 7-Day Onboarding Conversion - hard - Signed up Monday. Still here by Sunday?
- 7-Day Token Retention - medium - Premium tokens, day by day.
- 80th Percentile API Latency - medium - The 80th percentile tells the real story.
- 90th Pctl Model Accuracy Gap - medium - Most models are fine. The bottom 10% are not.
- Above Average - easy - Products beating the catalog average.
- Where the Money Burns - medium - Some services quietly burn more than the rest.
- Above Average Interactions - easy - The average user is boring. Who is above?
- Above Average Product Prices - medium - Some products cost more than they should.
- Above Category Average - easy - The category average is one thing. These beat it.
- The Above Average - medium - What counts as good depends on the company you keep.
- Above the Curve - medium - Spenders who break from the pack.
- Active API Tokens - easy - Tokens that have actually been used.
- Active Campaigns - easy - Which campaigns are earning their keep?
- Active Duo - medium - Shoppers who also browse. The overlap is the insight.
- Active Searchers - medium - They typed a query. That means something.
- Active Tokens on Target Date - medium - One specific day. Which tokens were still alive?
- Active User Open Rate - medium - What share of push notifications were opened by active users
- Active User Penetration Rate - hard - How much of the user base is actually alive?
- Active User Revenue for April - easy - Total revenue from active users in a single month
- Beyond the Signup - medium - Anyone can create an account. Fewer actually return.
- Active Users With April Transactions - easy - Active accounts that also opened their wallets. How many?
- Presence vs. Participation - medium - Being in the region and being active are two very different things.
- Activity Histogram - easy - How many users did X things? Build the distribution.
- Ad Clickers - easy - Who clicked? What did they spend?
- Adopters Before Migration - hard - They used the old feature. Did they ever touch the new one?
- Ad Revenue 2026 - easy - Annual ad revenue. On the books.
- Ad Revenue by Age Bucket - medium - Ad dollars, sliced by country.
- After Hours API Calls - medium - The office is dark. The API is not.
- The Vote Tally - hard - Net revenue, day by day, for one product in one region.
- Alert Count by Severity Tier - medium - Alerts by severity. The breakdown matters.
- Alert Hotspots by Service and Severity - easy - Some services and severities light up more than others.
- Alert Response Breakdown - medium - An on-call postmortem asks which services are bleeding alerts nobody acknowledges.
- Alert Severity - hard - When the alarms go off, who screams loudest?
- The Severity Matrix - medium - When services cry wolf, the numbers reveal who's serious.
- All Infra Regions - easy - The infrastructure spans the globe. Map it.
- All Known Endpoints - medium - Two tables. One truth. Every endpoint accounted for.
- Allocations in Top Spending Region - hard - The biggest spenders live in one region.
- The Tag Order - hard - Tags arrived in chaos. The system needs them in line.
- Annual Cloud Spend - easy - One year of cloud bills. The total.
- Annual Cloud Spend Summary - easy - A year of cloud bills. Add it all up.
- Annual Pipeline Failures - easy - How many pipelines broke this year?
- API Call Distribution Fraction - hard - Not all endpoints are created equal.
- API Calls With and Without Errors - medium - Some calls succeed. Some do not. Break it down.
- API Calls With Matching Status - medium - Same status, same pattern. Coincidence?
- API Token Churn Rate - medium - Tokens come and go. What's the turnover?
- API Traffic by CDN Edge - medium - CDN paths carrying API traffic. Which edges?
- App Stability by Region - medium - Some regions crash more than others.
- April and May Active Users - easy - Spring cleaning for the user base. Who was actually around?
- Attributable Impression Rate - medium - What share of ad impressions can be traced to a real user account
- Auction Lot Summary - medium - The hammer falls. Who bid the most?
- Auth Endpoint Callers - medium - Identify users who have called authentication API endpoints
- Auth Endpoints - easy - Not all endpoints are visible to everyone.
- Authors Deploying to Dev and Production - medium - Dev, staging, production. Who has touched all three?
- Authors With Successful Deploys - easy - Who deployed successfully?
- Auth Service Health Checks - easy - One service. Full audit trail.
- Average Accuracy by Framework - medium - Not all frameworks deliver the same accuracy.
- Average API Latency by Year - medium - Latency year over year. Is it getting better?
- Average Brand Campaign Revenue - easy - A quick benchmark on brand campaigns.
- Average Build Duration by Repo - easy - Some repos build fast. Others don't.
- Average Compensation by Department and Status - medium - Average compensation. Department by department.
- Average DQ Fail Rate - easy - Average failure rate, table by table.
- Average Event Progression Time - hard - How fast do users move through the funnel?
- Average Fulfillment Lag - medium - Ordered, then... waiting.
- Average GPU Node CPU Usage - easy - GPU nodes burning CPU. How much?
- Metric Trend by Department - easy - How each team's numbers moved, year over year.
- Average High-Range Accuracy - easy - The top-scoring models. What's their average?
- Average Initial Call Latency - medium - First contact latency. The benchmark.
- Average Latency by Health Status - easy - Healthy versus degraded. The latency gap is real.
- Average Latency by Status - easy - Each status code has its own latency story.
- Average Node CPU by Region - easy - Average infrastructure node CPU usage broken down by region
- Average Node Utilization - easy - CPU and memory, region by region.
- Average Rating by Category - easy - Category ratings. Some shine, some don't.
- Average Response Time by Hour - easy - Hour by hour. When does latency spike?
- Average Results for Python Searches - medium - Python searches. What's the click-through?
- Average Review Comments by Author - medium - Some authors get more feedback than others.
- Average Search Endpoint Latency - easy - One endpoint. Average speed.
- Average Search Results Per User - easy - How many results per searcher?
- Average Session Duration - medium - How long do users actually stay?
- Average Session Duration by Device - easy - Session length, device by device.
- Average Sessions Per User - hard - How often do users come back?
- Average Spending by Account Status - medium - Average per-user lifetime spending segmented by account status
- Average Update Call Latency - medium - Follow-up calls. How fast?
- Average Watch Time by Format - medium - Which content format keeps viewers watching the longest
- Not All Fires Are Equal - medium - The alert volume varies. So does what it means.
- Who's Holding Up Traffic - medium - Some endpoints carry the product. Others are barely touched.
- The Ones Who Hold Attention - medium - Time on screen is the real vote. Find the creators earning it.
- Bargain Bin - easy - Floor prices. Right before the vendor call.
- Bargains and Budget-Busters - hard - Every region has both. Find them.
- Batch Job Performance Tiers - medium - Every batch job gets a grade.
- Best Accuracy to Training Time Ratio - medium - Fast to train. Accurate too. Which model?
- Best Day for Ad Revenue - medium - One day of the month outperforms the rest.
- Best Selling Product by Month - hard - Every month has a winner.
- Best-Selling Reps Each Month - easy - In every category, a few sellers rise to the top.
- Biggest Deployment Decline - medium - One team's deploy count cratered. Which one?
- Big Spenders - easy - The whale list.
- Binary Flag Indicators - medium - On or off. Every flag at a glance.
- Bottom 2% Services by Spend - hard - The bottom 2% of spenders. Who are they?
- Bottom Endpoints by POST Volume - medium - The quietest POST endpoints.
- Bronze Medal - easy - Two ahead of you. The rest below.
- The Budget Line - easy - Some rows are over. Some are under. Label every one.
- Budget-Friendly Products - easy - Affordable does not mean invisible.
- Build a dynamic report header - medium
- Build Health - medium - Rank every repo by how often CI stays green.
- Builds per Author per Branch - medium - Who triggered what, and where?
- Build Success Rate by Trigger - medium - Which triggers produce green builds?
- Build Success vs Failure by Repo - medium - Green versus red, repo by repo.
- Busiest Pipeline Month - medium - One month, more pipeline runs than any other.
- Busiest Route by Passenger Volume - medium - The busiest route by volume.
- Busy Authors - medium - Some developers spread their commits everywhere.
- Cache Efficiency - hard - Some edges run hot. Others coast on the global average.
- Calculate the median transaction amount (50th percentile) and the 95th percentile transaction amount - medium
- Campaign Bookend Engagement - hard - First impression versus last. The gap.
- Campaign Click-Through Rates - medium - Clicks per impression. Campaign by campaign.
- The Notification That Paid Off - hard - The message went out to thousands. A smaller number actually bit.
- Campaign Conversion Window - hard - A narrow window between impression and action.
- Campaign Cost Effectiveness - medium - Money in, conversions out. What is the ratio?
- Campaign Engagement Rank Shift - hard - Two months, many countries. Who moved up? Who fell?
- Two Names, One Campaign - easy - The ad team and the push team never agreed on naming. Find where they secretly meant the same thing.
- Click-Through by Campaign - medium - Which campaigns actually got the tap.
- Campaign Revenue Totals - easy - Every campaign has a price tag. Total them up.
- Campaigns With Most Clicks - medium - The campaigns getting all the clicks.
- Cart Sizes - easy - Power buyers. Big carts.
- Categories With Mixed Price Tiers - medium - Users who cross content types.
- Category Buyers - medium - Which categories have the broadest reach?
- Category Census - easy - Which aisles are worth restocking?
- Category Deep Dive - hard - Revenue, units, rank. The full category report card.
- Category Revenue - medium - Which categories pull their weight?
- Category Sales Summary - easy - Category by category. How did they do?
- Category-Specific Product Volume - easy - Sum transactions for a specific payment type.
- CDN Image Request Paths - easy - CDN image traffic. Every path.
- CDN-Related DNS Lookups - easy - DNS lookups tied to the CDN.
- CDN Traffic by Day and Hour - medium - CDN traffic, hour by hour.
- Character Position in Endpoint - easy - URL patterns, character by character.
- Chat Activity - easy - Which channels are ghost towns?
- Cheapest CDN Route - easy - The cheapest path across regions.
- Cheapest Cost Per Region - easy - Lowest spend per region.
- Cheapest High-Rated Product - medium - Cheap and highly rated. A rare combination.
- Cheapest Transaction per User - easy - Everyone has a smallest purchase.
- The Quiet Outlier - hard - Ignore what the traffic does all day. Find the spike that barely showed up.
- Classify Services by Name - medium - The name tells you what it is. Mostly.
- Clean Averages - easy - Merchandising only cares about the categories customers actually rate.
- Clean Cache CDN Edges - easy - Cached, clean, error-free edges.
- Clean Latency Cast - easy - The latency column is a string. It should not be.
- Clicked Ad Impressions - easy - They saw the ad. They clicked.
- Loyalty's Double Tap - medium - When a nudge and a banner team up.
- Click Rate - medium - Campaigns nobody clicks.
- Click Revenue - easy - Which campaigns are earning their keep?
- Click vs Non-Click Rates - medium - Some searches lead to clicks. Most do not.
- Cloud Bill - easy - Which cost buckets are bleeding money?
- Cloud Cost Breakdown by Provider - hard - Cloud costs, provider by provider.
- Cloud Cost by Team - easy - Spend by team. Who's burning most?
- Cloud Cost Stats by Provider - medium - Three providers. Three very different bills.
- Cloud Cost Trend Analysis - medium - Cost trends across billing periods.
- Combined Cloud Spend by Region and Service - medium - Region by region. Service by service. Where does the money go?
- Commit Cadence - hard - Some repos go quiet for too long.
- Commit Royalty - medium - In a sea of commits, only a few wear the crown.
- Common Age Buckets - easy - Duplicate records hiding in the users table.
- Completed Priority-1 Jobs - easy - Priority one. Completed.
- Completion Rate - medium - Not every region closes orders cleanly. The percentages tell the story.
- Compute Nodes in Key Regions - easy - Compute nodes across the key regions.
- Consecutive Cost Growth Periods - hard - Five straight months of spending increases.
- Consistent High-Quantity Revenue - medium - Big orders, consistent revenue. A rare combination.
- Content by Specific Users - easy - Two creators. What did they publish?
- Content Duration Snapshot - easy - A popularity snapshot by duration.
- Content Mix - easy - One content format to bet the quarter on.
- Content Page Spreads - hard - Content, laid out in two columns.
- Content Published in 2026 - easy - Published back then. Still relevant?
- Content Recommendation Engine - medium - Pages they haven't discovered yet.
- Content Session Counts - medium - Session metrics, content item by item.
- Content Sorted by Duration - easy - The catalog, sorted by length.
- Content Type Distribution - easy - How many of each content type?
- Content Types by Creator - easy - One creator. What did they make?
- Content Viewer Penetration - easy - What share of the user base has viewed at least one piece of content
- Cost Density Extremes - medium - Some regions pack more cost per node than others.
- Cost Efficiency Ratio - easy - Dollars in, value out. What's the ratio?
- Cost Efficiency Variance - hard - Cost efficiency varies. By how much?
- Cost Share Within Category - medium - Each entry's slice of the category total.
- Service Roll Call - easy - The mesh is sprawling. Find out exactly how many services are actually running.
- Regional Footprint - easy - Every node costs money. Know what you own.
- CPU Utilization Summary - easy - The CPUs are working. How hard?
- Creator Favorite Content Type - hard - Every creator has a go-to format.
- Creators With Top-Rated Content - medium - Top-rated content. Who made it?
- Cross-Variant User Pairs - medium - Same experiment. Different variants. Who overlaps?
- The Slow Build - medium - Month over month, the number grows. Track how the average moves with it.
- Cumulative Sales Per Customer - medium - Each purchase adds to the running total. Watch it climb.
- Currently Active Feature Flags - medium - Which flags are live right now?
- Customer Full Name Concat - easy - First name, last name. Combine them.
- Customers Without Orders - medium - Customers who have never ordered.
- Custom Message Type Counts - medium - Not all messages are created equal.
- Daily and Weekly Active Users - easy - One metric by day, one by week. Same users, different lenses.
- Daily Cross-Platform Users - easy - Mobile and web. Same day, same users?
- Ship It or Skip It - easy - The calendar doesn't lie. How aggressive is this team, really?
- Daily Error Count Change - medium - Errors, trending up or down?
- Daily Error Resolution Ratio - medium - Reported versus removed. The daily ratio.
- Deploy Velocity Swings - medium - Month to month, who sped up and who stalled.
- Daily Net Revenue - hard - Net revenue, day by day. Refunds included.
- Daily Session and User Counts - medium - Sessions and users, day by day.
- Campaign Click Rate - medium - Among engaged users, which campaigns landed.
- Daily Top Endpoints - medium - Three winners each day.
- Data Quality - hard - Failed checks pile up. Which tables need the most attention?
- Data Repo Fix Commits - medium - How many commits start with 'fix'?
- Days with More Edited Than Unedited Messages - medium - Some days, more messages get edited than sent.
- The Freshest Record - medium - Duplicates everywhere. Only the most recent version of the truth survives.
- The Clean Aisle Numbers - medium - Clear the noise. What did each category actually earn?
- Department Cost by Status - medium - Headcount and compensation. The dashboard view.
- The Org Chart in Numbers - hard - Headcount by department, sliced by quarter. Every seat accounted for.
- Department Running Totals - medium - Compute cumulative metric values within each department using window operations.
- Department Snapshot - medium - Who is underperforming and who is excelling?
- Department Spend Difference - easy - The compensation gap between departments.
- Department Spend Gap - easy - Gap between Engineering's and Marketing's biggest single purchase
- Deploy Author Performance Score - medium - Not all deployers are equally reliable.
- Deploy Cadence - easy - Which environments ship the most?
- Deploy Count by Service - easy - Some services deploy constantly. Others barely at all.
- Deployed Models by Framework - easy - Which frameworks are actually in production?
- Deployment Duration by Status - easy - Fast deploys versus slow ones. By outcome.
- Deployment Failure Impact - medium - When deploys fail, how bad is the blast radius?
- Deployments per Environment - medium - Dev, staging, prod. Where do most deploys land?
- Deployments Without Alerts - easy - Deployed without a single alert. Suspicious or impressive?
- Deploy Reliability Scores - medium - A reliability scoreboard for deploy teams.
- Deploy Velocity - hard - Days between deploys. Some services ship fast, others crawl.
- The Apprentices Still in the Forge - easy - A model is not a model until it stops learning and starts earning.
- Device Mix - easy - The device breakdown before the redesign.
- Devices Per Age Bucket - medium - Device diversity among the younger users.
- Device Type Serving Most Users - medium - One device type serves more users than the rest.
- Device Types With Chrome Users - easy - Power users and their devices.
- Disabled Feature Flags - easy - Disabled flags. Still worth auditing.
- Disabled-Flag Share by Owner - medium - Which teams ship everything off by default.
- Distinct Blog Referrers - easy - Where did the traffic really come from? No repeats.
- Distinct Chat Conversations - medium - How many unique conversations?
- Distinct Product Categories - easy - A quick category inventory.
- Diverse Shoppers - medium - They shop the whole catalog.
- DQ Fail Rate by Table - medium - Pass rates, table by table.
- DQ Score Spread - medium - The spread in data quality scores.
- Duplicate DQ Check Records - medium - Passed QA twice. That's the problem.
- Duplicated User Event Messages - medium - Duplicated messages from the alerts topic.
- Duplicate Training Runs - medium - Same model, trained twice.
- Early Commit Velocity by Author - medium - How productive was each author during the first year of a repo's CI pipeline
- Early 2026 Data Pipelines - easy - Early-year data pipelines.
- Early User Activation - medium - Activated early. A good sign.
- Efficient Pipeline Throughput - medium - Throughput per pipeline. The benchmark.
- Email Census - easy - The reachability split.
- Employees Per Department - easy - Headcount, location by location.
- Endpoint Latency Spread - medium - Latency spread across endpoints.
- Verbose by Design - hard - Audit endpoint paths. Length without the outer slashes, and how many segments.
- Endpoint Performance Report - medium - Every endpoint has a speed and a reliability story.
- Endpoint Ranking - hard - The slowest endpoints. Called to the principal's office.
- Endpoint With Most GET-Only Users - medium - Read-only users have a favorite endpoint.
- Engagement by Content Type - medium - Some content types get all the attention.
- Engagement Gap - medium - Zero transactions is still a data point. Count everyone.
- Error Category Breakdown - hard - Postmortem time. Categorize the errors.
- Error Hall of Fame - medium - The year's worst error categories.
- Fault Lines - medium - Errors by day and region. Some areas are worse than they appear.
- Error Severity Buckets - easy - Errors sorted by how much they hurt.
- Errors With Service Health - easy - Error data, enriched with health context.
- Even-ID February Signups - easy - A very specific slice of a very specific cohort.
- Even-ID June Signups - easy - Odd IDs, even IDs. The filter is precise.
- Event Count on Key Days - easy - Key days. Key event volumes.
- Events by Month Across Years - easy - Month by month, year by year. The pattern emerges.
- Event Types Spanning Multiple Months - easy - Some events span seasons.
- Exact Keyword Counts in Logs - hard - Errors and warnings. Count every single one.
- Exclusive Users per Device Type - medium - Loyal to one platform only.
- Expensive AWS Services - easy - Some AWS services quietly drain the budget.
- The A/B Verdict - medium - Variant A or Variant B. The conversion numbers pick the winner.
- Experiment Impact - hard - Which experiments moved the needle? Settle the standings inside every variant.
- Experiment Variant Ratios - hard - Control versus treatment. The participation split.
- Extract Deploy Versions - medium - The version number is buried in the log.
- Extreme API Token Usage - medium - Outlier tokens. Suspiciously busy.
- Extreme Category Totals - medium - The highest and the lowest. Both are interesting.
- Extreme Headcount Departments - easy - The pay extremes tell a story.
- Extremely Late Resolutions - medium - Twenty minutes past the SLA. Still unresolved.
- Broken Promises Between Tables - medium - Every foreign key is a pinky-swear. Count the ones that got broken.
- Rollback Roulette - easy - Some ships sink before they leave the harbor.
- Failure Rate - medium - Build failures happen. Which repos break the most?
- The High and the Low - hard - The fastest and slowest in every region.
- Fastest CI Build Date - medium - The fastest build ever. When did it happen?
- Fastest Completion Per Day - medium - Every day has a speed champion.
- Fastest Page View to Click - hard - How fast from view to click?
- Fastest Regions by Latency - medium - The fastest regions. Benchmarked.
- Feature Flag Adoption - medium - How widely adopted are the flags?
- Feature Flag Engagement Impact - hard - Flags on versus flags off. The engagement gap.
- Feature Flag Fan vs Detractor Pairs - hard - Some users love the flag. Others want it gone.
- Feature Name Intersection - hard - Training names versus serving names. The overlap.
- Feature Quality by Source - medium - Quality varies by source.
- Features With Missing Values - easy - Missing data in the features.
- Feature Vote Winner - medium - Users voted with their clicks. Who won?
- February 2024 Signups - easy - One signup window. One cohort. Who joined the club?
- Not From Around Here - easy - The data is mixed. Only some of it belongs.
- Filtered User Roster - easy - A clean roster for the all-hands.
- Find active users in the '25-34' age bucket - medium
- Find all products with a price between 20 and 80 (inclusive) - medium
- Find all products with a price greater than 25 - medium
- Find Deploy Authors - easy - Same person. Many different spellings.
- Find the Fifth Largest Cost - medium - Not the biggest. Not the smallest. The fifth.
- Find transactions over $100 with quantity greater than 2 - medium
- Find users whose age_bucket is '18-24', '25-34', or '35-44' - medium
- First and Last Peak Accuracy Dates - medium - Peak accuracy. When it first hit and when it last did.
- First and Last Timeout Per Service - medium - First timeout. Last timeout. Each service.
- First Build per Repository - easy - Every repo had a first build.
- The Ninety-Day Comeback - hard - Everyone shows up once. Who comes back before the quarter ends?
- First Deploy Attribution - medium - The first deploy per service.
- First Half of Page Views - medium - Half the data. The first half.
- First Interaction Credit - hard - Attribute transactions to earliest touchpoint
- First Migration Record - easy - The very first migration. Where it all began.
- First Contact - easy - Every pipeline has a first run. This is what it brought back.
- First Time Learners Per Day - medium - Brand new users, day by day.
- First Touch Attribution - medium - The first interaction matters most. Or does it?
- Flag Check - easy - Which flags are actually live?
- Flatten Org Chart Hierarchy - hard - The tree runs deep. Walk every branch to the root.
- Frequent Message Senders - medium - Someone is sending too many messages.
- Friday Sessions for Shared Experiments - medium - Friday vibes only. Same experiment, different users.
- Friday Spending Analysis - hard - Friday spending during Q1.
- Did We Actually Make Money? - medium - Cancelled deals don't count. Of the rest, how many paid off?
- Full Customer Order List - easy - Every customer. Every order. The full picture.
- Full Funnel - hard - Search. Browse. Buy. Only a few do all three.
- Engagement Depth by Event - hard - Where users actually spend their attention.
- Gateway Connection Timeouts - easy - Timeouts at the gateway.
- Ghost Products - medium - Listed but never sold. The shelves collect dust.
- Health Check Distribution - easy - Pass, fail, degraded. The distribution.
- Health Checks per Service - easy - Some services get checked constantly.
- Healthiest Service Check History - hard - The healthiest service. Full history.
- Heavy Ad Exposure - medium - Saturated with ads. Is it too much?
- Heavy Hitters - medium - Some repos never sleep.
- Heavy Namespaces - medium - Kubernetes has favorites. Some namespaces carry more weight.
- Repeat Offenders of the Search Bar - easy - Once is a fluke. Twice is a habit.
- High and Critical Alerts in 2026 - easy - High and critical alerts from that year.
- High Engagement Pages - hard - Some pages hold attention longer than others.
- Higher Performing Variant - easy - Control versus treatment. One wins.
- Higher Than Supervisor - easy - When the student outscores the teacher.
- Highest and Lowest Cloud Costs - medium - The extremes in cloud spending.
- Highest Cost Per Team - easy - Peak cost, team by team.
- Highest Daily Spend - medium - Somewhere in that window, someone broke the spending record.
- Highest Latency Endpoints - easy - The slowest endpoints. Everyone notices.
- Hottest Regions by CPU - medium - Where the fleet runs warmest.
- Highest Throughput Pipelines - medium - The pipes that carry the most water.
- High-Output Creators - easy - High engagement creators.
- High Price Products - easy - Everything above 100.
- High-Rated In-Stock Percentage - easy - Highly rated and in stock. A rare combo.
- High-Spend 2025 Campaigns - easy - Big-budget campaigns from last year.
- High-Traffic Endpoints in February - easy - When traffic spikes, some endpoints get buried. How many crossed the line?
- High-Value Electronics - easy - The five priciest electronics.
- High Volume Batch Jobs - easy - Batch jobs that processed millions.
- Holiday Promo Campaign Click Year - easy - One year, the holiday campaign exploded.
- Holiday Sale Campaign Revenue - easy - The holiday sale campaign. How did it do?
- Idle Team Members - easy - Sprint started. Some people never got assigned.
- Impressions by Search Keyword - hard - Campaign performance, keyword by keyword.
- Inactive Android Control Users - medium - Android control cohort. Gone quiet.
- Inactive Unverified Users - easy - Signed up. Never verified. Never came back.
- Inactive Users in Date Range - medium - Ghost accounts. Active signup, zero sessions.
- Inactive vs Suspended Engagement - medium - Premium versus free. The engagement gap.
- Incident Keyword Messages - hard - Certain words trigger an investigation.
- What's in a Name - easy - Group by the first letter, count the heads, show the share.
- Actually Available - easy - The catalog is big. The shelf is smaller.
- Intra-Region Latency Diff - hard - Same region. Different latency.
- iOS Adoption by Age Bucket - medium - The install numbers don't match the hype.
- iOS Sessions by Device Type - medium - iOS engagement, device by device.
- Japan Revenue for April - easy - Last month's numbers for one region.
- Job Status Duration - medium - How long in each job state?
- The Full Picture - easy - Two tables know different things about the same people. Combine them.
- The Row Count Surprise - easy - Same tables. Different handshakes. Wildly different results.
- Keep Most Recent Record - medium - Carbon copies clutter the table. Only the latest matters.
- Keyword-Based User Search - medium - The search terms reveal intent.
- Largest A/B Test by Participants - medium - The biggest experiment ever run.
- Largest CDN Response - hard - One edge location served something massive.
- Largest Group - easy - One group towers above the rest.
- Largest Single Cloud Cost - medium - One line item. The biggest bill of all.
- The Last Checkout - medium - Their last visit. Everything in the bag.
- Last Five Batch Jobs - easy - The last five. A quick tail check.
- Last Migration Record - easy - The most recent migration. Is it the last?
- Last Server Activity - easy - Each server's last heartbeat.
- Latency Gap to 10th Fastest - medium - One server. Compared to the 10th fastest.
- Latency Quartiles Per Endpoint - hard - Quartile breakdowns. Endpoint by endpoint.
- Latency Variance and Std Dev - hard - How much does latency actually vary?
- Latency vs Regional Average - easy - Each service versus its region's average.
- Latest Commit Build Cost - medium - The latest commit came with a build cost.
- Latest Metric Values - easy - Stale records hiding in the metrics.
- Latest Migration Output per Author - medium - Each author's most recent migration output.
- Latest Session Per User - easy - Everyone has a most recent session.
- Latest Version Per Service - easy - The latest version deployed. Each service.
- Leading ML Frameworks by Accuracy - medium - Which frameworks lead on accuracy?
- Least Viewed Content - medium - Nobody is watching. Should it still exist?
- Log Entries by Level - easy - Info, warn, error, fatal. The breakdown matters.
- Log Levels - easy - Severity breakdown with response times.
- Log Priority - easy - Which servers are on fire before coffee?
- Log Volume by Day of Week - easy - Some days are noisier than others.
- Longest Active Membership Streak - easy - The longest unbroken streak.
- Longest Deploy With Full Identifier - easy - The longest deployment. Full ID.
- Longest Gap Between Token Events - medium - The longest gap between token events.
- Longest Running Pipeline - medium - One pipeline outlasted them all.
- Longest Uptime Streak - hard - Pass, pass, pass. How long until fail?
- Longest Visit Streaks - hard - Day after day after day. Who kept coming back?
- Long Messages - medium - Some commit messages tell a novel.
- Long-Running Feature Flags - medium - Flags that have been on for too long.
- Long Searches Containing 'er' - easy - Long queries with 'er'. A pattern?
- Low-Byte CDN Responses - easy - Tiny responses from the edge.
- Low-Engagement Sessions - medium - Users whose average session duration is below the engagement threshold
- On Their Way Out - easy - They signed up. They never really showed up.
- Lowest Average Price Category - easy - The cheapest category. Not necessarily the worst.
- Cheapest Line for Network-Heavy Teams - medium - Among the network spenders, the smallest single line.
- Lowest CPU Pods per Namespace - hard - The five lightest pods per namespace.
- Lowest Latency per Service - medium - The fastest response each service ever gave.
- Low Latency API Calls - easy - Fast endpoints. Confirmed fast.
- Low Severity Checks in 2026 - medium - Low severity. High volume.
- Low Severity DQ Checks - easy - Low severity checks. All of them.
- Low Throughput Pipelines - easy - Pipelines barely moving data.
- Low Uptime Services - easy - Underperforming services.
- Low-Volume Stream Topics - medium - Quiet topics in the stream.
- March Revenue by Customer - medium - One month, every customer, every dollar accounted for.
- Market Share - hard - Every category wants a bigger slice.
- Max Value Per Location - easy - Every location has a peak.
- Active Token Owners in 2026 - easy - Active token owners this year.
- Median Cloud Cost by Service - hard - The median cloud bill, service by service.
- Median Failure Rate by Table - hard - Half the tables fail more than this.
- Median Household Earnings - hard - Household earnings. The median reveals the middle.
- Median Model Accuracy - hard - The median accuracy. Not the mean.
- Median Null Percentage of Float Features - medium - Nulls in float columns. How widespread?
- Median Transaction by Category - hard - The middle transaction in each category.
- Memory-Heavy Pods - easy - Memory-hungry workloads.
- Mentorship User Pairs - medium - Pair them up. Mentor and mentee.
- Merge-Triggered Builds 2026 - easy - How many builds came from merges this year?
- Message Length - easy - Verbose commits. Risky changes?
- Messages Containing Keyword - easy - Flagged terms in the messages.
- Messages From Specific Users - easy - Specific users. What did they say?
- Metric Range by Department - medium - Where each team's numbers sit, low to high.
- Metric Range Per Group - easy - The spread within each group.
- Metric Value Pairs Over Threshold - medium - Two metrics, both above the line.
- Metric Value Quarter Complement - easy - Two metrics that accidentally match.
- Metric Volatility Gap - easy - Stable metrics are boring. Volatile ones need attention.
- Mid-CPU Nodes - easy - Not the heaviest. Not the lightest. The middle.
- Mid-Range Cost Allocations - easy - Not the cheapest. Not the priciest. The middle.
- Mid-Range Team Spenders - hard - Above average but not extreme.
- Mid-Tier Batch Jobs - easy - Not the biggest, not the smallest. The overlooked middle.
- The Floor Price - medium - Before the negotiation, find what each provider really charges at its cheapest.
- Minimum Parallel Workers - hard - Too few workers and it stalls.
- Missing Email for Non-Active Users - easy - No email on file. No recent activity. Something smells off.
- Mobile Event Counts - easy - Mobile engagement, device by device.
- Mobile vs Desktop Session Duration - medium - Mobile versus desktop. Who stays longer?
- Model Accuracy Drift - hard - Accuracy used to be higher.
- Models With Variable Accuracy - medium - Accuracy should be stable. These models are not.
- Model Training Completion Rate - medium - How many models finished training?
- Most-Allocated Service - hard - The service every big team keeps paying for.
- Monthly Active Users per Endpoint - easy - One endpoint, many users. Which ones showed up?
- Monthly Category Totals - easy - Sum amounts by category and month.
- Monthly Cloud Cost Forecast Error - hard - The forecast was off. By how much?
- Monthly Cohort Retention - medium - Compute month over month retention rates for user signup cohorts.
- Deploy Outcomes by Service - hard - Success, failure, rollback - side by side.
- Thirty Days of Shipping - easy - A month in the life of an engineering team, counted one deploy at a time.
- Monthly Revenue Change - hard - Revenue, month over month.
- Monthly Revenue Comparison - medium - Last month versus this month. Per product.
- Monthly Running Total - medium - Cumulative sales per product across months.
- Monthly Service Retention - hard - Users came back. Or they did not.
- Monthly Signup Counts - easy - Signups, month by month.
- The Cloud Bill - medium - Every provider sent an invoice. Every month tells a different story.
- The Spending Rhythm - easy - Every month tells a spending story, user by user.
- Monthly Transaction Summary - medium - A monthly engagement summary.
- Monthly Unique Users per Campaign - easy - Monthly reach, campaign by campaign.
- Month With Fewest Deploys - medium - One month, nobody deployed.
- Morning Warning Logs - easy - Warnings before noon.
- Most Active Chat Users - medium - The loudest voices on the platform.
- Most Active Recent Committers - medium - Who has been writing the most code lately?
- Most Active Servers by Log Volume - medium - The busiest servers by log volume.
- Most Commented Code Review - medium - The code review that started a debate.
- Most Common Export Job Status - easy - The most common job status.
- Most Common Monday Outcome - medium - Mondays have a pattern.
- Most Efficient API Endpoint - medium - Best throughput per call.
- Most Efficient High-Volume Campaign - hard - High volume. Low cost. The dream campaign.
- Most Efficient Region by Token Usage - hard - Some regions squeeze more out of every token.
- Most Frequent Error Types - medium - The errors that keep coming back.
- Most Ordered Product by Country - medium - Popular products in specific markets.
- Most Popular Content Type - medium - The content type everyone prefers.
- Most Popular Signup Day - medium - One day of the week wins on signups.
- Most Profitable Region Month - medium - One region, one month. Peak profit.
- Most Recent Token Usage - easy - Each user's latest token activity.
- Multi-Category Buyers - hard - One-category shoppers are boring.
- The Tiebreaker - easy - One column wasn't enough. The second column settles it.
- Multi-Host Regions by Node Type - medium - Some regions are quietly building empires.
- Multi-Month Active Users - hard - Active this month and last month. Who stuck around?
- Multi-OS Users - easy - iOS today, Android tomorrow.
- Multi-Provider Cost Lookup - easy - AWS, GCP, Azure. Side by side.
- The Three-Way Report - medium - Three tables. One summary. Every piece depends on the others.
- Multi-Variant Experiments - easy - One user, multiple experiments.
- Mutual Channel Connections - medium - Two users. What channels do they share?
- Negative Outcome Rate for New Users - medium - New users have a rough first two weeks.
- Net Lines - medium - Some authors build. Others trim. The net tells the truth.
- Never-Ordered Products - easy - In the catalog. Never purchased.
- New Customers Per Day - medium - Count users whose first order falls on each date.
- New Services With Poor Health - hard - New services, already struggling.
- New User Purchases - medium - What's this year's signup cohort worth so far?
- New vs Returning User Share - hard - Fresh faces versus familiar ones.
- Nodes by Region and Type - medium - Broken down by region. Broken down by type.
- Nodes in Key Regions - medium - Six regions. How many nodes in each?
- Nodes in Target Regions - easy - The target regions need attention.
- Node Summary Per Region - easy - Every region has a node story.
- Node Utilization - hard - Overloaded nodes hiding in busy regions. Spot the hot spots.
- No Gaps - easy - Zero blanks. A clean contact list.
- Noisiest Tables by DQ Failures - medium - The tables that fail the most checks.
- Noisy Endpoints - medium - The routes generating the most noise.
- Non-Bot Acknowledged Alerts - easy - Human-acknowledged alerts only.
- Non-Draft Content - easy - Everything except drafts.
- Non-Trivial Fatal Errors - medium - Short errors are noise. Long ones matter.
- Normalization Tradeoffs in Practice - hard - Clean data or fast queries? You can't always have both.
- Notification Delivery Ratio - medium - Sent versus delivered. The gap is the problem.
- Notification Open Rate - medium - Sent versus opened. The rate.
- Did Anyone Actually Read It? - easy - A push isn't a win until a thumb taps it.
- The Weekly Pulse - medium - Notifications by platform and day. When does the audience actually show up?
- Nth Highest Salary Per Department - medium - Third place in every department.
- Nth Largest Value - easy - Select the row with a specific rank position.
- The Vanishing Rows - easy - Some records disappear when the tables meet. Figure out why.
- Oldest Alert per Service - hard - The oldest unresolved alert per service.
- Oldest and Newest User Sessions - easy - The extremes of the user base.
- The Scorched Earth Reviews - easy - Someone was unhappy. Find out how many times.
- Opened Notifications in Jan-Feb - medium - Two months of push notifications. How many were actually read?
- Overall Average API Latency - easy - The overall average. Across everything.
- Over-Budget Services - medium - Over budget. Flagged.
- Overlapping User Sessions - medium - Two sessions, one user, same clock. Something overlaps.
- Overloaded Infrastructure Nodes - medium - CPU above 90. Memory above 80. Red alert.
- Pages Viewed by Session Duration - medium - Longer sessions, more pages? Check.
- Pairwise Latency Maximum - medium - Every pair compared.
- Peak Activity by Device - easy - Activity windows, device by device.
- Peak Ad Revenue Moment - easy - The single peak earning moment.
- Peak API Hour - medium - The hour when traffic peaks.
- Peak Concurrent Batch Jobs - medium - Jobs pile up. Find the moment the scheduler sweats the most.
- Peak Concurrent Pods - hard - The most pods alive at once.
- Peak Concurrent Tokens - hard - How many tokens were alive at the same time?
- Peak Hour Power Callers - medium - One hour. The phone lines exploded.
- The Ides of March - medium - Every endpoint has one March it would rather forget.
- Peak Metric Per Department - easy - Peak metrics for the quarterly deck.
- Peak Non-Converting Month - easy - Everyone showed up. Nobody bought anything.
- Peak Retargeting Revenue Month - medium - Retargeting revenue. The peak month.
- Peak Satisfaction - easy - Which departments are winning on satisfaction?
- Peak Spending Month - easy - One month, the bill was unforgettable.
- Ghosts in the Scheduler - easy - It says running. It has been running.
- Pipeline Completion Rate - medium - How far do users get through the flow?
- Pipeline Duration vs Throughput - hard - Does throughput correlate with duration?
- Pipeline Overhead by Environment - medium - Production overhead versus staging.
- Pipeline Recovery by Priority - medium - Recovery time, priority by priority.
- Pipeline Run History - easy - The lineage trail.
- Pipeline Throughput Ratio - easy - Compute current-to-initial value ratio per period.
- The Event Breakdown - medium - Events are piling up by type. The report needs them side by side.
- Platform Check - easy - OS and device combos. Which sessions last longest?
- Platform Speed - medium - Which devices keep users longest?
- Platform Team Feature Flags - easy - The platform team owns a lot of flags.
- Platform Team Mobile Flags - easy - Mobile flags under platform ownership.
- Pod CPU to Memory Ratio - medium - CPU versus memory. Resource efficiency.
- The Stable and the Restless - easy - Some pods never restart. That could mean anything.
- Popular Categories - easy - Merchandising only cares about categories big enough to negotiate shelf space.
- Power Users - medium - Engagement separates tourists from regulars.
- Power Users by Session Activity - medium - More sessions. More time. The power users.
- The Regulars - medium - Past a certain threshold, casual becomes committed.
- Previous Day Top Service - hard - Yesterday's top spender.
- Price Check - easy - Priced to sell or priced to sit?
- Price Pairs - hard - Same shelf, wildly different stickers. Spot the pricing gaps.
- Price Rank - medium - In every category, someone charges the most. Who's on top?
- Priciest Item in Each Category - medium - The most expensive item per category.
- Shipped to Prod - easy - Staging is safe. Production is real. How many made the jump?
- Production Deploys From April Onward - easy - After the cutoff, how many times did prod get a push?
- Product Name Letter Replace - easy - A quick text transform on product names.
- Product Name Prefix - easy - Just the first three characters. That is all.
- Everybody Wants a Bigger Screen - easy - The search bar never lies about what people actually want.
- Product Ratings vs Sales - medium - Do higher ratings actually mean more revenue?
- Product Revenue Ranking - easy - Rank them by revenue. See who leads.
- Products Without Sales - easy - Listed but never sold.
- Products With Strong Unit Price - medium - Budget-friendly and high-performing.
- Product Transaction Counts - medium - Show how many transactions each product has, sorted by product ID.
- Profitable Categories by Price - easy - The most profitable categories.
- Profit Tiers - medium - High, moderate, or in the red. Every order gets a label.
- Prolific Authors in Largest Service Teams - medium - Senior leads in the biggest teams.
- Promo Campaign Cost per Acquisition - easy - The campaign ran. What did each customer cost?
- Provider Cost Change H1 - easy - Cost swings in the first half of the year.
- Provider Spend Variance Between Halves - medium - Two time windows. Did the cloud bill go up or down?
- Purchase Log - easy - Names on receipts, not just IDs.
- Push Notification Open Rate - medium - Push sent. How many opened?
- The Notification Lifecycle - medium - Sent, opened, ignored. What happened after the alert went out?
- Push Opens by Platform and Campaign - medium - Opens by platform and campaign.
- Q2 Search Volume - easy - Q2 search volume. The numbers.
- Quarterly Consolidated Cloud Costs - medium - Quarterly cloud spend, weighted.
- Q by Q - easy - Thirteen weeks. This is how the team spent them.
- Quarterly Peak Cloud Costs - hard - Every quarter has a peak bill.
- Quarter-over-Quarter Latency Trend - hard - Latency trending up or down? The quarters have the answer.
- The Relentless Searchers - medium - Most users look once and leave. A few never stop looking.
- Rapid Retry Detection - medium - Detect retried API calls within 5 minutes of failure.
- Rarest Latency Value - hard - A latency value that appeared exactly once.
- Rate Limit Rules Per Endpoint - medium - Threshold rules, endpoint by endpoint.
- Rating Tiers - medium - No gaps, no skips. Ratings stacked tight within each category.
- Recent Price Drops - medium - The price just dropped. Who noticed?
- Recurring Error Types - easy - The same errors, recurring.
- Regional Order Summary - medium - Region by region. The order numbers tell the story.
- Regional Profits - easy - P&L by region. Before the board meeting.
- Regional Sales Growth QoQ - hard - Quarter-over-quarter growth. Region by region.
- Regional Status - easy - The full regional breakdown.
- Regions by Alert Volume - medium - Some regions are quiet. Others never stop screaming.
- Selling Where Nobody Lives - medium - Shipments land in regions our customer list has never heard of.
- Regions With 5+ Nodes - easy - Regions with five or more nodes.
- Region With Best Uptime - medium - The single most reliable region.
- Region With Most Nodes - medium - Which region hosts the most?
- Repeat Buyers Across Halves - medium - First half buyer. Second half buyer. Same person.
- The Subscription Ghost - medium - Some charges come back to haunt the same card a month later.
- Repeat Purchases Within a Week - medium - They bought again within seven days.
- Repeat Purchase Window - medium - The retention squad is looking for repeat purchasers.
- Repository Commit Ranking - medium - Lines added tell the story of a repo's ambition.
- Repos with More Builds Than Commits - medium - More builds than commits. Something is off.
- Resolved vs Unresolved Alerts - hard - Resolved versus open. By severity.
- Response Buckets - medium - Fast, normal, or slow. Every API call gets a verdict.
- Retargeting Campaign Impressions - easy - Retargeting impressions. All of them.
- Retried Failed API Calls - medium - Spot users who retry API calls within 5 minutes of a failure.
- Returning Buyers - medium - They came back and bought again.
- Revenue by Product - easy - Which products carry the revenue line?
- Two Names on the Ledger - easy - Two accounts. One ledger. Watch the spend stack up.
- Revenue Per Product With Zeros - medium - Total revenue per product. Even the zeros.
- Reviewer Performance Metrics - medium - Some reviewers are thorough. Others are fast.
- Reviewers Per Repo Per Year - medium - Reviewers per repo, year by year.
- Reviews Per Reviewer - easy - The workload split across reviewers.
- Revoked Tokens by Scope - medium - Banned tokens, sorted by what they had access to.
- Rolling Revenue Average - hard - Smooth out the revenue bumps. The trend matters more.
- Rolling Weekly Total - medium - Seven days at a time, the totals keep rolling forward.
- Rows With Multiple Flag Conditions - medium - Rows caught by multiple flags.
- Runner-Up Cost Without ORDER BY - medium - The second highest. Without sorting.
- Running Node Pairs - easy - Two servers, same region, both alive.
- Running Tab - medium - Every purchase adds to the total. Watch the tab grow.
- The Accumulator - hard - A total that builds row by row. Structure the query to match.
- Rush Hour API Latency - medium - Rush hour hits the API differently.
- Same-Day Session and Transaction Correlation - hard - Same day session and purchase. Connected?
- Honeymoon Phase - medium - How many wallets stay loyal the same year they say "I do"?
- Same First and Last Reply Target - medium - They started and ended the month messaging the same person.
- Satisfaction by Platform - medium - Satisfaction scores, platform by platform.
- Satisfaction Score by Region - easy - Satisfaction scores. Missing region data.
- Search Algorithm Rating - hard - How good are the search results?
- Search Endpoint Status Distribution - easy - Status codes on the health endpoint.
- Searches by Users With Email - easy - One user's search behavior.
- Search Success by User Tenure - hard - Compare search click-through rates between new and existing users.
- Search Term Length vs Click Rates - hard - Longer queries, more clicks?
- Search Terms Starting With G - easy - Queries starting with 'g'.
- Second Highest Cloud Cost - medium - The second biggest bill on record.
- Second Highest Latency by Method - medium - Almost the slowest. By method.
- Second Highest Salary - easy - Silver medal. Almost the top, but not quite.
- Second Highest Value - easy - Almost the top. Not quite.
- Second Purchase - hard - The first buy is curiosity. The second is commitment.
- Senior to Junior Ratio - medium - The ratio tells you a lot about the department.
- Back From the Brink - hard - Roll it back, then nail the next one.
- Servers Returning to Origin - medium - Servers that migrated back home.
- Server With Most Errors - medium - One server stands out. Not in a good way.
- Service Alert Frequency - easy - How often does each service trigger alerts?
- Service Budget per Head - medium - Budget per head. Pipeline by pipeline.
- Service Component Classification - medium - Classified by naming pattern.
- Service Reliability Tiers - medium - Reliability tiers. Based on uptime.
- Services at Median Uptime - medium - Exactly at the median. Not above, not below.
- Service Scorecard - hard - Deploys vs. alerts. One row per service tells the whole story.
- Services Hitting Cost Threshold - hard - The budget line is here. How many crossed it?
- Services With Most Checks in 2025 - hard - Last year's most-checked services.
- Services With Most Error Occurrences - easy - The noisiest services.
- Services With Multi-Quarter Uptime - hard - Multi-quarter uptime streaks.
- Service Uptime Minutes - medium - Status changed. How long was it actually up?
- Service Uptime Turnaround - hard - It was down. Then it came back. Stronger.
- Service User Growth Rate - easy - User growth, service by service.
- Service With Most Critical Alerts - hard - One service keeps setting off the alarms.
- Session Count Distribution - hard - How are sessions distributed among the newest users?
- Session Duration by Account Status - medium - Average session duration broken down by user account status
- Session-Fit Content - easy - Content that fits the session length.
- Session Logins Dec 13 to 19 - easy - Logins during one specific window.
- Session Overview - medium - Full engagement picture, even for the ones who never showed up.
- Session Page View Distance - hard - Page view distance per session.
- Session Pulse - easy - Engagement is slipping. Who is phoning it in?
- Session Rank - medium - Longest sessions rise to the top. Within each user, a pecking order.
- Sessions by Content Type - medium - Engagement, broken down by content format.
- Sessions Per Device Type - easy - Sessions, device by device.
- Shared Category Purchasers - medium - They bought different things from the same aisle.
- Shared Channel Contacts - hard - User networks mapped through messages.
- Shared Endpoints - medium - Shared credentials across endpoints.
- Show all products in the 'Electronics' category - medium
- Show all products NOT in categories 'Toys' or 'Games' - medium
- This Year's Class - easy - The cohort is in. Time to count who made it through the door.
- Signups by Age Bucket Since April - easy - Recent signups by age.
- Signups Jan to Jul 2026 - easy - Signups from January through July.
- The Conversion Story - medium - Signups are one thing. Paid purchases are another. Find the gap by source.
- Silent Users - medium - Users who have never typed a query.
- Single Service Owners - medium - One owner, one service. Nobody else.
- Sirens and Smoke - easy - Stale alerts. Still ringing.
- Slow Batch Jobs - easy - Promised by noon. Delivered at midnight.
- Slow Failures - easy - SRE is hunting for the endpoints that fail slowly enough to burn timeouts.
- The Address That Changed - hard - Addresses change. History must not be erased.
- Slow Production Deploys - easy - Production deploys that took way too long.
- Smooth Latency - medium - Noisy latency readings, smoothed into a trend you can trust.
- The Compliance Order - easy - Token scopes need to be in the right sequence before the audit.
- Spend and Rank - hard - Five thrones at the top of the spending leaderboard.
- Spending by Account Status - medium - Segment user spending and activity by account status across the platform
- Spending Range - hard - Between the smallest purchase and the biggest lies the story.
- Spending Tiers - medium - High rollers, mid-spenders, and the frugal. Everyone gets a tier.
- Split Metric Sums - medium - One column, two totals.
- Status Report - easy - Where are orders getting stuck?
- Stock Status - easy - Human-readable availability labels.
- Storage Node Lookup - easy - The storage nodes hold the critical data.
- Streak Status Changes - hard - Detect value changes across consecutive rows
- Subscribers Without Premium - medium - Subscribed. But never upgraded.
- Successful Build Duration by Repository - medium - CI throughput, repo by repo.
- Successful Call Volume per Endpoint - medium - Not every ping is honest.
- Green Lights on the Order Line - easy - How often did the orders API just... work?
- Successful Pipeline Runs - easy - Which pipelines completed successfully?
- Successful Production Deploys - easy - Successful production deploys with duration.
- The Middle Ground - medium - Strip the outliers from both ends. What does the core actually add up to?
- Super Reviewers - medium - The most prolific code reviewers.
- Suspected Bot Sessions - easy - Five seconds or less. Probably a bot.
- Symmetric Reply Network - medium - Who replies to whom? Both directions.
- Tables With Many DQ Failures - medium - Some tables have never once passed QA.
- Tables With Most DQ Failures - medium - The tables with the most failures.
- Targeted Ad Campaigns - easy - High-value impressions. Targeted precisely.
- Team Cost Allocation Comparison - hard - Individual spend versus team average.
- Teams Below Double Average Spend - medium - Teams spending under twice the average.
- Tenure Mentorship Match - medium - Pair by tenure. Longest with newest.
- Tenure Spread for Active Tokens - hard - Tenure extremes among active tokens.
- The Ad Ledger - easy - Annual ad revenue. On the record.
- The Campaign Trail - easy - Impressions are vanity. Conversions are sanity.
- The Cannibalization Report - hard - The new product launched. The old one suffered.
- The Day-7 Retention Cohort - medium - Day one was promising. Day seven tells the truth.
- The Dormant Accounts - easy - They are still paying. They stopped showing up.
- Double Vision - easy - Before the records move, the ones wearing the same name twice have to surface.
- The February Cohort - easy - One signup window. One cohort. Who joined the club?
- The First Half - easy - New arrivals during one specific window.
- The Heaviest Carts - medium - Inside every age group, a few customers carry the basket. Find them.
- The Latest Transaction Per Product - medium - Every product has a last sale. When was it?
- The Legacy Hunt - easy - Old data. Still matters.
- The Lion's Share - medium - Every category claims its slice. Find out who really owns the table.
- The Merge Counter - easy - How many builds came from merges?
- The Ones That Move - medium - Every aisle has its champions. Surface the three that carry each one.
- The Ones Who Hold Attention - medium - Plenty of creators get the click. Find the ones who actually keep people watching.
- The Ones Who Return - medium - One purchase is a trial. Two is a habit. Find how many members formed one.
- The Phantom Readers - medium - They read everything. They bought nothing.
- The Podium Finish - medium - Top two products per category.
- The Publishing Audit - easy - Published years ago. Still generating views?
- The Quiet Alarms - medium - Low severity. High volume. Worth a look.
- The Regional Cost Reconciliation - hard - Two cost tables, one region. Reconcile the running balance.
- The Revenue Cliff - medium - Revenue was climbing. Then it wasn't. Spot the drop.
- The Session Stitcher - hard - Page views without sessions are just noise.
- The Slow Lane - medium - Peak API load. The slow endpoints.
- The Token Census - easy - How many tokens are out there?
- The Usual Suspects - hard - Same services, same checks, same problems.
- The Weight of Everything Before - medium - Every purchase carries the ones that came before it. Trace the climb.
- Third Highest Spender - medium - Bronze medal in spending.
- Third Largest Batch Job - easy - Bronze medal in the batch job rankings.
- Threads Excluding User - easy - Every thread they're not part of.
- Three-Item Combinations - medium - Generate all unique 3-item sets with total cost.
- Three Lowest Distinct Cloud Cost Amounts - easy - The three cheapest bills on record.
- Three-Value Sum Combinations - medium - Pick three. See what they add up to.
- The Transaction Breakdown - easy - Multiple time windows. One query. The business wants all of it at once.
- Timeout Status Records - easy - Unknown status in the health records.
- Timeout Warning Logs - easy - Timeout warnings. The postmortem trail.
- Titles Ending With S - easy - Naming conventions. Specifically the plurals.
- Keys That Never Die - medium - Some API keys have no expiry date at all. That should worry someone.
- Tokens With Non-Read Scope Prefix - medium - Tokens that don't start with 'read'.
- Top 100 Batch Jobs Total Output - easy - The hundred biggest jobs. Combined output.
- Top 10 AB Test Variants - medium - The ten best-performing variants.
- Top 10 Batch Jobs - easy - The ten biggest batch jobs.
- Top 10 CPU-Heavy Nodes - medium - The ten hungriest nodes.
- Top 10 Model Accuracies - easy - Top ten model performance.
- Top 10 Rated Products - medium - The ten highest-rated items.
- Top 10 Slowest Endpoints - easy - The ten endpoints nobody wants to call.
- Top 2 Active Push Days - medium - Two days stood out from the rest. Which ones?
- Top 2 Ad Campaigns by Spend - medium - Two campaigns. Most of the budget.
- Top 2 Busiest API Slots - medium - Two time slots per week. The busiest.
- Top 2 Callers per Endpoint - medium - Two top callers per endpoint.
- Top 2 Cloud Services by Cost - medium - Two services eating most of the budget.
- Top 2 Rate-Limited Clients - medium - Two clients are hitting the rate limit harder than anyone.
- First Impressions - medium - The first three pages decide who stays.
- Top 3 Monthly Costs per Team - hard - Three priciest months per team.
- Top 3 Revenue Months - medium - The three best months on record.
- Top 5 Slowest DNS Lookups - easy - Five DNS lookups that took too long.
- Top Accuracy Model - medium - The single best-performing model.
- Top Active API Tokens - medium - The five busiest tokens.
- Top Active Senders per Channel - medium - Top three messages per channel by replies.
- Top Ad Campaigns by Revenue - easy - Every campaign has a bottom line. Stack them up.
- Top Alert Resolvers - medium - The engineers who resolve the most.
- Top and Bottom Cloud Spenders - hard - The extremes. Top and bottom.
- Top API Caller - medium - One user triggered more API calls than anyone.
- Top API Token Scopes - easy - The highest-value token scopes.
- Top Average By Region - easy - Region by region, who pulls the best average?
- Top AWS Non-APAC Service Costs - medium - Outside APAC, AWS costs tell a different story.
- Top Batch Job Under Priority 1 - medium - Priority one. Top performer.
- Back Again - medium - Acquisition is expensive. These customers didn't need convincing twice.
- Top Buyers of Premium Products - medium - Which users bought the most top-rated products
- Top Campaign by Opens - medium - One campaign got all the opens.
- Top Campaign by User Revenue - medium - Which campaign made each user spend the most?
- Top Category by User Segment - medium - Each segment has a favorite category.
- Top Chat Contributors - medium - The ten most active chat users.
- Top Commit Authors by Repo - hard - Three authors per repo. The top committers.
- Top Committers in 2025 - medium - In a sea of commits, only a few wear the crown.
- Top Content by Lifetime Value - medium - Lifetime value. Measured in total watch time.
- Top Content by Views - medium - Top five content items by views.
- Top Content by Watch Time - medium - Some content holds attention. Others get skipped.
- Top Content Flagger - medium - Flagged content. Who flagged the most?
- Top Cost Categories - medium - Three categories eating the budget.
- Top Cost Entry per Team - medium - The single biggest bill per team.
- Top CPU Pods per Namespace - hard - The two most CPU-hungry pods in each namespace.
- Top Deployed Model - easy - The best-performing model in production.
- Top Device by Sessions - easy - One device type generates the most sessions.
- Top Duration Content Items - easy - The content that held the number-one spot.
- Top Earner Per Campaign - medium - The top-earning user per campaign.
- Top Endpoint by Power Users - hard - Power users have a favorite endpoint.
- The Loudest Failures - medium - Twelve months of errors. Which types showed up most?
- Top Error-Service Pair - medium - Which error-service pair triggered the most resolved incidents
- Top Five - easy - The five priciest items for the luxury section.
- Top Flagged Campaign Resolutions - hard - Flagged the most. Resolved how?
- Top Framework by Deployments - hard - The framework most often deployed.
- Top Frameworks by Accuracy - medium - Top three frameworks by accuracy.
- Top Identified Event Types - medium - The top users by events, but only the identifiable ones.
- Top Lessons Each Month - medium - Rank items within time periods and keep top 3
- Top Metric per Department - medium - Peak performer in every department.
- Top Metric Values - easy - The five highest numbers. No duplicates.
- Top Mobile OS by Session Duration - easy - Which mobile OS keeps users longest?
- Top Models by Framework - hard - Every framework has a star model.
- Top Pattern Matches - medium - A needle in a haystack, but how many haystacks?
- Top Per Category - hard - Every category has a champion. Crown them all.
- Top Percentile API Tokens - hard - The most suspicious tokens.
- Top Percentile Spenders - medium - Top 1% of users by total spend via percentile bucketing.
- Top Performing Models - easy - The models that actually perform.
- Top Product Categories - medium - Top three categories by page views.
- Top Product Categories by Sales - easy - The highest-grossing categories.
- Top Product Category by Transactions - medium - Organic purchases, no marketing nudge. Which category wins?
- Top Products by Quantity Sold - medium - The bestsellers. By volume.
- Top Products per Category - medium - Five winners per category.
- Top-Ranked Wines by Variety - easy - The best bottles. Ranked by variety.
- Top Recent Sellers - easy - Fresh data, top sellers. The recent leaderboard.
- Top Region by Order Volume - medium - The single busiest region.
- Top Regions by Critical Alerts - medium - Which regions have the highest volume of critical alerts
- Top Regions by Effective Uptime - medium - The most reliable regions.
- Top Regions by High CPU Nodes - hard - Five regions with the hottest CPUs.
- Top Repos by Commit Volume - medium - The most active repos in the org. No ties left behind.
- Top Repos by Successful Builds - medium - Green builds. Which repos lead?
- Top Revenue Products H1 - medium - First half of the year. Which products led the revenue race?
- Top Selling Items - easy - Revenue crowns the winners. Who sold the most?
- Top Services by Regional Cost - medium - Top spenders in one region.
- Top Services by Uptime - medium - Uptime is a competition. Which services never blink?
- The Heavy Hitters - medium - Within each cloud, two services rise above the rest.
- Top Shelf - easy - Buyers need to know ceiling prices before negotiating with vendors.
- Top Spender - medium - When your spending exceeds the priciest item on the shelf.
- The Spender Leaderboard - easy - Spending speaks. The leaderboard does the listening.
- Top Users by Pages Viewed - medium - Five users who browsed the most.
- Top Users by Recent Spend - medium - Big spenders in the last 30 days.
- Top Users by Session Time - medium - They spent the most time here.
- Total Compute Cloud Cost - easy - Total compute spend. The number.
- Total Cost by Category - easy - Total spend per category.
- Total Engineering Cost Allocation - easy - Engineering's total allocated budget.
- Total Hours Between Consecutive Events - hard - Hours between state changes.
- Total Rows by Pipeline Status - easy - Row counts alongside pipeline aggregates.
- Total User Spend - easy - Each customer's total. Summarized.
- Transaction-Only Features - hard - Exclusive to one source. Missing from the other.
- Transaction Overview - easy - The executive snapshot. Users, products, revenue.
- Transaction Revenue by Customer - medium - One month, every customer, every dollar accounted for.
- Transaction Share of User Spend - medium - Each transaction's share of the whole.
- Transaction Source Features - easy - One pipeline reviewed them. What did it see?
- The Named Transaction - easy - Transaction IDs are useless without context. Bring in the product names.
- Transaction Timeline - medium - First purchase to last. The full spending arc.
- Trend Spotter - medium - What did they spend last time? Context changes everything.
- Trim Endpoints Right - easy - Trailing whitespace. Clean it up.
- Trim Search Terms Left - easy - Leading whitespace. Clean it up.
- Read the Manual - easy - Some titles promise to walk you through it. Count the ones that say so out loud.
- Unclicked Searches by Campaign - medium - Searched but never clicked.
- Unique Hostnames per Region - medium - How many distinct machines live in each region?
- Unique Hosts by Node Type - easy - How many unique hosts per node type?
- Unique Reporters per Content - medium - How many people flagged each item?
- Unique Searchers - easy - How many users actually searched?
- Who's Looking - easy - Every search is a question someone needed answered. Count the people asking.
- Unique Stream Topics - easy - A clean inventory of streaming topics.
- Unique Visitors - easy - Which months actually had an audience?
- Unmatched Categories - easy - Categories with nothing on the shelf. Empty aisles.
- Buyers Who Never Browsed - easy - They bought without ever loading a page.
- Unmatched Deploy Services - medium - Two registries. They do not agree.
- Unreviewed Models - easy - Models that have never been evaluated.
- Unsold Product Categories - medium - Dead inventory inflating storage costs.
- Unused Read Tokens - easy - Active tokens that nobody uses.
- Upvote Percentage by Age Cohort - hard - New users versus existing. The upvote gap.
- Where in the World Are Our Customers? - medium - One country dominates the logo wall. Or does it?
- Use APPROX_DISTINCT to estimate the number of unique users who have made transactions - medium
- US-East KV Store Entries - easy - KV store inventory. us-east-1.
- User 360 - hard - One row per user. Everything they did, or didn't do.
- User Age Ranking - easy - Age brackets, stacked from top to bottom.
- User Campaign Overlap Percentage - hard - How much ad overlap between users?
- Six Degrees - hard - Every reply ties two names together. Find whose web reaches the furthest.
- User Devices - medium - Desktop, mobile, tablet. What does each user actually use?
- User Engagement Summary - medium - Sessions plus searches. The full engagement picture.
- User Engagement Totals - easy - Per-user engagement. The totals.
- Behavioral Range - easy - Power users don't just visit more. They do more things.
- User Roster - easy - Which account states are bleeding users?
- User Session Roster - easy - Every user paired with their sessions, even users who never logged in
- User Sessions on Specific Days - easy - One user. Specific days. What happened?
- Users Outperforming Control - medium - Treatment beat control. For these users.
- User Spend Audit - medium - One user. One category. Total spend.
- User Spend Segmentation by Category - hard - Users segmented by spending behavior.
- Users Per Device Type - easy - Users per device. The split.
- Users Who Churned in February - hard - Gone in February.
- Users Who Clicked Ads - easy - Ad clickers and their account details.
- Users With Admin Tokens - medium - Admin tokens. Who holds them?
- Users With and Without Ad Clicks - hard - Clicked an ad versus never clicked. The split.
- Users With API Errors - medium - Count unique users who have triggered an API error response
- Users Without Purchases - medium - How many registered users have never made a single purchase
- Users Without Sessions - medium - Account created. Never logged in.
- Users With Purchase Events - easy - At least one purchase. That changes everything.
- User With Most Transactions - medium - The most active buyer.
- Verify Commit ID Uniqueness - easy - Duplicate commit IDs. Are there any?
- View Count Per Page - easy - Every page has visitors. Some just have more.
- Point of Entry - hard - Everyone starts by looking. Count who came back to buy.
- Views by Content Type - medium - Count content views broken down by content type
- Views by Specific Users - easy - Retrieve all content views for a set of flagged user accounts
- Weekend Warriors - easy - Weekdays vs. weekends. When does the action really happen?
- Weekly Build Status Report - hard - Every CI run, bucketed by week.
- Weekly Transaction Day Split - hard - Transactions by day of week.
- Weekly Transaction Volume - easy - Weekly volume. The pulse.
- Weighted Variant Selection - hard - Select a row using cumulative weight probabilities.
- Welcome Wagon - easy - How many signed up this year?
- Whale Watch - easy - The accounts driving the top line.
- Where the Money Pools - medium - Every region has one line item that dwarfs the rest. Find it.
- Word Count Per Message - medium - How wordy are the messages?
- Workers Earning Above Department Average - medium - Earning above the department average.
- Worst Table Per Year by DQ Failures - hard - Every year has a worst table.
- Against the Clock - medium - Build times by repo, year by year.
- Yearly Output - easy - Publishing velocity for the board deck.
- Year-over-Year Content Launches - medium - Launch velocity, year over year.
- YoY Signup Growth Rate - hard - This year versus last year. Growing or shrinking?
- Stumbling Out of the Gate - medium - Some model versions never recover from a bad opening run.
- Zero-Retry Job Ratio by Priority - hard - No retries needed. First try success rate.