Loading section...
Full Refresh or Incremental?
Concepts: paFullVsIncremental
At scale, the answer is never purely one or the other. The interviewer wants to hear you design a hybrid strategy: incremental daily loads for speed, periodic full refreshes for correctness. If you only say "incremental," they will probe until you admit the failure modes. The Hybrid Pattern Incremental loads accumulate drift. A missed CDC event, a race condition in the source system, a timezone bug that shifts one hour of data into the wrong partition - these errors compound silently. A weekly full refresh acts as a consistency checkpoint. It rebuilds the entire table from source, and any discrepancy between the full refresh output and the incrementally-maintained table is a data quality alert. Your answer should describe this as an intentional design, not a fallback. Cost Modeling at Scal