Loading section...
Key Generation Strategies
Concepts: dmKeyGeneration, dmSurrogateKeys
How you generate surrogate keys has real implications for performance, correctness, and pipeline complexity. There are three main strategies: database sequences, hash-based keys, and UUIDs. Each fits different scenarios. Database Sequences A sequence is a database-maintained counter that increments atomically. Each call to NEXTVAL returns a unique integer. Sequences are fast and produce compact, ordered values that B-tree indexes love. The downside: in distributed insert pipelines (e.g., Spark writing to PostgreSQL), every executor contends for the same sequence, creating a bottleneck. Hash-Based Keys A hash-based surrogate key applies a deterministic hash (MD5, SHA-256) to one or more business attributes. The same input always produces the same output. This is what makes hash keys powerfu