Loading section...
External Binary Search and Index Structures
Concepts: pyBtree, pyLSMTree, pySSTable, pyHBase
This is where binary search stops being an algorithm and becomes an architecture. Every major data storage system you will use as a data engineer is built on sorted storage and binary search. B-trees, LSM trees, SSTables, Bloom filters plus index lookups, and HBase/BigTable row key lookups. Understanding how binary search powers these systems is what separates a senior DE who knows how the tools work from a junior who only knows how to use them. B-trees: Binary Search at Every Level A B-tree of order b stores up to b-1 keys per node. Finding a key in a B-tree requires descending from root to leaf. At each node, you binary search the sorted key array to find the correct child pointer. For a tree of depth d with branching factor b containing n keys: depth d = ceil(log_b(n)). Binary search at