Loading lesson...
Dictionary mastery for production code
Dictionary mastery for production code
Topics covered: Deep Copying Dictionaries, defaultdict: Auto-Initializing Keys, Counter: Purpose-Built for Counting, Dictionary Performance, Advanced Dictionary Patterns, Dictionaries and JSON, Dictionary Gotchas
The problem is that both dictionaries share the same list object. Appending to the list affects both because they're pointing to the same list in memory. The copy Module Performance Considerations Deep copying is slower and uses more memory than shallow copying because it must traverse and duplicate every nested object. For deeply nested or large data structures, this cost can be significant:
Counting with defaultdict Nested defaultdicts You can create multi-level defaultdicts for building complex nested structures automatically. This is useful when processing hierarchical data:
The .most_common() Method Counter Arithmetic Counters support arithmetic operations. You can add, subtract, or find the intersection/union of counts: Python offers two main tools for counting. Here is how they compare. OrderedDict: Stable Order In Python 3.7 and later, regular dictionaries maintain insertion order. Before that, order was not guaranteed. OrderedDict explicitly guarantees order and provides additional methods for reordering. move_to_end() Reordering
Understanding dictionary performance is crucial for writing efficient code, especially in data engineering where you process large datasets. Dictionaries use a technique called hashing that makes most operations extremely fast. How Hashing Works When you add a key to a dictionary, Python computes a hash value from the key. This hash determines where in memory the value is stored. When you look up a key, Python computes the same hash and jumps directly to that location. dict vs list Lookups When
These patterns appear frequently in technical interviews and production code. Each demonstrates a clever way to use dictionaries to solve problems that would otherwise require slower, more complex approaches. Two-Sum Pattern Caching with Dictionaries Memoization is a technique where you cache function results to avoid redundant computation. Dictionaries are perfect for this: Graph Representation Dictionaries are the standard way to represent graphs in Python. Each key is a node, and the value is
JSON (JavaScript Object Notation) is the most common data format for APIs and configuration files. Python dictionaries map directly to JSON objects, making conversion seamless: In data engineering, you'll constantly convert between dictionaries and JSON when working with APIs, configuration files, and data pipelines.
Even experienced developers make these mistakes. Being aware of these gotchas helps you avoid subtle bugs. Modifying During Iteration Adding or removing keys while iterating causes a RuntimeError. If you need to modify, iterate over a copy: Mutable Default Arguments Using a dictionary as a default argument is a famous Python pitfall. The default is created once and shared across all calls: Integer Key Confusion In Python, the integer 1 and the boolean True hash to the same value. This can lead t