Loading lesson...
Transform, filter, and combine data
Transform, filter, and combine data
Topics covered: Using sorted() with key, Using sorted() in Reverse, Using map() for Transforms, Using filter() for Selection, Using zip() for Combining
This is one of Python's most powerful features for data processing. Without key functions, sorting a list of dictionaries by a specific field would require writing a custom comparison function or manually extracting values. With key, you express the sort criteria in a single line that Python handles efficiently. The key parameter represents a fundamental shift from imperative to declarative programming. Instead of writing code that compares elements step by step, you declare what value to use fo
While you could achieve the same result by sorting normally and then reversing, using the reverse parameter is both cleaner and more efficient. Python handles the reversal during the sort rather than as a separate pass through the data. Basic Reverse Sorting Getting the top N items from a collection is a common operation. Sorting in descending order and slicing the first N elements is simple and readable. For very large collections where you only need a few top items, consider heapq.nlargest() f
The map() function embodies the principle that transformations should be separate from iteration. When you use a for loop to transform data, you mix the mechanics of iteration with the logic of transformation. With map(), you cleanly express the transformation once and let Python handle the iteration. This separation makes code easier to understand, test, and parallelize. Basic map() Usage map() with Builtins Many built-in functions work directly with map() without needing lambda: This direct fu
Filter operations are lazy by default in Python 3. The filter object only computes results as you iterate through it. This means you can filter a massive dataset without loading everything into memory at once. Each element is tested and yielded one at a time. This lazy evaluation is crucial for processing data that does not fit in memory, a common situation in data engineering. Basic filter() Usage The predicate function must return a truthy or falsy value. Elements where the predicate returns T
In data engineering, zip() appears when merging columns, pairing keys with values, iterating over multiple lists simultaneously, or transposing data structures. It's also the foundation for creating dictionaries from separate key and value lists, a common data transformation pattern. The name "zip" comes from the analogy to a physical zipper, which interleaves two rows of teeth into one. Just as a zipper combines alternating teeth from each side, the zip() function combines alternating elements