Slack computes shared channel membership between any two users in milliseconds by taking the intersection of their respective channel sets, a calculation that runs in constant time regardless of how many channels either person belongs to. With hundreds of millions of messages processed daily across millions of workspaces, doing this with list lookups would be impossibly slow, requiring linear scans for every comparison. Sets make the operation instant because membership testing is O(1), not O(n). The add, remove, and membership operations you will learn in this lesson are the same building blocks Slack and every major collaboration platform rely on to serve membership data at scale.
What is a Set?
Daily Life
Interviews
Distinguish sets from lists
A set is an unordered collection of unique elements. These two properties define what makes a set different from other collection types like lists and tuples. Understanding both properties is essential for using sets correctly.
The word "unordered" means that sets do not maintain any particular sequence for their elements. Unlike lists, where the first item you add stays first and the last item stays last, sets make no guarantees about element order. When you iterate over a set or print it, the elements might appear in any order. This order might even change between different runs of your program. You cannot rely on sets to preserve the order in which you added elements.
The word "unique" means that each element can appear at most once in a set. If you try to add a duplicate element, the set simply ignores it without raising an error. The set remains unchanged. This automatic duplicate handling is one of the most useful features of sets. You never need to check whether an element exists before adding it. You can add freely, and the set ensures uniqueness.
Unordered
Elements have no defined position or index in a set
Unique
Each element can appear at most once in the collection
Mutable
You can add and remove elements after creation
Fast lookup
Checking membership is extremely efficient at O(1) time
Dynamic size
Sets grow and shrink as you add and remove elements
Think of a set like a bag of marbles where each marble must be a different color. If you already have a red marble in the bag and try to add another red marble, the bag rejects it because red is already represented. You cannot ask for "the third marble" or "the marble at position five" because marbles in a bag have no order. But you can very quickly check "Is there a blue marble in the bag?" by reaching in and finding it almost instantly.
This analogy also illustrates why sets are useful. If you wanted to know how many different colors of marbles you have, a set gives you the answer directly: its length equals the number of unique colors. With a list, you would need to examine each marble and keep track of which colors you have already seen.
Sets vs Lists Fundamentals
Lists and sets are both collection types, but they serve fundamentally different purposes. Understanding when to use each is crucial for writing efficient, correct code. The wrong choice can lead to subtle bugs or severe performance problems.
Lists preserve order and allow duplicates. When you add items to a list, they stay in the order you added them. You can access items by their position using indexing: list[0] gives you the first item, list[1] gives you the second, and so on. Lists allow the same value to appear multiple times. A list like [1, 1, 2, 2, 3] is perfectly valid and maintains all five elements.
Sets ignore order and enforce uniqueness. When you add items to a set, they do not have positions. You cannot access set elements by index. Sets reject duplicate values automatically. A set created from {1, 1, 2, 2, 3} would contain only {1, 2, 3} because duplicates are eliminated.
•List
Ordered: [1, 2, 3] stays in that order
Allows duplicates: [1, 1, 2, 2] is valid
Access by index: list[0] returns first item
Slower membership test: O(n) time complexity
Preserves insertion order
•Set
Unordered: order is not guaranteed
No duplicates: {1, 2} only, no repeats
No indexing: set[0] raises an error
Fast membership test: O(1) time complexity
No concept of order
The notation O(n) means that checking if an item is in a list takes time proportional to the list size. If the list has n items, you might need to check all n items in the worst case. A list with a million items requires up to a million comparisons. The notation O(1) means that checking if an item is in a set takes constant time regardless of set size. Whether the set has ten items or ten million items, the lookup takes approximately the same amount of time.
This performance difference matters enormously in practice. If you need to check whether items exist in a collection thousands or millions of times, using a set instead of a list can reduce your runtime from hours to seconds. Many experienced programmers have debugged slow code only to discover that converting a list to a set solved the performance problem.
Python Quiz
> A set automatically removes duplicates. Pick the built-in that counts unique elements, and the keyword that tests membership in constant time.
Sets and lists are both collections, but they solve different problems. Use a list when you care about order or need to store duplicates. Use a set when you only care whether an item is present and want instant answers regardless of collection size.
The O(1) membership test is the defining advantage of sets. It comes from hashing: Python converts each element into a number that directly points to its storage location, so no scanning is needed. This makes sets the right tool for membership checks in any performance-sensitive code.
TIP
When you find yourself writing if item in my_list inside a loop, consider converting my_list to a set first. The lookup cost drops from O(n) to O(1), turning potentially slow code into fast code with a one-word change.
Creating Sets
Daily Life
Interviews
Build sets from any iterable
Python provides two main ways to create sets. You can use curly braces with elements inside, similar to how you write dictionary literals but without key-value pairs. Alternatively, you can use the set() constructor function, which can convert other iterables into sets. Each approach has specific use cases and limitations that you should understand.
Using Curly Braces
The most common and concise way to create a set with initial elements is using curly braces. Place your elements inside the braces, separated by commas. This syntax looks similar to dictionary syntax, but dictionaries have key-value pairs separated by colons, while sets contain only single values.
1
# Create a set of colors
2
colors={"red","green","blue"}
3
print(colors)
4
print(type(colors))
5
6
# Create a set of numbers
7
numbers={1,2,3,4,5}
8
print(numbers)
9
10
# Create a set with mixed types
11
mixed={42,"hello",3.14,True}
12
print(mixed)
>>>Output
{'red', 'green', 'blue'}
<class 'set'>
{1, 2, 3, 4, 5}
{True, 42, 3.14, 'hello'}
When you print a set, Python displays it with curly braces. Notice that the order of elements in the output may differ from the order you wrote them. In the mixed set example, the elements appear in a different order than we specified. This is completely normal behavior because sets are unordered. Do not write code that depends on any particular ordering of set elements.
The type() function confirms that these objects are sets. Python's set type is a built-in type, meaning it is always available without importing anything. Sets are as fundamental to Python as lists, dictionaries, and tuples.
The Empty Set Problem
There is one critical exception to the curly brace syntax that trips up many Python programmers. You cannot create an empty set with empty curly braces. When Python sees {}, it interprets this as an empty dictionary, not an empty set. This behavior exists for historical reasons: dictionaries were added to Python before sets, and {} was already established as the dictionary literal syntax.
1
# Creates a DICTIONARY, not a set!
2
not_a_set={}
3
print("Type of {}:",type(not_a_set))
4
5
# This is how you create an empty SET
6
empty_set=set()
7
print("Type of set():",type(empty_set))
8
9
empty_set.add("first element")
10
print("After adding:",empty_set)
>>>Output
Type of {}: <class 'dict'>
Type of set(): <class 'set'>
After adding: {'first element'}
This quirk catches even experienced Python developers. If you write code that initializes a variable with {} and later tries to use set methods like add(), you will get an AttributeError because dictionaries do not have an add() method. The error message might be confusing because you thought you had a set.
TIP
Always use set() to create an empty set. Using {} creates an empty dictionary. This is one of the most common Python gotchas and has caused countless debugging sessions.
Sets from Other Collections
The set() constructor is versatile. It can convert any iterable object into a set. An iterable is anything you can loop over: lists, tuples, strings, ranges, and even other sets. This conversion automatically removes any duplicate values, which is often exactly what you want.
1
# Convert a list with duplicates to a set
2
numbers_list=[1,2,2,3,3,3,4,4,4,4]
3
unique_numbers=set(numbers_list)
4
print("Original list:",numbers_list)
5
print("As a set:",unique_numbers)
6
print("List length:",len(numbers_list))
7
print("Set length:",len(unique_numbers))
>>>Output
Original list: [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
As a set: {1, 2, 3, 4}
List length: 10
Set length: 4
In this example, the list has ten elements but only four unique values. The set constructor processes each element from the list. When it encounters an element it has already seen, it simply skips it. The resulting set has length four because there are only four distinct values in the original list.
1
# Convert a string to a set of characters
2
word="mississippi"
3
letters=set(word)
4
print("Word:",word)
5
print("Unique letters:",letters)
6
print("Total characters:",len(word))
7
print("Unique characters:",len(letters))
>>>Output
Word: mississippi
Unique letters: {'m', 'i', 's', 'p'}
Total characters: 11
Unique characters: 4
Strings are iterable in Python, meaning you can loop over their characters. When you pass a string to set(), Python treats it as a sequence of characters and creates a set containing each unique character. The string "mississippi" has eleven characters total, but only four unique letters: m, i, s, and p. The set contains exactly these four characters.
This pattern of converting a collection to a set for deduplication is extremely common in data processing:
Try choosing different constructors below to see how Python interprets each syntax for creating a set versus other collection types.
Fill in the Blank
> You have a list [3, 1, 2, 1] with a duplicate value. Pick a constructor to convert it and see how set, list, and tuple each handle duplicates and ordering differently.
result = ([3, 1, 2, 1])
print(type(result))
print(result)
The constructor you choose determines everything about the resulting collection: whether it keeps duplicates, whether it maintains order, and whether it supports fast membership checks. set() is unique in that it both removes duplicates and provides O(1) lookups.
A common pattern is to convert a list to a set and back: list(set(my_list)). This deduplicates a list in one step, though the output order may differ from the input since sets do not guarantee ordering.
TIP
Use set() with an empty argument to create an empty set, not {}. Curly braces alone create an empty dictionary in Python. Always write my_set = set() for an empty set.
Automatic Duplicate Removal
Daily Life
Interviews
Deduplicate data with add and update
The automatic duplicate removal behavior of sets is one of their most powerful and useful features. Sets eliminate duplicates both during creation and when adding new elements. This happens silently, without errors or warnings. Understanding this behavior allows you to write cleaner, more concise code.
1
# Duplicates are automatically removed during creation
# Even though we wrote Alice 3 times and Bob 2 times
7
# The set contains each name exactly once
>>>Output
Votes cast: {'Alice', 'Bob', 'Charlie'}
Unique voters: 3
Even though we specified "Alice" three times and "Bob" twice in the set literal, the resulting set contains each name exactly once. Python processes the elements in order, adding each one to the set. When it encounters an element that already exists in the set, it simply skips it. This behavior is consistent and predictable.
This same deduplication happens when you add elements to an existing set. If you add an element that already exists, the set remains unchanged. No error is raised, and the size of the set does not increase. This allows you to add elements freely without first checking whether they exist.
1
# Duplicates are also ignored when using add()
2
colors={"red","blue"}
3
print("Initial set:",colors)
4
print("Initial size:",len(colors))
5
6
colors.add("green")
7
print("After adding green:",colors)
8
print("Size:",len(colors))
9
10
colors.add("red")
11
print("After adding red again:",colors)
12
print("Size:",len(colors))
>>>Output
Initial set: {'red', 'blue'}
Initial size: 2
After adding green: {'red', 'blue', 'green'}
Size: 3
After adding red again: {'red', 'blue', 'green'}
Size: 3
Counting Unique Values
One of the most common uses of sets is counting unique values in a dataset. Given any collection with potential duplicates, converting to a set and checking its length tells you how many distinct values exist. This operation is fast and memory-efficient.
This pattern appears constantly in data analysis and processing. Given a column of data from a database or spreadsheet, you often need to know "How many distinct values are there?" Converting to a set and checking its length answers this question efficiently, regardless of how large the original dataset is.
Because sets are unordered, converting a list to a set and back to a list loses the original order. Sometimes you need to remove duplicates while keeping the first occurrence of each item in its original position. This requires a different approach that uses a set for tracking but preserves order in a separate list.
This approach iterates through the original list once. For each item, it checks whether the item has been seen before by checking the set. If not seen, it adds the item to both the seen set (for fast future lookups) and the unique_ordered list (to preserve order). If already seen, it skips the item. The first occurrence of each item is preserved in its original position.
In Python 3.7 and later, dictionaries preserve insertion order. The dict.fromkeys() method creates a dictionary where each item becomes a key (with None as the value). Since dictionary keys must be unique, duplicates are automatically eliminated while order is preserved. Converting back to a list gives you the deduplicated, ordered result.
When you only need unique values and order does not matter, convert directly to a set. When order must be preserved, use the loop pattern with a set for tracking and a list for collecting results, or use dict.fromkeys() in Python 3.7+.
Adding Elements to Sets
Sets are mutable, meaning you can add elements after creation. Python provides two methods for adding elements: add() for single elements and update() for adding multiple elements at once. Understanding when to use each method helps you write cleaner, more efficient code.
The add() Method
The .add() method inserts exactly one element into the set. If the element already exists, the set remains unchanged and no error is raised. The .add() method modifies the set in place and returns None (not the modified set).
1
colors={"red","green"}
2
print("Initial:",colors)
3
4
# Adding a new element
5
colors.add("blue")
6
print("After adding blue:",colors)
7
8
# Adding a duplicate has no effect
9
colors.add("red")
10
print("After adding red again:",colors)
11
12
# Note: add() returns None, not the set
13
result=colors.add("yellow")
14
print("Return value:",result)
15
print("Final set:",colors)
>>>Output
Initial: {'red', 'green'}
After adding blue: {'red', 'green', 'blue'}
After adding red again: {'red', 'green', 'blue'}
Return value: None
Final set: {'red', 'green', 'blue', 'yellow'}
The silent handling of duplicates is a feature, not a bug. It means you can safely add elements without first checking whether they already exist. This simplifies your code and avoids unnecessary conditional statements. The set handles uniqueness for you.
Building Sets with Loops
A common pattern is to start with an empty set and add elements one by one as you process data. This is particularly useful when you need to collect unique values from a stream of inputs or when filtering data based on some condition.
1
# Collect unique words from a sentence
2
sentence="the quick brown fox jumps over the lazy dog"
3
words=sentence.split()
4
5
unique_words=set()
6
forwordinwords:
7
unique_words.add(word)
8
9
print("Original sentence:",sentence)
10
print("Total words:",len(words))
11
print("Unique words:",len(unique_words))
12
print("Unique word set:",unique_words)
>>>Output
Original sentence: the quick brown fox jumps over the lazy dog
The sentence contains nine words, but "the" appears twice. The set contains only eight unique words because the second occurrence of "the" was ignored when added. This pattern works regardless of how many times duplicates appear.
The .update() method adds multiple elements from any iterable (list, tuple, string, set, or other iterable). This is more concise and often more efficient than calling add() repeatedly in a loop.
1
primary={"red","green","blue"}
2
print("Initial:",primary)
3
4
# Add multiple elements from a list
5
secondary=["orange","purple","green"]
6
primary.update(secondary)
7
print("After update with list:",primary)
8
9
# Add from a tuple
10
primary.update(("cyan","magenta"))
11
print("After update with tuple:",primary)
12
13
# Add characters from a string
14
primary.update("xyz")
15
print("After update with string:",primary)
>>>Output
Initial: {'red', 'green', 'blue'}
After update with list: {'red', 'green', 'blue', 'orange', 'purple'}
After update with tuple: {'red', 'green', 'blue', 'orange', 'purple', 'cyan', 'magenta'}
After update with string: {'red', 'green', 'blue', 'orange', 'purple', 'cyan', 'magenta', 'x', 'y', 'z'}
Notice that when updating with the list ["orange", "purple", "green"], only "orange" and "purple" were actually added. "green" was already in the set and was ignored. Also notice that updating with a string adds each character individually, not the entire string as one element.
•add()
Adds exactly one element
set.add("item")
Use for single additions
Argument must be hashable
•update()
Adds multiple elements
set.update([a, b, c])
Use for bulk additions
Argument must be iterable
TIP
Use add() for single elements and update() for bulk additions from an iterable. Never pass a list to add() (it raises TypeError) and remember that update() with a string adds each character individually. To store a collection as one element, convert it to a tuple first.
Python Quiz
> Build a set one element at a time. Duplicates are silently ignored. Pick the method that inserts a single element, and the built-in that counts how many unique items remain.
Sets silently ignore duplicate insertions. Calling add() with a value that already exists is a no-op: the set remains unchanged and no error is raised. This makes sets ideal for collecting unique items in a loop without explicit duplicate checking.
For bulk additions, update() accepts any iterable: a list, tuple, another set, or even a string, which adds each character individually. If you need to add a list as a single element, convert it to a tuple first since lists are unhashable and cannot be stored in a set.
TIP
Do not confuse add() (sets) with append() (lists). Sets have no append() method. If you get an AttributeError: "set" object has no attribute "append", you likely have a set where you expected a list, or vice versa.
Removing Elements from Sets
Daily Life
Interviews
Remove elements and test membership
Python provides several methods for removing elements from sets: remove(), discard(), pop(), and clear(). Each behaves differently and is suited for different situations. Understanding these differences helps you choose the right method and avoid unexpected errors.
remove() vs discard()
Both remove() and discard() delete a specific element from the set. The critical difference is what happens when the element does not exist. The remove() method raises a KeyError exception if the element is not found, while discard() silently does nothing.
1
fruits={"apple","banana","cherry","date"}
2
print("Initial:",fruits)
3
4
# remove() deletes a specific element
5
fruits.remove("banana")
6
print("After removing banana:",fruits)
7
8
# discard() also deletes a specific element
9
fruits.discard("cherry")
10
print("After discarding cherry:",fruits)
11
12
# discard() on missing element: no error, no change
After removing banana: {'apple', 'cherry', 'date'}
After discarding cherry: {'apple', 'date'}
After discarding mango (not present): {'apple', 'date'}
In this example, discarding "mango" had no effect because mango was not in the set. No error was raised, and the set remained unchanged. If we had used remove("mango") instead, Python would have raised a KeyError exception, potentially crashing our program if we did not handle it.
1
fruits={"apple","banana"}
2
3
# Safe approach: check before removing
4
if"mango"infruits:
5
fruits.remove("mango")
6
else:
7
print("mango not found, skipping remove")
8
9
# Even simpler: use discard()
10
fruits.discard("mango")
11
print("After discard:",fruits)
>>>Output
mango not found, skipping remove
After discard: {'apple', 'banana'}
Both approaches handle missing elements gracefully. The if-check approach is explicit, while discard() handles it silently. Choose based on whether you want your code to acknowledge the absence or ignore it entirely.
•remove()
Raises KeyError if element missing
Use when element MUST exist
Fails fast on programming bugs
Good for required elements
•discard()
Silent if element is missing
Use when element MIGHT exist
Safe for uncertain removal
Good for optional cleanup
TIP
Use remove() when the element should definitely exist and its absence indicates a bug in your program. Use discard() when it is acceptable for the element to be absent, such as when cleaning up potentially incomplete data.
Try choosing different removal methods below to see how each one behaves when the element is missing from the set.
Fill in the Blank
> A set {"apple", "banana"} does not contain "grape", but you try to remove it anyway. Pick a removal method to see which one handles the missing element gracefully.
The .pop() method removes and returns an arbitrary element from the set. Because sets are unordered, you cannot predict which element will be removed. This method is useful when you need to process elements one by one and do not care about the order, or when you need to empty a set while examining each element.
The exact order in which elements are popped depends on Python's internal implementation and can vary between different runs or Python versions. Do not assume any particular element will be popped first. If you need a specific order, sort the elements first or use a different data structure.
Calling pop() on an empty set raises a KeyError. Always ensure the set is not empty before popping, either by checking its length or using a while loop as shown above.
Clearing a Set
The .clear() method removes all elements from a set, leaving it empty. This is useful when you want to reset a set for reuse without creating a new set object.
1
data={1,2,3,4,5}
2
print("Before clear:")
3
print(" Set:",data)
4
print(" Length:",len(data))
5
6
data.clear()
7
print("After clear:")
8
print(" Set:",data)
9
print(" Length:",len(data))
10
print(" Is empty:",len(data)==0)
>>>Output
Before clear:
Set: {1, 2, 3, 4, 5}
Length: 5
After clear:
Set: set()
Length: 0
Is empty: True
After clearing, the set still exists as an object but contains no elements. You can continue to add elements to it. Clearing is generally more efficient than creating a new empty set, especially if other variables reference the same set object.
Membership Testing
The in operator checks whether an element exists in a set. This operation is one of the primary reasons to use sets: membership testing in sets is extremely fast, with O(1) time complexity. This makes sets ideal for situations where you need to check existence frequently.
1
allowed_users={"alice","bob","charlie","diana"}
2
3
# Check if users are in the set
4
print("Is alice allowed?","alice"inallowed_users)
5
print("Is eve allowed?","eve"inallowed_users)
6
7
# The 'not in' operator checks for absence
8
print("Is eve NOT allowed?","eve"notinallowed_users)
>>>Output
Is alice allowed? True
Is eve allowed? False
Is eve NOT allowed? True
The expression "alice" in allowed_users returns True because "alice" is a member of the set. The expression "eve" in allowed_users returns False because "eve" is not in the set. The not in operator returns the logical opposite: True if the element is absent, False if present.
Why Sets Are Fast
Understanding why sets are fast helps you make better decisions about when to use them. When you check if an item is in a list, Python must scan through each element one by one, comparing your search value to each element until it finds a match or reaches the end of the list. This is called linear search. For a list with n items, this requires up to n comparisons in the worst case.
Sets use a fundamentally different approach called hashing. When an element is added to a set, Python calculates a hash value for it, which is a number derived from the element's value. This hash value determines where the element is stored internally. When you check if an element is in the set, Python calculates its hash value and looks directly at that location. This typically requires just one or two comparisons regardless of how many elements are in the set.
This pattern is fundamental in data validation and access control. Before processing an input, check if it belongs to a set of valid options. The lookup is fast regardless of how many valid options exist. Whether your set of valid options contains ten items or ten million items, each membership test takes approximately the same amount of time.
List-to-Set for Fast Lookup
A common optimization pattern is to convert a list to a set when you need to perform many membership tests against it. The conversion has a one-time cost proportional to the list size, but each subsequent lookup is O(1). If you perform enough lookups, the time saved far exceeds the conversion cost.
# This lookup is O(1) because valid_codes_set is a set
13
ifcodeinvalid_codes_set:
14
print(f" {code}: VALID")
15
else:
16
print(f" {code}: INVALID")
>>>Output
Validating codes:
C300: VALID
X999: INVALID
A100: VALID
Z000: INVALID
G700: VALID
In real applications, valid_codes_list might contain thousands or millions of entries loaded from a database or configuration file. If you needed to validate millions of user inputs against this list, using a set instead of a list could reduce validation time from hours to seconds.
TIP
If you need to check membership multiple times against the same collection, convert it to a set first. The upfront cost of conversion is quickly recovered through faster lookups. Even just a few dozen lookups can justify the conversion.
The code below has a bug related to membership testing. The developer tried to use the in operator with a list literal instead of a set, losing the O(1) performance advantage. Fix it to use a set.
Debug Challenge
> This code checks membership using a list, which requires scanning every element. Switching to a set gives O(1) lookups instead of O(n).
Functional but slow: list uses O(n) lookup instead of O(1)
Converting a list to a set is one of the most common and impactful performance optimizations in Python. The change is a single word in the source code, but it can reduce the time complexity of membership checks from O(n) to O(1), making code that scanned thousands of items per check effectively instant.
Sets work for membership testing because hashing gives each element a predictable storage address. When you check item in my_set, Python computes the hash of the item and checks one location directly, without scanning any other elements.
TIP
If you need to validate user input against a known list of allowed values, define that list as a set literal from the start: ALLOWED = {"admin", "editor", "viewer"}. This is cleaner and faster than building the set at runtime.
What Can Be in a Set?
Daily Life
Interviews
Identify which types sets accept
Not everything can be an element of a set. Set elements must be hashable, which generally means they must be immutable (unchangeable after creation). This requirement exists because sets use hashing to organize elements internally. If an element could change after being added, the set would not be able to find it anymore because its hash value would be different.
int / floatstrtuplebool/Nonefrozenset
int / float
Numbers
42, 3.14, -17 are valid
str
Strings
"hello" and "" both work
tuple
Tuples
(1, 2) if contents hash
bool/None
Singletons
True, False, None work
frozenset
Frozen Sets
Immutable set variant
Mutable types like lists, dictionaries, and regular sets cannot be set elements because their hash values would change if modified. Python raises a TypeError if you try to add an unhashable type to a set.
1
# Valid set elements
2
valid_set={42,"hello",3.14,True,None,(1,2,3)}
3
print("Valid set:",valid_set)
4
5
# Tuples make excellent set elements for storing pairs
6
coordinates={(0,0),(1,0),(0,1),(1,1),(0,0)}
7
print("Coordinate set:",coordinates)
8
print("Number of unique points:",len(coordinates))
Notice that the coordinate set shows only four points even though we specified five. The point (0, 0) was specified twice but only appears once because sets eliminate duplicates. Tuples are hashable (as long as their contents are hashable), making them ideal for storing coordinate pairs, database keys, or any immutable combination of values.
1
my_set=set()
2
3
# Test which types can be added to a set
4
foritemin[42,"hello",(1,2),True]:
5
my_set.add(item)
6
print(f"Added {item!r} - set is now: {my_set}")
7
8
# These would raise TypeError:
9
try:
10
my_set.add([1,2,3])
11
exceptTypeErrorase:
12
print(f"Cannot add list: {e}")
>>>Output
Added 42 - set is now: {42}
Added 'hello' - set is now: {42, 'hello'}
Added (1, 2) - set is now: {42, (1, 2), 'hello'}
Added True - set is now: {42, (1, 2), 'hello'}
Cannot add list: unhashable type: 'list'
Notice that True was "added" but the set size did not change. This is because Python considers True and 1 to be equal (and they have the same hash). Since 42 is already in the set and True equals 1 not 42, True just maps to the same slot. Mutable types like lists trigger a TypeError immediately.
01
Lists are mutable
You can change a list after creation, so its content is not fixed
02
Hash would change
A modified list would produce a different hash value than the original
03
Lookup breaks
The set could not find the element at its old hash-based location
04
Python forbids it
Mutable objects are rejected from sets to maintain data integrity
1
# If you need to store a list-like collection in a set,
2
# convert it to a tuple first
3
data_points=[[1,2],[3,4],[1,2],[5,6]]
4
5
# Convert each list to a tuple
6
unique_points={tuple(point)forpointindata_points}
7
print("Unique points as tuples:",unique_points)
>>>Output
Unique points as tuples: {(1, 2), (3, 4), (5, 6)}
Common Mistakes to Avoid
Even experienced Python programmers sometimes make mistakes when working with sets. Learning about these common pitfalls helps you avoid them and write more robust code.
Mistake 1: {} vs set()
The most common set mistake is trying to create an empty set with empty curly braces {}. Python interprets this as an empty dictionary, not an empty set. This mistake often leads to AttributeError exceptions later when you try to use set methods.
•Wrong
empty = {}
Creates a dictionary!
type(empty) returns dict
empty.add("x") raises AttributeError
•Correct
empty = set()
Creates a set!
type(empty) returns set
empty.add("x") works correctly
Mistake 2: Expecting Order
Sets are unordered. Do not write code that assumes elements will appear in any particular order when you iterate over a set or print it. Even if elements seem to appear in a consistent order during testing, this order can change between Python versions, between different runs of your program, or when the set grows or shrinks.
1
# Order is NOT guaranteed
2
letters=set()
3
letters.add("c")
4
letters.add("a")
5
letters.add("b")
6
7
print("Set contents:",letters)
8
print("Elements in iteration order:")
9
forletterinletters:
10
print(f" {letter}")
11
>>>Output
Set contents: {'a', 'b', 'c'}
Elements in iteration order:
a
b
c
In this example, we added elements in the order c, a, b, but they might appear differently when printed. If you need elements in a specific order, sort them explicitly or use a list instead.
Mistake 3: Indexing Sets
You cannot access set elements by index. Sets have no concept of "first element" or "element at position 2" because they have no order. Trying to index a set with square brackets raises a TypeError.
1
colors={"red","green","blue"}
2
3
# Trying to index a set raises TypeError
4
try:
5
print(colors[0])
6
exceptTypeErrorase:
7
print(f"Error: {e}")
8
9
# Sort to list for indexing
10
colors_list=sorted(colors)
11
print("Sorted list:",colors_list)
12
print("First alphabetically:",colors_list[0])
>>>Output
Error: 'set' object is not subscriptable
Sorted list: ['blue', 'green', 'red']
First alphabetically: blue
Using sorted() gives you a predictable ordering every time, unlike converting to an unsorted list where the order could vary. If you need indexed access often, store your data in a list instead of a set.
Mistake 4: In-Loop Mutation
Adding or removing elements from a set while iterating over it can cause unexpected behavior or RuntimeError exceptions. If you need to modify a set based on its contents, iterate over a copy instead.
1
numbers={1,2,3,4,5,6}
2
print("Before:",numbers)
3
4
# CORRECT: Iterate over a copy
5
forninnumbers.copy():
6
ifn%2==0:
7
numbers.remove(n)
8
9
print("Odd numbers only:",numbers)
10
11
# ALTERNATIVE: Set comprehension
12
numbers2={1,2,3,4,5,6}
13
odds={nforninnumbers2ifn%2!=0}
14
print("Using comprehension:",odds)
>>>Output
Before: {1, 2, 3, 4, 5, 6}
Odd numbers only: {1, 3, 5}
Using comprehension: {1, 3, 5}
If you need to access elements by position, use a list. If you need uniqueness and fast membership testing, use a set. Sometimes you need both: maintain a list for ordered access and a set for fast lookups.
Try fixing the buggy code below. The programmer accidentally used curly braces to create what they thought was an empty set.
Debug Challenge
> This code uses {} to create what it thinks is an empty set, but Python interprets {} as an empty dictionary. The .add() call then fails.
AttributeError: 'dict' object has no attribute 'add'
Sets provide a powerful way to work with unique collections and perform membership tests efficiently. Put these fundamentals to the test with hands-on challenges in the Python Builder.
❯❯❯PUTTING IT ALL TOGETHER
> You are a data analyst at Mailchimp deduplicating email addresses collected from three separate campaign upload files before running a bulk re-engagement send, ensuring no subscriber receives the same message twice and that every address meets basic hashability requirements.
set() created from the first campaign list automatically removes any duplicate addresses already present in that single source file.
.add() merges each address from the second and third campaign files into the existing set without any risk of introducing duplicates.
The in operator checks whether a specific email was already captured before deciding whether to include it from a new source list.
The set's uniqueness guarantee means the final address list passed to Mailchimp's send API contains no repeated recipient emails.
KEY TAKEAWAYS
Sets are unordered collections that automatically eliminate duplicates
Create sets with curly braces {1, 2, 3} or use set() for empty sets
Empty curly braces {} creates a dictionary, not a set
Convert lists to sets to remove duplicates: set(my_list)
.add() adds one element; .update() adds multiple from an iterable
.remove() raises KeyError if missing; .discard() is silent
Membership testing with in is O(1) - extremely fast regardless of set size
Set elements must be hashable (immutable): strings, numbers, tuples
Lists, dictionaries, and sets cannot be elements of sets
Do not modify a set while iterating over it; iterate over a copy instead
Collections that guarantee uniqueness
Category
Python
Difficulty
beginner
Duration
55 minutes
Challenges
3 hands-on challenges
Topics covered: What is a Set?, Creating Sets, Automatic Duplicate Removal, Removing Elements from Sets, What Can Be in a Set?
A set is an unordered collection of unique elements. These two properties define what makes a set different from other collection types like lists and tuples. Understanding both properties is essential for using sets correctly. The word "unordered" means that sets do not maintain any particular sequence for their elements. Unlike lists, where the first item you add stays first and the last item stays last, sets make no guarantees about element order. When you iterate over a set or print it, the
Python provides two main ways to create sets. You can use curly braces with elements inside, similar to how you write dictionary literals but without key-value pairs. Alternatively, you can use the set() constructor function, which can convert other iterables into sets. Each approach has specific use cases and limitations that you should understand. Using Curly Braces The most common and concise way to create a set with initial elements is using curly braces. Place your elements inside the brace
The automatic duplicate removal behavior of sets is one of their most powerful and useful features. Sets eliminate duplicates both during creation and when adding new elements. This happens silently, without errors or warnings. Understanding this behavior allows you to write cleaner, more concise code. Even though we specified "Alice" three times and "Bob" twice in the set literal, the resulting set contains each name exactly once. Python processes the elements in order, adding each one to the s
remove() vs discard() Both approaches handle missing elements gracefully. The if-check approach is explicit, while discard() handles it silently. Choose based on whether you want your code to acknowledge the absence or ignore it entirely. Try choosing different removal methods below to see how each one behaves when the element is missing from the set. The pop() Method The exact order in which elements are popped depends on Python's internal implementation and can vary between different runs or P
Not everything can be an element of a set. Set elements must be hashable, which generally means they must be immutable (unchangeable after creation). This requirement exists because sets use hashing to organize elements internally. If an element could change after being added, the set would not be able to find it anymore because its hash value would be different. Notice that the coordinate set shows only four points even though we specified five. The point (0, 0) was specified twice but only app