Python

In Python, a Set is an unordered collection of unique items. Sets are based on the mathematical concept of sets and are primarily used for membership testing and eliminating duplicate entries from a dataset.

1. Theoretical Overview

Characteristics of Sets

  • Unordered: Sets do not record element position or order of insertion. Therefore, you cannot access items using an index (e.g., my_set[0] will raise an error).

  • Unique Elements: Sets automatically filter out duplicate values. Each element must be unique.

  • Unchangeable (Immutable) Items: While the set itself is mutable (you can add or remove items), the individual items within the set must be of an immutable type (like strings, integers, or tuples).

  • Unindexed: Since sets are unordered, the items do not have a defined index.

Why Use a Set?

  • Duplicate Removal: Quickly turn a list with duplicates into a collection of unique values.

  • Mathematical Operations: Easily perform operations like Union, Intersection, and Difference.

  • Membership Testing: Checking if an item exists in a set is significantly faster than in a list (O(1) average time complexity vs O(n)).

2. Basic Syntax and Creation

Sets are created using curly braces {} or the set() constructor.

Note: To create an empty set, you must use set(), because {} creates an empty dictionary.

Python
# Creating sets
fruits = {"apple", "banana", "cherry", "apple"} # "apple" is duplicated
print(fruits) # Output: {'banana', 'apple', 'cherry'} (Order may vary)

# Creating an empty set
empty_set = set()

3. Key Set Operations

A. Adding and Removing Items

MethodDescriptionExample
add()Adds a single element.my_set.add("orange")
update()Adds multiple elements (from a list, tuple, etc.).my_set.update(["kiwi", "mango"])
remove()Removes a specific item; raises error if not found.my_set.remove("banana")
discard()Removes a specific item; does NOT raise error if missing.my_set.discard("banana")
pop()Removes a random item (since sets are unordered).my_set.pop()

B. Mathematical Set Operations

This is where sets truly shine for data analysis.

Python
set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}

# Union: Items in either set
print(set_a | set_b) # {1, 2, 3, 4, 5, 6}

# Intersection: Items in both sets
print(set_a & set_b) # {3, 4}

# Difference: Items in set_a but NOT in set_b
print(set_a - set_b) # {1, 2}

# Symmetric Difference: Items in either set, but NOT both
print(set_a ^ set_b) # {1, 2, 5, 6}

4. Set Methods & Membership

Python
# Membership Testing
print("apple" in fruits) # True

# Subset and Superset
set_x = {1, 2}
set_y = {1, 2, 3}

print(set_x.issubset(set_y))   # True
print(set_y.issuperset(set_x)) # True

5. Frozen Sets

If you need a set that is entirely immutable (cannot be changed after creation), Python provides the frozenset() type. These can be used as keys in a dictionary.

Venn diagram for set operations, AI generated
Getty Images
Python
fs = frozenset(["red", "green", "blue"])
# fs.add("yellow") # This would raise an AttributeError

 

Upcoming Course
Upcoming Course
Learn More
Instructor Tips
Instructor Tips
View Tips
Join Community
Join Community
Join Now