Readings: sets
Contents
6.1. Readings: sets#
6.1.1. Sets#
Sets are collections of values. What sets them apart from lists and tuples is that sets do not allow duplicate elements. In addition, sets are unordered, which means that we cannot access elements of a set by indexing. Also sets do not support slicing. In addition, they are unchangeable, since we cannot modify an element that is already added there, but we can add and remove elements.
6.1.1.1. Creating a set#
The syntax to define a set is:
set_name = {element1, element2, element3,....}
For example:
chemicals = {
'Oxygen', 'Hydrogen', 'Nitrogen', 'Potassium', 'Sodium', 'Oxygen'
}
chemicals
{'Hydrogen', 'Nitrogen', 'Oxygen', 'Potassium', 'Sodium'}
The order in which the items appear is not important, since sets are unordered.
As you can see, although Oxygen
is specified twice in the creation of the set, it is added only once.
We can also create sets by using the set()
function. This function receives a collection of values and builds a set from it. set()
function is called a constructor since it is used to build/construct a set.
list_of_values = [
'Oxygen', 'Hydrogen', 'Nitrogen', 'Potassium', 'Sodium', 'Oxygen'
]
chemicals_1 = set(list_of_values)
chemicals_1
{'Hydrogen', 'Nitrogen', 'Oxygen', 'Potassium', 'Sodium'}
Again, Oxygen
appears only once.
Also, we can use a for-loop to iterate over the values of a set:
for el in chemicals:
print(el, end=' ')
Potassium Hydrogen Sodium Oxygen Nitrogen
In contrast to lists, sets are very useful for storing collections of unique values when the order does not matter. This can be useful, e.g., for dereplicating a list or quickly counting the number of unique values in a list by casting a list to a set:
birds = ['duck', 'duck', 'duck', 'duck', 'goose']
unique_birds = set(birds)
print(unique_birds)
print('We have {0} birds of {1} different types!'.format(len(birds), len(unique_birds)))
{'goose', 'duck'}
We have 5 birds of 2 different types!
6.1.1.2. Adding elements in a set#
There are two ways to add elements in a set: by using the add()
and update()
methods. add()
is used to add a single value, while update()
can add any iterable.
For example:
chemicals.add('Carbon')
chemicals
{'Carbon', 'Hydrogen', 'Nitrogen', 'Oxygen', 'Potassium', 'Sodium'}
chemicals.update(['Calcium','Magnesium','Aluminum'])
chemicals
{'Aluminum',
'Calcium',
'Carbon',
'Hydrogen',
'Magnesium',
'Nitrogen',
'Oxygen',
'Potassium',
'Sodium'}
As you can see, the order in which elements are added is not important. In our case, the elements in the set are ordered alphabetically.
6.1.1.3. Removing elements from a set#
For removing elements these methods can be used: remove()
, discard()
and pop()
. The clear()
method can be used to delete all elements of the set but not the set itself.
For example, remove(element_name)
will remove the specified element and if that does not exist it will raise a KeyError
.
chemicals_1.remove('Hydrogen')
chemicals_1
{'Nitrogen', 'Oxygen', 'Potassium', 'Sodium'}
chemicals_1.remove('Carbon')
chemicals_1
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/ipykernel_57703/2546945945.py in <module>
----> 1 chemicals_1.remove('Carbon')
2 chemicals_1
KeyError: 'Carbon'
The discard(element_name)
does the same but if the element does not exist then it does not raise an error.
chemicals_1.discard('Nitrogen')
chemicals_1
{'Oxygen', 'Potassium', 'Sodium'}
chemicals_1.discard('Carbon')
chemicals_1
{'Oxygen', 'Potassium', 'Sodium'}
The pop()
method removes a random element from the set and returns it as output. Since sets are unordered, it will select an element (we as programmers cannot affect which element is selected) at random and will remove it.
print(chemicals_1.pop())
print('Set after popping an element: ', chemicals_1)
Potassium
Set after popping an element: {'Sodium', 'Oxygen'}
In order to clear a set from all the elements we can use the clear()
method:
chemicals_1.clear()
chemicals_1
set()
The del keyword deletes the set entirely from the runtime.
del chemicals_1
chemicals_1
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/ipykernel_57703/1218415494.py in <module>
1 del chemicals_1
----> 2 chemicals_1
NameError: name 'chemicals_1' is not defined
As you can see, there is no variable anymore in the runtime that is called chemicals_1
, that is why we get a NameError
.
6.1.1.4. Searching for a value in a set#
To check whether a value is in the set we can use the in
or not in
operators. As we have seen in the previous chapters, if the value is present, the in
operator returns True
and the not in
operator returns False
. If the value is not present, then in
will return False
and not in
will return True
.
print('Oxygen' in chemicals)
print('Carbon' in chemicals)
print('Carbon' not in chemicals)
print('Oxygen' not in chemicals)
True
True
False
False
6.1.1.5. Set Comparison#
We can use comparison operators introduced in Chapter 2 to compare two sets. Two sets are equal (==
) if they have the same elements no matter their order. Otherwise, they are not equal.
Let us look at an example:
chemicals_1 = {
'Aluminum', 'Calcium', 'Carbon',
'Hydrogen', 'Magnesium', 'Nitrogen',
'Sodium', 'Oxygen', 'Potassium',
}
print('chemicals =', chemicals)
print('chemicals_1 =', chemicals_1)
print(chemicals_1 == chemicals)
chemicals = {'Potassium', 'Hydrogen', 'Carbon', 'Aluminum', 'Sodium', 'Calcium', 'Oxygen', 'Nitrogen', 'Magnesium'}
chemicals_1 = {'Potassium', 'Hydrogen', 'Carbon', 'Aluminum', 'Sodium', 'Calcium', 'Oxygen', 'Nitrogen', 'Magnesium'}
True
As you can see, elements in chemicals_1
do not have the same order as chemicals
. But they have the same elements, as a result both sets are equal.
On the other hand:
chemicals_2 = {'Aluminum', 'Calcium', 'Carbon'}
print('chemicals =', chemicals)
print('chemicals_2 =', chemicals_2)
print(chemicals_2 == chemicals)
chemicals = {'Potassium', 'Hydrogen', 'Carbon', 'Aluminum', 'Sodium', 'Calcium', 'Oxygen', 'Nitrogen', 'Magnesium'}
chemicals_2 = {'Calcium', 'Aluminum', 'Carbon'}
False
As expected, chemicals_2
is not equal to chemicals
because they do not have the same elements.
We say that subset A is a proper subset of set B if all elements of A are contained in B, but not all elements of B are contained in A. In other words, A and B are not equal, but B contains all elements of A. To check if A is a proper subset of B we can use the <
(strictly less than) operator. The <
operator checks whether the set on its left is a proper subset of the set on the right.
a = {1,2,3}
b = {1,2,3,4,5}
print(a < b)
print(b < a)
True
False
We say that subset A is an improper subset of set B if all elements of A are contained in B, and all elements of B may be in A. In other words, A and B can be equal, but surely B contains all elements of A. To check if A is a improper subset of B we can use the <=
(strictly less than or equal to) operator. The <=
operator checks whether the set on its left is an improper subset of the set on the right.
a = {2,3,1}
b = {1,2,3,4,5}
a <= b
True
c = {1,2,3}
print(a <= c)
print(c <= a)
True
True
As you see from the examples above, in case that the sets are equal, the <=
is commutative. Also, a proper subset is an improper subset as well, but not the other way round.
Equivalently there is the issubset()
method to check for improper subsets:
a.issubset(c)
True
Symmetrically, we can find if set B is a proper or improper superset of A by using the >
(greater than) or >=
(greater than or equal to) operators, respectively. For clarity, B is a proper superset of A if all elements of A are part of B and A and B are not equal. While B is an improper superset of A, if all elements of A are part of B and A and B might be equal. We can also use the issuperset()
method to check for an improper superset.
6.1.1.6. Set comprehension#
If you remember from the previous chapter, we defined list comprehensions using []
, since list elements were defined inside []
. For sets, we will define the set comprehension inside {}
, since set elements are inside {}
. In the next example, we will use a list that contains duplicate elements to create a set using list comprehension:
chemicals = [
'Oxygen', 'Hydrogen', 'Nitrogen', 'Oxygen', 'Hydrogen', 'Carbon'
]
set_from_list = {element for element in chemicals}
set_from_list
{'Carbon', 'Hydrogen', 'Nitrogen', 'Oxygen'}
As you can see, the set does not contain duplicate elements anymore.
6.1.1.7. Set operators (Optional)#
6.1.1.7.1. Set union#

Fig. 6.1 Set Union#
The union of two sets is another set that contains the elements of both sets. To find the union of two sets you can use the |
operator or the union()
method.
a = {1,2,3,4}
b = {1,2,3,4,5}
print(a | b)
print(a.union(b))
{1, 2, 3, 4, 5}
{1, 2, 3, 4, 5}
Note
The union()
can take any iterable as argument. Before it applies the method, it will convert it to a set and then perform the operation on two sets.
6.1.1.7.2. Set intersection#

Fig. 6.2 Set Intersection#
Similarly, you can use the &
or the intersection()
method to find the common elements of two sets:
print(a & b)
print(a.intersection(b))
{2, 3, 4}
{2, 3, 4}
As you can see, both ways output the common elements of both sets.
6.1.1.7.3. Set difference#

Fig. 6.3 Set Difference#
The set difference is another set that contains only the elements of the first set that are not part of the second set. To find the set difference we can use the -
operator or the difference()
method:
print(a - b)
print(a.difference(b))
set()
set()
As you can see, both ways return an empty set because there are no elements in a
that are not in b
. Let us check the difference between b
and a
now:
print(b - a)
print(b.difference(a))
{5}
{5}
In the second case, 5
is the element that is in b
but not in a
.

Fig. 6.4 Symmetric Set Difference#
The symmetric set difference on the other hand, returns a set that the elements that are not common in any of the sets:
print(a^b)
print(a.symmetric_difference(b))
{5}
{5}
It is like taking the union of the two set differences a-b
and b-a
.
6.1.1.7.4. Disjoint sets#

Fig. 6.5 Disjoint Sets#
Two sets are disjoint if they do not have any elements in common. To check whether to sets are disjoint or not we can use the isdisjoint()
method:
a.isdisjoint(b)
False
It returns a boolean value that indicates if two sets are disjoint or not.
d = {10,20,30}
a.isdisjoint(d)
True
6.1.1.8. Frozenset (Optional)#
Frozensets are immutable sets. As we have seen so far, sets are mutable since we can modify them by adding or removing elements. However, set elements must be immutable. This means that they cannot be modified. Thus, a frozenset is an immutable set. This means that once it is created, it cannot be modified. A frozenset can be created by using the built-in function frozenset()
that takes as input any iterable and creates the frozen set from its elements.
Next on, we will explore dictionaries.