Another type of mutable collection in Python is a set. The Python Docs on Sets use the following paragraphs to describe what sets are and how they are used:
Python also includes a data type for sets. A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
If you're familiar with mathematical terms, then a set might already seem intuitive. If not, don't worry; you'll go over how to create one and what you can use it for in this lesson.
Creating A Set
You can create a set in Python using set() and pass it to another collection object. You can also create a new set by wrapping comma-separated values in curly braces ({}):
s1 = {1, 2, 3}
print(s1);
# OUTPUT: {1, 2, 3}
s2 = set([1, 2, 3])
print(s2)
# OUTPUT: {1, 2, 3}
s3 = set()
print(s3)
# OUTPUT: {}
Both s1 and s2 are equivalent ways of creating a set. s2 uses set() to convert a list into a set. The third example, s3, creates an empty set.
Note: If you want to create an empty set, then you need to use set(), not {}, as the latter creates an empty dictionary. You'll learn more about dictionaries in an upcoming lesson.
After you've learned how to create a set, you'll now look at an example of how sets can be useful.
Eliminating Duplicates
One frequent use of sets is to eliminate duplicate entries in a collection.
Imagine that you fetched all links from a website with your web scraping script, and now you want to follow those links programmatically. However, you only want to visit each destination once, and some pages have multiple paths to get to the same page, so you'll end up with multiple URLs in your list.
You can convert the list of URLs to a set, which will remove all duplicates and present you with a collection of unique URLs:
url_list = ['http://www.example.com',
'http://www.setsareuseful.com',
'http://www.example.com']
unique_urls = set(url_list)
print(unique_urls)
# OUTPUT: {'http://www.example.com', 'http://www.setsareuseful.com'}
You effectively removed all duplicate links from your list with a single line of code. Your result set is also an iterable collection, which means that you can use it in the same way you'd use a list to access each of the URLs in your script.
Using Set Operations
You can do more with sets. In fact, if you are familiar with mathematical sets, then you'll see that you can apply many set operations in the same way that you would in mathematics:
There are even more set operations. You don't have to learn all of them, but rather keep aware that they exist and look them up if you need them.
Playground: Python Set Operations
- Referencing the table above, practice using
.union()and.update().
# your turn!
Additional Resources
- Wikipedia: Basic Set Operations
- Official Python Tutorial: Python Sets
- Python Documentation: Python Sets
- Real Python: Python Sets
- Snakify: Set Tutorial
In a future lesson, you'll get to know another very important data type that you'll use frequently in programming. This data type is called a dictionary.
Summary: What is a Set
- You can create sets with curly braces (
{}) or theset()function. - You can use set operations and set methods to perform actions with sets.
- Python sets work similarly to mathematical sets. In practice, you'll likely use them to quickly remove duplicates from a collection.