Group objects in a list by property value in Python
May 2, 2021 ‐ 2 min read
Lately I came across a problem for which I had to group objects of a single list by the value of a property of these objects. After trying some things I settled on the groupby
function from the itertools
module.
Lets consider the following example, we have a Letter class with two properties: a string to specify the character and a boolean to specify wether the letter is a vowel or not. We want to group Letter objects based on their vowel
property.
This class looks as the following, the __repr__
is just there to have a neat string representation when we later print out the objects.
class Letter:
def __init__(self, char: str, vowel: bool) -> None:
self.char = char
self.vowel = vowel
def __repr__(self) -> str:
return self.char
Next we create a list with Letter objects. Not the entire alphabet, the first couple letters will do.
letters = [
Letter('a', True),
Letter('b', False),
Letter('c', False),
Letter('d', False),
Letter('e', True),
]
Now we group our letters
list containing Letter
objects based on the value of their .vowel
property. This grouping is done by the key function we pass to groupby
as a second argument, the list itself being the first argument.
from itertools import groupby
sorted_letters = sorted(letters, key=lambda letter: letter.vowel)
grouped = [list(result) for key, result in groupby(
sorted_letters, key=lambda letter: letter.vowel)]
print(grouped)
# => [[b, c, d], [a, e]]
One gotcha is that the list, or any other iterable, needs to be sorted before passing it to the groupby
function. Otherwise your groups end up with a segmented result. To illustrate see the following example where the sorting is commented out. As you see the groupby
makes the groups consecutively.
from itertools import groupby
# letters = sorted(letters, key=lambda letter: letter.vowel)
grouped = [list(result) for key, result in groupby(
letters, key=lambda letter: letter.vowel)]
print(grouped)
# => [[a], [b, c, d], [e]]
Our entire code sample looks like:
from itertools import groupby
class Letter:
def __init__(self, char: str, vowel: bool) -> None:
self.char = char
self.vowel = vowel
def __repr__(self) -> str:
return self.char
letters = [
Letter('a', True),
Letter('b', False),
Letter('c', False),
Letter('d', False),
Letter('e', True),
]
sorted_letters = sorted(letters, key=lambda letter: letter.vowel)
grouped = [list(result) for key, result in groupby(
sorted_letters, key=lambda letter: letter.vowel)]
print(grouped)
# => [[b, c, d], [a, e]]