Group objects in a list by property value in Python

Lately I came across a problem for which I had to group objects of a single list by the value of a property of these objects. After trying some things I settled on the groupby function from the itertools module.

Lets consider the following example, we have a Letter class with two properties: a string to specify the character and a boolean to specify wether the letter is a vowel or not. We want to group Letter objects based on their vowel property.

This class looks as the following, the __repr__ is just there to have a neat string representation when we later print out the objects.

class Letter:
    def __init__(self, char: str, vowel: bool) -> None:
        self.char = char
        self.vowel = vowel

    def __repr__(self) -> str:
        return self.char

Next we create a list with Letter objects. Not the entire alphabet, the first couple letters will do.

letters = [
    Letter('a', True),
    Letter('b', False),
    Letter('c', False),
    Letter('d', False),
    Letter('e', True),
]

Now we group our letters list containing Letter objects based on the value of their .vowel property. This grouping is done by the key function we pass to groupby as a second argument, the list itself being the first argument.

from itertools import groupby

sorted_letters = sorted(letters, key=lambda letter: letter.vowel)
grouped = [list(result) for key, result in groupby(
    sorted_letters, key=lambda letter: letter.vowel)]

print(grouped)
# => [[b, c, d], [a, e]]

One gotcha is that the list, or any other iterable, needs to be sorted before passing it to the groupby function. Otherwise your groups end up with a segmented result. To illustrate see the following example where the sorting is commented out. As you see the groupby makes the groups consecutively.

from itertools import groupby

# letters = sorted(letters, key=lambda letter: letter.vowel)
grouped = [list(result) for key, result in groupby(
    letters, key=lambda letter: letter.vowel)]

print(grouped)
# => [[a], [b, c, d], [e]]

Our entire code sample looks like:

from itertools import groupby


class Letter:
    def __init__(self, char: str, vowel: bool) -> None:
        self.char = char
        self.vowel = vowel

    def __repr__(self) -> str:
        return self.char

letters = [
    Letter('a', True),
    Letter('b', False),
    Letter('c', False),
    Letter('d', False),
    Letter('e', True),
]

sorted_letters = sorted(letters, key=lambda letter: letter.vowel)
grouped = [list(result) for key, result in groupby(
    sorted_letters, key=lambda letter: letter.vowel)]

print(grouped)
# => [[b, c, d], [a, e]]

See Also

6 Non-Programming Books for Programmers