Hash Tables — Visual Learning

Group Anagrams —
Sort the signature. Group by key.

Group a list of strings so that all anagrams of each other end up in the same bucket. The trick: every anagram has the same sorted string — use that as a hash map key. Watch each word get sorted, keyed, and bucketed on LearnBug and the grouping logic becomes completely clear in one pass.

O(n·k log k)Time complexity
O(n·k)Space complexity
PythonLanguage

What is Group Anagrams?

Given a list of strings, return them grouped so all anagrams appear together. For ["eat", "tea", "tan", "ate", "nat", "bat"], the output is [["eat","tea","ate"], ["tan","nat"], ["bat"]]. Two strings are in the same group if they're anagrams — same characters, same counts, any order.

The key insight — sorted string as signature

Every anagram of a word has the exact same sorted form. sort("eat") = "aet", sort("tea") = "aet", sort("ate") = "aet". Use this sorted form as a hash map key. All anagrams map to the same key and collect in the same bucket. One pass, no pairwise comparison.

YouTube — Group Anagrams Explained
📺 Drop your YouTube embed here
LearnBug — words bucketing by sorted key
🖼 Add a LearnBug screenshot here
Walkthrough

Grouping ["eat", "tea", "tan", "ate", "nat", "bat"]

1

"eat" → sort → "aet" → new bucket

Key "aet" doesn't exist in map yet. Create it with ["eat"].

Map: { "aet": ["eat"] }
2

"tea" → sort → "aet" → same bucket as "eat"

Key "aet" already exists. Append "tea" to that bucket.

Map: { "aet": ["eat", "tea"] }
3

"tan" → sort → "ant" → new bucket

Key "ant" doesn't exist. Create with ["tan"].

Map: { "aet": ["eat", "tea"], "ant": ["tan"] }
4

"ate" → "aet", "nat" → "ant", "bat" → "abt"

"ate" appends to "aet" bucket. "nat" appends to "ant" bucket. "bat" creates new "abt" bucket. All 6 words processed.

Final Map: { "aet": ["eat","tea","ate"], "ant": ["tan","nat"], "abt": ["bat"] }

Result: [["eat","tea","ate"], ["tan","nat"], ["bat"]]

Python Code

Sort key and frequency tuple key approaches

PythonSorted string as key — O(n·k log k)
from collections import defaultdict

def group_anagrams(strs):
    groups = defaultdict(list)

    for word in strs:
        key = ''.join(sorted(word))   # "eat" → "aet"
        groups[key].append(word)

    return list(groups.values())
PythonCharacter count tuple as key — O(n·k), avoids sort
def group_anagrams_no_sort(strs):
    groups = defaultdict(list)

    for word in strs:
        # 26-element tuple of char counts — no sorting needed
        count = [0] * 26
        for c in word:
            count[ord(c) - ord('a')] += 1
        groups[tuple(count)].append(word)

    return list(groups.values())

Time & Space Complexity

Sort key approachO(n·k log k)n strings, each sorted in O(k log k) where k = max word length
Count tuple approachO(n·k)No sort — count 26 chars per word in O(k). Faster for long words.
SpaceO(n·k)Map stores all n strings — total characters is n·k
Why defaultdict?Auto-initdefaultdict(list) creates an empty list on first key access — no KeyError

Frequently asked questions

Why use a sorted string as the key?

All anagrams produce the same sorted string — it's a canonical form that uniquely identifies the set of characters. Using it as a hash map key ensures all anagrams map to the same bucket. Any function that maps anagrams to the same value and non-anagrams to different values would work — sorted string is just the most natural choice.

When is the count tuple approach better than sorted?

For long words (large k), counting 26 characters is O(k) while sorting is O(k log k). For large inputs with long words, the count tuple approach wins. For short words (typical interview input), the sorted approach is simpler and fast enough. In interviews, start with sorted — mention count tuple as the follow-up optimisation.

What is defaultdict and why use it here?

defaultdict(list) is a dict that automatically creates an empty list when a new key is accessed — no need to check if key not in groups or call groups.setdefault(key, []). It makes the code cleaner and is the idiomatic Python approach for grouping problems.

What is the LeetCode problem for this?

LeetCode 49 — Group Anagrams. Medium difficulty. It's a standard hash map design problem that tests whether you understand using composite objects (sorted strings, tuples) as hash keys. Related: LC 242 (Valid Anagram), LC 438 (Find All Anagrams in a String).

Can the grouping order affect the result?

LeetCode accepts any ordering of groups and any ordering within groups — the test compares sets of sets. In Python 3.7+, dicts maintain insertion order, so groups.values() returns groups in the order the first word of each anagram group was encountered. This is deterministic but not required to be sorted.

Watch each word sort, key, and bucket in real time

Paste your word list into LearnBug and see every anagram group form step by step.

Open Playground →