Arrays — Visual Learning

Find Duplicate in Array —
Four approaches. One optimal.

Finding a duplicate in an array has multiple solutions — hash set O(n) space, sorting O(n log n), and Floyd's cycle detection O(1) space. Each approach reveals a different way of thinking about the problem. Watch each one animate on LearnBug and you'll understand not just the answer but why each approach works.

O(n)Optimal time
O(1)Optimal space
PythonLanguage

What is the problem?

Given an array of n+1 integers where each value is between 1 and n inclusive, find the one duplicate. By the pigeonhole principle, at least one value must repeat. The challenge is finding it efficiently — without modifying the array and without using O(n) extra space for the hardest variant.

Why visualization changes everything

Floyd's cycle detection is notoriously hard to understand from code alone. On LearnBug you watch the tortoise and hare move through the array as if it were a linked list — slow pointer advancing one step, fast advancing two — until they meet inside the cycle. The cycle entry point reveals the duplicate.

YouTube — Find Duplicate Explained Visually
📺 Drop your YouTube embed here
LearnBug — slow and fast pointer meeting in the cycle
🖼 Add a LearnBug screenshot here
Four Approaches

From brute force to O(1) space

Brute Force — O(n²) time

Check every pair with nested loops. For each element, scan the rest of the array. Returns the duplicate but is too slow for large inputs.

O(n²)O(1) spaceToo slow

Sort First — O(n log n)

Sort the array, then scan for adjacent equal elements. Simple but modifies the array and costs O(n log n). Not allowed if array is immutable.

O(n log n)Modifies array

Hash Set — O(n) time, O(n) space

Add each element to a set. If it's already there, that's the duplicate. O(n) time and space — fast but uses extra memory.

O(n)O(n) spaceEasy to code

Floyd's Cycle — O(n) time, O(1) space ✓

Treat array values as linked list pointers. Slow and fast pointers detect a cycle, then find its entry — the duplicate. Optimal: no extra space, no modification.

O(n)O(1) spaceOptimal
Deep Dive

Floyd's cycle detection — step by step

Array: [1, 3, 4, 2, 2]. Values are used as next-index pointers. Index 0 → value 1 → index 1 → value 3 → index 3 → value 2 → index 2 → value 4 → index 4 → value 2 → index 2 (cycle!)

1

Phase 1 — Find intersection point

Start both at index 0. slow = nums[slow] (one step), fast = nums[nums[fast]] (two steps). Repeat until they meet.

10
31
42
23
24

slow: 0→1→3→2→4→2...  |  fast: 0→3→4→2→4→2... → meet at index 2

2

Phase 2 — Find cycle entry (the duplicate)

Reset slow to index 0. Keep fast at intersection. Now both move one step at a time. Where they meet is the cycle entry = the duplicate value.

1slow
3
4fast
2
2

slow: 0→1→3→2  |  fast: 2→4→2 → meet at index 2, value = 2

Python Code

Hash set and Floyd's in code

PythonHash Set — simple O(n) space
def find_duplicate_set(nums):
    seen = set()
    for num in nums:
        if num in seen:
            return num
        seen.add(num)
    return -1
PythonFloyd's Cycle Detection — O(1) space (optimal)
def find_duplicate_floyd(nums):
    # Phase 1: find intersection inside cycle
    slow = fast = nums[0]
    while True:
        slow = nums[slow]
        fast = nums[nums[fast]]
        if slow == fast:
            break

    # Phase 2: find cycle entry = duplicate
    slow = nums[0]
    while slow != fast:
        slow = nums[slow]
        fast = nums[fast]

    return slow

Approach Comparison

Hash SetO(n) / O(n)Time / Space — easiest to code in an interview
SortO(n log n) / O(1)Modifies array — not valid if immutability is required
Floyd'sO(n) / O(1)Optimal — doesn't modify array, no extra space
ConstraintValues 1 to nFloyd's only works because values are valid array indices

Frequently asked questions

Why does Floyd's algorithm work on this problem?

Because values are in range 1 to n, each value can be used as an index into the array — just like a "next pointer" in a linked list. The duplicate value means two different indices point to the same next index, creating a cycle. Floyd's cycle detection finds where that cycle begins, which is exactly the duplicate value.

What if the array has multiple duplicates?

The classic formulation (LeetCode 287) guarantees exactly one duplicate. For multiple duplicates, Floyd's doesn't directly apply. Use a hash set to collect all seen values and return a list of those seen more than once, or sort and check adjacent pairs.

Can I solve this with XOR?

XOR works for finding a missing number (XOR all indices with all values), but not for finding a duplicate when values are 1 to n and one is duplicated. XOR cancels pairs perfectly but a duplicate creates an odd count, not a missing value — the XOR approach doesn't isolate which value is duplicated.

What is the LeetCode problem for this?

LeetCode 287 — Find the Duplicate Number. Constraints: array of n+1 integers, values 1 to n, exactly one duplicate which may appear more than twice. Must not modify the array and must use only O(1) extra space — that forces Floyd's. Also related: LC 442 (Find All Duplicates), LC 448 (Find All Missing Numbers).

Is the hash set approach acceptable in interviews?

Yes — always start with the hash set approach. It's O(n) time, easy to code correctly, and shows you understand the problem. Then offer Floyd's as the optimal follow-up if the interviewer asks for O(1) space. Interviewers appreciate seeing both approaches and understanding the trade-offs.

Watch slow and fast pointers find the cycle on your array

Paste your duplicate array into LearnBug and see Floyd's algorithm detect the cycle step by step.

Open Playground →