Heap#
The heapq module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. Heaps are binary trees where every parent node has a value less than or equal to any of its children (min-heap). This cheat sheet covers heap operations including heap sort, priority queues, merging sorted iterables, and finding the n largest or smallest elements efficiently.
The source code is available on GitHub.
References#
Basic Heap Operations#
The heapq module provides functions to create and manipulate heaps. Use
heapify to convert a list into a heap in-place in O(n) time. Use heappush
and heappop to add and remove elements while maintaining the heap property.
>>> import heapq
>>> # Convert list to heap in-place
>>> h = [5, 1, 3, 2, 6]
>>> heapq.heapify(h)
>>> h[0] # smallest element at root
1
>>> # Push and pop
>>> heapq.heappush(h, 0)
>>> heapq.heappop(h)
0
>>> # Push and pop in one operation
>>> heapq.heappushpop(h, 4) # push 4, then pop smallest
1
>>> # Pop and push in one operation
>>> heapq.heapreplace(h, 0) # pop smallest, then push 0
2
Implement Heap Sort with heapq#
Heap sort works by pushing all elements onto a heap and then popping them off one by one. Since the heap maintains the min-heap property, elements come out in sorted order. The time complexity is O(n log n).
>>> import heapq
>>> a = [5, 1, 3, 2, 6]
>>> h = []
>>> for x in a:
... heapq.heappush(h, x)
...
>>> x = [heapq.heappop(h) for _ in range(len(a))]
>>> x
[1, 2, 3, 5, 6]
A more efficient approach uses heapify to convert the list in-place:
>>> import heapq
>>> def heap_sort(items):
... h = items.copy()
... heapq.heapify(h)
... return [heapq.heappop(h) for _ in range(len(h))]
...
>>> heap_sort([5, 1, 3, 2, 6])
[1, 2, 3, 5, 6]
Implement Max Heap#
Python’s heapq only provides a min-heap. To implement a max-heap, negate
the values when pushing and negate again when popping.
>>> import heapq
>>> # Max heap using negation
>>> h = []
>>> for x in [5, 1, 3, 2, 6]:
... heapq.heappush(h, -x)
...
>>> [-heapq.heappop(h) for _ in range(len(h))]
[6, 5, 3, 2, 1]
For custom objects, implement __lt__ with reversed comparison:
import heapq
class MaxHeapItem:
def __init__(self, val):
self.val = val
def __lt__(self, other):
return self.val > other.val # reversed for max heap
h = []
for x in [5, 1, 3]:
heapq.heappush(h, MaxHeapItem(x))
print(heapq.heappop(h).val) # 5 (largest)
Implement Priority Queue with heapq#
A priority queue processes elements based on their priority rather than insertion
order. Use tuples (priority, value) where lower numbers indicate higher priority.
>>> import heapq
>>> pq = []
>>> heapq.heappush(pq, (2, "medium"))
>>> heapq.heappush(pq, (1, "high"))
>>> heapq.heappush(pq, (3, "low"))
>>> [heapq.heappop(pq) for _ in range(len(pq))]
[(1, 'high'), (2, 'medium'), (3, 'low')]
For custom objects, implement the __lt__ method to define comparison behavior:
import heapq
class Task:
def __init__(self, priority, name):
self.priority = priority
self.name = name
def __lt__(self, other):
return self.priority < other.priority
def __repr__(self):
return f"Task({self.priority}, {self.name!r})"
h = []
heapq.heappush(h, Task(3, "low"))
heapq.heappush(h, Task(1, "high"))
heapq.heappush(h, Task(2, "medium"))
while h:
print(heapq.heappop(h))
# Task(1, 'high')
# Task(2, 'medium')
# Task(3, 'low')
Find K Largest or Smallest Elements#
The nlargest and nsmallest functions efficiently find the k largest or
smallest elements. They are more efficient than sorting when k is small relative
to the list size.
>>> import heapq
>>> nums = [5, 1, 8, 3, 9, 2, 7]
>>> heapq.nsmallest(3, nums)
[1, 2, 3]
>>> heapq.nlargest(3, nums)
[9, 8, 7]
Use the key parameter to extract comparison keys from complex objects:
>>> import heapq
>>> data = [
... {'name': 'Alice', 'score': 85},
... {'name': 'Bob', 'score': 92},
... {'name': 'Charlie', 'score': 78},
... ]
>>> heapq.nlargest(2, data, key=lambda x: x['score'])
[{'name': 'Bob', 'score': 92}, {'name': 'Alice', 'score': 85}]
Merge Sorted Iterables#
The merge function merges multiple sorted inputs into a single sorted output.
It returns an iterator, making it memory-efficient for large datasets.
>>> import heapq
>>> a = [1, 3, 5, 7]
>>> b = [2, 4, 6, 8]
>>> c = [0, 9, 10]
>>> list(heapq.merge(a, b, c))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Use key and reverse parameters for custom merging:
>>> import heapq
>>> # Merge in descending order
>>> a = [5, 3, 1]
>>> b = [6, 4, 2]
>>> list(heapq.merge(a, b, reverse=True))
[6, 5, 4, 3, 2, 1]
Maintain a Fixed-Size Heap#
To maintain a heap of fixed size k (e.g., tracking top k elements), use
heappushpop or check the size after each push.
>>> import heapq
>>> def top_k(items, k):
... """Keep track of k largest elements using min-heap."""
... h = []
... for x in items:
... if len(h) < k:
... heapq.heappush(h, x)
... elif x > h[0]:
... heapq.heapreplace(h, x)
... return sorted(h, reverse=True)
...
>>> top_k([5, 1, 8, 3, 9, 2, 7, 4, 6], 3)
[9, 8, 7]
Heap with Index Tracking#
When you need to update priorities in a heap, use a dictionary to track element positions or mark entries as invalid.
import heapq
class IndexedHeap:
def __init__(self):
self.heap = []
self.entry_finder = {}
self.REMOVED = '<removed>'
def push(self, item, priority):
if item in self.entry_finder:
self.remove(item)
entry = [priority, item]
self.entry_finder[item] = entry
heapq.heappush(self.heap, entry)
def remove(self, item):
entry = self.entry_finder.pop(item)
entry[-1] = self.REMOVED
def pop(self):
while self.heap:
priority, item = heapq.heappop(self.heap)
if item is not self.REMOVED:
del self.entry_finder[item]
return item
raise KeyError('pop from empty heap')
# Usage
h = IndexedHeap()
h.push('task1', 3)
h.push('task2', 1)
h.push('task1', 0) # update priority
print(h.pop()) # task1 (now has priority 0)