Before we can step through heapsort we need to understand binary heaps / priority queues and their properties.
Heap properties
Even thought this heap is a tree it can be represented via an array. We can quickly access parents and leaves by doing some arithmetic.
So we now can access parent and child nodes pretty easily. Let’s try adding some elements to our max heap
So far we’ve added nodes that maintain the properties of the max heap ie the values we added are smaller than their parent nodes. What about if we add a vvalue that needs to be higher in the tree?
When we add an element to the max heap we need to do an operation called heapify_up
(also known as bubble-up, percolate-up, sift-up, trickle-up, swim-up, heapify-up, or cascade-up).
source for this algo - wiki for binary heaps:
The insert has a worst-case time complexity of O(logn)
Now what about if we wanted to extract out the highest values from our max heap? We can do the corresponding heapify_down
algorithm
Ok so we now understand heaps a bit more. How do we use it to actually sort an array?
It boils down to a few major steps.
Let’s go through this visually
def heapify(arr, n, i):
# Find largest among root and children
largest = i
l = 2 * i + 1
r = 2 * i + 2
if (l < n) and (arr[i] < arr[l]):
largest = l
if (r < n) and (arr[largest] < arr[r]):
largest = r
# If root is not largest, swap with largest and continue heapifying
if largest != i:
arr[i],arr[largest] = arr[largest],arr[i]
heapify(arr, n, largest)
def heapSort(arr):
arr_length = len(arr)
# build max heap
for i in range(arr_length, 0, -1):
heapify(arr, arr_length, i)
for i in range(arr_length - 1, 0, -1):
# swap
arr[i], arr[0] = arr[0], arr[i]
# heapify root element
heapify(arr, i, 0)