[sorting] heap sort

heap.img

Pre-req.

Binary tree:

  • Every node at most has two kids.

Full binary tree:

  • Every level is full.
  • K levels ⟺ 2^K-1 nodes.

Complete binary tree:

  • Except the last level, all levels are full.

  • Because of the above properties, it can be represented by an array, where left = root*2, right = root*2 +1, and root = kid // 2.

  • Becasue of array is usually zero- based, the below formulas are used in heap instead: left = root*2 + 1, right = root*2 + 2, and root = (kid-1) // 2

Heap

* the codes below use min heap for default.

Piority queue

A piority queue, where its element are tuples like (piority, item), is just a special heap. In python, because tuple comparision is available, heap is quivalent piority queue!

Operations on Heap

Heapify(array, i)

It adjust the tree under index i, making sure the ith element sinks down to its correct place.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

def (arr, i):
root = i
left = 2 * i + 1
right = 2 * i + 2
if right < len(arr) and arr[right] < arr[left] and arr[right] < arr[root]:
# swap root and right
arr[right], arr[root] = arr[root], arr[right]
heapify(arr, right)
elif left < len(arr) and arr[left] < arr[root]:
# swap left and root
arr[left], arr[root] = arr[root], arr[left]
heapify(arr, left)
else:
pass

or

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# use while loop
def heapify_loop(arr, i)
root = i
while root < len(arr):
left = 2 * root + 1
right = 2 * root + 2
if right < len(arr) and arr[right] < arr[left] and arr[right] < arr[root]:
# swap root and right
arr[right], arr[root] = arr[root], arr[right]
root = right
elif left < len(arr) and arr[left] < arr[root]:
# swap left and root
arr[left], arr[root] = arr[root], arr[left]
root = left
else:
break

Build_heap(array)

This done by calling heapify(arr, i) from the bottom to the top, ensureing the array satisfies the heap property.

1
2
3
def build_heap(arr):
for i in range( (len(arr)-1) // 2, -1, -1):
heapify(arr, i)

Insert(array, ele)

This is done by adding the heap to the end, and sift it up to the correct position.

1
# omiited

Pop(array)

This is done by:

  1. poping the root of the heap
  2. fill the root with the last element in the array
  3. heapify(arr, 0)
1
2
3
4
5
6
7
8
9
def pop(arr):
if arr == []:
return None
else:
temp = arr[0]
arr[0] = arr[-1]
del arr[-1]
heapify(arr, 0)
return temp

Notice keep popping will give a ordered array.

Heapsort(array):

The complete sorting process includes

  1. build the heap
  2. ouput the min(max) of the heap one at a time to form a ordered array
1
2
3
4
5
6
7
8
def heapsort(arr):
build_heap(arr)
res = []
temp = pop(arr)
while temp is not None:
res.append(temp)
temp = pop(arr)
return res

Performance

Each Heapify is log(n).

Building heap does n times of Heapify, hence nlog(n).

Each Pop calls Heapify once, hence log(n).

Totally there are n times of popping, hence nlog(n).

⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎⬇︎

Worst = 2*nlog(n) = O(nlog(n))

Ref.