This file briefly describes each type of heap, and how it has been implemented
in the supplied C source files. For detailed information refer to the source
files, or the references given at the end of this file.
-------------------------------------------------------------------------------
The Binary Heap
Files: bheap.c, bheap.h
The binary heap is a heap ordered binary tree. A binary tree allows each node
in the tree to have two children. Each node has a value associated with it,
called its key. The term `heap ordered' means that no child in the tree has a
key greater than the key of its parent. By maintaining heap order in the tree,
the root node has the smallest key. Because the heap has simple access to the
minimum node, the find_min() operation to takes O(1) time.
Heaps support the operations insert(), decrease_key()/increase_key(), delete(),
and delete_min(). When these operations occur, the heap needs to be
rearranged, shifting nodes to maintain heap order or to replace deleted nodes.
When inserting a node, it is placed at the bottom of the tree. If the inserted
node has a key greater than that of its parent, the node is swapped with its
parent. This process, called sift-down continues until heap order is
satisfied. When deleting a node, we replace the deleted node with the node at
the bottom of the tree. If it is smaller than its parent, sift-down occurs.
Otherwise, if it is greater than the minimum of its two children, the opposite
process called sift-up occurs. The sift-up process swaps the node with its
smallest child, and continues until heap order is satisfied. After a
decrease_key() operation sift-down is performed if necessary. After an
increase_key() operation sift-up is is performed if necessary.
Because of the binary heaps simplicity, it is possible to maintain it using a
one dimensional arrays. This method has been used in the source file bheap.c.
The root node is located at position 1 in the array. The first child of the
root is located at position 2 and the second child at position 3. In general,
the children of the node at position i are located at 2*i and 2*i + 1. So the
children of the node at position 3 in the array are located at positions 6 and
7. Similarly, the parent of the node at position i is located at i div 2.
Note that in file bheap.c array entry 0 is unused.
The alternative way to implement binary heaps is to define a structure type for
a node, and use pointer fields to parent and child nodes in the heap.
-------------------------------------------------------------------------------
The Fibonacci Heap
Files: fheap.c, fheap.h
In this description, when comparing nodes we are actually referring to its key.
The Fibonacci heap invented by Fredman and Tarjan consists of a collection of
several trees of different ranks. The rank of a node in a Fibonacci heap is
defined as the number of children it has, and the rank of a tree is defined as
the rank of the root node. The trees are created through a linking process. A
rank 0 tree consists of a single node. A rank 1 tree is created by comparing
the root nodes of two rank 0 trees and linking them so that the tree with the
smaller root node becomes the parent of the tree with the larger root node.
This is done by adding the smaller root node to the children of the other trees
root node, thus increasing the rank of the other tree. In general a rank i
tree is created through the linking of two rank i-1 trees. By comparing the
keys of root nodes, trees created trough linking are heap ordered.
Because linking joins whole trees, Fibonacci heaps are implemented by defining
a structure type for nodes and using pointers between nodes. In the file
fheap.h, the Fibonacci heap node type has the pointers parent, child, left, and
right. A nodes children are maintained in a circular doubly linked list using
the left and right sibling pointers. The child pointer of a node points to one
child in its list of children. The parent pointer points to a nodes parent.
For root nodes, the parent pointer is NULL.
For the Fibonacci heap implementation in fheap.c, we allow at most one tree of
each rank. This is done by maintaining an array of pointers to Fibonacci heap
trees, where entry i points to the root node of a tree of rank i. When
inserting a node, we attempt insertion at entry 0. If there is already a
rank 0 tree there we link the new node with it and attempt to insert the
resulting rank 1 tree at entry 1. Linking continues until a tree is inserted
at an empty entry.
For the delete_min() operation we must scan the root nodes of all trees in the
heap to determine which one is the minimum node. After deleting the minimum
node, its children, which make up a collection of trees, are merged back into
the heap, similar to inserting a node into the heap, except we merge at the
appropriate rank array entry for each tree. Each tree in the collection of
child trees after a delete_min is easily accessed using the left and right
sibling pointers.
After a decrease_key() operation on node x, we break the link between x and
its parent, p, then merge the resulting tree, with root node x, back into the
array of trees. There is one further detail with decrease_key(): We allow a
node to loose at most one of its children to decrease_key() operations. When a
node looses a child, we mark it. Each node has a boolean field, `marked',
which is TRUE if it has lost a child, and FALSE otherwise. Upon loosing a
second child, we must also break the link between p and its parent and merge
the tree rooted at p back into the heap. This process is called a cascading
cut, since it may propagate further up the tree if p's parent has already lost
a child, and so forth. For the Fibonacci heap implementation in file fheap.c,
the marked field of a root node is true, and is set to false after the root
node becomes non-root through linking. In fheap.c, breaking a link between
node x and its parent node, p, also requires removing x from the linked list
of p's children.
The delete() operation is not implemented in fheap.c but is similar to a
combination delete_min() and decrease_key() processes.
-------------------------------------------------------------------------------
The 2-3 Heap
Files: ttheap.c, ttheap.h
The 2-3 heap is similar to the Fibonacci heap, but unlike the Fibonacci heap,
the 2-3 heap has explicit constraints on the structure of trees in the heap.
Trees in the 2-3 heap are made up of trunks, formed by linking nodes in a
chain. Trunks can be either 2 or 3 nodes in length. In this description,
length is defined as the number of nodes in the trunk. Consider the recursive
definition:
T(0) = a single node.
T(i) = T_1(i-1) . ... . T_m(i-1) (m is either 2 or 3)
We link trees T_1 to T_m in a chain by their root nodes. We call T(i) a tree
of dimension i and the chain of nodes formed a trunk of dimension i. Main
trunks are formed by linking 0, 1 or 2 trees of the same dimension. The 2-3
heap allows one main trunk for each tree dimension. This collection is
maintained using an array of pointers to the head node of each main trunk.
Entry i points to a main trunk of dimension i trees. If there is no main trunk
for a particular dimension (i.e. no trees), then the pointer will be NULL. It
is necessary to maintain the dimension of a node. The dimension of a node, as
implemented in the file ttheap.c, is defined as the dimension of the highest
dimension non-main trunk the node lies on. This means that array entry 0
points to a main trunk containing dimension 0 nodes.
For implementing 2-3 trees we can use a similar node structure type to that of
the Fibonacci heap. A nodes child pointer points to the highest dimension
child. A circular doubly linked list of children is formed by the left and
right sibling pointers of nodes. The left pointer points to the next lowest
dimension sibling, and the right pointer points to the next highest dimension
sibling. Because it is circular, the right pointer of the highest dimension
child points to the dimension 0 child, and the left pointer of the dimension 0
child points to the highest dimension child. Every node has a parent pointer,
except for root nodes, which have the parent pointer set to NULL. Fibonacci
heap nodes had a rank field. In comparison 2-3 heap nodes have a dimension
field.
An alternative structure type pairs nodes by using an extra pointer field,
partner. We can describe trunks in this representation as consisting of the
nodes (parent, 1st child, 2nd child). 1st child and and child nodes are doubly
linked to each other using their partner pointers. The left and right pointers
of a 1st child node only point to other 1st child nodes. The parent pointer of
a 1st child node points to its parent. 2nd child nodes do not use the parent,
left, and right pointers. However, both 1st child and 2nd child nodes
maintain their own child pointers. Since trunks can be either length 2 or 3,
2nd child nodes don't necessarily have to exist, in which case, the 1st child
node's partner pointer would be NULL. Main trunks consisting of 2 nodes are
treated as (1st child, 2nd child). For convenience we also have the boolean
field, extra, which identifies whether a node is 1st child or 2nd child.
However it may be possible to use the parent pointer to identify 2nd child
nodes by always setting the parent pointer of 2nd child nodes to NULL, which
would then require another check to distinguish 2nd child nodes from root
nodes.
An insert operation involves merging the new node into the dimension 0 main
trunk, maintaining heap order. If the resulting trunk is length 3, then we
have a dimension 1 tree which is merged into the dimension 1 main trunk. The
insertion process can be viewed as adding 1 to the ternary number represented
by the collection of main trunks, with the possibility of a chain of carries
occurring. For example the ternary addition 222 + 1 = 1000 causes 3 carries.
The delete min_operation for the 2-3 heap is similar to that for the Fibonacci
heap. After removing the minimum node, we take child trunks of the minimum
node and merge them back into the main trunk level of the heap. The child
trunks can be either 1 or 2 nodes long. The list of child trunks is ordered,
with the lowest dimension child trunk consisting of dimension 0 trees, the next
consisting of dimension 1 tress, and so on. The main trunk that the minimum
node lies on decreases in length by 1. This merging process corresponds to the
addition of two ternary numbers, with each digit of the added number being
either 1 or 2.
The decrease_key operation involves removing the node whose key was decreased
and merging it back into the heap at main trunk level. Removing a node, also
removes the entire sub-tree it is the root of. When a node is removed from a
non-main trunk we allow the trunk to shrink from 3 nodes to 2 nodes. If
decrease_key occurs on a root node there is no need to remove it since heap
order will still be maintained. Removing the second node from a 2-node main
trunk shrinks the trunk to length 1, then the removed tree is merged back into
the trunk. Recall the nodes on a 3 node trunk are (parent, 1st child, 2nd
child). We treat the 2 nodes on a main trunk as (1st child, 2nd child) with
no parent. This is why the second node on a main trunk identifies as a node
that can be removed.
When removing a node from a length 2 non-main trunk, we must replace it with
some node from a nearby trunk since we do not allow length 1 non-main trunks.
This is done by rearranging a nodes workspace. The workspace of a dimension i
node consists of the dimension i trunk it is part of, the dimension i+1 trunk
it lies off, and any other dimension i trunks that also lies off the
dimension i+1 trunk. For a detailed definition, refer to Takaoka [1]. If the
dimension i+1 trunk is a non-main trunk, the workspace has between 4 and 9
nodes. When the dimension i+1 trunk is a main trunk, the work space could have
just 2 nodes. For details on how the workspace is rearranged, examine the
remove_node() function in the 2-3 heap source code file ttheap.c.
-------------------------------------------------------------------------------
The Trinomial Heap
Files: triheap.c, triheap.h
triheap_ext.c, triheap_ext.h
The trinomial heaps structure is very similar to that of the 2-3 heap.
However, with the trinomial heap, all non-main trunks are kept at length 3.
Length here is defined as the number of nodes in the trunk. Only main trunks
are allowed to vary in length, allowing lengths 0, 1, or 2. We refer to nodes
on a trunk as (parent, 1st child, 2nd child). A 1st child or 2nd child node is
called inconsistent if its key is less than that of the parent node. The
trinomial heap always enforces key(1st child) <= key(2nd child), and can swap
1st and 2nd child nodes to maintain the correct order. An active node is a
possibly inconsistent node. The trinomial heap allows a limited number of
active nodes.
We use the node-pair structure type for implementing the trinomial heap. There
are two implementations of the trinomial heap. The simpler implementation,
given in file triheap.c, maintains at most 1 active node for each dimension.
An extended implementation, given in file triheap_ext.c, allows O(log n) active
nodes of any dimension. The later supports decrease_key in O(1) worst case
time, rather than O(1) amortised time.
Insertion is the same as for the 2-3 heap, and nodes are inserted as inactive.
We will describe the simpler version of the trinomial heap first. This is just
a brief description, refer to the source code file, triheap.c for details.
Consider what happens when a decrease_key operation occurs, decreasing the key
of a node, x, of dimension d. Some pseudo code is given below. Here we assume
that the node lies on a non-main trunk.
If x is a 1st child,
If there is not already an active node for dimension d,
make node x active.
Else, unless x is already the active node,
do rearrangement with the current active node for dimension d.
Else,
If key(x) < key(1st child),
swap x with the 1st child.
If the old 1st child is active,
If the old 1st child is inconsistent, do promotion.
Else, x replaces it as the active node for dimension d.
Else if there is some other active node for dimension d,
do rearrangement with it.
Else,
make x the active node for dimension d.
Else,
If the 1st child is active,
If x is inconsistent, do promotion.
Rearrangement:
This process involves two trunks. (pv, v, v') and (pw, w, w'). Here v is
a node that tried to become active, but an active node, w, of the same
dimension already existed. In the simple trinomial heap implementation
2nd child nodes are never left inconsistent, so we know that both v' and
w' are consistent. Nodes v and w are possibly inconsistent.
If key(v') < key(w'),
make the trunk (pv, v', w') containing no inconsistent nodes.
If key(v) < key(w),
make the trunk (pw, v, w).
If key(w) < key(pw),
do promotion.
Else
v replaces w as the active node.
Else
make the trunk (pw, w, v).
If key(v) < key(pw),
do promotion.
Else
w remains the active node.
Else
make the trunk (pw, w', v') containing no inconsistent nodes.
If key(v) < key(w),
make the trunk (pv, v, w).
If key(w) < key(pv),
do promotion.
Else
v replaces w as the active node.
Else
make the trunk (pv, w, v).
If key(v) < key(pv),
do promotion.
Else
w remains the active node.
Promotion:
This process involves one trunk, where both nodes on the trunk are
inconsistent. Consider the trunk (p, v, v') with both v and v'
inconsistent. We promote v and v' up 1 position and place p at the end
of the trunk to give the resulting trunk (v, v', p). Now both v' and p are
consistent, but node v may now be inconsistent at a higher dimension.
It appears as though a decrease_key has occurred at the position where p
used to be an v now is, so the process continues at a higher dimension.
Note that node p may go from active to inactive.
The delete_min operation is more complicated than that of the 2-3 heap. We
must also check active nodes as well as the root nodes when locating the
minimum. The child trunks of the deleted node are still merged back into the
heap. However if the node that was deleted is not a root node, we must perform
the break-up operation which breaks up the higher dimension parts of the tree,
producing trunks to be merged back into the heap at root level with the list of
child trunks. For the trinomial heaps delete_min operation, all the trunks to
be merged back into the heap will be length 2. A list of trunks to be merged
back into the heap is maintained during break-up so that the merging can be
done in one sweep.
The node-pair structure is important for the break-up operation. Recall that
the (1st child, 2nd child) nodes are paired by setting their partner pointers
to point to each other. For the description of the break-up operation we refer
to the broken nodes parent and partner, so we have one of the trunk cases
(parent, break_node, partner) or (parent, partner, break_node). Initially,
break_node is the node that was deleted. Let the dimension of break_node be d.
Break-up pairs parent with partner to produce a trunk structured like a main
trunk of 2 dimension d trees. This is added to the list of trunks to be merged
back into the heap at main trunk level. If break_node's trunk has higher
dimension sibling trunks, these are cut from parent to produce length 2 trunks
and added to the list of trunks to be merged. Such a case will occur if
dim(parent) > dim(break_node) + 1.
Break-up proceeds to a higher dimensions by setting break_node to parent,
giving new parent and partner pointers, with which to continue the process.
The break-up ends once the main trunk has been reached.
This still takes O(log n) time since there are only O(log n) active nodes, and
the maximum dimension is O(log n). Note that during break-up, we must
deactivate any active nodes on trunks to be merged back into the heap.
----------
Next, the extended trinomial heap is described, which supports decrease_key in
O(1) worst case time. This differs in that it allows active nodes to be of any
dimension, instead of allowing just one active node for each dimension. To
achieve O(1) worst case time for decrease_key it is also necessary to allow 2nd
child nodes to be active. However, note that a 2nd child nodes can not be
active unless the 1st child is also active.
This trinomial heap has a tolerance bound, t, which allows t = [log_3 (n+1)]
nodes to be active. If the number of active nodes exceeds t then rearrangement
or promotion must be performed on two active nodes of the same dimension.
To keep track of active node of the same dimension the extended trinomial heap
maintains an active node queue for each dimension. In the source files
triheap_ext.c and triheap_ext.h each queue is implemented using a circularly
doubly linked list of active_ptr_t items. The heap maintains an array,
active_queues, which contains pointers to the head queue item for each
dimension. Each active_ptr_t item points to the active node that it
corresponds to. Nodes in the extended trinomial heap have an additional
pointer, active_entry, which points to the corresponding active_ptr_t item when
a node is active. If a node is not active its active_entry pointer will be
NULL.
A separate queue maintains candidates for rearrangement. Referring to
triheap_ext.h, this is implemented using a doubly linked list of
candidate_ptr_t items. Each candidate_ptr_t item has a dimension field that
specifies which active node queue it corresponds to. If one of the active node
queues has two or more items in it, then there will be a candidate_ptr_t item
that corresponds to it in the candidate queue. For each dimension, there is at
most one candidates_ptr_t item in the queue at any given time.
Note that when there are more active nodes than the tolerance level, at least
one of the active node queues has two or more items in it, since there are more
items than queues. There is guaranteed to be at least one candidate queue
item. Suppose that the tolerance level has been reached. The first item in
the candidate queue specifies which active node queue has two items in it.
From the two items, pointers to two active nodes of the same dimension can be
obtained.
Before continuing with rearrangement using the two active nodes, we must check
that neither active node has another active node as its partner:
* If we do get a (1st child, 2nd child) pair with both nodes active,
* If both nodes in the (1st child, 2nd child) pair are inconsistent,
do promotion.
* Else,
deactivate the 2nd child and finish.
* Else,
do rearrangement.
Promotion and rearrangement have the effect of decreasing the number of active
nodes by 1 or 2. Note that in the extended trinomial heap, promotion does not
propagate to higher dimensions.
To make checking active node keys easier during delete_min, an array, `active',
is maintained which contains pointers to each active node. This uses entries
0 to n_active-1, where n_active is the number of active nodes in the heap.
Each active_ptr_t queue item has a position field which specifies the of the
corresponding active nodes pointer in this array.
The code in triheap_ext.c is similar to that of triheap.c, except that now when
a node is made active, an activate() procedure is called to update the queues.
Similarly, when a node goes from active to inactive a deactivate() procedure is
called to update the queues. Note that queue items from any queue are only
added or removed when activate() or deactivate() is called. After a node is
made active using activate(), a check is made to make sure that the number of
active nodes has not exceeded the tolerance. There are also other minor
differences in parts of the code since the extended trinomial heap allow 2nd
child nodes to be active. Note that before a 2nd child node can become active,
the first child has to already be active.
-------------------------------------------------------------------------------
References
[1] T. Takaoka. Theory of 2-3 heaps. In Proc. COCOON '99, volume 1627 of
Lecture Notes in Computer Science, pages 41-50. July 1999.