Project 4: Skew Heaps

Due: Tuesday, November 19, before 9:00 pm


Addenda


Objectives

The objectives of this programming assignment are:


Introduction

A skew heap is a type of heap data structure that, like a standard max-heap, has efficient insert(), max(), and removeMax() operations, but also allows for efficient merging of heaps. Unlike a standard heap, it is not space-efficient to implement a skew heap in an array or vector &ldash; we will have to use a linked binary tree structure.

Your skew heap must satisfy the Max-heap Property, which you were introduced to in Project 3: for any node \(w\) in the tree, the priority of \(w\) must be greater than the priorities of its two children. This property guarantees that the element with highest priority is always in the root node of the tree, and so reading the highest priority element is just a matter of accessing the data in the root node.

The special feature of a skew heap is the merge operation which combines two skew heaps into a single, valid skew heap. Let p1 and p2 be positions in two skew heaps (e.g. pointers to nodes). Further, suppose the methods left() and right() return positions of the left and right children, and priority() returns the priority of a position. The merge operation is defined recursively:

Merge(p1, p2): // If either heap is empty, return the other if p1 is Null, return p2 if p2 is Null, return p1 // Ensure p1 has higher priority root if p1.priority() < p2.priority() swap( p1, p2 ) // Swap the left and right children of p1 swap( p1.left(), p1.right() ) // Recursively merge p1's left child with p2; make the // merged heap the new left child of p1 p1.left() = Merge( p1.left(), p2 ) return p1

The following figure shows two skew heaps (blue and green) and the result of merging the two. All the steps of the merge procedure are described below the figure.

  1. First Merge
    1. Make (23) the new root since 23 > 19.
    2. Swap (23)'s left and right subtrees.
    3. Recursively Merge (23)'s left subtree, which is the blue subtree rooted at (11), with the other skew heap; replace (23)'s left subtree with the merged subtree.
  2. Second Merge
    Merge the green heap with the blue subtree rooted at (11).
    1. Make (19) the root since 19 > 11.
    2. Swap (19)'s left and right subtrees.
    3. Recursively Merge (19)'s left subtree (rooted at (5)) with the subtree rooted at (11); replace (19)'s left subtree with the merged tree.
  3. Third Merge
    Merge the green subtree rooted at (5) with the blue subtree rooted at (11).
    1. Make (11) the root since 11 > 5.
    2. Swap (11)'s left and right subtrees.
    3. Recursively Merge (11)'s left subtree (the single node (9)) with the green subtree rooted at (5); replace (11)'s left subtree with the merged tree.
  4. Fourth Merge
    Merge the green subtree rooted at (5) with the single node (9).
    1. Make (9) the root sinc 9 > 5.
    2. (9) has no children, so we don't have to swap anything.
    3. Recursively Merge (9)'s left subtree (which is empty!) with the green subtree rooted at (5); replace (9)'s left subtree with the merged tree.
  5. Fifth Merge
    This is a trivial merge since one of the two trees is empty. We simply return the non-empty tree, which in this case is the green subtree rooted at (5).

Other Operations

The major operations supported by a max-skew heap are insertion of elements, reading the highest priority element, and removing the highest priority element. Reading the highest priority element is just a matter of reading the root node of the heap. The other two operations, insertion and removal, are applications of the merge function:

We see, then, that the merge function is key to all of the major skew heap operations. If we can implement merge correctly, insertion and removal are simple.

Supporting Multiple Data Types — Tagged Unions

We would like to create a skewed heap that can store multiple data types (for the purpose of the project, the types will be string and int). One way this can be done is by using a tagged union, which uses a C++ union, a construction that allows one of several different data types to be stored in a single memory location, along with a tag variable to indicate which type is currently being stored.

The tagged union is implemented in the Node class:

// ENUM of valid data types enum type_t {ISSTRING, ISINT}; struct Node { enum type_t tagged_union; // Type of data in node union { string data_string; // Store string or int; but int data_int; // not both }; Node *right; // Right child Node *left; // Left child // Other code omitted... };

The union construction indicates that a string or an int may be stored, but not both! The tagged_union variable holds one of two enumerated values, ISSTRING or ISINT, to indicate which type is being stored. The enumerated type is defined in the enum before the Node declaration.

Node constructors and a destructor have been provided for you in SkewHeap.h. Study their implementation to understand how the tagged union is used in practice

Function Pointers

So...you're going to build a skew heap that can hold both strings and integers. We can't just compare strings and integers, so what are we supposed to use for priority? We're going to leave that up to the user by allowing them to pass a function pointer to the skew heap constructor. The function pointer is the address of a function that will be used to compute the priority of a node. The function must be able to compute the priority of a string or an integer.

A typedef for the function pointer is provided for you in SkewHeap.h:

typedef int (*pri_fn)(Node *);

This says that pri_fn is a pointer to a function that takes a Node* argument. It must compute the priority from the contents of the node. Sample priority functions and examples of their use are provided in the test programs.


Assignment

Your assignment is to implement the SkewHeap class which implements a skew heap that can store both strings and integers using the tagged union construction. In addition, SkewHeap allows the user to determine how priorities are computed by passing a pointer to a prioritization function to the constructor. In addition, the user can change the prioritization function of an existing skew heap using a setter method, and the class will rebuild the heap using the new prioritization.

You must complete an additional class that uses SkewHeap, the TypedHeap class. The TypedHeap class maintains two skew heaps: an integer heap and a string heap. The user provides a vector of strings to insert, which are processed as follows:

  1. If the string represents an integer (consists only of decimal digits), then convert it to an integer and insert it into the integer skew heap.
  2. If the string does not represent an integer (contains non-digit characters), insert it into the string skew heap.

In addition, the user may request, through a combineHeaps() method, that the string and integer heaps be merged into a third “total” heap. Functions to change the priority function and to dump the skew heaps must also be implemented.

Although several test programs are provided, you are responsible for thoroughly testing your program. It is particularly important that your code run without segmentation faults, memory leaks, or memory errors. Memory leaks are considered as bad as segmentation fault since many segmentation faults are caused by poorly written destructors. A program with a poorly written destructor might avoid some segmentation faults but will leak memory horribly. Memory leaks will incur a penalty similar to that of a segmentation fault.

Following is a list of member functions that must be implemented. Function prototypes are provided in SkewHeap.h and TypedHeap.h. You will need to create the implementation files, SkewHeap.cpp and TypedHeap.cpp.

Public Methods in SkewHeap

Public Methods in TypedHeap


Additional Requirements

Requirement: You must use the supplied SkewHeap.h and TypedHeap.h files. You may not modify any of the public method prototypes or private data, although you may add private functions to SkewHeap.h.

Requirement: The implementation file for the SkewHeap class must be named SkewHeap.cpp. The implementation file for the TypedHeap class must be named TypedHeap.cpp.

Requirement: If a user attempts to merge two skew heaps with different priority functions, a domain_error exception must be thrown.

Requirement: You must implement the skew heap merge operation as described in the Introduction.

Requirement: The insert() function must implement the merge-based insertion method described in Other Operations.

Requirement: The removeTop() function must implement the merge-based removal methods described in Other Operations.

Requirement: The dump() function must print the elements of the skew heap in decreasing priority order.

Requirement: The max-heap property must hold after any call to insert(), removeTop(), or skewHeapMerge().


Provided Programs

Header files are provided for both classes (SkewHeap.h and TypedHeap.h). You may add private helper functions in SkewHeap.h, but TypedHeap.h should not be modified.

Two test programs, one for SkewHeap and one for TypedHeap are provided, along with sample output from each program. However, it is your responsibility to write additional tests to ensure that your implementation is correct.


What to Submit

You must submit the following files to the proj4 directory.

If you followed the instructions in the Project Submission page to set up your directories, you can submit your code using this Unix command command.

cp SkewHeap.h SkewHeap.cpp TypedHeap.h TypedHeap.cpp ~/cs341proj/proj4/