Homework 6: Hash Tables & Sorting

Due: Tuesday, May 15, 8:59:59pm



To allow us to automate part of the grading (what's the point of computers if they don't save us work, and do a better job to boot?), we are providing this assignment as a text file, which you must fill in and submit. You can use any text editor (vi, emacs, nano, etc.) to edit the file. Make sure you follow the directions exactly, fill in your answers where directed, and try to not modify any other part of the file. Otherwise, your homework might not be accurately graded.

Because this assignment was put out later than planned, a question has been removed to make the assignment shorter. The point value will remain the same.

How to submit: Here's the link for the file shown below: hw6.txt. Edit the file and copy into the hw6 submission directory on GL as usual.

Name: Userid: CMSC 341 Fall 2017 Homework 6 [Modified 5/14: Ques. A: Missing comma between "5" and "12" added (there was already a separating space there) Ques. C: - typo: "union" -> "intersection" (explanation stays same) - Added section for entering "which is better?" answer.] ------------------------------------------------------------------- Question A (15 points) ------------------------------------------------------------------- [Adapted from problem R-9.8 in textbook] Hash the values 94, 11, 39, 20, 16, 5, 12, 44, 13, 88 and 23 into a hash table with 11 slots. Handle collisions by linear probing. Use the hash function: h(i) = (3 * i + 5) % 11 Type in each key on the same line as the index of the slot where it is eventually inserted. YOUR ANSWER TO QUESTION A: 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: ------------------------------------------------------------------- Question B (15 points) ------------------------------------------------------------------- [Adapted from problem R-9.10 in textbook] Repeat Question A using double hashing instead of linear probing. Use the secondary hash function h'(k) = 7 - (k % 7). YOUR ANSWER TO QUESTION B: 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: ------------------------------------------------------------------- Question C (20 points total) ------------------------------------------------------------------- [Inspired by Prof. Chang] You are given two sets of numbers, S1 and S2. Being true mathematical sets, there are no repeated numbers in either set, but there ARE numbers that appear in both sets. In fact, that is your task: to find all the numbers that appear in both sets; i.e., the set intersection of S1 and S2. You and your classmate are arguing about which of your methods is faster, i.e., has a smaller O(). Your proposed method is to combine all the elements from the two sets into a new list, L. (Obviously, L is not necessarily a true set, since it might have duplicates.) You will then sort the list L using your favorite O(n * log(n)) sorting algorithm. You will then linearly scan the resulting sorted list, comparing sequential pairs to see if there are any duplicates. Your friend Ann has a slightly different proposal: she wants to sort the two sets independently, into the resulting sorted lists S1' and S2'. Then, she will compare the first element from each list. If they match, a duplicate was found, and both elements are removed from their respective lists. However, if they do not match, the smaller of the two is removed from its list. This process loops until one of the lists is empty. For each of the two algorithms, first compute the asymptotic running time for that algorithm, along with a short but clear justification. Then, state which is better again giving a cogent explanation why. Note that there are two different sets involved. Give your answer in terms of m and n, and 'm' is the size of set S1, and 'n' is the size of set S2. (Please fill in your answer in the following space--you can take as many lines as you need, but do not modify the headers.) YOUR ANSWERs TO QUESTION C: a) RUNTIME FOR YOUR METHOD: (4 pts) EXPLANATION FOR YOUR RUNTIME: (4 pts) RUNTIME FOR ANN'S METHOD: (4 pts) EXPLANATION FOR ANN'S RUNTIME: (4 pts) EXPLANATION OF WHICH METHOD IS BETTER AND WHY (4 pts);