11.2 - Indexing and Hashing Show slide * Most of the time we are only looking for a few particular records in a table Looking for a particular employee Don't want to have to scan an ENTIRE table in order to do this * We need additional structures to do this SLIDE 11.3 - Basic Concepts Show slide READ 1 * Works just like an index in the back of a book Look for a topic by name Find the page its on Turn to that page * Library card catalog Look up book Find Dewey Decimal number Go to shelf * Database Look up record Find block number where it resides on disk Go read that block into main memory READ 2 Book - Topics are sorted alphabetically to make them easy to find Library - card catalog sorted by title or author READ 3 READ 4 Indices may become VERY large themselves If you store EVERY search value in it Need more sophisticated methods SLIDE 11.4 - Index Evaluation Metrics Show slide * We will look at several techniques for indexing * No one technique is best in every situation * How do you evaluate the different techniques? READ 1 READ 2 Time it takes to find a particular data item, or set of items READ 3 What would an insert involve? Find correct place in index to insert Update the index READ 4 READ 5 Balance space vs performance * May want to have more than one index Title, author, subect * Search key may or may NOT be the same as the PK SLIDE 11.5 - Ordered Indices Show slide * GOAL - is PERFORMANCE ! ! ! So far, when we have talked about keys, we are talking referential integrity & normalization Now we are talking about how fast can we get to the data THAT WE WANT THE MOST * Each index is associated with a particular search key on a particular relation READ 1 POINT 2 If the file containing the records is sequentially ordered, a clustering index is an index whose search key ALSO defines the sequential order of the records in the file A clustering index is ALSO the Primary Index Primary INDEX and Primary KEY are NOT the same thing You may very well choose to make your PK the primary index POINT 3 What is a non-clustering index ? An index whose search key specifies an order different from the sequential order of the file Also called secondary index POINT 4 Clustering index on a search key SLIDE 11.6 - Dense Index Files Show slide * Index Entry / Index Record search key value pointer to one or more records with that value as their search key value consists of disk block number and offset within that block * Clustered indexes mean that data records are kept in the same order * What is the search key for this index? Notice how the records a ordered in this file * What is the PK of this relation ? READ 1 SLIDE 11.7 - Dense Index Files Show slide * Index record contains pointer to FIRST record with that search key How are other records with that search key found ? * What if this is a dense NON clustering index How would other records with that search key be found ? Must keep multiple pointers SLIDE 11.8 - Sparse Index Files Show slide READ 1 Applies only to clustering indices * How do the headers on a dictionary work? READ 2 * What happens when we are looking for ID = 10101 ID = 22222 * What advantages do sparse indices have over dense ones ? What are the tradeoffs ? Wait to next slide to answer !! SLIDE 11.9 - Sparse Index Files Show slide READ 1 * What advantages do sparse indices have over dense ones ? What are the tradeoffs ? speed vs space READ 2 * Dominant cost in processing a request is the time it takes to bring a block from disk into main memory VERY fast to scan block in memory * One entry per block Minimize block access Minimize space SLIDE 11.10 - Multilevel Index Show slide * Assume you have 1 million tuples Building a dense index Can fit 100 index entries per 4k block How much many blocks does our index take ? 10k blocks * Change our assumption - now we have 100 million tuples 1 million index blocks at 4k each => 4 GB of memory * Indexes themselves incur a great deal of disk overhead * May require several block reads to get to the entry you are looking for * Assume binary search Requires log2(blocks) reads 10k blocks => 14 block reads Each read takes 10ms Entire search will take 140 ms You only get 7 index searches per second If overflow blocks are being used, You go to sequential search * How do we fix this ? READ 1 READ 2 READ 3 READ 4 SLIDE 11.11 - Multilevel Index Show slide * Inner index of 10k blocks Requires 10k entries in the outer index 100 blocks of space in the outer index May fit into main memory Would only read one index block per search Can perform 14 times as meany searches per second * What happens when the outer index gets too big? SLIDE 11.12 - Index Update: Deletion Show slide * When does the system need to update (write to) the index ? whenever a new record is inserted deleted search key value is updated May be handled as a delete followed by an insert of the new value READ 1 Go back to slide 6 * What needs to happen if the index is dense and we delete Einstein ? Delete the index entry Delete the record Maintain order in file Go back to slide 12 * What needs to happen with a sparse index IF Delete Einstein ? delete record maintain record pointers Delete Crick ? Third index entry updated to point to Brandt SLIDE 11.13 - Index Update: Insertion Show slide READ ALL SLIDE 11.14 - Secondary Indices Show slide * Secondary Indices MUST be dense No way to find intermediate records They may appear ANYWHERE in the file SLIDE 11.15 - Secondary Indices Example Show slide * Secondary indices on Candidate Keys Work the same as primary indices Just the records pointed to are not in THAT sort order * If NOT on a Candidate Key * Extra level of indirection is needed SLIDE 11.16 - Primary and Secondary Indices Show slide READ 1 READ 2 READ 3 p 485 SLIDE 11.17 - B+-Tree Index Files Show slide * What does the B stand for in B tree ? It is a balanced tree * How is a B-tree different from a binary tree? Binary tree has at most 2 children and may not be balanced * What is a balanced tree ? All paths from the root to the bottom leaf have the same length SLIDE 11.18 - Example of B+-Tree Show slide * How many values does each node hold ? What values could go into the 3rd node on the second level ? * Is this tree balanced ? How do you know * What is the search key for this index ? * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * SLIDE 11.19 - B+-Tree Index Files Show slide * What is a root node ? * What is a leaf node ? * What is an internal node ? * Each node contains n-1 key values n pointers Why is there one more pointer than there is values ? Points to the next node in sort order at the leaf level OR Next node child node Each node uses 1 more pointer than the number of KVs that it contains *** Show previous slide *** RULES FOR BALANCED TREE * Each node must be AT LEAST half full * Except for the root Root starts out as a leaf As soon as the KVs are filled Root splits and MUST have AT LEAST 2 children * Assume we have a tree where n = 5 * How many key values may exist in a leaf node ? between 2-4 * Internal nodes Each internal should have how many children ? between 3-5 SLIDE 11.20 - B+-Tree Node Structure Show slide SLIDE 11.21 - Leaf Nodes in B+-Trees Show slide SLIDE 11.22 - Non-Leaf Nodes in B+-Trees Show slide SLIDE 11.23 - Example of B+-tree Show slide * B+-tree for instructor file * n= 6 How many key values per node ? How many pointers per node ? READ ALL SLIDE 11.24 - Observations about B+-trees Show slide * What happens when n = 6 ? 1st level : Min values: 2^3 = 8 2nd level : Min values: 2^3*3 = 2^9 = 512 p 488 SLIDE 11.25 - Queries on B+-Trees Show slide 1. ROOT Starting with the root as the current node Each i consists of a PAIR: a pointer Pi and an associated search-key value Ki 2. INTERNAL NODES 1. look for the smallest i in the node such that V <= search-key value Ki 2. If NOT found in the current node Make the node that is pointed to by the last non-null pointer in the node to be our new current node 3. If i is found, IF V < Ki THEN get the pointer from that i to be our current node IF V = Ki THEN go to the pointer value from the NEXT i to be our current node 3. LEAF NODES Look for V = Ki in that node If found, get pointer to record If NOT found, value does not exist in relation * Find Singh * Find El Said * Find Mozart SLIDE 11.26 - Handling Duplicates Show slide SLIDE 11.27 - Handling Duplicates Show slide * This is the algorithm from the previous slide SLIDE 11.28 - Queries on B+-Trees Show slide * Root nodes of heavily used indices are probably kept in main memory buffer SLIDE 11.29 - Updates on B+-Trees: Insertion Show slide SLIDE 11.30 - Updates on B+-Trees: Insertion Show slide SLIDE 11.31 - B+-Tree Insertion - ADAMS Show slide SLIDE 11.32 - B+-Tree Insertion - LAMPORT Show slide * Split leaf level * Split parent KVs that lie between the pointers moved to the right in our node (Kim) are moved along with their pointer KVs that lie between the pointers that stay on the left (Califieri, Einstein) remain undisturbed KV that lies between those two move up to its parent node in the correct order SLIDE 11.33 - B+-Tree Insertion - Algorithm Show slide SLIDE 11.34 - Updates on B+-Trees: Deletion Show slide SLIDE 11.35 - Updates on B+-Trees: Deletion Show slide SLIDE 11.36 - Examples of B+-Trees: Deletion - SRINIVASAN Show slide * Delete Srinivasan from leaf * Merge with node on the left (Wu goes to Mozart and Singh) * Remove Srinivasan from parent * All we have left is 1 pointer * Merge parent with its sibling * Can't - now there are 5 pointers * Redistribute pointers between the node and its sibling * Rightmost pointer in merged parent points to Mozart * Can't just move gold to the sibling in the right Gold's KV is less that Mozart * Move KV Gold up and KV Mozart down * Golds P moves to as the first P level 2 parent SLIDE 11.37 - Examples of B+-Trees: Deletion - SINGH and WU Show slide SLIDE 11.38 - Examples of B+-Trees: Deletion - GOLD Show slide * Error on the slide Gold should be Katz SLIDE 11.39 - Non-Unique Search Keys Show slide * A relation that can have more than 1 record containing the same search key value Non-unique index READ 1a * Additional block access is bad READ 1b * Variable size buckets * What happens when the bucket becomes bigger than the size of the leaf node READ 1c * Added attribute is called a UNIQUIFIER SLIDE 11.40 - B+-Tree File Organization Show slide p 500 READ 1 * Degradation of index-sequential files Overflow blocks READ 2 READ 3 * Record lengths in leaf nodes > pointers in internal nodes Max recs in a leaf < max pointers in internal nodes READ 4 SLIDE 11.41 - B+-Tree File Organization Show slide p 501 READ 1 READ 2 * Insertion * Attempt to insert into node * If node is full, redistribute between 2 adjacent nodes to make room * If that fails, create one new node, and redistribute between the 3 nodes to make room * Deletion * If a node falls below 2/3 full attempt to borrow a entry from an adjacent node * If both nodes are 2/3 full delete entry and merge others delete node SLIDE 11.42 - Other Issues in Indexing Show slide p 502 SLIDE 11.43 - Indexing Strings Show slide p 502 SLIDE 11.44 - Bulk Loading and Bottom-Up Build Show slide p 503 What happens when: * Relation is larger than main memory * Non-clustering index is larger than main memory READ 1 * One I/O to fetch, One to write * 100 million recs * 10 ms / readwrite = 1 milliion seconds to build the index 277 hours * Assuming 100-byte record, Only takes 200 secs to read that file in from disk READ 2 * Read in all the search key values into a temporary file * Sort the file * Scan the file and and insert entries into the index * When entries are already in sort order when inserted all entries on a particular leaf node appear consecutively leaf needs to be written out only once nodes willnever have to be read from disk during bulk load if the tree was empty to start with * Each leaf node incurs only one I/O operation * If each leaf contains 100 entries Leaf level will contain 1 million nodes 1 Million I/O operations Successive leaf nodes are allocated on sequential blocks Minimizing seeks Means 1 ms / block writes for sequential disk block access * 1000 seconds ! ! ! READ 3 * Records are read into blocks * Each level of the B tree points to the beginning of a block SLIDE 11.45 - B-Tree Index Files Show slide p 504 READ 1 READ 2 * KVs appear in internal nodes as well as leaf nodes READ 3 * Leaf nodes are the same as in B+ trees READ 4 * Non-leaf nodes contain Ps, Ks and Bs Buckets or file-record pointers SLIDE 11.46 - B-Tree Index File Example Show slide p 505 * Record pointer to Einstein appears in the internal instead of the leaf node SLIDE 11.47 - B-Tree Index Files Show slide p 506 READ ALL SLIDE 11.48 - Multiple Multiple-Key Access Show slide p 506 * Assume instructor file has 2 indices dept_name salary READ ALL SLIDE 11.49 - Indices on Multiple Keys Show slide p 508 * Instead of having 2 different indices on 2 different attributes Create One index on both attributes READ ALL SLIDE 11.50 - Indices on Multiple Attributes Show slide p 508 READ ALL * Condition on first attribute is NOT equality Does not correspond to a range query on the search key For EACH department, it will search for salary Will be located on different blocks SLIDE 11.51 - Other Features Show slide p 509 * Assume 1 we have an index on ID for the instructor relation 2 we do a lot of queries that involve instructor and their salaries by id * What if we added salary to the leaf level of the index 1 We don't need to load the actual record 2 Saves space in search keys by not making it a composite index * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * SLIDE 11.53 - Hashing Show slide p 509 * With sequential file organizations You must use an index file to get to the data efficiently This involves I/O * Hashing Avoids accessing an index structure Provides a way of constructing indices READ 1 READ 2 READ 3 Hash function maps a search key value to a particular bucket Worst possible algorithm will map all keys into the same bucket Choice of hash function should spread search keys out evenly over all the buckets READ 4 SLIDE 11.54 - Example of Hash File Organization Show slide p 510 READ ALL SLIDE 11.55 - Example of Hash File Organization Show slide p 511 SLIDE 11.56 - Hash Functions Show slide p 510 READ ALL * What about a hash function that maps the first letter of each search key to one of 26 buckets? Is that a good has function ? Non-uniform more names begin with S and T than Y and Z * What about a hash function for salary that diviides the values into ranges by 10k increments ? Is that a good has function ? Non-random SLIDE 11.57 - Handling of Bucket Overflows Show slide p 512 READ 1 * Number of buckets must be > # of total recs / # recs per bucket * Reasonable algorithm for # of buckets (# recs / recs/bucket) * 1.? - fudge factor (usually .2) SLIDE 11.58 - Handling of Bucket Overflows Show slide p 513 READ ALL * In open hashing, buckets overflow into already existing buckets maybe the next sequential bucket causes problems with deletions SLIDE 11.59 - Hash Indices Show slide p 514 SLIDE 11.60 - Example of Hash Index Show slide p 514 SLIDE 11.61 - Deficiencies of Static Hashing Show slide p 515 READ ALL * Split and merge buckets SLIDE 11.62 - Dynamic Hashing Show slide p 515 READ 1 * Only reorganizing one bucket at a time READ 2 READ 3 a hash function generates values up to 4 billion possible buckets b we only use the first i number of bits as an offset into a hash table containing ptrs to the buckets the number of bucket ptr enties is 2^i c i changes => number of ptr enties in the bucket address table changes size of bucket address table changes d we do not create a bucket for each hash value e SLIDE 11.63 - General Extendable Hash Structure Show slide p 516 * Notice that 2nd and 3rd buckets each have their own bucket ptr because their integer = hash pfrefix * 1st bucket has 2 ptrs to it because its integer is one less than current hash pfrefix SLIDE 11.64 - Use of General Extendable Hash Structure Show slide p 517 Look ups * Get the first i bits from the hash code * Use that as an offset into the bucket address table * Follow the pointer to the correct bucket READ SLIDE SLIDE 11.65 - Insertion in General Extendable Hash Structure Show slide p 517 * There is a key value associated with each bucket Call it Kj POINT 1 * If hash prefix is greater than the integer number for that particular bucket Create a new bucket Assign the integer for old bucket and new bucket to old integer + 1 Update bucket address table Take half of entries that pointed to the old bucket, Point them to the new bucket Update buckets take all the old records in the old bucket rehash them and move them into the appropriate bucket READ 2 * If bucket is full and there is only 1 ptr to that bucket in the bucket address table * Create overflow bucket OR * Create a bigger bucket address table so that you can create more buckets * Increase you hash prefix size by 1 Double the size of the table * Now there are 2 entries in the table for each bucket * Split bucket that we were originally trying to insert into SLIDE 11.66 - Deletion in General Extendable Hash Structure Show slide p 518 * If you need to remove a record Locate bucket and data record using hash table Remove record from data file Remove key value from hash bucket Remove bucket if it becomes empty READ SLIDE 11.67 - Use of Extendable Hash Structure: Example Show slide p 517 * Search key value is dept_name * 32-bit hash SLIDE 11.68 - Example Show slide p 518 * i = 0 * # of hash entries = 2 ^ i = 1 * # of buckets = 1 * bucket size = 2 (can hold 2 hash key values - arbitrary design-time choice) * integer for bucket = 0 which = i since integer = i, only one bucket address points to this bucket * Insert Srinivasan Compute key value on Comp. Sci. => 1111 0001 ... Look at slide 11.67 * hash prefix = 0 Looking at 0 bits of the hash code Goes immediately in first bucket * Insert Wu Same thing SLIDE 11.69 - Example Show slide p 519 * Insert Mozart bucket is full * Compare hash prefix (i=0) to bucket integer (ij = 0) They are the same Increase the number of bits in the hash prefix (i=1) Double size of hash address table Split old bucket Both buckets, old and new, now have bucket integer of 1 Old bucket - hash values start with 0 New bucket - hash values start with 1 All entries whose hash values start with 1, move to new bucket * Insert Einstein Calculate hash value for department => 1001 Hash prefix = 1, First bit of hash code is 1 Goes in 2nd bucket Bucket is full and hash prefix = bucket integer for that bucket SLIDE 11.70 - Example Show slide p 520 * Increase the number of bits in the hash prefix (i=2) Double size of hash address table First 2 hash address entries both point to bucket 1 Bucket integer remains unchanged Split full bucket Both buckets, old and new, now have bucket integer of 2 Old bucket - dept hash values start with 10 New bucket - dept hash values start with 11 All entries whose hash values start with 11, move to new bucket SLIDE 11.71 - Example - Gold & El Said Show slide p 520 * Insert Gold Calculate hash value for department => 1001 Starts with 10 Goes in 2nd bucket (again) Bucket is full and hash prefix = bucket integer for that bucket * Increase the number of bits in the hash prefix (i=3) Double size of hash address table First 4 hash address entries both point to bucket 1 Bucket integer remains unchanged Split full bucket Both buckets, old and new, now have bucket integer of 3 Old bucket - dept hash values start with 100 New bucket - dept hash values start with 101 All entries whose hash values start with 101, move to new bucket * Insert El Said Calculate hash value for department => 1100 Hash prefix = 3 Starts with 110 Goes in 4th bucket SLIDE 11.72 - Example - Katz Show slide p 520 * Insert Katz Last bucket is full hash prefix < bucket integer Split bucket Old bucket starts with 110 New bucket starts with 111 SLIDE 11.73 - Example - Overflow record with Brandt Show slide p 521 * Insert Califieri * Insert Singh * Insert Crick * Insert Brandt * Notice overflow bucket for 111 There are 3 records with the same hash value ie, same department SLIDE 11.74 - Example - Kim Show slide p 522 * Split bucket #1 Increase bucket integer by 1 (=2) SLIDE 11.75 - Extendable Hashing vs. Other Schemes Show slide p 522 * Static hashing vs Dynamic Hashing READ 1 * No buckets need to be reserved for future growth READ 2 READ 3 SLIDE 11.76 - Comparison of Ordered Indexing and Hashing Show slide p 523 * Organization for files * Ordered files index-sequential recs are stored sequentially in order out-of-order records are chained together B+-tree organization * Unordered Heap READ SLIDE SHOW Hash_Index_Works_Best_With.PDF http://science.kennesaw.edu/~mguimara/4490/tuning.ppt * Hash Index work best with Very-high cardinality columns Only equal (=) tests are used Index values do not change Number of rows are known ahead of time * Ethries that have the same key value are in the same bucket * Key values themselves are spread randomly * Ordered index works best with range queries inequalities in WHERE clause records are stored in order of key value guarentees a sequential search (almost) SLIDE 11.77 - Bitmap Indices Show slide p 524 READ 1 * Each bitmat applies to a single key READ 2 * n identifies the logical number of the record * divide by block size to get the block number * remainder of the divide is the offset into the block READ 3 * More possible values => mean more bitmaps needed READ 4 SLIDE 11.78 - Bitmap Indices Show slide p 525 * Because each value on each attribute gets its own bitmap * Each record gets its own bit in the bitmap * Note that bitmap wouldn't be useful for ID field too many possible values SLIDE 11.79 - Bitmap Indices Show slide p 526 READ 1 * Finding all records with males in the file is best done with a sequential search * You probably need to read every block of the file anyway READ 2 READ 3 * Find all records where gender = 'F' and income_level = 'L2' * Fetch bitmaps for gender AND income level * AND the two bitmaps together * Only makes sense if the number of records to be retrieved is small * otherwise sequentail read of entire table would be cheaper * If all we want is a count of the number of records satisfying a certain condition * Don't need to access the table at all * Count the number of bits that are 1's SLIDE 11.80 - Bitmap Indices Show slide p 526 READ 1 * Assume data record is 100 bytes * Need a minimum of 1 byte per bitmap Bitmap takes up 1% of the space of the entire relations You only need 1 bit per record Space required - per record - is 1/800th of the size of the record * If you have 8 possible values Total space for all bitmaps is 1/100th or 1% of the size of the relation READ 2 * Deletions * Cannot re-order records on deletion Too expensive to rebuild the bitmaps Just mark them as deleted * Inserts Append them to the end of the file READ 3 SLIDE 11.81 - Efficient Implementation of Bitmap Operations Show slide p 527 READ 1 * 1 million records => bitmap that is 1 million bits => 128 k bytes 4 bytes / word => only needs 32k instruction cycles on CPU 2.2GHz processor => FAST !!! READ 2 READ 3 SLIDE 11.82 - Efficient Implementation of Bitmap Operations Show slide p 528