SLIDE 12.2 - Chapter 12: Query Processing
Show slide
p 537

*	Query processing involves
	*	translations of sql statements into reading actual blocks from disk
	*	optimizations and transformations
	*	evaluations of queries
	
*	Covering only the Overview, Measures, and Joins


SLIDE 12.3 - Basic Steps in Query Processing
Show slide
p 537

SELECT salary FROM instructor WHERE salary < 75000


SLIDE 12.4 - Basic Steps in Query Processing
Show slide
p 538

*	Parsing
		Check syntax
			relations & attribute names
		Create parse-tree representation of the query
		Translate into relational algebra


SLIDE 12.5 - Basic Steps in Query Processing: Optimization
Show slide
p 539

SELECT salary FROM instructor WHERE salary < 75000

READ 1
*	May be many ways to represent the same query in relational algebra
READ 2
*	Same relational algebra expression may be executed by different algorithms
READ 3
*	Annotations may be added to the RA expression to specify which algorithm should be used
			Called EVALUATION PRIMITIVE
		Sequence of evaluation primitives is called a
			QUERY-EXECUTION plan or QUERY-EVALUATION plan


SLIDE 12.6 - Basic Steps in Query Processing: Optimization
Show slide
p 539

READ 1
*	Optimization
		Users writing SQL shouldn't have to worry about this
READ 2
READ 3


SLIDE 12.7 - Measures of Query Cost
Show slide
p 540

*	Evaluate plans according to their cost
		How do you estimate the cost ?
		
READ 1
		
*	Focus on number of block transfers
*	What makes up the cost of a disk block access ?
		Seek time 					(on avg 4 ms)
		Rotational Latency 
*	Block Transfer time 			(on avg 0.1 ms per 4k block)
	
READ 2 


SLIDE 12.8 - Measures of Query Cost
Show slide
p 540

READ 1
READ 2
*	CPU costs depends on many other factors and is very fast compared to disk IO


SLIDE 12.9 - Measures of Query Cost
Show slide
p 541

READ SLIDE

* * * SKIP AHEAD TO JOINS * * * 


SLIDE 12.23 - Join Operation
Show slide
p 549

READ SLIDE


SLIDE 12.24 - Nested-Loop Join
Show slide
p 550

*	This join algorithm consists of a pair of nested loops

POINT 1
*	theta is a join condition
*	for each row in r
		read each row in s
			if the 2 rows satisfy the join condition Theta
				Combine the 2 rows
				Add them to the result set

*	natural join
		theta join on attributes of the same name
		delete repeated attributes columns in result
		
READ 2
READ 3
READ 4


SLIDE 12.25 - Nested-Loop Join
Show slide
p 550

POINT 1
	Nr is number of tuples in r
	Br is the number of blocks to hold all tuples in r
	Bs is the number of blocks to hold all tuples in s
	
	*	For each record in r, we have to perform a complete scan on s

READ 1	
	*	Seeks
		*	Need 1 seek for each block of r
		*	Need 1 seek on s for each record in r
				Costs only 1 seek for each scan of s because records are read sequentially
			
READ SLIDE


SLIDE 12.26 - Block Nested-Loop Join
Show slide
p 551
	
*	Improve performance if we process relations on a per-block basis
		rather than on a per-tuple basis
*	Every block of the inner relation
		is paired with every block of the outer relation
*	Within each pair of blocks
		every tuple in one block is paired with every tuple in the other block
		

SLIDE 12.27 - Block Nested-Loop Join
Show slide
p 552

READ 1
*	Doing a cross product of the blocks
		Read block from r
			Read ALL blocks from s
READ 2
*	If all the blocks fit into memory
READ 3


SLIDE 12.28 - Indexed Nested-Loop Join
Show slide
p 553

READ 1
READ 2
*	Go read tuple from data block whose index entry matced the join condition
READ 3
*	Room for 1 data block of r and one index block for s
READ 4


SLIDE 12.29 - Example of Indexed Nested-Loop Join Costs
Show slide
p 553

READ 1-4
READ 5
*	Transfers:
		Read 1 block of takes once for each student
		400 blocks of takes * 100 blocks of student 
*	Seeks: 
		Seek once for each scan of inner relation
		Seek once for each block of outer relation
		2 * 100 blocks of students
READ 6
*	Cut I/O in half


SLIDE 12.30 - Merge-Join
Show slide
p 553


SLIDE 12.31 - Merge Merge-Join
Show slide
p 557

READ 1-3
*	Bb is the number of buffer blocks
READ 4


SLIDE 12.32 - Hash-Join
Show slide
p 557

*	Value in r is hashed to one of its partitions
*	Equal value in s is hashed to one of its partitions
*	Only those 2 partitions need to be checked


SLIDE 12.33 - Hash-Join
Show slide
p 558


SLIDE 12.34 - Hash-Join
Show slide
p 558


SLIDE 12.35 - Hash-Join Algorithm
Show slide
p 558


SLIDE 12.36 - Hash-Join Algorithm
Show slide
p 559


SLIDE 12.37 - Handling of Overflows
Show slide
p 560


SLIDE 12.38 - Cost of Hash Hash-Join
Show slide
p 561

READ 1


SLIDE 12.39 - Cost of Hash Hash-Join
Show slide
p 561


SLIDE 12.40 - Cost of Hash Hash-Join
Show slide
p 562


SLIDE 12.41 - Cost of Hash Hash-Join
Show slide
p 563

*	ANDs
READ 1

*	ORs
READ 2