LOOK AT UNDERLYING ASSUMPTIONS RELATIONS * What is the difference between a spreadsheet and a relation (table) ? Spreadsheet - each cell is completely independent from every other cell SHOW Clares function dependency.xls - Sheet 1 May have column letters and row numbers - but this is just for addressing purposes Data may be completely random - it does not need to relate to other data (although it may) Table / relation SHOW Clares function dependency.xls - Sheet 2 A set of values that are related and belong together Instructor - each value has to do with that particular instructor Each set of values is related in the same way Same attribute names and with the same domain * If this was not true, then nothing that we have been learning in this class has any meaning That is the basic assumption FUNCTIONAL DEPENDENCIES TWO questions * What does the word FUNCTION mean in functional dependency? * What does a functional dependency have to do with keys? * Function is not the word FUNCTION used in the general sense Like "How's your iPhone funtioning today?" Or "What functions does that have?" Or "How do want it to function?" * Function is used in a mathematical sense * What is a function in math? Mathematical function - Expression in which every value of X uniquely determines a value of Y For every value of X there is ONLY one Y Draw graph of straight line Show how there is a unique value of Y for every value of X Draw graph of circle Is this a function? Why not ? It is an algebraic expression but NOT a function * It is invalid to return more than one Y for any X * Same concept of FUNCTION applies in dealing with databases and functional dependency * There is a PARTICULAR FUNCTION that we are talking about when we say FUNCTIONAL DEPENDENCY * The "function" being discussed is the function of IDENTIFICATION. A set of attributes we are calling ALPHA, uniquely identifies another set of related attributes BETA ALPHA is the set of attributes that form the KEY for the identification The VALUES of a particular instance of the key, that is ALPHA, will return the related set of VALUES for the attributes contained in BETA Given a tuple and the values of the attributes in ALPHA, one can determine the corresponding values of the attributes in BETA * It is invalid to have two records with the same set of ALPHA-values, but a different set of BETA-values ALPHA selects a particular value for the BETAs, and the values for each set of ALPHAs is unique * A Functional Dependency describes a relationship between attributes in a relation. * A set of attributes is functionally dependant on another if we can use the values of one set of attributes in a tuple (the key) to determine (ie, uniquely identify) the value of another set of attributes in the same tuple * determinant - the attributes on the left hand side of the functional dependency ( ALPHA) * dependent - the attribute on the right side of the functional dependency ( BETA ) * a determinant is an attribute or a group of attributes that select the value of other attributes * dependent attributes are uniquely identified by the attributes of the determinant * key - the determinant super key One last time * Functional dependency In a given relation, a set of attributes BETA is said to have a functional dependency on a set of attributes ALPHA IF AND ONLY IF each set of values for ALPHA is associated with precisely one set of values for BETA. OTHER DEFINITIONS * Trivial functional dependency A trivial functional dependency is a functional dependency of an attribute on a superset of itself. AB -> A Its a key that is just identifying itself * Full functional dependency * Dependent attributes rely on ALL attributes of the candidate key * An attribute is fully functionally dependent on a set of attributes ALPHA if it is functionally dependent on ALPHA, and NOT functionally dependent on any proper subset of ALPHA * Transitive dependency A transitive dependency is an indirect functional dependency If X->Y and Y->Z, then X -> Z * SATISIFES An INSTANCE of r SATISFIES the functional dependency IF for ALL pairs of tuples IN THAT INSTANCE the dependency is true * LEGAL INSTANCE of a relation Instance that satisfies all constraints including functional dependencies * HOLDS A functional dependency HOLDS on a relation IF EVERY LEGAL instance of the relation SATISFIES the functional dependency NORMALIZATION Normalisation is the process of taking data from a problem and reducing it to a set of relations ensuring data integrity and eliminating data redundancy * Data integrity all of the data in the database are consistent, and satisfy all integrity constraints. * Data redundancy direct redundancy - data in the database can be found in two different locations indirect redundancy - data can be calculated from other data items NORMAL FORMS * The normal forms represent guidelines for table design * The normal forms are applicable to individual tables; to say that an entire DATABASE is in normal form N is to say that ALL of its tables are in normal form N. * To be in a particular form requires that the data meets the criteria to also be in all normal forms BEFORE that form. To be in 2NF the data must meet the criteria for both 2NF and 1NF. * The higher the form the more redundancy has been eliminated. * 1NF - First Normal Form A relation R is in 1NF if the DOMAINS of all attributes of R are ATOMIC (simple values) Not composite or multi-valued Second and third normal forms deal with the relationship between non-key and key fields. * 2NF - Second Normal Form No partial functional dependencies on the key ? No attribute in BETA is functionally dependent on a proper subset of a ALPHA No non-key attribute is functionally dependent on a proper subset of the key There must be no partial key dependencies (PKDs). A relation in 1NF whose key consists of only a single attribute is automatically in 2NF There is no proper subset of the key 2NF is violated when a non-key field is a fact about a part of a key instead of the whole key. * 3NF - All non-key attributes must be mutually independent 1) Beta attributes are all part of the key 2) Alpha is a superkey 3) All non-key attributes must be contained in some other candidate key Non-key attributes may not be functionally dependent on other non-key attribute Third normal form is violated when a non-key field is a fact about another non-key field inst_dept ( id, name, salary , dept_name, building, budget ) budget is dependent on dept_name Creates a TRANSITIVE dependency on id, dept_name & budget id -> dept_name dept_name -> budget 3NF removes virtually all the redundant data * KEY a set of attributes that functionally determines all of the other attributes in a relation. * CANDIDATE KEY a minimal key * BCNF - Every non-trivial functional dependency in the table is a dependency on a superkey * BCNF is based on the concept of a determinant. A relation is in BCNF is, and only if, every determinant is a candidate key. * When a relation has more than one candidate key, anomalies may result even though the relation is in 3NF. 3NF does not deal satisfactorily with the case of a relation with overlapping candidate keys i.e. when composite candidate keys have at least one attribute in common.