Slide 7.2 * What happens when you have to start from scratch to put a database application together? How do you do that? Pictures - structure Diagrams - allow info to be viewed at a high level Lists of rules Constraints Design Process 1. Specification of user reqs * You need to know what you are going to build 2. Conceuptual design detailed overview of the enterprise => Data and Relationships 3. Functional Reqs Operations performed on the data 4. Logical design Map the conceptual design to Implementation data model Relational model 5. Physical design Files, indexes, etc PITFALLS in design Redundancy Too much data Incompletenes Not enough data Slide 7.3 Modeling + Gives us the overall logical structure of the database * A db can be modeled as collection of entities - nouns relatonships between entities - verbs, is-a (specialization), has-a (aggregation) * An entity is an object that exists and is distinguishable from other objects Analagous to - OO instance Specific instructor, student, etc Candidate keys uniquely identify a particular entity * Entities have attributes Adjectives Info about the entity Attributes have Values * Entity Set is a set of entities of the same type that share the same properties. Analagous to OO Class It is an abstract term set of all POSSIBLE instructors / students EXTENSION OF THE ENTITY SET - Actual set of instructors / students data currently in the set + DB => collection of entity sets each of which contains any number of entities of the same type Slide 7.4 - Entity Sets instructor and student Show slide Slide 7.5 - Relationship Sets Show slide * RELATIONSHIP - association among 2 or more entities * RELATIONSHIP SET - set of relationships of the same type + Association between entity sets is called participation Entity set E1, .., En PARTICIPATE in relationship R + RELATIONSHIP INSTANCE represents an association between the named entities in the real world enterprise + Entity's ROLE - function that that entity plays in a relationship Implicitly understood - but may be added explicitly for clarification Slide 7.6 - Relationship Set advisor Show slide + Relationships are the lines between the 2 entity sets Slide 7.7 - Relationship Sets cont. Show slide + Better example of relationship attribute Student takes Course => grade + Relationship instance in a given set must be uniquely identifiable from its participating entities Can't use relationship attributes to form uniqueness + The same entity sets may participate in more than one relationship Another kind of advisor eg, dept_advisor Slide 7.9 - Degree of a Relationship Set Show slide + What does binary mean? + Ternary Particular student guided by a particular instructor on a particular project A student can have different instructors for different projects Can't be captured in a binary relationship Slide 7.10 - Attributes Show slide + Simple -> scalar, undividable + Composite -> can be broken into subparts What examples can you give? Name -> first, middle, last Address -> + Single valued Attribute is only made up of one value May be simple or composite + Multi valued May allow a set of values for that attribute Multiple phone numbers + Derived Students - totcredits + Attribute may have a null value Missing / doesn't exist Unknown Slide 7.10 - Composite Attributes Show slide Slide 7.11 - Mapping Cardinality Constraints Show slide * One-to-one For each entity in A there is at most one matching entity in B Does anyone see a 1-1 relationship in the university schema * One-to-Many For each entity in A there are 0 or more matching entities in B For each entity in B, there is at most one match entity in A Does anyone see a 1-many relationship in the university schema * Many-to-One Opposite of above * Many-to-many For each entity in A there are 0 or more matching entities in B Vice versa + Cardinality depends on real-world situation + Total participation Every entity participates in at least on relationship + Partial participation Slide 7.12 - Mapping Cardinality Show slide Slide 7.13 - Mapping Cardinality Show slide Slide 7.14 - Keys Show slide + No two entities in an entity set are allowed to have exactly the same value for all attributes Uniqueness is determined by the value of ALL the attributes for each entity in the set + KEY - for an entity is a set of set attributes that suffice to distinguish entities from each other * SUPER KEY * CANDIDATE KEY Slide 7.15 - Keys for Relationship Sets Show slide + Keys help define relationships uniquely A relationship set is an association between what? Two entity sets How do we uniquely define relationship entities in a relationship set? What would be the unique key in relationship come from? What happens if attributes from the 2 entity sets have the same name? What if an entity set has a relationship with itself? Slide 7.16 - Redundant Attributes Show slide + Database design 1. Choose entities 2. Choose attributes 3. Choose relationships May result in a situation where the attributes in the various entity sets are redundant May need to be removed * Read slide Remove dept_name from instructor FKs that we currently find in our tables are really part of the relationship Happens on the Many side of the relationship + Good entity-relationship design does not contain redundant attriubtes Slide 7.17 - E-R Diagrams Show slide + Dashed lines link attributes of a relationship set to the relationship set Slide 7.18 - Entity With Composite, Multivalued Multivalued, and Derived, Attributes Show slide Slide 7.19 - Relationship Sets with Attributes Show slide + If a relationship set has some attributes, we enclose them in a simple rectangle and link it with a dashed line Slide 7.20 - Roles Show slide + We indicate roles by labeling the lines that connect diamonds to rectangles Slide 7.21 - Cardinality Constraints Show slide + Mapping cardinality Line ending with arrow indicates the ONE side of a 1-1 or 1-many relationship Else, it is the ANY side of the relationship Slide 7.22 - One-to-One Relationship Show slide Slide 7.23 - One-to-Many Relationships Show slide Slide 7.24 - Many-to-One Relationships Show slide Slide 7.25 - Many-to-Many Relationships Show slide Slide 7.26 - Participation of an Entity Set in a Relationship Set Show slide + Total participation Every entity participates in at least on relationship + Partial participation Slide 7.27 - Alternative Notation for Cardinality Limits Show slide Slide 7.28 - E-R Diagram with a Ternary Relationship Show slide Slide 7.29 - Cardinality Constraints on Ternary Relationship Show slide Slide 7.30 - Interactive ER Design with Class Show slide * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Slide 7.31 - Weak Entity Sets NO Show slide SHOW university diagram * Look at instructor and department relations * SHOW inst_dept.jpg Each instructor has an associated department dept_name appears in both entity set When we create the relationship between the 2 enitity sets How do we create a unique key for a relationship set that associates section and course ? * What is the PK for instructor? * What is the PK for department? dept_name becomes one of the attributes of the relationship set because it is the PK of department dept_name becomes redundant in instructor because that info is now kept in the relationship set Remove dept_name from instructor SHOW university diagram * Look at section and course entity sets * SHOW sec_course_1.jpg How are we going to create the key for a relationship set that associates section and course ? * What is the PK for section * What is the PK for course * SHOW sec_course_2.jpg * How do we deal with the redundancy ? Get rid of extra course_id * SHOW sec_course_3.jpg * What happens to course if we remove course_id ? * What happens to section if we remove course_id ? Lose uniqueness on section sec_course becomes a SPECIAL relationship that provides extra info required for uniqueness on section * STRONG ENTITY SET - Entity set that has a PK * WEAK ENTITY SET - does not have sufficient attributes to form its own PK MUST be associated with another set call the IDENTIFYING or OWNER ENTITY SET The week entity set is EXISTENCE DEPENDENT on the identifying entity set The identifying set is said to OWN the weak entity set that it identifies * IDENTIFYING RELATIONSHIP is the realtionship associating the weak entity set with the identifying entity set Is M-to-1 from the weak entity set to the identifying set Participation of the weak set in the relationship is TOTAL * Weak entity set no longer has its own PK Still need a way to link individual entities (sections) with the strong entity set (course) * DISCRIMINATOR - set of attributes in the weak entity set that allows the link to occur sections => secion_id, year, semester This set of attributes uniquely identifies one single section for that course Also called a PARTIAL KEY * PK of the weak entity set is formed by 1) the PK if the identifying entity set (course) , 2) the discriminator of the weak entity set (section) * PK of section is still {course_id, sec_id, semester, year} but course_id comes from the PK of the identifying enitity set DISCRIMINATOR of weak entity set is underlined with dashes on the ER diagram Relationship set is depicted using a DOUBLE DIAMOND Slide 7.31 - Weak Entity Sets Show slide Slide 7.32 - Weak Entity Sets Show slide Slide 7.33 - Weak Entity Sets Show slide Slide 7.34 - E-R Diagram for a University Enterprise Show slide * What is different between this and the university diagram we've been using? * What does the double diamond mean on the sec_course relationship set? Connects a weak entity set to its identifying strong entity set * What do the double lines between the entity sets and relationship sets mean? Total participation * What does the arrow at the end of the line indicate? The ONE side of the relationship Slide 7.35 - Reduction to Relational Schemas Show slide * Going back to where we started at the beginning of the book * Convert ER diagrams back into relation schemas * Both are abstract, logical representations of the real world * Very similar in design Slide 7.36 - Reduction to Relational Schemas Show slide 1 READ - Notice that both entity sets and relationship sets reduce to a relation schema 2 READ - An ER database schema (single) is reduced to a SET of relation schemas (plural) 3 READ - Entity sets keep their same names in the corresponding relation schemas 4 READ - Attributes (generally) also keep their same names in the corresponding relation schema Slide 7.37 - Representing Entity Sets With Simple Attributes Show slide * Tell me all the STRONG entity sets in the university ER diagram with SIMPLE attributes classroom( building, room_number, capacity) department ( dept_name, building, budget) course ( course_id, title, credits) instructor ( id, name, salary ) student ( id, name, tot_cred) * Tell me all the WEAK entity sets in the university ER diagram with SIMPLE attributes ? * How do you know what the weak entity sets are? Their discriminators have dashed underlines * Weak entity sets get the primary key of their corresponding strong identifying entity set As well as their own discriminator set to form its PK section( course_id, sec_id, semester, year ) * Weak entity sets ALSO get a FK constraint back to the strong identifying entity set For each tuple in the weak entity, there is a correspoinding tuple representing the corresponding strong entity section( course_id, sec_id, semester, year ) Slide 7.38 - Representing Relationship Sets Show slide * How do you know from the ER diagram the a relationship set is many-to-many ? No arrows on the lines connecting the relationship set to the participating entity sets * In a binary relationship, where to the attributes come from that make up the PK of the relationship entity set ? PKs from participating entity sets Union of the PKs from the participating entity sets Allows us to uniquely identify all the individual relationships between particular entities * Taking ALL the PKs from BOTH sides of a BINARY relationship is OVERKILL * ONLY needed for Many-to-Many binary relationships * Many-to-One or One-to-Many PK on the MANY side becomes the PK of the relationship * One-to-One PK from EITHER entity set Choice is arbitrary * Must also create FK constraints on the appropriate relations schemas for the relationships * What about more than BINARY sets? Sets WITHOUT arrows PKs from ALL participating sets becomes the PK of the relationship set Sets WITH arrow on ONE of its edges The PK of the entity set on the end of the arrow becomes the PK of the relationship set * Example - Advisor relationship PK of relationship set is PKs from both entity sets - id, and id Must rename attributes to make them unique - i_id and s_id From instructor to student Many-to-one PK for the relation schema is just s_id Must create 2 FK constraints on Advisor relation One on i_id and one on s_id teaches ( id, course_id, sec_id, semester, year ) takes ( id, course_id, sec_id, semester, year, grade ) prereq ( course_id, prereq_id ) advisor ( s_id, i_id ) sec_course ( course_id, sec_id, semester, year ) sec_time_slot ( course_id, sec_id, semester, year, time_slot_id ) sec_class ( course_id, sec_id, semester, year, building, room_number ) inst_dept ( id, dept_name ) stud_dept ( id, dept_name ) course_dept ( course_id, dept_name ) * Tell me the cardinality of each of the relationship sets above Slide 7.39 - Redundancy of Schemas Show slide 1 READ 2 READ * What do we do to fix 1:M and M:1 redundancies in the relations above? Slide 7.40 - Redundancy of Schemas Show slide 1 READ 2 READ 3 READ * What do we do to fix 1:1 redundancies in the relations above? * What do we do to fix weak entity set redundancies? Slide 7.41 - Composite and Multivalued Attributes Show slide Slide 7.42 - Composite and Multivalued Attributes Show slide Slide 7.43 - Composite and Multivalued Attributes Show slide Slide 7.44 - Design Issues Show slide * Choice of entity sets and relation schemas is BOTH art and science ie, it is NOT precise, there is no one right way * When is an object an entity vs an attribute ? Entities are more general Allow multiple entities of that type Allows other attributes to associated with it It is a DESIGN decision Talk to the user about how that object is being used in the real world * Common errors in ER diagrams * Not using a relationship set to indicate a relationship between entity sets Using PKs / FKs in the ER design / diagram * Specifying PKs of relationship sets as attributes of that set They are implicit ! ! ! Slide 7.45 - Design Issues Show slide * When do you use an entity set vs a relationship set? Replace Takes relationship set with an entity set A registration record that records this event Each registration is associated with exactly one student and one section of a course Need to add 2 new relationship sets There is total participation on both sides Both are correct Takes is more compact without losing info Might need Registration if there is more info needed * Guideline Use a relationship set to describe an action that occurs between entities Slide 7.46 - Design Issues Show slide * ANY bigger than BINARY (n-ary) relationship may be can be broken down into strictly binary relationships by replaces the relationship with another entity and creating relationship sets between that entity and all the other entities in the original relationship * Might NOT want to do this Doesn't clearly represent the inherent relationship Might lose constraints More complex design * Cardinality of relationship affects where relationship attributes go Attributes of 1:M relationship sets MUST be put in MANY side of the 2 participating entity sets Attributes of 1:1 relationship sets can be put in EITHER of the 2 participating entity sets It is a DESIGN DECISION Attributes of M:M MUST keep the relationship set intact It is NOT redundant and cannot be removed without losing information Slide 7.47 - Binary Vs. Non Non-Binary Relationships Show slide Slide 7.48 - Binary Vs. Non Non-Binary Relationships Show slide Slide 7.49 - Binary Vs. Non Non-Binary Relationships Show slide Slide 7.50 - Extended ER Features Show slide Slide 7.51 - Extended E-R Features: Specialization Show slide * We will extend our university database to model various people within the univ Defining a new entity - PERSON - with attributes ID, NAME, ADDRESS * Specialization - TOP DOWN design * A subset of entities within a set may include attributes not shared by all entities in the set * PERSON entity set may be broken into to other subgroupings - EMPLOYEE and STUDENT * A PERSON entity may be an EMPLOYEE, a STUDENT, both, or neither * STUDENT could be further divided in GRADUATE (gets and office) and UNDERGRADUATE (residential_college) * EMPLOYEE could be broken down by job such as INSTRUCTOR and SECRETARY * May also be designated as TEMPORARY vs PERMANENT * In ER diagrams, this is a special kind of relationship called IS-A * Specialization is shown as a hollow tipped arrow From specialized entity to more general entity * Attributes of higher-level entity sets are inherited (shared) by lower-level entity sets Applies through ALL tiers instructors and secretaries as well as students all have id, name, address * Relationships are also inherited down to the lowest level Slide 7.52 - Specialization Example Show slide * Can an entity belong to MORE than one subset at a time * Yes - OVERLAPPING specialization 2 separate arrows are used person may be both an employee and a student at the same time * No - DISJOINT specialization 1 single arrow with 2 branches is used Slide 7.53 - Extended ER Features: Generalization Show slide * Generalization - BOTTOM UP design Refinement of initial entity sets into successive levels of entity subgrouping Multiple entity sets are synthesized into a higher-level entity set on the basis of common features * Similarities between instructor and secretary Instructor - instructor_id, instructor_name, insturctor_salary, rank Secretary - secratary_id. secratary_name, secratary_salary, hourly_rate Several attributes are conceptually the same * Generalization A containment relationship that exists between a higher-level entity set and one or more lower-level entity sets Attributes are given a common name and represented in a higher-level entity - PERSON * Specialization and Generalization are used back and forth in the design process They are treated the same in an ER diagram Slide 7.54 -Specialization and Generalization (Cont.) Show slide Slide 7.55 -Design Constraints on a Specialization / Generalization Show slide * Membership * What determines if an entity may be a member of a lower-level entity? Attribute defined Explicit condition or predicate May be attribute that flags whether the entity is a member of that lower-level set student_type = "undergraduate" or "graduate" User defined An actual person operating the software assigns an entity to a particular group Decision is outside the database * How many subgroups may an entity belong to? * Disjoint Entity belongs to no more than one Either undergrad or graduate * Overlapping Entity may belong to more than 1 subgroup Student AND employee Slide 7.56 -Design Constraints on a Specialization / Generalization Show slide * Completeness * Total Each higher-level entity MUST belong to AT LEAST one lower-level entity set Student must be classified as grad or undergrad Depict this on diagram by writing the word "Total" next to the hollow arrowhead * Partial Each higher-level entity MAY or MAY NOT belong to a lower-level entity set This is the default * Completeness and Disjointedness are INDEPENDENT constraints * If there is a total completeness contraint in place What happens when you do an insert? Slide 7.57 - Aggregation Show slide * ER diagrams are unable to represent relationships among relationships * Instructors are required to give monthly evaluations of students on their projects * Eval becomes a new entity * Options Turn it into a 4-way relationship? Slide 7.58 - Aggregation Show slide * Combine proj_guide with eval_for ? Some combinations of proj_guid may not have evaluations * Make evaluation an attribute of proj_guide eval_for has the same combination of attributes as proj_guide * May not be possible if evaluations also has relationships with other entity sets Slide 7.59 - Aggregation Show slide * AGGREGATION An abstraction through which relationships are treated as higher-level entities * Combine Treat Students, instructors, projects entity sets, and proj_guide relationship set into one BIG entity Create a normal binary relationship between evaluations and the higher-level entity set Slide 7.60 - Representing Specialization via Schemas - Method 1 Show slide * How do represent specialization and generalization in relation schema? * PK of highest level relation becomes PK of all lower-level relations * FKs defined on each of the lower-level relations back to the highest Slide 7.61 - Representing Specialization via Schemas - Method 2 Show slide * Used when generalization is disjoint and complete No redundancy All info is EITHER in one relation or the other - NOT both NONE of the attributes are kept at the higher level Don't need it as a relation - Can we get rid of it ? * BOTH relations have the same attribute as their PK - id * NO - need it for the PK * If overlapping generalization Some values would appear multiple times * If disjoing but NOT complete Person is NEITHER student NO employee need the extra Person schema * Still have FK problem Remove common attributes from below You end up with Method 1 Slide 7.62 - Schemas Corresponding to Aggregation Show slide * Straight forward Define relation for evaluation Define relation for eval_for PK for evaluation PK for proj_guide Any descriptive attributes * Define all PKs and FKs as before Slide 7.63 - Schemas Corresponding to Aggregation Show slide Slide 7.64 - E-R Design Decisions Show slide Slide 7.65 - E-R Design Show slide 1. A publishing company produces scientific books on various subjects. 2. The books are written by authors who specialize in one particular subject. 3. The company employs editors who, not necessarily being specialists in a particular area, each take sole responsibility for editing one or more publications. 4. A publication covers essentially one of the specialist subjects and is normally written by a single author. 5. When writing a particular book, each author works with on editor, but may submit another work for publication to be supervised by other editors. 6. To improve their competitiveness, the company tries to employ a variety of authors, more than one author being a specialist in a particular subject. Slide 7.66 - E-R Notation ER_Notation_defs_1.jpg Slide 7.67 - E-R Notation ER_Notation_defs_2.jpg Slide 7.68 - Alternative E-R Notations Show slide Slide 7.69 - Alternative E-R Notations Show slide Slide 7.70 - Alternative E-R Notations Show slide * 7 different diagrams used in UML Class diagram similar to ER diagram Use case diagram how will the system be used? by whom? Activity diagram Flow of work between compoments in the system Implementation Diagram All the components of the system and how they fit together hardware & software Slide 7.71 - ER vs. UML Class Diagrams Show slide Slide 7.72 - ER vs. UML Class Diagrams Show slide Slide 7.73 - ER vs. UML Class Diagrams Show slide