Definitions / Rules
Entity <--> Database Table
- Naming convention: singular, uppercase
- Instance of an entity <--> Database Row (or Tuple or Record)
Attribute <--> Database Column (or Field).
- Naming convention: singular, lowercase
- Relationship = Link between two entities, represents a business rule and can be expressed as a verb phrase.
- Primary key = an attribute that uniquely identify each instance of an entity.
- "one-to-many" relationship: one instance of the parent entity relates to many instances of the child entity.
"many-to-many" relationship: many instances of one entity relate to many instances of the other entity.
- Need to be resolved (to avoid data redundancy): create an intermediate entity known as a cross-reference (or XREF) entity. The XREF entity is made up of the primary keys from both of the two original entities. Both of the two original entities become parent entities of the XREF entity. Thus, the "many-to-many" relationship becomes resolved as two "one-to-many" relationships.
- Relationships are either "identifying" or "non-identifying". Identifying relationships exist when the primary key of the parent entity is included in the primary key of the child entity. On the other hand, a non-identifying relationship exists when the primary key of the parent entity is included in the child entity but not as part of the child entity's primary key. In addition, non-identifying relationships may be further classified as being either "mandatory" or "non-mandatory". A mandatory non-identifying relationship exists when the value in the child table cannot be null. On the other hand, a non-mandatory non-identifying relationship exists when the value in the child table can be null.
A primary Key is an attribute that is selected to uniquely identify each instance of an entity.
- cannot be null
- immutable: cannot be changed
- shall contain no semantic meaning
- shall not be re-used/recycled
Primary key of a parent entity in a child entity (reference)
Natural key = Business key = Domain key
is a type of unique key in a database formed of attributes that exist and are used in the external world outside the database (i.e. in the business domain or domain of discourse). In the relational model of data, a natural key is a candidate key and is therefore a functional determinant for all attributes in a relation.
Primary key generated by the database management system. The only significance of the surrogate key is to act as the primary key. The value:
- is unique system-wide, hence never reused
- is system generated
- is not manipulable by the user or application
- contains no semantic meaning
- is not visible to the user or application
- is not composed of several values from different domains.
Combination of attributes leading to a primary key
Candidate key (relation)
A candidate key of a relation is a minimal superkey for that relation; that is, a set of attributes such that:
- the relation does not have two distinct tuples (i.e. rows or records in common database language) with the same values for these attributes (which means that the set of attributes is a superkey)
- there is no proper subset of these attributes for which (1) holds (which means that the set is minimal).
A composite key is a candidate key that consists of two or more attributes (table columns) that together uniquely identify an entity occurrence (table row). A compound key is a composite key for which each attribute that makes up the key is a simple (foreign) key in its own right.
|Simple||A key made from only one attribute.|
|Concatenated||A key made from more than one attribute joined together as a single key, such as part or whole name with a system generated number appended as often used for E-mail addresses.|
|Compound||A key made from at least two attributes or simple keys, only simple keys exist in a compound key.|
|Composite||A key containing at least one compound key with at least one other attribute or simple key (this is an extension of a compound key).|
|Natural||A key made from data that exists outside the current database. In other words, the data is not system generated, such as a social security number imported from another system.|
|Surrogate||An artificial key made from data that is system assigned or generated when another candidate key exists. Surrogate keys are usually numeric ID values and often used for performance reasons.|
|Candidate||A key that may become the primary key.|
|Primary||The key that is selected as the primary key. Only one key within an entity is selected to be the primary key. This is the key that is allowed to migrate to other entities to define the relationships that exist among the entities. When the data model is instantiated into a physical database, it is the key that the system uses the most when accessing the table, or joining the tables together when selecting data.|
|Alternate||A non-primary key that can be used to identify only one row in a table. Alternate keys may be used like a primary key in a single-table select.|
|Foreign||A key that has migrated to another entity.|
A foreign key requires that values must be present in the parent entity before like values may be inserted in the child entity.
- Cardinality of the relationship between the child entity and the parent entity. The cardinality of a relationship may be determined by asking the following question: "How many instances of the child entity relate to each instance of the parent entity?"
There are four types of cardinality:
- One to zero or more (common cardinality)
- One to one or more (P cardinality)
- One to zero or one (Z cardinality)
- One to exactly N (N cardinality).
- minimize data redundancy
- minimize data restructuring
- minimize I/O by reducing transaction sizes
- enforce referential integrity
First normal form (1NF): very basic rules:
Second normal form (2NF): remove duplicate data:
- Meet all the requirements of 1NF.
- Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
- Create relationships between these new tables and their predecessors through the use of foreign keys.
Third normal form (3NF):
- Meet all the requirements of the second normal form.
- Remove columns that are not dependent upon the primary key.
Boyce-Codd NormalForm (BCNF or 3.5NF):
- Meet all the requirements of the 3NF.
- Every determinant must be a candidate key.
Fourth NormalForm (4NF):
- Meet all the requirements of the 3NF.
- Relationships have no multi-valued dependencies.
Fifth normal form (5NF), Project-join normal form (PJ/NF) designed to reduce redundancy in relational databases recording multi-valued facts by isolating semantically related multiple relationships.