Relational model


The relational model for management of a database is a data model based on predicate logic and set theory.


The model

The fundamental assumption of the relational model is that all data are represented as mathematical relations, i.e., a subset of the Cartesian product of n sets. In the mathematical model, reasoning about such data is done in two-valued predicate logic (that is, without NULLs), meaning there are two possible evaluations for each proposition: either true or false. Data are operated upon by means of a relational calculus and algebra.

The relational data model permits the designer to create a consistent logical model of information, to be refined through database normalization. The access plans and other implementation and operation details are handled by the DBMS engine, and should not be reflected in the logical model. This contrasts with common practice for SQL DBMSs in which performance tuning often requires changes to the logical model.

The basic relational building block is the domain, or data type. A tuple is an ordered multiset of attributes, which are ordered pairs of domain and value. A relvar (relation variable) is a set of ordered pairs of domain and name, which serves as the header for a relation. A relation is a set of tuples. Although these relational concepts are mathematically defined, they correspond loosely to traditional database concepts. A table is an accepted visual representation of a relation; a tuple is similar to the concept of row.

The basic principle of the relational model is the Information Principle: all information is represented by data values in relations. Thus, the relvars are not related to each other at design time: rather, designers use the same domain in several relvars, and if one attribute is dependent on another, this dependency is enforced through referential integrity.


Other models are the hierarchical model and network model. Some systems using these older architectures are still in use today in data centers with high data volume needs or where existing systems are so complex it would be cost prohibitive to migrate to systems employing the relational model; also of note are newer object-oriented databases, even though many of them are DBMS-construction kits, rather than proper DBMSs.

The relational model was the first formal database model. After it was defined, informal models were made to describe hierarchical databases (the hierarchical model) and network databases (the network model). Hierarchical and network databases existed before relational databases, but were only described as models after the relational model was defined, in order to establish a basis for comparison.


The relational model was invented by Dr. Ted Codd as a general model of data, and subsequently maintained and developed by Chris Date and Hugh Darwen among others. In The Third Manifesto (1995) they show how the relational model can be extended with object-oriented features without compromising its fundamental principles.


SQL, initially pushed as the standard language for relational databases, was actually always in violation of it. SQL DBMS's are thus not actually RDBMS's, and the current ISO SQL standard doesn't mention the relational model or use relational terms or concepts.


There have been several attempts to produce a true implementation of the relational database model originally developed by Codd, Date, Darwen and others, but none have been popular successes so far. Rel ( is one of the more recent attempts to do this.


Codd himself proposed a three-valued logic version of the relational model, and a four-valued logic version has also been proposed, in order to deal with missing information. But these have never been implemented, presumably because of attending complexity. SQL NULLs were intended to be part of a three-valued logic system, but fell short of that due to logical errors in the standard and in its implementations.


Database normalization is usually performed when designing a relational database, to improve the logical consistency of the database design and the transactional performance.

There are two commonly used systems of diagramming to aid in the visual representation of the relational model: the entity-relationship diagram (ERD), and the related IDEF diagram used in the IDEF1X method created by the U.S. Air Force based on ERDs.

Example database

An idealized, very simple example of a description of some relvars and their attributes:

Customer(Customer ID, Tax ID, Name, Address, City, State, Zip, Phone)

Order(Order No, Customer ID, Invoice No, Date Placed, Date Promised, Terms, Status)

Order Line(Order No, Order Line No, Product Code, Qty)

Invoice(Invoice No, Customer ID, Order No, Date, Status)

Invoice Line(Invoice No, Line No, Product Code, Qty Shipped)

Product(Product Code, Product Description)

In this design we have six relvars: Customer, Product, Order, Order Line, Invoice, and Invoice Line. The bold, underlined attributes are candidate keys. The non-bold, underlined attributes are foreign keys.

Usually one candidate key is arbitrarily chosen to be called the primary key and used in preference over the other candidate keys, which are then called alternate keys.

A candidate key is a unique identifier enforcing that no tuple will be duplicated; this would make the relation into something else, namely a bag, by violating the basic definition of a set. A key can be composite, that is, can be composed of several attributes. Below is a tabular depiction of a relation of our example Customer relvar; a relation can be thought of as a value that can be attributed to a relvar.

Set Theory Formulation

Basic notions in the relational model are relation names and attribute names. We will represent these as strings such as "Person" and "name" and we will usually use the variables r, s, t, ... and a, b, c to range over them. Another basic notion is the set of atomic values that contains values such as numbers and strings.

Our first definition concerns the notion of tuple, which formalizes the notion of row or record in a table:

Def. A tuple is a partial function from attribute names to atomic values.
Def. A header is a finite set of attribute names.
Def.- The projection of a tuple t on a finite set of attributes A is t[A] = { (a, v) : (a, v) ∈ t, aA }.

The next definition defines relation which formalizes the contents of a table as it is defined in the relational model.

Def. A relation is a tuple (H, B) with H, the header, and B, the body, a set of tuples that all have the domain H.

Such a relation closely corresponds to what is usually called the extension of a predicate in first-order logic except that here we identify the places in the predicate with attribute names. Usually in the relational model a database schema is said to consist of a set of relation names, the headers that are associated with these names and the constraints that should hold for every instance of the database schema.

Def. A relation universe U over a header H is a non-empty set of relations with header H.
Def. A relation schema (H, C) consists of a header H and a predicate C(R) that is defined for all relations R with header H.
Def. A relation satisfies the relation schema (H, C) if it has header H and satisfies C.

Key constraints and functional dependencies

One of the simplest and most important types of relation constraints is the key constraint. It tells us that in every instance of a certain relational schema the tuples can be identified by their values for certain attributes.

Def. A superkey is written as a finite set of attribute names.
Def. A superkey K holds in a relation (H, B) if KH and there are no two distinct tuples t1 and t2 in B such that t1[K] = t2[K].
Def. A superkey holds in a relation universe U over a header H if it holds in all relations in U.
Def. - A superkey K holds as a candidate key for a relation universe U over H if it holds as a superkey for U and there is no proper subset of K that also holds as a superkey for U.
Def. A functional dependency (or FD for short) is written as X->Y with X and Y finite sets of attribute names.
Def. A functional dependency X->Y holds in a relation (H, B) if X and Y are subsets of H and for all tuples t1 and t2 in B it holds that if t1[X] = t2[X] then t1[Y] = t2[Y]
Def. A functional dependency X->Y holds in a relation universe U over a header H if it holds in all relations in U.
Def. A functional dependency is trivial under a header H if it holds in all relation universes over H.
Theorem A FD X->Y is trivial under a header H iff YXH.
Theorem A superkey K holds in a relation universe U over H iff KH and K->H holds in U.
Def. (Armstrong's rules) Let S be a set of FDs then the closure of S under a header H, written as S+, is the smallest superset of S such that:
(reflexivity) if YXH then X->Y in S+
(transitivity) if X->Y in S+ and Y->Z in S+ then X->Z in S+
(augmentation) if X->Y in S+ and ZH then XZ -> YZ in S+
Theorem Armstrong's rules are sound and complete, i.e., given a header H and a set S of FDs that only contain subsets of H then the FD X->Y is in S+ iff it holds in all relation universes over H in which all FDs in S hold.
Def. If X is a finite set of attributes and S a finite set of FDs then the completion of X under S, written as X+, is the smallest superset of X such that:
if Y->Z in S and YX+ then ZX+

The completion of an attribute set can be used to compute if a certain dependency is in the closure of a set of FDs.

Theorem Given a header H and a set S of FDs that only contain subsets of H it holds that X->Y is in S+ iff YX+.
Algorithm (deriving candidate keys from FDs)
      INPUT: a set S of FDs that contain only subsets of a header H
      OUTPUT: the set C of superkeys that hold as candidate keys in
              all relation universes over H in which all FDs in S hold
        C := ∅;          // found candidate keys
        Q := { H };      // superkeys that contain candidate keys
        while Q <> ∅ do
          let K be some element from Q;
          Q := Q - { K };  
          minimal := true;
          for each X->Y in S do 
            K' := (K - Y) ∪ X;   // derive new superkey
            if K' K
              minimal := false;
              Q := Q ∪ { K' };
          if minimal and there is not a subset of K in C
            remove all supersets of K from C;
            C := C ∪ { K };
Def. Given a header H and a set of FDs S that only contain subsets of H an irreducible cover of S is a set T of FDs such that
  1. S+ = T+
  2. there is no proper subset U of T such that S+ = U+,
  3. if X->Y in T then Y is a singleton set and
  4. if X->Y in T and Z a proper subset of X then Z->Y is not in S+.

See also


  • Codd, E. F. (1970). "A relational model of data for large shared data banks". Communications of the ACM, , Vol. 13, No. 6, pp. 377-387. Retrieved from Sept. 4, 2004.
  • Date, Christopher J. (2003); Introduction to Database Systems. 8th ed.

External links

lt:Reliacinis modelis pl:Model relacyjny pt:Modelo relacional


  • Art and Cultures
    • Art (
    • Architecture (
    • Cultures (
    • Music (
    • Musical Instruments (
  • Biographies (
  • Clipart (
  • Geography (
    • Countries of the World (
    • Maps (
    • Flags (
    • Continents (
  • History (
    • Ancient Civilizations (
    • Industrial Revolution (
    • Middle Ages (
    • Prehistory (
    • Renaissance (
    • Timelines (
    • United States (
    • Wars (
    • World History (
  • Human Body (
  • Mathematics (
  • Reference (
  • Science (
    • Animals (
    • Aviation (
    • Dinosaurs (
    • Earth (
    • Inventions (
    • Physical Science (
    • Plants (
    • Scientists (
  • Social Studies (
    • Anthropology (
    • Economics (
    • Government (
    • Religion (
    • Holidays (
  • Space and Astronomy
    • Solar System (
    • Planets (
  • Sports (
  • Timelines (
  • Weather (
  • US States (


  • Home Page (
  • Contact Us (

  • Clip Art (
Personal tools