relation algebra

Overview of the relational algebra

  • It provides a series of operations including Union, Intersection, Difference, Cartesian Product, Project, Select, Join and Division based on sets.
  • The opertors take one or more relations as inputs and give a new relation as a result.
  • Procedural language
    For example : $prod_{Sno,Cname}(sigma_{Cno = “002”}(Studentbowtie SC))$

  • An abstract language, and is the foundation to learn SQL.

Traditional Relational Algebra Operations [only involve row]

Operation Relation R Relation S Notation
UNION $R$ $S$ $R cup S$
INTERSECTION $R$ $S$ $R cap S$
DIFFFERENCE $R$ $S$ $R-S$
Cartesian PRODUCT $R$ $S$ $R times S$

Special Relational Algebra Operations [involve row and column]

Operation Relation R Relation S Notation
PROJECT $R$ $pi_{A}(R)$
SELECT $R$ $sigma_{Con}(R)$
JOIN $R$ $S$ $R mathop{Join}limits_{A theta B} S$
Cartesian PRODUCT $R$ $S$ $R div S$

Compatibility for the relational algebra

  • The relations involved in Union, Intersection and Difference must be compatible(相容的) to ensure above relational algebra operations to be valid
  • Relation $R$ and relation $S$ are compatible when
    $[1]$ $R$,$S$ must have the same arity(同元的, same number of attributes)
    $[2]$ The attribute domains must be compatible (e.g., the $2^{nd}$ column of $S$.)

    For relation $R(A_1,A_2,…,A_n)$ and relation $S(B_1,B_2,…,B_m)$,
    IF $R$ and $S$ are compatible,
    Then $n = m$ and Domain$(A_i)$ = Domain$(B_i)$, $i=1,2,…,n$

Relational algebra (1) : Union

  • Notation: $R cup S$
  • Defined as: $R cap S$ = $lbrace t | t in R vee t in S rbrace$
  • For $R cup S$ to be valid, $R$ and $S$ should be compatible
  • $R cup S $ = $S cup R$

Relational algebra (2) : Difference

  • Notation: $R - S$
  • Defined as: $R-S = lbrace t | t in R land t notin S rbrace$
  • For $R-S$ to be valid, R and S should be compatible
  • $R - S ne S - R$

Relational algebra (3) : Intersection

  • Notation: $R cap S$
  • Defined as: $R cap S$ = $lbrace t | t in R land t in S rbrace$
  • For $R cap S$ to be valid, R and S should be compatible
  • $R cap S$ = $S cap R$
  • $R cap S$ = $R - (R-S)$ = $S - (S - R)$

Relational algebra (4) : Cartesian Product

  • Notation: $R times S$
  • Defined as: $R times S$ = $lbrace t,q | t in R land q in S rbrace$
  • $R times S$ = $S times R$
  • If the defree of $R$ is $n$, and the degree of $S$ is $m$, then the degree of $R times S$ is $n + m$
  • If the cardinality of $R$ is $n$, and the cardinality of $S$ is $m$, then the cardinality of $R times S$ is $n times m$.

Relational algebra (5) : Select

  • Notation: $sigma_p(R)$
  • $p$ is called the selection predicate (选择谓词)
  • Defined as: $sigma_p(R)$ = $lbrace t | t in R land p(t) rbrace$
    where $p$ is a formula in propositional calculus(命题演算) consisting of terms connected by: $land (and), lor (or),lnot (not)$
    Each term is one of: <$attribute$> $op$ <$attribute$> or <$constant$>
    where $op$ is obne of: $=,ne,gt,ge,lt,le$

Usually, there are many operators in the selection predicate $p$, and the priority sequence is as following:

  • () [Parentheses]
  • $=,ne,gt,ge,lt,le$
  • $lnot$
  • $land$
  • $lor$

Relational algebra (6) : Project

  • Notation : $prod _{A_1,A_2,…,A_k}(R)$
    where $A_1$,$A_2$,…,$A_k$ are attribute names and $R$ is a relation name.
  • The result is defined as the relation of $k$ columns obtained by erasing the columns that are not listed
  • Duplicate rows removed from result, since relations are sets

Ralational algebra (7) : Join

$theta$-Join

  • Notation: $R mathop{Join}limits_{A theta B} S$
  • Defined as: $Rmathop{Join}limits_{A theta B} S = sigma _{t[A] theta s[B]}(R times S)$
    • $R(A_1,A_2,…,A_n)$, $A in lbrace A_1,A_2,…A_n rbrace$
    • $S(B_1,B_2,…,B_m)$, $B in lbrace B_1,B_2,…B_m rbrace$
    • $t in R$, $s in S$
    • $A$ and $B$ are compatible
    • $theta in lbrace gt,ge,lt,le,=,<> rbrace$
  • $theta$-Join usually used with Select and Project together.

Rename

  • Notation: $rho$
  • Rename a relation to another with a different name
  • Duplicate a relation and give a new name
  • $R_1 rightarrow R_2$, and only the relation names are different for R_1 and R_2

Query: Select all course No.s which both “2015030101” and “2015040101” from relation SC have taken.

Answer: $pi_{SC.Cno = “2015030101” lor SC1.Sno = “2015040101”}(SC mathop{Join}limits_{SC.Cno = SC1.Cno} rho_{SC1}(SC))$

Equal-Join

Natural-Join

  • Notation: $R Join S$
  • Defined as: $R Join S = sigma_{t[B]=s[b]}(Rtimes S)$
    • $R$ and $S$ have one same attribute or a group of same attributes
    • Duplicated columns should be deleted in the result relation
  • Equal-Join is a special case if $theta $-Join

Outer-Join

  • An etension of the join operation that avoids loss of information
  • Computes the join and then adds tuples from one relation that does not match tuples in the other relation to the result of the join
  • User null values: null signifies that the value is unknown or does not exist
  • Notation:
    • Left-Join: ⟕
    • Right-Join: ⟖
    • Full-Join: ⟗

Ralational algebra (8) : Divisiojn

  • Notation: $R div S$
  • $R = (A_1,…,A_m,B_1,…,B_n)$
  • $S = (B_1,…,B_n)$
  • $S subseteq R$ Each attribute of schema $S$ is also in schema $R$
  • The result of $R div S$ is a relation schema, and containing all attributes of $R$ that are not in $S$.
    $R - S = (A_1,…,A_m)$
  • Suited to queries that include the phrase “for all”
  • $R div S = lbrace t | t in prod _{R-S}(R) land forall u in S(tu in R)rbrace$