Skip to content

Commit

Permalink
Implement indexed tables as databases (#38)
Browse files Browse the repository at this point in the history
* Add IndexedTable file

* Add singleton Indexed Table

* Implement unions of indexed tables

* Implement projection

* Implement selection on IndexedTables

* Implement aggregation in indexed table

* Define natural join for indexed tables

* Explain natural join for indexed tables
  • Loading branch information
MatBon01 authored Apr 28, 2023
1 parent 19223cd commit aecbe7a
Show file tree
Hide file tree
Showing 5 changed files with 91 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ library
Data.Bag,
Data.PointedSet,
Data.Key,
Database.Bag
Database.Bag,
Database.IndexedTable

-- Modules included in this library but not exported.
other-modules:
Expand Down Expand Up @@ -66,6 +67,7 @@ test-suite spec
Data.CMonoidSpec,
Data.PointedSetSpec,
Data.KeySpec,
Database.BagSpec
Database.BagSpec,
Database.IndexedTableSpec
build-depends: base >=4.16.4.0, hspec ^>=2.10, a-deeper-dive-into-relational-algebra-by-way-of-adjunctions
build-tool-depends: hspec-discover:hspec-discover == 2.*
4 changes: 2 additions & 2 deletions report/background/relationalmodel.tex
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ \subsubsection{Joins}\label{sec:joins}
\caption{Relation \relation{S} as example for joins.}
\label{tab:joinRelationS}
\end{table}
\paragraph{Natural join} The natural join is the first way to combine relations. Given that relations \relation{R} and \relation{S} have common attributes \attribute{a_1}, \ldots, \attribute{a_k}, tuples in \relation{R} and \relation{S} are combined if the component of all attributes are equal. This join is expressed as \natjoin{R}{S}.\cite{DatabaseSystems}
\paragraph{Natural join}\label{sec:natjoin} The natural join is the first way to combine relations. Given that relations \relation{R} and \relation{S} have common attributes \attribute{a_1}, \ldots, \attribute{a_k}, tuples in \relation{R} and \relation{S} are combined if the component of all attributes are equal. This join is expressed as \natjoin{R}{S}.\cite{DatabaseSystems}
\subparagraph*{Example of the natural join} Given the relations \relation{R} and \relation{S} in \fref{tab:joinRelationR} and \fref{tab:joinRelationS} respectively, the natural join $\natjoin{R}{S}$ is as in \fref{tab:naturalJoinResult}.\cite{RelationalModel}
In this example we call the tuple \verb|(1, 2, 4)| a \emph{dangling tuple} as it failed to pair with any other tuple in relation \relation{S}.\cite{DatabaseSystems}
\begin{table}[h]
Expand All @@ -220,4 +220,4 @@ \subsubsection{Joins}\label{sec:joins}
\paragraph{Equijoin} The most important class of joins concerning this project, a specialisation of the theta-join. Equijoin is used when the operator of predicate $\theta$ between two attributes is an equality\footnote{So common that joins using operators other than $=$, such as $<$, are sometimes called \emph{nonequijoins}.\cite{JoinProcessing}}.\cite{JoinProcessing} An equijoin between relations \relation{R} and \relation{S} where we want to join the values of attributes \attribute{a} and \attribute{b} respectively is denoted \equijoin{R}{a}{S}{b}.
\todo{Write example for equijoin}
\subsubsection{Note on permutations}
Permutations is another specialist operation in relational algebra, though not important to the scope of the project. For completion, despite the fact that relations are domain--unordered, their internal representation in computers is not and so permutation may be done for performance benefits despite no logical difference storing a relation and its permutations.\todo{Make sure I worded the performance benefits thing correctly}\cite{RelationalModel} Furthermore, permutation can be used (and is usually implied) to ensure that tuples with identical schemas differing only in ordering can have the normal set operations applied to them. \cite{DatabaseSystems}
Permutations is another specialist operation in relational algebra, though not important to the scope of the project. For completion, despite the fact that relations are domain--unordered, their internal representation in computers is not and so permutation may be done for performance benefits despite no logical difference storing a relation and its permutations.\todo{Make sure I worded the performance benefits thing correctly}\cite{RelationalModel} Furthermore, permutation can be used (and is usually implied) to ensure that tuples with identical schemas differing only in ordering can have the normal set operations applied to them. \cite{DatabaseSystems}
9 changes: 9 additions & 0 deletions report/project/benchmark/implementation.tex
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,12 @@ \subsection{Commutative Monoids}
outcome of the aggregation should not depend on the internal representation of
the bag as would happen given a non-commutative monoid.
\todo{Write implementation of CMonoid}

\subsection{Indexed Tables}
\paragraph{Natural Joins} In the implementation given, a natural join is defined
by \todo{Add code and mathematical description here} merging two indexed tables
then applying the raised Cartesian product on them. This translates to a local
Cartesian product on the keys indexed by the table. In \fref{sec:natjoin} we
define the natural join as a join that pairs all common indices, and then it is
clear that our implementation defines all common attributes as the key to the
finite map.
27 changes: 27 additions & 0 deletions src/Database/IndexedTable.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
module Database.IndexedTable where

import qualified Data.Bag as Bag
import qualified Data.Key as Map
import Data.CMonoid

empty :: (Map.Key k) => Map.Map k (Bag.Bag v)
empty = Map.empty

singleton :: (Map.Key k) => (k, v) -> Map.Map k (Bag.Bag v)
singleton (k, v) = Map.single (k, Bag.single v)

union :: (Map.Key k) => Map.Map k (Bag.Bag v) -> Map.Map k (Bag.Bag v) -> Map.Map k (Bag.Bag v)
union t1 t2 = (fmap (uncurry Bag.union) . Map.merge) (t1, t2)

projection :: (Map.Key k) => (v -> w) -> Map.Map k (Bag.Bag v) -> Map.Map k (Bag.Bag w)
projection = fmap . fmap

selection :: (Map.Key k) => (v -> Bool) -> Map.Map k (Bag.Bag v) -> Map.Map k (Bag.Bag v)
selection p = fmap (Bag.filter p)

aggregation :: (Map.Key k, CMonoid m) => Map.Map k (Bag.Bag m) -> Map.Map k m
aggregation = fmap Bag.reduceBag

-- Joins on common keys
naturalJoin :: (Map.Key k) => Map.Map k (Bag.Bag v) -> Map.Map k (Bag.Bag w) -> Map.Map k (Bag.Bag (v, w))
naturalJoin t1 t2 = fmap (uncurry Bag.cp) (Map.merge (t1 , t2))
49 changes: 49 additions & 0 deletions test/Database/IndexedTableSpec.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
module Database.IndexedTableSpec (spec) where

import Test.Hspec
import qualified Database.IndexedTable as Table
import qualified Data.Key as Map
import qualified Data.Bag as Bag
import Data.Monoid

type Name = String
data Person = Person { firstName :: Name, lastName :: Name} deriving (Show, Eq)

people :: Map.Map () (Bag.Bag Person)
people = Map.Lone (Bag.Bag [Person "John" "Smith", Person "Jane" "Doe", Person "John" "Doe"])

spec :: Spec
spec = do
describe "empty" $ do
it "returns an empty map" $ do
(Table.empty :: Map.Map () (Bag.Bag Int)) `shouldBe` (Map.empty :: Map.Map () (Bag.Bag Int))
describe "singleton" $ do
it "returns a single table" $ do
Table.singleton ((), 3) `shouldBe` Map.Lone (Bag.Bag [3])
describe "union" $ do
it "can correctly handle union of singletons" $ do
Table.union (Table.singleton ((), 3)) (Table.singleton ((), 4)) `shouldBe` Map.Lone (Bag.Bag [3, 4])
it "can correctly deal with first element empty" $ do
Table.union (Table.empty :: Map.Map () (Bag.Bag Char)) (Table.singleton ((), 'a')) `shouldBe` Map.Lone (Bag.Bag ['a'])
it "can correctly deal with second element empty" $ do
Table.union (Table.singleton ((), 'a')) (Table.empty :: Map.Map () (Bag.Bag Char)) `shouldBe` Map.Lone (Bag.Bag ['a'])
describe "projection" $ do
it "can correctly do a general projection" $ do
Table.projection firstName people `shouldBe` Map.Lone (Bag.Bag ["John", "Jane", "John"])
it "can correctly project on an empty map" $ do
Table.projection lastName (Table.empty :: Map.Map () (Bag.Bag Person)) `shouldBe` Map.empty
it "can correctly use the identity projection" $ do
Table.projection id people `shouldBe` people
describe "selection" $ do
it "can correctly select in general" $ do
Table.selection ((== "John") . firstName) people `shouldBe` Map.Lone (Bag.Bag [Person "John" "Smith", Person "John" "Doe"])
it "can correctly select all elements of a table" $ do
Table.selection (const True) people `shouldBe` people
it "can correctly select no elements of a table" $ do
Table.selection (const False) people `shouldBe` Map.empty
describe "aggregation" $ do
it "can correctly aggregate a table in general" $ do
Table.aggregation (Map.Lone (Bag.Bag [Any True, Any True, Any False])) `shouldBe` Map.Lone (Any True)
describe "natural join" $ do
it "is a local cartesian product" $ do
Table.naturalJoin (Map.Lone (Bag.Bag [1, 2])) (Map.Lone (Bag.Bag [2, 3])) `shouldBe` Map.Lone (Bag.Bag [(1, 2), (1, 3), (2, 2), (2, 3)])

0 comments on commit aecbe7a

Please sign in to comment.