Skip to content

5. Database

Maruf Bepary edited this page May 9, 2023 · 2 revisions

Non-Relational Database

A non-relational database is a database that does not use the tabular schema of rows and columns found in most traditional database systems. Instead, non-relational databases use a storage model that is optimized for the specific requirements of the type of data being stored. Non-relational databases are sometimes referred to as “NoSQL”. Non-relational databases are often used when large quantities of complex and diverse data need to be organized, or when data is frequently changed or updated. The data is normally stored in a structure similar to a file system.

This project uses a non-relational database provided by Firebase called Firestore. The rationale for using Firestore was discussed in Non-Relational Database using on Final System. Firestore is a document database, meaning that it is structured similar to a file system. A collection in Firestore is similar to a folder and can store many related documents, for example, a collection of "users" can store many user objects. These objects, collected into collections, are called "documents". Documents can also contain sub-collections, which were used in this project to capture certain types of relations when metadata was involved. Since Firestore is a non-relational database, there was no normalization involved in the design.

Advantages

Non-relational databases have many advantages. One advantage is that they are easier to scale vertically, due to the way the data is stored; as mentioned before, the data is stored in a similar way to files in a file-system (folders and documents) which makes it much easier to split into multiple smaller databases and store them in different servers. Because of its minimal use of relations between objects, there is no joining making them much faster than relational databases for large amounts of data. The lack of normalisation also makes them more flexible allowing for the schema to change. Sometimes, non relational databases are used as cache for relational databases when the data is highly relational but there is a lot of data to process.

Disadvantages

There are some disadvantages to using non relational databases. The lack of ACID (atomicity, consistency, isolation, durability) transactions across multiple documents or collections make it harder to update related data and can lead to failures due to the lack of integrity and consistency. Due to the lack of the properties mentioned above, querying non relational databases is also more difficult, especially in the case of complex queries. Unlike relational database which all work in a similar way, different non relational databases work in different ways making the ecosystem more fragmented, for example, Firestore and MongoDB do not function the same way even if the base concept is the same.

Use in the Project

As mentioned before, the current system uses a non relational database provided by Firebase. Due to the object modelled in the project being highly relational, querying data was difficult, however, these complex queries were not carried out often. However, the schema was much more flexible which was vital for this project as the requirements were chaining constantly as the project went on meaning that the database was also being extended often, this would have required multiple normalisations procedures if using a relational database. This flexibility also decreased development time facilitating Agile development as discussed in the Methodology section. As mentioned before, this type of database is also more scalable meaning that it would be much easier to reach a wider audience at lower costs.

Entities

User object in the users collection

The user entity in the system represents the users who interact with the website. Although there are other fields that exist for the user, they are not relevant for the purpose of this description. The user performs various actions within the system, such as creating and deleting posts and creating and deleting comments. This entity is the central piece for the user's interaction with the website.

  • uid: identifies the user
  • email: email for the user
  • disabled: whether the account is banned
  • displayName: name of the user
  • emailVerified: whether the email was verified
  • passwordHash: hashed password of the user
  • providerData: metadata provided if the user signed up using 3rd party provider

Communities object in the communities collection

The communities entity in the system represents the communities that a user can interact with. Each community has a unique name, which serves as its identifier. Each user can subscribe to multiple communities, and communities can have many subscribers. The numberOfMembers field captures the number of subscribers in a community. In a relational database, this would be a derived field, calculated by counting the number of users that are subscribed, but in this case, it is a stored field. This means that the data must be manually updated each time a user subscribes or unsubscribes, but it also means that there is no need for extra computation if only the number of members in a community is needed. The users who have subscribed to a community are stored in the "users" document. The privacyType field stores the type of community, with options for "public" (where any user can view and post), "restricted" (where any user can view posts but only subscribers can post), and "private" (where only subscribers can view and post).

  • createdAt: when the community was created
  • creatorId: user who created the community
  • imageURL: URL (from Firabase Storage) for the community logo
  • numberOfMembers: number of users subscribed to the community
  • privacyType: whether the community is ‘public’, ‘private’ or ‘restricted’

Posts object in the posts collection

The posts entity in the system represents the content created by users within the communities. Users can create posts in communities, and other users can view them. Each post belongs to a single community and is created by one user. Posts can be commented on, resulting in many comments for a single post. The number of comments for a post is stored directly for the same reason as storing the number of subscribers in a community. The comments themselves are stored in the "comments" collection.

A post has a title, the body storing the contents of the post, and an optional image, which is stored in Firebase Storage. Posts can also be voted on by other users, with a vote status captured by the field "voteStatus." The overall vote status is stored for the same reason as storing the number of subscribers in a community. When a post is created, its vote status is initialized to 0, as no user has liked or disliked the post. When a user votes on a post, the vote status is incremented or decremented depending on whether the user liked or disliked the post, respectively. The users who voted on a post are stored in a sub-collection in the "users" document.

  • uid: uniquely identifies a post
  • title: stores the title of the post
  • body: stores the extra description/content of the post
  • imageURL: optional image that can be posted
  • creatorId: stores the unique identifier of the user who created the post
  • creatorUsername: stores the username of the creator of the post for quick access
  • createTime: when the post was created
  • voteStatus: overall vote of the post

Comments object in the comments collection

The comments entity in the system represents the responses to the posts created by users. Users can comment on posts, which can be viewed by other users. Each comment belongs to a single post, and a post can have many comments on it. This entity captures the conversation and discussions related to the posts within the communities.

  • id: unique identifier of the post
  • postId: identifier to the post the comment belongs to
  • postTitle: title of the post the comment belongs to for quick access
  • creatorId: identifier of the user who created the comment
  • creatorDisplayText: username of the user who created the comment for quick access
  • text: comment text itself
  • createdAt: time when the comment was created

Relations

communitySnippets sub-collection in the users collection

The relation between a user and a community is represented in the database. As mentioned earlier, a user can subscribe to multiple communities, and a community can have many users subscribed to it. The user objects are stored in the "users" collection, and these objects contain a "communitySnippet" collection that represents the relations between the user and the communities. The community snippet objects are stored in the following format: "user/userId/communitySnippet/communitySnippetObject". This means that a user can have many community snippet objects, representing multiple relationships.

There is some data repetition in this structure for data that is frequently used, such as the relation between a user and a community. If a user is the creator of a community, they are considered the admin. This information is stored in the community snippet object, allowing for quick access to the information without the need for complex computations or queries.

  • communityId: identifier of the community the user is subscribed to
  • imageURL: logo of the community for quick access
  • isAdmin: specifying if the user is the admin of this community

postVotes sub-collection in the users collection

As mentioned earlier, a user can vote on posts in the system. This means that multiple users can vote on multiple posts. The information about a user's votes is stored in a sub-collection within the "users" object, representing the list of all votes the user has made for posts. The overall post vote status is stored in the post object. Whether the post was liked is stored in "postVotes," and the value of "voteValue" is used to calculate the overall value of the post votes. This structure allows for easy tracking of the votes made by each user and the overall vote status of each post.

  • id: unique identifier for the post
  • communityId: community to which the post being liked belongs to
  • postId: identifier to post being voted on
  • voteValue: whether the post was liked (+1) or disliked (-1)

These are the non-relational database (Firestore) components for the discussion platform. The structure allows for efficient data access and minimizes the need for complex computations or queries. By using a non-relational database like Firestore, the system can easily scale and adapt to changing requirements, providing a flexible and performant solution for managing data in a discussion platform.