-
Notifications
You must be signed in to change notification settings - Fork 5
Indexes Redevelopment
Example Content schema definition:
var Content = db.define('Content', {
type: String,
title: String,
text: String,
excerpt: String,
previewImage: String,
attachment: db.JSON,
poll: db.JSON,
location: db.JSON,
comments: { type: [], fulltext: true },
repliesTotal: Number,
likes: [],
dislikes: [],
likesTotal: Number,
views: Number,
score: Number,
groupId: { type: Number, index: true },
replyToId: { type: Number, index: true },
tags: [],
tagString: String,
authorId: { type: Number, index: true },
privacy: { type: String, index: 'unsorted' },
createdAt: { type: Date, defaultSort: 'desc' },
updatedAt: Date,
timestamp: Number,
priority: Number,
url: { type: String, index: 'unique' },
fulltext: String,
importData: db.JSON
});
db.adapter.defineTagsIndex('Content', 'tags');
There are a few changes to the existing schema definition:
The defaultSort property defines that this property is used to define the default sort order and whether it is ascending or descending. If this is not defined for a model we should default to ‘id desc’ - which is in most cases the same as ‘createdAt desc’. The defaultSort property defines the value that is used for the sorted indexes as the score value.
A unique
index should only ever have one value. Therefore we can use a string to define the value rather than a set or sorted set.
Some indexes do not need to be sorted. These can be indexes that are only used in conjunction with another index. E.g. privacy is only ever used alongside a groupId index or a tag index.
Tags can be used to define arbitrary indexes which are able to define their own sort property. An example set of tags for a content item might be as follows:
post.id = 1
post.title = 'hello, world';
post.groupId = 1;
post.privacy = 'public';
post.url = 'localhost:3000/hello-world';
post.createdAt = '01 jan 2001';
post.score = 5;
post.numberOfComments = 3;
post.tags = {
{ id: 1, index: 'group1-1', name: 'blog' },
{ id: 2, index: 'group1-2', name: 'popular', sort: 'score desc' },
{ id: 5, index: 'group1-5', name: 'most comments', sort: 'numberOfComments desc' }
};
post.save();
When the content item post
is save to the database, the standard indexes are first processed. Then the tags are iterated through and their index value added. In this example, the Redis commands would be:
#standard indexes
ZADD z:Content:groupId:1 01-jan-2001 1
SADD i:Content:privacy:public 1
SET u:Content:url:localhost:3000/hello-world 1
#tag indexes
ZADD z:Content:tags:group1-1 01-jan-2001 1
ZADD z:Content:tags:group1-2 5 1
ZADD z:Content:tags:group1-5 3 1
Querying via JugglingDB is as per usual but the Redis commands that are issued to the database may be different depending on index used. Here are a few examples: Complexity is calculated assuming that each tag or field index contains 1,000,000 records.
Note that because the defaultSort is descending, we use ZREVRANGE rather than ZRANGE
db.model(‘Content’).all({ where: { groupId; 1 }}, function(err, posts) { … }});
ZREVRANGE z:Content:groupId:1 0 10
Complexity = log(1000000) + 10 = 6 + 10 = 16
This will return the top 10 posts in order of score
var tagId = group.getTagId(‘popular’);
db.model(‘Content’).all({ where: { tags: tagId }}, function(err, posts) { … }});
ZREVRANGE z:Content:tags:group1-2 0 10
Complexity = log(1000000) + 10 = 6 + 10 = 16
This will return the top 50 public posts with the most comments
var tagId = group.getTagId(‘most comments’);
db.model(‘Content’).all({ where: { tags: tagId, privacy: ‘public’ }}, function(err, posts) { … }});
ZINTERSTORE temp123 z:Content:tags:group1-2 i:Content:privacy:public
EXPIRE temp123 7
ZREVRANGE temp123 0 50
Complexity = (1,000,000 * 2) + 333,333 * log(333,333) = 3,840,000
The last example is clearly inefficient and should be replaced with a tag called ‘most popular public’ which would reduce the complexity to a maximum of 16 ops.