LongLivedReadTransactions

The following article applies to version 2.0+

Transactions are great. Except when they're not.

From an abstract perspective, it's easy to think in terms of transactions. We execute a read-write transaction and make some changes to the database. And after the transaction is complete we know that future transactions will see the changes. But the problem is, we don't always think in terms of transactions. This is especially true on the main thread. Consider the following code:

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath
{
    __block id onSaleItem = nil;
    [databaseConnection readWithBlock:^(YapDatabaseReadTransaction *transaction){
        onSaleItem = [[transaction ext:@"view"] objectAtIndex:indexPath.row inGroup:@"sales"];
    }];
    
    // configure and return cell...
}

At first glance, this code looks correct. In fact, this is the natural and recommended way to write this code. But what about in terms of transactions? What happens if we execute a read-write transaction on a background thread and remove a bunch of sales items? And meanwhile the main thread is chugging away, populating the tableView or scrolling the tableView, and invoking the above dataSource method?

The answer is that things are going to get out-of-sync. At least temporarily. View controllers such as tableViews and collectionViews require a stable data source. The underlying data needs to remain in a consistent state, until the main thread is able to process a notification (for example), and update the view controller and underlying data in a synchronized fashion.

This is where long-lived read-only transactions come in.

A read-only transaction represents an immutable snapshot-in-time of the database. But the traditional block-based architecture limits the duration of the snapshot. Long-lived transactions allow you to "bookmark" a snapshot of the database, and ensure that all future read-only transactions use the previously "bookmarked" snapshot. Furthermore, the long-lived architecture allows you to move your "bookmarked" snapshot forward in time in a single atomic operation.

The architecture was designed to make it easy to use YapDatabase without having to worry about asynchronous read-write issues. Your main thread can move from one steady-state to another. And the code-block above will work just fine, without worrying about transaction issues.

Hello World

Step One is to begin a long-lived transaction:

- (void)viewDidLoad
{
    [databaseConnection beginLongLivedReadTransaction];

    // ...
}

After invoking the beginLongLivedReadTransaction method, then future invocations of readWithBlock or asyncReadWithBlock will use the snapshot that was "bookmarked" at the point-in-time in which you called beginLongLivedReadTransaction.

Now, obviously, you don't want to stay on the same snapshot forever. When the database changes, you want to update your view.

Step Two is to listen for YapDatabaseModifiedNotification's:

- (void)viewDidLoad
{
    [databaseConnection beginLongLivedReadTransaction];

    [[NSNotificationCenter defaultCenter] addObserver:self
                                             selector:@selector(yapDatabaseModified:)
                                                 name:YapDatabaseModifiedNotification
                                               object:database];
}

- (void)yapDatabaseModified:(NSNotification *)notification
{
    [databaseConnection beginLongLivedReadTransaction]; // End & Re-Begin the long-lived transaction atomically
    [self updateView];
}

Handling Multiple Modifications

YapDatabase is fast. And it can handle multiple read-write transactions extremely fast. In fact, it can often handle multiple read-write transactions in the time it takes your main thread to update its view once...

Which begs the question: What if multiple read-write transactions occur before my main thread invokes beginLongLivedReadTransaction?

The answer is that you may jump multiple commits. For example, if you were previously on snapshot 12, you may end up jumping to snapshot 14 (which is 2 read-write commits later).

In fact, you may also jump zero commits. For example, the main thread was busy scrolling, and multiple YapDatabaseModifiedNotification's got queued up on the main thread. The first time yapDatabaseModified is hit, you jump 2 commits. And the second time yapDatabaseModified is hit, you jump zero commits (because you already handled it previously).

But no need to freak out! There's a clear way to handle it:

- (void)yapDatabaseModified:(NSNotification *)notification
{
    // End & Re-Begin the long-lived transaction atomically.
    // Also grab all the notifications for all the commits that I jump.
    NSArray *notifications = [databaseConnection beginLongLivedReadTransaction];
    if ([notifications count] > 0) {
        [self updateView];
    }
}

Did the change affect my view?

YapDatabaseModifiedNotification has all the juicy details on what changed. And there are various methods you can use to inspect the notification to see if anything related to your view actually changed. Thus you can be more selective about potentially expensive view updates.

- (void)yapDatabaseModified:(NSNotification *)notification
{
    // End & Re-Begin the long-lived transaction atomically.
    // Also grab all the notifications for all the commits that I jump.
    NSArray *notifications = [database beginLongLivedReadTransaction];

    // Update views if needed
    if ([databaseConnection hasChangeForKey:itemId inNotifications:notifications]) {
        [self updateItemView];
    }
    if ([databaseConnection hasChangeForKey:cartId inNotifications:notifications]) {
        [self updateShoppingCartImage];
    }
}

Important Warning

It is absolutely critical that you listen for YapDatabaseModifiedNotification's, and properly move your long-lived transactions forward. Failure to do so could significantly slow down the database.

In order to provide things like concurrency, snapshots, and long-lived transactions, the database uses SQLite's write-ahead log (WAL) mechanism. Here's a really high-level overview on how this works:

There is the database file, and a separate write-ahead log (WAL)
The WAL contains commits that have yet to be synced into the database file
As the minimum transaction moves forward in time, old commits are moved from the WAL to the database
If the WAL gets too big, the performance of the database begins to suffer
Typically this won't ever happen, unless you start a long-lived read transaction and never move it forward!

Think of each read-write commit as having a number, where the number is incremented by 1 each time. So the database file might reflect every commit up to, say, number 12. And each commit after number 12 is in the WAL, like so:

Database: {12}
WAL     : [13, 14]
            ^   ^
   connection1  ^
                ^
       connection2

In the picture above, connection1 is reading commit #13, and connection2 is on the latest (commit #14). At this point we can move commit #13 into the database file. Why? Because every single connection is at or past commit #13. However, we cannot move commit #14 into the database file. Why? Because connection1 doesn't know about it yet. It's still on commit #13.

In most situations this is perfectly fine. Eventually connection1 will move forward, and all commits in the WAL will get moved into the database. And then the WAL file will get reset (become empty).

But if you use a long-lived read transaction, and don't bother to move your transaction forward, then you end up with a situation like this:

Danger, Will Robinson!

Database: {12}
WAL     : [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
            ^                                               ^
   connection1                                              ^
                                                            ^
                                                   connection2

As your WAL file continues to grow, the performance begins to degrade. If left unchecked for an extended number of commits, the sqlite database may become sluggish. What may be even worse is the startup time for a database that encounters a giant WAL file.

The solution is quite simple. Just listen for the proper notification and move your long-lived transactions forward in time (as demonstrated above).

Provide feedback

Saved searches