Skip to content
This repository has been archived by the owner on Nov 23, 2018. It is now read-only.

Add Coalescer manager to batch writes #2

Open
pkieltyka opened this issue Aug 6, 2014 · 3 comments
Open

Add Coalescer manager to batch writes #2

pkieltyka opened this issue Aug 6, 2014 · 3 comments

Comments

@pkieltyka
Copy link
Contributor

batched writes are way faster for some data stores, ie. for boltdb: https://gist.github.com/benbjohnson/59b57e3772bfb7a65fbf

@andrewwatson
Copy link

the idea being that you'd feed data into a channel instead of calling bolt directly and something else would wake up every N milliseconds and write batches out to bolt?

@pkieltyka
Copy link
Contributor Author

Yes exactly. It would be similar to what other databases do when they fsync. For example mongo uses memory mapped files and will sync the data changes to disk on some timer. I believe the fsync time is like 30 seconds. Of course they also have a fast append log to make sure if anything goes wrong between that time, the log will sync on the next time the database boots up.. this is called a WAL (http://en.wikipedia.org/wiki/Write-ahead_logging). That could also be implemented as part of the coalescer package. The WAL could be optional too via #5. The coalescer could also use either fsync time and/or number of objects queued up to write.

store := chainstore.New(
  memstore.New(100*1024*1024),
  coalescer.New(30 * time.Second,
    boltstore.New("/tmp/store.db", "myBucket"),
  )
)

or..

store := chainstore.New(
  memstore.New(100*1024*1024),
  coalescer.New().Fsync(30 * time.Second).MaxObjects(100).WriteLog("/tmp/bolt.log").For(
    boltstore.New("/tmp/store.db", "myBucket"),
  )
)

in fact, the coalescer should use the memstore internally if one hasn't been defined on the chain, since the idea is once the data is in the chain, it can be requested from the store, it just wouldn't have been "commited" to bolt until the fsync time / num of objects batched.

.. I have more experience with Go since I originally wrote this project, so I think I can do even better on defining the chain to be more flexible and cleaner. Also, the coalescer would have to be able to work with any kind of store, ie. even batching s3 writes, which makes it a bit more difficult to define a general manager.

@pkieltyka
Copy link
Contributor Author

might need a BatchWrite interface that some stores could have, and the coalescer would test the interface upgrade (http://avtok.com/2014/11/05/interface-upgrades.html) on the fly to see if its supported, then skips the coalescer if it doesn't have that support. That should work

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants