Skip to content

Commit

Permalink
Changes to support RocksDB
Browse files Browse the repository at this point in the history
This includes the following changes

1. Honour O_CLOEXEC in open call
2. FCNTL honour O_CLOEXEC
3. Redirect FREAD_UNLOCKED to FREAD
4. Handle 'pread' calls in SplitFS
5. Intercept fallocate
6. Intercept sync_file_range
7. Add unit tests
8. Add implementation.md
  • Loading branch information
OmSaran committed Aug 4, 2020
1 parent cf787a1 commit a3ac90f
Show file tree
Hide file tree
Showing 18 changed files with 748 additions and 68 deletions.
18 changes: 18 additions & 0 deletions implementation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
## Implementation details
Some of the implementation details of intercepted calls in SplitFS
- `fallocate, posix_fallocate`
- We pass this to the kernel.
- But before we pass this on to the kernel we fsync (relink) the file so that the kernel and SplitFS both see the file contents and metadata consistently.
- We also clear the mmap table in SplitFS because they might get stale after the system call.
- We update the file size after the system call accordingly in SplitFS before returning to the application.
- `sync_file_range`
- sync_file_range guarantees data durability only for overwrites on certain filesystems. It does not guarantee metadata durability on any filesystem.
- In case of POSIX mode of SplitFS too, we guarantee data durability and not metadata durability, i.e we want to provide the same guarantees as posix.
- The data durability is guaranteed by virtue of doing non temporal writes to the memory mapped file, so we don't really need to do anything here. In case where the file is not memory mapped (for e.g file size < 16MB) we pass it on to the underlying filesystem.
- In case of Sync and Strict mode in SplitFS, this is guaranteed by the filesystemitself and sync_file_range is not required for durability.
- `O_CLOEXEC`
- This is supported via `open` and `fcntl` in SplitFS. We store this flag value in SplitFS.
- In the supported `exec` calls, we first close the files before passing the `exec` call to the kernel.
- We do not currently handle the failure scenario for `exec`
- `fcntl`
- Currently in SplitFS we only handle value of the `close on exec` flag before it is passed through to the kernel.
4 changes: 2 additions & 2 deletions splitfs/bg_clear_mmap.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,9 @@ static void clean_dr_mmap() {
assert(0);
}
if (clean_overwrite)
ret = posix_fallocate(dr_fd, 0, DR_OVER_SIZE);
ret = _hub_find_fileop("posix")->POSIX_FALLOCATE(dr_fd, 0, DR_OVER_SIZE);
else
ret = posix_fallocate(dr_fd, 0, DR_SIZE);
ret = _hub_find_fileop("posix")->POSIX_FALLOCATE(dr_fd, 0, DR_SIZE);

if (ret < 0) {
MSG("%s: posix_fallocate failed. Err = %s\n",
Expand Down
18 changes: 18 additions & 0 deletions splitfs/fileops_hub.c
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,9 @@ RETT_FOPEN64 _hub_FOPEN64(INTF_FOPEN64);
RETT_IOCTL ALIAS_IOCTL(INTF_IOCTL) WEAK_ALIAS("_hub_IOCTL");
RETT_IOCTL _hub_IOCTL(INTF_IOCTL);

RETT_FCNTL ALIAS_FCNTL(INTF_FCNTL) WEAK_ALIAS("_hub_FCNTL");
RETT_FCNTL _hub_FCNTL(INTF_FCNTL);

RETT_OPEN64 ALIAS_OPEN64(INTF_OPEN64) WEAK_ALIAS("_hub_OPEN64");
RETT_OPEN64 _hub_OPEN64(INTF_OPEN64);

Expand Down Expand Up @@ -1399,6 +1402,21 @@ RETT_UNLINK _hub_UNLINK(INTF_UNLINK)
return result;
}

RETT_FCNTL _hub_FCNTL(INTF_FCNTL)
{
CHECK_RESOLVE_FILEOPS(_hub_);

DEBUG("CALL: _hub_FCNTL\n");

va_list ap;
void * arg;
va_start (ap, cmd);
arg = va_arg (ap, void*);
va_end (ap);

return _hub_managed_fileops->FCNTL(CALL_FCNTL, arg);
}

RETT_UNLINKAT _hub_UNLINKAT(INTF_UNLINKAT)
{
CHECK_RESOLVE_FILEOPS(_hub_);
Expand Down
Loading

0 comments on commit a3ac90f

Please sign in to comment.