Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration | Collaboration with https://github.com/abraunegg/onedrive #105

Closed
abraunegg opened this issue Jan 27, 2021 · 4 comments
Closed

Comments

@abraunegg
Copy link

@jstaf
Great tool you have created.

Would love to collaborate here & potentially work out a way to utilise your tool to solve this open issue: abraunegg/onedrive#757 - or get your help in building / creating a similar overlay for the tool which I now maintain.

If you could let me know what your thoughts are that would be greatly appreciated.

@jstaf
Copy link
Owner

jstaf commented Jan 28, 2021

@abraunegg I admittedly am not familiar enough with your project's codebase to suggest an immediate solution - however, I can probably help by explaining how onedriver implemented the on-demand downloads.

The important thing to note is that onedriver doesn't actually sync the files, it's a filesystem that actually acts as a "middleman" that handles I/O syscalls on behalf of the kernel when they try to read a directory or open a file. So when a program tries to read a directory or open a file, it makes a syscall to the kernel and then the kernel asks onedriver what's in the directory or what's in the file. This gives onedriver the ability to download files and fetch metadata on demand: every I/O operation on OneDrive files go through onedriver. So to implement a similar on-demand functionality you would need to intercept syscalls in a similar manner.

Here's a sample code path (this is for listing the files of a directory, but reading/writing files is similar):

@abraunegg
Copy link
Author

@jstaf
Thanks - will have to look into this a bit deeper. In terms of caching all the online items, using a /delta call and storing the tree of items / objects is what 'onedrive' does at the moment. This is refreshed every 5 mins, however you can call /delta on a specific path - so ls could make a query to /delta + path & provide the details as well.

Handling the download is going to be a tricky one. My initial attempts there is an issue in pausing the application request long enough for a file to download. OneDrive itself does not really support HTTP/2, so all calls need to HTTP/1.1, and 4xx, 5xx error handling / retry needs to be really tight + utilising QuickXOR / hash of the file + size to determine if it has been correctly downloaded.

How are uploads to OneDrive being handled? For Business / SharePoint accounts there are many quirks with file types where OneDrive modifies the file post upload by removing it (if you query the original endpoint you get a 404) and they add metadata, thus, file is now out of sync with the file on the local disk.

@jstaf
Copy link
Owner

jstaf commented Feb 3, 2021

File downloads work the same way as readdir(), in this case onedriver intercepts the open() syscall and performs the download if the checksum of a local copy does not match what's on the server: https://github.com/jstaf/onedriver/blob/master/fs/inode.go#L786

Uploads are where things get kinda weird (uploads are independent of fs ops to keep things fast so no one has to wait for things to upload every time they save):

  • When a user creates a file, it uploads an empty file to obtain a OneDrive ID
  • Read/writes happen against the local file
  • When flush() or fsync() get called, it creates an "upload session" with the contents to be uploaded and inserts this into a queue
  • An "upload manager" periodically reads from the queue and uploads any pending upload sessions and retries these if they fail: https://github.com/jstaf/onedriver/blob/master/fs/upload_manager.go. Uploads use the OneDrive ID created earlier so we don't lose the file if the user renames the file right after they save it. (All items are getting tracked by ID instead of path for this reason.)

@jstaf
Copy link
Owner

jstaf commented May 16, 2021

Going to close this one because there's no specific tasks that need to be completed in this issue.

@jstaf jstaf closed this as completed May 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants