-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support detection of Placeholder files #355
Comments
I'm not familiar with these placeholder files. Are they still real files that can be opened and parsed? If so, we'd be open to discussing further and might ultimately accept a PR. |
They are still files, but sparse files. That is, they are < 1K in size (regardless of how big the real file is). The real file is kept in a sync providors cloud service (iCloud, OneDrive, Google, etc.). If you try to open the file; windows Kernal instructs the sync providor to doanload the file and make it fully available. So, when I use metdadataExtractor against one of these file, technically it works as your code doesn't see what goes on behind the scenes. But it leads to every file being downloaded, which is against the whole point of these types of files. What Placeholder file aware code should be doing is identifying that the file is a sparse file, then asking the Windows property system what properties are available (think EXIF data) and then asking for those properties. This is very fast. Sync providors typically fill the 1K sparse file with a thumbnail, common properties for image, video, music, etc., and other information. |
How would MetadataExtractor read the file if the kernel's going to transparently intercept the file system request and do the download? I would be concerned that any fix here would be platform-specific. |
Apple has the same concept (store full versions in iCloud) and on Windows, Apple do the same as well as Microsoft. I'm sure it's implemented differently on Windows and Mac, but anything that gets the metadata from multiple files will hot issues. |
It would help if you could find some analysis or documentation about the file formats. |
When getting metadata from photos, everything is fine until it hits a Placeholder file (OneDrive/iCloud, etc.).
For those files the metadata cannot be extracted without downloading the file from the cloud which is a very long process, requires the internet, and defeats the disk saving feature of those files.
The information is available from the file, just not by opening it; for example, the Windows property system will have a subset of properties (depending on the sync providor); this can also be read using the system indexer.
Is there any potential for this being implemented? I understand that it would be a Windows only feature.
The text was updated successfully, but these errors were encountered: