-
-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Remove original file stem from filenames in the blobstorage (#4309) #4583
feat: Remove original file stem from filenames in the blobstorage (#4309) #4583
Conversation
05a5904
to
f8133a1
Compare
d14434c
to
1c8fa0d
Compare
1c8fa0d
to
48edc0c
Compare
thinking it over again, forcing random file name in blob-directory will worsen UX at least on deltachat-desktop, at least currently: on desktop, currently, eg. a pdf-reader is opened without the file being copied, the pdf-reader will show the filename then - and there, sth. as i known, the idea is that all files should be copied before being accessed from external apps, however, we're not there yet on any UI; most visible, however it seems to be on desktop, cc @Simon-Laux so, as the filename in the blob-directory does not have a meaning anyways and we're anyways have no consistency there, it would remove pressure from UIs and simplify getting this pr to stable if we keep filename generation as is (or maybe always add the random number so that it is more visible where the "wrong" filenames are used). this pr could add only add the save-function, use correct names on forwarding, and what else is done here (@iequidoo i find pr descriptions as "some discussions here and there" without some additional words hard for reviewers, it would be good to always have at least a handful sentences in your own words about reasonings, effects, successor of etc) as a side-effect, this would keep debugging blob-dir simple :) when UI are adapted to the new API, we can think over to use completely random file names. |
The main tracking issue for this PR is #4309 This PR is not meant to be a next step at this point, see my comment at #4309 (comment) |
Ok, i will better improve commit messages, i prefer putting everything there and use PR descriptions just for additional info. As for dc_msg_get_file(), i can implement what @link2xt suggested:
but then we need to add some new call like |
Also i can just remove the last commit so that there will be random suffixes, but not whole filenames. Anyway now we can get files with random suffixes from |
48edc0c
to
b79b2e1
Compare
b79b2e1
to
0776811
Compare
0776811
to
eadd138
Compare
eadd138
to
bb48814
Compare
bb48814
to
53f01ca
Compare
53f01ca
to
f2bae94
Compare
f2bae94
to
7358c33
Compare
Removed the last commit to unblock merging this |
49b0e82
to
aaffb9d
Compare
a1bf4a9
to
e635108
Compare
fbceb01
to
435ff59
Compare
aaffb9d
to
27dc07e
Compare
27dc07e
to
25f1eec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a high-level comment:
making dc_msg_get_file()
return unmangled name will improve some things in UI. at the same time, as this requires copying the file, converts the function from a cheap to an expensive one, potentially worsening things - UI have used dc_msg_get_file()
more as get_state()
, it is always fast anyways, no need to think over it can be slow or one needs to save results.
to target this, UI has to check all usages of dc_msg_get_file()
- to avoid side effects, but also as dc_msg_get_file
is anyways deprecated
but if UI anyways needs to check this, then it seems better to leave the functionality of dc_msg_get_file()
as is - so it will become the dc_msg_get_filedata_path()
(which it is already mostly, function description is mostly the same), UI wise, things may get slightly worse, as the filename is always “bad” - but UI would need to do the checks anyways and by this approach:
- no unexpected slowness or worsening can happen - there are lots of calls to
dc_msg_get_file()
- and that in already complex areas as "share", i expect things to be overseen - no new deprecated API
- things are working as before (apart from more often mangled names)
- it is very visible to devs/users what is missing: if eg. external apps show the mangled names, one needs to transfer things via a BLOB+
dc_msg_get_filename()
or usedc_msg_save_file()
- finally, but only as a sidenode on an implementation detail: not creating additional subdirectories in blobdir may have maintenance benefits in the future
might be i voted against that approach before, idk, but it looks better to me, thinking it over, also with the otherwise behaviour change of dc_msg_get_file()
Recently there was an accident with a chatbot that replaced its avatar set from the command line with an unrelated avatar of a contact. Both the `selfavatar` setting and the contact avatar `i` param pointed to `$BLOBDIR/avatar.png` at the time it was detected. How this happened is unclear, but it is possible that `avatar.png` was removed, unmounted or otherwise not detected by the core, and the core stored avatar received from the contact as `avatar.png`, while `selfavatar` config still pointed to `$BLOBDIR/avatar.png`. Such bugs are unavoidable even if the core itself has no bugs as we cannot rely on blobdir not reside on the faulty network filesystem, being incorrectly backed up and restored etc., so we should assume that files may be randomly removed. Then there may be dangling `$BLOBDIR/...` references in the database which may accidentally point to unrelated files, could even be an `avatar.png` file sent to the bot in private. To prevent such bugs, we add random filename suffixes for the blobdir objects. Thanks to the added Param::Filename these random suffixes aren't sent over the network.
b248bbf
to
ed33f30
Compare
) This way filenames in the blobstorage are just random hex numbers. This also allows us to get rid of the `sanitize-filename` dependency. This also requires `Param::Filename` to be set to "debug_logging*.xdc" for messages containing logging webxdc-s, otherwise they are not detected properly. This is done in "fix: Message::set_file_from_bytes(): Set Param::Filename", so don't forget to update senders as well.
…pace (#5338) Before file extensions were also limited to 32 chars, but extra chars in the beginning were just cut off, e.g. "file.with_lots_of_characters_behind_point_and_double_ending.tar.gz" was considered to have an extension "d_point_and_double_ending.tar.gz". Better to take only "tar.gz" then. Also don't include whitespace-containing parts in extensions. File extensions generally don't contain whitespaces.
25f1eec
to
562c4f8
Compare
deca65d
to
0610b7d
Compare
8a315e8
to
d2e299d
Compare
Opened #6033 instead |
Based on #5495
Fix #5338