-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Videos randomly disappearing #1453
Comments
Another example: https://signbank.cls.ru.nl//dictionary/gloss/47336 When glos video has been deleted, then NME video has moved from NME to glos. |
At Glos AFBROKKELEN (https://signbank.cls.ru.nl//dictionary/gloss/46883), I am seeing this:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
Can you increase the amount if time between updating/uploading and retrieving? Like maybe increase it to 10 minutes? There was previously a problem that the time between operations was too frequent. |
I had this problem locally on my own computer using iCloud for storage. |
Yes I could do that but i would rather for Signbank side do something about the transactions, for example for incoming transactions I always enqueue them in a list instead of executing them immediately. When the server has finished processing a certain transaction then it could handle the next one. Would that be something you can work on? |
@rem0g that sounds like an interesting approach. I will discuss that with @vanlummelhuizen how to implement that. He is the Django expert. A queuing mechanism. |
@rem0g How are you deleting the video? (There are some "signals" when objects are deleted. These may or may not move or delete video files. It would help in debugging to know what commands have been done.) Theoretically, if you are uploading video files at rapid speed, the temporary files (that Unix is making) could end up being linked to the wrong object in Django. I suspect this for a long time, but cannot fix this myself. I will ask the others.) I implemented a lot of code in November/December for managing the video files. There are pull requests for these. But nobody has reviewed them yet. The intention is that the dataset manager can inspect what is in the file system. That also allows to retrieve the videos from deleted glosses. The gloss IDs are not reused, so new videos should not have any interference with deleted glosses, since they always have the ID in the filename and these are not reused.) |
@rem0g for this gloss: https://signbank.cls.ru.nl/dictionary/gloss/47361 the NME video is not in the correct format! (On Firefox, it shows that it is not supported.) Recall that you asked us to not test for MP4 anymore. Thus it can be that incorrect formats are causing problems. |
@rem0g here are the gloss video objects for AFBROKKELEN, yes, you can see that the same filename appears multiple times, for different GlossVideo objects with various perspectives and NME set. |
Is it possible something was wrong with the permissions on your source file? Or that it was a symbolic link? |
@vanlummelhuizen can you help on this issue? |
@rem0g there are hundreds of videos with the wrong filename as you point out. Did you change anything in your script? I copied the newest database to signbank-test in order to inspect the filenames (in the objects). https://signbank-test.cls.ru.nl/datasets/checks/5 (There are no files, but you can see the filenames in the objects) The last time I checked filenames (end of November) everything was as expected. The problem is that all the gloss video objects of a gloss are indeed sharing a single video file. They are all pointing to the same file. I can only think that this is being caused by an alias or something. That the file system is pointing to a single file during upload. Babbling, but I know Django does not allow to upload multiple files in the Django Form Template. (We used to do this for the eaf files in the Dataset Manager, but when we updated to Django 4.2, the code had to be modified to only upload one file.) Perhaps Django is somehow doing something here since multiple video files are in the same API request. (The Django feature was removed for security reasons from Django.) |
From Django manual
|
I need help with this issue. One of the glosses that is messed up is STRUISVOGEL Here, you see all the objects refer to the same file (the perspective videos have the same file name as the normal video): Here you see a stats for the file:
But in the file system, the timestamp on the file shows that it was not changed on 20 december.
|
I don't know how to solve this. Hopefully, @vanlummelhuizen can have a go. |
Hello, thank you for looking into the issue. For me the issue is not clear yet but now I know it's not caused by my script as everything relies on Signbank API. For uploading videos i do use this API: /dictionary/api_update_gloss/{glossid}/video That's it. As for NME video upload i do use: /dictionary/api_create_gloss_nmevideo/{datasetid}/{glossid}/} And for deleting NME video: /dictionary/api_delete_gloss_nmevideo/{datasetid}/{glossid}/{videoid}/ For every time if i want to upload a certain NME video i do execute api_delete_glos_nmevideo first, but for that i obtain unique ID from the nme video and then delete that and then upload the new NME video. |
Hi all, I added 'blocking' to indicate extra extra priority. If there's something on our end we can do, please let us know. |
I'm not able to solve this myself. I am aware of this problem for a long time, but it was on a local server running on iCloud. So I chalked that up to Apple quirks. There are quite a few messed up video objects/ files now. I'm playing with this locally, so I can inspect the file system and the admin without messing up anything. Since multiple video objects refer to exactly the same file, it is not possible to "fix" this, other than to delete objects that point to the wrong file. But to do this, we need to turn off the "normal" process of deleting, otherwise the "correct" file may be deleted. I made a command (pull request) for renaming backup video files, since that has been messed up for a long time. |
@rem0g can you stop deleting the videos on your side? Because objects are referring to the same file, this is causing a shared file to be deleted. I still think this is due to the API commands happening too fast for the file system. But once a file ends up shared by different objects, that escalates the problem, domino effect. |
@rem0g I put the database (per yesterday) onto Out of curiosity, I looked at the gloss video history. Here you can see the names of the files that were uploaded. The source file and the (desired) target file. It looks like your naming convention with L, M, R in the filenames, that you are also using M for videos that are NME videos. |
I have a copy of the database on signbank-susan. (Per Monday. No videos, but you can browse the video objects / filenames / paths) in the admin. I am working on filters to enhance the admin in order to detect and fix video problems. #1398 I added filters for NME and Perspective, plus a filter for "wrong filename". Now it's possible to query on those wrong filenames. For NME videos with the wrong filename, there are 1366 results!!!!! For Perspective videos with the wrong filename, there are 4688 results!! @Woseseltops @vanlummelhuizen this is a huge problem. Since multiple objects are referring to the same video file (see the examples above), it is not possible to simply erase or delete anything. I can make an "unlink file from object" command in order to uncouple the link. Then it would be possible to delete NME and Perspective objects that are pointing to the normal video file. (The files need to be either deleted or unlinked in order to delete the objects. But this runs the risk of deleting files that should not be deleted.) Moreover, when you do delete an object, this sets in motion lots of "signal" commands that move around backup video files. |
I have looked into this problem, but without succes so far. My hypothesis is that, because GlossVideoNME is a subclass of GlossVideo, the delete function of GlossVideoNME is executed as if it was a GlossVideo. Things I found out about FRIKANDEL-A (gloss id 47336)The contents of the corresponding video file were modified on December 18 (the file's birth), while it's metadata were changed on December 20 (change of file name, move to different location, etc):
An NME video was deleted and created again on December 18:
and in the logs
|
NME videos are indeed deleted via API url:
the videoid is obtained via:
then we execute deletion via:
When we want to replace NME video, we call that api_delete_gloss_nmevideo and then upload new nme_video again via:
|
@rem0g In the procedure you describe above you seem to delete a NME video twice via https://signbank.cls.ru.nl/dictionary/api_delete_gloss_nmevideo/5/{gloss_id}/{videoid}/. Is that the case? I would think only one call should be enough. |
It’s executed once just before new nme video creation. The first one is
just a “title” of the workflow so it’s done one time though.
…On Fri, Jan 10, 2025 at 08:42 Micha Hulsbosch ***@***.***> wrote:
@rem0g <https://github.com/rem0g> In the procedure you describe above you
seem to delete a NME video twice via
https://signbank.cls.ru.nl/dictionary/api_delete_gloss_nmevideo/5/{gloss_id}/{videoid}/.
Is that the case? I would think only one call should be enough.
—
Reply to this email directly, view it on GitHub
<#1453 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACC4F7Q6HWJOWXEUQGLBWQL2J52WBAVCNFSM6AAAAABUCWGSK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBRHE3TKMZYGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I'm not sure if we're all on the same page regarding this problem. Let me summarize:
Unlikely as it is, this seems to have happened:
|
Pointed out earlier, in the video history (admin), some video files that are uploaded are using the M... for the NME video as well, so name of a source video is the same for the normal video as an NME video. For videos larger than 2.5 mb (Django quote above) these are in temp storage. It could happen that somehow the uploaded file in temp storage is being used by multiple objects because it has the same name. |
Two new discoveries:
|
I don't think this is what is happening here: mixing up files from temporary storage would lead to duplicates, all with correct file paths. Instead, we see that the URL paths stored in the database themselves are incorrect, and that the files at the correct path don't exist. |
Could it happen if a transaction doesn't complete? The object creations don't wait on the file system. Or if the Django implementation (of Django) is somehow using symbolic links? No idea, I'll stop. It's just weird. When running it on iCloud, I had the same problem. It turned out the files were there, but with "." before them and had not been uploaded to iCloud yet. |
What would happen if you execute the transaction with 10 different glosses
at short timespan?
I am going to run the script on ten glosses soon.
Gomer
…On Fri, Jan 10, 2025 at 12:04 Wessel Stoop ***@***.***> wrote:
Two new discoveries:
- I just tried to reproduce the problem by adding and removing NME
videos video the API a few times, but no luck... it all worked as intended
:/ . Gomer must have done something differently, but I cannot imagine what.
- The other example glosses given by Gomer, #BMW and #PSV, don't show
perspective and NME videos, but that's because of an unrelated problem (the
hashtag in the name). The actual videos linked show the same pattern as
FRIKANDEL-A
—
Reply to this email directly, view it on GitHub
<#1453 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACC4F7XXZAD3MIWAI2Q5OTD2J6SNBAVCNFSM6AAAAABUCWGSK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBSGQZTSOJZGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I am yet to see weird results but for this API I have encountered a bug: I have to send POST update to api_update_gloss/{gloss}/video twice as the videos dont appear the first time i POST the videos, after 5 seconds again POST-ing it they appear again. I don't know if that's related to this issue. Gomer |
I am just going ahead and re-upload all base glos video files, but with sleep of 5 sec between uploads. If that's finished then i will start with NME video delete and re-uploads. |
I expect nothing special, except maybe database locks if it's faster than the database can handle.
I think unrelated. Respond to this comment with your fav emoji if you want me to create a separate issue for it :)
Apologies that this is needed, I realize the problem is on our side :(. I think our best chance of fixing the problem in the current issue is being able to reproduce it. If you notice the disappearing/overwriting of videos again during the re-upload, would be great if you could (for one example gloss) explain exactly which commands you run so that I can try to make the behavior appear at will. |
Ok thank you! We will see what happens and no need to have seperate issue for that, for now :) |
Still reuploading as I also have script written to double check if video is there at gloss, if not then reupload again. I came across this for example: https://signbank.cls.ru.nl//dictionary/gloss/45834 Gloss video is in NME index, but it should not be there. I have theory what is happening: Gloss Video URL is https://signbank.cls.ru.nl/dictionary/protected_media/glossvideo/NGT/RA/RAT-A-45834.mp4?v=20250111170412 NME video URL is also: "Link": "https://signbank.cls.ru.nl//dictionary/protected_media/glossvideo/NGT/RA/RAT-A-45834.mp4" When i run API to delete the NME video index 0, it will remove the gloss video. I think that looks a reasonable cause. My proposal for a bugfix: Disable api_delete_gloss_nmevideo as we can simply overwrite existing NME video at index 0 for example. API documentation states this: Updates the details of a 'non-manual element' video for a gloss. To replace the video, delete it using api_delete_gloss_nmevideo and create a new NME video using api_create_gloss_nmevideo. So that requirement will not be need anymore. For now I have disabled NME video upload script, when the bug if recognized and fixed then I will re-activate it. |
[CODE BABBEL] Yes, it is true that it overwrites the video with the same index number. When you used to use the "zip" upload, the video files were overwritten. The backup system was avoided altogether. The normal videos do not delete the video files. But the subclasses do delete them. If you just stop using delete, it should just work, since it overwrites them. Did you try this? There could also be something messed up with the "str" coercion used on vide files. Normally, this yields the relative path. Although while working on some new video admin commands, the This is a concern (to me) because it means the wrong The type of the variables is dynamic and determined when the variable is first assigned. So it could be that somehow Django has "first" used a variable for an object of the supertype (normal video) and later in the code the type was already determined. The supertype methods also work on the subtype objects. @vanlummelhuizen can write this more concisely. What I'm describing would kind of be a Django implementation of subclasses error. I'm guessing we need to remove the "subclass" |
Another try, can you make the index start at 1 of the NME video? Using 0 can be dangerous, since it is also sometimes the same as False or None in the database. Because things are strings in the API it could be a problem with type coercion. Also, Django automatically converts objects to their "id" when they are passed around. They only remain an actual object in the template if it's via context variables. It could possibly be that the 0 is somehow become a "Null" value, so Django thinks the field is not set in the object. |
Using 0 as replacement for gloss sounds weird as I understand the index number is for NME videos only. Nothing has been sent to Signbank via API after last reupload of all videos before my vacation (happened past friday or so). NME autoupload and api_delete_gloss_nmevideo API call has been disabled since then too. Per @uklomp comment everything was fine this morning, so something has happened on Signbank server outside of our scope? I hope @Woseseltops can solve this issue this week as we are nearing the end of our Signbank project. |
Some glosses has disappeared videos for example:
https://signbank.cls.ru.nl/dictionary/gloss/47361
https://signbank.cls.ru.nl/dictionary/gloss/47475
Some glosses has wrong still image from the video.
Also some glosses has wrong video perception.
Some glosses even has NMM video as gloss video.
This is happening everywhere, I have checked my scripts and everything looks fine at my end.
The text was updated successfully, but these errors were encountered: