Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locking to create operations for packages #100

Closed
wants to merge 2 commits into from

Conversation

mreid-tt
Copy link
Contributor

@mreid-tt mreid-tt commented Apr 29, 2023

During the upload of SPK files, if this is for a new package or version, folders and corresponding database entries are created. During this process, if there are multiple concurrent requests there is a risk that a clash may occur and put the system into an inconsistent state. Adding locks to the creation of the package or version entries should avoid this.

Fixes: #95

@Diaoul
Copy link
Member

Diaoul commented Apr 29, 2023

I am not sure how thread locks behave in a web context under gunicorn workers. Did you check that it would solve the issue? We may need more than this approach.

@mreid-tt
Copy link
Contributor Author

mreid-tt commented Apr 29, 2023

I am not sure how thread locks behave in a web context under gunicorn workers. Did you check that it would solve the issue? We may need more than this approach.

Hmm, I had not considered this. Looking into the code I see this:

CMD [ "gunicorn", "-b", "0.0.0.0:8000", "-w", "5", "wsgi:app" ]

From my limited understanding this is a WSGI HTTP server with 5 worker processes. Reading up some more on "Selecting gunicorn worker types for different python web applications" there are a number of different worker types. I'm not sure if we are using request per process or request per coroutines described in the article.

Either way, with multiple parallel workers my understanding is that having a thread lock would only be effective within a single worker. As you have pointed out, the PR would not be an effective solution. From other searches, using process locks (e.g. multiprocessing.Lock) instead of thread locks may be effective but can potentially reduce performance due to the overhead of acquiring and releasing locks.

This is not an area I'm very familiar with so I would need assistance in formulating an effective solution.

@Diaoul
Copy link
Member

Diaoul commented Apr 29, 2023

Maybe we could use a redis lock instead. I think we use redis already, maybe for caching. I have not looked into that code for a long time 😆

@mreid-tt
Copy link
Contributor Author

mreid-tt commented Apr 29, 2023

Maybe we could use a redis lock instead. I think we use redis already, maybe for caching. I have not looked into that code for a long time 😆

@Diaoul, I've tried switching to a redis implementation. Let me know your thoughts.

@mreid-tt mreid-tt changed the title Add threading to create operations for packages Add locking to create operations for packages Apr 29, 2023
@mreid-tt mreid-tt force-pushed the api-threading-patch branch from ec88680 to 1240c5e Compare April 29, 2023 22:26
@mreid-tt
Copy link
Contributor Author

On further review, I believe the reason for the try block failing during the creation of a new version (create_version=True) is likely because of the os.mkdir() function raising a FileExistsError if the folder already exists. A solution specific to this has been proposed in #101 and this PR will be withdrawn.

@mreid-tt mreid-tt closed this Apr 30, 2023
@mreid-tt mreid-tt deleted the api-threading-patch branch April 30, 2023 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to publish package (500 Internal Server Error)
2 participants