Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve database management, files storage management and sub-process management #662

Draft
wants to merge 81 commits into
base: main
Choose a base branch
from

Conversation

gschwind
Copy link
Collaborator

@gschwind gschwind commented Sep 7, 2022

Overview

This patch series include many change to have a even better handle of sub-process. The patch series include:

  • Change the management of end-points, now the service provide /wps, /status, /files URL
  • Refactor of Process and Services
  • Rewrite of Storage Back-ends
  • Dynamic status files (Implement dynamic status response and data response #354)
  • Implement optional file lock for the database access, after several test the current lock is not safe.

Missing peace:

  • Implementation of the back-end for S3 storage

More detail on patches series

Storage Back-end rewrite:
The rewrite of back-end storage, simplify the API of back-end storage by only allowing read and write and does not provide particular API for one or another Storage type (for example I removed the copy/move optimisation). Now files in the back-end storage may be exported. If they are exported they can be downloaded via /files endpoint using it's identifier. I did not implemented the S3 back-end because I can't test it, but I think there is nothing that blocking the implementation of the new API.

Multiple end-point:
This allow to implement dynamic status, and to manage files output via Storage back-end. There is no more direct file access, all files download goes through /files URL.

New locks scheme for databases.
The current lock scheme does not work, the filelock scheme is 100% safe in some setup, in particular one WPS server that access to the database. The filelock will not work if some process that is not aware of file lock try to access to the database or if the database is shared within several sever. I did not found a 'standard' lock method that is shared accross several database back-end thus I implemented this one.

I currently use this implementation on our server.

Best regards

Related Issue / Discussion

Additional Information

This contribution is supported by MINES ParisTech.

Contribution Agreement

(as per https://github.com/geopython/pywps/blob/master/CONTRIBUTING.rst#contributions-and-licensing)

  • I'd like to contribute [feature X|bugfix Y|docs|something else] to PyWPS. I confirm that my contributions to PyWPS will be compatible with the PyWPS license guidelines at the time of contribution.
  • I have already previously agreed to the PyWPS Contributions and Licensing Guidelines

Move the process management logic from Process to Service. This make
more sence that process does not manage then self and are used only for
running.
@gschwind gschwind changed the title Fix status failed pull request Improve database management, files storage management and sub-process management Sep 7, 2022
@gschwind gschwind force-pushed the fix-status-failed-pull-request branch from 45e3960 to f6fa83f Compare September 8, 2022 13:27
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.07%) to 80.978% when pulling f6fa83f on gschwind:fix-status-failed-pull-request into 85ca819 on geopython:main.

@huard
Copy link
Collaborator

huard commented Sep 16, 2022

I tried testing with https://github.com/bird-house/emu (a test and demo servier) and all processes failed.

@gschwind
Copy link
Collaborator Author

I will have a look,

Thanks for feed back :)

@gschwind
Copy link
Collaborator Author

gschwind commented Oct 12, 2022

Hello, after a first look I gather two issue.

The first one is fixed by gschwind/pywps-emu@6e4e3aa and it's due to change that make request url ends with /wps when wps is made, to separate /files or /status request. I may change that by using service query parameter instead, as exemple.

The second issue that I found is related to the new Storage handle, I do not store file immediately in the storage when we use code such as response.outputs["output"].file = "/some/path". and it's seems that during test temporary file vanish before I try to put them into the store. If I put the into a directory that do not vanish it's work. In test outputs files are stored in process.workdir, if I put them directly in "/tmp" every thing goes well. I will investigate futher.

@gschwind
Copy link
Collaborator Author

Hello,

I did fixed more test regarding previous comment here : https://github.com/gschwind/pywps-emu/commits/update-test

I have also some update to fix outputs storage, but I need to fix issue for inputs. I will try to fix it and update the branch once I got something that I'm confident with it.

And I still have the test test_wps_ncml.py that I cannot fix, and that is invalid test in my opinion, it tries to retrieve data from not-existing server. Thus the test should be rethinked using client.get(...) instead of d3.retrieveData()

Best Regards.

@gschwind gschwind mentioned this pull request Nov 9, 2022
2 tasks
@gschwind
Copy link
Collaborator Author

gschwind commented Nov 9, 2022

Hello,

I working on replacement of this patch suite, the first part is in #667

Best regards

@cehbrecht cehbrecht marked this pull request as draft October 30, 2023 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants