Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cylc review: port to Cylc 8 #5937

Open
oliver-sanders opened this issue Jan 24, 2024 · 5 comments
Open

cylc review: port to Cylc 8 #5937

oliver-sanders opened this issue Jan 24, 2024 · 5 comments
Milestone

Comments

@oliver-sanders
Copy link
Member

oliver-sanders commented Jan 24, 2024

Ths cylc review utility provided us with a database-driven browser-based monitoring tool.

It did not support interactivity or live updates but proved useful for a number of cases especially providing read-only access to other people's workflows at scale (due to efficient backend), debugging (due to linkable line numbers in log files) and reviewing historical workflow data.

The plan was to replace this with the cylc-ui and the cylc review source code was removed from cylc-flow master. Unfortunately, we have not yet been able to bring the required features into cylc-ui to satisfy these use cases leaving us with a gap in functionality.

This issue proposes porting the cylc review utility to Python 3 / Cylc 8 to give cylc-ui the time required to fill in these functionality gaps.

Must:

  • Provide the Cylc 7 cylc review interface.
  • Run with Python 3

Should:

  • Be able to run under Apache (wsgi or wagi) for ease of deployment / service management.
  • Continue to support Cylc 7 workflows if possible.

Questions:

  • Where should this go, cylc-flow, cylc-uiserver or elsewhere?
    • Propose cylc-flow as an optional dependency.
  • What server framework should we use?
    • Propose either Tornado (due to familiarity) or uvicorn/fastapi (due to popularity) providing Apache integration potential.
@oliver-sanders oliver-sanders added this to the some-day milestone Jan 24, 2024
@hjoliver
Copy link
Member

Another question (I'm expecting a "no" answer, but maybe worth asking):

Is it feasible to incorporate cylc review into the UIS in the sense of presenting a sort of "cylc review view" that is not completely integrated into the new UI as such, but is at least served by the UIS. On the upside, we could drop Question 2 (server framework).

@oliver-sanders
Copy link
Member Author

That is absolutely possible but may leak some of the limitations of the UIS into Review, needs some thought.

One of the big problems that cylc-review solves very well is large scale anomalous access for users who don't necessarily have accounts on the system they are inspecting. E.G. we may have a large number of users monitoring a production workflow. We wouldn't want to dump that load onto the server that the production folks are using, so we would have to set up another server under another user account at which point you're halfway to an Apache deployment anyway.

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Mar 8, 2024

Investigation:

Find out what Python 3 server frameworks can run under Apache:

  • The old standard was WSGI, but this is no longer supported by Tornado.
  • The new standard is WAGI, but it is unclear whether / how well this works with Apache.

Ideally we would find a modern WAGI framework and an Apache support module. If this exists, try it out with a simple example to find out how well it works.

@oliver-sanders
Copy link
Member Author

oliver-sanders commented May 13, 2024

Is it feasible to incorporate cylc review into the UIS

that is absolutely possible but may leak some of the limitations of the UIS into Review, needs some thought.

The UIS doesn't fit the Cylc Review model (single user vs multi-user, central vs distributed), however, there are also Jupyter Hub services, these run under the Hub not the UIS so should be both centralised and multi-user fitting the Cylc Review model nicely.

I'm not sure how these services are accessed so this may require a little research, hopefully a hub service could provide a public endpoint that does not require authentication. If so this would be a very nice solution that we could bundle with the Cylc Hub. The Hub user would require read-access to the relevant portions of filesystem for this to work.

Note, the Jupyter Hub service approach is also of interest to cylc/cylc-admin#72

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Jul 2, 2024

Running Cylc Review under Jupyter Hub (JH)

Is it feasible to incorporate cylc review into the UIS in the sense of presenting a sort of "cylc review view" that is not completely integrated into the new UI as such, but is at least served by the UIS.

UIS no, but Hub, yes. After a bit of poking:

  • We could integrate Cylc Review with JH.
  • We could serve a fully public (no-authentication required) Review web app this way.
  • This wouldn't restrict the choice of approach or web server tooling, but it may make Tornado advisable.

Jupyter Hub (JH) Services

Long story short, JH services give you a proxy, but not a server:

  • JH Services allow you to start a web server externally to Jupyter Hub.
  • JH will configure its proxy so that this external service is accessible behind the JH URL.
  • These services can be "hub managed" (i.e. the hub will start, and restart the external server as needed), or fully "external" (i.e. the server's lifecycle is not coupled to the hub lifecycle). Note, services are started on hub start up, from experimentation, if the service is killed, it will be re-launched on the next API request (i.e. on the next visit the service's URL).
  • These services can use any web server implementation they like in any language (it's a proxy mapping not a code API).
  • However, if you want things from the hub, e.g. authentication or the ability to call JH APIs, then it makes sense to stick with Tornado as this allows you to use the JH base classes. It would be possible to use other web server implementations, however, you would have to develop JH integration from scratch (the jupyverse code is probably good reference for doing this).
  • These services may provide authenticated (via JH integration) or unauthenticated endpoints.

So we could potentially run cylc review behind JH, however, it's technically the same thing as running it standalone. But there are two advantages to this approach:

  1. Both cylc-ui and cylc-review can be served from the same host:port (one fewer port to open).
  2. Both can be configured via the same jupyter_config.py file. In theory cylc-review could even be enabled by default in the Cylc UI Server configuration.

POC Service (Python 2 cylc-review)

Jupyter Configuration:

 c.JupyterHub.services = [
    {
        'name': 'cylc-review',
        'command': ['/path/to/cylc-review-launcher'],
        'url': 'http://0.0.0.0:8042/',
    }
]

Launcher script (Python 2):

 #!/usr/bin/python

import sys

# load the Cylc 7 library code
sys.path.insert(0, '/path/to/cylc-7/lib')
import cylc.review
from cylc.ws import _ws_init

# hack the log path
cylc.review.LOG_ROOT_TMPL = '~/.cylc/cylc-review'

# hack the service namespace to allow it to run under the JUPYTERHUB root URL
cylc.review.CylcReviewService.NS = 'services/cylc'

# start review in standalone mode
_ws_init(cylc.review.CylcReviewService, 8042, service_root_mode=True)

Launch Jupyter Hub as normal, then navigate to <hub-url>/services/cylc-review.

Note, a Python launcher script is only required to hack the Cylc Review code in order to allow it to be served behind the Jupyter Hub proxy. We should be able to achieve this with JH config alone.

Conclusions

  • JH services should take the pressure of of the Apache/WSGI requirement.
    • Sites are already committed to maintaining a Jupyter Hub deployment to serve the Cylc UI so this doesn't constitute an extra service to deploy.
  • It probably makes sense to go with Tornado.
    • Permits use of Jupyter Hub APIs if desired.
    • Doesn't impact the dependency base.
  • We should be able to get cylc-review running "out of the box" in Cylc Hub.
    • It could be configured in jupyter_config.py.
  • Unless standalone access is of interest, it would make sense to put the cylc-review code in the cylc-uiserver repository.
  • The same approach would be suitable for the Cylc Server Monitor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants