Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elyra workbench with Airflow support #49

Open
mamurak opened this issue Jan 5, 2024 · 5 comments
Open

Elyra workbench with Airflow support #49

mamurak opened this issue Jan 5, 2024 · 5 comments

Comments

@mamurak
Copy link

mamurak commented Jan 5, 2024

The current ODH/RHOAI workbenches come with Elyra-KFP for out-of-the-box integration with Data Science Pipelines.
Upstream Elyra supports Airflow as an additional backend. I'm proposing a new community workbench image for the explicit purpose of developing and submitting Airflow pipelines through Elyra, with the option of integrating Github or Gitlab based git servers.

@shalberd
Copy link

shalberd commented Jan 5, 2024

Hi @mamurak, good to hear from you again. Yeah, I think we have to i.e. explicitely enable gitlab option during pip install, correct? At least that is what I did in my custom image build.
And yes, I had to manually modifiy the list of runtimes in the array to again enable airflow. With Elyra 4.x, Airflow 2.x support for generic pipelines will come, by the way and Airflow 1.x support will be removed. I've been working with a lot of good people on that who have great input. Elyra overall will get more attention again, @LaVLaS told me that once we can show that Elyra is up to snuff regarding current Airflow, it might even make it back into the official notebook images (aside contrib) some time in future.

@shalberd
Copy link

shalberd commented May 13, 2024

@mamurak @harshad16 I am working on this and being quite successful, will make a PR soon, probably in 2-3 weeks.
i.e. elyra[gitlab] only without all the kfp fluff in here.
i.e.
https://github.com/opendatahub-io-contrib/workbench-images/blob/main/snippets/ides/1-jupyter/files/utils/jupyter_elyra_config.py#L11

instead of

c.PipelineProcessorRegistry.runtimes = ['kfp']

in the airflow snippet

c.PipelineProcessorRegistry.runtimes = ['airflow']

or even

c.PipelineProcessorRegistry.runtimes = ['kfp','local']

and in the pipfile and pipfile.lock and requirements-jupyter.txt

https://github.com/opendatahub-io-contrib/workbench-images/blob/main/snippets/bundles/1-minimal/py39/Pipfile#L18

instead of

"elyra[kfp-tekton]" = "~=3.15.0"

for that airflow case then

"elyra[gitlab]" = "~=3.15.0"

I've been using a 3.16-dev locally built wheel file cause of changes for Airflow 2 support, but this shows the general direction.

Will also propose what I found out to complete Elyra without kfp (got that working) as a PR by me (not @akchinSTC, but his initial work was invaluable/very useful to me) in Elyra
elyra-ai/elyra#3144

@koep
Copy link
Contributor

koep commented May 13, 2024

@thomas-gremm 🥂

@koep
Copy link
Contributor

koep commented Jul 5, 2024

hi @shalberd , I was wondering if you could share an update on your PR?

@shalberd
Copy link

shalberd commented Jul 11, 2024

August 15, still some loose ends to tie up, but gettin' there. See my communication in Slack.
Y'all are not the only ones who think it is worth it having Airflow / Airflow 2 support in Workbench and runtime Images.
Key is bootstrapper.py in https://github.com/opendatahub-io-contrib/workbench-images/blob/main/snippets/ides/5-runtime/files/utils/bootstrapper.py, among other things.

We have disabled Pipelines in Open Data Hub Dashboard and do everything with a, for now, custom Elyra wheel file supporting Airflow 2.x for generic pipeline nodes DAG code rendering.
I can confirm this work nicely https://github.com/elyra-ai/elyra/pull/3167/files
Just want to also add all the other Airflow 2 aspects (parsing Airflow 2.x wheel file for operators via AST https://codedamn.com/news/python/python-abstract-syntax-trees-ast-manipulating-code-core
and assembling the package catalog Elyra GUI fields correctly, see
elyra-ai/elyra#3208
For now, with my custom built wheel file, I am just not using the package catalog functions ...
The goal is to get all this with no more Airflow 1.x, but airflow 2.x, support into Elyra 4.
Tracker: elyra-ai/elyra#3165

About "with the option of integrating Github or Gitlab based git servers." yeah, we have Gitlab, too. So I added
elyra = {extras = ["gitlab"] to Pipenv, for example. Ok, file ref :-)
elyra = {extras = ["gitlab"], file = "elyra-3.16.0.dev0-py3-none-any.whl"}

I will first make a PR here around August 14 disregarding the custom build and specialities for Airflow 2 support.
Focusing on multi-runtime support in interactive-image-builder and some more tweaks here.
Later, expect full Airflow 2 support in Elyra by October. Anyone who can help with how to parse AST style code in
elyra-ai/elyra#3208 is welcome to comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants