Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing packaging structure #5

Merged
merged 40 commits into from
Jul 23, 2024
Merged

Changing packaging structure #5

merged 40 commits into from
Jul 23, 2024

Conversation

aradhakrishnanGFDL
Copy link
Collaborator

We need a sub-package structure, as in one can import just catalogs, or the wrapper scripts, or the API itself, or get all at once. But, also noting that going forward the catalogs perhaps will also be pushed to a schema repo.

Given the landscape change, we are revisiting the way packaging was done and see if technical issues can be resolved to get this done, collectively.

@aradhakrishnanGFDL aradhakrishnanGFDL marked this pull request as draft July 19, 2024 19:42
@aradhakrishnanGFDL
Copy link
Collaborator Author

I've opened a couple issues for those that could be addressed in a different PR so this is a bit manageable to review. Appreciate quick improvements and fixes needed to address this PR.. while the bigger ones can be new "issues" to address later.

@aradhakrishnanGFDL aradhakrishnanGFDL marked this pull request as ready for review July 19, 2024 20:02
@ilaflott
Copy link
Member

@ilaflott with the changes in the PR, we should be able to the following

from catalogbuilder.scripts import gen_intake_gfdl

I can confirm this in a clean environment but i am getting a warning with a fresh code checkout:

Python 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import catalogbuilder
>>> from catalogbuilder.scripts import gen_intake_gfdl
The module intakebuilder is not installed. Do you have intakebuilder in your sys.path or have you activated the conda environment with the intakebuilder package in it?
Attempting again with adjusted sys.path

which brings me to these lines that i find confusing. it does seem to work though.

and hopefully be able to do the following

conda install noaa-gfdl::catalogbuilder
and bonus conda install catalogbuilder.scripts -c noaa-gfdl

happy to test that out upon publishing pipeline, which runs on merge i take it?

Someone needs to remove the intakebuilder package once this PR is approved, and communicate to ensure analysis-scripts is updated.

@ilaflott Do you know how to update the CONDA tokens (if GitHub teams allow)? See a silent failure here

sorry i haven't touched github secrets and/or conda tokens. I am unsure if i even have permissions, we can chat elsewhere about this if we need.

@ilaflott
Copy link
Member

I pushed a super small tweak to the reqs detailed in the meta.yaml and setup.py

@ilaflott
Copy link
Member

@Ciheim i think you can set the secrets for the respository? https://github.com/orgs/NOAA-GFDL/teams/catalogbuilder-admins

@aradhakrishnanGFDL
Copy link
Collaborator Author

@ilaflott with the changes in the PR, we should be able to the following
from catalogbuilder.scripts import gen_intake_gfdl

I can confirm this in a clean environment but i am getting a warning with a fresh code checkout:

Python 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import catalogbuilder
>>> from catalogbuilder.scripts import gen_intake_gfdl
The module intakebuilder is not installed. Do you have intakebuilder in your sys.path or have you activated the conda environment with the intakebuilder package in it?
Attempting again with adjusted sys.path

which brings me to these lines that i find confusing. it does seem to work though.

Right, if the conda package is installed and user activates the environment, you won't see this error. Since users have an option to run the catalog builder without the conda package, we expect them to add the necessary system paths so we can find the intakebuilder. We do need a new CI test that tests the conda package upon publishing (also a GitHub issue, though unsure when it will be taken up)

and hopefully be able to do the following
conda install noaa-gfdl::catalogbuilder
and bonus conda install catalogbuilder.scripts -c noaa-gfdl

happy to test that out upon publishing pipeline, which runs on merge i take it?

Someone needs to remove the intakebuilder package once this PR is approved, and communicate to ensure analysis-scripts is updated.
@ilaflott Do you know how to update the CONDA tokens (if GitHub teams allow)? See a silent failure here

sorry i haven't touched github secrets and/or conda tokens. I am unsure if i even have permissions, we can chat elsewhere about this if we need.

I've added the secret. The conda publish CI pipeline is now working as expected per the logs. Though it's not approved - I did push the package to the noaa-gfdl channel through this CI (have issues opened, we need a test channel or something too later). So conda install could be tested now.

@aradhakrishnanGFDL
Copy link
Collaborator Author

@ilaflott
Copy link
Member

so conda install noaa-gfdl::catalogbuilder works, but...

(ians_CLEAN_env) ldt-4788479: ~ $] python
Python 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import catalogbuilder
>>> from catalogbuilder.scripts import gen_intake_gfdl
The module intakebuilder is not installed. Do you have intakebuilder in your sys.path or have you activated the conda environment with the intakebuilder package in it?
Attempting again with adjusted sys.path

now i look back at those import lines we WANT to work, in the try blocks:

from intakebuilder import gfdlcrawler, CSVwriter, builderconfig, configparser

this won't work because the package is no longer called intakebuilder, it's called catalogbuilder, and these pieces are under catalogbuilder.intakebuilder here. that's why python can't find those scripts and why the dirname(dirname(filename))) output getting added to sys.path works to solve the import issue- it's essentially saying "look for the scripts at the root directory of this package"

If we adjust the import calls, we can avoid sys.path manipulation, change the line to instead:

from catalogbuilder.intakebuilder import gfdlcrawler, CSVwriter, builderconfig, configparser

...or i can even use a relative import:

from . import gfdlcrawler, CSVwriter, builderconfig, configparser

... and changing the same import lines in applicable files within catalogbuilder/intakebuilder and we get the desired behavior, no warning.

@ilaflott
Copy link
Member

not sure conda install catalogbuilder.scripts -c noaa-gfdl its working as excpected. but that being said, the package is quite small anyways, so i dont think the incentive to get this installation mode working is huge. The main approach, conda install noaa-gfdl::catalogbuilder works well and is more important.

(ians_CLEAN_env) ldt-4788479: ~ $] mamba install catalogbuilder.scripts -c noaa-gfdl

Looking for: ['catalogbuilder.scripts']

conda-forge/linux-64                                        Using cache
conda-forge/noarch                                          Using cache
r/linux-64                                                  Using cache
r/noarch                                                    Using cache
noaa-gfdl/linux-64                                            No change
noaa-gfdl/noarch                                              No change
main/linux-64                                                 No change
main/noarch                                                   No change
Could not solve for environment specs
The following package could not be installed
└─ catalogbuilder.scripts does not exist (perhaps a typo or a missing channel).
(ians_CLEAN_env) ldt-4788479: ~ $] conda install catalogbuilder.scripts -c noaa-gfdl
Channels:
 - noaa-gfdl
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - catalogbuilder.scripts

Current channels:

  - https://conda.anaconda.org/noaa-gfdl
  - defaults
  - https://conda.anaconda.org/conda-forge

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

ilaflott added 3 commits July 22, 2024 15:53
`catalogbuilder.intakebuilder` not found in pipeline. b.c. the package isnt installed into the created environment. adding `pip install .` to check
…like one always expects?

.... wouldnt totally suprise me.
@ilaflott
Copy link
Member

Well i certainly see the challenge in keeping both a user's interactive python shell happy, and github's CI/CD pipeline happy, at the same time.

... python packaging, sigh...

ilaflott added 7 commits July 22, 2024 17:01
we 1) ask github CI/CD to setup a python3.10
we 2) ask conda to create an env with a python, but the package is NOT installed.

i tried "pip install .", but that didnt work... because it used the first python, i imagine.

now i will try calling the explicit pip of the CONDA python to install the package, before calling the test.
sometimes the environment is built with environemtn.yml, which has a slightly diff array of depencdencies in the yaml file, when compared to the conda-build target, meta.yaml./
Copy link
Member

@ilaflott ilaflott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i reverted back to the state it was in when i introduced a tiny tweak. pipeline passes. will look at this in other issues/branches/PRs as you suggested- nontrivial for some reason.

honestly, i think we're fighting the pipeline+it's env much more than we are the packaging. those two sets of problems LOOK very similar, though.

Copy link
Contributor

@ceblanton ceblanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! thanks for this, @ilaflott

@ilaflott ilaflott merged commit 54f92f4 into main Jul 23, 2024
4 checks passed
@ilaflott ilaflott deleted the subpkgtest branch July 23, 2024 15:44
@aradhakrishnanGFDL aradhakrishnanGFDL mentioned this pull request Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants