Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import strategy #100

Merged
merged 22 commits into from
Aug 6, 2024
Merged

Import strategy #100

merged 22 commits into from
Aug 6, 2024

Conversation

gumityolcu
Copy link
Collaborator

@gumityolcu gumityolcu commented Aug 5, 2024

Hiii. Let's discuss in a meeting, below is a wall of text so I remember everything.

I deleted a bunch of lines from Makefile (checkpoints, isort) and files (tutorials) to make this work.

So I looked at quantus, scikit-learn, torchvision and pandas. There seems to be 2 things to decide:

Examples from other libraries
1- scikit-learn and torchvision use relative imports, not going sklearn.foo.bar.bla for each import. If the file is already inside sklearn/foo, they just use import .bar.bla. I chose to go with using absolute imports, as we did up to now.

2- pandas and torchvision add code inside the init.py files. So they have everything where they want them to be imported. Quantus basically uses our current import scheme. Scikit-learn does what I chose to go with. So I explain this below.

What I did
In this PR, we don't go all the way into the leaves (files) in the directory tree, we stop at one level above (directories that include files are used to include the objects inside those files.). For example, to include the BaseExplainer class from quanda/explainers/base.py, we use from quanda.explainers import BaseExplainer.

I think you will find that this reduces a lot of redundancy in the import statements.

For example, we use
from quanda.utils.datasets import ActivationDataset instead of from quanda.utils.datasets.activation_dataset import ActivationDataset

Also note that, because quanda/utils/datasets/transformed is a directory, for transformed datasets we go one level deeper:
from quanda.utils.datasets.transformed import LabelFlippingDataset instead of from quanda.utils.datasets.transformed.label_flipping import LabelFlippingDataset

I think this makes sense for our library where we try to have dedicated files for everything. The user shouldn't have to guess when we dedicate a file to something or not. I think they should be able to reach all explainer-related classes or modules (like quanda.explainers.wrappers) when they go from quanda.explainers import

This applies to all of the library, with 2 exceptions:

utils folders/files include a wide range of functionality, so I decided to make explainers.utils its own subpackage. Thus, even though it is a file, I made it it's own submodule. So you need to use from quanda.explainers.utils import explain_fn_from_explainer. Likewise, individual files in the quanda/utils/ directory are taken to be their own submodules.

init.py files (and exception files like quanda/utils/common.py) have the all array defined, which determines which symbols will be imported when the user uses a wildcard "*" import. This prevents implicitly importing symbols with a line like from quanda.metrics import *. I also think it is great to make explicit where each symbol is importable (should be imported). It is also a little bit more of housekeeping to do as the library evolves.

I kind of took some decisions on my own, but I think they make sense for our codebase and different versions can be quickly realized so please inform me if you guys have a strong preference for a different design.

…it__ file of their directory. Exceptions: the files in quanda/utils/, also quanda/explainers/utils.py is imported as from quanda.explainers.utils import foo, bar. Finally: aggregators.py is now inside quanda.metrics
@gumityolcu
Copy link
Collaborator Author

Needs more work, was confused by git.

@gumityolcu gumityolcu closed this Aug 5, 2024
@gumityolcu gumityolcu reopened this Aug 6, 2024
@dilyabareeva
Copy link
Owner

@dilyabareeva will check the remaining erros, put tutorials and isort back

@gumityolcu gumityolcu closed this Aug 6, 2024
@gumityolcu
Copy link
Collaborator Author

gumityolcu commented Aug 6, 2024

I did small changes, fixing relative imports and moving aggregator to explainers. I added tutorials back, and src is now called quanda. This should be the exact same state as here, where there is a flake8 error. Which is why I closed the PR.

i had previously deleted "tutorials" as a quick fix. Then, we get this mypy failure

# Conflicts:
#	Makefile
#	quanda/toy_benchmarks/localization/mislabeling_detection.py
#	quanda/toy_benchmarks/localization/subclass_detection.py
#	quanda/toy_benchmarks/unnamed/dataset_cleaning.py
#	tests/conftest.py
#	tests/metrics/test_unnamed_metrics.py
#	tests/toy_benchmarks/localization/test_mislabeling_detection.py
#	tests/toy_benchmarks/localization/test_subclass_detection.py
#	tests/toy_benchmarks/unnamed/test_dataset_cleaning.py
#	tests/utils/test_training.py
@dilyabareeva dilyabareeva reopened this Aug 6, 2024
@dilyabareeva dilyabareeva linked an issue Aug 6, 2024 that may be closed by this pull request
@dilyabareeva dilyabareeva reopened this Aug 6, 2024
@dilyabareeva dilyabareeva merged commit ca2b0dc into main Aug 6, 2024
6 checks passed
@dilyabareeva dilyabareeva deleted the import_strategy branch August 6, 2024 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Figure out the import strategy
2 participants