Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New loaders #47

Merged
merged 216 commits into from
Nov 7, 2023
Merged

New loaders #47

merged 216 commits into from
Nov 7, 2023

Conversation

johentsch
Copy link
Member

@johentsch johentsch commented Jun 27, 2023

Welcoming your feedback and suggestions for a loader API, @huguesdevimeux and @apmcleod; I hope the demos in tests/test_loaders.py give a good enough idea for now. Documentation is yet to follow.

I would also be thankful for some input concerning #44 (see above) which, ideally, I would like to see addressed with this PR, too.

JFYI, this version currently required the bleeding-edge ms3@schema (specified in requirements) to generate JSON resource descriptors, so please pip install -e .[dev] to run the code.

johentsch added 16 commits June 23, 2023 14:04
…<path>); replaces ms3.resolve_dir() with utils.resolve_path();
…tly implemented derive from;

this happens in preparation for the Loader(PipelineStep) and others in the future which do not operate on features
…ucture table; datatypes not cleaned; values such as time signatures not propagated; many redundant columns by using parse_m21_object() a lot; that also slows the parsing down, profiler shows that the use of getattr() on m21 objects takes a lot of time)
…Ds currently a fallback to (folder, fname); switched order of __init__ arguments, making package_name mandatory, but not source (can be added later, or is not necessary when specifying a list of paths, PathFactory, etc...)
…ough the user of Loader.iter_package_descriptors()
@johentsch johentsch mentioned this pull request Jun 27, 2023
Copy link
Collaborator

@huguesdevimeux huguesdevimeux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments mostly about path handling. I did not review the core of it, as, to be completely honest, I don't feel I have the knowledge for.

tests/test_loaders.py Outdated Show resolved Hide resolved
src/dimcat/steps/loaders/utils.py Outdated Show resolved Hide resolved
src/dimcat/steps/loaders/utils.py Outdated Show resolved Hide resolved
src/dimcat/steps/loaders/utils.py Outdated Show resolved Hide resolved
return tuple(ext if ext[0] == "." else f".{ext}" for ext in extensions)


class PathFactory(Iterable[str]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite too many paramaters, and a builder here would be nice. It would allow to reuse some "base" scanner, and to specialize them quite easily instead of having to redefine the whole thing everytime.

src/dimcat/utils.py Outdated Show resolved Hide resolved
src/dimcat/utils.py Show resolved Hide resolved
src/dimcat/steps/loaders/base.py Outdated Show resolved Hide resolved
@apmcleod
Copy link

I can't say I really looked through the 2000+ lines of changes in any depth, since I already didn't really know the inner workings, but I left some comments, probably on something you weren't asking about though :) Namely, the naming of FeatureStep I find confusing. What is a FeatureStep, and what are its sibling classes? What other types of PipelineSteps are there, and it seems counter-intuitive that every Analyzer and Grouper would be a FeatureStep.

Copy link

@apmcleod apmcleod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment.

src/dimcat/steps/base.py Outdated Show resolved Hide resolved
src/dimcat/steps/base.py Show resolved Hide resolved
src/dimcat/steps/extractors.py Outdated Show resolved Hide resolved
src/dimcat/steps/analyzers/base.py Outdated Show resolved Hide resolved
src/dimcat/steps/pipelines.py Outdated Show resolved Hide resolved
…(), apply_slice_intervals(), relevant for slicing
@johentsch johentsch marked this pull request as ready for review November 7, 2023 02:36
@johentsch johentsch merged commit 396eabe into DCMLab:development Nov 7, 2023
1 check failed
@johentsch johentsch deleted the loader branch November 7, 2023 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement Loader base class
4 participants