Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add EnsembleFrame Support to Tape (#308)
* A minimal Dask Dataframe subclass for the Ensemble * Addressed comments, added test fixture. * Make convert_flux_to_mag part of the EnsembleFrame * Ensembles can now track a group of labeled frames * Preserve EnsembleFrame metadata after assign() * Parquet support for frame subclasses checkpoint * Reverting changes to tests * Adds test for objsor_from_parquet * Addressed comments * Removed adding column via apply * Fix comment typo * Fix EnsembleFrame.set_index * Add update_ensemble() and Use EnsembleFrames (#252) * Adds EnsembleFrame.update_ensemble() * Use EnsembleFrames throughout the Ensemble * Udpdate ensemble test * Extends update_ensemble test cases * Unpin sphinx to address docs build fail * Fix minor test error * Remove debug line * Propagate EnsembleFrame._is_dirty (#264) * EnsembleFrames should propagate is_dirty * Test that a frame's dirty status propagates * Update doc strings * Address review comment * Have update_frame mark frames as dirty (#267) * Remove calls to set_dirty in ensemble (#269) * Update refactor (#274) * Add ensemble loader functions for dataframes * Updated unit tests * Lint fixes * Always update column mapping * Addressed review comments * Ensure object frame is indexed * adds a dask_on_ray tutorial * add performance comp; add use_map comment --------- Co-authored-by: Doug Branton <[email protected]> * Merge main into tape_ensemble_refactor (#277) * Add ensemble loader functions for dataframes * Updated unit tests * Lint fixes * Always update column mapping * Addressed review comments * Ensure object frame is indexed * adds a dask_on_ray tutorial * add performance comp; add use_map comment * sync with map_partitions * sync with map_partitions * sync with map_partitions * sync with map_partitions * coalesce with map_partitions * use dataframes instead of series * add descriptive comments * implement suggestions * Update TAPE README.md Update the project description for TAPE to better reflect the current state and goals of the project. * Set object table index for from_dask_dataframe * add zero_point as float input :q q * add ensemble default cols * S82 RRLyr notebook * Move rrlyr nb to examples * Update requirements.txt to unpin sphinx * Update pyproject.toml to unpin sphinx * add calc_nobs * add calc_nobs * add calc_nobs * reduce scope of sync_tables * address divisions issue * add temporary cols test * improve coverage * add temporary kwarg to assign * add temporary kwarg to assign * drop divisions * drop brackets * fix bug in sync * Issue 199: Added static Ensemble read constructors to tape namespace (#256) * Added static read constructors to tape namespace * Removed @staticmethod as python 3.9 didn't like it * Reformatted via black * Changed read_dask_dataframe to call from_ method * Collapsed create dask client args to single arg * Fixed dask_client parameter * reformatted via black * Added missing unit test * Resolved code review comments from PR 256 * Fixed failing unit test Removed reference to Ensemble._nobs_band_cols field * fix bug in sync --------- Co-authored-by: Doug Branton <[email protected]> Co-authored-by: Konstantin Malanchev <[email protected]> Co-authored-by: Olivia R. Lynn <[email protected]> Co-authored-by: Chris Wenneman <[email protected]> * Fix EnsembleFrame.set_dirty and map_partitions metadata propagation (#280) * FIx _Frame.set_dirty * Update propgating data in map_partitions * Fix typo * Ensemble.update_frame no longer infers if a frame is dirty by checking if row count changed (#281) * Mark frames dirty without len() call * Move calls to set_dirty to EnsembleFrame * Support storing batch results for custom meta (#285) * Add meta handling for batch * Add unit tests for custom meta * Remove unit test sanity check, fix warning output * Provide default labels for result frames. * Update Remaining TAPE Documentation Notebooks for the Refactor (#298) * Remove ._source and ._object * Update notebooks for refactor * Fix find-replace error * Update Docs for TAPE EnsembleFrame Refactor (#290) * Initial commit for notebooks with refactor API * Removed _object and _source references * Added sync tables example * Address comment * Addressed frame renaming * Update docs/tutorials/working_with_the_ensemble.ipynb Co-authored-by: Konstantin Malanchev <[email protected]> * Addressed comments * Clear output --------- Co-authored-by: Konstantin Malanchev <[email protected]> * Allow EnsembleFrame.compute to Trigger Object-Source Table Syncing (#295) * Allow EnsembleFrame.compue to sync tables * Fixed docstring * Add Explicit Metadata Propagation for EnsembleFrame joins (#301) * Support propagating frame metadata in joins * Update doc strings and test * Update test * Merge Main into Ensemble Refactor Branch (#304) * check divisions, enable lazy syncs * check divisions, enable lazy syncs * initial tests * add tests; calc_nobs preserve divisions * batch with divisions * cleanup * fix sf2 tests * add sync_tables check * cleanup * fix calc_nobs reset_index issue * per table warnings; index comments * add map_partitions mode for calc_nobs when divisions are known * build metadata * build metadata * add multi partition test * add version file to init * add small test * Fix table syncing to use inner joins. (#303) * Fix table syncing to use inner joins. * fix lint error * Update test --------- Co-authored-by: Doug Branton <[email protected]> * Revert "Merge Main into Ensemble Refactor Branch (#304)" This reverts commit 5c847e1. * Fix linting * Remove unsupported type annotations * Fix merge error * Use client=False in test_analysis * Remove '_object' and '_source' fields * Fix linting errors * Address review comments, add tests --------- Co-authored-by: Doug Branton <[email protected]> Co-authored-by: Konstantin Malanchev <[email protected]> Co-authored-by: Olivia R. Lynn <[email protected]> Co-authored-by: Chris Wenneman <[email protected]>
- Loading branch information