Releases: mitdbg/palimpzest
Releases · mitdbg/palimpzest
0.5.3: Create chat.rst (#96)
* Create chat.rst * Update pyproject.toml Hotfix for chat * Update conf.py Hotfix for chat.rst
0.5.2: Update datasources.py (#93)
* Update datasources.py Removing Hardcoded limit to pdf text. We kept it for a demo but now is undocumented behavior * bump version --------- Co-authored-by: Matthew Russo <[email protected]>
0.5.1: Add Static Redirect to Chat Demo (#90)
* adding static redirect to chat demo * bump version * make pandas version more flexible
0.5.0: Dev to main. (#88)
* update README * 1. support add_columns in Dataset; 2. support run().to_df(); 3. add demo in df-newinterface.py (#78) * Support add_columns in Dataset. Support demo in df-newinterface.py Currently we have to do records, _ = qr3.run() outputDf = DataRecord.to_df(records) I'll try to make qr3.run().to_df() work in another PR. * ruff check --fix * Support run().to_df() Update run() to DataRecordCollection, so that it will be easier for use to support more features for run() output. We support to_df() in this change. I'll send out following commits to update other demos. * run check --fix * fix typo in DataRecordCollection * Update records.py * fix tiny bug in mab processor. The code will run into issue if we don't return any stats for this function in ``` max_quality_record_set = self.pick_highest_quality_output(all_source_record_sets) if ( not prev_logical_op_is_filter or ( prev_logical_op_is_filter and max_quality_record_set.record_op_stats[0].passed_operator ) ``` * update record.to_df interface update to record.to_df(records: list[DataRecord], project_cols: list[str] | None = None) which is consistent with other function in this class. * Update demo for the new execute() output format * better way to get plan from output.run() * fix getting plan from DataRecordCollection. people used to get plan from execute() of streaming processor, which is not a good practice. I update plan_str to plan_stats, and they need to get physical plan from processor. Consider use better ways to provide executed physical plan to DataRecordCollection, possibly from stats. * Update df-newinterface.py * update code based on comments from Matt. 1. add cardinality param in add_columns 2. remove extra testdata files 3. add __iter__ in DataRecordCollection to help iter over streaming output. * see if copilot just saved me 20 minutes * fix package name * use sed to get version from pyproject.toml * bump project version; keep docs behind to test ci pipeline * bumping docs version to match code version * use new __iter__ method in demos where possible * add type hint for output of __iter__; use __iter__ in unit tests * Update download-testdata.sh (#89) Added enron-tiny.csv --------- Co-authored-by: Matthew Russo <[email protected]> Co-authored-by: Gerardo Vitagliano <[email protected]>
0.4.0: Docs (#77)
* update gitignore and add sphinx to pyproject * add initial empty docs * update README * add docs build to gitignore * Load image over http from s3 * adding workflow to publish docs * fixing yaml * remove workflow.yaml * fix webpage content; temp deploy on docs branch * install pz before building docs * fix publish_dir * test package push * test push package logic * test push package * test push package after oidc * update publishing workflow * fix bug * fix cname changing * add missing step name * fix job name * finally get damn links to not show up * starting point for documentation; tweaked README and quickstart * fixed metaclass causing type issues * try to extract package version * try to extract package version * try to extract package version again * try to extract package version again * bump version to help debug pipeline * bump and add * yolo * version bump * preparing version 0.4.0 --------- Co-authored-by: Matthew Russo <[email protected]>
0.3.4
Update quickstart.ipynb
docs
Delete CNAME
Preliminary alpha release
This code is very preliminary, but stable. It will run programs but only implements very basic optimizations.