Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataframe v2: inline deduped latest logic #7705

Merged
merged 2 commits into from
Oct 12, 2024

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Oct 12, 2024

It is impossible for Chunk::deduped_latest_at_index to be fast with current Arrow limitations.

This PR works around that by inlining that logic into the dataframe walk directly.

Before:
image

After:
image

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@teh-cmc teh-cmc added 🏹 arrow concerning arrow 🚀 performance Optimization, memory use, etc include in changelog feat-dataframe-api Everything related to the dataframe API labels Oct 12, 2024
Copy link

Deployed docs

Commit Link
5a94ea9 https://landing-3jqjh0069-rerun.vercel.app/docs

@teh-cmc teh-cmc merged commit 2a81c67 into main Oct 12, 2024
34 of 35 checks passed
@teh-cmc teh-cmc deleted the cmc/dataframev2_inlined_deduped_latest_logic branch October 12, 2024 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏹 arrow concerning arrow feat-dataframe-api Everything related to the dataframe API 🚀 performance Optimization, memory use, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants