Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate dask-cudf README improvements to dask-cudf sphinx docs #16765

Merged
merged 17 commits into from
Sep 16, 2024

Conversation

rjzamora
Copy link
Member

@rjzamora rjzamora commented Sep 6, 2024

Description

Follow up to #16671

  • Moves most of the information recently added to the dask-cudf README into the dask-cudf Sphinx documentation
  • Adds a "Quick-start" example to the simplified dask-cudf README

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@rjzamora rjzamora added 2 - In Progress Currently a work in progress doc Documentation dask Dask issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 6, 2024
@rjzamora rjzamora self-assigned this Sep 6, 2024
@rjzamora rjzamora removed the improvement Improvement / enhancement to an existing function label Sep 6, 2024
@github-actions github-actions bot added the Python Affects Python cuDF API. label Sep 6, 2024
@rjzamora
Copy link
Member Author

@bdice - This PR was partially motivated by your comments/suggestions in #16671 (comment)

The primary goal here is to avoid redundant/overlapping documentation in both the dask-cudf readme and the docs.

@rjzamora rjzamora marked this pull request as ready for review September 10, 2024 14:29
@rjzamora rjzamora requested a review from a team as a code owner September 10, 2024 14:29
@rjzamora rjzamora added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Sep 10, 2024
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We <3 docs!

python/dask_cudf/README.md Outdated Show resolved Hide resolved
@rjzamora rjzamora added 4 - Needs Review Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Sep 11, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@rjzamora rjzamora added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs Review Waiting for reviewer to review or respond labels Sep 11, 2024
@rjzamora
Copy link
Member Author

/merge

Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some suggestions mostly on formatting and naming, otherwise LGTM. Thanks @rjzamora !

docs/cudf/source/user_guide/10min.ipynb Outdated Show resolved Hide resolved
Comment on lines +21 to +23
"<div class=\"alert alert-block alert-info\">\n",
"<b>Note:</b> This notebook uses the explicit Dask cuDF API (dask_cudf) for clarity. However, we strongly recommend that you use Dask's <a href=\"https://docs.dask.org/en/latest/configuration.html\">configuration infrastructure</a> to set the \"dataframe.backend\" option to \"cudf\", and work with the Dask DataFrame API directly. Please see the <a href=\"https://github.com/rapidsai/cudf/tree/main/python/dask_cudf\">Dask cuDF documentation</a> for more information.\n",
"</div>\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these changes actually format it correctly when viewing the notebook? This is a markdown cell, shouldn't it be formatted as markdown? This is a real question, I have no idea what it should look like and didn't check it myself to how that looks after the changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just made this change after seeing that the cell was "off" in my browser: https://docs.rapids.ai/api/cudf/24.10/user_guide/10min/

I had to do a bit of research to learn that you need to use html to make a note like this work in a jupyter notebook.

docs/dask_cudf/source/index.rst Outdated Show resolved Hide resolved
docs/dask_cudf/source/index.rst Outdated Show resolved Hide resolved
docs/dask_cudf/source/index.rst Outdated Show resolved Hide resolved
python/dask_cudf/README.md Show resolved Hide resolved
python/dask_cudf/README.md Outdated Show resolved Hide resolved
Comment on lines 57 to 58
# Compute, persist, or write out the result
query.compute()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a note/warning like https://github.com/rapidsai/cudf/pull/16765/files#diff-9fe601374fbe50ac48fcee9f76db5d573411078bcd654e7c2bbbbe11abc8d6aaR184-R188 in here? Or possibly even use .persist() only to avoid suggesting practices that aren't often the best.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute is typically correct for groupby aggregations, but I'll change to head to be safe. Seem reasonable?

python/dask_cudf/README.md Outdated Show resolved Hide resolved
python/dask_cudf/README.md Outdated Show resolved Hide resolved
@rapids-bot rapids-bot bot merged commit 124d3e3 into rapidsai:branch-24.10 Sep 16, 2024
94 checks passed
@@ -115,7 +116,7 @@
"source": [
"ds = dask_cudf.from_cudf(s, npartitions=2)\n",
"# Note the call to head here to show the first few entries, unlike\n",
"# cuDF objects, dask-cuDF objects do not have a printing\n",
"# cuDF objects, Dask-cuDF objects do not have a printing\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to annoy you with this @rjzamora , but just wanted to point out in case you missed that you left a few Dask-cuDF entries (with hyphen) and you wanted to explicitly move away from it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to hyphenate Dask cuDF when it is used as an adjective.

@rjzamora rjzamora deleted the dask-cudf-api-doc-update branch September 16, 2024 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge dask Dask issue doc Documentation non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants