If you are interested in contributing to cuML, your contributions will fall into three categories:
- You want to report a bug, feature request, or documentation issue
- File an issue describing what you encountered or what you want to see changed.
- Please run and paste the output of the
cuml/print_env.sh
script while reporting a bug to gather and report relevant environment details. - The RAPIDS team will evaluate the issues and triage them, scheduling them for a release. If you believe the issue needs priority attention comment on the issue to notify the team.
- You want to propose a new Feature and implement it
- Post about your intended feature, and we shall discuss the design and implementation.
- Once we agree that the plan looks good, go ahead and implement it, using the code contributions guide below.
- You want to implement a feature or bug-fix for an outstanding issue
- Follow the code contributions guide below.
- If you need more context on a particular issue, please ask and we shall provide.
- Read the project's README.md to learn how to setup the development environment.
- Find an issue to work on. The best way is to look for the good first issue or help wanted labels
- Comment on the issue saying you are going to work on it.
- Get familar with the developer guide relevant for you:
- For C++ developers it is available here DEVELOPER_GUIDE.md
- For Python developers, a Python DEVELOPER_GUIDE.md is availabe as well.
- Code! Make sure to update unit tests!
- When done, create your pull request.
- Verify that CI passes all status checks, or fix if needed.
- Wait for other developers to review your code and update code as needed.
- Once reviewed and approved, a RAPIDS developer will merge your pull request.
After you have started a PR (refer to step 6 in the previous section), every time you do a git push <yourRemote> <pr-branch>
, it triggers a new CI run on all the commits thus far. Even though GPUCI has mechanisms to deal with this to a certain extent, if you keep push
ing too frequently, it might just clog our GPUCI servers and slow down every PR and conda package generation! So, please be mindful of this and try not to do many frequent pushes.
To quantify this, the average check in our CI takes between 25 and 32 minutes on our servers. The GPUCI infrastructure has limited resources, so if the servers get overwhelmed, every current active PR will not be able to correctly schedule CI.
Remember, if you are unsure about anything, don't hesitate to comment on issues and ask for clarifications!
Once you have gotten your feet wet and are more comfortable with the code, you can look at the prioritized issues of our next release in our project boards.
Pro Tip: Always look at the release board with the highest number for issues to work on. This is where RAPIDS developers also focus their efforts.
Look at the unassigned issues, and find an issue you are comfortable with contributing to. Start with Step 3 from above, commenting on the issue to let others know you are working on it. If you have any questions related to the implementation of the issue, ask them in the issue instead of the PR.
The cuML repository has two main branches:
master
branch: it contains the last released version. Only hotfixes are targeted and merged into it.branch-x.y
: it is the development branch which contains the upcoming release. All the new features should be based on this branch and Merge/Pull request should target this branch (with the exception of hotfixes).
For every new version x.y
of cuML there is a corresponding branch called branch-x.y
, from where new feature development starts and PRs will be targeted and merged before its release. The exceptions to this are the 'hotfixes' that target the master
branch, which target critical issues raised by Github users and are directly merged to master
branch, and create a new subversion of the project. While trying to patch an issue which requires a 'hotfix', please state the intent in the PR.
For all development, your changes should be pushed into a branch (created using the naming instructions below) in your own fork of cuML and then create a pull request when the code is ready.
A few days before releasing version x.y
the code of the current development branch (branch-x.y
) will be frozen and a new branch, 'branch-x+1.y' will be created to continue development.
-
Create a new branch based on
branch-x.y
named following the format<type>-<name>
, where:- Type:
- fea - For if the branch is for a new feature(s)
- enh - For if the branch is an enhancement of an existing feature(s)
- bug - For if the branch is for fixing a bug(s) or regression(s)
- Name:
- A name to convey what is being worked on
- Please use dashes or underscores between words as opposed to spaces.
- Type:
-
Add a line to the
CHANGELOG.md
file (located in the repository root folder) with a one line description of the functionality implemented in the Pull Request. Please put the line in the adequate section: New Feature, Improvement or Bug Fix. The cuML repository CI requires this change before a pull request can be merged.
Before submitting a pull request, you can do a local build and test on your machine that mimics our gpuCI environment using the ci/local/build.sh
script.
For detailed information on usage of this script, see here.
Portions adopted from https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md