Thanks for considering contributing to Streaming!
Issues tagged with good first issue are great options to start contributing.
If you have questions, join us on Slack -- we'll be happy to help you!
We welcome contributions for bug fixes, new features you'd like to contribute to the community, or improve test suite!
To set up the development environment in your local box, run the commands below.
1. Install the dependencies needed for testing and linting the code:
pip install -e '.[dev]'
# Optional: If you would like to install all the dependencies
pip install -e '.[all]'
2. Configure pre-commit, which automatically formats code before each commit:
pre-commit install
To submit a contribution:
1. Fork a copy of the Streaming library to your own account.
2. Clone your fork locally and add the mosaicml repo as a remote repository:
git clone [email protected]:<github_id>/streaming.git
cd streaming
git remote add upstream https://github.com/mosaicml/streaming.git
3. Create a branch and make your proposed changes.
cd streaming
git checkout -b cool-new-feature
4. Run linting as part of pre-commit
.
git add <file1> <file2>
pre-commit run
# Optional: Run pre-commit for all files
pre-commit run --all-files
5. Run the unit test to ensure it passes locally.
ulimit -n unlimited # Workaround: To overcome 'Too many open files' issues since streaming uses atexit handler to close file descriptor at the end.
pytest -vv -s . # run all the unittests
cd docs && make clean && make doctest # run doctests
6. [Optional] Compile and visualize the documentation locally. If you have a documentation changes, running the below commands is mandatory.
cd docs
pip install -e '.[docs]'
make clean && make html
make host # open the output link in a browser.
See the Makefile for more information.
7. When you are ready, submit a pull request into the streaming repository!
git commit -m "cool feature" # Add relevant commit message
git push origin cool-new-feature
Create a pull request to propose changes you've made to a fork of an upstream repository by following this guide.
Streaming uses pytest-codeblocks to test all example code snippets. The pytest-codeblocks repository explains how to annotate code snippets, which supports most pytest
configurations. For example, if a test requires model training, the GPU mark (<!--pytest.mark.skip-->
) should be applied.