Skip to content

Commit

Permalink
Review comments
Browse files Browse the repository at this point in the history
  • Loading branch information
thvasilo committed Sep 27, 2023
1 parent 0fc856c commit ad9ec0b
Show file tree
Hide file tree
Showing 5 changed files with 279 additions and 204 deletions.
61 changes: 32 additions & 29 deletions graphstorm-processing/docs/source/developer/developer-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ jump into the project.

The steps we recommend are:

Install JDK 8, 11 or 17
~~~~~~~~~~~~~~~~~~~~~~~
Install JDK 8, 11
~~~~~~~~~~~~~~~~~

PySpark requires a compatible Java installation to run, so
you will need to ensure your active JDK is using either
Expand All @@ -33,7 +33,7 @@ On Amazon Linux 2 you can use:
sudo yum install java-11-amazon-corretto-headless
sudo yum install java-11-amazon-corretto-devel
Install pyenv
Install ``pyenv``
~~~~~~~~~~~~~

``pyenv`` is a tool to manage multiple Python version installations. It
Expand All @@ -50,13 +50,14 @@ or use ``brew`` on a Mac:
brew update
brew install pyenv
For more info on ``pyenv`` see https://github.com/pyenv/pyenv
For more info on ``pyenv`` see `its documentation. <https://github.com/pyenv/pyenv>`

Create a Python 3.9 env and activate it.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We use Python 3.9 in our images so this most closely resembles the
execution environment on SageMaker.
execution environment on our Docker images that will be used for distributed
training.

.. code-block:: bash
Expand All @@ -65,12 +66,12 @@ execution environment on SageMaker.
..
Note: We recommend not mixing up conda and pyenv. When developing for
Note: We recommend not mixing up ``conda`` and ``pyenv``. When developing for
this project, simply ``conda deactivate`` until there's no ``conda``
env active (even ``base``) and just rely on pyenv+poetry to handle
env active (even ``base``) and just rely on ``pyenv`` and ``poetry`` to handle
dependencies.

Install poetry
Install ``poetry``
~~~~~~~~~~~~~~

``poetry`` is a dependency and build management system for Python. To install it
Expand All @@ -80,7 +81,7 @@ use:
curl -sSL https://install.python-poetry.org | python3 -
Install dependencies through poetry
Install dependencies through ``poetry``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Now we are ready to install our dependencies through ``poetry``.
Expand Down Expand Up @@ -111,13 +112,12 @@ You can also activate and use the virtual environment using:
# We're now using the graphstorm-processing-py3.9 env so we can just run
pytest ./graphstorm-processing/tests
To learn more about poetry see:
https://python-poetry.org/docs/basic-usage/
To learn more about ``poetry`` see its `documentation <https://python-poetry.org/docs/basic-usage/>`_

Use ``black`` to format code
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use ``black`` to format code [optional]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We use `black <https://black.readthedocs.io/en/stable/index.html>`__ to
We use `black <https://black.readthedocs.io/en/stable/index.html>`_ to
format code in this project. ``black`` is an opinionated formatter that
helps speed up development and code reviews. It is included in our
``dev`` dependencies so it will be installed along with the other dev
Expand Down Expand Up @@ -148,19 +148,18 @@ We include the ``mypy`` and ``pylint`` linters as a dependency under the ``dev``
of dependencies. These linters perform static checks on your code and
can be used in a complimentary manner.

We recommend using VSCode and enabling the mypy linter to get in-editor
annotations:

https://code.visualstudio.com/docs/python/linting#_general-settings
We recommend `using VSCode and enabling the mypy linter <https://code.visualstudio.com/docs/python/linting#_general-settings>`_
to get in-editor annotations.

You can also lint the project code through:

.. code-block:: bash
poetry run mypy ./graphstorm_processing
To learn more about ``mypy`` and how it can help development see:
https://mypy.readthedocs.io/en/stable/
To learn more about ``mypy`` and how it can help development
`see its documentation <https://mypy.readthedocs.io/en/stable/>`_.


Our goal is to minimize ``mypy`` errors as much as possible for the
project. New code should be linted and not introduce additional mypy
Expand All @@ -169,17 +168,21 @@ errors. When necessary it's OK to use ``type: ignore`` to silence

As a project, GraphStorm requires a 10/10 pylint score, so
ensure your code conforms to the expectation by running
`pylint --rcfile=/path/to/graphstorm/tests/lint/pylintrc` .

.. code-block:: bash
pylint --rcfile=/path/to/graphstorm/tests/lint/pylintrc
on your code before commits. To make this easier we include
a pre-commit hook below.

Use a pre-commit hook to ensure black and pylint runs before commits
Use a pre-commit hook to ensure ``black`` and ``pylint`` runs before commits
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To make code formatting and pylint checks easier for graphstorm-processing
developers we recommend using a pre-commit hook.
To make code formatting and ``pylint`` checks easier for graphstorm-processing
developers, we recommend using a pre-commit hook.

We include ``pre-commit`` in the projects ``dev`` dependencies, so once
We include ``pre-commit`` in the project's ``dev`` dependencies, so once
you have activated the project's venv (``poetry shell``) you can just
create a file named ``.pre-commit-config.yaml`` with the following contents:

Expand Down Expand Up @@ -213,15 +216,15 @@ And then run:
pre-commit install
which will install the ``black`` hook into your local repository and
which will install the ``black`` and ``pylin`` hooks into your local repository and
ensure it runs before every commit.

.. note:: text
.. note::

The pre-commit hook will also apply to all commits you make to the root
GraphStorm repository. Since that one doesn't use ``black``, you might
GraphStorm repository. Since that Graphstorm doesn't use ``black``, you might
want to remove the hooks. You can do so from the root repo
using ``rm -rf .git/hooks``.

Both project use ``pylint`` to check Python files so we'd still recommend using
Both projects use ``pylint`` to check Python files so we'd still recommend using
that hook even if you're doing development for both GSProcessing and GraphStorm.
Loading

0 comments on commit ad9ec0b

Please sign in to comment.