Merge branch 'main' into add_wg_sparse_emb_rebased

awslabs · Jan 13, 2024 · f8ab9df · f8ab9df
2 parents afa3d05 + aacf520
commit f8ab9df
Show file tree

Hide file tree

Showing 64 changed files with 3,000 additions and 100 deletions.
diff --git a/.github/workflow_scripts/e2e_check.sh b/.github/workflow_scripts/e2e_check.sh
@@ -8,6 +8,7 @@ sh ./tests/end2end-tests/create_data.sh
 sh ./tests/end2end-tests/tools/test_mem_est.sh
 sh ./tests/end2end-tests/data_process/test.sh
 sh ./tests/end2end-tests/data_process/movielens_test.sh
+sh ./tests/end2end-tests/data_process/homogeneous_test.sh
 sh ./tests/end2end-tests/custom-gnn/run_test.sh
 bash ./tests/end2end-tests/graphstorm-nc/test.sh
 bash ./tests/end2end-tests/graphstorm-lp/test.sh

diff --git a/.github/workflow_scripts/lint_check.sh b/.github/workflow_scripts/lint_check.sh
@@ -7,6 +7,7 @@ python3 -m pip install --upgrade prospector pip
 yes | pip3 install astroid==v3.0.0
 FORCE_CUDA=1 python3 -m pip install -e '.[test]'  --no-build-isolation
 pylint --rcfile=./tests/lint/pylintrc ./python/graphstorm/data/*.py
+pylint --rcfile=./tests/lint/pylintrc ./python/graphstorm/distributed/
 pylint --rcfile=./tests/lint/pylintrc ./python/graphstorm/dataloading/
 pylint --rcfile=./tests/lint/pylintrc ./python/graphstorm/gconstruct/
 pylint --rcfile=./tests/lint/pylintrc ./python/graphstorm/config/

diff --git a/docker/sagemaker/Dockerfile.sm b/docker/sagemaker/Dockerfile.sm
@@ -46,7 +46,7 @@ ENV PYTHONPATH="/opt/ml/code/graphstorm/python/:${PYTHONPATH}"
 RUN cp /opt/ml/code/graphstorm/sagemaker/run/* /opt/ml/code/
 
 # Download DGL source code
-RUN cd /root; git clone https://github.com/dmlc/dgl.git; cd dgl; git checkout -b 1.1.0 1.1.0
+RUN cd /root; git clone https://github.com/dmlc/dgl.git
 # Un-comment if we prefer a local DGL distribution
 # COPY dgl /root/dgl
 ENV PYTHONPATH="/root/dgl/tools/:${PYTHONPATH}"

diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -1,5 +1,7 @@
 sphinx==7.1.2
 sphinx-rtd-theme==1.3.0
+nbsphinx
+pandoc
 --extra-index-url https://download.pytorch.org/whl/cpu
 torch==1.13.1+cpu
 -f https://data.dgl.ai/wheels-internal/repo.html

diff --git a/docs/source/advanced/own-models.rst b/docs/source/advanced/own-models.rst
@@ -272,7 +272,7 @@ The GraphStorm trainers can have evaluators and task trackers associated. The fo
                                   config.early_stop_strategy)
     trainer.setup_evaluator(evaluator)
     # Optional: set up a task tracker to show the progress of training.
-    tracker = GSSageMakerTaskTracker(config)
+    tracker = GSSageMakerTaskTracker(config.eval_frequency)
     trainer.setup_task_tracker(tracker)
 
 GraphStorm's `evaluators <https://github.com/awslabs/graphstorm/blob/main/python/graphstorm/eval/evaluator.py>`_ could help to compute the required evaluation metrics, such as ``accuracy``, ``f1``, ``mrr``, and etc. Users can select the proper evaluator and use the trainer's ``setup_evaluator()`` method to attach them. GraphStorm's `task trackers <https://github.com/awslabs/graphstorm/blob/main/python/graphstorm/tracker/graphstorm_tracker.py>`_ serve as log collectors, which are used to show the process information.

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -35,6 +35,7 @@
     "sphinx.ext.autosummary",
     "sphinx.ext.coverage",
     "sphinx.ext.mathjax",
+    "nbsphinx",
 ]
 templates_path = ['_templates']
 exclude_patterns = []

diff --git a/docs/source/configuration/configuration-run.rst b/docs/source/configuration/configuration-run.rst
@@ -126,11 +126,6 @@ GraphStorm provides a set of parameters to control how and where to save and res
     - Yaml: ``task_tracker: sagemaker_task_tracker``
     - Argument: ``--task_tracker sagemaker_task_tracker``
     - Default value: ``sagemaker_task_tracker``
-- **log_report_frequency**: The frequency of reporting model performance metrics through task_tracker. The frequency is defined by using number of iterations, i.e., every N iterations the evaluation metrics will be reported. (Please note the evaluation metrics should be generated at the reporting iteration. See "eval_frequency" for how evaluation frequency is controlled.)
-
-    - Yaml: ``log_report_frequency: 1000``
-    - Argument: ``--log-report-frequency 1000``
-    - Default value: ``1000``
 - **restore_model_path**: A path where GraphStorm model parameters were saved. For training, if restore_model_path is set, GraphStom will retrieve the model parameters from restore_model_path instead of initializing the parameters. For inference, restore_model_path must be provided.
 
     - Yaml: ``restore_model_path: /model/checkpoint/``
@@ -278,7 +273,7 @@ GraphStorm provides a set of parameters to control model evaluation.
     - Yaml: ``use_mini_batch_infer: false``
     - Argument: ``--use-mini-batch-infer false``
     - Default value: ``true``
-- **eval_frequency**: The frequency of doing evaluation. GraphStorm trainers do evaluation at the end of each epoch. However, for large-scale graphs, training one epoch may take hundreds of thousands of iterations. One may want to do evaluations in the middle of an epoch. When eval_frequency is set, every **eval_frequency** iterations, the trainer will do evaluation once. The evaluation results can be printed and reported. See **log_report_frequency** for more details.
+- **eval_frequency**: The frequency of doing evaluation. GraphStorm trainers do evaluation at the end of each epoch. However, for large-scale graphs, training one epoch may take hundreds of thousands of iterations. One may want to do evaluations in the middle of an epoch. When eval_frequency is set, every **eval_frequency** iterations, the trainer will do evaluation once. The evaluation results can be printed and reported.
 
     - Yaml: ``eval_frequency: 10000``
     - Argument: ``--eval-frequency 10000``
@@ -381,20 +376,20 @@ Classification and Regression Task
 
 Node Classification/Regression Specific
 .........................................
-- **target_ntype**: (**Required**) The node type for prediction.
+- **target_ntype**: The node type for prediction.
 
     - Yaml: ``target_ntype: movie``
     - Argument: ``--target-ntype movie``
-    - Default value: This parameter must be provided by user.
+    - Default value: For heterogeneous input graph, this parameter must be provided by the user. If not provided, GraphStorm will assume the input graph is a homogeneous graph and set ``target_ntype`` to "_N".
 
 Edge Classification/Regression Specific
 ..........................................
-- **target_etype**: (**Required**) The list of canonical edge types that will be added as a training target in edge classification/regression tasks, for example ``--train-etype query,clicks,asin`` or ``--train-etype query,clicks,asin query,search,asin``. A canonical edge type should be formatted as `src_node_type,relation_type,dst_node_type`. Currently, GraphStorm only supports single task edge classification/regression, i.e., it only accepts one canonical edge type.
+- **target_etype**: The list of canonical edge types that will be added as training targets in edge classification/regression tasks, for example ``--train-etype query,clicks,asin`` or ``--train-etype query,clicks,asin query,search,asin``. A canonical edge type should be formatted as `src_node_type,relation_type,dst_node_type`. Currently, GraphStorm only supports single task edge classification/regression, i.e., it only accepts one canonical edge type.
 
     - Yaml: ``target_etype:``
            | ``- query,clicks,asin``
     - Argument: ``--target-etype query,clicks,asin``
-    - Default value: This parameter must be provided by user.
+    - Default value: For heterogeneous input graph, this parameter must be provided by the user. If not provided, GraphStorm will assume the input graph is a homogeneous graph and set ``target_etype`` to ("_N", "_E", "_N").
 - **remove_target_edge_type**: When set to true, GraphStorm removes target_etype in message passing, i.e., any edge with target_etype will not be sampled during training and inference.
 
     - Yaml: ``remove_target_edge_type: false``

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -35,6 +35,14 @@ Welcome to the GraphStorm Documentation and Tutorials
    scale/distributed
    scale/sagemaker
 
+.. toctree::
+   :maxdepth: 1
+   :caption: Programming User Guide
+   :hidden:
+   :glob:
+
+   notebooks/Notebook_0_Data_Prepare
+
 .. toctree::
    :maxdepth: 1
    :caption: Advanced Topics