diff --git a/docs/source/cli/model-training-inference/configuration-run.rst b/docs/source/cli/model-training-inference/configuration-run.rst index b99be25708..8260a8ed19 100644 --- a/docs/source/cli/model-training-inference/configuration-run.rst +++ b/docs/source/cli/model-training-inference/configuration-run.rst @@ -7,7 +7,7 @@ GraphStorm provides dozens of configurable parameters for users to control their Launch Arguments -------------------- -GraphStorm's `graphstorm.run.launch `_ command has a set of parameters to control the launch behavior of training and inference. +GraphStorm's model training and inference CLIs (both task-specific and task-agnostic) have a set of parameters to control the behavior of training and inference. - **workspace**: the folder where launch command assume all artifacts were saved. If the other parameters' file paths are relative paths, launch command will consider these files in the workspace. - **part-config**: (**Required**) Path to a file containing graph partition configuration. The graph partition is generated by GraphStorm Partition tools. **HINT**: Use absolute path to avoid any path related problems. Otherwise, the file should be in workspace. @@ -51,13 +51,16 @@ GraphStorm provides a set of parameters to config the GNN model structure (input - Yaml: ``model_encoder_type: rgcn`` - Argument: ``--model-encoder-type rgcn`` - Default value: This parameter must be provided by user. -- **node_feat_name**: User defined feature name. It accepts two formats: a) `fname`, if a node has node features, the corresponding feature name will be fname; b) `ntype0:feat0 ntype1:featA ...`, different node types have different node feature name(s). In the example, "ntype0" has a node feature named "feat0" and "ntype1" has a node feature named "featA". Note: Characters `:` and ` ` are not allowed to be used in node feature names. And in Yaml format, need to put each node's feature in a separated line that starts with a hyphon. +- **node_feat_name**: User defined feature name. It accepts two formats: a) `fname`, if a node has node features, the corresponding feature name will be fname; b) `ntype0:feat0 ntype1:featA ...`, different node types have different node feature name(s). In the example, "ntype0" has a node feature named "feat0" and "ntype1" has a node feature named "featA". - Yaml: ``node_feat_name:`` | ``- "ntype1:featA"`` | ``- "ntype0:feat0"`` - Argument: ``--node-feat-name "ntype0:feat0 ntype1:featA"`` - Default value: If not provided, there will be no node features used by GraphStorm even graphs have node features attached. + + .. Note:: Characters ``:`` and white space are not allowed to be used in node feature names. And in Yaml format, need to put each node's feature in a separated line that starts with a hyphon. + - **num_layers**: Number of GNN layers. Must be an integer larger than 0 if given. By default, it is set to 0, which means no GNN layers. - Yaml: ``num_layers: 2`` @@ -111,7 +114,7 @@ GraphStorm provides a set of parameters to control how and where to save and res - Yaml: ``save_model_frequency: 1000`` - Argument: ``--save-model-frequency 1000`` - Default value: ``-1``. GraphStorm will not save models within an epoch. -- **topk_model_to_save**: The number of top best GraphStorm model to save. By default, GraphStorm will keep all the saved models in disk, which will consume huge number of disk space. Users can set a positive integer, e.g. `K`, to let GraphStorm only save `K`` models with the best performance. +- **topk_model_to_save**: The number of top best GraphStorm model to save. By default, GraphStorm will keep all the saved models in disk, which will consume huge number of disk space. Users can set a positive integer, e.g. `K`, to let GraphStorm only save `K` models with the best performance. - Yaml: ``topk_model_to_save: 3`` - Argument: ``--topk-model-to-save 3`` diff --git a/docs/source/cli/model-training-inference/index.rst b/docs/source/cli/model-training-inference/index.rst index e4aa6c0829..a99c59578e 100644 --- a/docs/source/cli/model-training-inference/index.rst +++ b/docs/source/cli/model-training-inference/index.rst @@ -10,7 +10,7 @@ This section provides guidelines of GraphStorm model training and inference on : GraphStorm CLIs require less- or no-code operations for users to perform Graph Machine Learning (GML) tasks. In most cases, users only need to configure the parameters or arguments provided by GraphStorm to fulfill their GML tasks. Users can find the details of these configurations in the :ref:`Model Training and Inference Configurations`. -In addition, there are two node ID mapping operations during the graph construction procedure, and these mapping results are saved in a certain folder by which GraphStorm inference pipelines will automatically use to remap prediction results' node IDs back to the original IDs. In case when such automatic remapping does not occur, you can do it mannually according to the :ref:`GraphStorm Output Node ID Remapping ` guideline. +In addition, there are two node ID mapping operations during the graph construction procedure, and these mapping results are saved in a certain folder by which GraphStorm training and inference CLIs will automatically use to remap prediction results' node IDs back to the original IDs. In case when such automatic remapping does not occur, you can find the details of outputs of model training and inference without remapping in :ref:`GraphStorm Training and Inference Output `. In addition, users can do the remapping mannually according to the :ref:`GraphStorm Output Node ID Remapping ` guideline. .. toctree:: :maxdepth: 2