Skip to content

Commit

Permalink
response to Xiang comments
Browse files Browse the repository at this point in the history
  • Loading branch information
Ubuntu committed Jan 9, 2024
1 parent 3e2873f commit 5fb25ab
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions docs/source/notebooks/Notebook_0_Data_Prepare.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@
"metadata": {},
"source": [
"## Generate ACM Raw Table Data\n",
"Then we can use the command below to build the raw table data, which is the stand input data for GraphStorm's gconstruct module."
"Then we can use the command below to build the raw table data, which is the standard input data for GraphStorm's gconstruct module."
]
},
{
Expand Down Expand Up @@ -613,7 +613,7 @@
"metadata": {},
"source": [
"#### Raw ACM Tables in the `nodes/` and `edges/` folder.\n",
"As defined in the `./acm_raw/config.json` file, the node data files are stored at the `./acm_raw/nodes/` folder, and edge data files are stored at the `./acm_raw/edges/` folder. General description of these files can be found at the [Input raw node/edge data files](https://graphstorm.readthedocs.io/en/latest/tutorials/own-data.html#input-raw-node-edge-data-files). Here, we can read some nodes(\"paper\") and edges(\\[\"paper\", \"citing\", \"paper\"\\]) tables to learn more about them."
"As defined in the `./acm_raw/config.json` file, the node data files are stored at the `./acm_raw/nodes/` folder, and edge data files are stored at the `./acm_raw/edges/` folder. General description of these files can be found at the [Input raw node/edge data files](https://graphstorm.readthedocs.io/en/latest/tutorials/own-data.html#input-raw-node-edge-data-files). Here, we can read some node (\"paper\") and edge (\\[\"paper\", \"citing\", \"paper\"\\]) tables to learn more about them."
]
},
{
Expand All @@ -636,7 +636,7 @@
"source": [
"**The \"paper\" node table**\n",
"\n",
"The paper type node table could be read in as a Pandas DataFrame. The table has a few columns, whose name are used in the `config.json`. For the \"paper\" nodes, there is a `node_id` column, including a unique identifier for each node, a `feat` column, including a 256D numerical tensor for each node, a `text` column, including free text feature for each node, and a `label` column, including an integer to indicate the class that each node is assigned.\n",
"The paper node table could be read in as a Pandas DataFrame. The table has a few columns, whose names are used in the `config.json`. For the \"paper\" nodes, there is a `node_id` column, including a unique identifier for each node, a `feat` column, including a 256D numerical tensor for each node, a `text` column, including free text feature for each node, and a `label` column, including an integer to indicate the class that each node is assigned.\n",
"\n",
"The other two node types, \"author\" and \"subject\", have similar data tables. Users can explore them with the similar code below."
]
Expand Down Expand Up @@ -849,7 +849,7 @@
"\n",
"1. a GraphStorm partitioned configuration JSON file;\n",
"2. original node id space to GraphStorm node id space mapping files, created during graph processing;\n",
"3. GraphStorm node id space to shuffled node id space mapping, created during graph patitioning;\n",
"3. GraphStorm node id space to shuffle node id space mapping, created during graph patitioning;\n",
"4. label statitic files."
]
},
Expand Down Expand Up @@ -918,7 +918,7 @@
"id": "7cbd682e-36e8-4410-ae77-b0d6931f3052",
"metadata": {},
"source": [
"Because the choice of the different number of partitions, the two folders have different partition data sub-folders, named after \"part0\" to \"part*N*\", where *N* is the number of partitions specified with the `--num-parts` argument.\n",
"Because the choice of the different number of partitions, the two folders have different partition data sub-folders, named after \"part0\" to \"part***N***\", where ***N*** is the number of partitions specified with the `--num-parts` argument of construct_graph command.\n",
"\n",
"<div class=\"alert alert-block alert-info\">\n",
"<b>Tip:</b> In the next sections, we use the 3-partition graph to explore these four sets of files and sub-folders one by one. But we will use the 1-partition graph in the other notebooks for GraphStorm standalone mode programming tutorials. </div>\n"
Expand Down Expand Up @@ -1231,7 +1231,7 @@
"These node id mappings, in the form of a python dictionary, are stored in those `****_mapping.pt` files, which can be loaded using Pytorch.\n",
"\n",
"<div class=\"alert alert-block alert-info\">\n",
"<b>Tip:</b>In general, uses do not need to do the id mapping back operations. If use GraphStorm's command line interface to train models and do inference, GraphStorm will automatically remapping the partitioned ID space to the original node IDs space. </div>"
"<b>Tip:</b>In general, uses do not need to do the id mapping back operations. If use GraphStorm's command line interface to train models and do inference, GraphStorm will automatically remapping the partitioned ID space to the original node ID space. </div>"
]
},
{
Expand Down

0 comments on commit 5fb25ab

Please sign in to comment.