Skip to content

Commit

Permalink
Review comments
Browse files Browse the repository at this point in the history
  • Loading branch information
thvasilo committed Nov 17, 2023
1 parent f199677 commit c06b4c2
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 12 deletions.
5 changes: 2 additions & 3 deletions python/graphstorm/gconstruct/file_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,16 +200,15 @@ def read_data_parquet(data_file, data_fields=None):
"""
table = pq.read_table(data_file)
data = {}
df_table: pd.DataFrame = table.to_pandas()
df_table = table.to_pandas()
assert df_table.shape[0] > 0, \
f"{data_file} has an empty data. The data frame shape is {df_table.shape}"

if data_fields is None:
data_fields = list(df_table.keys())
for key in data_fields:
assert key in df_table, f"The data field {key} does not exist in {data_file}."
val = df_table[key].to_numpy()
d = np.array(val)
d = df_table[key].to_numpy()

# For multi-dimension arrays, we split them by rows and
# save them as objects in parquet. We need to merge them
Expand Down
10 changes: 1 addition & 9 deletions tests/end2end-tests/graphstorm-ec/mgpu_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -89,15 +89,7 @@ then
fi

echo "**************dataset: Generated multilabel MovieLens EC, do inference on saved model"
python3 -m graphstorm.run.gs_edge_classification --inference --workspace $GS_HOME/inference_scripts/ep_infer \
--num-trainers $NUM_INFO_TRAINERS --num-servers 1 --num-samplers 0 \
--part-config /data/movielen_100k_multi_label_ec/movie-lens-100k.json \
--ip-config ip_list.txt --ssh-port 2222 --cf ml_ec_infer.yaml \
--multilabel true --num-classes 6 --node-feat-name movie:title user:feat \
--use-mini-batch-infer false --save-embed-path /data/gsgnn_ec/infer-emb/ \
--restore-model-path /data/gsgnn_ec/epoch-$best_epoch/ \
--save-prediction-path /data/gsgnn_ec/prediction/ --logging-file /tmp/log.txt \
--logging-level debug --preserve-input True
python3 -m graphstorm.run.gs_edge_classification --inference --workspace $GS_HOME/inference_scripts/ep_infer --num-trainers $NUM_INFO_TRAINERS --num-servers 1 --num-samplers 0 --part-config /data/movielen_100k_multi_label_ec/movie-lens-100k.json --ip-config ip_list.txt --ssh-port 2222 --cf ml_ec_infer.yaml --multilabel true --num-classes 6 --node-feat-name movie:title user:feat --use-mini-batch-infer false --save-embed-path /data/gsgnn_ec/infer-emb/ --restore-model-path /data/gsgnn_ec/epoch-$best_epoch/ --save-prediction-path /data/gsgnn_ec/prediction/ --logging-file /tmp/log.txt --logging-level debug --preserve-input True

error_and_exit $?

Expand Down

0 comments on commit c06b4c2

Please sign in to comment.