Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] change the input argument of GSTaskTrackerAbc to be an integer #699

Merged
merged 5 commits into from
Jan 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/advanced/own-models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ The GraphStorm trainers can have evaluators and task trackers associated. The fo
config.early_stop_strategy)
trainer.setup_evaluator(evaluator)
# Optional: set up a task tracker to show the progress of training.
tracker = GSSageMakerTaskTracker(config)
tracker = GSSageMakerTaskTracker(config.eval_frequency)
trainer.setup_task_tracker(tracker)

GraphStorm's `evaluators <https://github.com/awslabs/graphstorm/blob/main/python/graphstorm/eval/evaluator.py>`_ could help to compute the required evaluation metrics, such as ``accuracy``, ``f1``, ``mrr``, and etc. Users can select the proper evaluator and use the trainer's ``setup_evaluator()`` method to attach them. GraphStorm's `task trackers <https://github.com/awslabs/graphstorm/blob/main/python/graphstorm/tracker/graphstorm_tracker.py>`_ serve as log collectors, which are used to show the process information.
Expand Down
7 changes: 1 addition & 6 deletions docs/source/configuration/configuration-run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,11 +126,6 @@ GraphStorm provides a set of parameters to control how and where to save and res
- Yaml: ``task_tracker: sagemaker_task_tracker``
- Argument: ``--task_tracker sagemaker_task_tracker``
- Default value: ``sagemaker_task_tracker``
- **log_report_frequency**: The frequency of reporting model performance metrics through task_tracker. The frequency is defined by using number of iterations, i.e., every N iterations the evaluation metrics will be reported. (Please note the evaluation metrics should be generated at the reporting iteration. See "eval_frequency" for how evaluation frequency is controlled.)

- Yaml: ``log_report_frequency: 1000``
- Argument: ``--log-report-frequency 1000``
- Default value: ``1000``
- **restore_model_path**: A path where GraphStorm model parameters were saved. For training, if restore_model_path is set, GraphStom will retrieve the model parameters from restore_model_path instead of initializing the parameters. For inference, restore_model_path must be provided.

- Yaml: ``restore_model_path: /model/checkpoint/``
Expand Down Expand Up @@ -278,7 +273,7 @@ GraphStorm provides a set of parameters to control model evaluation.
- Yaml: ``use_mini_batch_infer: false``
- Argument: ``--use-mini-batch-infer false``
- Default value: ``true``
- **eval_frequency**: The frequency of doing evaluation. GraphStorm trainers do evaluation at the end of each epoch. However, for large-scale graphs, training one epoch may take hundreds of thousands of iterations. One may want to do evaluations in the middle of an epoch. When eval_frequency is set, every **eval_frequency** iterations, the trainer will do evaluation once. The evaluation results can be printed and reported. See **log_report_frequency** for more details.
- **eval_frequency**: The frequency of doing evaluation. GraphStorm trainers do evaluation at the end of each epoch. However, for large-scale graphs, training one epoch may take hundreds of thousands of iterations. One may want to do evaluations in the middle of an epoch. When eval_frequency is set, every **eval_frequency** iterations, the trainer will do evaluation once. The evaluation results can be printed and reported.

- Yaml: ``eval_frequency: 10000``
- Argument: ``--eval-frequency 10000``
Expand Down
2 changes: 1 addition & 1 deletion examples/customized_models/HGT/hgt_nc.py
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,7 @@ def main(args):
config.early_stop_strategy)
trainer.setup_evaluator(evaluator)
# Optional: set up a task tracker to show the progress of training.
tracker = GSSageMakerTaskTracker(config)
tracker = GSSageMakerTaskTracker(config.eval_frequency)
trainer.setup_task_tracker(tracker)

# Start the training process.
Expand Down
2 changes: 1 addition & 1 deletion examples/peft_llm_gnn/main_nc.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ def main(config_args):
config.early_stop_strategy,
)
trainer.setup_evaluator(evaluator)
tracker = GSSageMakerTaskTracker(config)
tracker = GSSageMakerTaskTracker(config.eval_frequency)
trainer.setup_task_tracker(tracker)

# create train loader
Expand Down
3 changes: 0 additions & 3 deletions examples/peft_llm_gnn/nc_config_Video_Games.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,8 @@ gsf:
batch_size: 4
dropout: 0.1
eval_batch_size: 4
# eval_frequency: 100
#log_report_frequency: 50
lr: 0.0001
num_epochs: 10
# save_model_frequency: 300
wd_l2norm: 1.0e-06
input:
restore_model_path: null
Expand Down
2 changes: 1 addition & 1 deletion python/graphstorm/gsf.py
Original file line number Diff line number Diff line change
Expand Up @@ -656,4 +656,4 @@ def check_homo(g):

def create_builtin_task_tracker(config):
tracker_class = get_task_tracker_class(config.task_tracker)
return tracker_class(config)
return tracker_class(config.eval_frequency)
10 changes: 6 additions & 4 deletions python/graphstorm/tracker/graphstorm_tracker.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@ class GSTaskTrackerAbc():

Parameters
----------
config: GSConfig
Configurations. Users can add their own configures in the yaml config file.
log_report_frequency: int
The frequency of reporting model performance metrics through task_tracker.
The frequency is defined by using number of iterations, i.e., every N iterations
the evaluation metrics will be reported.
"""
def __init__(self, config):
self._report_frequency = config.log_report_frequency # Can be None if not provided
def __init__(self, log_report_frequency):
self._report_frequency = log_report_frequency # Can be None if not provided

@abc.abstractmethod
def log_metric(self, metric_name, metric_value, step, force_report=False):
Expand Down
7 changes: 5 additions & 2 deletions python/graphstorm/tracker/sagemaker_tracker.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,11 @@ class GSSageMakerTaskTracker(GSTaskTrackerAbc):

Parameters
----------
config: GSConfig
Configurations. Users can add their own configures in the yaml config file.
log_report_frequency: int
The frequency of reporting model performance metrics through task_tracker.
The frequency is defined by using number of iterations, i.e., every N iterations
the evaluation metrics will be reported.

"""

def _do_report(self, step):
Expand Down
Loading