Merge branch 'main' into gsprocessing-docker-fix

awslabs · Sep 26, 2024 · 1521703 · 1521703
2 parents 614adb4 + ccc931b
commit 1521703
Show file tree

Hide file tree

Showing 18 changed files with 1,342 additions and 66 deletions.
diff --git a/.github/workflow_scripts/e2e_gb_check.sh b/.github/workflow_scripts/e2e_gb_check.sh
@@ -8,6 +8,6 @@ GS_HOME=$(pwd)
 # Install graphstorm from checked out code
 pip3 install "$GS_HOME" --upgrade
 
-bash ./tests/end2end-tests/setup.sh
 bash ./tests/end2end-tests/create_data.sh
 bash ./tests/end2end-tests/graphbolt-gs-integration/graphbolt-graph-construction.sh
+bash ./tests/end2end-tests/graphbolt-gs-integration/graphbolt-training-inference.sh
diff --git a/.github/workflows/continuous-integration.yml b/.github/workflows/continuous-integration.yml
@@ -191,6 +191,7 @@ jobs:
       uses: aws-actions/configure-aws-credentials@v1
       with:
         role-to-assume: arn:aws:iam::698571788627:role/github-oidc-role
+        role-duration-seconds: 14400
         aws-region: us-east-1
     - name: Checkout repository
       uses: actions/checkout@v3

diff --git a/docs/source/advanced/link-prediction.rst b/docs/source/advanced/link-prediction.rst
@@ -12,8 +12,8 @@ Optimizing model performance
 ----------------------------
 GraphStorm incorporates three ways of improving model performance of link
 prediction. Firstly, GraphStorm avoids information leak in model training.
-Secondly, to better handle heterogeneous graphs, GraphStorm provides three ways
-to compute link prediction scores: dot product, DistMult and RotatE.
+Secondly, to better handle heterogeneous graphs, GraphStorm provides four ways
+to compute link prediction scores: dot product, DistMult, TransE, and RotatE.
 Thirdly, GraphStorm provides two options to compute training losses, i.e.,
 cross entropy loss and contrastive loss. The following sub-sections provide more details.
 
@@ -32,7 +32,7 @@ GraphStorm provides supports to avoid theses problems:
 
 Computing Link Prediction Scores
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-GraphStorm provides three ways to compute link prediction scores: Dot Product, DistMult and RotatE.
+GraphStorm provides four ways to compute link prediction scores: Dot Product, DistMult, TransE, and RotatE.
 
 * **Dot Product**: The Dot Product score function is as:
 
@@ -53,7 +53,21 @@ GraphStorm provides three ways to compute link prediction scores: Dot Product, D
     The ``relation_emb`` values are initialized from a uniform distribution
     within the range of ``(-gamma/hidden_size, gamma/hidden_size)``,
     where ``gamma`` and ``hidden_size`` are hyperparameters defined in
-    :ref:`Model Configurations<configurations-model>`。
+    :ref:`Model Configurations<configurations-model>`.
+
+* **TransE**: The TransE score function is as:
+
+    .. math::
+        score = gamma - \|h+r-t\|^{frac{1}{2}} \text{or} gamma - \|h+r-t\|
+
+    where the ``head_emb`` is the node embedding of the head node,
+    the ``tail_emb`` is the node embedding of the tail node,
+    the ``relation_emb`` is the relation embedding of the specific edge type.
+    The ``relation_emb`` values are initialized from a uniform distribution
+    within the range of ``(-gamma/(hidden_size/2), gamma/(hidden_size/2))``,
+    where ``gamma`` and ``hidden_size`` are hyperparameters defined in
+    :ref:`Model Configurations<configurations-model>`.
+    To learn more information about TransE, please refer to `the DGLKE doc <https://dglke.dgl.ai/doc/kg.html#transe>`__.
 
 * **RotatE**: The RotatE score function is as:
 

diff --git a/docs/source/api/references/graphstorm.model.rst b/docs/source/api/references/graphstorm.model.rst
@@ -101,3 +101,7 @@ Decoder Layer
     LinkPredictContrastiveDistMultDecoder
     LinkPredictRotatEDecoder
     LinkPredictContrastiveRotatEDecoder
+    LinkPredictWeightedRotatEDecoder
+    LinkPredictTransEDecoder
+    LinkPredictContrastiveTransEDecoder
+    LinkPredictWeightedTransEDecoder
diff --git a/docs/source/cli/model-training-inference/configuration-run.rst b/docs/source/cli/model-training-inference/configuration-run.rst
@@ -482,12 +482,12 @@ Link Prediction Task
     - Yaml: ``num_negative_edges_eval: 1000``
     - Argument: ``--num-negative-edges-eval 1000``
     - Default value: ``1000``
-- **lp_decoder_type**: Set the decoder type for loss function in Link Prediction tasks. Currently GraphStorm support  ``dot_product``, ``distmult`` and ``rotate``.
+- **lp_decoder_type**: Set the decoder type for loss function in Link Prediction tasks. Currently GraphStorm support  ``dot_product``, ``distmult``, ``rotate``, ``transe_l1``, and ``transe_l2``.
 
     - Yaml: ``lp_decoder_type: dot_product``
     - Argument: ``--lp-decoder-type dot_product``
     - Default value: ``distmult``
-- **gamma**: Set the value of the hyperparameter denoted by the symbol gamma. Gamma is used in the following cases: i/ focal loss for binary classification ii/ DistMult score function for link prediction and iii/ RotatE score function for link prediction.
+- **gamma**: Set the value of the hyperparameter denoted by the symbol gamma. Gamma is used in the following cases: i/ focal loss for binary classification ii/ DistMult score function for link prediction, iii/ TransE score function for link prediction, and iv/ RotatE score function for link prediction.
 
     - Yaml: ``gamma: 10.0``
     - Argument: ``--gamma 10.0``
@@ -586,4 +586,4 @@ GraphStorm provides a set of parameters to control GNN distillation.
 
     - Yaml: ``max_seq_len: 1024``
     - Argument: ``--max-seq-len 1024``
-    - Default value: ``1024``
+    - Default value: ``1024``
diff --git a/python/graphstorm/config/__init__.py b/python/graphstorm/config/__init__.py
@@ -31,7 +31,9 @@
 
 from .config import (BUILTIN_LP_DOT_DECODER,
                      BUILTIN_LP_DISTMULT_DECODER,
-                     BUILTIN_LP_ROTATE_DECODER)
+                     BUILTIN_LP_ROTATE_DECODER,
+                     BUILTIN_LP_TRANSE_L1_DECODER,
+                     BUILTIN_LP_TRANSE_L2_DECODER)
 from .config import SUPPORTED_LP_DECODER
 
 from .config import (GRAPHSTORM_MODEL_EMBED_LAYER,

diff --git a/python/graphstorm/config/argument.py b/python/graphstorm/config/argument.py
@@ -83,7 +83,7 @@ def get_argument_parser():
     arugments in GraphStorm launch CLIs. Specifically, it will parses yaml config file first,
     and then parses arguments to overwrite parameters defined in the yaml file or add new
     parameters.
-    
+
     This ``get_argument_parser()`` is also useful when users want to convert customized models
     to use GraphStorm CLIs.
 
@@ -166,7 +166,7 @@ def get_argument_parser():
 # pylint: disable=no-member
 class GSConfig:
     """GSgnn configuration class.
-    
+
     GSConfig contains all GraphStorm model training and inference configurations, which can
     either be loaded from a yaml file specified in the ``--cf`` argument, or from CLI arguments.
     """
@@ -1223,9 +1223,9 @@ def edge_feat_name(self):
     @property
     def node_feat_name(self):
         """ User defined node feature name. Default is None.
-         
+
         It can be in following format:
-        
+
         - ``feat_name``: global feature name, if a node has node feature, the corresponding
           feature name is <feat_name>.
         - ``"ntype0:feat0","ntype1:feat0,feat1",...``: different node types  have different
@@ -1291,16 +1291,16 @@ def _check_fanout(self, fanout, fot_name):
     def fanout(self):
         """ The fanouts of GNN layers. The values of fanouts must be integers larger
             than 0. The number of fanouts must equal to ``num_layers``. Must provide.
-            
-            It accepts two formats: 
-            
+
+            It accepts two formats:
+
             - ``20,10``, which defines the number of neighbors
             to sample per edge type for each GNN layer with the i_th element being the
             fanout for the ith GNN layer.
-            
+
             - "etype2:20@etype3:20@etype1:10,etype2:10@etype3:4@etype1:2", which defines
             the numbers of neighbors to sample for different edge types for each GNN layers
-            with the i_th element being the fanout for the i_th GNN layer. 
+            with the i_th element being the fanout for the i_th GNN layer.
         """
         # pylint: disable=no-member
         if self.model_encoder_type in BUILTIN_GNN_ENCODER:
@@ -1440,7 +1440,7 @@ def use_mini_batch_infer(self):
 
     @property
     def gnn_norm(self):
-        """ Normalization method for GNN layers. Options include ``batch`` or ``layer``. 
+        """ Normalization method for GNN layers. Options include ``batch`` or ``layer``.
             Default is None.
         """
         # pylint: disable=no-member
@@ -1616,7 +1616,7 @@ def dropout(self):
     @property
     # pylint: disable=invalid-name
     def lr(self):
-        """ Learning rate for dense parameters of input encoders, model encoders, 
+        """ Learning rate for dense parameters of input encoders, model encoders,
             and decoders. Must provide.
         """
         assert hasattr(self, "_lr"), "Learning rate must be specified"
@@ -1825,7 +1825,7 @@ def early_stop_rounds(self):
 
     @property
     def early_stop_strategy(self):
-        """ The strategy used to decide if stop training early. GraphStorm supports two 
+        """ The strategy used to decide if stop training early. GraphStorm supports two
             strategies: 1) ``consecutive_increase``, and 2) ``average_increase``.
             Default is ``average_increase``.
         """
@@ -1843,7 +1843,7 @@ def early_stop_strategy(self):
 
     @property
     def use_early_stop(self):
-        """ Whether to use early stopping during training. Default is False. 
+        """ Whether to use early stopping during training. Default is False.
         """
         # pylint: disable=no-member
         if hasattr(self, "_use_early_stop"):
@@ -1954,7 +1954,7 @@ def check_multilabel(multilabel):
     def multilabel_weights(self):
         """Used to specify label weight of each class in a multi-label classification task.
             It is feed into ``th.nn.BCEWithLogitsLoss`` as ``pos_weight``.
-             
+
             The weights should be in the following format 0.1,0.2,0.3,0.1,0.0, ...
             Default is None.
         """
@@ -2182,7 +2182,7 @@ def remove_target_edge_type(self):
             Default is True.
 
             If set to True, Graphstorm will set the fanout of training target edge
-            type as zero. This is only used with edge classification. 
+            type as zero. This is only used with edge classification.
             If the edge classification is to predict the existence of an edge between
             two nodes, GraphStorm should remove the target edge in the message passing to
             avoid information leak. If it's to predict some attributes associated with
@@ -2205,7 +2205,7 @@ def remove_target_edge_type(self):
 
     @property
     def decoder_type(self):
-        """ The type of edge clasification or regression decoders. Built-in decoders include 
+        """ The type of edge clasification or regression decoders. Built-in decoders include
             ``DenseBiDecoder`` and ``MLPDecoder``. Default is ``DenseBiDecoder``.
         """
         # pylint: disable=no-member
@@ -2265,7 +2265,7 @@ def decoder_edge_feat(self):
     ### Link Prediction specific ###
     @property
     def train_negative_sampler(self):
-        """ The negative sampler used for link prediction training. 
+        """ The negative sampler used for link prediction training.
             Built-in samplers include ``uniform``, ``joint``, ``localuniform``,
             ``all_etype_uniform`` and ``all_etype_joint``. Default is ``uniform``.
         """
@@ -2276,7 +2276,7 @@ def train_negative_sampler(self):
 
     @property
     def eval_negative_sampler(self):
-        """ The negative sampler used for link prediction training. 
+        """ The negative sampler used for link prediction training.
             Built-in samplers include ``uniform``, ``joint``, ``localuniform``,
             ``all_etype_uniform`` and ``all_etype_joint``. Default is ``joint``.
         """
@@ -2316,8 +2316,8 @@ def num_negative_edges_eval(self):
     @property
     def lp_decoder_type(self):
         """ The decoder type for loss function in link prediction tasks.
-            Currently GraphStorm supports ``dot_product``, ``distmult`` and ``rotate``.
-            Default is ``distmult``.
+            Currently GraphStorm supports ``dot_product``, ``distmult``,
+            ``transe`` (``transe_l1`` and ``transe_l2``), and ``rotate``. Default is ``distmult``.
         """
         # pylint: disable=no-member
         if hasattr(self, "_lp_decoder_type"):
@@ -2379,10 +2379,10 @@ def lp_edge_weight_for_loss(self):
             positive edge loss for link prediction tasks. Default is None.
 
             The edge_weight can be in following format:
-            
+
             - ``weight_name``: global weight name, if an edge has weight,
             the corresponding weight name is ``weight_name``.
-            
+
             - ``"src0,rel0,dst0:weight0","src0,rel0,dst0:weight1",...``:
             different edge types have different edge weights.
         """
@@ -2450,21 +2450,21 @@ def _get_predefined_negatives_per_etype(self, negatives):
 
     @property
     def train_etypes_negative_dstnode(self):
-        """ The list of canonical edge types that have hard negative edges 
+        """ The list of canonical edge types that have hard negative edges
             constructed by corrupting destination nodes during training.
 
             For each edge type to use different fields to store the hard negatives,
             the format of the arguement is:
 
             .. code:: json
 
-                train_etypes_negative_dstnode: 
+                train_etypes_negative_dstnode:
                     - src_type,rel_type0,dst_type:negative_nid_field
                     - src_type,rel_type1,dst_type:negative_nid_field
-              
+
             or, for all edge types to use the same field to store the hard negatives,
             the format of the arguement is:
-            
+
             .. code:: json
 
                 train_etypes_negative_dstnode:
@@ -2482,7 +2482,7 @@ def train_etypes_negative_dstnode(self):
 
     @property
     def num_train_hard_negatives(self):
-        """ Number of hard negatives to sample for each edge type during training. 
+        """ Number of hard negatives to sample for each edge type during training.
             Default is None.
 
             For each edge type to have a number of hard negatives,
@@ -2496,7 +2496,7 @@ def num_train_hard_negatives(self):
 
             or, for all edge types to have the same number of hard negatives,
             the format of the arguement is:
-            
+
             .. code:: json
 
                 num_train_hard_negatives:
@@ -2533,21 +2533,21 @@ def num_train_hard_negatives(self):
 
     @property
     def eval_etypes_negative_dstnode(self):
-        """ The list of canonical edge types that have hard negative edges 
+        """ The list of canonical edge types that have hard negative edges
             constructed by corrupting destination nodes during evaluation.
 
             For each edge type to use different fields to store the hard negatives,
             the format of the arguement is:
 
             .. code:: json
 
-                eval_etypes_negative_dstnode: 
+                eval_etypes_negative_dstnode:
                     - src_type,rel_type0,dst_type:negative_nid_field
                     - src_type,rel_type1,dst_type:negative_nid_field
-              
+
             or, for all edge types to use the same field to store the hard negatives,
             the format of the arguement is:
-            
+
             .. code:: json
 
                 eval_etypes_negative_dstnode:
@@ -2565,7 +2565,7 @@ def eval_etypes_negative_dstnode(self):
 
     @property
     def train_etype(self):
-        """ The list of canonical edge types that will be added as training target. 
+        """ The list of canonical edge types that will be added as training target.
             If not provided, all edge types will be used as training target. A canonical
             edge type should be formatted as ``src_node_type,relation_type,dst_node_type``.
         """
@@ -2582,7 +2582,7 @@ def train_etype(self):
 
     @property
     def eval_etype(self):
-        """ The list of canonical edge types that will be added as evaluation target. 
+        """ The list of canonical edge types that will be added as evaluation target.
             If not provided, all edge types will be used as evaluation target. A canonical
             edge type should be formatted as ``src_node_type,relation_type,dst_node_type``.
         """
@@ -2638,7 +2638,7 @@ def alpha(self):
 
     @property
     def class_loss_func(self):
-        """ Classification loss function. Builtin loss functions include 
+        """ Classification loss function. Builtin loss functions include
             ``cross_entropy`` and ``focal``. Default is ``cross_entropy``.
         """
         # pylint: disable=no-member
@@ -2652,7 +2652,7 @@ def class_loss_func(self):
 
     @property
     def lp_loss_func(self):
-        """ Link prediction loss function. Builtin loss functions include 
+        """ Link prediction loss function. Builtin loss functions include
             ``cross_entropy`` and ``contrastive``. Default is ``cross_entropy``.
         """
         # pylint: disable=no-member

diff --git a/python/graphstorm/config/config.py b/python/graphstorm/config/config.py
@@ -81,10 +81,14 @@
 BUILTIN_LP_DOT_DECODER = "dot_product"
 BUILTIN_LP_DISTMULT_DECODER = "distmult"
 BUILTIN_LP_ROTATE_DECODER = "rotate"
+BUILTIN_LP_TRANSE_L1_DECODER = "transe_l1"
+BUILTIN_LP_TRANSE_L2_DECODER = "transe_l2"
 
 SUPPORTED_LP_DECODER = [BUILTIN_LP_DOT_DECODER,
                         BUILTIN_LP_DISTMULT_DECODER,
-                        BUILTIN_LP_ROTATE_DECODER]
+                        BUILTIN_LP_ROTATE_DECODER,
+                        BUILTIN_LP_TRANSE_L1_DECODER,
+                        BUILTIN_LP_TRANSE_L2_DECODER]
 
 ################ Task info data classes ############################
 def get_mttask_id(task_type, ntype=None, etype=None, label=None):