Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: xiang song(charlie.song) <[email protected]>
  • Loading branch information
thvasilo and classicsong authored Jul 1, 2024
1 parent db38b9a commit 1d49c25
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions docs/source/gs-processing/usage/emr.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ inline policy:
]
}
Launch an EMR cluster with GSProcessing step
Launch an AWS EMR cluster with GSProcessing step
--------------------------------------------

Once our roles are set up, that is we have an EMR EC2 instance role,
Expand All @@ -99,15 +99,15 @@ and how to
`run Spark applications with Docker on Amazon EMR <https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-docker.html>`_.

To launch a GSProcessing job with EMR on EC2 we will use the ``graphstorm-processing/scripts/submit_gsp_emr_step.py`` Python
script that uses ``boto3`` to launch a cluster and corresponding GSProcessing job as a step.
script that uses ``boto3`` to launch a cluster and the corresponding GSProcessing job as a step.
The script has four required arguments:

* ``--entry-point-s3``: We need to upload the GSProcessing entry point,
``graphstorm-processing/graphstorm_processing/distributed_executor.py`` to a location
on S3 from which our leader instance will be able to read it from.
* ``--gsp-arguments``: Here we pass all the arguments to the entry point as one space-separated
string. To ensure they are parsed as one string, enclose these in double quotes, e.g.
``"--input-config gsp-config.json --input-prefix s3://my-bucket/raw-data [...]"``.
``--gsp-arguments "--input-config gsp-config.json --input-prefix s3://my-bucket/raw-data [...]"``.
* ``--instance-type``: The instance type to use for our cluster. Our script only supports
a uniform instance types currently.
* ``--worker-count``: Number of worker instances to launch for the cluster.
Expand Down

0 comments on commit 1d49c25

Please sign in to comment.