Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
parthosa committed Oct 30, 2024
1 parent 9bd2462 commit 870467d
Showing 1 changed file with 13 additions and 4 deletions.
17 changes: 13 additions & 4 deletions tools/emr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,19 @@ GPU acceleration as well as analyzing GPU Spark jobs. For customers on EMR, the
notebooks offer a simple interface for running the tools given a set of Spark event logs from
CPU (qualification) or GPU (profiling) application runs.

To use a demo notebook, you can import the notebook in the EMR Workspace.
## Usage

Once the demo notebook is imported, you can enter in the log path location in the cell below the `User Input` in the
notebook. After that, click on the `fast-forward` icon which says *Restart the kernel, then re-run the whole notebook* to execute the tools for the specific logs in the log path.
### Pre-requisites: Setup EMR Studio and Workspace
1. Ensure that you have an **EMR cluster** running.
2. Set up **EMR Studio** and **Workspace** by following the instructions in the [AWS Documentation](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-studio-create-studio.html):
- Select **Custom Settings** while creating the Studio.
- Choose the **VPC** and **Subnet** where the EMR cluster is running.
3. Attach the Workspace to the running EMR cluster. For more details, refer to the [AWS Documentation](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-studio-create-use-clusters.html).

### Running the Notebook
1. Import the notebook into the EMR Workspace by dragging and dropping the notebook file.
2. In the **User Input** section of the notebook, enter the path to event log files.
3. Click the **fast-forward** icon labeled *Restart the kernel, then re-run the whole notebook* to process the logs at the specified path.

## Limitations
1. Currently, local and S3 event log paths are supported.
Expand All @@ -18,4 +27,4 @@ notebook. After that, click on the `fast-forward` icon which says *Restart the
1. To specify multiple event logs, separate the paths with commas.
- For example: `s3://my-bucket/path/to/eventlog1,s3://my-bucket/path/to/eventlog2`

**Latest Tools Version Supported** 24.08.2
**Latest Tools Version Supported** 24.08.2

0 comments on commit 870467d

Please sign in to comment.