Ray Project Setup

This project uses Ray to manage AWS EC2 instances for distributed computing. Following up on our previous work on Ray, we will be utilizing DeepSpeed for its ZERO 3 optimizer to fine-tune on a Ray cluster. Tested and working using Python 3.11.

Getting Started

Clone the repository:

git clone ray-distributed-compute.git
cd ray-distributed-compute

Install Ray:
```
pip install ray
```
Install DeepSpeed:
```
pip install deepspeed
```
Configure AWS CLI:
```
aws configure
```
Find Your AWS Account Number:
```
aws sts get-caller-identity --query Account --output text
```
Note down your AWS account number, as you will need it for the next steps.

Create an IAM role with full S3 access:

aws iam create-role --role-name ray-s3-fullaccess --assume-role-policy-document file://trust-policy.json

Create an S3 bucket:

aws s3api create-bucket --bucket ray-bucket-model-output --region eu-central-1 --create-bucket-configuration LocationConstraint=eu-central-1

Attach the S3 full access policy to the role:

aws iam attach-role-policy --role-name ray-s3-fullaccess --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess

Create an instance profile:

aws iam create-instance-profile --instance-profile-name ray-s3-instance-profile

Add the role to the instance profile:

aws iam add-role-to-instance-profile --instance-profile-name ray-s3-instance-profile --role-name ray-s3-fullaccess

Create a policy to allow iam:PassRole for ray-autoscaler-v1: Replace <your-account-id> with your AWS account number.

aws iam create-policy \
    --policy-name PassRolePolicy \
    --policy-document '{
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "iam:PassRole",
                "Resource": "arn:aws:iam::<your-account-id>:role/ray-s3-fullaccess"
            }
        ]
    }'

Attach the PassRolePolicy to the ray-autoscaler-v1 role:

aws iam attach-role-policy \
    --role-name ray-autoscaler-v1 \
    --policy-arn arn:aws:iam::<your-account-id>:policy/PassRolePolicy

Retrieve the ARN for ray-s3-instance-profile:

aws iam list-instance-profiles-for-role --role-name ray-s3-fullaccess --query 'InstanceProfiles[0].Arn' --output text

Note down the retrieved ARN for use in the next steps.

Update the YAML configuration:

Open your raycluster.yaml file and replace the placeholder with the ARN you retrieved:

ray.worker.default:
  resources:
    CPU: 1
    resources: 15
  node_config:
    ImageId: ami-07652eda1fbad7432
    InstanceType: p3.2xlarge
    IamInstanceProfile:
      Arn: arn:aws:iam::<your-account-id>:instance-profile/ray-s3-instance-profile

Start the Ray cluster:
```
ray up raycluster.yaml
```
Access the Ray dashboard:
```
ray dashboard raycluster.yaml
```

Submit a Ray job:

Open a new terminal window and navigate to your project directory:

cd <project-directory>

Submit the Ray job:

ray job submit --address http://localhost:8265 --working-dir . -- python3 main.py

Check the S3 bucket:

When the job finishes running, head over to the specified S3 bucket (ray-bucket-model-output) where you should find the trained model.

Overview

Ray is a distributed computing framework that allows you to easily scale your applications across multiple machines. In this setup, you'll use Ray to manage a head node and large VM instance's with powerful CPU's and GPU's, leveraging their respective hardware capabilities to perform computational tasks efficiently. Additionally, using DeepSpeed with the ZERO 3 optimizer will enhance your fine-tuning process.

For detailed documentation on Ray, visit the Ray documentation. For more on DeepSpeed, check out the DeepSpeed documentation.

Made with ❤️ by datamax.ai.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
main.py		main.py
raycluster.yaml		raycluster.yaml
trust-policy.json		trust-policy.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ray Project Setup

Getting Started

Overview

About

Releases

Packages

Contributors 2

Languages

License

data-max-hq/RAY-DeepSpeed

Folders and files

Latest commit

History

Repository files navigation

Ray Project Setup

Getting Started

Overview

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages