Skip to content

Simple

In this Get Started guide, you will perform a simple test against a remote Ray endpoint. This guide assumes the following:

  • You have already created a "Ray as Service" tenant using Rafay
  • You have the https URL and Access Credentials to the remote endpoint.
  • You have Python 3 installed on your laptop

Review Code

Download the source code file "simple.py" and open it in your favorite IDE such as VS Code to review it. If you are using a Jupyter Notebook to execute your code, copy this file into your notebook's file tree.

This code is a simple example that demonstrates how to use Ray to execute tasks in parallel on a Ray cluster. It defines a remote function and submits multiple instances of this function to be run in parallel, then collects and prints the results.


Performance Benefit

Time Savings

Because each task sleeps for 1 second, running them sequentially would take approximately 5 seconds (1 second per task). However, since they are executed in parallel, the total execution time will be close to 1 second, plus a small overhead for managing parallelism.

Scalability

If connected to a larger Ray cluster with many nodes, the same approach could scale to handle thousands of tasks, leveraging the computational power of the entire cluster.


Step-by-Step Explanation

Here’s a detailed explanation of what each part of the code is doing:

Import Libraries

import ray
import time

ray: The library used for distributed computing. It allows you to run functions as remote tasks across multiple cores or nodes in a cluster.

time: Used to simulate some delay in the remote function by adding a sleep time.


Initialize Ray

ray.init()

This initializes Ray and connects the script to the Ray cluster. If running locally, it starts a local Ray instance; if running on a remote Ray cluster, it connects to that cluster. This step allows the script to submit tasks to the Ray cluster.


Define a Remote Function

@ray.remote
def simple_task(x):
time.sleep(1)  # Simulate some work
return x * x

@ray.remote: Decorates the function simple_task, making it a Ray remote function. This means it can be executed as a distributed task on any available worker in the Ray cluster.

simple_task: A function that takes an input x, sleeps for 1 second to simulate some processing time, and then returns the square of x.

This function can be run concurrently on multiple Ray workers, allowing parallel execution of many instances of the function.


Main Function

if __name__ == "__main__":
    print("Starting test on Ray as Service Endpoint...")

# Create multiple parallel tasks
futures = [simple_task.remote(i) for i in range(5)]

# Retrieve the results
results = ray.get(futures)
print(f"Results from Ray cluster: {results}")

__name__ == "__main__": This block ensures that the code only runs when the script is executed directly, not if it is imported as a module.

Print a Message: Displays a message indicating that the test is starting.

Create Multiple Parallel Tasks:

futures = [simple_task.remote(i) for i in range(5)]

Uses a list comprehension to create 5 remote tasks by calling simple_task.remote(i) for i ranging from 0 to 4. Each call to simple_task.remote(i) submits a task to the Ray cluster to calculate i * i after a 1-second delay.

These tasks run concurrently on different workers, allowing them to execute in parallel, reducing overall execution time compared to running them sequentially.

futures: A list of object references (futures) that represent the results of the remote tasks. The tasks are executed in the background, and the futures can be used to retrieve the results once they are completed.

Retrieve the Results:

results = ray.get(futures)

ray.get(futures) waits for all the remote tasks to complete and retrieves their results. In this case, each result is the square of a number (e.g., 0, 1, 4, 9, 16).

Print the Results:

print(f"Results from Ray cluster: {results}")

Prints the list of results obtained from the Ray cluster, showing the squares of the numbers from 0 to 4.


Job Submission Code

Download the source code file "submit_ray_job.py" and open it in your favorite IDE such as VS Code to review it. Alternatively, you can copy the code from this file into your Jupyter Notebook and run the code from your notebook. As you can see from the code snippet below, we will be using Ray's Job Submission Client to submit a job to the remote Ray endpoint.

import ray
from ray.job_submission import JobSubmissionClient
import urllib3
import time

# Suppress the warning about unverified HTTPS requests since 
# we are using self signed certificates for testing 
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# Ray client
client = JobSubmissionClient(
    "https URL for endpoint", 
    headers={"Authorization": "Basic <Base 64 encoded admin:password>"}, 
    verify=False  # Disable SSL verification
)

# Submit the job to the remote Ray cluster
job_id = client.submit_job(
    entrypoint="python simple.py",  # The script to be executed remotely
    runtime_env={
        "working_dir": "./",  # The working directory containing the script
        "pip": []  # No additional dependencies for this simple test
    }
)

print(f"Submitted job with ID: {job_id}")

# Check the status of the job
status = client.get_job_status(job_id)
print(f"Job status: {status}")
while status != "SUCCEEDED":
    print(f"Job status: {status}")
    time.sleep(5)  # Wait 5 seconds before checking again
    status = client.get_job_status(job_id)

print("Job has succeeded!")

# Retrieve the logs or output of the job
logs = client.get_job_logs(job_id)
print(f"Job logs: {logs}")

Now, update the authorization credentials with the base64 encoded credentials for your Ray endpoint. You can use the following command to perform the encoding.

echo -n 'admin:PASSWORD' | base64

Submit Job

In order to submit the job to your remote Ray endpoint,

  • First, in your web browser, access the Ray Dashboard's URL and keep it open. We will monitor the status and progress of the submitted job here.
  • Now, open Terminal and enter the following command
python3 ./submit_ray_job.py 

This will submit the job to the configured Ray endpoint and you can review progress and the results on the Ray Dashboard.

Once the Ray endpoint receives the job, it will be pending for a few seconds. The output should look similar to what is shown below.

Starting test on Ray as Service Endpoint...
Results from Ray cluster: [0, 1, 4, 9, 16]

Starting Message: Indicates that the process of testing Ray has begun.

Results: Shows the squared values of 0, 1, 2, 3, and 4, computed in parallel by the Ray cluster.