Introducing Ray Serve
Ray Serve is a framework-agnostic model-serving library. It is scalable and creates inference APIs on your behalf. Some of the key concepts in Ray Serve are as follows:
- Deployment
- ServeHandle
- Ingress deployment
We will look at each of these in the following sections.
Deployment
A deployment contains the business logic and the ML model that will be served. To define a deployment, the @serve.deployment decorator is used. For example, let’s take a look at the following code snippet, which shows a very basic deployment that will return whatever message is passed by the user as a payload:
@serve.deployment class MyFirstDeployment: # Take the message to return as an argument to the constructor. def __init__(self, msg): self.msg = msg def __call__(self): return self.msg my_first_deployment = MyFirstDeployment.bind("Hello...