Endpoint
An endpoint is a URL where end user/application can access one/many LLMs via the OpenAI compatible API. An endpoint can be either multitenant or dedicated to a single tenant.
New Endpoint¶
- In the Ops Console, click on GenAI and then Endpoint.
- Now, click on "New Endpoint" to initiate the workflow
General Section¶
Provide a unique name for the endpoint and an optional description.
Deployment Section¶
Enter the "host name" for the endpoint (e.g. https://api.inference.com) and select the compute cluster from the dropdown that will be used to power the inference service.
Certificate Section¶
Users and applications that will access the Inference service's API endpoint will expect the service to be secured using server side TLS. Upload the server certificate (chain) and private key in PEM format.
List All Endpoints¶
In the Ops Console, click on GenAI and then Endpoint. This will display the list of configured endpoints, their status and some metadata for the administrator.
View Endpoint Details¶
In the Ops Console, click on GenAI and then Endpoint. This will display details about the endpoint
Delete Endpoint¶
To delete an endpoint, click on the ellipses (3 dots) under actions for the selected endpoint.
Important
This action is not reversible. Admins will need to recreate the endpoint in case of accidental deletion.




