Swimm on-prem agent - installation guide
The Swimm on-prem agent
is a dedicated container operating within your local network, acting as a conduit linking your Azure OpenAI instance with the Swimm IDE extension. This service stands as the sole entity capable of direct communication with your Azure OpenAI instance, ensuring the protection and confidentiality of your API key.
This arrangement guarantees that your API key remains safeguarded and is not exposed to the Swimm IDE extension or any other external component.
Prerequisitesβ
- Make sure you have the following details of your Azure OpenAI instance for GPT-4o and GPT-3.5:
- API Key
- Deployment URL
- The deployed container will be exposed to the Swimm IDE extension in order to bridge Azure OpenAI with the Swimm IDE extension. Make sure that your network allows inbound traffic to this container from the developers' machines.
- Since Swimm uses Azure OpenAI streaming API, the
onprem-agent
will stream the responses from the Azure OpenAPI instance using SSE (Server-Sent Events). Make sure that your network and container orchestration platform allows SSE. - Our healthcheck route for the container is
/health
and internal port:24605
.
Installation Stepsβ
Step 1: Download the Swimm On-Prem Agent Imageβ
Pull the latest version of the Swimm On-Prem Agent image by running:
docker pull swimmio/onprem-agent:latest
(Optional) Download the image from the Swimm Image Storage and upload it to your Organization's Registry.β
Option 1 | Option 2 |
---|---|
(Non-Azure) Kubernetes / Amazon ECS | Azure Container Apps |
(Non-Azure) Download the image from the Swimm Image Storage and upload it to your Organization's Registryβ
-
In case you are not able to pull the image from the Swimm Image Storage, you can download it from our storage and upload it to your Organization's Registry.
- Download the latest version of the
Swimm On-Prem Agent
image from here and save it locally on a machine that has access to your Organization's Registry. - Use the following command to load the image:
docker load -i swimmio_onprem_agent_latest.tar
- Tag the image with your Organization's Registry URL:
$ docker tag swimmio/onprem-agent:latest <your-registry-url>/swimmio/onprem-agent:latest
- Upload the image to your Organization's Registry.
- Download the latest version of the
Azure Container Apps Environment and Registryβ
-
Create an Azure Container Apps Environment:
Ensure you have the Azure CLI installed and configured on your machine.
Create a new Azure Container Apps environment by running the following command:
az containerapp env create --name myContainerAppEnv --resource-group myResourceGroup --location eastus
If you don't have a resource group, you can create a new one:
az group create --name myResourceGroup --location eastus
-
Create an Azure Container Registry (ACR) (Optional)
This is optional because our Docker image is readily available on Docker Hub in a public registry.
image: docker pull swimmio/onprem-agent:latest
If you prefer to use your own container registry, you can pull the image from Docker Hub and push it to your own ACR.
-
Create an Azure Container Registry:
az acr create --resource-group myResourceGroup --name myContainerRegistry --sku Basic
-
Login to your ACR:
az acr login --name myContainerRegistry
-
Pull the image from Docker Hub:
docker pull swimmio/onprem-agent:latest
-
Tag the image with your ACR URL:
docker pull swimmio/onprem-agent:latest mycontainerregistry.azurecr.io/onprem-agent:latest
-
Push the image to your ACR:
docker push mycontainerregistry.azurecr.io/onprem-agent:latest
-
β Go to step 2 to deploy the Swimm on-prem agent.
Step 2: Deploy the Swimm On-Prem Agentβ
You have multiple options to deploy the Swimm On-Prem Agent
. Choose your preferred deployment method and follow the instructions below.
Option 1 | Option 2 | Option 3 |
---|---|---|
Kubernetes | Amazon ECS | Azure Container Apps |
Kubernetes deploymentβ
-
Edit the following
deployment.yaml
template and replace the image URL with your Organization's Registry URL as well as the following API Keys:Note: If you have access to GPT-4o ,please set the same GPT-4o
β¦API_KEY
andβ¦DEPLOYMENT_URL
for each GPT_4LONG
andSHORT
env variable pair.- GPT_4_LONG_API_KEY (32K)
- GPT_4_LONG_DEPLOYMENT_URL (32K)
- GPT_4_SHORT_API_KEY
- GPT_4_SHORT_DEPLOYMENT_URL
- GPT_3_5_API_KEY
- GPT_3_5_DEPLOYMENT_URL
apiVersion: apps/v1
kind: Deployment
metadata:
name: onprem-agent
labels:
app: onprem-agent
spec:
replicas: 1
selector:
matchLabels:
app: onprem-agent
template:
metadata:
labels:
app: onprem-agent
spec:
containers:
- name: onprem-agent
image: <your-registry-url>/onprem-agent:latest
ports:
- containerPort: 24605
env:
# Your GPT 4 32K
GPT_4_LONG_API_KEY: <your-gpt-4-long-api-key>
GPT_4_LONG_DEPLOYMENT_URL: <your-gpt-4-long-deployment-url>
# Your GPT 4 Details
GPT_4_SHORT_API_KEY: <your-gpt-4-short-api-key>
GPT_4_SHORT_DEPLOYMENT_URL: <your-gpt-4-short-deployment-url>
# Your GPT 3.5 Details
GPT_3_5_API_KEY: <your-gpt-3-5-api-key>
GPT_3_5_DEPLOYMENT_URL: <your-gpt-3-5-deployment-url> -
Depending on your setup, you might also need a
service.yaml
file with the following content:apiVersion: v1
kind: Service
metadata:
name: onprem-agent
spec:
type: NodePort
ports:
- port: 8080
targetPort: 24605
selector:
app: onprem-agent -
Deploy the service by running the following commands (or through your container orchestration platform):
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Amazon ECS deploymentβ
-
Navigate to the Amazon ECS console and create a new task definition.
- Log in to the AWS Management Console and navigate to the Amazon ECS dashboard.
- Select
Task Definitions
from the navigation pane. - Click on the
Create new Task Definition
button. - Choose the launch type compatibility based on your requirements (EC2 or Fargate).
- Enter a name (e.g.,
swimm-ai-proxy
). - Define the container image from your Organization's Registry URL by specifying the repository URL and image tag.
- Configure essential container settings, including memory and CPU limits, logging options, and networking mode.
- Set up environment variables for your API keys and OpenAI deployments:
Note: If you have access to GPT-4o ,please set the same GPT-4o
β¦API_KEY
andβ¦DEPLOYMENT_URL
for each GPT_4LONG
andSHORT
env variable pair.- GPT_4_LONG_API_KEY (32K)
- GPT_4_LONG_DEPLOYMENT_URL (32K)
- GPT_4_SHORT_API_KEY
- GPT_4_SHORT_DEPLOYMENT_URL
- GPT_3_5_API_KEY
- GPT_3_5_DEPLOYMENT_URL
- Specify the container internal port to
24605
. - Review and create the task definition.
-
Create a new ECS service using the task definition.
- Once your task definition is created, go back to the Amazon ECS dashboard.
- Select the cluster where you want to deploy your service.
- Click on the
Create
button to create a new service. - Enter a service name and configure the service's desired tasks count. For automatic scaling, leave the desired tasks count field empty.
- Choose a deployment type (Rolling update or Blue/green) and set up deployment options as per your requirements.
- Configure network settings, such as VPC, subnets, and security groups.
- Optionally, enable service auto-scaling if you anticipate changes in traffic patterns. Define scaling policies based on metrics like CPU utilization or request counts.
- Configure load balancing settings if you want to expose your service to external traffic.
- Review and create the service.
Azure container deploymentβ
Deploy Azure containers easily using automated and flexible management tools tailored to your preferred approach.
Create the following Azure OpenAI model deploymentsβ
Once you have access to an Azure Open AI account, create the following model deployments.
- Go to
Azure Console > Azure Open AI > Select or create new Azure Open AI
- Go to
Model Deployments > Manage Deployments
- Click on
Deploy Model
- Fill in the details for both models:
Model | Model Name | Deployment Name |
---|---|---|
GPT 4 | gpt-4o | GPT4o |
GPT 3.5 | gpt-35-turbo-16k | GPT35turbo16K |
How to create a model deployment for GPT 4oβ
Deployment options:β
Deploy with Azure CLIβ
-
Run the following command to deploy the container using the Azure CLI. Replace the placeholder values with your specific environment variables.
Note: If you have access to GPT-4o ,please set the same GPT-4o
β¦API_KEY
andβ¦DEPLOYMENT_URL
for each GPT_4LONG
andSHORT
env variable pair.az containerapp create --name myContainerApp--resource-group myRessourceGroup --environment myContainerAppEnv --image docker pull swimmio/onprem-agent:latest --target-port 24605 --ingress 'external' --env-vars
GPT_4_LONG_API_KEY=<>
GPT_4_LONG_DEPLOYMENT_URL=<>
GPT_4_SHORT_API_KEY=<>
GPT_4_SHORT_DEPLOYMENT_URL=<>
GPT_3_5_API_KEY=<>
GPT_3_5_DEPLOYMENT_URL=<>Note: If you decide to use your own container registry, ensure you update the
--image
value with your container registry.
Deploy with YAML fileβ
- Create a
containerapp.yaml
file with the following content. Replace the placeholder values with your specific environment variables. These values should be the deployment name of the model.
How to get model deployment URL for Azure OpenAIβ
Format | Deployment URL Example |
---|---|
β | https://genaiga.openai.azure.com/openai/deployments/GPT4O |
β | https://genaiga.openai.azure.com/openai/deployments/GPT4O/chat/completions?api-version=2023-03-15-preview |
properties:
environmentId: /subscriptions/<YOUR_SUBSCRIPTION_ID>/resourceGroups/myResourceGroup/providers/Microsoft.App/managedEnvironments/myContainerAppEnv
configuration:
ingress:
external: true
targetPort: 24605
template:
containers:
- image: docker pull swimmio/onprem-agent:latest
name: onprem-agent
env:
- name: GPT_4_LONG_API_KEY
value: ""
- name: GPT_4_LONG_DEPLOYMENT_URL
value: ""
- name: GPT_4_SHORT_API_KEY
value: ""
- name: GPT_4_SHORT_DEPLOYMENT_URL
value: ""
- name: GPT_3_5_API_KEY
value: ""
- name: GPT_3_5_DEPLOYMENT_URL
value: ""
Note: If you decide to use your own container registry, ensure you update the --image
value with your container registry.
- Deploy the container using the
containerapp.yaml
file.az containerapp create --resource-group myResourceGroup --file containerapp.yaml
Step 3: Verify that the service is up and runningβ
The service URL will be used to configure the Swimm IDE extension to communicate with the service for your Swimm workspace.
Make sure that the service is up and running. Try to access it from your local machine.
Option 1 | Option 2 |
---|---|
(Non-Azure) Kubernetes / Amazon ECS | Azure Container Apps |
Non-Azure Deployment Verificationβ
-
Using the following cURL command (or through your browser):
curl http://<service-url>/health
-
If the server is up and running, you should get an HTTP 200 response.
π Please send the service URL to the Swimm team. πβ
Azure Deployment Verificationβ
-
Verify Deployment Status:
Ensure that your deployment is successful by checking the status of your container app. You can do this using the Azure CLI:
az containerapp show --name my-container-app --resource-group my-resource-group
-
Check Application Logs:
Ensure there are no errors by checking the logs of your application. Use the Azure CLI to view logs:
az containerapp logs show --name <your-container-app-name> --resource-group <your-resource-group>
-
Get your application URL:
Get your application URL on the App overview page on the Azure console or using the Azure CLI:
az containerapp show --name <your-container-app-name> --resource-group <your-resource-group> --query properties.configuration.ingress.fqdn
π Please send the application URL to the Swimm team. πβ
Set up a budget and an alert to your Azure Open AI subscriptionβ
The Azure Open AI token is not technically limited to a certain amount of API calls or budget, so itβs better to have an alert for unusual/unauthorized usage.
Here is a short video that shows how to do that.