Skip to main content

Swimm on-prem agent - installation guide

The Swimm on-prem agent is a dedicated container operating within your local network, acting as a conduit linking your Azure OpenAI instance with the Swimm IDE extension. This service stands as the sole entity capable of direct communication with your Azure OpenAI instance, ensuring the protection and confidentiality of your API key.

This arrangement guarantees that your API key remains safeguarded and is not exposed to the Swimm IDE extension or any other external component.

Prerequisites​

  1. Make sure you have the following details of your Azure OpenAI instance for GPT-4o and GPT-3.5:
  2. The deployed container will be exposed to the Swimm IDE extension in order to bridge Azure OpenAI with the Swimm IDE extension. Make sure that your network allows inbound traffic to this container from the developers' machines.
  3. Since Swimm uses Azure OpenAI streaming API, the onprem-agent will stream the responses from the Azure OpenAPI instance using SSE (Server-Sent Events). Make sure that your network and container orchestration platform allows SSE.
  4. Our healthcheck route for the container is /health and internal port: 24605.

Installation Steps​

Step 1: Download the Swimm On-Prem Agent Image​

Pull the latest version of the Swimm On-Prem Agent image by running:

docker pull swimmio/onprem-agent:latest

(Optional) Download the image from the Swimm Image Storage and upload it to your Organization's Registry.​


(Non-Azure) Download the image from the Swimm Image Storage and upload it to your Organization's Registry​
  • In case you are not able to pull the image from the Swimm Image Storage, you can download it from our storage and upload it to your Organization's Registry.

    1. Download the latest version of the Swimm On-Prem Agent image from here and save it locally on a machine that has access to your Organization's Registry.
    2. Use the following command to load the image:
      docker load -i swimmio_onprem_agent_latest.tar
    3. Tag the image with your Organization's Registry URL:
      $ docker tag swimmio/onprem-agent:latest <your-registry-url>/swimmio/onprem-agent:latest
    4. Upload the image to your Organization's Registry.
βœ… Go to step 2 to deploy the Swimm On-Prem Agent.

Azure Container Apps Environment and Registry​
  1. Create an Azure Container Apps Environment:

    Ensure you have the Azure CLI installed and configured on your machine.

    Create a new Azure Container Apps environment by running the following command:

    az containerapp env create --name myContainerAppEnv --resource-group myResourceGroup --location eastus

    If you don't have a resource group, you can create a new one:

    az group create --name myResourceGroup --location eastus
  2. Create an Azure Container Registry (ACR) (Optional)

    This is optional because our Docker image is readily available on Docker Hub in a public registry.

    image: docker pull swimmio/onprem-agent:latest

    If you prefer to use your own container registry, you can pull the image from Docker Hub and push it to your own ACR.

    1. Create an Azure Container Registry:

      az acr create --resource-group myResourceGroup --name myContainerRegistry --sku Basic
    2. Login to your ACR:

      az acr login --name myContainerRegistry
    3. Pull the image from Docker Hub:

      docker pull swimmio/onprem-agent:latest
    4. Tag the image with your ACR URL:

      docker pull swimmio/onprem-agent:latest mycontainerregistry.azurecr.io/onprem-agent:latest
    5. Push the image to your ACR:

      docker push mycontainerregistry.azurecr.io/onprem-agent:latest

βœ… Go to step 2 to deploy the Swimm on-prem agent.

Step 2: Deploy the Swimm On-Prem Agent​

You have multiple options to deploy the Swimm On-Prem Agent. Choose your preferred deployment method and follow the instructions below.

If you are using a different platform, feel free to reach out and our team will be happy to help you get started.

Kubernetes deployment​

  1. Edit the following deployment.yaml template and replace the image URL with your Organization's Registry URL as well as the following API Keys:

    Note: If you have access to GPT-4o ,please set the same GPT-4o …API_KEY and …DEPLOYMENT_URL for each GPT_4 LONG and SHORT env variable pair.

    • GPT_4_LONG_API_KEY (32K)
    • GPT_4_LONG_DEPLOYMENT_URL (32K)
    • GPT_4_SHORT_API_KEY
    • GPT_4_SHORT_DEPLOYMENT_URL
    • GPT_3_5_API_KEY
    • GPT_3_5_DEPLOYMENT_URL

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: onprem-agent
    labels:
    app: onprem-agent
    spec:
    replicas: 1
    selector:
    matchLabels:
    app: onprem-agent
    template:
    metadata:
    labels:
    app: onprem-agent
    spec:
    containers:
    - name: onprem-agent
    image: <your-registry-url>/onprem-agent:latest
    ports:
    - containerPort: 24605
    env:
    # Your GPT 4 32K
    GPT_4_LONG_API_KEY: <your-gpt-4-long-api-key>
    GPT_4_LONG_DEPLOYMENT_URL: <your-gpt-4-long-deployment-url>

    # Your GPT 4 Details
    GPT_4_SHORT_API_KEY: <your-gpt-4-short-api-key>
    GPT_4_SHORT_DEPLOYMENT_URL: <your-gpt-4-short-deployment-url>

    # Your GPT 3.5 Details
    GPT_3_5_API_KEY: <your-gpt-3-5-api-key>
    GPT_3_5_DEPLOYMENT_URL: <your-gpt-3-5-deployment-url>
  2. Depending on your setup, you might also need a service.yaml file with the following content:

    apiVersion: v1
    kind: Service
    metadata:
    name: onprem-agent
    spec:
    type: NodePort
    ports:

    - port: 8080
    targetPort: 24605
    selector:
    app: onprem-agent
  3. Deploy the service by running the following commands (or through your container orchestration platform):

    kubectl apply -f deployment.yaml
    kubectl apply -f service.yaml
βœ… Go to step 3 to verify that the service is up and running.

Amazon ECS deployment​

  1. Navigate to the Amazon ECS console and create a new task definition.

    1. Log in to the AWS Management Console and navigate to the Amazon ECS dashboard.
    2. Select Task Definitions from the navigation pane.
    3. Click on the Create new Task Definition button.
    4. Choose the launch type compatibility based on your requirements (EC2 or Fargate).
    5. Enter a name (e.g., swimm-ai-proxy).
    6. Define the container image from your Organization's Registry URL by specifying the repository URL and image tag.
    7. Configure essential container settings, including memory and CPU limits, logging options, and networking mode.
    8. Set up environment variables for your API keys and OpenAI deployments:

    Note: If you have access to GPT-4o ,please set the same GPT-4o …API_KEY and …DEPLOYMENT_URL for each GPT_4 LONG and SHORT env variable pair.

    • GPT_4_LONG_API_KEY (32K)
    • GPT_4_LONG_DEPLOYMENT_URL (32K)
    • GPT_4_SHORT_API_KEY
    • GPT_4_SHORT_DEPLOYMENT_URL
    • GPT_3_5_API_KEY
    • GPT_3_5_DEPLOYMENT_URL

    1. Specify the container internal port to 24605.
    2. Review and create the task definition.

  2. Create a new ECS service using the task definition.

    1. Once your task definition is created, go back to the Amazon ECS dashboard.
    2. Select the cluster where you want to deploy your service.
    3. Click on the Create button to create a new service.
    4. Enter a service name and configure the service's desired tasks count. For automatic scaling, leave the desired tasks count field empty.
    5. Choose a deployment type (Rolling update or Blue/green) and set up deployment options as per your requirements.
    6. Configure network settings, such as VPC, subnets, and security groups.
    7. Optionally, enable service auto-scaling if you anticipate changes in traffic patterns. Define scaling policies based on metrics like CPU utilization or request counts.
    8. Configure load balancing settings if you want to expose your service to external traffic.
    9. Review and create the service.
βœ… Go to step 3 to verify that the service is up and running.

Azure container deployment​

Deploy Azure containers easily using automated and flexible management tools tailored to your preferred approach.

Create the following Azure OpenAI model deployments​

Once you have access to an Azure Open AI account, create the following model deployments.

  1. Go to Azure Console > Azure Open AI > Select or create new Azure Open AI
  2. Go to Model Deployments > Manage Deployments
  3. Click on Deploy Model
    • Fill in the details for both models:
ModelModel NameDeployment Name
GPT 4gpt-4oGPT4o
GPT 3.5gpt-35-turbo-16kGPT35turbo16K

How to create a model deployment for GPT 4o​

Deployment options:​

Option 1Option 2
Azure CLIYAML file

Deploy with Azure CLI​

  1. Run the following command to deploy the container using the Azure CLI. Replace the placeholder values with your specific environment variables.

    Note: If you have access to GPT-4o ,please set the same GPT-4o …API_KEY and …DEPLOYMENT_URL for each GPT_4 LONG and SHORT env variable pair.

    az containerapp create --name myContainerApp--resource-group myRessourceGroup --environment myContainerAppEnv --image docker pull swimmio/onprem-agent:latest --target-port 24605 --ingress 'external' --env-vars
    GPT_4_LONG_API_KEY=<>
    GPT_4_LONG_DEPLOYMENT_URL=<>
    GPT_4_SHORT_API_KEY=<>
    GPT_4_SHORT_DEPLOYMENT_URL=<>
    GPT_3_5_API_KEY=<>
    GPT_3_5_DEPLOYMENT_URL=<>

    Note: If you decide to use your own container registry, ensure you update the --image value with your container registry.

Deploy with YAML file​

  1. Create a containerapp.yaml file with the following content. Replace the placeholder values with your specific environment variables. These values should be the deployment name of the model.
How to get model deployment URL for Azure OpenAI​

FormatDeployment URL Example
βœ…https://genaiga.openai.azure.com/openai/deployments/GPT4O
❌https://genaiga.openai.azure.com/openai/deployments/GPT4O/chat/completions?api-version=2023-03-15-preview

properties:
environmentId: /subscriptions/<YOUR_SUBSCRIPTION_ID>/resourceGroups/myResourceGroup/providers/Microsoft.App/managedEnvironments/myContainerAppEnv
configuration:
ingress:
external: true
targetPort: 24605
template:
containers:
- image: docker pull swimmio/onprem-agent:latest
name: onprem-agent
env:
- name: GPT_4_LONG_API_KEY
value: ""
- name: GPT_4_LONG_DEPLOYMENT_URL
value: ""
- name: GPT_4_SHORT_API_KEY
value: ""
- name: GPT_4_SHORT_DEPLOYMENT_URL
value: ""
- name: GPT_3_5_API_KEY
value: ""
- name: GPT_3_5_DEPLOYMENT_URL
value: ""

Note: If you decide to use your own container registry, ensure you update the --image value with your container registry.

  1. Deploy the container using the containerapp.yaml file.
    az containerapp create --resource-group myResourceGroup --file containerapp.yaml
βœ… Go to step 3 to verify that the service is up and running.

Step 3: Verify that the service is up and running​

The service URL will be used to configure the Swimm IDE extension to communicate with the service for your Swimm workspace.

Make sure that the service is up and running. Try to access it from your local machine.


Non-Azure Deployment Verification​

  1. Using the following cURL command (or through your browser):

    curl http://<service-url>/health
  2. If the server is up and running, you should get an HTTP 200 response.


πŸ”— Please send the service URL to the Swimm team. πŸ”—β€‹


Azure Deployment Verification​

  1. Verify Deployment Status:

    Ensure that your deployment is successful by checking the status of your container app. You can do this using the Azure CLI:

    az containerapp show --name my-container-app --resource-group my-resource-group
  2. Check Application Logs:

    Ensure there are no errors by checking the logs of your application. Use the Azure CLI to view logs:

    az containerapp logs show --name <your-container-app-name> --resource-group <your-resource-group>
  3. Get your application URL:

    Get your application URL on the App overview page on the Azure console or using the Azure CLI:

    az containerapp show --name <your-container-app-name> --resource-group <your-resource-group> --query properties.configuration.ingress.fqdn

πŸ”— Please send the application URL to the Swimm team. πŸ”—β€‹


Set up a budget and an alert to your Azure Open AI subscription​

The Azure Open AI token is not technically limited to a certain amount of API calls or budget, so it’s better to have an alert for unusual/unauthorized usage.

Here is a short video that shows how to do that.