Skip to main content

On-Prem Agent Installation Guide

Swimm's On-Prem Agent is a dedicated server running in a container within your company's network. It is responsible for granting Swimm access to services that you have your own instance off, such as Azure OpenAI, that you wouldn't want to otherwise share their keys directly with Swimm. Keeping your keys confidential and private.

Prerequisetes

  1. Make sure you have the following details of your Azure OpenAI instance for GPT-4o, GPT-4o mini and GPT-3.5 Turbo Instruct if you are deploying it for AI features:

    • API Key
    • Deployment URL

    See:

  2. The server is required to be deployed with HTTPS/TLS configured with a certificate trusted by your organization's computers. See TLS Configuration.

  3. The deployed container will be exposed to the Swimm IDE extension and web page in order to bridge Azure OpenAI with Swimm. Make sure that your network allows inbound traffic to this container from the developers' machines.

  4. If you direct the agent to connect to Azure OpenAI via a forward/reverse proxy/VPN, and that proxy/VPN does TLS Man-in-the-Middle (MITM), you will need to configure the On-Prem Agent to trust the certificate of the proxy/VPN. You can configure an additional certificate bundle that the On-Prem Agent will use by using the NODE_EXTRA_CA_CERTS environment variable. See the FAQ for more details.

  5. Swimm uses Azure OpenAI streaming API, the On-Prem Agent will stream the responses from the Azure OpenAPI instance using SSE (Server-Sent Events) the same as a direct streaming request to Azure OpenAI. Make sure that your network, load balancer/proxy, and container orchestration platform allows SSE. You can verify this by checking documentation of your load balancer/proxy/container orchestration platform, or by asking the relevant personnal at your organization that is responsible for it.

Technical Details

  • The server listens on port 24605 inside the container by default, this can be overriden using the PORT environment variable. Though you can also always remap the outside port using Docker port mapping.
  • The server has an HTTP health check endpoint at /health.
  • The path to the configuration file is specified using the CONFIG_PATH environment variable.

Deploying Models

Only needed if you want to use your own Azure OpenAI instance and this wasn't already done or you don't have the needed models deployed.

  1. Enter Azure AI Foundry, creating or selecting the correct Azure OpenAI resources at the top.

    Or enter Azure Portal → Azure OpenAI → Create/Select OpenAI Resource → Explore Azure AI Foundry Portal

  2. Select "Deployments" in the sidebar.

  3. Select "Deploy model".

  4. Deploy gpt-4o, gpt-4o-mini and gpt-35-turbo-instruct with a name of your choice.

How to get the deployment URL

Click your deployed model in the Azure AI Foundry portal, and you should see it's target URI and examples in various languages. The URL you need is without the "/chat/completions" and "?api-version=*" suffixes. For example:

URL
Good ✅https://ai-resource.openai.azure.com/openai/deployments/{deployment-name}
Bad ❌https://ai-resource.openai.azure.com/openai/deployments/{deployment-name}/chat/completions?api-version=2025-01-01-preview

Installation

Step 1 - Pull

Pull the latest version of the image using:

docker pull swimmio/onprem-agent:2 # From Docker Hub
# Or
docker pull us-docker.pkg.dev/swimmio/public/onprem-agent:2 # From Google Artifact Registry

The image is on Docker Hub swimmio/onprem-agent and Google Artifact Registry. You can see the available tags there if you want to pin the version to have more control over updates. We publish the following tags:

  • latest - Always the latest version.
  • 2, etc. - The latest major version 2.
  • 2.0.7 - A specific version.

In case you don't want to, or have no access, to use Docker Hub or Google Artifact Registry, you can also mirror the image:

Mirroring
docker save swimmio/onprem-agent:2 --platform linux/amd64 -o swimm-onprem-agent-2.tar # From Docker Hub
# Or
docker save us-docker.pkg.dev/swimmio/public/onprem-agent:2 --platform linux/amd64 -o swimm-onprem-agent-2.tar # From Google Artifact Registry

And then to load the image:

docker load -i swimm-onprem-agent-2.tar

And if needed you can also push it to a registry:

docker tag <image> <your-registry>/<image> # e.g. docker tag swimmio/onprem-agent:latest registry.foo.test/swimmio/onprem-agent:latest
docker push <your-registry>/<image> # e.g. docker push registry.foo.test/swimmio/onprem-agent:latest

Step 2 - Configure

Create your configuration file:

swimm-onprem-agent.yaml
enterprise_name: <enterprise_name> # Optional, but recommended

# For configuring AI features, remove if not used
openai:
models:
gpt-4o:
versions:
'2024-08-06':
api_key: <api_key>
deployment_url: <gpt-4o_deployment_url>
usd_per_1000_input_tokens: 0.005
usd_per_1000_output_tokens: 0.015
max_token_count: 131072
'2024-05-13':
api_key: <api_key>
deployment_url: <gpt-4o_deployment_url>
usd_per_1000_input_tokens: 0.005
usd_per_1000_output_tokens: 0.015
max_token_count: 131072
gpt-4o-mini:
versions:
'2024-07-18':
api_key: <api_key>
deployment_url: <gpt-4o-mini_deployment_url>
usd_per_1000_input_tokens: 0.000165
usd_per_1000_output_tokens: 0.00066
max_token_count: 131072
gpt-35-turbo-instruct:
versions:
'0613':
api_key: <api_key>
deployment_url: <gpt-35-turbo-instruct_deployment_url>
usd_per_1000_input_tokens: 0.003
usd_per_1000_output_tokens: 0.004
max_token_count: 16384

# For configuring On-Prem/Enterprise Git Provider authentication, remove if not used
git_oauth:
git_hosting: <git_server_url>
client_id: <client_id>
client_secret: <client_secret>

Configuring On-prem Enterprise Git Provider (if needed)

For more information on how to create a Swimm OAuth App for your on-prem Git hosting, please refer to the following instructions:

Step 3 - Deploy

Since the On-Prem Agent is a Docker image/container, you can deploy it using any container orchestrator of your choice: plain Docker or with Docker Compose, Kubernetes, ECS, GCP Cloud Run, and so on.

Assuming your configuration file is in the current working directory and named swimm-onprem-agent.yaml.

docker run -d \
-p 24605:24605 \
-v $PWD/swimm-onprem-agent.yaml:/etc/swimm/onprem-agent.yaml:ro \
-e CONFIG_PATH=/etc/swimm/onprem-agent.yaml \
swimmio/onprem-agent:2
note

The path on the left hand side of -v must be absolute, we use $PWD to make it absolute.

note

You will require a load balancer/reverse proxy in front of the container to handle HTTPS, or to mount your own certificate and private key to the container, and specifiy them in the configuration file.

Step 4 - Verify

In your browser on the client machines that are going to connect to the onprem-agent, browse to:

https://<your-service-url>/status

This should let you know if the service is up and running, and if the configuration is working correctly to communicate with your Azure OpenAI deployment.

You can also try from a terminal:

curl -v https://<your-service-url>/health

Step 5 - Service URLs

Please send the following service URL(s) to your contact at Swimm:

We'll configure these within your workspace to ensure proper connectivity.

Setting up a budget alert

The Azure Open AI token is not technically limited to a certain amount of API calls or budget, so it's better to have an alert for unusual/unauthorized usage.

Here is a short video that shows how to do that: