On-Prem Agent Installation Guide
Swimm's On-Prem Agent is a dedicated server running in a container within your company's network. It is responsible for granting Swimm access to services that you have your own instance off, such as Azure OpenAI, that you wouldn't want to otherwise share their keys directly with Swimm. Keeping your keys confidential and private.
Prerequisetes
-
Make sure you have the following details of your Azure OpenAI instance for GPT-4o, GPT-4o mini and GPT-3.5 Turbo Instruct if you are deploying it for AI features:
- API Key
- Deployment URL
See:
-
The server is required to be deployed with HTTPS/TLS configured with a certificate trusted by your organization's computers. See TLS Configuration.
-
The deployed container will be exposed to the Swimm IDE extension and web page in order to bridge Azure OpenAI with Swimm. Make sure that your network allows inbound traffic to this container from the developers' machines.
-
If you direct the agent to connect to Azure OpenAI via a forward/reverse proxy/VPN, and that proxy/VPN does TLS Man-in-the-Middle (MITM), you will need to configure the On-Prem Agent to trust the certificate of the proxy/VPN. You can configure an additional certificate bundle that the On-Prem Agent will use by using the
NODE_EXTRA_CA_CERTS
environment variable. See the FAQ for more details. -
Swimm uses Azure OpenAI streaming API, the On-Prem Agent will stream the responses from the Azure OpenAPI instance using SSE (Server-Sent Events) the same as a direct streaming request to Azure OpenAI. Make sure that your network, load balancer/proxy, and container orchestration platform allows SSE. You can verify this by checking documentation of your load balancer/proxy/container orchestration platform, or by asking the relevant personnal at your organization that is responsible for it.
Technical Details
- The server listens on port
24605
inside the container by default, this can be overriden using thePORT
environment variable. Though you can also always remap the outside port using Docker port mapping. - The server has an HTTP health check endpoint at
/health
. - The path to the configuration file is specified using the
CONFIG_PATH
environment variable.
Deploying Models
Only needed if you want to use your own Azure OpenAI instance and this wasn't already done or you don't have the needed models deployed.
-
Enter Azure AI Foundry, creating or selecting the correct Azure OpenAI resources at the top.
Or enter Azure Portal → Azure OpenAI → Create/Select OpenAI Resource → Explore Azure AI Foundry Portal
-
Select "Deployments" in the sidebar.
-
Select "Deploy model".
-
Deploy gpt-4o, gpt-4o-mini and gpt-35-turbo-instruct with a name of your choice.
How to get the deployment URL
Click your deployed model in the Azure AI Foundry portal, and you should see
it's target URI and examples in various languages. The URL you need is
without the "/chat/completions
" and "?api-version=*
" suffixes. For
example:
URL | |
---|---|
Good ✅ | https://ai-resource.openai.azure.com/openai/deployments/{deployment-name} |
Bad ❌ | https://ai-resource.openai.azure.com/openai/deployments/{deployment-name}/chat/completions?api-version=2025-01-01-preview |
Installation
Step 1 - Pull
Pull the latest version of the image using:
docker pull swimmio/onprem-agent:2 # From Docker Hub
# Or
docker pull us-docker.pkg.dev/swimmio/public/onprem-agent:2 # From Google Artifact Registry
The image is on Docker Hub swimmio/onprem-agent and Google Artifact Registry. You can see the available tags there if you want to pin the version to have more control over updates. We publish the following tags:
latest
- Always the latest version.2
, etc. - The latest major version 2.2.0.7
- A specific version.
In case you don't want to, or have no access, to use Docker Hub or Google Artifact Registry, you can also mirror the image:
Mirroring
docker save swimmio/onprem-agent:2 --platform linux/amd64 -o swimm-onprem-agent-2.tar # From Docker Hub
# Or
docker save us-docker.pkg.dev/swimmio/public/onprem-agent:2 --platform linux/amd64 -o swimm-onprem-agent-2.tar # From Google Artifact Registry
And then to load the image:
docker load -i swimm-onprem-agent-2.tar
And if needed you can also push it to a registry:
docker tag <image> <your-registry>/<image> # e.g. docker tag swimmio/onprem-agent:latest registry.foo.test/swimmio/onprem-agent:latest
docker push <your-registry>/<image> # e.g. docker push registry.foo.test/swimmio/onprem-agent:latest
Step 2 - Configure
Create your configuration file:
enterprise_name: <enterprise_name> # Optional, but recommended
# For configuring AI features, remove if not used
openai:
models:
gpt-4o:
versions:
'2024-08-06':
api_key: <api_key>
deployment_url: <gpt-4o_deployment_url>
usd_per_1000_input_tokens: 0.005
usd_per_1000_output_tokens: 0.015
max_token_count: 131072
'2024-05-13':
api_key: <api_key>
deployment_url: <gpt-4o_deployment_url>
usd_per_1000_input_tokens: 0.005
usd_per_1000_output_tokens: 0.015
max_token_count: 131072
gpt-4o-mini:
versions:
'2024-07-18':
api_key: <api_key>
deployment_url: <gpt-4o-mini_deployment_url>
usd_per_1000_input_tokens: 0.000165
usd_per_1000_output_tokens: 0.00066
max_token_count: 131072
gpt-35-turbo-instruct:
versions:
'0613':
api_key: <api_key>
deployment_url: <gpt-35-turbo-instruct_deployment_url>
usd_per_1000_input_tokens: 0.003
usd_per_1000_output_tokens: 0.004
max_token_count: 16384
# For configuring On-Prem/Enterprise Git Provider authentication, remove if not used
git_oauth:
git_hosting: <git_server_url>
client_id: <client_id>
client_secret: <client_secret>
Configuring On-prem Enterprise Git Provider (if needed)
For more information on how to create a Swimm OAuth App for your on-prem Git hosting, please refer to the following instructions:
Step 3 - Deploy
Since the On-Prem Agent is a Docker image/container, you can deploy it using any container orchestrator of your choice: plain Docker or with Docker Compose, Kubernetes, ECS, GCP Cloud Run, and so on.
- Docker
- Docker Compose
- Kubernetes
- ECS
- Azure Container Instances
- Azure Container Apps
- Google Cloud Run
Assuming your configuration file is in the current working directory and
named swimm-onprem-agent.yaml
.
docker run -d \
-p 24605:24605 \
-v $PWD/swimm-onprem-agent.yaml:/etc/swimm/onprem-agent.yaml:ro \
-e CONFIG_PATH=/etc/swimm/onprem-agent.yaml \
swimmio/onprem-agent:2
The path on the left hand side of -v
must be absolute, we use $PWD
to make it absolute.
Assuming your configuration file is in the current working directory and
named swimm-onprem-agent.yaml
, and you place compose.yaml
alongside it.
services:
onprem-agent:
image: swimmio/onprem-agent:2
ports:
- 24605:24605
volumes:
- ./swimm-onprem-agent.yaml:/etc/swimm/onprem-agent.yaml:ro
environment:
CONFIG_PATH: /etc/swimm/onprem-agent.yaml
docker compose up -d
apiVersion: v1
kind: Secret
metadata:
name: onprem-agent-config
# Replace <onprem-agent-config> with the contents of your configuration file
# properly indented, or use kubectl create secret or kustomize using
# secretGenerator to load this from a file, or any such similar tool
stringData:
onprem-agent.yaml: |
<onprem-agent-config>
apiVersion: apps/v1
kind: Deployment
metadata:
name: onprem-agent
labels:
app: onprem-agent
spec:
selector:
matchLabels:
app: onprem-agent
template:
metadata:
labels:
app: onprem-agent
spec:
containers:
- name: onprem-agent
image: swimmio/onprem-agent:2
ports:
- containerPort: 24605
volumeMounts:
- name: config
mountPath: /etc/swimm
readOnly: true
env:
- name: CONFIG_PATH
value: /etc/swimm/onprem-agent.yaml
resources:
requests:
memory: 512Mi
cpu: '1'
limits:
memory: 512Mi
cpu: '1'
readinessProbe:
httpGet:
port: 24605
path: /health
# Uncomment if you configured TLS in the On-Prem Agent configuration
# scheme: HTTPS
# TODO Do we want a livenessProbe?
volumes:
- name: config
secret:
secretName: onprem-agent-config
securityContext:
runAsNonRoot: true
And then take your pick of the way to expose the service, note that you need to configure HTTPS:
- Ingress
- Gateway
- LoadBalancer
apiVersion: v1
kind: Service
metadata:
name: onprem-agent-ingress-svc
spec:
type: NodePort
selector:
app: onprem-agent
ports:
- port: 24605
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: onprem-agent
spec:
# ingressClassName: nginx
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: onprem-agent-ingress-svc
port:
number: 24605
You will need to create your own Gateway resource with HTTPS/TLS, and then use the HTTPRoute below replacing its name.
For example:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: onprem-agent
spec:
gatewayClassName: nginx
listeners:
- name: https
protocol: HTTPS
port: 443
hostname: onprem-agent.example.internal
tls:
mode: Terminate # If protocol is `TLS`, `Passthrough` is a possible mode
certificateRefs:
- group: ""
kind: Secret
name: <onprem-agent-cert>
And then create the Service
and HTTPRoute
:
apiVersion: v1
kind: Service
metadata:
name: onprem-agent-gateway-svc
spec:
type: NodePort
selector:
app: onprem-agent
ports:
- port: 24605
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: onprem-agent-httproute
spec:
parentRefs:
- name: <onprem-agent-gateway>
rules:
- backendRefs:
- name: onprem-agent-gateway-svc
port: 24605
apiVersion: v1
kind: Service
metadata:
name: onprem-agent
spec:
# You will need to configure HTTPS in the load balancer, or use a separate proxy that and NodePort
type: LoadBalancer
selector:
app: onprem-agent
ports:
- port: 8080 # Or 443 if you configured HTTPS in the onprem-agent config
targetPort: 24605
Coming soon: Instructions for deploying with ECS.
Coming soon: Instructions for deploying with Azure Container Instances.
Coming soon: Instructions for deploying with Azure Container Apps.
This instructions are for a Bourne like shell (e.g. bash
), adapt them to your specific shell as needed.
PROJECT_ID=$(gcloud config get project)
REGION=us-central1
gcloud iam service-accounts create swimm-onprem-agent \
--display-name "Swimm's On-Prem Agent" \
--description "Swimm's On-Prem Agent Service Account"
gcloud secrets create swimm-onprem-agent-config \
--data-file=swimm-onprem-agent.yaml
gcloud secrets add-iam-policy-binding swimm-onprem-agent-config \
--member=serviceAccount:swimm-onprem-agent@$PROJECT_ID.iam.gserviceaccount.com \
--role=roles/secretmanager.secretAccessor
gcloud run deploy swimm-onprem-agent \
--image=us-docker.pkg.dev/swimm-dev/public/onprem-agent:2 \
--allow-unauthenticated \
--region=$REGION \
--service-account=swimm-onprem-agent@$PROJECT_ID.iam.gserviceaccount.com \
--startup-probe=initialDelaySeconds=1,periodSeconds=3,timeoutSeconds=1,httpGet.port=8080,httpGet.path=/health \
--set-env-vars=CONFIG_PATH=/etc/swimm/onprem-agent.yaml \
--set-secrets=/etc/swimm/onprem-agent.yaml=swimm-onprem-agent-config:latest
You will require a load balancer/reverse proxy in front of the container to handle HTTPS, or to mount your own certificate and private key to the container, and specifiy them in the configuration file.
Step 4 - Verify
In your browser on the client machines that are going to connect to the onprem-agent
, browse to:
https://<your-service-url>/status
This should let you know if the service is up and running, and if the configuration is working correctly to communicate with your Azure OpenAI deployment.
You can also try from a terminal:
curl -v https://<your-service-url>/health
Step 5 - Service URLs
Please send the following service URL(s) to your contact at Swimm:
- On-prem agent service URL
- Git hosting server URL (if applicable for self-hosted Git providers)
We'll configure these within your workspace to ensure proper connectivity.
Setting up a budget alert
The Azure Open AI token is not technically limited to a certain amount of API calls or budget, so it's better to have an alert for unusual/unauthorized usage.
Here is a short video that shows how to do that: