On-Prem Agent Installation Guide
Swimm's On-Prem Agent is a dedicated server running in a container within your company's network. It is responsible for granting Swimm access to services that you have your own instance off, such as Azure OpenAI, that you wouldn't want to otherwise share their keys directly with Swimm. Keeping your keys confidential and private.
Prerequisetes
-
Make sure you have the following details of your Azure OpenAI instance for GPT-4o, GPT-4o mini and GPT-3.5 Turbo Instruct if you are deploying it for AI features:
- API Key
- Deployment URL
See:
-
The server is required to be deployed with HTTPS/TLS configured with a certificate trusted by your organization's computers. See TLS Configuration.
-
The deployed container will be exposed to the Swimm IDE extension and web page in order to bridge Azure OpenAI with Swimm. Make sure that your network allows inbound traffic to this container from the developers' machines.
-
If you direct the agent to connect to Azure OpenAI via a forward/reverse proxy/VPN, and that proxy/VPN does TLS Man-in-the-Middle (MITM), you will need to configure the On-Prem Agent to trust the certificate of the proxy/VPN. You can configure an additional certificate bundle that the On-Prem Agent will use by using the
NODE_EXTRA_CA_CERTS
environment variable. See the FAQ for more details. -
Swimm uses Azure OpenAI streaming API, the On-Prem Agent will stream the responses from the Azure OpenAPI instance using SSE (Server-Sent Events) the same as a direct streaming request to Azure OpenAI. Make sure that your network, load balancer/proxy, and container orchestration platform allows SSE. You can verify this by checking documentation of your load balancer/proxy/container orchestration platform, or by asking the relevant personnal at your organization that is responsible for it.
Technical Details
- The server listens on port
24605
inside the container by default, this can be overriden using thePORT
environment variable. Though you can also always remap the outside port using Docker port mapping. - The server has an HTTP health check endpoint at
/health
. - The path to the configuration file is specified using the
CONFIG_PATH
environment variable.
Deploying Models
Only needed if you want to use your own Azure OpenAI instance and this wasn't already done or you don't have the needed models deployed.
-
Enter Azure AI Foundry, creating or selecting the correct Azure OpenAI resources at the top.
Or enter Azure Portal → Azure OpenAI → Create/Select OpenAI Resource → Explore Azure AI Foundry Portal
-
Select "Deployments" in the sidebar.
-
Select "Deploy model".
-
Deploy gpt-4o, gpt-4o-mini and gpt-35-turbo-instruct with a name of your choice.
How to get the deployment URL
Click your deployed model in the Azure AI Foundry portal, and you should see
it's target URI and examples in various languages. The URL you need is
without the "/chat/completions
" and "?api-version=*
" suffixes. For
example:
URL | |
---|---|
Good ✅ | https://ai-resource.openai.azure.com/openai/deployments/{deployment-name} |
Bad ❌ | https://ai-resource.openai.azure.com/openai/deployments/{deployment-name}/chat/completions?api-version=2025-01-01-preview |
Installation
Step 1 - Pull
Pull the latest version of the image using:
docker pull swimmio/onprem-agent:2 # From Docker Hub
# Or
docker pull us-docker.pkg.dev/swimmio/public/onprem-agent:2 # From Google Artifact Registry
The image is on Docker Hub swimmio/onprem-agent and Google Artifact Registry. You can see the available tags there if you want to pin the version to have more control over updates. We publish the following tags:
latest
- Always the latest version.2
, etc. - The latest major version 2.2.0.7
- A specific version.
In case you don't want to, or have no access, to use Docker Hub or Google Artifact Registry, you can also mirror the image:
Mirroring
docker save swimmio/onprem-agent:2 --platform linux/amd64 -o swimm-onprem-agent-2.tar # From Docker Hub
# Or
docker save us-docker.pkg.dev/swimmio/public/onprem-agent:2 --platform linux/amd64 -o swimm-onprem-agent-2.tar # From Google Artifact Registry
And then to load the image:
docker load -i swimm-onprem-agent-2.tar
And if needed you can also push it to a registry:
docker tag <image> <your-registry>/<image> # e.g. docker tag swimmio/onprem-agent:latest registry.foo.test/swimmio/onprem-agent:latest
docker push <your-registry>/<image> # e.g. docker push registry.foo.test/swimmio/onprem-agent:latest
Step 2 - Configure
Create your configuration file:
enterprise_name: <enterprise_name> # Optional, but recommended
# For configuring AI features, remove if not used
openai:
models:
gpt-4o:
versions:
'2024-08-06':
api_key: <api_key>
deployment_url: <gpt-4o_deployment_url>
usd_per_1000_input_tokens: 0.005
usd_per_1000_output_tokens: 0.015
max_token_count: 131072
'2024-05-13':
api_key: <api_key>
deployment_url: <gpt-4o_deployment_url>
usd_per_1000_input_tokens: 0.005
usd_per_1000_output_tokens: 0.015
max_token_count: 131072
gpt-4o-mini:
versions:
'2024-07-18':
api_key: <api_key>
deployment_url: <gpt-4o-mini_deployment_url>
usd_per_1000_input_tokens: 0.000165
usd_per_1000_output_tokens: 0.00066
max_token_count: 131072
gpt-35-turbo-instruct:
versions:
'0613':
api_key: <api_key>
deployment_url: <gpt-35-turbo-instruct_deployment_url>
usd_per_1000_input_tokens: 0.003
usd_per_1000_output_tokens: 0.004
max_token_count: 16384
# For configuring On-Prem/Enterprise Git Provider authentication, remove if not used
git_oauth:
git_hosting: <git_server_url>
client_id: <client_id>
client_secret: <client_secret>
Configuring On-prem Enterprise Git Provider (if needed)
For more information on how to create a Swimm OAuth App for your on-prem Git hosting, please refer to the following instructions:
Step 3 - Deploy
Since the On-Prem Agent is a Docker image/container, you can deploy it using any container platform/orchestrator of your choice: plain Docker or with Docker Compose, Kubernetes, ECS, GCP Cloud Run, and so on.
- Docker
- Docker Compose
- Kubernetes
- ECS
- Azure Container Apps
- Azure Container Instances
- Google Cloud Run
Assuming your configuration file is in the current working directory and
named swimm-onprem-agent.yaml
.
docker run -d \
-p 24605:24605 \
-v $PWD/swimm-onprem-agent.yaml:/etc/swimm/onprem-agent.yaml:ro \
-e CONFIG_PATH=/etc/swimm/onprem-agent.yaml \
swimmio/onprem-agent:2
The path on the left hand side of -v
must be absolute, we use $PWD
to make it absolute.
Assuming your configuration file is in the current working directory and
named swimm-onprem-agent.yaml
, and you place compose.yaml
alongside it.
services:
onprem-agent:
image: swimmio/onprem-agent:2
ports:
- 24605:24605
volumes:
- ./swimm-onprem-agent.yaml:/etc/swimm/onprem-agent.yaml:ro
environment:
CONFIG_PATH: /etc/swimm/onprem-agent.yaml
docker compose up -d
apiVersion: v1
kind: Secret
metadata:
name: onprem-agent-config
# Replace <onprem-agent-config> with the contents of your configuration file
# properly indented, or use kubectl create secret or kustomize using
# secretGenerator to load this from a file, or any such similar tool
stringData:
onprem-agent.yaml: |
<onprem-agent-config>
apiVersion: apps/v1
kind: Deployment
metadata:
name: onprem-agent
labels:
app: onprem-agent
spec:
selector:
matchLabels:
app: onprem-agent
template:
metadata:
labels:
app: onprem-agent
spec:
containers:
- name: onprem-agent
image: swimmio/onprem-agent:2
ports:
- containerPort: 24605
volumeMounts:
- name: config
mountPath: /etc/swimm
readOnly: true
env:
- name: CONFIG_PATH
value: /etc/swimm/onprem-agent.yaml
resources:
requests:
memory: 512Mi
cpu: '1'
limits:
memory: 512Mi
cpu: '1'
readinessProbe:
initialDelaySeconds: 1
periodSeconds: 3
httpGet:
port: 24605
path: /health
# Uncomment if you configured TLS in the On-Prem Agent configuration
# scheme: HTTPS
# TODO Do we want a livenessProbe?
volumes:
- name: config
secret:
secretName: onprem-agent-config
securityContext:
runAsNonRoot: true
And then take your pick of the way to expose the service, note that you need to configure HTTPS:
- Ingress
- Gateway
- LoadBalancer
apiVersion: v1
kind: Service
metadata:
name: onprem-agent-ingress-svc
spec:
type: NodePort
selector:
app: onprem-agent
ports:
- port: 24605
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: onprem-agent
spec:
# ingressClassName: nginx
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: onprem-agent-ingress-svc
port:
number: 24605
You will need to create your own Gateway resource with HTTPS/TLS, and then use the HTTPRoute below replacing its name.
For example:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: onprem-agent
spec:
gatewayClassName: nginx
listeners:
- name: https
protocol: HTTPS
port: 443
hostname: onprem-agent.example.internal
tls:
mode: Terminate # If protocol is `TLS`, `Passthrough` is a possible mode
certificateRefs:
- group: ""
kind: Secret
name: <onprem-agent-cert>
And then create the Service
and HTTPRoute
:
apiVersion: v1
kind: Service
metadata:
name: onprem-agent-gateway-svc
spec:
type: NodePort
selector:
app: onprem-agent
ports:
- port: 24605
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: onprem-agent-httproute
spec:
parentRefs:
- name: <onprem-agent-gateway>
rules:
- backendRefs:
- name: onprem-agent-gateway-svc
port: 24605
apiVersion: v1
kind: Service
metadata:
name: onprem-agent
spec:
# You will need to configure HTTPS in the load balancer, or use a separate proxy that and NodePort
type: LoadBalancer
selector:
app: onprem-agent
ports:
- port: 8080 # Or 443 if you configured HTTPS in the onprem-agent config
targetPort: 24605
This are instructions using the AWS CLI and using a Bourne-like shell (e.g.
Bash
, Zsh
), adapt them to your specific shell as needed.
You can also use the AWS console instead to perform similar steps. Adapt the instructions as needed to suit your requirements, e.g. ECS cluster using Fargate or EC2, etc. Make sure that the AWS CLI is configured with the correct project, credentials, and region.
This are basic instructions that assume you have a public/organization private domain that you can get a certificate to assign to the load balancer via ACM (Automatic or import). You can also choose to let the On-Prem Agent handle TLS if you have a fixed certificate or expose the On-Prem Agent directly with no load balancer and replication, use a VPC with NAT and no public IPs, assign rules so that the load balancer only responds to the correct host name, and other tweaks.
You can view your AWS account ID using:
aws sts get-caller-identity --query Account --output text
-
Create a IAM role for access to the configuration secret:
aws iam create-role --role-name SwimmOnPremAgentRole --assume-role-policy-document file://ecs-trust-policy.json
ecs-trust-policy.json{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
} -
Create a secret in AWS Secrets Manager with the configuration file:
aws secretsmanager create-secret --name onprem-agent-config --secret-string file://swimm-onprem-agent.yaml
-
Give permissions to the IAM role to access the secret (Replace
{AccountID}
with your AWS account ID):aws secretsmanager put-resource-policy --secret-id onprem-agent-config --resource-policy file://config-resource-policy.json
config-trust-policy.json{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::{AccountID}:role/SwimmOnPremAgentRole"
},
"Action": "secretsmanager:GetSecretValue",
"Resource": "*"
}
]
} -
Create an ECS task definition (Replace
{AccountID}
with your AWS account ID):aws ecs register-task-definition --cli-input-json file://onprem-agent-task.json
onprem-agent-task.json{
"family": "swimm-onprem-agent",
"containerDefinitions": [
{
"name": "onprem-agent",
"image": "swimmio/onprem-agent:2",
"cpu": 0,
"portMappings": [
{
"name": "http",
"containerPort": 24605,
"hostPort": 24605,
"protocol": "tcp",
"appProtocol": "http"
}
],
"essential": true,
"environment": [
{
"name": "CONFIG_PATH",
"value": "/etc/swimm/onprem-agent.yaml"
}
],
"environmentFiles": [],
"mountPoints": [
{
"sourceVolume": "config",
"containerPath": "/etc/swimm",
"readOnly": true
}
],
"volumesFrom": [],
"dependsOn": [
{
"containerName": "files-composer",
"condition": "SUCCESS"
}
],
"ulimits": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/swimm-onprem-agent",
"mode": "non-blocking",
"awslogs-create-group": "true",
"max-buffer-size": "25m",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
},
"secretOptions": []
},
"healthCheck": {
"command": [
"curl",
"-f",
"http://localhost:24605/health"
],
"interval": 30,
"timeout": 2,
"retries": 3,
"startPeriod": 1
},
"systemControls": []
},
{
"name": "files-composer",
"image": "public.ecr.aws/compose-x/ecs-files-composer:latest",
"cpu": 0,
"portMappings": [],
"essential": false,
"environment": [
{
"name": "ECS_CONFIG_CONTENT",
"value": "{\"files\": {\"/etc/swimm/onprem-agent.yaml\": {\"source\": {\"Secret\": {\"SecretId\": \"onprem-agent-config\"}}}}}"
}
],
"environmentFiles": [],
"mountPoints": [
{
"sourceVolume": "config",
"containerPath": "/etc/swimm",
"readOnly": false
}
],
"volumesFrom": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/swimm-onprem-agent",
"mode": "non-blocking",
"awslogs-create-group": "true",
"max-buffer-size": "25m",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
},
"secretOptions": []
},
"systemControls": []
}
],
"taskRoleArn": "arn:aws:iam::{AccountID}:role/SwimmOnPremAgentRole",
"executionRoleArn": "arn:aws:iam::{AccountID}:role/ecsTaskExecutionRole",
"networkMode": "awsvpc",
"volumes": [
{
"name": "config",
"host": {}
}
],
"placementConstraints": [],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "1024",
"memory": "2048",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"enableFaultInjection": false
} -
Create or choose an ECS cluster:
# To create a dedicated cluster
aws ecs create-cluster --cluster-name onprem-agent-cluster --capacity-providers FARGATE FARGATE_SPOT --default-capacity-provider-strategy FARGATE
# Or just use an existing cluster of your choice -
Create an application load balancer:
lb_group_id=$(aws ec2 create-security-group --group-name swimm-onprem-agent-lb --description "Swimm On-Prem Agent LB" --query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress --group-id $lb_group_id --protocol tcp --port 80 --cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress --group-id $lb_group_id --protocol tcp --port 443 --cidr 0.0.0.0/0
# Pick subnets, you can also use the AWS console (https://us-east-1.console.aws.amazon.com/vpcconsole/home#subnets:)
aws ec2 describe-subnets
# Use the security group ID from the previous command
load_balancer=$(aws elbv2 create-load-balancer \
--name swimm-onprem-agent-lb \
--subnets "<subnets>" \
--security-groups $lb_group_id \
--query 'LoadBalancers[0].LoadBalancerArn' \
--output text)
# List VPCs, you can also use the AWS console (https://us-east-1.console.aws.amazon.com/vpcconsole/home#vpcs:)
aws ec2 describe-vpcs
# Create a target group, specify the VPC ID
target_group=$(aws elbv2 create-target-group --name=test --protocol=HTTP --port=80 --target-type=ip --vpc-id "<vpc>" --query 'TargetGroups[0].TargetGroupArn' --output text)
aws elbv2 create-listener \
--load-balancer-arn $load_balancer \
--protocol HTTP --port 80 \
--default-actions "Type=redirect,RedirectConfig={Protocol=HTTPS,StatusCode=HTTP_301}"
aws elbv2 create-listeners \
--load-balancer-arn $load_balancer \
--protocol HTTPS --port 443 \
--certificates "CertificateArn=<certificate-arn>" \
--default-actions "Type=forward,TargetGroupArn=$target_group" -
Create the service (Replace with your correct cluster):
group_id=$(aws ec2 create-security-group --group-name swimm-onprem-agent --description "Swimm On-Prem Agent" --output text --query 'GroupId')
aws ec2 authorize-security-group-ingress --group-id $group_id --protocol tcp --port 24605 --source-group $lb_group_id
# Pick subnets, you can also use the AWS console (https://us-east-1.console.aws.amazon.com/vpcconsole/home#subnets:), you can use the same as the load balancer
aws ecs create-service \
--cluster onprem-agent-cluster \
--service-name swimm-onprem-agent \
--task-definition swimm-onprem-agent \$
--desired-count 1 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[<subnets>],securityGroups=[$group_id],assignPublicIp=ENABLED}" \
--load-balancer "targetGroupArn=$target_group,containerName=onprem-agent,containerPort=24605"
This are instructions using the Azure CLI and using a Bourne-like shell
(e.g. Bash
, Zsh
), adapt them to your specific shell as needed. Make sure
that you are logged in to the right subscription.
You can also use the Azure Portal instead to perform similar steps. Though you will have to convert the YAML to JSON to submit it using the portal. Adapt the instructions as needed to suit your requirements.
-
First you need to select or create a resource group:
# List resource groups
az group list
# Create a resrouce group
az group create --name <resource-group> --location <location> -
Select or create a container apps environment:
# List container apps environments
az containerapp env list [--resource-group <resource-group-name>]
# Create a container apps environment
az containerapp env create --name swimm-onprem-agent --resource-group <resource-group> --location <location> -
Deploy the swimm-onprem-agent using the YAML file, replace
<onprem-agent-config>
with the contents of your config file in the YAML file:swimm-onprem-agent.yamlproperties:
configuration:
ingress:
external: true
allowInsecure: false
targetPort: 24605
secrets:
- name: config
value: |
<onprem-agent-config>
template:
containers:
- image: docker.io/swimmio/onprem-agent:2
name: swimm-onprem-agent
resources:
cpu: 1
ephemeralStorage: 4Gi
memory: 2Gi
env:
- name: CONFIG_PATH
value: /etc/swimm/onprem-agent.yaml
volumeMounts:
- mountPath: /etc/swimm
volumeName: config
probes:
- type: Readiness
initialDelaySeconds: 1
periodSeconds: 3
httpGet:
path: /health
port: 24605
scheme: HTTP
volumes:
- name: config
storageType: Secret
secrets:
- secretRef: config
path: onprem-agent.yamlaz containerapp create \
--resource-group <resource-group> \
--environment <container-apps-environment> \
--name swimm-onprem-agent \
--yaml swimm-onprem-agent.yamlNote that you can also reference the config file from a secret in Azure Key Vault. You will need add
Identity
to the container app and give it access to the Key Vault secret, and specifykeyVaultUrl
andidentity
for the secret instead ofvalue
in the YAML file. -
To see the service URL, run:
az containerapp show --resource-group <resource-group> --name swimm-onprem-agent --query properties.latestRevisionFqdn --output tsv
This are instructions using the Azure CLI and using a Bourne-like shell
(e.g. Bash
, Zsh
), adapt them to your specific shell as needed. Make sure
that you are logged in to the right subscription.
Adapt the instructions as needed to suit your requirements.
Note that this configuration is more difficult to setup TLS for than container apps which has automatic TLS. You will need to configure TLS in the On-Prem Agent configuration or TLS termination using some other service in front of the container,
-
First you need to select or create a resource group:
# List resource groups
az group list
# Create a resrouce group
az group create --name <resource-group> --location <location> -
Create the instance using the given YAML file, replace
<base64-encoded-onprem-agent-config>
with the base64 encoded contents of your config file, and consider filling in or removinglogAnalytics
:swimm-onprem-agent.yamlapiVersion: '2021-10-01'
type: Microsoft.ContainerInstance/containerGroups
location: eastus
properties:
osType: Linux
containers:
- name: swimm-onprem-agent
properties:
image: us-docker.pkg.dev/swimmio/public/onprem-agent:2
environmentVariables:
- name: CONFIG_PATH
value: /etc/swimm/onprem-agent.yaml
volumeMounts:
- name: config
mountPath: /etc/swimm
readOnly: true
ports:
- port: 24605
protocol: TCP
resources:
requests:
cpu: 1.0
memoryInGB: 1.5
readinessProbe:
initialDelaySeconds: 1
periodSeconds: 3
httpGet:
path: /health
port: 24605
volumes:
- name: config
secret:
onprem-agent.yaml: |
<base64-encoded-onprem-agent-config>
ipAddress:
type: Public
ports:
- protocol: tcp
port: 24605
dnsNameLabel: swimm-onprem-agent
autoGeneratedDomainNameLabelScope: TenantReuse
# If you want to persist logs to Azure Log Analytics, you need to fill in the workspace-id and workspace-key, otherwise remove this section
diagnostics:
logAnalytics:
workspaceId: <workspace-id>
workspaceKey: <workspace-key>
logType: ContainerInstanceLogsaz container create \
--resource-group <resource-group> \
--name swimm-onprem-agent \
--file swimm-onprem-agent.yaml -
To see the service URL, run:
az container show --resource-group <resource-group> --name swimm-onprem-agent --query ipAddress.fqdn --output tsv
If you didn't configure TLS using a fixed cert and key in the configuration (You will have to add the cert and key to the YAML to mount them to the container as well), you will also need to configure TLS termination using some other service in front of the container, for example Azure Front Door or Application Gateway.
This instructions are for a Bourne-like shell (e.g. Bash
, Zsh
), adapt them to your specific shell as needed.
PROJECT_ID=$(gcloud config get project)
REGION=us-central1
gcloud iam service-accounts create swimm-onprem-agent \
--display-name "Swimm's On-Prem Agent" \
--description "Swimm's On-Prem Agent Service Account"
gcloud secrets create swimm-onprem-agent-config \
--data-file=swimm-onprem-agent.yaml
gcloud secrets add-iam-policy-binding swimm-onprem-agent-config \
--member=serviceAccount:swimm-onprem-agent@$PROJECT_ID.iam.gserviceaccount.com \
--role=roles/secretmanager.secretAccessor
gcloud run deploy swimm-onprem-agent \
--image=us-docker.pkg.dev/swimm-dev/public/onprem-agent:2 \
--allow-unauthenticated \
--region=$REGION \
--service-account=swimm-onprem-agent@$PROJECT_ID.iam.gserviceaccount.com \
--startup-probe=initialDelaySeconds=1,periodSeconds=3,timeoutSeconds=1,httpGet.port=8080,httpGet.path=/health \
--set-env-vars=CONFIG_PATH=/etc/swimm/onprem-agent.yaml \
--set-secrets=/etc/swimm/onprem-agent.yaml=swimm-onprem-agent-config:latest
To see the service URL, run:
gcloud run services describe --region=$REGION swimm-onprem-agent --format="get(status.url)"
Depending on your deployment method you will need to configure a load balancer/reverse proxy in front of the container to handle TLS/HTTPS, or to mount your own certificate and private key to the container, and specify them in the configuration file.
Step 4 - Verify
In your browser on the client machines that are going to connect to the onprem-agent
, browse to:
https://<your-service-url>/status
This should let you know if the service is up and running, and if the configuration is working correctly to communicate with your Azure OpenAI deployment.
You can also try from a terminal:
curl -v https://<your-service-url>/health
Step 5 - Service URLs
Please send the following service URL(s) to your contact at Swimm:
- On-prem agent service URL
- Git hosting server URL (if applicable for self-hosted Git providers)
We'll configure these within your workspace to ensure proper connectivity.
Setting up a budget alert
The Azure Open AI token is not technically limited to a certain amount of API calls or budget, so it's better to have an alert for unusual/unauthorized usage.
Here is a short video that shows how to do that: