Skip to main content

Ask Swimm on-prem service installation guide

What is the /ask Swimm On-Prem Service?​

The /ask Swimm On-Prem Service is a dedicated container operating within your local network, acting as a conduit linking your Azure OpenAI instance with the Swimm IDE extension. This service stands as the sole entity capable of direct communication with your Azure OpenAI instance, ensuring the protection and confidentiality of your API key.

This arrangement guarantees that your API key remains safeguarded and is not exposed to the Swimm IDE extension or any other external component.

Prerequisites​

  1. Make sure you have the following details of your Azure OpenAI instance for GPT-4o, GPT-4 32K, GPT-4 and GPT-3.5:
    • API Key
    • Deployment URL
    • Max Token Count
  2. The deployed container will be exposed to the Swimm IDE extension in order to bridge Azure OpenAI with the Swimm IDE extension. Make sure that your network allows inbound traffic to this container from the developers' machines.
  3. Since Swimm uses Azure OpenAI streaming API, the ask-swimm-onprem-middleware will stream the responses from the Azure OpenAPI instance using SSE (Server-Sent Events). Make sure that your network and container orchestration platform allows SSE.
  4. Our healthcheck route for the container is /health and internal port: 3000.

Installation Steps​

Step 1: Download the /ask Swimm On-Prem Service Image​

Pull the latest version of the AskSwimm On-Prem Service image by running:

$ docker pull swimmio/ask-swimm-onprem-middleware:latest

(Optional) Download the image from the Swimm Image Storage and upload it to your Organization's Registry.​


(Non-Azure) Download the image from the Swimm Image Storage and upload it to your Organization's Registry​
  • In case you are not able to pull the image from the Swimm Image Storage, you can download it from our storage and upload it to your Organization's Registry.

    1. Download the latest version of the /ask Swimm On-Prem Service image from here and save it locally on a machine that has access to your Organization's Registry.
    2. Use the following command to load the image:
      $ docker load -i ask-swimm-onprem-middleware.tar
    3. Tag the image with your Organization's Registry URL:
      $ docker tag ask-swimm-onprem-middleware:latest <your-registry-url>/ask-swimm-onprem-middleware:latest
    4. Upload the image to your Organization's Registry.
βœ… Go to step 2 to deploy the /ask Swimm on-prem service.

Azure Container Apps Environment and Registry​
  1. Create an Azure Container Apps Environment:

    Ensure you have the Azure CLI installed and configured on your machine.

    Create a new Azure Container Apps environment by running the following command:

    az containerapp env create --name myContainerAppEnv --resource-group myResourceGroup --location eastus

    If you don't have a resource group, you can create a new one:

    az group create --name myResourceGroup --location eastus
  2. Create an Azure Container Registry (ACR) (Optional)

    This is optional because our Docker image is readily available on Docker Hub in a public registry.

    image: swimmio/ask-swimm-onprem-middleware:latest

    If you prefer to use your own container registry, you can pull the image from Docker Hub and push it to your own ACR.

    1. Create an Azure container registry:

      az acr create --resource-group myResourceGroup --name myContainerRegistry --sku Basic
    2. Login to your ACR:

      az acr login --name myContainerRegistry
    3. Pull the image from Docker Hub:

      docker pull swimmio/ask-swimm-onprem-middleware:latest
    4. Tag the image with your ACR URL:

      docker tag swimmio/ask-swimm-onprem-middleware:latest mycontainerregistry.azurecr.io/ask-swimm-onprem-middleware:latest
    5. Push the image to your ACR:

      docker push mycontainerregistry.azurecr.io/ask-swimm-onprem-middleware:latest

βœ… Go to step 2 to deploy the /ask Swimm on-prem service.

Step 2: Deploy the /ask Swimm On-Prem Service​

You have multiple options to deploy the /ask Swimm On-Prem Service. Choose your preferred deployment method and follow the instructions below.

Option 1Option 2Option 3
KubernetesAmazon ECSAzure container
If you are using a different platform, feel free to reach out and our team will be happy to help you get started.

Kubernetes deployment​

  1. Edit the following deployment.yaml template and replace the image URL with your Organization's Registry URL as well as the following API Keys:

    Note: If you have access to GPT-4o ,please set the same GPT-4o …API_KEY and …DEPLOYMENT_URL for each GPT_4 LONG and SHORT env variable pair.

    • GPT_4_LONG_API_KEY (32K)
    • GPT_4_LONG_DEPLOYMENT_URL (32K)
    • GPT_4_SHORT_API_KEY
    • GPT_4_SHORT_DEPLOYMENT_URL
    • GPT_3_5_API_KEY
    • GPT_3_5_DEPLOYMENT_URL

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: ask-swimm-onprem-middleware
    labels:
    app: ask-swimm-onprem-middleware
    spec:
    replicas: 1
    selector:
    matchLabels:
    app: ask-swimm-onprem-middleware
    template:
    metadata:
    labels:
    app: ask-swimm-onprem-middleware
    spec:
    containers:
    - name: ask-swimm-onprem-middleware
    image: <your-registry-url>/ask-swimm-onprem-middleware:latest
    ports:
    - containerPort: 3000
    env:
    # Your GPT 4 32K
    GPT_4_LONG_API_KEY: <your-gpt-4-long-api-key>
    GPT_4_LONG_DEPLOYMENT_URL: <your-gpt-4-long-deployment-url>

    # Your GPT 4 Details
    GPT_4_SHORT_API_KEY: <your-gpt-4-short-api-key>
    GPT_4_SHORT_DEPLOYMENT_URL: <your-gpt-4-short-deployment-url>

    # Your GPT 3.5 Details
    GPT_3_5_API_KEY: <your-gpt-3-5-api-key>
    GPT_3_5_DEPLOYMENT_URL: <your-gpt-3-5-deployment-url>
  2. Depending on your setup, you might also need a service.yaml file with the following content:

    apiVersion: v1
    kind: Service
    metadata:
    name: ask-swimm-onprem-middleware
    spec:
    type: NodePort
    ports:

    - port: 8080
    targetPort: 3000
    selector:
    app: ask-swimm-onprem-middleware
  3. Deploy the service by running the following commands (or through your container orchestration platform):

    kubectl apply -f deployment.yaml
    kubectl apply -f service.yaml
βœ… Go to step 3 to verify that the service is up and running.

Amazon ECS deployment​

  1. Navigate to the Amazon ECS console and create a new task definition.

    1. Log in to the AWS Management Console and navigate to the Amazon ECS dashboard.
    2. Select Task Definitions from the navigation pane.
    3. Click on the Create new Task Definition button.
    4. Choose the launch type compatibility based on your requirements (EC2 or Fargate).
    5. Enter a name (e.g., swimm-ai-proxy).
    6. Define the container image from your Organization's Registry URL by specifying the repository URL and image tag.
    7. Configure essential container settings, including memory and CPU limits, logging options, and networking mode.
    8. Set up environment variables for your API keys and OpenAI deployments:

    Note: If you have access to GPT-4o ,please set the same GPT-4o …API_KEY and …DEPLOYMENT_URL for each GPT_4 LONG and SHORT env variable pair.

    • GPT_4_LONG_API_KEY (32K)
    • GPT_4_LONG_DEPLOYMENT_URL (32K)
    • GPT_4_SHORT_API_KEY
    • GPT_4_SHORT_DEPLOYMENT_URL
    • GPT_3_5_API_KEY
    • GPT_3_5_DEPLOYMENT_URL

    1. Specify the container internal port to 3000.
    2. Review and create the task definition.

  2. Create a new ECS service using the task definition.

    1. Once your task definition is created, go back to the Amazon ECS dashboard.
    2. Select the cluster where you want to deploy your service.
    3. Click on the Create button to create a new service.
    4. Enter a service name and configure the service's desired tasks count. For automatic scaling, leave the desired tasks count field empty.
    5. Choose a deployment type (Rolling update or Blue/green) and set up deployment options as per your requirements.
    6. Configure network settings, such as VPC, subnets, and security groups.
    7. Optionally, enable service auto-scaling if you anticipate changes in traffic patterns. Define scaling policies based on metrics like CPU utilization or request counts.
    8. Configure load balancing settings if you want to expose your service to external traffic.
    9. Review and create the service.
βœ… Go to step 3 to verify that the service is up and running.

Azure container deployment​

Deploy Azure containers easily using automated and flexible management tools tailored to your preferred approach.

Option 1Option 2
Azure CLIYAML file

Deploy with Azure CLI​

  1. Run the following command to deploy the container using the Azure CLI. Replace the placeholder values with your specific environment variables.

    Note: If you have access to GPT-4o ,please set the same GPT-4o …API_KEY and …DEPLOYMENT_URL for each GPT_4 LONG and SHORT env variable pair.

    az containerapp create --name myContainerApp--resource-group myRessourceGroup --environment myContainerAppEnv --image swimmio/ask-swimm-onprem-middleware:latest --target-port 3000 --ingress 'external' --env-vars
    GPT_4_LONG_API_KEY=<>
    GPT_4_LONG_DEPLOYMENT_URL=<>
    GPT_4_SHORT_API_KEY=<>
    GPT_4_SHORT_DEPLOYMENT_URL=<>
    GPT_3_5_API_KEY=<>
    GPT_3_5_DEPLOYMENT_URL=<>
    GPT_3_5_TI_API_KEY=<>
    GPT_3_5_TI_DEPLOYMENT_URL=<>

    Note: If you decide to use your own container registry, ensure you update the --image value with your container registry.

Deploy with YAML file​

  1. Create a containerapp.yaml file with the following content. Replace the placeholder values with your specific environment variables.

    properties:
    environmentId: /subscriptions/<YOUR_SUBSCRIPTION_ID>/resourceGroups/myResourceGroup/providers/Microsoft.App/managedEnvironments/myContainerAppEnv
    configuration:
    ingress:
    external: true
    targetPort: 3000
    template:
    containers:
    - image: swimmio/ask-swimm-onprem-middleware:latest
    name: ask-swimm-onprem-middleware
    env:
    - name: GPT_4_LONG_API_KEY
    value: ""
    - name: GPT_4_LONG_DEPLOYMENT_URL
    value: ""
    - name: GPT_4_SHORT_API_KEY
    value: ""
    - name: GPT_4_SHORT_DEPLOYMENT_URL
    value: ""
    - name: GPT_3_5_API_KEY
    value: ""
    - name: GPT_3_5_DEPLOYMENT_URL
    value: ""
    - name: GPT_3_5_TI_API_KEY
    value: ""
    - name: GPT_3_5_TI_DEPLOYMENT_URL
    value: ""

    Note: If you decide to use your own container registry, ensure you update the --image value with your container registry.

  2. Deploy the container using the containerapp.yaml file.

    az containerapp create --resource-group myResourceGroup --file containerapp.yaml
βœ… Go to step 3 to verify that the service is up and running.

Step 3: Verify that the service is up and running​

The service URL will be used to configure the Swimm IDE extension to communicate with the service for your Swimm workspace.

Make sure that the service is up and running. Try to access it from your local machine.


Non-Azure Deployment Verification​

  1. Using the following cURL command (or through your browser):

    curl http://<service-url>/health
  2. If the server is up and running, you should get an HTTP 200 response.


πŸ”— Please send the service URL to the Swimm team. πŸ”—β€‹


Azure Deployment Verification​

  1. Verify Deployment Status:

    Ensure that your deployment is successful by checking the status of your container app. You can do this using the Azure CLI:

    az containerapp show --name my-container-app --resource-group my-resource-group
  2. Check Application Logs:

    Ensure there are no errors by checking the logs of your application. Use the Azure CLI to view logs:

    az containerapp logs show --name <your-container-app-name> --resource-group <your-resource-group>
  3. Get your application URL:

    Get your application URL on the App overview page on the Azure console or using the Azure CLI:

    az containerapp show --name <your-container-app-name> --resource-group <your-resource-group> --query properties.configuration.ingress.fqdn

πŸ”— Please send the application URL to the Swimm team. πŸ”—β€‹


Set up a budget and an alert to your Azure Open AI subscription​

The Azure Open AI token is not technically limited to a certain amount of API calls or budget, so it’s better to have an alert for unusual/unauthorized usage.

Here is a short video that shows how to do that.