Skip to content

AirGap Controller Upgrade

This document provides detailed instructions for upgrading a Rafay Controller in air-gapped environments.

Prerequisites

Before proceeding with the upgrade, ensure the following:

  • All clusters under required projects are up and healthy
  • CPU and memory usage are within normal parameters
  • Applications/services installed using custom blueprints are running
  • Sufficient disk space is available for the new image installation
  • Cluster backup is available (refer to backup documentation)
  • Management VM has at least 30GB available in the root directory

Important

There may be visible downtime on the controller during the upgrade process. Ensure no ongoing deployments are in progress during the upgrade.

Pre-upgrade Steps

1. Clean Up Temporary Files

Navigate to the /tmp directory and remove Rafay controller package folders:

cd /tmp
sudo rm -rf radm.log rafay-dep rafay-cluster rafay-core istio

2. Backup Configuration

Create a backup of the existing configuration:

mkdir -p /home/controller/backup
mv config.yaml /home/controller/backup/config.yaml-2.12

Upgrade Process

1. Download and Prepare Upgrade Package

  1. Download the airgap setup binary from the URL provided by the support team bash wget <airgap_upgrade_package>
  2. Extract the package:
    tar -xf <airgap_upgrade_package>
    
  3. Copy and configure the new config.yaml:
    cp -rp config.yaml-airgap-tmpl config.yaml
    

2. Configure the Upgrade

Before proceeding, ensure the following:

  1. Compare the new config.yaml with your previous version's configuration
  2. Update the archive-directory path in the config file to point to the correct location of your upgrade tar ball
  3. Verify that all other configuration parameters match your previous setup

Compare your previous config with the new one and ensure the following settings are present, particularly checking that the archive-directory path points to your upgrade tar ball location:

spec:
  deployment:
    ha: true  # Set to true for HA controller
  repo:
    archive-directory: /path/to/upgradetar/location  # Update this path to your upgrade tar ball location
    unarchive-path: /tmp
  app-config:
    generate-self-signed-certs: true  # If using self-signed certificates
    partner:
      star-domain: "*.example.com"

Configuration Verification

Double-check that the archive-directory path correctly points to where your upgrade tar ball is located, as RADM will use this path to access the configuration files during the upgrade process.

3. Install RADM Services

  1. Copy the new RADM binary:

    sudo cp radm /usr/bin/
    

  2. Upgrade dependencies:

    sudo radm dependency --config config.yaml
    

  3. Remove the old assets deployment:

    kubectl delete deploy -n rafay-core rafay-core-rafay-assets
    

  4. Upgrade the application:

    sudo radm application --config config.yaml
    

4. Verify Infrastructure

This verification step is required to check if all the MKS clusters are connected to the controller. After the application upgrade, verify the infrastructure:

# Check MKS cluster entries
kubectl exec -it -n rafay-core infra-salt-orchestrator-0 bash
salt '*' test.ping

MKS Cluster Connectivity

The salt '*' test.ping command will show all MKS clusters that are currently connected to the controller. This verification is crucial to ensure no cluster connections were lost during the upgrade process.

5. Upgrade Cluster Components

This upgrade is required to upload the latest cluster images/manifests with respect to the version to build in nexus registry:

sudo radm cluster --config config.yaml

Cluster Images Update

This step ensures that your nexus registry contains the latest cluster images and manifests required for the new version. This is crucial for maintaining compatibility between your clusters and the upgraded controller.

6. Update Blueprint Version

Update the blueprint version of existing clusters to the latest version for 3.1.

Post-upgrade Verification

After completing the upgrade:

  1. Verify all pods are running correctly
  2. Check system health and metrics
  3. Validate cluster connectivity
  4. Test application deployments

Troubleshooting

If you encounter issues during the upgrade:

  1. Check the logs in /tmp/radm.log
  2. Ensure sufficient disk space is available
  3. Contact Rafay support if issues persist

--