Skip to content

Upgrade

This document provides detailed instructions for upgrading the air-gapped Rafay Controller.


Prerequisites

Before proceeding with the upgrade, ensure the following:

  • All clusters under required projects are up and healthy
  • CPU and memory usage are within normal parameters
  • Applications/services installed using custom blueprints are running
  • Sufficient disk space is available for the new image installation
  • Cluster backup is available (refer to backup documentation)
  • Management VM has at least 30GB available in the root directory

Important

There may be visible downtime on the controller during the upgrade process. Ensure no ongoing deployments are in progress during the upgrade.


Pre-upgrade Steps

1. Clean Up Temporary Files

Navigate to the /tmp directory and remove Rafay controller package folders:

cd /tmp
sudo rm -rf radm.log rafay-dep rafay-cluster rafay-core istio

2. Backup Configuration

Create a backup of the existing configuration:

mkdir -p /home/controller/backup
mv config.yaml /home/controller/backup/config.yaml-2.12

Upgrade Process

1. Download and Prepare Upgrade Package

  1. Download the airgap setup binary from the URL provided by the support team:

    wget <airgap_upgrade_package>
    

    Download Tip

    The package is around 30 GB and may take ~15 minutes to download with wget.
    For faster downloads, use aria2c, which supports parallel connections:

    time aria2c -x 16 <URL_of_airgap_installation_package>
    
    This can significantly reduce download time by using up to 16 connections.

  2. Extract the package:

    tar -xf <airgap_upgrade_package>
    

    Extraction Tip

    To speed up extraction of large files (like the ~30GB air-gapped package), you can use pigz.

    If pigz is installed, use the following command instead to significantly reduce untar time:

    tar -I pigz -xvf <name-of-downloaded-package>.tar.gz
    

    pigz leverages multiple CPU cores to accelerate the decompression process.

    On Ubuntu, you can install pigz using:

    sudo apt install pigz
    
  3. Copy and configure the new config.yaml:

    cp -rp config.yaml-airgap-tmpl config.yaml
    

2. Validate Configuration for Upgrade

Before proceeding, ensure the following:

  1. Compare the new config.yaml with your previous version's configuration
  2. Update the archive-directory path in the config file to point to the correct location of your upgrade tar ball
  3. Verify that all other configuration parameters match your previous setup

Compare your previous config with the new one and ensure the following settings are present, particularly checking that the archive-directory path points to your upgrade tar ball location:

spec:
  deployment:
    ha: true  # Set to true for HA controller
  repo:
    archive-directory: /path/to/upgradetar/location  # Update this path to your upgrade tar ball location
    unarchive-path: /tmp
  app-config:
    generate-self-signed-certs: true  # If using self-signed certificates
    partner:
      star-domain: "*.example.com"

Configuration Verification

Double-check that the archive-directory path correctly points to where your upgrade tar ball is located, as RADM will use this path to access the configuration files during the upgrade process.


3. Controller Upgrade Steps

About radm

radm is a Go-based CLI tool used to manage the full lifecycle of a Rafay air-gapped controller. It handles both initial installation and subsequent upgrades of the controller, including infrastructure add-ons, Kubernetes cluster management, software provisioning, and ongoing maintenance. During upgrades, radm manages the process of updating controller components, dependencies, and cluster configurations while maintaining system stability.

  1. Copy the new RADM binary:

    sudo cp radm /usr/bin/
    
  2. Upgrade dependencies:

    sudo radm dependency --config config.yaml
    
  3. Remove the old assets deployment:

    kubectl delete deploy -n rafay-core rafay-core-rafay-assets
    
  4. Upgrade the application:

    sudo radm application --config config.yaml
    

4. Verify Clusters

Check if all clusters are healthy and accessible after the controller upgrade. Here are different ways to verify:

  1. Through the Rafay UI Console:
  2. Navigate to the Clusters page
  3. Verify all clusters show as "Healthy" status
  4. Test kubectl operations through the UI console

  5. For MKS clusters, you can perform additional verification:

    # Check MKS cluster connectivity
    kubectl exec -it -n rafay-core infra-salt-orchestrator-0 bash
    salt '*' test.ping
    

    MKS Cluster Verification

    The salt '*' test.ping command will show all MKS clusters that are currently connected to the controller. This verification is crucial to ensure no cluster connections were lost during the upgrade process.


5. Upgrade Cluster Dependencies

This upgrade is required to upload the latest cluster images/manifests with respect to the version to build in nexus registry:

sudo radm cluster --config config.yaml

Cluster Images Update

This step ensures that your nexus registry contains the latest cluster images and manifests required for the new version. This is crucial for maintaining compatibility between your clusters and the upgraded controller.


Post-upgrade Verification

After completing the upgrade:

  1. Verify all pods are running correctly
  2. Check system health and metrics
  3. Validate cluster connectivity
  4. Test application deployments

Recommendation

Update Cluster Blueprint Version

Update the blueprint version of existing clusters to match the controller version you upgraded to.
For example, if you upgraded the controller to version 3.1, then update the system default blueprint to version 3.1.0.


Troubleshooting

If you encounter issues during the upgrade:

  1. Check the logs in /tmp/radm.log
  2. Ensure sufficient disk space is available
  3. Contact Rafay support if issues persist