Skip to content

July

v3.5 - Self-Hosted

22 July, 2025

What's New in This Release

  • Fresh controller installation support for EKS, GKE, and Airgapped Controller.
  • Upgrade path from v2.11 to v2.12 for GKE Controller, EKS Controller, and Airgapped Controller.

Blogs for New Features in Release

# Description
1 k8s 1.33 for Rafay MKS
2 Revamped UI
3 User Namespaces in k8s v1.33

The section below provides a brief description of the new functionality and enhancements in this release.

Revamped UI/UX

The Rafay Console UI has been revamped to provide administrators and end users with an improved user experience. A refreshed look and feel is being introduced to the Infrastructure Portal, focusing on a cleaner, more modern design. The update delivers a more efficient use of screen space, enhanced readability, and improved overall usability.

Note

For more details on this enhancement, refer to our blog post.

Console UI


Self-Hosted Controller

MinIO Backup and Restore Support

This release introduces support for MinIO-based backup and restore for the self-hosted controller. Users with existing MinIO setups can now leverage their MinIO infrastructure for controller backup and restore operations, providing an alternative to AWS S3.

Previously, backup and restore functionality was limited to AWS S3. With this enhancement, organizations using MinIO for their object storage can now seamlessly integrate it with Rafay's self-hosted controller backup and restore workflows, offering greater flexibility in storage backend choices.


Preflight Check Tool

A new preflight check tool is now available to help validate node configurations before controller deployment. Users can download this tool from:

https://rafay-airgap-controller.s3.us-west-2.amazonaws.com/Publish/preflight-check.sh

This tool performs comprehensive validation checks including:

  • Node sizing verification - Ensures nodes meet minimum resource requirements
  • Operating system compatibility - Validates that nodes are running supported OS versions
  • System configuration checks - Verifies other essential configurations needed for successful controller deployment

Running this preflight check tool helps identify potential issues early in the deployment process, reducing deployment failures and improving the overall installation experience.

Note

The preflight check tool is not specific to this release and can be used with any controller release to validate node configurations before deployment.


Upstream Kubernetes for Bare Metal and VMs

Kubernetes v1.33 Support

Support is being added for Kubernetes v1.33 for Rafay's Kubernetes distribution. This includes:

  • Provisioning of new clusters with v1.33
  • Upgrading existing clusters from earlier versions to v1.33

1.33 k8s version

Note

For more details on Kubernetes v1.33 support in Rafay MKS, refer to our blog post.

The following new Kubernetes patch versions have also been added in this release:

  • v1.30.12
  • v1.31.8
  • v1.32.4

These versions are available for both new cluster provisioning and cluster upgrades from earlier minor or patch versions.


CNCF Conformance

Upstream Kubernetes clusters based on Kubernetes v1.33 (similar to prior Kubernetes versions) will be fully CNCF conformant.


Platform Version Support for Core Components

This release introduces Platform Versioning, starting with version v1.0.0. It enables upgrade management of critical software components (such as etcd) that are deployed alongside Kubernetes clusters. Users can now perform centralized management of Kubernetes clusters, software running inside the cluster and now critical software components outside and alongside the Kubernetes cluster. This feature allows customers to quickly respond by updating updates needed for these software components from the central management platform.

Platform version v1.0.0 includes the following component versions:

  • CRI: v2.0.4
  • etcd: v3.5.21
  • Salt Minion: v3006.9

Upgrade Guidance

  • New clusters will be provisioned with this v1.0.0 platform version by default

  • Existing clusters can upgrade to this platform version either:

  • Independently

  • Along with Kubernetes upgrades

  • Platform versioning gives users control to manage critical component upgrades and respond to vulnerabilities

Note

For existing clusters, the initial platform version will be shown as v0.1.0, which is assigned for reference purposes to older clusters that were created before platform versioning was introduced. Please perform the upgrade to v1.0.0 during scheduled downtime, as it involves updates to core components such as etcd and CRI.

We will continue releasing updated platform versions as component-level enhancements or security fixes are made available.


Platform Version Display During Kubernetes Upgrade

Part of K8s Upgrade


Individual Platform Version Upgrade Option

Individual Upgrade

Note: For additional documentation, see the Upstream Platform Version docs.


OS Field Update Capability in Cluster Configurations

Previously, the OS field in the cluster configuration was immutable once the cluster was provisioned. This posed challenges in cases where the underlying operating system on the nodes was updated out-of-band (outside Rafay).

With this enhancement, users can now manually update the OS field in the deployed configuration file. This ensures that the configuration stays in sync with the actual state of the nodes, improving operational visibility during Day 2 operations.


RCTL Cluster Upgrade Response Enhancement

Enhancements have been made to the RCTL CLI response during cluster upgrades. The response now includes both the source version and target version of the upgrade.

This enhancement improves user experience by clearly showing the current and target Kubernetes versions during upgrades, making the process more transparent and traceable.

Sample Response

Dry Run:

{
  "operations": [
    {
      "operationName": "ClusterUpgrade",
      "resourceName": "test-consul-rocky"
    }
  ],
  "comments": "Cluster will be upgraded from  v1.31.4 to v1.32.0"
}

Display of CNI Add-on Version in System Blueprints

System blueprints will display the version of the CNI add-on in the UI. Previously, this information was not visible making it unclear which CNI version would be applied by the default CNI system blueprint.

This enhancement improves transparency and user experience by providing clear visibility into the CNI version being deployed.

CNI Version


Rook Upgrade in 3.5-Based Blueprints (Managed Storage Enabled)

With the latest 3.5-based blueprints that have Managed Storage enabled, the new Rook version v1.17.1 will be deployed by default for new clusters created on the following supported operating systems:

  • Ubuntu 24.x
  • Ubuntu 22.x
  • RHEL
  • Rocky Linux 9
  • AlmaLinux
  • Flatcar

This update addresses a security vulnerability identified in Rook version v1.15.x, which was previously being used. Upgrading to v1.17.1 ensures clusters remain secure and up to date.

For existing clusters running one of the above OSes (excluding Ubuntu 24.x), users can upgrade to the latest Rook-Ceph version by creating a new version of the blueprint based on 3.5 and applying it to the cluster.

Note: While new clusters on Ubuntu 24.x are supported, upgrading Rook on existing Ubuntu 24.04 clusters is currently not supported due to an upstream compatibility issue. We are actively monitoring this and will release a fix as soon as it becomes available.


Google GKE

In this release, Kubernetes v1.32 is supported for both provisioning and upgrades, and v1.29 has been deprecated as it is no longer supported by Google GKE.

1.32


Amazon EKS

EKS Auto Mode Support

As part of Rafay's commitment to maintain parity with native AWS EKS capabilities, this release introduces support for EKS Auto.

Amazon EKS Auto Mode streamlines Kubernetes cluster management by automatically provisioning infrastructure, selecting optimal compute instances, dynamically scaling resources, continually optimizing compute for costs, patching operating systems (OS), and integrating with AWS security services.

By integrating support for EKS Auto, Rafay enables users to benefit from a simplified cluster creation experience with baked-in AWS best practices, enhanced operational efficiency, and reduced configuration overhead.

For more details, refer to the AWS EKS Auto blog post.

Note: For additional documentation, see the Rafay EKS Auto Mode Docs.


Combined Cluster Label & Upgrade Operations

Users can now combine cluster label updates with Nodegroup AMI upgrades or Kubernetes version upgrades as part of a single operation. Previously, these actions had to be performed in isolation. This enhancement streamlines upgrade workflows and reduces operational overhead by enabling users to batch changes into a unified process, improving efficiency during Day 2 operations.


GitOps

System Sync

Commits made by the GitOps pipeline will now include the author's email address as the Git username in commit metadata. Previously, commits would display the author as "Unknown". This was often due to missing or unrecognized user metadata. With this change, the author's email is explicitly included, ensuring that commits always contain a meaningful and identifiable username.

This improves auditability and traceability of GitOps operations across all integrated Git providers.


Centralized Agent Configuration

This enhancement introduces centralized management of agent configurations via the controller. Administrators can now centrally define the following settings:

  • CPU and Memory Limits
  • Number of Engine Agent Workers (Determines how many worker pods are launched for executing Environment Manager activities)
  • Concurrency Limit for CD Agent RPC Calls (Controls parallel fetches of artifacts from configured repositories)
  • Tolerations, Node Selectors, and Affinity Rules (Provides control over agent placement within the Kubernetes cluster)

Note

Support for agent configuration will initially be available only through non-UI interfaces. Support with UI interface will be introduced in a subsequent release.


Repositories

Configuration

Customers are strongly encouraged to configure their own agents even for publicly accessible repositories. If no agent is specified for internet-accessible endpoints, the controller's hosted agents will be used by default (not recommended for production use). For production environments, configuring dedicated agents helps avoid issues such as rate limits imposed by registries like Docker Hub.

Repository


Env Manager

State unlock

Terraform uses a locking mechanism to prevent concurrent modifications that could corrupt the state file. However, locks may sometimes persist due to failed or canceled runs often caused by lost connections to the build agent or updates to the storage backend holding the state file.

This enhancement introduces support for manually unlocking Terraform state environments that use OpenTofu-based resource templates and a compatible backend. It should be used cautiously and only it is verified that the lock is no longer valid.

Draft versions


Staggered deployments

The Schedule Policy defined within an environment template supports periodic triggering of environment deployments using a specified cron expression. Multiple environment deploys triggered at the same time can cause a spike in resource usage on the agent. With the introduction of a stagger interval, deployments can now be randomized within a defined time window.

For example, if environment deployments are scheduled every 4 hours and a stagger interval of 2 hours is configured, deployments will be triggered at random times between the 4–6 hour mark, 8–10 hour mark, and so on helping to distribute the load on the agents more evenly.

Note

Configuration will initially be available only through non-UI interfaces. Support with UI interface will be introduced in a subsequent release.


User Management

IDP Integration

Optimizations are being implemented to improve performance in scenarios where users are associated with 100 or more IDP groups.


API Keys

Console view

When an API key was generated, it was unclear which portion represented the Key ID and which was the Secret. A visual indicator has now been added to clearly distinguish between the two improving usability.

API key


System Template Catalog Updates

Use Case–Driven Experience

The System Template Catalog is being enhanced to provide a more intuitive, use case–driven experience, making it easier for end users to discover and apply relevant templates based on their specific needs.

Catalog UX


Enhancements to Existing Templates

The following system templates have been updated to support the latest Kubernetes versions, improving compatibility and streamlining cluster provisioning workflows:

  • system-mks
  • system-vsphere-mks

These templates now support the following Kubernetes versions:

  • v1.30.12
  • v1.31.8
  • v1.32.4
  • v1.33.0

⚠️ Please note: Older patch versions such as v1.30.8, v1.31.4, and v1.32.0 have been deprecated for new cluster provisioning.
However, existing clusters running these versions can be upgraded to the newly supported versions including the latest v1.33.0 — as part of your cluster lifecycle operations.

For existing environment templates, remember to update the list of values in the Cluster Kubernetes Version input variable. This ensures that new versions are available to end users when provisioning or upgrading clusters.


Approval Workflow Handlers

Support for workflow handlers is being added to the catalog to enable approval steps as part of the execution process. This ensures alignment with organizational policies by allowing integration with approval systems such as ServiceNow or JIRA before a workflow can proceed.

Rafay K8s Distro on Nutanix

A new system template is being added to the catalog to provision and manage Rafay's Kubernetes distribution on Nutanix infrastructure. This enables users to deploy fully managed clusters with integrated networking, storage, and add-ons—all automated via the Rafay platform.

Overview & Get Started Guide — Learn how to configure, launch, and manage Nutanix based clusters using this template.


Bug Fixes

Bug ID Description
RC-42227 Fixed an issue where workload search was not working
RC-42027 Resolved a bug where limits and offset parameters were not functioning in the V3 workloads list API
RC-41878 Environment Manager: Fixed workload mutating webhook to support shell command substitutions
RC-41833 Added kube-apiserver to the system user list to prevent Rafay drift webhook from blocking updates
RC-41636 Fixed an issue where the GitOps CD Agent upgrade process showed "Upgrade Successful" without actually upgrading
RC-41543 UI: Fixed bug where workloads created using rctl had additional_reference added in UI even when not enabled
RC-41386 Added validation for Blueprint version when it is disabled
RC-40984 Upstream K8s: Fixed an issue where the View/Rotate Certificates activity tab was visible only to Org Admins
RC-40536 Resolved permission issue where Namespace Admin + Project Read-Only role could create SecretStore
RC-40198 Fixed login failure for IdP users with more than 100 groups