Skip to content

v4.1 - SaaS

Upstream Kubernetes for Bare Metal and VMs

Kubernetes Component Configuration (Control Plane Overrides)

Customizing control plane flags or feature gates previously required directly editing static pod manifests which is error-prone and unsupported.

This release introduces Control Plane Overrides, enabling configuration of extra args, volumes, and volume mounts for the API Server, Controller Manager, and Scheduler via the console or cluster spec and other supported interfaces at Day-0 or Day-2.

Benefit

Safely tune control plane behavior without touching static pod manifests, reducing operational risk and keeping clusters in a supported state.

Control Plane Overrides

Control Plane Overrides UI

For more information, see Kubernetes Component Configuration.


Cluster and Node Annotations Support

Attaching metadata to clusters and nodes previously required workarounds outside of cluster configuration.

Annotations can now be set directly in Cluster Settings at Day-0 or Day-2, via the UI, Terraform, rctl, or APIs. Node-level annotations merge with cluster-level ones, with node values taking precedence on conflicts.

Benefit

Attach ownership, environment, and compliance metadata in-config for consistent governance without external workarounds.

Annotations Configuration

Cluster Settings

metadata:
  annotations:
    env: prod
spec:
  config:
    nodes:
    - hostname: <hostname>
      annotations:
        envmnt: preprod

For more information, see Cluster Settings.


Enhancement: Preflight Checks in Conjurer

Cluster provisioning previously could fail due to pre-existing Kubernetes installations, residual CNI configurations, or conflicting binaries on nodes.

This release adds node-level preflight validations that detect these conflicts before provisioning begins. If issues are found, installation halts with explicit error messages before any changes are made to the node.

Benefit

Get early, actionable feedback before provisioning starts, eliminating time-consuming troubleshooting of mid-installation failures.

For more information, see Preflight Checks.


Amazon EKS

Repair Configuration Overrides for Node Auto Repair

Node repair previously used a single default behavior regardless of the failure condition.

This release adds Repair Configuration Overrides — per-condition rules that specify the monitoring condition, unhealthy reason, wait time, and repair action (Replace, Reboot, or NoAction). Up to 49 overrides per node group are supported.

Benefit

Apply the right remediation per failure type, reducing unnecessary node replacements and disruption to running workloads.

Note

Repair Configuration Overrides is currently supported via RCTL, Terraform, API, and System Sync. UI support will be added in the Upcoming release.

Repair Configuration Overrides

For more information, see Node Auto Repair — Repair Configuration Overrides.


ARC Zonal Shift Support

Redirecting EKS cluster traffic away from an impaired AWS Availability Zone previously required manual intervention outside the platform. This release adds AWS ARC Zonal Shift support — both manual operator-initiated shifts and Zonal Autoshift with CloudWatch alarm evaluation and configurable practice run windows. Manageable directly from the Rafay Console at Day-0 or post-provisioning.

Benefit

Respond to AZ-level impairments directly from Rafay Console without manual AWS-side intervention, reducing mean time to recovery.

Note

ARC Zonal Shift configuration is currently supported via the following interfaces. UI support will be added in a future release.


Blueprints

Cluster Overrides

This release allows injection of custom nodeAffinity rules into a specific add-on's Helm config per cluster via Cluster Overrides without modifying the base blueprint.

Benefit

Control add-on placement per cluster without forking or modifying shared blueprints.

Node Affinity Override Configuration

For more information, see Override Node Affinity for Addons.


Policy Management

OPA Gatekeeper

This release adds support for OPA Gatekeeper v3.21.0.

Benefit

Stay current with the latest Gatekeeper capabilities and security fixes, ensuring policy enforcement remains reliable and up to date across clusters.

OPA Gatekeeper Installation Profile


Security

IP Whitelisting via RCTL

IP whitelisting was only configurable through the console UI. This release adds RCTL support to create, update, retrieve, and delete whitelisted IP addresses or CIDR ranges programmatically.

Benefit

Automate and script IP access controls as part of broader infrastructure workflows, without manual console updates.

For more information, see Managing IP Whitelisting Using RCTL.


GitOps

File Exclusion in Git-to-System Sync (.rafayignore)

Non-spec files in a repository (e.g. README files, docs, examples) were being evaluated during Git-to-System sync, causing validation errors and pipeline noise. This release introduces .rafayignore, a pattern-based file committed to the repo. Matched files are skipped entirely during validation and sync.

README.md
*.md
docs/**
examples/**

Git to System Sync - File Exclusion

Benefit

Keep auxiliary files alongside specs in the same repo without causing sync failures or validation noise.

For more information, see File Exclusion with .rafayignore.

Agent Deployment

The privileged init container previously used in the cd-agent has been removed with this release.

Benefit

Improves security by eliminating privileged access in the CD agent deployment.

Agent Version control

When creating an agent, users could previously only select the latest available version. This created challenges for customers who tested specific agent versions and later needed access to those versions after new releases became available. This release introduces enhanced agent version control capabilities:

  • Users can now select N-1 and N-2 agent versions when creating an agent
  • Support for rolling back an agent to a previously deployed version is now available

Benefit

Enables continued use of approved/test agent versions and faster recovery from upgrade issues by rolling back to a previously stable version.


Cost Management

App resizing

This release introduces App Resizing, enabling platform teams to generate reports that compare configured CPU and memory requests against actual usage metrics (P90, P95, and peak) per pod across clusters, projects, and namespaces.

Note: This feature requires the Rafay Prometheus stack to be enabled to collect the necessary metrics.

Benefit

Quickly identify over-provisioned workloads and reclaim wasted cluster capacity with data-backed insights.

App Resizing

Generated Reports

For more information, see App Resizing.


UI enhancements

Several UI improvements have been introduced to enhance usability and operational efficiency across cluster resources:

  • Added the ability to search based on specific column names when viewing resources
  • Users can now save custom views for frequently used resource filters and layouts
  • Init containers are now visible in the UI
  • Added the ability to restart StatefulSets and DaemonSets directly from the Resources tab
  • Improved grouping and search capabilities for cluster and namespace labels, extending enhancements previously introduced for node labels
  • The YAML manifest upload interface now supports both .yaml and .yml file formats

Benefit

Improves operational efficiency by making cluster resources easier to search, manage, and troubleshoot directly from the UI.


The following bug fixes have been addressed with 4.1 release:

Bug ID Description
RC-41891 MKS: Fixed an issue where the kubelet config.yaml file was not restored after an upgrade

v1.1.60 - Terraform Provider

An updated version of the Terraform provider is now available.

rafay_eks_cluster

  • ARC Zonal Shift and Auto Zonal Shift configuration support
  • Repair Configuration Overrides for Node Auto Repair

For an example with these fields, see An example EKS cluster with node repair, zonal shift, and auto zonal shift.

rafay_mks_cluster

  • Cluster and Node Annotations – Annotations can now be set at the cluster level via metadata.annotations and at the node level via nodes.<hostname>.annotations. Node-level annotations take precedence on conflict.

    resource "rafay_mks_cluster" "example" {
      api_version = "infra.k8smgmt.io/v3"
      kind        = "Cluster"
      metadata = {
        name    = "mks-ha-cluster"
        project = "terraform"
        annotations = {
          "app"   = "infra"
          "infra" = "true"
        }
      }
      spec = {
        config = {
          nodes = {
            "hostname1" = {
              hostname = "hostname1"
              roles    = ["ControlPlane", "Worker"]
              annotations = {
                "app"   = "infra"
                "infra" = "true"
              }
            }
          }
        }
      }
    }
    
  • Kubernetes Component Configuration (Control Plane Overrides)

For an example with these fields, see Examples of Control Plane Overrides.


v4.0 Update 7 - SaaS

Azure AKS

Bootstrap VM Configuration

Previously, the VM size and image used for the AKS bootstrap node were fixed to a smaller default, with no option to customize. Organizations can now configure bootstrapVmParams in the cluster spec to define a VM size and image that meets their standards — for example, a CIS-hardened or custom gallery image with an appropriate VM size.

The following bootstrapVmParams fields are supported under spec.clusterConfig.spec:

bootstrapVmParams:
  image:
    id: /subscriptions/<subscription-id>/resourceGroups/<rg>/providers/Microsoft.Compute/galleries/<gallery>/images/<image-def>/versions/<version>
    osState: Generalized   # or Specialized
  vmSize: Standard_B4ms

Full cluster spec example:

apiVersion: rafay.io/v1alpha1
kind: Cluster
metadata:
  name: <cluster-name>
  project: <project-name>
spec:
  blueprint: default-aks
  blueprintversion: latest
  cloudprovider: <cloud-credential-name>
  clusterConfig:
    apiVersion: rafay.io/v1alpha1
    kind: aksClusterConfig
    metadata:
      name: <cluster-name>
    spec:
      bootstrapVmParams:
        image:
          id: /subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Compute/galleries/<gallery-name>/images/<image-definition>/versions/<version>
          osState: Generalized
        vmSize: Standard_B4ms
      managedCluster:
        apiVersion: "2025-01-01"
        identity:
          type: UserAssigned
          userAssignedIdentities:
            ? /subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<identity-name>
            : {}
        location: <azure-region>
        properties:
          apiServerAccessProfile:
            enablePrivateCluster: true
          autoUpgradeProfile:
            nodeOsUpgradeChannel: None
            upgradeChannel: none
          dnsPrefix: <cluster-name>-dns
          enableRBAC: true
          kubernetesVersion: 1.33.7
          networkProfile:
            dnsServiceIP: 10.0.0.10
            loadBalancerSku: standard
            networkPlugin: azure
            networkPolicy: azure
            serviceCidr: 10.0.0.0/16
          powerState:
            code: Running
        sku:
          name: Base
          tier: Free
        type: Microsoft.ContainerService/managedClusters
      nodePools:
      - apiVersion: "2025-01-01"
        name: primary
        properties:
          count: 1
          enableAutoScaling: true
          maxCount: 1
          maxPods: 110
          minCount: 1
          mode: System
          orchestratorVersion: 1.33.7
          osType: Linux
          type: VirtualMachineScaleSets
          vmSize: Standard_DS2_v2
          vnetSubnetID: /subscriptions/<subscription-id>/resourceGroups/<network-resource-group>/providers/Microsoft.Network/virtualNetworks/<vnet-name>/subnets/<subnet-name>
        type: Microsoft.ContainerService/managedClusters/agentPools
      resourceGroupName: <resource-group>
  type: aks

For more information, see Bootstrap VM Configuration.

Benefit

Allows organizations to enforce standard VM sizes and hardened images for the bootstrap node in line with organizational guidelines, without being constrained to the platform default.


v1.1.59 - Terraform Provider

An updated version of the Terraform provider is now available.

Cloud credentials v3 resource auto-recreation – Cloud credentials resources deleted out of band are now automatically recreated on the next terraform apply, eliminating the need for manual state management after out-of-band changes.


v4.0 Update 6 - SaaS

12 Mar, 2026

Azure AKS

Proxy Configuration

Previously updating the proxy configuration required a manual blueprint update to propagate changes to the operator components. This is now handled automatically as part of the proxy configuration update.

Bug Fixes

Bug ID Description
RC-47500 AKS proxy update fails when DiskEncryptionSetID is configured with the deny-aks-without-cmk policy assigned
RC-47415 AKS proxy config map not updated unless blueprint is manually republished

v1.1.58 - Terraform Provider

An updated version of the Terraform provider is now available.

The following changes are included in this release:

Resource recreation on rename or project change – Resources now support automatic recreation when metadata or project fields are updated. On the next terraform apply, affected resources are recreated automatically, eliminating the need for manual state management after out of band changes.

updated resources

Addon, Blueprint, BreakGlassAccess, Cluster, ConfigContext, Credentials, Driver, Environment, EnvironmentTemplate, Repository, Resource, ResourceTemplate, WorkflowHandler

For example Cloud credentials deleted out of band from the console previously caused a 404 error during terraform apply. Affected resources are now automatically refreshed and recreated as part of terraform apply instead of failing.


v4.0 Update 5 - SaaS

5 Mar,2026

Upstream Kubernetes for Bare Metal and VMs

Kubernetes patch versions update

Previously only the latest patch version for each minor version was available in the upgrade flow. With this release, all patch versions that are available for new cluster creation are now also available for cluster upgrades.

For the full Kubernetes version support matrix, see Support Matrix.

Bug fixes

Bug ID Description
RC-46840 Fixed error when clicking Save Changes on Cluster Overrides created via CLI invalid resource selector

v1.1.57 - Terraform Provider

An updated version of the Terraform provider is now available.

aks_cluster_v3 – The following configuration options have been added to the AKS v3 Terraform resource:

  • Azure managed Web Application Routing add-on
  • Azure managed Istio service mesh add-on
  • Key Vault Secret Provider CSI Driver
  • Custom kubelet config
  • http_proxy, https_proxy, no_proxy configuration
  • Snapshot support
  • Network dataplane: Cilium
  • Network policy: Cilium
  • Node Auto Provisioning (NAP)

For an example reference with these fields, see rafay_aks_cluster_v3/resource.tf.

Documentation update for EKS resource

  • EKS system components placement