June

v2.7 - SaaS¶

28 June, 2024

The section below provides a brief description of the new functionality and enhancements in this release.

Clusters¶

Cluster Deletion UX¶

In this release, we have added an improved cluster deletion UI experience! This enhancement aims to streamline the deletion process and provide greater clarity.

By default, the "Delete Cluster Completely" option will now be pre-selected for managed clusters. This simplifies the process for complete cluster removal.
Users still have the flexibility to choose an alternative option based on their specific cluster state. The UI will continue to display available deletion options.

This update ensures a more user-friendly and efficient cluster deletion workflow. Below is a screenshot showcasing the enhanced cluster deletion user experience:

Deletion of EKS/AKS/GKE Clusters

Deletion of Upstream Kubernetes for Bare Metal and VMs

Deletion of Imported Clusters

Amazon EKS¶

Kubernetes v1.30¶

New EKS clusters can now be provisioned based on Kubernetes v1.30. Existing clusters managed by the controller can be upgraded "in-place" to Kubernetes v1.30.

New Cluster

In-Place Upgrade

Important Note: Please refer to the important information mentioned for EKS 1.30 here before creating new clusters based on the EKS 1.30 version.

Debugging & Troubleshooting: Enhanced Cloud Messaging for EKS Provisioning¶

Further enhancements have been implemented to the provisioning logs to help users pinpoint issues causing provisioning failures.

Prior to this release, provisioning failure logs were limited to control plane and default node groups. Starting this release, cloud-init logs for Bootstrap NodeGroup will also be displayed offering deeper insights into the node initialization process. Real-time visibility into CloudFormation events for your EKS cluster during provisioning can be obtained by using:

RCTL commands: The rctl cloudevents command will allow retrieval of CloudEvents for specific infrastructure resources like clusters, node groups, and EKS managed add-ons
Swagger API: CloudEvents can be accessed through the /apis/infra.k8smgmt.io/v3/projects/defaultproject/clusters/<cluster-name>/provision/cloudevents endpoint

This enhanced monitoring capability will aid with effectively troubleshooting and diagnosing provisioning issues.

RCTL Commands

Usage:
  rctl cloudevents [command]

Available Commands:
  addon        Get EKS Managed Addon CloudEvents
  cluster      Get EKS Cluster CloudEvents
  nodegroup    Get EKS Nodegroup CloudEvents
  provision    Get EKS Cluster Provision CloudEvents

API to Fetch Provision Events

GET /apis/infra.k8smgmt.io/v3/projects/defaultproject/clusters/<cluster-name>/provision/cloudevents

Benefits:

Improved troubleshooting through detailed cloud-init logs
Real-time monitoring of CloudFormation events for proactive issue identification

Note:

Replace <cluster-name> in the API endpoint with your actual cluster name.

For more information, refer to the Logs & Events.

Azure AKS¶

Azure Overlay CNI¶

A previous release added support for Azure CNI overlay for AKS clusters via supported automation interfaces i.e RCTL CLI, Terraform, GitOps (System Sync) and Swagger API interfaces. The facility to do the same is being added via the UI.

For more information in the feature and limitations , refer to the Rafay Azure Overlay documentation and Microsoft Azure Overlay documentation

Upstream Kubernetes for Bare Metal and VMs¶

Kubernetes v1.30¶

New upstream clusters based on Rafay's MKS distribution can be provisioned based on Kubernetes v1.30.x. Existing upstream Kubernetes clusters managed by the controller can be upgraded in-place to Kubernetes v1.30.x. Read more about this in this blog post

CNCF Conformance¶

Upstream Kubernetes clusters based on Kubernetes v1.30 (and prior Kubernetes versions) will be fully CNCF conformant.

Known Issue: Upgrading Kubernetes Cluster to 1.30.x with Windows Nodes

Upgrading a Kubernetes cluster to version 1.30.x with Windows nodes will result in upgrade failure. This is a known upstream Kubernetes issue tracked on GitHub (issue).

Workaround: Before initiating the upgrade, drain the Windows nodes using the kubectl drain <nodename> command, then retry the upgrade again.

Fix: The fix is included in k8s version 1.30.2 and will be available in the 2.8 release.

VMware vSphere¶

Kubernetes v1.29 and v1.30¶

New VMware clusters can be provisioned based on Kubernetes v1.29.x and v1.30.x. Existing VMware Kubernetes clusters managed by the controller can be upgraded in-place to Kubernetes v1.29.x and v1.30.x. Read more about the Kubernetes versions on this page

Blueprints¶

Criticality of Add-Ons¶

By default, only the Rafay management operator components are treated as “CRITICAL”. Customers will have the option to specify custom add-ons as CRITICAL based on the nature/importance in a blueprint. If a critical add-on fails to install during a blueprint sync operation, the blueprint state will be marked as FAILED and operations on the cluster such as workload deployment will be blocked until the issue is resolved.

The CRITICAL badge can be used for add-ons such as security and monitoring tools which are deemed as critical and mandatory to maintain compliance. In summary,

Blueprint state is marked as FAILED and operations on the cluster are blocked only if one or more critical add-ons fail to install
Blueprint state is marked as PARTIAL SUCCESS if one or more non-critical add-ons fail to install. Operations such as workload deployment, scaling nodes up/down are still allowed in this condition.

For more information, refer to this page.

Managed Ingress Controller¶

The ingress class name for Rafay's managed ingress controller (with the managed ingress add-on) is being updated to default-rafay-nginx instead of simply nginx to prevent potential naming conflicts. This adjustment ensures the uniqueness of the custom ingress, thereby avoiding clashes. Additionally, this name update resolves various ingress-related issues encountered during Blueprint synchronization.

Note

The new ingress class name will be used when the base blueprint version is 2.7+ for both new/existing managed ingress add-on installations

User Management¶

User Groups¶

An indicator is being added in the Group listing page to distinguish between an "IDP group" and an "Override/Local group". This will enable customers to easily differentiate between the two groups in the Rafay Console.

Compliance¶

Custom Registry¶

There can be security guidelines that mandate that the Rafay Operator and managed add-on images be pulled from a custom registry instead of Rafay hosted registry. This release adds the ability to do that.

Please reach out to the Rafay Customer Success team if this is a requirement in your organization.

Environment Manager¶

Driver Support in Hooks¶

In this release, we have added driver support in hooks for both resource templates and environment templates, offering enhanced flexibility and reusability in your workflows.

What's New

Select Drivers in Hooks: Users can now choose a driver within hooks, enabling the utilization of pre-configured drivers.
Override Driver Timeouts: Timeouts configured in hooks will now take precedence over driver timeouts.
Interface Support: This functionality is available across all supported interfaces including UI, SystemSync, Terraform, and Swagger API.

Benefits

Improved Reusability: Simplify workflows by leveraging existing drivers within hooks.
Enhanced Control: Tailor timeouts to specific scenarios with hook-level overrides.

Volume Backup & Restore¶

In previous releases, Volume Backup and Restore functionality was accessible through the UI and API. This functionality is now integrated into the GitOps System Sync. This feature empowers users to save configuration data stored in volumes associated with multi template deployments. Volumes serve as storage locations for configuration data required by resources within your environment. Enabling backup ensures you can restore this data if needed, minimizing downtime and data loss. When destroying the environment, these volumes will be cleaned up.

The enableBackupAndRestore field has been introduced in the Container Driver and Resource Template config specifications.

Bug Fixes¶

Bug ID	Description
RC-35207	EKS: Cluster tags not being applied to launch template
RC-35032	EKS: Convert to managed fails when having nodegroup of multiple instance types is scaled down to 0
RC-34875	MKS: RCTL fails to run any Day2 Node Update tasks if node hostname is set to uppercase in the Cluster Config
RC-34337	MKS: Nodes stuck in discovered state due to a concurrency issue
RC-34326	Default BP is still available for selection as the base BP even when it is disabled at the Org level
RC-33959	Pipeline trigger fails when the commit description includes special characters
RC-32653	Zero Trust Access auto completion prompt for kubectl get nodes does not work appropriately
RC-34631	Taints are not applied correctly to master nodes when dedicatedMasters flag is not set
RC-34540	RCTL:MKS rctl apply --dry-run not working for mks clusters
RC-34541	RCTL MKS: Add validation for labels and taints

v1.1.32 - Terraform Provider¶

21 June, 2024

An updated version of the Terraform provider is now available.

This release includes enhancements and bug fixes around resources listed below:

Existing Resources

rafay_eks_cluster: The rafay_eks_cluster resource now offers improved policy configuration options.
- You can now specify policy information in JSON format using the new attach_policy_v2 field. This provides more flexibility for defining complex policies.
- The addons and nodegroup IAM sections within the resource support additional fields in attach_policy: * Condition * NotAction * NotPrincipal * NotResource
rafay_eks_cluster: Previously, Terraform would reject configurations lacking mandatory addons in the rafay_eks_cluster definition. This validation has been removed. Terraform now accepts configurations without explicit addon definitions and implicitly adds them during cluster creation using terraform apply.
rafay_cloud_credential and rafay_cloud_credential_v3:
- These resources now allow updating credentials without errors. Simply update the resource definition and run terraform apply to reflect the changes.

v1.1.32 Terraform Provider Bug Fix¶

Bug ID	Description
RC-34554	Terraform Provider: Attach_policy is not working for EKS managed adddons
RC-34387	Terraform Provider: Unable to update Cloud Credentials Secrets