Skip to content

Ankur Pandita

Interact with Your Rafay Managed Kubernetes Clusters Using MCP-compatible AI clients

The Model Context Protocol (MCP) is an open standard that enables AI assistants to securely interact with external tools and systems. When used with Kubernetes, MCP allows an AI assistant to execute operations (for example, kubectl commands), retrieve live cluster state, and reason about results without requiring users to manually copy and paste output into a chat interface.

This blog uses Claude Desktop as an example AI assistant. The same approach applies to any MCP-compatible AI client.

For platform administrators, this capability enables controlled, auditable, and policy-driven AI-assisted cluster operations.


For production environments, the recommended approach is to run the MCP server locally and connect to your Kubernetes cluster using a Rafay Zero Trust Kubectl Access (ZTKA) kubeconfig.

In this model:

  • The MCP server runs on the administrator’s workstation
  • Cluster access is established through Rafay’s ZTKA secure relay
  • No inbound access to the cluster is required
  • No VPN tunnels or exposed Kubernetes API endpoints are needed

This architecture aligns with zero-trust security principles and enterprise compliance requirements.

Kubernetes v1.35 for Rafay MKS

As part of our continuous effort to bring the latest Kubernetes versions to our users, support for Kubernetes v1.35 will be added soon to the Rafay Operations Platform for MKS cluster types.

Both new cluster provisioning and in-place upgrades of existing clusters are supported. As with most Kubernetes releases, this version deprecates and removes a number of features. To ensure zero impact to our customers, we have validated every feature in the Rafay Kubernetes Operations Platform on this Kubernetes version. Support will be promoted from Preview to Production in a few days and made available to all customers.

Important: Platform Version 1.2.0 Required

Kubernetes v1.35 requires etcd version 3.5.24 which is delivered as part of Rafay Platform Version 1.2.0. When creating new clusters based on Kubernetes v1.35, select Platform Version 1.2.0 along with it. For upgrading existing clusters to Kubernetes v1.35, upgrade to Platform Version 1.2.0 first or together with the Kubernetes upgrade. Clusters cannot be provisioned or upgraded to Kubernetes v1.35 without Platform Version 1.2.0.

Kubernetes v1.35 Release

Managing Environments at Scale with Fleet Plans

As organizations scale their cloud infrastructure, managing dozens or even hundreds of environments becomes increasingly complex. Whether you are rolling out security patches, updating configuration variables, or deploying new template versions, performing these operations manually on each environment is time-consuming, error-prone, and simply unsustainable.

Fleet Plans solve this challenge—a powerful feature that eliminates the need to manage environments individually by enabling bulk operations across multiple environments in parallel.

Fleet Plans General Flow

Fleet Plans provide a streamlined workflow for managing multiple environments at scale, enabling bulk operations with precision and control.

Note: Fleet Plans currently support day 2 operations only, focusing on managing and updating existing environments rather than initial provisioning.

Granular Control of Your EKS Auto Mode Managed Nodes with Custom Node Classes and Node Pools

With a couple of releases back, we added EKS Auto Mode support in our platform for doing either quick configuration or custom configuration. In this blog, we will explore how you can create an EKS cluster using quick configuration and then dive deep into creating custom node classes and node pools using addons to deploy them on EKS Auto Mode enabled clusters.

Dynamic Resource Allocation for GPU Allocation on Rafay's MKS (Kubernetes 1.34)

This blog demonstrates how to leverage Dynamic Resource Allocation (DRA) for efficient GPU allocation using Multi-Instance GPU (MIG) strategy on Rafay's Managed Kubernetes Service (MKS) running Kubernetes 1.34.

In our previous blog series, we covered various aspects of Dynamic Resource Allocation (DRA) in Kubernetes:

DRA is GA in Kubernetes 1.34

With Kubernetes 1.34, Dynamic Resource Allocation (DRA) is Generally Available (GA) and enabled by default on MKS clusters. This means you can immediately start using DRA features without additional configuration.

Prerequisites

Before we begin, ensure you have:

  • A Rafay MKS cluster running Kubernetes 1.34 (see MKS v1.34 Blog)
  • GPU nodes with compatible NVIDIA GPUs (A100, H100, or similar MIG-capable GPUs)
  • Container Device Interface (CDI) enabled (automatically enabled in MKS for Kubernetes 1.34)
  • Basic understanding of Dynamic Resource Allocation concepts (covered in our previous blog series)
  • Active Rafay account with appropriate permissions to manage MKS clusters and addons

Kubernetes v1.34 for Rafay MKS

As part of our continuous effort to bring the latest Kubernetes versions to our users, support for Kubernetes v1.34 will be added soon to the Rafay Operations Platform for MKS cluster types.

Both new cluster provisioning and in-place upgrades of existing clusters are supported. As with most Kubernetes releases, this version also deprecates and removes a number of features. To ensure there is zero impact to our customers, we have made sure that every feature in the Rafay Kubernetes Operations Platform has been validated on this Kubernetes version. This will be promoted from Preview to Production in a few days and will be made available to all customers.

Kubernetes v1.34 Release

Upstream Kubernetes on RHEL 10 using Rafay

Our upcoming release update will add support for a number of new features and enhancements. This blog is focused on the upcoming support for Upstream Kubernetes on nodes based on Red Hat Enterprise Linux (RHEL) v10.0. Both new cluster provisioning and in-place upgrades of Kubernetes clusters will be supported for lifecycle management.

RHEL 9.2

Introducing Platform Version with Rafay MKS clusters.

Our upcoming release introduces support for a number of new features and enhancements. One such enhancement is the introduction of Platform Versioning for Rafay MKS clusters a major feature in our v3.5 release. This new capability is designed to simplify and standardize the upgrade lifecycle of critical components in upstream Kubernetes clusters managed by Rafay MKS.

Why Platform Version?

Upgrading Kubernetes clusters is essential, but the core components—such as etcd, CRI, and Salt Minion also require updates for:

  • Security patches
  • Compatibility with new Kubernetes features
  • Performance improvements

Platform Versioning introduces a structured, reliable, and repeatable upgrade path for these foundational components, reducing risk and operational overhead.

What is a Platform Version?

A Platform Version defines a tested and validated set of component versions that can be safely upgraded together. This ensures compatibility and stability across your clusters.

We are introducing v1.0.0 as the very first Platform Version for new clusters. This version includes:

  • CRI: v2.0.4
  • etcd: v3.5.21
  • Salt Minion: v3006.9

Note

For existing clusters, the initial platform version will be shown as v0.1.0, which is assigned for reference purposes to older clusters that were created before platform versioning was introduced. Please perform the upgrade to v1.0.0 during scheduled downtime, as it involves updates to core components such as etcd and CRI.

How Does Platform Versioning Work?

You can upgrade the Platform Version in two ways:

  • During a Kubernetes version upgrade
  • As a standalone platform upgrade

This flexibility allows you to keep your clusters secure and up to date, regardless of your Kubernetes upgrade schedule.

Platform Version

Controlled and Responsive Update Cadence

Platform Versions are not released frequently. New versions are published only when:

  • A high severity CVE or vulnerability is addressed
  • A major performance or compatibility feature is introduced
  • There are significant version changes in core components

This approach ensures that upgrades are meaningful and necessary, minimizing disruption.

Whenever a new Platform Version is released, existing clusters can seamlessly upgrade to the latest version, ensuring they benefit from the latest security patches and improvements without manual intervention.

Evolving Platform Versions and Expanding Coverage

We are committed to continuously improving Platform Versioning. In future releases, we will introduce new platform versions to to expand the scope of Platform Versioning by including more critical components as part of the platform version. For this initial release, we have started with three foundational components etcd, CRI, and Salt Minion because of their critical importance to cluster stability. Over time, we will enhance Platform Versioning to cover additional components, ensuring your clusters remain robust, secure, and up to date.

Platform Version Documentation

For detailed documentation, see: Platform Version Docs

In Summary

Platform Versioning makes it easier than ever to keep your clusters current and secure by managing the upgrade lifecycle of foundational components like etcd, CRI, and Salt Minion.

Whether you apply it alongside a Kubernetes version bump or independently, Platform Versioning ensures your infrastructure remains stable, secure, and optimized now and in the future.

Kubernetes v1.33 for Rafay MKS

As part of our upcoming May release , alongside other enhancements and features, we are adding support for Kubernetes v1.33 with Rafay MKS (i.e., upstream Kubernetes for bare metal and VM-based environments).

Both new cluster provisioning and in-place upgrades of existing clusters are supported. As with most Kubernetes releases, v1.33 deprecates and removes several features. To ensure zero impact to our customers, we have validated every feature of the Rafay Kubernetes Operations Platform on this Kubernetes version.

Kubernetes v1.33 Release