Skip to content

Mohan Atreya

How to Select the Right GPU for Open Source LLMs?

Deploying and operating an open-source Large Language Model (LLM) requires careful planning when selecting the right GPU model and memory capacity. Choosing the optimal configuration is crucial for performance, cost efficiency, and scalability. However, this process comes with several challenges.

In this blog, we will describe the factors that you need to consider to select the optimal GPU model for your LLM. We have also published a table capturing optimal GPU models to deploy and use Top-10 open source LLMs.

How many GPUs

Developer Self Service Access to DeepSeek on Amazon EKS

A few weeks back, Tiago Reichert from AWS published a very interesting blog on AWS Community showcasing how you can deploy and use the DeepSeek-R1 LLM on an Amazon EKS Cluster operating in Auto Mode. Detailed step-by-step instructions for this are documented in this Git Repo.

In this blog, we will describe how we took AWS's excellent blog and packaged it to provide a turnkey, 1-click self-service experience for non AWS administrator type users in a typical enterprise. It took one of our solution architects 30 minutes to wrap AWS's example code using Rafay's Environment Manager and PaaS.

Over the last few weeks, we have been asked to demonstrate this every day to several customers and partners. Given the significant interest in DeepSeek and the self service experience, we believe others will benefit from this blog.

Rafay, DeepSeek and EKS

Turnkey Integration with Cilium CNI

In the first blog, we discussed how organizations can use Hubble for Cilium for observability. In this blog, we will look at how the Rafay Platform provides a tight, turnkey integration with Cilium making life easy for platform teams. In the next blog, my colleague will describe and showcase how an administrator can configure and enable Hubble on a Rafay MKS based Kubernetes cluster with the Cilium CNI.

Rafay + Cilium

Supercharge Kubernetes Networking Observability using Hubble and Cilium

Networking observability in Kubernetes environments is essential for troubleshooting, security, and performance optimization. Hubble, an observability platform for the Cilium CNI, addresses this challenge by providing real-time insights into network traffic, security policies, and application-layer interactions. Hubble is built on eBPF (Extended Berkeley Packet Filter) and provides deep visibility into packet flows, service-to-service communication, and security enforcement without requiring intrusive packet mirroring or modifications to application code. In a nutshell, Hubble is a fully distributed networking and security observability platform for cloud native workloads.

In this introductory blog about Hubble for Cilium, We will start with a real life example highlighting where traditional monitoring tools fall short. We will then look at how Hubble + Cilium can address these gaps. In the second blog, I will describe how Rafay provides our customers with a a tight, turnkey integration with Cilium for various cluster types (i.e. Rafay MKS for Data Centers and Public Cloud Distributions such as Amazon EKS).

Hubble Intro

Encrypt your Kubernetes Backups using Server Side Encryption

As Kubernetes adoption grows rapidly in enterprises, protecting cluster data is critical. Backups ensure business continuity in case of failures, accidental deletions, or security breaches. For over 2 years, users have depended on the integrated backup/restore capability in the Rafay Platform to dramatically simplify Kubernetes backup and restore operations. When the backups artifacts are stored in public cloud environments, organizations may have a concern with security. One of the most effective ways to secure these backups is by using Server-Side Encryption (SSE). SSE encrypts data at rest within cloud storage services, protecting it from unauthorized access while minimizing operational overhead.

In this blog, I describe the value of SSE encryption for Kubernetes backups and how it enhances security and compliance. I will also describe how administrators can configure and use SSE for backups in the Rafay Platform.

Encryption

Info

Learn about the integrated Backup/Restore capabilities in the Rafay Platform.

Flatcar Linux: A Great Fit for Kubernetes

In the fast-evolving landscape of containerized applications and cloud-native technologies, choosing the right operating system for your Kubernetes cluster can sometimes make a very big difference. Enter Flatcar Container Linux, an open-source, minimal, and immutable Linux distribution tailored specifically for running containers.

Flatcar is an excellent choice for Kubernetes and modern cloud-native environments. In Aug 2024, Flatcar Linux was accepted as a CNCF project.

This is a 3-part blog series. In this blog, we'll explore what Flatcar Linux is, why it’s uniquely suited for Kubernetes, and the benefits it brings relative to generic Linux.

Flatcar Logo

EKS Auto Mode - Considerations

In the introductory blog on Auto Mode for Amazon EKS, we described the basics of this new capability that was announced at AWS re:Invent 2024. In this blog, we will review considerations that organizations need to factor in before using EKS in Auto Mode.

Note

Please consider this as a living/evolving document. EKS Auto Mode is relatively new and we update this blog with new learnings/findings.

Considerations for EKS Auto Mode

EKS Auto Mode - An Introduction

The Rafay team just got back late last week from an incredibly busy AWS re:Invent 2024. Congratulations to the EKS Product team led by our friend, Nate Taber for the launch of Auto Mode for EKS.

Since this announcement last week, we have had several customers reach out and ask us for our thoughts on this newly launched EKS Auto Mode service. There are several blogs that already describe "How Auto Mode for EKS works etc". In this blog series, I will attempt to provide perspective on "Why", "Why Now?" and "What this means for the industry".

EKS Auto Mode

The Kube-OVN CNI: A Powerful Networking Solution for Kubernetes

Kubernetes has become the de facto standard for orchestrating containerized applications, but efficient networking remains one of the biggest challenges. For Kubernetes networking, Container Network Interface (CNI) plugins handle the essential task of managing the network configuration between pods, nodes, and external systems. Among these CNI plugins, Kube-OVN stands out as a feature-rich and enterprise-ready solution, designed for cloud-native applications requiring robust networking features.

In this blog, we will discuss how it is different from popular CNI plugins such as Calico and Cilium and use cases where it is particularly useful.

Kube-OVN Logo

Spatial Partitioning of GPUs using Nvidia MIG

In the prior blogs, we discussed why GPUs are managed differently in Kubernetes, how the GPU Operator helps streamline management and various strategies to share GPUs on Kubernetes. In 2020, Nvidia introduced Multi-Instance GPU (MIG) that takes GPU sharing to a different level.

In this blog, we will start by reviewing some common industry use cases where MIG is used and then dive deeper into how MIG is configured and used.

Nvidia MIG