Skip to content

Mohan Atreya

Cost Management for SageMaker AI: The Case for Strong Administrative Guardrails

Enterprises are increasingly leveraging Amazon SageMaker AI to empower their data science teams with scalable, managed machine learning (ML) infrastructure. However, without proper administrative controls, SageMaker AI usage can lead to unexpected cost overruns and significant waste.

In large organizations where dozens or hundreds of data scientists may be experimenting concurrently, this risk compounds quickly.

Cost Overruns

BioContainers: Streamlining Bioinformatics with the Power of Portability

In today's fast-paced world of bioinformatics, the constant evolution of tools, dependencies, and operating system environments presents a significant challenge. Researchers often spend countless hours grappling with software installation, configuration, and version conflicts, hindering their ability to focus on scientific discovery. Enter biocontainers – a revolutionary approach that leverages containerization technology to package bioinformatics software and its entire environment into self-contained, portable units.

Imagine a meticulously organized lab where every experiment, regardless of its complexity, can be instantly replicated with identical results.

This is the promise of biocontainers. Built upon established container platforms like Docker and Singularity, biocontainers encapsulate everything a bioinformatics tool needs to run: the application itself, its libraries, dependencies, and even specific operating system configurations.

BioContainers Logo

Why Inventory Management is Table Stakes for GPU Clouds

In the world of GPU clouds, where speed, scalability, and efficiency are paramount, it’s surprising how many “Neo cloud” providers still manage their infrastructure the old-fashioned way—through spreadsheets.

As laughable as it sounds, this is the harsh reality. Inventory management, one of the most foundational aspects of a reliable cloud platform, is often overlooked or under built. And for modern GPU clouds, that’s a deal breaker.

Inventory Management

Comparing HPA and KEDA: Choosing the Right Tool for Kubernetes Autoscaling

In Kubernetes, autoscaling is key to ensuring application performance while managing infrastructure costs. Two powerful tools that help achieve this are the Horizontal Pod Autoscaler (HPA) and Kubernetes Event-Driven Autoscaling (KEDA). While they share the goal of scaling workloads, their approaches and capabilities are actually very different and distinct.

In this introductory blog, we will provide a bird's eye view of how they compare, and when you might choose one over the other.

HPA vs KEDA

A Fresh New Look: Rafay Console Gets a UI/UX Makeover for Enhanced Usability

At Rafay, we believe that user experience is as critical as the powerful automation capabilities we deliver. With that commitment in mind, we’ve been working for the last few months on a revamp of the Rafay Console User Interface (UI). The changes are purposeful and designed to streamline navigation, increase operational clarity, and elevate your productivity. Whether you’re managing clusters, deploying workloads, or orchestrating environments, the new interface will put everything you need right at your fingertips.

The new UI will launch as part of our v3.5 Release scheduled to rollout end of May 2025. We understand change is hard and it can take a few hours for users to get used to the new experience. Note that existing projects and configurations remain unchanged, and users can continue managing their infrastructure and applications without interruption.

In this blog, we provide a closer look at the most impactful improvements and how they will benefit our users.

Migration

Introduction to User Namespaces in Kubernetes

In Kubernetes, some features arrive quietly, but leave a massive impact. Kubernetes v1.33 is turning out to be one such release where there are some features with massive impact. In the previous blog, my colleague described how you can provision and operate Kubernetes v1.33 clusters on bare metal and VM based environments using Rafay.

In this blog, we will discuss a new feature in v1.33 called User Namespaces. This feature is not a headline grabber such as a service mesh etc, but is a game changer for container security.

Container in a Jail

Powering Multi-Tenant, Serverless AI Inference for Cloud Providers

The AI revolution is here, and Large Language Models (LLMs) are at its forefront. Cloud providers are uniquely positioned to offer powerful AI inference services to their enterprise and retail customers. However, delivering these services in a scalable, multi-tenant, and cost-effective serverless manner presents significant operational challenges.

Rafay enables cloud providers deliver Serverless Inference to 100s of users and enterprises.

Info

Earlier this week, we announced our Multi-Tenant Serverless Inference offering for GPU & Sovereign Cloud Providers. Learn more about this here.

Multi Tenant

Family vs. Lineage: Unpacking Two Often-Confused Ideas in the LLM World

LLMs have begun to resemble sprawling family trees. Folks that are relatively new to LLMs will notice two words appear constantly in technical blogs: "family" and "lineage".

They sound interchangeable and users frequently conflate them. But, they describe different slices of an LLM’s life story.

Important

Understanding the differences is more than trivia. This determines how you pick models, tune them, and keep inference predictable at scale.

LLM Family vs Lineage

Why “Family” Matters in the World of LLMs

When GPU bills run into six digits and every millisecond of latency counts, platform teams learn that vocabulary choices and hidden-unit counts aren’t the only things that separate one model checkpoint from another.

LLMs travel in families—lineages of models that share a common architecture, tokenizer, and training recipe. Think of them the way you might think of Apple’s M-series chips or Toyota’s Prius line: the tuning changes, the size varies, but the underlying design stays stable enough that tools, drivers, and workflows remain interchangeable.

In this blog, we will learn about what we mean by a family for LLMs and why this matters for Inference.

LLM Family