Profile

Machine Learning Platform, Distributed Systems, and AI Infrastructure

About

I like building the parts of AI systems that most people only notice when they fail: the infrastructure, deployment paths, inference layers, and platform pieces that keep models usable in production. My work sits across distributed systems, machine learning platforms, AI platforms, agentic platforms, and the practical details that make these systems reliable.

I currently work at Oracle and studied at Carnegie Mellon University and Indian Institute of Technology (BHU). This site is where I keep the writing, research artifacts, experiments, talks, and patent work that connect back to the same thread: making ML/AI systems scale without losing sight of how they actually run.

Core Areas

Distributed systems for ML/AI workloads
Machine learning platforms
Machine learning
AI and agentic platforms
LLM inference and serving infrastructure
Cloud reliability, scale, and performance

Public Footprint

Forbes Technology Council contributor
Research article archived with DOI on Zenodo
LinkedIn profile with 500+ connections
Blog posts across GPT-2 internals, Forbes, and Oracle AI
Patent filing on optimized model deployment

Research

Research Articles

Preprint | Published Jun 13, 2026 | DOI: 10.5281/zenodo.20682521
FailFast-Fargate: Predictive Container Restart Policies for SLO-Driven ECS/Fargate Services

FailFast-Fargate studies a proactive task replacement framework for Amazon ECS/Fargate services. The goal is to detect degradation before user-visible SLO violations occur, instead of waiting for hard failures or health-check breaches after error budgets have already been consumed.

The approach estimates short-horizon SLO risk from task telemetry and trace signals, then compares the likely cost of leaving a degrading task in service against the cost of replacing it. It is designed to work through standard ECS mechanisms without requiring changes to the underlying AWS ECS architecture.

Synthetic degradation experiments show reductions in error-budget burn and SLO impact while keeping restart rates controlled. The work frames predictive restart policies as a practical path toward self-healing, SLO-aware services on ECS/Fargate.

Keywords: ECS, Fargate, SLO, cloud computing, distributed systems, predictive restart, self-healing.

Blog Posts

This is my corner of the internet where I nerd out about the stuff I love — systems, infrastructure, and AI. If it scales, breaks, or learns, I’m probably writing about it.

Jun 19, 2025 | Personal blog
Deconstruction Series #1: Rebuilding GPT-2 in Pure C
Welcome to the GPT-2 Deconstruction Series — a deep dive into how GPT-2 really works, built from the ground up in pure C. No Python. No PyTorch. No magic. Just raw logic, memory management, and the beauty (and pain) of doing everything yourself. Whether you’re here to learn how transformers tick, or just enjoy bending C to your will, this is your guide to building GPT-2 step by step — from tokenization to text generation. Check Out the GPT-2 C Implementation: gpt2.c Read more
Jan 22, 2026 | Forbes Technology Council
The New Frontier Of LLM Inference: Where The Next Tenfold Gains Will Come From

A Forbes Technology Council article on how brute-force scaling is giving way to inference engine improvements rooted in core computer systems design.
Oracle AI and Data Science Blog
Deploy an LLM on OCI Data Science with NVIDIA Triton

Oracle blog post on deploying a large language model using OCI Data Science and NVIDIA Triton.

Talks and judging

Talks

May 1, 2026 | UCLA LA Hacks
UCLA LA Hacks at Pauley Pavilion

Judged UCLA's flagship hackathon, reviewed demos from 1,000+ participants, and highlighted the Codebreaker winning team.

Patents

Application No. 19/088,846 | Filed Mar 18, 2025
Model Deployment System for Generating Optimized Models for a Target Environment

Patent application covering a model deployment system for generating optimized models for a target environment. Listed with other inventors.

About

Core Areas

Public Footprint

Research Articles

FailFast-Fargate: Predictive Container Restart Policies for SLO-Driven ECS/Fargate Services

Blog Posts

Deconstruction Series #1: Rebuilding GPT-2 in Pure C

The New Frontier Of LLM Inference: Where The Next Tenfold Gains Will Come From

Deploy an LLM on OCI Data Science with NVIDIA Triton

Talks

UCLA LA Hacks at Pauley Pavilion

Patents

Model Deployment System for Generating Optimized Models for a Target Environment