Learning Journal

Today I Learned

A collection of daily discoveries, insights, and notes from building AI infrastructure. Each entry captures something new learned along the journey.

Saturday, February 28, 2026

Nano-RL and Nano-Coder: Producing and Consuming Agentic Models

Starting two complementary projects: nano-RL for training agentic models with reinforcement learning, and nano-coder for building agent systems that consume these models

rlagentsllmreinforcement-learningtool-usecode-execution

Nano-Train: From Minimal Training Loop to Distributed Training Framework

Journey from building a basic training loop to understanding modern LLM training framework architecture with monitoring, model inspection, and distributed parallelism

llmtrainingdistributed-trainingmegatronparallelismmonitoring

Mixed Precision Training: FP16 → FP8 → DeepSeek-Style Fine-Grained Scaling

Implemented mixed-precision training in nano-train and studied FP8 quantization theory, from loss scaling basics to DeepSeek-V3's fine-grained tile/block scaling approach

trainingfp8quantizationdeepseeknano-train

From Simple Model Dump to Comprehensive Roofline Analysis: Evolution of dump_model_info()

Learned how a simple model architecture dumping function evolved into a comprehensive performance analysis tool with roofline bottleneck detection and optimization guidance

performance-analysisrooflinemodel-analysisbottleneck-detectionllm-training

SGLang Model Support: Native DeepSeek-V2 Mapping and Transformers Backend Fallback

Learned how SGLang supports models like GLM-5 via native DeepSeek-V2 inheritance, and falls back to Transformers backend for unsupported models with hot-path replacement

sglangvllmglm5deepseektransformersmodel-architecture

Building Production LLM Infrastructure on Kubernetes

Learned how to build production-grade Kubernetes infrastructure for LLM serving with custom CRDs, operators, and fleet autoscaling patterns

kubernetesllmvllmcrdoperatorsautoscaling