Sahaib's Tech Stack
  • Home
  • About
  • Inside the Stack
  • Hacks & Prompts
  • Notes & Trends
Sign in Subscribe

InsideTheStack

Choosing the Right Model for the Right Job
InsideTheStack

Choosing the Right Model for the Right Job

The decision framework most developers never build The idea of a “best
Read More
GPU vs CPU Inference: Real Truths
InsideTheStack

GPU vs CPU Inference: Real Truths

The real truths most people never tell you This debate is usually
Read More
RAG That Actually Works
InsideTheStack

RAG That Actually Works

And why 90 percent of people implement it wrong Most RAG systems
Read More
Coding Models: Qwen2.5 vs GPT vs Claude
InsideTheStack

Coding Models: Qwen2.5 vs GPT vs Claude

Why Claude 4.5 changes the entire game For years, coding models
Read More
Cloud LLM Playbook (OpenRouter, Cost vs Latency)
InsideTheStack

Cloud LLM Playbook (OpenRouter, Cost vs Latency)

When you should use cloud instead of local models Local models are
Read More
Local LLM Playbook
InsideTheStack

Local LLM Playbook

Run strong models on your machine without a GPU For a long
Read More
KV Cache: Why Models Become Fast
InsideTheStack

KV Cache: Why Models Become Fast

The hidden mechanism that makes modern LLMs feel instant Most people think
Read More
How Tokenization Actually Works
InsideTheStack

How Tokenization Actually Works

The hidden layer behind every LLM Most people talk about models, parameters,
Read More
🚀 InsideTheStack: The Kickoff
InsideTheStack

🚀 InsideTheStack: The Kickoff

A series for the builders who don’t want to stay in
Read More
Sahaib's Tech Stack © 2026
  • Sign up
Follow me on LinkedIn