Choosing the Right Model for the Right Job

Choosing the Right Model for the Right Job

The decision framework most developers never build

The idea of a “best model” is comforting.
It is also wrong.

There is no best model.
There is only the best fit for a specific job.

Once you accept that, model selection stops being emotional and starts being engineering.

Why model choice actually matters

Picking the wrong model shows up immediately:

  • unnecessary cost
  • slow responses
  • inconsistent outputs
  • hallucinations in edge cases
  • infrastructure that is bigger than the problem

Most failures blamed on “LLMs” are actually selection failures.


What model selection should be based on

A model should be chosen using clear constraints, not vibes.

At minimum, you need to evaluate:

  1. Task type
    Code, summarization, extraction, classification, planning
  2. Context window requirements
    How much input does the model need to see at once
  3. Accuracy expectations
    Good enough vs must-be-correct
  4. Latency constraints
    Interactive vs background processing
  5. Cost budget
    Per request, per user, per month

Example rules that work in practice:

  • use Llama or Qwen for structured, repeatable tasks
  • use Claude when large input understanding and coherence matter

Why one model is never enough

Production systems need more than a single benchmark score.

Real selection requires:

  • benchmarking on your own data
  • A/B testing outputs
  • latency profiling under load
  • consistency checks across retries
  • long-context stress testing

One model rarely performs best across all dimensions.
Accepting that early saves months of refactoring later.


How I think about model tiers

This mental model keeps systems predictable:

  • small models
    extraction, classification, tagging, routing
  • medium models
    coding, reasoning, transformations
  • large models
    strategy, planning, multi-step logic, synthesis

This is not about power.
It is about alignment between task and capability.

The real takeaway

Model selection is an architectural decision.
Treating it like a preference is how systems drift into chaos.

When models are chosen intentionally, AI systems become boring.
And boring is exactly what production needs.


Closing

This post is part of InsideTheStack, focused on building AI systems that behave predictably under real constraints.

Follow along for more.

#InsideTheStack #ModelSelection