Occam’s razor when you pick models

People reach for Occam’s razor when a design review balloons or when a bug has five competing stories. The move is right more often than wrong if you treat it as discipline about what you add to the picture, not as a license to stop thinking.

Entities are not to be multiplied beyond necessity.

That line is the usual ontological reading of the razor. The Stanford Encyclopedia of Philosophy article on simplicity treats it as one side of a larger idea. Philosophers separate parsimony (how many kinds of thing your theory says exist) from elegance (how lean the hypotheses and machinery are). Everyday engineering talk often mashes those together. They pull in different directions. Postulating an extra entity can shrink the amount of law you write down, and shaving entities can force uglier rules. The overview’s point for practitioners is that “do not multiply beyond necessity” is only decisive when you know what counts as necessity for the problem at hand.

The same article flags another constraint that matters on teams. Competing explanations rarely line up as perfect twins with one carrying a strict subset of the other’s ontology. When the evidence does not tie the knot, parsimony is a tie-breaker, not a verdict. Use it to postpone commitment until the facts earn the heavier story.

When two architectures cover the same behavior and failure modes you care about, the one with fewer concepts, fewer failure domains, and fewer hand-maintained invariants is easier to operate. That is parsimony in the “kinds of thing” sense. When the heavier design buys a capability you have already promised (latency, isolation, auditability), necessity has entered the room and the razor stops arguing against it.

Rank hypotheses by what they add to the world model. A flaky test that points to clock skew needs fewer new moving parts than a theory that requires a fresh concurrency bug in the runtime. Reach for the smaller story first, then let instrumentation falsify it. If you skip that order, you pay in time and in credibility with the people waiting on a fix.

Training-serving skew, data contracts, and evaluation harnesses dominate early failure modes. Google’s Rules for Machine Learning put the point in product language. Rule #4 tells teams to keep the first model plain and harden the pipeline first, because simple features are easier to pipe correctly and a plain baseline gives you somewhere stable to stand while you test richer models. That is the razor as a methodological rule (what to build next), not a claim that the universe prefers small neural nets.

The through line is conservative bookkeeping about what you introduce. KISS is the blunter engineering habit on the same axis. Occam asks what you can leave out of the explanation. KISS asks what a tired engineer at 2 a.m. still understands without you on Zoom. Parsimony is a check on sprawl. It does not replace proof, measurement, or the slower work of saying what “necessity” means for your system.

Links here