Software Engineering

Rate Limiting Is a Product Decision, Not Just an Infrastructure One

Teams often implement rate limits like pure backend plumbing. In reality, those limits shape user experience, customer trust, and who gets blocked when the system is under pressure.

Why rate limiting should be treated as part of product behavior and not only as a technical safeguard, especially when customers experience throttling as policy.

Jay McBride

Jay McBride

Software Engineer

3 min read

Introduction

Rate limiting gets treated like a backend checkbox far too often.

Protect the system. Prevent abuse. Keep one noisy client from crushing everyone else. All reasonable.

But the moment a real customer hits a limit, the technical safeguard becomes product behavior. It stops being a quiet implementation detail and starts becoming part of how the service feels.

That is why bad rate limiting decisions create so much frustration. The engineering logic may be valid while the user experience is still awful.

This article is for teams building APIs, admin tools, and products with shared resources where throttling is inevitable. The point is not whether rate limits are necessary. It is whether you are treating them like a system policy with user consequences.

The Core Judgment: A Rate Limit Is a Promise About Fairness, Not Just a Defense Mechanism

When a customer gets a 429, the system is telling them something about expectations:

  • how much use is acceptable
  • what kind of burst behavior is tolerated
  • who gets priority when resources are tight

Those are product decisions, whether the team labels them that way or not.

This is why one-size-fits-all rate limiting often feels sloppy. Different endpoints, customers, and workflows do not all carry the same cost or urgency. Treating them equally is easy to implement and often wrong to experience.

How This Breaks in the Real World

The naive implementation is common:

  • same limit for every customer
  • same window for every endpoint
  • generic error response
  • no visibility for users into what happened

Then predictable friction shows up.

A batch import hits limits intended for interactive usage. An internal admin workflow gets throttled by rules designed for public traffic. A high-value customer gets blocked during a perfectly legitimate burst because the system does not distinguish abuse from growth.

Technically, the limiter worked.

Operationally, the product made poor decisions about who to slow down and why.

A Real Example: Abuse Prevention That Punished the Wrong Users

I watched a team add strict request caps after a scraping incident. Totally understandable.

The problem was that the new limits punished legitimate customers more than the abusive traffic they were trying to stop. Bulk workflows slowed down, integrations started failing in bursts, and support had no useful language beyond “please wait and try again.”

The limiter was real. The policy behind it was immature.

Once the team split limits by endpoint type, exposed clearer headers, and handled privileged/internal workflows differently, the system became both safer and less hostile.

That is what happens when rate limiting stops being treated like a firewall rule and starts being designed like part of the product.

What I Would Do Instead

I want rate limiting to answer a few questions clearly:

  • what behavior are we protecting against?
  • which workflows are bursty by design?
  • which users or systems deserve different treatment?
  • what message does the client receive when throttled?

Then I want the limiter to reflect reality, not only implementation convenience.

That often means:

  • different limits for different endpoint classes
  • clearer retry guidance
  • visibility in docs and responses
  • exceptions for trusted internal or enterprise workflows

The goal is not to eliminate friction. It is to make the friction intentional.

Closing

Rate limiting is not merely infrastructure hygiene.

It is part of how your product governs access, fairness, and recovery under load.

If the rules are arbitrary, users will experience them as arbitrary. If the errors are vague, customers will experience the system as hostile.

Good limiters protect the platform.

Great ones protect the platform without making legitimate users feel like the problem.

Share

Pass it to someone who needs it

About the Author
Jay McBride

Jay McBride

Software engineer with 20 years building production systems and mentoring developers. I write about the tradeoffs nobody mentions, the decisions that break at scale, and what actually matters when you ship. If you've already seen the AI summaries, you're in the right place.

Based on 20 years building production systems and mentoring developers.

Support my work on Buy Me a Coffee
Keep Reading

More Articles

/ 3 min read

Background Jobs Are Where Web Apps Go to Hide Complexity

Teams love pushing work into the background because the request gets faster. They forget the complexity did not disappear. It just moved somewhere less visible.

Read article
/ 4 min read

Your Logging Strategy Is Not Observability

Dumping more lines into a log platform does not mean your team can understand a failure under pressure. Most logging strategies only create noisier confusion.

Read article