If you’ve operated a software service, you will have inevitably hit one of the following problems:
A network call with a missing timeout. Some kind of remote procedure call or other networking call is blocked waiting … forever, because there’s no timeout configured on the call.
Missing time-to-live (TTL). Some data that was intended to be ephemeral did not explicitly have a TTL set on it, and it didn’t get removed by normal means, and so its unexpected presence bit you.
A queue with no explicit size limit. A queue somewhere doesn’t have an explicitly configured upper bound on its size, and somehow the producers are consistently outnumbering the consumers, and then your queue eventually grows to some size that you never expected to happen.
Unfortunately, the only good solution to these problems is diligence. We have to remember to explicitly set timeouts, TTLs, and queue sizes. there are two reasons:
It’s impossible for a library author to define a reasonable default for these values. Appropriate timeouts, TTLs, and queue sizes will vary enormously from one use case to another, there simply isn’t a “reasonable” value to pick without picking one so large that it’s basically the same as being unbounded.
Forcing users to always specify values is a lousy user experience for new users. Library authors could make these values required rather than optional. However, this makes it more annoying for new users of these libraries, it’s an extra step that forces them to make a decision they don’t really want to think about. They probably don’t even know what a reasonable value is when they’re first setting out.
I think forcing users to specify these limits would lead to more robust software, but I can see many users complaining about being forced to set these limits rather than defaulting to infinity. At least, that’s my guess about why library authors don’t do it.