Proposal: Local curl request caching in babashka/neil

Created: Oct 25, 2022Published: Nov 01, 2022Last modified: Apr 05, 2023
Word count: 1008Backlinks: 1

Simplicity vs convenience is a difficult balance to strike! When is complexity worth it?

I recently proposed a rejected feature in babashka/neil. There are good reasons for the feature to be integrated, but it would also introduce complexity that needs to be understood by other devs and neil's end-users. Leaving it out is pragmatic and leaves the code in a simpler state, which is ultimately more important.

The github issue is here: Proposal: Local curl request caching · Issue #128 · babashka/neil

Background

neil is a babashka based tool for managing clojure dependencies (really, for managing a deps.edn). It has other features as well, such as generating new projects from templates.

I had just finished out an open PR for adding neil dep upgrade, which checks your local deps.edn versions against the latest counterpart on maven, clojars, or github, updating the versions if newer ones are found.

The upgrade process hits the maven/clojars/github apis to check for latest versions - those APIs are subject to rate limiting. Github is the limiting factor here - it allows only 60 hits per hour without authentication.

Adding test coverage for neil revealed this problem - working with neil in a dev capacity can quickly hit the github limit, which means the tests start to drag and fail for strange reasons.

The proposed solution to this is for each dev on the project to create a new security token and supply env vars with the token and username when running the tests.

Some things that Irk me

> New Contributor Overhead

If you are new to working with neil (as a contributor), it is very likely that you'll run into this.

This increases the overhead for new contributions to neil.

> Security

I'd prefer to have fewer tokens in my name around that can access github. Fortunately these tokens can be configured with limited access and can expire, which mitigates some of the longer-term security issues.

Still, this increases the surface area/attack vectors.

> the rate-limit 403 surfaces as 'no remote found', which is misleading

This is the most solvable, though it introduces similar code-complexity to the cache solution (if not more), which is why I opted to propose the cache instead.

Still, a PR for this will probably save some poor soul some time.

> Network noise/overhead

It irks me that the tools create so many more http requests than is necessary - I feel like there's some resonsiblity on the tool creators to not blast these APIs with requests every run, by the tests or the end users.

I could use some feedback on this opinion - I'm not sure if I'm overvaluing it or if this overhead is truly negligible at scale. Maybe this actually makes no difference, and there's no waste created in this way...

It feels like rate-limits exist to protect services from accidental DDOS attacks and spam - maybe anything under that limit, or authenticated/attached to a user account is fine, and not something to avoid?

Still, I don't love the idea of blasting so many logs into these servers just b/c I wanted to run the test suite again.

My personal desired usage

My original proposal was a file-based cache with a 60 minute ttl per request.

As a user of neil, I'd prefer to run it as often as I want, and I'd expect the results to be the same without making multiple requests.

If I want to bust the cache, I'll run a command for that, but otherwise, let me run it, realize my template isn't quite right, delete and recreate the project.

Or, let me upgrade my deps, find a problem, undo the upgrade, then upgrade deps individually.

You can do this now, it's just that it makes multiple requests to the api services every run, when it could make one request per dependency per hour.

> It occurred to me later, the tests run in the same process

So a simple (non-file-based) cache could reduce re-requests the tests are making. I wrote a proof of concept branch for this, but did not open a PR.

Since writing that, it occurred to me an even easier solution with the same semantics is to just use clojure's built-in memoize.

This would cut out the contributor overhead by making it possible to run the tests closer to 20 times per hour - it would cut back on the likelihood of hitting the rate limit while deving.

However, mitigating the problem like this is not likely to be worth it - this doesn't really solve the problem the way a persisted (file-based) cache would.

Cache complexity

Two hard things in programming: naming and cache invalidation.

Introducing a cache anywhere in a process can lead to confusion. Consumers (devs and end-users) would need to learn why the cache exists to understand what kinds of problems might come from it.

It also adds more code that is not simple to reason about.

In reality, no one has this problem

Github-based deps are rare, so reaching the 60 hits/hour limit as an end user is highly unlikely. It could happen, and if it happens to you, you can create a token like all the neil contributors had to.

The decision to leave this out of neil seems reasonable for the moment (though I'm not as sure in the long term). No one is complaining about the issue right now. Users likely don't have many github deps, and if they did, they'd need to run neil dep upgrade several times to hit the limit.

Maybe there is a problem?

It occurred to me while writing this that using the --dry-run | fzf | <select upgrade> flow might be the quickest way some end-user will hit the limit, because neil dep upgrade --dry-run will hit the api services with each run.

How do other deps tools deal with this?

I suppose this must be normal for deps-management tools like this. Do people hit github's api limit when using yarn, or poetry? How do other tools deal with this?


Backlinks