Summary
The user inquires about the possibility of creating multiple self-hosted satellites with the same name to balance loads, but learns that this is not feasible due to technical limitations. They express frustration and prefer an automated distribution setup to address issues with Kubernetes node reboots. The user considers using a maglev hash with a git branch name for cache re-use, acknowledging that while it could be implemented client-side, it would require additional wrapping. They mention that some users have successfully created satellites per pull request or user, or used round-robin methods on a pool of satellites, although they question the efficiency of such caching strategies.
brandon
Some users have done things like that - a satellite per-PR, or per user. Or round-robin on a pool of satellites. I don’t think it’s the best use of cache, but it seems to be worth the tradeoff for some people
kieran.mann
I was thinking about how I might tackle this if I run into the need to scale horizontally. Feeding a git branch name into a maglev hash to encourage cache re-use could be interesting. Should be do-able from the client side although would require wrapping earthly
brandon
I don't think so, as long as they have unique name and address
me1548
is there a limit to the number of satellites I can host in my homelab?
brandon
Yeah, I agree it would be better. It is ultimately what we'd like to get to in the future (and we've put a lot of thought into how we might make it work). The reason it doesn't work now is that the buildkit process we use internally can't handle sharing it's cache with another process.
me1548
(not to mention having dynamic autoscaling)
me1548
ah, makes sense, but is slightly annoying, I'd like it to be a "no thinking auto distributed" setup so that I don't have to worry about k8s nodes rebooting
brandon
I wrote a bit about this in a <https://docs.earthly.dev/earthly-cloud/satellites/best-practices|best-practice guide>
brandon
Unfortunately not yet, due to technical limitations.
What most people do is just use a larger instance, or partition your workload so that subsets of targets are handled by specific satellites.
As a simple example, if you use BUILD
statements in your Earthfile, those are good candidates to partition on different satellites. i.e.
BUILD +thing-1
BUILD +thing-2```
could turn into
```earthly --sat thing-1-sat +thing-1
earthly --sat thing-2-sat +thing-2```
me1548
is it possible to make multiple self-hosted satellites with the same name and have builds get load balanced between them?