-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Add WithContains, WithRegexp style options for Get and Watch operations #19667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I am interested in working on this, I'll see if I can get an etcd developer environment set up this weekend |
Quick note: this feature proposal has not been adopted by SIG-Etcd. I've added it as a discussion item for the next community meeting; I don't know what the costs of implementing this are, and it's not necessarily a feature we need for other use cases. |
With the way data are stored in bbolt, it is much faster to do |
Etcd is useful beyond the "kube centric" view of the world ;) This makes it substantially more useful. |
Indeed, while of course there is an added performance requirement, it's significantly less than downloading large amounts of the database and processing it client side. |
All of this just points to a need to evaluate the tradeoffs of the proposal before anyone works on implementing it. |
Your request for Implementing features like From a project direction standpoint, pursuing this path seems counterproductive. We cannot simply add SQL database features and expect to become competitive. Storage has never been etcd's primary strength, and fundamentally changing this is unlikely. Instead, we should emphasize etcd's core strengths: high availability and efficient watch capabilities, the latter being a key factor in Kubernetes' success. Kubernetes faces a similar challenge in efficiently filtering keys based on arbitrary properties (e.g., label selectors). Rather than implementing direct key filtering, Kubernetes leverages its watch functionality. Kubernetes uses watch for state replication, both to the API server and its client-side local caches. This replication not only avoids round trips but also allows for building local, arbitrary indexes, leading to extremely efficient reads. For instance, Kubernetes supports 5,000-node clusters with 150,000 pods, each represented by a key. Each node efficiently lists only its assigned 30 pods thanks to per-node indexing within the Kubernetes API server. I've even seen deployments six times that size, with 30,000-node clusters running on etcd. While this approach is currently tailored to Kubernetes, we are working on a generic version for etcd (#19371). Could a solution like this address your needs? Finally, I'd be very interested in better understanding the https://github.com/purpleidea/mgmt/ project and the challenges it faces. |
@serathius Hey Marek!
Quite right, although not arbitrary. It's more of a secondary filter on the substructure of the prefixed request.
I don't expect this kind of operation would prevent any of that. In effect, it's as normal as a WithPrefix request, but then it applies a post-lookup filtering step server side to avoid needing to handle all of that post-processing client side. Internally it could use the existing lookup mechanisms but then apply the filter before sending it to the client.
This is a neat approach, and something I might have to eventually do in mgmt. I didn't implement it yet since AIUI would require a large startup cost and there's lots of complexity there. We do however make extensive use of Watches for all the patterns we're interested in. We'd obviously want the WithFoo options to apply to watches too-- again because we wouldn't need to constantly filter client side.
This is neat, I'll keep an eye on it, thanks for the link, however I don't think it is a proper substitute for this kind of solution.
At the moment I've been able to mostly use key trickery and client-side filtering to solve my needs, but it's clear that etcd would benefit from a little more filtering power. For example, there are a number of attempts at writing an ORM layer for etcd which have either failed or become stale, and I suspect this might be one way to unblock that possibility. Perhaps there is another approach which I haven't considered. Looking at this from another angle, more and more people are using postgresql in place of where etcd would be a nicer solution. I am NOT suggesting etcd become a fully-featured relational database, but rather to gain some fancier filtering powers. Thanks for reading and for your comments. |
Based on my experience working on Postgres, here's a laundry list of the things we'd need for that:
Among other things, we'd need to make these indexes optional for users, since having them would at least double the size of etcd databases, and probably triple or quadruple the size. It would also triple the number of synchronous writes we do on each member, greatly increasing IO usage -- and sync write speed is already our primary performance bottleneck, so realistically we'd need to implement backend batch writing or something in bbolt as well. I don't really see this happening unless we get someone who wants to make adding this feature their PhD thesis or some startup who wants to build an etcd-based tool that needs it; we're talking at least a year of work for a senior engineer, probably more. |
@jberkus thanks for sharing your Postgres experience. One thing to note, etcd currently uses one in memory index. That would reduce the complexity, however we pay the cost at etcd bootstrap. Even with one index etcd can take tens of seconds to rebuild it for larged datasets. |
GIN indexes are really slow and CPU-consumptive to rebuild. Also hard to parallelize. That's why I didn't suggest an in-memory possibility. |
I think for @purpleidea 's case, they probably do not care about performance. If the only concern is avoid "downloading large amounts of the database and processing it client side", one can just do a full scan per request and indexing would not be required. But I agree that would not be very useful in general. And building a full performant feature would be too complex to take on. |
What would you like to be added?
Querying and watching anything but simple key prefixes is not possible. This proposes to add new WithFoo(...) style options so that this is possible.
The most basic Watch or Get filtering operation is done with WithPrefix: https://godocs.io/go.etcd.io/etcd/client/v3#WithPrefix
This is useful because clients can namespace their data (KV data) as follows:
/something/foo1/bar1/somethingelse = data1
/something/foo2/bar2/somethingelse = data2
/something/foo3/bar3/somethingelse = data3
If I want to query for entries that start with
foo
in the second separation, it's easy to do. But if I want to check in thebar
in the third separation, it's not possible. The only way is to write a more general query, which is more expensive, and then filter it client side.It would be ideal, if a slightly more advanced "querying/watching" system existed, so we would be able to have more precise key watches (and avoid erroneous events) as well as more efficient queries that don't need to send as much data across the wire to the client.
Some ideas include:
Obviously we need to make sure we pick simple enough matching systems because we don't want to slow down the server side computation, but these basics should be easy first goals.
I'd appreciate other ideas people have for other matching options, as well as ideas on this general proposal.
One more design note: I noticed nobody really ever implemented a proper "ORM" on top of etcd. I considered doing it, but I realized the lack of this kind of feature makes it impossible and/or inefficient. So I think this could unlock quite a lot of use-cases for etcd.
In my personal situation, this would make queries that my https://github.com/purpleidea/mgmt/ project performs, much more efficient.
Lastly, if someone is interested in writing this patch, I'd be happy to mentor it in my free time if I can.
Thanks!
Why is this needed?
Such a simple feature would make etcd vastly more capable and powerful.
The text was updated successfully, but these errors were encountered: