Skip to content

Add WithContains, WithRegexp style options for Get and Watch operations #19667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
purpleidea opened this issue Mar 26, 2025 · 12 comments
Open

Comments

@purpleidea
Copy link
Contributor

What would you like to be added?

Querying and watching anything but simple key prefixes is not possible. This proposes to add new WithFoo(...) style options so that this is possible.

The most basic Watch or Get filtering operation is done with WithPrefix: https://godocs.io/go.etcd.io/etcd/client/v3#WithPrefix

This is useful because clients can namespace their data (KV data) as follows:

/something/foo1/bar1/somethingelse = data1
/something/foo2/bar2/somethingelse = data2
/something/foo3/bar3/somethingelse = data3

If I want to query for entries that start with foo in the second separation, it's easy to do. But if I want to check in the bar in the third separation, it's not possible. The only way is to write a more general query, which is more expensive, and then filter it client side.

It would be ideal, if a slightly more advanced "querying/watching" system existed, so we would be able to have more precise key watches (and avoid erroneous events) as well as more efficient queries that don't need to send as much data across the wire to the client.

Some ideas include:

  • WithRegexp("") that works alongside the WithPrefix option.
  • WithContains("some string") that matches keys containing a certain pattern.
  • And so on...

Obviously we need to make sure we pick simple enough matching systems because we don't want to slow down the server side computation, but these basics should be easy first goals.

I'd appreciate other ideas people have for other matching options, as well as ideas on this general proposal.

One more design note: I noticed nobody really ever implemented a proper "ORM" on top of etcd. I considered doing it, but I realized the lack of this kind of feature makes it impossible and/or inefficient. So I think this could unlock quite a lot of use-cases for etcd.

In my personal situation, this would make queries that my https://github.com/purpleidea/mgmt/ project performs, much more efficient.

Lastly, if someone is interested in writing this patch, I'd be happy to mentor it in my free time if I can.

Thanks!

Why is this needed?

Such a simple feature would make etcd vastly more capable and powerful.

@townsag
Copy link

townsag commented Apr 4, 2025

I am interested in working on this, I'll see if I can get an etcd developer environment set up this weekend

@jberkus
Copy link

jberkus commented Apr 10, 2025

Quick note: this feature proposal has not been adopted by SIG-Etcd. I've added it as a discussion item for the next community meeting; I don't know what the costs of implementing this are, and it's not necessarily a feature we need for other use cases.

@siyuanfoundation
Copy link
Contributor

With the way data are stored in bbolt, it is much faster to do WithPrefix.
For WithContains, WithRegexp, we would need to search through the whole db. @purpleidea have you considered the performance impact of this feature?

@purpleidea
Copy link
Contributor Author

it's not necessarily a feature we need for other use cases

Etcd is useful beyond the "kube centric" view of the world ;) This makes it substantially more useful.

@purpleidea
Copy link
Contributor Author

have you considered the performance impact of this feature?

Indeed, while of course there is an added performance requirement, it's significantly less than downloading large amounts of the database and processing it client side. WithContains should be the primary goal, WithRegexp should be a secondary goal.

@jberkus
Copy link

jberkus commented Apr 10, 2025

All of this just points to a need to evaluate the tradeoffs of the proposal before anyone works on implementing it.

@serathius
Copy link
Member

Your request for WithFoo(...) style options to enable more flexible querying and watching beyond simple key prefixes is understandable. This essentially asks for a "WHERE" clause to enable arbitrary operations on keys. While standard in SQL databases, this is atypical for key-value stores. KV stores typically limit key access to prefix-based range requests, requiring users to structure their keys hierarchically. This limitation, however, offers benefits like improved horizontal scalability through sharding.

Implementing features like WithRegexp and WithContains necessitates considering the entire keyspace. To avoid full scans, case-specific indexes (like a prefix tree for regex and GIN for contains) would be required. While technically feasible, enabling these by default would significantly impact user performance. Introducing optional index configuration and management, similar to SQL databases, would add another layer of complexity.

From a project direction standpoint, pursuing this path seems counterproductive. We cannot simply add SQL database features and expect to become competitive. Storage has never been etcd's primary strength, and fundamentally changing this is unlikely.

Instead, we should emphasize etcd's core strengths: high availability and efficient watch capabilities, the latter being a key factor in Kubernetes' success. Kubernetes faces a similar challenge in efficiently filtering keys based on arbitrary properties (e.g., label selectors). Rather than implementing direct key filtering, Kubernetes leverages its watch functionality.

Kubernetes uses watch for state replication, both to the API server and its client-side local caches. This replication not only avoids round trips but also allows for building local, arbitrary indexes, leading to extremely efficient reads. For instance, Kubernetes supports 5,000-node clusters with 150,000 pods, each represented by a key. Each node efficiently lists only its assigned 30 pods thanks to per-node indexing within the Kubernetes API server. I've even seen deployments six times that size, with 30,000-node clusters running on etcd.

While this approach is currently tailored to Kubernetes, we are working on a generic version for etcd (#19371). Could a solution like this address your needs?

Finally, I'd be very interested in better understanding the https://github.com/purpleidea/mgmt/ project and the challenges it faces.

@purpleidea
Copy link
Contributor Author

@serathius Hey Marek!

This essentially asks for a "WHERE" clause to enable arbitrary operations on keys.

Quite right, although not arbitrary. It's more of a secondary filter on the substructure of the prefixed request.

This limitation, however, offers benefits like improved horizontal scalability through sharding.

I don't expect this kind of operation would prevent any of that. In effect, it's as normal as a WithPrefix request, but then it applies a post-lookup filtering step server side to avoid needing to handle all of that post-processing client side. Internally it could use the existing lookup mechanisms but then apply the filter before sending it to the client.

Kubernetes uses watch for state replication, both to the API server and its client-side local caches.

This is a neat approach, and something I might have to eventually do in mgmt. I didn't implement it yet since AIUI would require a large startup cost and there's lots of complexity there. We do however make extensive use of Watches for all the patterns we're interested in. We'd obviously want the WithFoo options to apply to watches too-- again because we wouldn't need to constantly filter client side.

we are working on a generic version for etcd

This is neat, I'll keep an eye on it, thanks for the link, however I don't think it is a proper substitute for this kind of solution.

Finally, I'd be very interested in better understanding the https://github.com/purpleidea/mgmt/ project and the challenges it faces.

At the moment I've been able to mostly use key trickery and client-side filtering to solve my needs, but it's clear that etcd would benefit from a little more filtering power. For example, there are a number of attempts at writing an ORM layer for etcd which have either failed or become stale, and I suspect this might be one way to unblock that possibility. Perhaps there is another approach which I haven't considered.

Looking at this from another angle, more and more people are using postgresql in place of where etcd would be a nicer solution. I am NOT suggesting etcd become a fully-featured relational database, but rather to gain some fancier filtering powers.

Thanks for reading and for your comments.

@jberkus
Copy link

jberkus commented Apr 17, 2025

Implementing features like WithRegexp and WithContains necessitates considering the entire keyspace. To avoid full scans, case-specific indexes (like a prefix tree for regex and GIN for contains) would be required. While technically feasible, enabling these by default would significantly impact user performance. Introducing optional index configuration and management, similar to SQL databases, would add another layer of complexity.

Based on my experience working on Postgres, here's a laundry list of the things we'd need for that:

  • tooling and commands to support secondary indexes
  • machinery to replicate those indexes between cluster members
  • ability to restore those indexes from backup
  • compaction for the secondary indexes (which will bloat much faster than our current primary index)
  • lots of documentation for all this, including on how to write queries

Among other things, we'd need to make these indexes optional for users, since having them would at least double the size of etcd databases, and probably triple or quadruple the size. It would also triple the number of synchronous writes we do on each member, greatly increasing IO usage -- and sync write speed is already our primary performance bottleneck, so realistically we'd need to implement backend batch writing or something in bbolt as well.

I don't really see this happening unless we get someone who wants to make adding this feature their PhD thesis or some startup who wants to build an etcd-based tool that needs it; we're talking at least a year of work for a senior engineer, probably more.

@serathius
Copy link
Member

@jberkus thanks for sharing your Postgres experience. One thing to note, etcd currently uses one in memory index. That would reduce the complexity, however we pay the cost at etcd bootstrap. Even with one index etcd can take tens of seconds to rebuild it for larged datasets.

@jberkus
Copy link

jberkus commented Apr 18, 2025

GIN indexes are really slow and CPU-consumptive to rebuild. Also hard to parallelize. That's why I didn't suggest an in-memory possibility.

@siyuanfoundation
Copy link
Contributor

I think for @purpleidea 's case, they probably do not care about performance. If the only concern is avoid "downloading large amounts of the database and processing it client side", one can just do a full scan per request and indexing would not be required. But I agree that would not be very useful in general. And building a full performant feature would be too complex to take on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants