-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use size of events to batch Watch responses #18975
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: serathius The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files
... and 17 files with indirect coverage changes @@ Coverage Diff @@
## main #18975 +/- ##
==========================================
- Coverage 68.82% 68.76% -0.07%
==========================================
Files 420 420
Lines 35621 35627 +6
==========================================
- Hits 24517 24498 -19
- Misses 9683 9700 +17
- Partials 1421 1429 +8 Continue to review full report in Codecov by Sentry.
|
bfb3026
to
fa2c9c0
Compare
fa2c9c0
to
2ac6613
Compare
b4506a9
to
e2d2e95
Compare
Spitted #18976 out |
ad9b317
to
9a2fd06
Compare
/retest |
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
… fit under size Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
9a2fd06
to
7e00cc2
Compare
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Part of #16839
Instead of batching by number of revisions (do not understand it's purpose), do batching by size of events to avoid resync generating huge responses. For example Watch opening on very old revision (like 5 minutes behind within but still compaction window), resync will load the whole history and try to push it in one response. If watch is congested or client is not reading this can cause a huge spikes in memory.
Implements #18965 without needing to reconfigure clients.
This is basically implementing fragment watch feature, without needing to configure it on client side. We still need to maintain watch Atomic property, so we cannot split revision between responses. This is simple, if we start an a revision we continue it even if we are over the limit.
Proposing to use the same response size as used by fragment, the size of request + grpc overhead (2MB). As we cannot split revisions, without fragmenting the maximum response size is 2x that (4MB).
For etcd overload scenario (>=100KB object size) it significantly reduced memory use (5GB -> 1.8GB), while improving latency (7.7s->6.9s). For all other scenarios it maintains similar memory and latency.
Tested using
go run ./tools/benchmark/main.go watch-latency --watch-per-stream 1000 --streams 1 --put-total 200 --val-size 100000