consume: cache batches, not fetches #423
Closed
+351
−540
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change caching to cache topic-partition batches instead of fetch. Cache entries are smaller and are directly calculated from the batch coordinates and only relevant batches are returned. The big fetch data blob is not needed to be handled. With Infinispan caching the topic-partition batches will be assigned within the cache cluster and this will likely lead to loading more of entries from neighbor nodes on clusters having more than one broker in the AZ. This means that smaller entries are loaded, with fetch caching the worst case is the loading of full fetch from neighbor. With the OMB benchmarking the full fetch entry average was 4 MiB and with batches the entry average is 500 KiB.
Cache sizing must be changed accordingly.