Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(blooms): Blooms/v2 encoding multipart series #13093

Merged
merged 39 commits into from
Jun 6, 2024

Conversation

owen-d
Copy link
Member

@owen-d owen-d commented May 31, 2024

V2 bloom format supporting multipart blooms

A lot of refactoring in here in the pursuit of multipart blooms. We've found that blooms for certain series can easily create blooms that are too large to reasonably handle. This hurts in a few ways:

  1. Creating blocks from arbitrarily large streams (tens of thousands of chunks per day) was OOMing during bloom-compactors during creation. Made more noticable b/c the bloom compactor doesn't stream data as well as it could.
  2. Large individual blooms increased the page sizes on the bloom-gateways (read path). This also exposed us to OOMs under heavy load.

We initially addressed this with the stopgap #12796 which abandoned bloom creation once we hit a configurable size. This helped stabilized our read & write paths, but we stopped ingesting a lot of data into the blooms themselves.

This PR introduces a new V2 version for bloom blocks which encodes large blooms into multiple sections in order to make memory usage less volatile on both read & write paths by making the size of any particular bloom more manageable & consistent.

I'll be testing this out on some of our clusters shortly.

/cc @honganan for awareness -- this adds a new v2 version to the bloom blocks binary format.

owen-d added 26 commits May 13, 2024 16:08
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
…m metrics

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
…uiting queries on each bloom

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
@owen-d owen-d requested a review from a team as a code owner May 31, 2024 21:01
…multipart-series

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
@owen-d owen-d changed the title Blooms/v2 encoding multipart series feat(blooms): Blooms/v2 encoding multipart series May 31, 2024
@owen-d owen-d force-pushed the blooms/v2-encoding-multipart-series branch from dc4ba83 to cd49144 Compare May 31, 2024 21:28
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
@owen-d owen-d force-pushed the blooms/v2-encoding-multipart-series branch from cd49144 to be6f549 Compare May 31, 2024 21:41
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
}

func (b *BlockQuerierIter) Next() bool {
return b.LazySeriesIter.Next()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we can remove this since the method will be inherited.

pkg/storage/bloom/v1/bloom_tokenizer.go Show resolved Hide resolved
pkg/storage/bloom/v1/versioned_builder.go Show resolved Hide resolved
pkg/storage/bloom/v1/bloom_builder.go Show resolved Hide resolved
pkg/storage/bloom/v1/fuse.go Show resolved Hide resolved
pkg/storage/bloom/v1/fuse.go Show resolved Hide resolved
pkg/storage/bloom/v1/versioned_builder.go Show resolved Hide resolved
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
@owen-d owen-d force-pushed the blooms/v2-encoding-multipart-series branch from 4aa6e83 to 5c6d4ae Compare June 4, 2024 18:59
…multipart-series

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
@owen-d owen-d force-pushed the blooms/v2-encoding-multipart-series branch from 9c263fb to b682a8c Compare June 4, 2024 19:57
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
Copy link
Collaborator

@slim-bean slim-bean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants