-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the data structure and partial commands of the Redis stream #745
Conversation
Can you add some design documents so that people can understand your design ideas :) |
Thanks to @torwig great contribution. For the replication part, Kvrocks will replicate the data after writing into db, so we needn't to do anything if it has no special case. For the new column family, I prefer to keeping less column family and feel free to add if we MUST. |
@caipengbo Yes, sure, I will :) |
@git-hulk I mean that if I run |
Yes, you are right. Kvrocks default replication only sends the data to replicas, we need to recognize those special write batch and wake up the waiting clients. |
@caipengbo Added design notes. |
Many thanks to @torwig detail explanation and bring this good feature into Kvrocks community, we will take a look recently. |
I took a look briefly and overall design is good to me. There are two small issues in design internal:
|
Thank you for your review. I'll fix the issues. |
Yes, I mean we can also save the number of encoded value. As this example, the encoded value can be |
@git-hulk Fixed possible crashes via access to command non-existing command arguments by index. Created column family for streams. I will use it also for replication purposes, to unblock blocked readers. |
I've got your idea about writing the number of strings to the encoded value. However, I didn't understand how this information can help us. I mean, if we know that there are 2 strings encoded, how this number can work in the decoding process? Could you please provide me with any example? |
I rethink about this point, current implementation is ok since C++ string didn't depend on '\0' at all, so use the string length to determine where's value end is ok. Sorry that I misunderstand the implementation. Just disregard it. |
@git-hulk Added batch handling on replicas to unblock clients that were blocked by the XREAD command by the respective XADD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a great work! Not familiar with redis stream, just comment for some little code smell.
Thanks for your great contribution, I will try and review the PR in a few days. |
@git-hulk I've removed the LIMIT option from trim options to be compliant with Redis protocol. Also, I will resolve some conflicts with the |
Available commands: - XADD - XDEL - XINFO STREAM - XLEN - XRANGE - XREAD - XREVRANGE - XTRIM
@torwig Can you help to resolve the conflict? |
1572f71
@git-hulk Done |
Thanks @torwig We can merge this PR after one of @Alfejik @ShooterIT approved since the new commit only resolved command number conflict, so I think the previous approves are still valid. Of course, guys can approve it again if you're free. cc @caipengbo @PragmaTwice |
I don't have time to go through this patch. Although, it seems this patch looks good to two reviewers. If we accept such a new feature, remember to track updates on our doc site :) |
Thanks to @Tison warm remind. Will merge this PR if have no further feedback tomorrow, then update the doc site after merging. |
Hello, thanks to @torwig brings this awesome feature for Kvrocks community and everyone who reviewed this PR. I will summary and merge this PR and welcome to create a new thread to further discussion. Cheers!!! |
Thanks @torwig and @aleksraiden again. |
as we know, prefix bloom filter can improve the performance of range read, but current encoding make it hard to use since we can't know key length and how to set prefix length. One PR still is in draft #508
for the entry encoding, i think we can put version at the front, so that prefix bloom filter will be useful. furthermore, maybe we should redesign the encoding for all complex data structure for better range read performance. Of course, we should keep compatibility, such as, adding an encoding version to distinguish different version for decoding, this way also allow us implement new features easily.
i think we use variable length encoding, such as |
I think we can change all complex data structures at the next major release to make the prefix bloom filter more effective. |
Available commands:
Design notes
Link to the Redis data type description: https://redis.io/docs/manual/data-types/streams/
A stream is a sequence of entries.
Each entry has a unique ID in the form "1-2" where two 64-bit numbers are divided by a hyphen.
By default, the first number is set to a millisecond timestamp, and the second one is a so-called sequence number - for cases when more than one entry was added at the same millisecond.
The value of an entry is a set of key-value pairs.
In case of the command:
XADD s1 1-0 key1 val1 key2 val2
the ID is 1-0 and the value is key1 val1 key2 val2.
In RocksDB entries are represented as key-value pairs where the key is formed as:
key | version | entry-ID-milliseconds-value | entry-ID-sequence-number-value
and the value is encoded as:
key1-length(fixed-32) | key1 | val1-length(fixed-32) | val1 | key2-length(fixed-32) | key2 | val2-length(fixed-32) | val2
.Thanks to the structure of a key, all entries in a stream are sorted in chronological order.
As for value decoding: this is the first idea that came to my mind and maybe it's not very efficient because has an overhead (4 bytes on every argument).
Why did I introduce such a weird encoding scheme? Because if you are reading entries, Redis responds not with a single string:
Perhaps, command args can be joined with a ' '(space) into a single string and this string should be saved in RocksDB? After reading, it will be split while constructing the reponse. With this encoding scheme, I was thinking about the possible spaces inside arguments and how to deal with them?
Differences from Redis
XTRIM
andXADD with trim possibility
: nearly exact trimming (via~
) is not possible due to implementation details (no radix tree here). However,LIMIT
option is working while in Redis it is allowed only in combination with~
.LIMIT
can be disallowed to be consistent with Redis protocol. I didn't do that because I want to hear opinions from kvrocks maintainers.Replication is not implemented yet. Basically, I didn't test streams in a cluster configuration. Perhaps, the plain
XREAD
on a replica will work, but blockingXREAD
that unblocks afterXADD
on the master - I'm sure that some code should be written. It would be greatly appreciated if maintainers provide me with some hints about how to implement this.Consumer groups are not implemented. I'm thinking about the possible implementation.
Right now I'm looking for any maintainers' feedback from the adding-new-data-type perspective (maybe, I didn't add a new column family to some filter/checker/extractor, etc.) and information about proper replicating a stream from master to other nodes.
This closes #532