Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to use lz4 compression in RocksDB #584

Merged
merged 10 commits into from
Jun 23, 2022

Conversation

xiaobiaozhao
Copy link
Contributor

@xiaobiaozhao xiaobiaozhao commented May 17, 2022

Description

Why need the PR?
Provides more Rocksdb.compression configuration parameters for customization

image

This closes #601

@ShooterIT
Copy link
Member

The biggest problem we need to solve is how to compile these compression algorithms.

@xiaobiaozhao xiaobiaozhao force-pushed the add_compression branch 2 times, most recently from 5d5e108 to f1d605c Compare June 8, 2022 01:30
@git-hulk
Copy link
Member

Do guys think this change makes sense for Kvrocks?

Copy link
Member

@PragmaTwice PragmaTwice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Thanks for your hard work. Comments inline.

cmake/lz4.cmake Outdated
Comment on lines 43 to 67
if (EXISTS ${LZ4_ZIP_ABSOLUT_PATH})
file(MD5 ${LZ4_ZIP_ABSOLUT_PATH} LZ4_MD5SUM_)
if (NOT ${LZ4_MD5SUM} STREQUAL ${LZ4_MD5SUM_})
message(STATUS "remove -f ${LZ4_ZIP_ABSOLUT_PATH}")
execute_process(COMMAND sh -c "rm -f ${LZ4_ZIP_ABSOLUT_PATH}")
endif()
endif()

if (NOT EXISTS ${LZ4_ZIP_ABSOLUT_PATH})
# download
message(STATUS "Downloading ${LZ4_ZIP_NAME} to ${LZ4_ZIP_ABSOLUT_PATH}")
file(DOWNLOAD ${LZ4_DOWNLOAD_URL}
${LZ4_ZIP_ABSOLUT_PATH}
TIMEOUT ${DOWNLOAD_LZ4_TIMEOUT}
STATUS ERR SHOW_PROGRESS)
endif()

if (EXISTS ${LZ4_ZIP_ABSOLUT_PATH})
file(MD5 ${LZ4_ZIP_ABSOLUT_PATH} LZ4_MD5SUM_)
if (NOT ${LZ4_MD5SUM} STREQUAL ${LZ4_MD5SUM_})
message(FATAL_ERROR "${LZ4_ZIP_ABSOLUT_PATH} seems be something wrong, please check")
endif()
else()
message(FATAL_ERROR "${LZ4_ZIP_ABSOLUT_PATH} seems be something wrong, please check")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use FetchContent instead of these manual downloading commands?
Just like lua.cmake, it is easy to maintain and more straightforward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lz4 dirs are under GPL license except the lib which we are needed, so NOT sure whether is it ok to use FetchContent like others.

Copy link
Member

@PragmaTwice PragmaTwice Jun 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just fetch files and do not call add_subdirectory (do not use these cmake files in lz4), like in lua.cmake, so I think it has few difference on the license issue than the manual way.

Copy link
Member

@PragmaTwice PragmaTwice Jun 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To facilitate the modification, I show a concrete example below:

FetchContent_Declare(lz4
  URL https://github.com/lz4/lz4/archive/v1.9.3.tar.gz
)

FetchContent_GetProperties(lz4)
if(NOT lz4_POPULATED)
  FetchContent_Populate(lz4)

  add_custom_target(make_lz4 COMMAND make ...)
  ...
endif()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks for Twice's explain.

cmake/lz4.cmake Outdated
Comment on lines 69 to 70
message(STATUS "remove ${LZ4_SOURCE_DIR}")
execute_process(COMMAND sh -c "rm -rf ${LZ4_SOURCE_DIR}")
Copy link
Member

@PragmaTwice PragmaTwice Jun 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these commands are not friendly to incremental build, could you consider to remove them?
We can use FetchContent to manipulate these file system stuff automatically.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool

Comment on lines 41 to 43
lz4_ROOT_DIR=${LZ4_LIB_SOURCE_DIR}
lz4_LIBRARIES=${LZ4_LIB_SOURCE_DIR}/liblz4.a
lz4_INCLUDE_DIRS=${LZ4_LIB_SOURCE_DIR}
Copy link
Member

@PragmaTwice PragmaTwice Jun 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like cmake files in cmake/modules, injecting the lz4 target via hooking find package mechanism can propagate dependency relation into the rocksdb target, in addition this makes the dependencies more modular, we can easily do some switches to selectively enable them.
Of course, if you are not familiar with them, I can follow up to help you implement them : )

.gitignore Outdated Show resolved Hide resolved
cmake/rocksdb.cmake Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
@PragmaTwice
Copy link
Member

The unit test failed in https://github.com/apache/incubator-kvrocks/runs/7003199495?check_suite_focus=true for 1ee66be.
Just to record for more investigate, I will try to rerun the CI.

Copy link
Member

@PragmaTwice PragmaTwice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Is there any test coverage for lz4 compression option? Or do you verify it locally?

@git-hulk
Copy link
Member

Cool! Is there any test coverage for lz4 compression option? Or do you verify it locally?

I tested it on my side, run with LZ4 is ok and check the symbols:

000000000072a890 t LZ4HC_compress_generic_dictCtx
0000000000725ec0 t LZ4HC_compress_generic_noDictCtx.part.0
00000000007216f0 t LZ4HC_compress_optimal
0000000000730790 T LZ4_attach_HC_dictionary
0000000000714790 T LZ4_attach_dictionary
0000000000721590 T LZ4_compress
000000000070b9a0 T LZ4_compressBound
0000000000730be0 T LZ4_compressHC
0000000000730ec0 T LZ4_compressHC2
00000000007319f0 T LZ4_compressHC2_continue
...

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@git-hulk thanks for your input. +1 to merge.

BTW do you confirm that introduce lz4 support in this way won't cause dependency issue? IIRC lz4 can contain GPLv2 code, but if we use lib only, it should be fine.

@git-hulk
Copy link
Member

@git-hulk thanks for your input. +1 to merge.

BTW do you confirm that introduce lz4 support in this way won't cause dependency issue? IIRC lz4 can contain GPLv2 code, but if we use lib only, it should be fine.

Yes that we use lib only, but NOT sure about whether we need to involve more people to discuss or not(previous discussion thread). I saw some apache projects also used the LZ4 with downloading the repo: https://github.com/search?l=CMake&q=org%3Aapache+lz4%2Flz4&type=Code.

@PragmaTwice
Copy link
Member

@git-hulk thanks for your input. +1 to merge.

BTW do you confirm that introduce lz4 support in this way won't cause dependency issue? IIRC lz4 can contain GPLv2 code, but if we use lib only, it should be fine.

Yeah, I think this way only uses files in the lib directory, so there will be no GPL issues involved.

@git-hulk
Copy link
Member

@git-hulk thanks for your input. +1 to merge.
BTW do you confirm that introduce lz4 support in this way won't cause dependency issue? IIRC lz4 can contain GPLv2 code, but if we use lib only, it should be fine.

Yeah, I think this way only uses files in the lib directory, so there will be no GPL issues involved.

So, let's merge this PR if this's crystal clear. cc @tisonkun @PragmaTwice @ShooterIT

@git-hulk git-hulk merged commit d42f029 into apache:unstable Jun 23, 2022
@git-hulk git-hulk changed the title feat(ops): Add more compression Support to use lz4 compression in RocksDB Jun 23, 2022
@git-hulk
Copy link
Member

Thanks for @xiaobiaozhao contribution again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[NEW] add lz4 lib
6 participants