Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TLS for replication #1630

Merged
merged 32 commits into from
Sep 12, 2023
Merged

Conversation

PragmaTwice
Copy link
Member

@PragmaTwice PragmaTwice commented Aug 2, 2023

It closes #1501.

Currently we do not support splitting cert between server and client.

This PR does not plan to support TLS for cluster mode, like slot migration.

@gofort It will be nice if you guys still want this feature and would like to try to do some tests on the patch.

@gofort
Copy link

gofort commented Aug 3, 2023

I have asked @sheyt0 to test, but please don't block this PR with our reply. We will anyway test this, but not sure if it will be during this week or next.

@PragmaTwice
Copy link
Member Author

PragmaTwice commented Aug 3, 2023

I have asked @sheyt0 to test, but please don't block this PR with our reply. We will anyway test this, but not sure if it will be during this week or next.

Thanks! It is OK to test it in next week since we do not require this PR to be merged ASAP. There is still work to be done before we can merge this PR.

src/common/io_util.cc Show resolved Hide resolved
src/common/io_util.cc Outdated Show resolved Hide resolved
src/common/io_util.cc Outdated Show resolved Hide resolved
git-hulk
git-hulk previously approved these changes Aug 12, 2023
@sheyt0
Copy link

sheyt0 commented Aug 14, 2023

@PragmaTwice сould you leave a little doc on how to use TLS for replication?

With this config
on master

kvrocks --tls-key-file ... --tls-cert-file ... --tls-ca-cert-file ...

and on slave

kvrocks --tls-replication yes --tls-key-file ... --tls-cert-file ... --tls-ca-cert-file ...

I get such errors

E20230814 08:22:48.497459 957356 redis_connection.cc:103] [connection] Going to remove the client: ...:59388, while encounter error: Success, SSL Error: error:0A0000C7:SSL routines::peer did not return a certificate
E20230814 08:22:49.507014 957356 redis_connection.cc:103] [connection] Going to remove the client: ...:59392, while encounter error: Success, SSL Error: error:0A0000C7:SSL routines::peer did not return a certificate

@PragmaTwice
Copy link
Member Author

PragmaTwice commented Aug 15, 2023

@sheyt0 Sure. Here are my test steps:

  1. Generate or obtain some certs (e.g. using minica):
  • ca.crt
  • server.crt
  • server.key

(all of them are in PEM format)

  1. Start a kvrocks server instance
./kvrocks --dir datadir1 --port 6666 --tls-port 6676 --log-dir stdout --tls-cert-file cert/server.crt --tls-key-file cert/server.key --tls-ca-cert-file cert/ca.crt
  1. Start a replica (in same host, for convenience)
./kvrocks --dir datadir2 --port 6667 --tls-port 6677 --log-dir stdout --tls-cert-file cert/server.crt --tls-key-file cert/server.key --tls-ca-cert-file cert/ca.crt --tls-replication yes --slaveof "127.0.0.1 6676"
  1. Test
➜  build git:(tls-replica) ✗ redis-cli -p 6666
127.0.0.1:6666> get a
(nil)
127.0.0.1:6666> set a 1
OK
127.0.0.1:6666> set b 2
OK
127.0.0.1:6666> set c 3
OK
127.0.0.1:6666> exit
➜  build git:(tls-replica) ✗ redis-cli -p 6667
127.0.0.1:6667> get a
"1"
127.0.0.1:6667> get b
"2"
127.0.0.1:6667> get c
"3"
127.0.0.1:6667>

Server logs:

WARNING: No config file specified, using the default configuration. In order to specify a config file use 'kvrocks -c /path/to/kvrocks.conf'
I20230815 13:19:36.561048 20476 main.cc:328] kvrocks unstable (commit 83ad999c)
I20230815 13:19:36.591779 20476 storage.cc:332] [storage] Success to load the data from disk: 24 ms
I20230815 13:19:36.595307 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.595387 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
I20230815 13:19:36.595523 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.595558 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
I20230815 13:19:36.595630 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.595645 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
I20230815 13:19:36.595757 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.595808 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
I20230815 13:19:36.595898 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.595913 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
I20230815 13:19:36.595993 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.596009 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
I20230815 13:19:36.596077 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.596091 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
I20230815 13:19:36.596154 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6666
I20230815 13:19:36.596168 20476 worker.cc:72] [worker] Listening on: 0.0.0.0:6676
W20230815 13:19:36.596231 20476 server.cc:1629] [server] Increased maximum number of open files to 18464 (it's originally set to 1024)
I20230815 13:19:36.596359 20476 worker.cc:516] [worker] Thread #139993055979072 started
I20230815 13:19:36.596398 20476 worker.cc:516] [worker] Thread #139993064371776 started
I20230815 13:19:36.596438 20476 worker.cc:516] [worker] Thread #139993072764480 started
I20230815 13:19:36.596958 20476 worker.cc:516] [worker] Thread #139993127319104 started
I20230815 13:19:36.597033 20476 worker.cc:516] [worker] Thread #139993118926400 started
I20230815 13:19:36.597082 20476 worker.cc:516] [worker] Thread #139993110533696 started
I20230815 13:19:36.597136 20476 worker.cc:516] [worker] Thread #139993102140992 started
I20230815 13:19:36.597185 20476 worker.cc:516] [worker] Thread #139993093748288 started
I20230815 13:19:36.597337 20476 server.cc:208] [server] Ready to accept connections
I20230815 13:22:06.778137 20577 cmd_replication.cc:59] Slave 127.0.0.1:43632, listening port: 6667, announce ip: 127.0.0.1 asks for synchronization with next sequence: 1 replication id: not supported, and local sequence: 2
I20230815 13:22:06.778566 20577 cmd_replication.cc:112] New replica: 127.0.0.1:43632 was added, start incremental syncing

Replica logs:

WARNING: No config file specified, using the default configuration. In order to specify a config file use 'kvrocks -c /path/to/kvrocks.conf'
I20230815 13:22:06.714430 20842 main.cc:328] kvrocks unstable (commit 83ad999c)
I20230815 13:22:06.767542 20842 storage.cc:332] [storage] Success to load the data from disk: 12 ms
I20230815 13:22:06.769820 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.769878 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
I20230815 13:22:06.770000 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.770045 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
I20230815 13:22:06.770133 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.770148 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
I20230815 13:22:06.770259 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.770275 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
I20230815 13:22:06.770339 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.770357 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
I20230815 13:22:06.770440 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.770459 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
I20230815 13:22:06.770542 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.770557 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
I20230815 13:22:06.770632 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6667
I20230815 13:22:06.770645 20842 worker.cc:72] [worker] Listening on: 0.0.0.0:6677
W20230815 13:22:06.770705 20842 server.cc:1629] [server] Increased maximum number of open files to 18464 (it's originally set to 1024)
W20230815 13:22:06.770941 20842 replication.cc:348] Clean old synced checkpoint successfully
I20230815 13:22:06.770998 20842 worker.cc:516] [worker] Thread #140695450900032 started
I20230815 13:22:06.771035 20842 worker.cc:516] [worker] Thread #140695459292736 started
I20230815 13:22:06.771433 20842 worker.cc:516] [worker] Thread #140695532729920 started
I20230815 13:22:06.771507 20842 worker.cc:516] [worker] Thread #140695522240064 started
I20230815 13:22:06.771544 20842 worker.cc:516] [worker] Thread #140695513847360 started
I20230815 13:22:06.771581 20842 worker.cc:516] [worker] Thread #140695505454656 started
I20230815 13:22:06.771620 20842 worker.cc:516] [worker] Thread #140695497061952 started
I20230815 13:22:06.771665 20842 worker.cc:516] [worker] Thread #140695488669248 started
I20230815 13:22:06.771783 20842 server.cc:208] [server] Ready to accept connections
I20230815 13:22:06.775859 20954 replication.cc:421] [replication] Check db name request was sent, waiting for response
I20230815 13:22:06.777531 20954 replication.cc:441] [replication] DB name is valid, continue...
I20230815 13:22:06.777562 20954 replication.cc:459] [replication] replconf request was sent, waiting for response
I20230815 13:22:06.777760 20954 replication.cc:485] [replication] replconf is ok, start psync
I20230815 13:22:06.777988 20954 replication.cc:515] [replication] Try to use psync, next seq: 1
I20230815 13:22:06.779289 20954 replication.cc:552] [replication] PSync is ok, start increment batch loop

@sheyt0
Copy link

sheyt0 commented Aug 15, 2023

@PragmaTwice

Currently we do not support splitting cert between server and client.

Do I understand correctly that one set of certs should be used (same certs for master and slave)?
Or is it just about config options?

@PragmaTwice
Copy link
Member Author

PragmaTwice commented Aug 15, 2023

@PragmaTwice

Currently we do not support splitting cert between server and client.

Do I understand correctly that one set of certs should be used (same certs for master and slave)? Or is it just about config options?

Yeah, certs for master and slave must be the same.

@sheyt0
Copy link

sheyt0 commented Aug 15, 2023

Yeah, certs for master and slave must be the same.

In the current implementation, this is not suitable for us. We need mTLS, not just TLS.

@PragmaTwice
Copy link
Member Author

In the current implementation, this is not suitable for us. We need mTLS, not just TLS.

Although I previously do not plan to support it in this PR, the feature is not hard to implement.

I will try to support it later.

@PragmaTwice
Copy link
Member Author

PragmaTwice commented Aug 15, 2023

@sheyt0 Sorry, I think I mis-understand the problem.

The certs of master and slave can be different. I have tried and it works well.

But you need to be noticed that the server connection (listen to kvrocks clients) and the client connection (connect to the master) of the slave must use the same cert, which currently is not able to split into different certs.

So internally it includes four part of SSL context:

  • master: server-side cert, client-side cert
  • slave: server-side cert, client-side cert

The two certs in master cannot be different (and also in slave), but the server-side cert in master and client-side cert in slave can be different.

@PragmaTwice
Copy link
Member Author

@sheyt0 And, I have found why you cannot start TLS connection between master and slave.

Currently slave must have a tls-port to have TLS connection to master, this is a BUG since it is unnecessary.

I will fix it soon.

@PragmaTwice
Copy link
Member Author

PragmaTwice commented Aug 15, 2023

Currently slave must have a tls-port to have TLS connection to master, this is a BUG since it is unnecessary.

It is fixed. Now I think the slave can disable the tls-port (if you want).

@sheyt0
Copy link

sheyt0 commented Aug 29, 2023

It is fixed. Now I think the slave can disable the tls-port (if you want).

Now it works

git-hulk
git-hulk previously approved these changes Sep 2, 2023
@PragmaTwice PragmaTwice merged commit f3d796d into apache:unstable Sep 12, 2023
26 checks passed
@niumcdao
Copy link

image I use minica generate ca file. https://github.com/jsha/minica

@PragmaTwice
Copy link
Member Author

PragmaTwice commented Nov 11, 2023

image I use minica generate ca file. https://github.com/jsha/minica

You need to build kvrocks with cmake option ENABLE_OPENSSL=ON.

Please refer to https://github.com/apache/kvrocks#build for details.

@niumcdao
Copy link

niumcdao commented Dec 2, 2023

tls-replication params I shout config on master node or slave node?

@niumcdao
Copy link

niumcdao commented Dec 2, 2023

I config tls-replication on slave node,then I login on master node, info comand:
image
Replication slave node info is port 6666,why not 6676(master node tls port)

@PragmaTwice
Copy link
Member Author

PragmaTwice commented Dec 2, 2023

You need to configure the slaveof port to 6676 by yourself.

Also, you can open a discussion or question in slack channel rather than reply in this PR thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add the support of TLS in replication
5 participants