Allow for private CA on the Agent side #3494

dbutenhof · 2023-07-11T19:29:30Z

PBENCH-1209

The staging server and local development servers now use HTTPS with certs signed by a private pbench CA. An HTTPS connection can't be validated without a reference to this CA.

The primary change here is in the contrib/containerized-pbench/pbench script which now looks for a private CA definition, maps the CA bundle file into the container, and defines REQUESTS_CA_BUNDLE within the container.

In an attempt to handle RPM installs, there's also logic to support a new [results] section pbench_ca configuration variable to define a CA path that will be used to verify the PUT to a server.

Note, this isn't being used for the --relay path, as that's currently not using https and we'll need to figure out how we want to configure this in the future.

dbutenhof · 2023-07-14T12:33:29Z

I'm taking this out of draft mode, because it's probably worth moving forward, even though I'm also not entirely satisfied. For convenience, here's the text of my second commit:

I'm not entirely sure how far I want to take this. What I have so far allows a
user to use the pbench script to run a Pbench Agent command with a specified
CA bundle against a "runlocal" or the staging server by specifying the CA file
with export PB_AGENT_CA=server/pbenchinacan/etc/pki/tls/certs/pbench_CA.crt
before using ./contrib/containerized-pbench/pbench pbench-results-move.

If you use --server or --relay you don't need to worry about having a
~/.config/pbench/pbench-agent.cfg file.

The remaining "problem" is that if you don't want to use --server and expect
to get your configuration from ~/.config/pbench/pbench-agent.cfg, even if it
has the new server_ca configuration value set, there's no logic to cause the
CA to be mapped into the container so it won't actually work. (That is, you
still need the PB_AGENT_CA environment variable.) I find this unsatisfying;
but any solution I can see (e.g., unconditionally mapping in the private CA if
we can find it, or building a mechanism to look for [results] server_ca in
the config file, given there's no standard config-parser CLI available outside
the container) at least equally unsatisfying.

So for now, this is at least a hack allowing us to use the pbench container
script against staging and runlocal servers. Maybe that's enough.

contrib/containerized-pbench/pbench

ndokos · 2023-07-17T17:45:37Z

The server functional tests are failing with a traceback in TestInventory.test_compare.

Odd that unit tests didn't see this. It's possible this is due to the new pquisby 0.0.9, which I believe was published to fix Varshini's comparison problems ... except I'm not finding the relevant code in the GitHub project to compare. 😦

dbutenhof · 2023-07-18T11:52:36Z

The server functional tests are failing with a traceback in TestInventory.test_compare.

Odd that unit tests didn't see this. It's possible this is due to the new pquisby 0.0.9, which I believe was published to fix Varshini's comparison problems ... except I'm not finding the relevant code in the GitHub project to compare. frowning

We're now up to pquisby 0.0.11, which appears to fix the problem.

webbnh

I'm requesting changes mostly because of the placeholder in the pbench-agent.cfg file, and partly for the fallback verify= value (and slightly for the readlink usage). However, I have some concerns about this change in general and I would like to offer a different approach.

One of the thrusts of the contrib/containerized-pbench/pbench script is that it provides for a "no-install" invocation of the Agent: you can download a single file, and run it without doing an RPM install or creating a Git checkout. Entangling it with the CA file doubles the complication.

Moreover, my/our expectation is that access to the Pbench Server will not require the Pbench private CA once we have TLS deployed in production, so an external user will not require the CA file. Nevertheless, it's quite true that access to the Staging server or to a development server will require the private CA, and only the user will know which of those they are targeting with the pbench script, so there does need to be some sort of "knob" which the user can employ to tell the script which configuration it is running in.

But, that knob does not necessarily have to be PB_AGENT_CA as presented, and there is an alternative to mapping a user-provided CA into the container (which is, I think, the root of my concerns): we can build the CA into the container (it is very small, relative to the size of the container, so the impact of doing so will be imperceptible). Then all we need is for the script to set REQUESTS_CA_BUNDLE inside the container, if the user requests it.

FWIW, rather than extending the code in the pbench script which tries to use or generate a local config file, I would prefer to see it torn out! The principal purpose of that was to allow the user to specify the Pbench Server "on the fly", but, with the wonderful --server switch that you added, we no longer need a config file to accomplish that. So, I recommend removing all that code -- if the user needs to use a custom Agent config file, s/he can use PB_AGENT_PODMAN_OPTIONS to map one over the default config file inside the container.

agent/config/pbench-agent.cfg

lib/pbench/agent/results.py

contrib/containerized-pbench/pbench

dbutenhof

But, that knob does not necessarily have to be PB_AGENT_CA as presented, and there is an alternative to mapping a user-provided CA into the container (which is, I think, the root of my concerns): we can build the CA into the container (it is very small, relative to the size of the container, so the impact of doing so will be imperceptible). Then all we need is for the script to set REQUESTS_CA_BUNDLE inside the container, if the user requests it.

I really don't like the idea of building our hack development CA into the real agent container. Frankly a "product deployed" agent shouldn't be able to trust our staging server. And while we could add another layer to the "real" agent container for development, that's a difference and complication I'd rather avoid.

FWIW, rather than extending the code in the pbench script which tries to use or generate a local config file, I would prefer to see it torn out!

I very nearly did that along the way, but I felt reluctant to tear out all your hard work so I struggled to retain it. 😆

agent/config/pbench-agent.cfg

contrib/containerized-pbench/pbench

lib/pbench/agent/results.py

webbnh · 2023-07-19T15:20:13Z

I really don't like the idea of building our hack development CA into the real agent container.

Why? Simply including it is not a security hazard, and it will increase the size of the image by about 0.00006%.

Frankly a "product deployed" agent shouldn't be able to trust our staging server.

Again, why? I'm guessing that using our CA will not be the default (because the default should be to target the production Server or to require the user to explicitly select a Server to target), and so this would just be an unused option in that case.

And while we could add another layer to the "real" agent container for development, that's a difference and complication I'd rather avoid.

I agree -- and it's not worth adding a layer for just 1.2Kb.

FWIW, rather than extending the code in the pbench script which tries to use or generate a local config file, I would prefer to see it torn out!

I very nearly did that along the way, but I felt reluctant to tear out all your hard work so I struggled to retain it. 😆

Thank you -- I very much appreciate that sentiment -- but, while the config file is an excellent solution on the Server side, I don't feel like it is a good one on the client/Agent side, and, now that we have --server, I would like to see the Agent config file disappear...so feel free to excise it from here! 😀

dbutenhof

I really don't like the idea of building our hack development CA into the real agent container.

Why? Simply including it is not a security hazard, and it will increase the size of the image by about 0.00006%.

It feels very much like including testing code in the product: it's impure, and exposes what should be internal. It opens the risk that REQUESTS_CA_BUNDLE=/etc/pki/tls/certs will trust something that shouldn't be trusted. The idea makes me extremely uncomfortable, and I can't convince myself that it's "innocuous".

I very nearly did that along the way, but I felt reluctant to tear out all your hard work so I struggled to retain it. 😆

Thank you -- I very much appreciate that sentiment -- but, while the config file is an excellent solution on the Server side, I don't feel like it is a good one on the client/Agent side, and, now that we have --server, I would like to see the Agent config file disappear...so feel free to excise it from here! 😀

Yeah: done. That closes out the discussion about virtual tabbing and HERE files, too. (I have GitHub set to 4 space tabs just like vscode, by the way, so it looks fine to me: and I don't see any advantage in a different setting since it still doesn't ensure a wayward tab is evident.)

agent/config/pbench-agent.cfg

dbutenhof · 2023-07-19T16:03:36Z

Why? Simply including it is not a security hazard, and it will increase the size of the image by about 0.00006%.

It feels very much like including testing code in the product: it's impure, and exposes what should be internal. It opens the risk that REQUESTS_CA_BUNDLE=/etc/pki/tls/certs will trust something that shouldn't be trusted. The idea makes me extremely uncomfortable, and I can't convince myself that it's "innocuous".

Final thought before I leave to hike... it occurs to me that this would bother me a bit less if we packaged it in our own file tree rather than in a standard system location, like /opt/pbench-agent/config/develop/pbench-devel-CA.crt where it's highly improbable that it could be referenced accidentally and where the path and name makes the purpose more obvious.

webbnh

I really don't like the idea of building our hack development CA into the real agent container.

It feels very much like including testing code in the product: it's impure, and exposes what should be internal.

I'll grant that. But, it's a very small impurity (of which it is far from the first...), and it enables substantial simplifications for the user. So, I think it is a good trade-off.

It opens the risk that REQUESTS_CA_BUNDLE=/etc/pki/tls/certs will trust something that shouldn't be trusted.

That's a fair point. (See below.)

now that we have --server, I would like to see the Agent config file disappear...so feel free to excise it from here! 😀

Yeah: done.

🎉

That closes out the discussion about virtual tabbing and HERE files, too.

🎊

(I have GitHub set to 4 space tabs just like vscode, by the way, so it looks fine to me: and I don't see any advantage in a different setting since it still doesn't ensure a wayward tab is evident.)

I was unaware of that setting. (However, I'm not sure I would like that better.)

it occurs to me that this would bother me a bit less if we packaged it in our own file tree rather than in a standard system location, like /opt/pbench-agent/config/develop/pbench-devel-CA.crt where it's highly improbable that it could be referenced accidentally and where the path and name makes the purpose more obvious.

I would be happy with that solution. I only put it in /etc/pki/tls/certs because I needed to put it somewhere and that seemed like a conventional location. (I was thinking that we would actually trust it in test configurations, but never carried through on that; but, IIUC, trusting it requires having it in there.)

As for what to name the CA cert file, we could go with pbench-devel-CA.crt or pbench-private-CA.crt, but I don't see much advantage to those over the current pbench_CA.crt.

Note that currently, we use the private CA:

in func-test to connect between the Server and Keycloak and between the client and the "inner" Nginx proxy; and,
in Staging to connect between the client and the "inner" proxy.

Once we move the Staging server behind the IntLab "outer" proxy, we won't use the CA in staging at all (just as we won't in production) -- instead, we'll use a cert signed by the IntLab "intermediate CA" -- so access to the private CA will only be an issue for func-test and development.

agent/config/pbench-agent.cfg

webbnh

I think the usage function is missing from the container_build.sh script. And, I'm requesting the ability to override the definition of WORKSPACE_TMP, because I actually do this in practice.

The rest of the comments are nits and small things.

agent/containers/images/container_build.sh

contrib/containerized-pbench/pbench

webbnh · 2023-07-21T16:47:34Z

lib/pbench/agent/results.py

        if server:
            path = config.get("results", "rest_endpoint")
            uri = f"{server}/{path}"
+            self.ca = None
        else:
            uri = config.get("results", "server_rest_url")
+
+            # If the "server_ca" config variable isn't defined, we expect to verify
+            # using a registered CA.
+            self.ca = config.get("results", "server_ca", fallback=None)


Is this code up-to-date with our current thinking?

As the code now stands (AFAICT), if we specify server, then we ignore the server_ca option in the config file, and we'll depend on REQUESTS_CA_BUNDLE if an override is needed; if, instead, we pull the uri from the config file, then we'll honor server_ca if it is provided...why the bifurcation? Why not use the config file CA value in both cases if it is provided? Or, why have the server_ca option at all when the user can use REQUESTS_CA_BUNDLE?

The config file defines a server and optionally a CA to go with that server. If you don't create a custom config file and use --server, there's not going to be a server_ca anyway. But if you have a config file, even with a server_ca, we can't necessarily assume that the --server target cert uses that CA.

So basically, config file server goes with config file CA; while --server goes with REQUESTS_CA_BUNDLE.

I'm unclear why we need this change and why it's better than just relying on REQUESTS_CA_BUNDLE for both cases. 🤷

So your idea is that if someone installs the Agent RPM and relies on a server configured in pbench-agent.cfg so they can just pbench-results-move (e.g., the passthrough server) we should still require them to put a customized REQUESTS_CA_BUNDLE in the environment?

Granted, in the long run, we don't want to force users to rely on private CA certs at all, but this still seems worth doing.

My thinking is that, in production, access to the Pbench Server will require only publicly-available and trusted CAs (for some definition of "public"...e.g., "Red Hat internal", perhaps...), and so no specific action (e.g., defining REQUESTS_CA_BUNDLE) on the part of the user would be required.

For Staging, we should be operating in a similar environment (i.e., behind the IntLab proxies and using IntLab-signed or RH-IT-signed certs).

So, it's only for func-test and development that we would need to be using our private CA. "Normal users" wouldn't need it, unless they are working with bits that are too hot for Staging.

And, I think we have func-test already covered. So, it all really boils down to how do we support our developers. And, for that, I think we can rely on defining REQUESTS_CA_BUNDLE (and, we can probably do it implicitly inside a wrapper script or two, so it doesn't have to be reflected at all in the actual Agent or Server code).

PBENCH-1209 The staging server and local development servers now use HTTPS with certs signed by a private pbench CA. An HTTPS connection can't be validated without a reference to this CA. The primary change here is in the `contrib/containerized-pbench/pbench` script which now looks for a private CA definition, maps the CA bundle file into the container, and defines `REQUESTS_CA_BUNDLE` within the container. In an attempt to handle RPM installs, there's also logic to support a new `[results]` section `pbench_ca` configuration variable to define a CA path that will be used to verify the `PUT` to a server. Note, this isn't being used for the `--relay` path, as that's currently not using `https` and we'll need to figure out how we want to configure this in the future.

I'm not entirely sure how far I want to take this. What I have so far allows a user to use the `pbench` script to run a Pbench Agent command with a specified CA bundle against a "runlocal" or the staging server by specifying the CA file with `export PB_AGENT_CA=server/pbenchinacan/etc/pki/tls/certs/pbench_CA.crt` before using `./contrib/containerized-pbench/pbench pbench-results-move`. If you use `--server` or `--relay` you don't need to worry about having a `~/.config/pbench/pbench-agent.cfg` file. The remaining "problem" is that if you don't want to use `--server` and expect to get your configuration from `~/.config/pbench/pbench-agent.cfg`, even if it has the new `server_ca` configuration value set, there's no logic to cause the CA to be mapped into the container so it won't actually work. (That is, you still need the `PB_AGENT_CA` environment variable.) I find this unsatisfying; but any solution I can see (e.g., unconditionally mapping in the private CA if we can find it, or building a mechanism to look for `[results] server_ca` in the config file, given there's no standard config-parser CLI available outside the container) at least equally unsatisfying. So for now, this is at least a hack allowing us to use the `pbench` container script against staging and runlocal servers. Maybe that's enough.

Drop the attempt to create and manage a pbench-agent.cfg file.

Might as well default `verify=None` everywhere.

webbnh

🦐

webbnh · 2023-07-25T00:59:48Z

lib/pbench/agent/results.py

        if server:
            path = config.get("results", "rest_endpoint")
            uri = f"{server}/{path}"
+            self.ca = None
        else:
            uri = config.get("results", "server_rest_url")
+
+            # If the "server_ca" config variable isn't defined, we expect to verify
+            # using a registered CA.
+            self.ca = config.get("results", "server_ca", fallback=None)


I'm unclear why we need this change and why it's better than just relying on REQUESTS_CA_BUNDLE for both cases. 🤷

pravins · 2023-07-25T11:53:11Z

I am just curious, can we use --server option to get rid of pbench-agent.conf file? May be out of context on this PR

webbnh · 2023-07-25T13:51:39Z

I am just curious, can we use --server option to get rid of pbench-agent.cfg file?

I had the very same thought, but, when I started to look at it, it became clear that the answer is "no". If/when we get to using conventional packaging (a la the Fedora rules), we get much closer, but even then the answer might well be "no".

pbench-agent.cfg is just the user-configurable end (i.e., the overrides) of pbench-agent-default.cfg, where there is quite a bit of configuration. It's true that the biggest config item is the Server location, and we can now specify that via command line options, so there is little need for pbench-agent.cfg itself, other than as a mechanism for the Agent to load the rest of the configuration from pbench-agent-default.cfg...so, perhaps we could separate the two, but we still need the ability to specify the configuration or at least the parts of it which aren't custom to a given user.

May be out of context on this PR

Definitely. 😀

But, we probably should be considering how we specify the access schema for the Pbench Server -- having that in pbench-agent-defaults.cfg instead of inside the pbench_web_server value is inconvenient, although, I suppose we could just have people override the server_rest_url in pbench-agent.cfg as well. (I believe that pbench_web_server is used only to define server_rest_url and host_info_url, and I don't think that host_info_url is used after 0.69....)

dbutenhof added the Agent label Jul 11, 2023

dbutenhof requested review from ndokos and MVarshini July 11, 2023 19:29

dbutenhof self-assigned this Jul 11, 2023

dbutenhof marked this pull request as draft July 12, 2023 11:58

ndokos previously approved these changes Jul 12, 2023

View reviewed changes

dbutenhof dismissed ndokos’s stale review via 2e99d90 July 14, 2023 12:29

dbutenhof force-pushed the agentca branch from d5e104b to 2e99d90 Compare July 14, 2023 12:29

dbutenhof marked this pull request as ready for review July 14, 2023 12:31

dbutenhof requested a review from ndokos July 14, 2023 12:33

ndokos requested changes Jul 17, 2023

View reviewed changes

contrib/containerized-pbench/pbench Outdated Show resolved Hide resolved

dbutenhof force-pushed the agentca branch from 2e99d90 to 26ba95f Compare July 17, 2023 15:13

ndokos self-requested a review July 17, 2023 16:40

ndokos previously approved these changes Jul 17, 2023

View reviewed changes

dbutenhof requested a review from webbnh July 18, 2023 11:51

webbnh requested changes Jul 18, 2023

View reviewed changes

dbutenhof commented Jul 19, 2023

View reviewed changes

agent/config/pbench-agent.cfg Outdated Show resolved Hide resolved

webbnh reviewed Jul 19, 2023

View reviewed changes

agent/config/pbench-agent.cfg Outdated Show resolved Hide resolved

dbutenhof dismissed ndokos’s stale review via b34e4ad July 19, 2023 20:45

dbutenhof requested review from webbnh and ndokos July 19, 2023 20:45

ndokos previously approved these changes Jul 21, 2023

View reviewed changes

webbnh requested changes Jul 21, 2023

View reviewed changes

dbutenhof added 5 commits July 21, 2023 16:03

Cleanup and commentary

8b0a349

Review comments & cleanup

f25cacf

Drop the attempt to create and manage a pbench-agent.cfg file.

Minor tweak I missed

a088575

Might as well default `verify=None` everywhere.

Cleanup

aab4a16

dbutenhof dismissed ndokos’s stale review via aab4a16 July 21, 2023 20:03

dbutenhof force-pushed the agentca branch from f78c7ad to aab4a16 Compare July 21, 2023 20:03

dbutenhof requested review from webbnh and ndokos July 21, 2023 20:07

webbnh approved these changes Jul 25, 2023

View reviewed changes

dbutenhof merged commit 2d0b189 into distributed-system-analysis:main Jul 26, 2023
3 checks passed

dbutenhof deleted the agentca branch July 26, 2023 12:53

webbnh mentioned this pull request Jul 28, 2023

Update the pbench_demo script to add --server #3508

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for private CA on the Agent side #3494

Allow for private CA on the Agent side #3494

dbutenhof commented Jul 11, 2023

dbutenhof commented Jul 14, 2023

ndokos commented Jul 17, 2023 •

edited by dbutenhof

Loading

dbutenhof commented Jul 18, 2023

webbnh left a comment

dbutenhof left a comment

webbnh commented Jul 19, 2023

dbutenhof left a comment

dbutenhof commented Jul 19, 2023

webbnh left a comment

webbnh left a comment

webbnh Jul 21, 2023

dbutenhof Jul 21, 2023

webbnh Jul 25, 2023

dbutenhof Jul 25, 2023

webbnh Jul 25, 2023

webbnh left a comment

webbnh Jul 25, 2023

pravins commented Jul 25, 2023

webbnh commented Jul 25, 2023

Allow for private CA on the Agent side #3494

Allow for private CA on the Agent side #3494

Conversation

dbutenhof commented Jul 11, 2023

dbutenhof commented Jul 14, 2023

ndokos commented Jul 17, 2023 • edited by dbutenhof Loading

dbutenhof commented Jul 18, 2023

webbnh left a comment

Choose a reason for hiding this comment

dbutenhof left a comment

Choose a reason for hiding this comment

webbnh commented Jul 19, 2023

dbutenhof left a comment

Choose a reason for hiding this comment

dbutenhof commented Jul 19, 2023

webbnh left a comment

Choose a reason for hiding this comment

webbnh left a comment

Choose a reason for hiding this comment

webbnh Jul 21, 2023

Choose a reason for hiding this comment

dbutenhof Jul 21, 2023

Choose a reason for hiding this comment

webbnh Jul 25, 2023

Choose a reason for hiding this comment

dbutenhof Jul 25, 2023

Choose a reason for hiding this comment

webbnh Jul 25, 2023

Choose a reason for hiding this comment

webbnh left a comment

Choose a reason for hiding this comment

webbnh Jul 25, 2023

Choose a reason for hiding this comment

pravins commented Jul 25, 2023

webbnh commented Jul 25, 2023

ndokos commented Jul 17, 2023 •

edited by dbutenhof

Loading