Skip to content
This repository has been archived by the owner on Jul 9, 2024. It is now read-only.

Dynamic rule reloading from external store #18

Open
nhoughto opened this issue Feb 23, 2020 · 7 comments · May be fixed by #25
Open

Dynamic rule reloading from external store #18

nhoughto opened this issue Feb 23, 2020 · 7 comments · May be fixed by #25

Comments

@nhoughto
Copy link
Contributor

Statically configured rules are useful start, but being to be able to configure a rules source that is external to the proxy itself would be very useful, being able to configure a local file source plus a remote source like an s3 bucket and path would extend the use-cases where inkfish is valuable and solve for existing develop pains in its current use-cases.

Some considerations:

  • How often they are pull/polled/refreshed, could end up with eventually consistent woes for new deployments where code is deployed but proxy rules aren't being honoured yet. Alternative is aggressive polling schedule (kinda annoying) or some event based process which is complicated to setup and maintain. There needs to be a deadline consumers of the service can reason about though.
  • Auditing / logging changes appropriately etc
@nhoughto
Copy link
Contributor Author

nhoughto commented Oct 7, 2020

Any appetite for this to happen? it would be nice if we had it to fix update ergonomics..

@nhoughto
Copy link
Contributor Author

nhoughto commented Oct 8, 2020

SNS/SQS just to receive S3 events is a pretty complicated setup, 60 second poll against s3 for the nominated s3 bucket and path is much simpler and will be good enough to solve the problem.

Expect we want to support -config <local path> -config <s3 bucket and path> .. where the configs are additive, as the config is made up of a number of related files, we need to be careful to avoid loading a half written set of config. The simplest approach to that seems to be to wait a configurable number of seconds after the last detected change (bucket object modified time or local file modified time) before reload, and subsequent changes in that windows would reset the time. This gives config changes a N second window for config changes to settle before they will be picked up.

Once changes are ready to process, they need to be pulled down and validated, once we are happy they are all valid then an atomic swap of the active config needs to happen for the rules to active.

@nhoughto
Copy link
Contributor Author

nhoughto commented Oct 8, 2020

S3 failures should trigger a warning to logs, and be nice to have a metric as well but should not fail the process to start, fail a healthcheck or in any other way endanger the proxy serving traffic with the rules it does have, it is important that it flags that its not healthy though.

@bls
Copy link
Contributor

bls commented Nov 30, 2020

How do we feel about dynamodb for dynamic proxy rules, rather than S3? It has slightly better ergonomics from the rule update perspective (basically, rules map 1:1 to ddb table entries / terraform resources). Expect on inkfish side would just use a polling strategy to update the table.

@mengxuzhao
Copy link
Collaborator

mengxuzhao commented Nov 30, 2020

I think the tricky bit is on the rolling updates on proxy EC2 instances, as they're not part of kube cluster, will need an external pub/sub thing set up to trigger updates to the EC2 instances (either reload or restart proxy.service). Alternatively a relatively pro-active approach - install SSM agent on proxy EC2 instances so that any updates to either S3 or DynamoDB (storing proxy rules) can trigger RunCommand (documenting shell-based instructions of update proxy.service) to the EC2 instances, which doesn't need to have polling daemon, proxy update will be event-triggered.

@bls
Copy link
Contributor

bls commented Nov 30, 2020

Was thinking the proxy process itself could be polling a dynamodb table or bucket (say every 20s?) so it wouldn't need to restart.

Another question is whether we would want to keep existing static rules (require roll to change) for platform (non app) stuff, or whether all rules should be updatable dynamically.

@nhoughto
Copy link
Contributor Author

nhoughto commented Nov 30, 2020

so ddb table item = what? ~= file in s3 equiv impl?
does this workaround the problem of atomic updates to multiple files/keys which was the biggest problem from memory?

seems like if you had a way to apply updates dynamically it could/would be used for everything? is an impl detail, but like why wouldn't you?

one consideration of how granular to make ddb items would be IAM access to update them? roughly equiv to s3 object tho i guess, could be important in a shared env

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants