-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vizier feature flags are set only at deploy time #1632
Labels
area/deployment
Issues replated to deployments
area/vizier
good first issue
Good for newcomers
kind/bug
Something isn't working
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Comments
aimichelle
added
kind/bug
Something isn't working
good first issue
Good for newcomers
area/deployment
Issues replated to deployments
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
area/vizier
labels
Jul 21, 2023
vihangm
pushed a commit
that referenced
this issue
Aug 9, 2023
Summary: This PR allows the VizierID to be used when getting the org as opposed to only relying on the DeployKey when adding feature flags. Relevant Issues: #1632 Type of change: /kind bug Test Plan: Another PR needs to come in on the operator side to retrieve/send the VizierID to the cloud. Getting this PR in on the cloud side will simplify the process of testing the integration between the two later. --------- Signed-off-by: Kartik Pattaswamy <kpattaswamy@pixielabs.ai>
aimichelle
pushed a commit
that referenced
this issue
Aug 11, 2023
Summary: After testing the interaction between the operator and config manager service, it seems that the config manager service segmentation faults when it receives the VizierID from the operator as it's not able to properly convert it to type uuid, causing the service to restart. This needs to be fixed and in the meantime these changes should be reverted. Relevant Issues: #1632 Type of change: /kind bug Test Plan: Make sure it still builds Signed-off-by: Kartik Pattaswamy <kpattaswamy@pixielabs.ai>
This was referenced Aug 11, 2023
vihangm
pushed a commit
that referenced
this issue
Aug 15, 2023
Summary: This PR allows the VizierID to be used when getting the Vizier's org as opposed to only relying on the DeployKey when adding feature flags. This PR also contains changes to prevent a segmentation fault that occurred previously. Relevant Issues: #1632 Type of change: /kind bug Test Plan: Skaffold the cloud changes on the testing cluster and then skaffold the operator changes on a dev cluster. Manually trigger an update on the operator side to send the VizierID to the config manager service. Check the config manager service's logs to see that the new logic is being executed as expected to handle the VizierID. --------- Signed-off-by: Kartik Pattaswamy <kpattaswamy@pixielabs.ai>
copybaranaut
pushed a commit
to pixie-io/pxapi.go
that referenced
this issue
Aug 15, 2023
Summary: This PR allows the VizierID to be used when getting the Vizier's org as opposed to only relying on the DeployKey when adding feature flags. This PR also contains changes to prevent a segmentation fault that occurred previously. Relevant Issues: pixie-io/pixie#1632 Type of change: /kind bug Test Plan: Skaffold the cloud changes on the testing cluster and then skaffold the operator changes on a dev cluster. Manually trigger an update on the operator side to send the VizierID to the config manager service. Check the config manager service's logs to see that the new logic is being executed as expected to handle the VizierID. --------- Signed-off-by: Kartik Pattaswamy <kpattaswamy@pixielabs.ai> GitOrigin-RevId: 87180347a8b9bfca31b9b561712b64d29bd97807
JamesMBartlett
pushed a commit
that referenced
this issue
Aug 16, 2023
Summary: This PR adjusts the cloud side to accept a type uuidpb.UUID for the VizierID instead of a string. Relevant Issues: #1632 Type of change: /kind bug Test Plan: Skaffold the cloud changes on the testing cluster and then skaffold the operator changes on a dev cluster. Manually trigger an update on the operator side to send the VizierID to the config manager service. Check the config manager service's logs to see that the new logic is being executed as expected to handle the VizierID. Signed-off-by: Kartik Pattaswamy <kpattaswamy@pixielabs.ai>
JamesMBartlett
pushed a commit
that referenced
this issue
Aug 16, 2023
…1670) Summary: This PR lets the operator get the VizierID from the cluster's secrets and then sends it to the config manager service for it to ultimately set feature flags. Relevant Issues: Fixes #1632 Type of change: /kind bug Test Plan: Skaffold the cloud changes on the testing cluster and then skaffold the operator changes on a dev cluster. Manually trigger an update on the operator side to send the VizierID to the config manager service. Check the config manager service's logs to see that the new logic is being executed as expected to handle the VizierID. --------- Signed-off-by: Kartik Pattaswamy <kpattaswamy@pixielabs.ai>
ddelnano
added a commit
that referenced
this issue
May 13, 2024
#1899) Summary: Revert change that causes `px deploy` to timeout on initial deployment This reverts #1670, which creates a circular dependency between the operator and the cloud connector service. The cloud connector service is responsible for [registering a vizier](https://github.com/pixie-io/pixie/blob/fb18345808e6fbd8aed02dac8163469322b84205/src/vizier/services/cloud_connector/bridge/server.go#L290-L302) and populating the `pl-cluster-secrets` k8s secret with its vizier ID. The operator is responsible for creating the vizier services (including the cloud connector), so [this call](https://github.com/pixie-io/pixie/blob/fb18345808e6fbd8aed02dac8163469322b84205/src/operator/controllers/vizier_controller.go#L503) will never succeed on a fresh install. This bug doesn't cause pixie installs to fail completely. A `px deploy` cli command will time out after 10 minutes and then the vizier is deployed following that timeout expiration. While it eventually converges to a healthy vizier, this is a poor user experience. The perf tool is also experiencing this problem, but because it requires `px deploy` to return a successful status code it is causing it to fail completely. Relevant Issues: Reopens #1632 Type of change: /kind bug Test Plan: Reverted this change and verified that the a `skaffold`'ed operator doesn't timeout Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
@kpattaswamy FYI this was reverted because it was causing |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/deployment
Issues replated to deployments
area/vizier
good first issue
Good for newcomers
kind/bug
Something isn't working
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Describe the bug
We have a system for setting feature flags on Viziers. The flow is as follows:
pixie/src/api/proto/cloudpb/cloudapi.proto
Line 945 in ea9a357
deployKey
is included as part of the vzspec.However, this flow only works when the deployKey is valid (so basically, only at deploy time). Most times, a deployKey is generated when deploying a Vizier, and automatically deleted once the Vizier is up-and-running. When the Vizier updates, since the deployKey is no longer valid, the config manager service is unable to fetch the associated orgID. If it is unable to get an orgID, it simply does not add any of the feature flags to the Vizier YAMLs.
Expected behavior
Regardless of whether a user is deploying or updating a Vizier, the feature flags should be set properly based on which org is deploying the Vizier. Since the deploy key is not always valid, we can determine the org associated with a Vizier through its Vizier ID.
ConfigForVizierRequest
protos:src/api/proto/cloudpb/cloudapi.proto
,src/cloud/config_manager/configmanagerpb/service.proto
ConfigForVizierRequest
:pixie/src/operator/controllers/vizier_controller.go
Line 809 in ea9a357
pixie/src/operator/controllers/vizier_controller.go
Lines 1018 to 1024 in ea9a357
pixie/src/cloud/config_manager/controllers/server.go
Lines 222 to 223 in ea9a357
pixie/src/cloud/vzmgr/vzmgrpb/service.proto
Line 47 in ea9a357
The text was updated successfully, but these errors were encountered: