Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: dataset update permission out of sync #25043

Merged

Conversation

zephyring
Copy link
Contributor

SUMMARY

The perm and schema_perm attributes for dataset model is out of sync due to unreliably using sqlalchemy get_history method where the history state depends on whether the session has flushed or not.
Previously we have logic in before_update and after_update and permission sync logic in after_update will not be deterministic due to session flush.

This PR moves permission sync logic into before_update and directly using connection.execute to check current db state instead of relying on session state to make sure we can detect actual db/table/schema changes to trigger permission update.

It also clean up shadow_dataset_write logic out as we no longer support it but the database attribute is still not lazily loaded when dataset.database attribute is accessed. Thus I leave the load_database method there still.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@@ -284,34 +282,6 @@ def wraps(self: "BaseSupersetView", *args: Any, **kwargs: Any) -> FlaskResponse:
return functools.update_wrapper(wraps, f)


def validate_sqlatable(table: models.SqlaTable) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is not used anywhere, thus removing

else:
return None

if not DatasetDAO.validate_uniqueness(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check has already been made in dataset update command and proper domain exception thrown there, thus removing this extra check here.
We also have unique constraints at db layer for (db, schema, table) tuple

Copy link
Member

@craig-rueda craig-rueda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No tests?

Copy link
Member

@jfrag1 jfrag1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I still think it would be best to eventually move this logic into the DAO layer, but this is a great way to get things working properly consistently with sqlalchemy events.

To @craig-rueda's point, there are several existing tests prefixed test_after_update_dataset that test this functionality. I'm not sure any new tests are needed since this PR is pretty much just a refactor/improvement of this existing functionality, but it would probably be good to rename them to test_before_update_dataset to be consistent with naming

@zephyring
Copy link
Contributor Author

LGTM! I still think it would be best to eventually move this logic into the DAO layer, but this is a great way to get things working properly consistently with sqlalchemy events.

To @craig-rueda's point, there are several existing tests prefixed test_after_update_dataset that test this functionality. I'm not sure any new tests are needed since this PR is pretty much just a refactor/improvement of this existing functionality, but it would probably be good to rename them to test_before_update_dataset to be consistent with naming

agreed. Consolidating to domain's command layer is likely the better way of handling side effect like permission update, as there's existing patterns here.
But I would defer the problem to a larger refactoring until we all agree upon ideal state/target state of how permissions system should work in Superset.

This PR is purely continuing with existing patterns but more on a fix side of thing.

if self.database_id and (
not self.database or self.database.id != self.database_id
):
session = inspect(self).session
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for ensuring we're using the same session.

copy.id = model.id
# create a copy of the model here
# so it doesn't trigger before_update/after_update sqla listener each time
# we set a property
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an important change to the base update method.
By creating a out of session copy of the original model instance, we can set attributes on the copy without worrying the before/after update sqla listener will be triggered for each property.

The listener will only be triggered when session is committed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's one test test_run_sync_query_dont_exist that I cannot get passed by with this baseDao.update method change(using a copy model instance out of session and merge)
We probably need more thoughts on how to reduce number of events triggered on each property update, thus reverting here

@zephyring zephyring force-pushed the zef/ch73388/fixing_dataset_update_perm branch from cc1ec80 to f4c4568 Compare August 22, 2023 18:05
Copy link
Member

@eschutho eschutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zephyring!

@zephyring zephyring force-pushed the zef/ch73388/fixing_dataset_update_perm branch from ed4d9f8 to 6dbee2f Compare August 22, 2023 23:48
Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! just one comment, would be nice to have a test also to assert what changed on this PR


# When database name changes
if history_database.has_changes() and history_database.deleted:
if current_db_id != target.database_id:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dataset permission is a string that includes the database_id, table_name and schema if those change we have to update the permission string also

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forget it, those cases are covered on the next if statements

zef added 15 commits August 24, 2023 12:00
… sure sqla before/after update listener is only called once
…c method since we already have create. If we want upsert, we can create it separately.
This reverts commit 8c8098ee4bbe5b3d45d7fcf2eb5597cf0b74bbe0.
…ve atomic method since we already have create. If we want upsert, we can create it separately."

This reverts commit f4c45680b2e0073afd339295e80388ab801bef1e.
zef added 2 commits August 24, 2023 12:00
This reverts commit 5380ef80f1790711abd389e9c9e82f937d191b88.
… to make sure sqla before/after update listener is only called once"

This reverts commit c69412684368e6aadfc456a5c7100df33a03e3d2.
@zephyring zephyring force-pushed the zef/ch73388/fixing_dataset_update_perm branch from 6dbee2f to 225be2b Compare August 24, 2023 19:02
@eschutho eschutho merged commit 5168475 into apache:master Aug 25, 2023
29 checks passed
@eschutho eschutho deleted the zef/ch73388/fixing_dataset_update_perm branch August 25, 2023 18:34
@michael-s-molina michael-s-molina added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Aug 29, 2023
cccs-rc pushed a commit to CybercentreCanada/superset that referenced this pull request Mar 6, 2024
@mistercrunch mistercrunch added 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants