Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Get the cache storage passing mypy #11354

Closed
wants to merge 17 commits into from
Closed

Conversation

DMRobertson
Copy link
Contributor

This was a PITA wrangling the inheritance sphagetti.

Review commit-by-commit.

@DMRobertson DMRobertson requested a review from a team as a code owner November 16, 2021 11:44
@DMRobertson DMRobertson marked this pull request as draft November 16, 2021 11:44
David Robertson added 2 commits November 16, 2021 11:48
for e.g. get_latest_event_ids_in_room

Needed on CacheInvalidationWorkerStore for many functions, e.g.
`have_seen_event`.

Fixup MRO in CensorEventsStore.

The BaseSlavedStore is trickier. The removals should be safe because the
removed classes are in parents, via SlavedEventStore -> BaseSlavedStore
-> CacheInvalidationWorkerStore
@DMRobertson
Copy link
Contributor Author

Need to fix portdb.

@clokep clokep removed the request for review from a team November 16, 2021 17:37
@DMRobertson
Copy link
Contributor Author

That sytest failure on buster multi-postgres is a suspected flake, see matrix-org/sytest#1167

@DMRobertson
Copy link
Contributor Author

Think this is reviewable now. Suggest going commit-by-commit up to
5343d6f , then the last three are probably best viewed together.

@DMRobertson DMRobertson marked this pull request as ready for review November 16, 2021 18:02
David Robertson added 10 commits November 16, 2021 18:46
Could use a fancy TypeVar here but I restrained myself.
for e.g. get_relations_for_event.

Remove it from SlavedEventsStore. (Still there via SlavedEventsStore ->
BaseSlavedStore -> CacheInvalidationWorkerStore -> RelationsWorkerStore.
for _membership_stream_cache
I need StateDeltasStore for _curr_state_delta_stream_cache.
I also need the `clock` attribute, which pathologically only seems to be
defined on StatsStore. StatsStore is a subclass of StateDeltasStore, so
let's just use that to keep the list of parents minimal.

To be explicit: I think it's very very odd that only StatsStore provides
a `clock`. I think pulling it in here is silly: why should
CacheInvalidationWorkerStore care about stats? Consider this a protest
commit to raise attention to the bizarre status quo.
for get_rooms_for_user_with_stream_ordering.

This was the most painful MRO problem to resolve.

Remove BaseSlavedStore and SlavedEventStore from
GenericWorkerSlavedStore. These are already pulled in indirectly by
inheriting from SlavedPushRuleStore.
Being honest I just tried to do the minimal thing possible to get it to
work.
@DMRobertson
Copy link
Contributor Author

I've rebased to try and trim down the commits. There are still quite a few small commits trying to make one surgical change to the storage class inheritance, with comments to try and justify them. I still think it's worth going through commit-by-commit.

I'm still quite scared by all this. I'm not sure if we should go ahead with all of this change (input welcome). I think it's useful to see what dependencies we have, even if they might not all make sense. Maybe this can inform us on how to do things better (#11165) if nothing else.

Debugging this to get a consistent MRO between all the datastore classes was the motivation for #11357.

@clokep clokep requested a review from a team November 16, 2021 18:53
synapse/storage/databases/main/cache.py Show resolved Hide resolved
synapse/replication/slave/storage/events.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/cache.py Outdated Show resolved Hide resolved
David Robertson added 3 commits November 18, 2021 18:42
And while we're at it, StatsStore doesn't needs its own clock attribute.
already has it via SlavedEventStore -> BaseEventStore ->
CacheInvalidationWorkerStore -> StreamWorkerStore
@DMRobertson
Copy link
Contributor Author

@clokep thanks for your comments! Could you take another look once CI has run?

@clokep
Copy link
Member

clokep commented Nov 18, 2021

@DMRobertson Looks like portdb is failing.

@DMRobertson
Copy link
Contributor Author

And trial too:

   File "/home/runner/work/synapse/synapse/synapse/visibility.py", line 149, in allowed
    oldest_allowed_ts = storage.main.clock.time_msec() - max_lifetime
builtins.AttributeError: 'DataStore' object has no attribute 'clock'

@DMRobertson DMRobertson marked this pull request as draft November 18, 2021 19:58
@DMRobertson
Copy link
Contributor Author

And trial too:

   File "/home/runner/work/synapse/synapse/synapse/visibility.py", line 149, in allowed
    oldest_allowed_ts = storage.main.clock.time_msec() - max_lifetime
builtins.AttributeError: 'DataStore' object has no attribute 'clock'

Why didn't mypy spot this?

                    reveal_type(storage.main)
synapse/visibility.py:149: note: Revealed type is "Any"

Gonads.

I think this has now become a Tomorrow Problem.

@clokep
Copy link
Member

clokep commented Nov 18, 2021

I believe this comes back to stores.main being DataStoreT?

@DMRobertson
Copy link
Contributor Author

I believe this comes back to stores.main being DataStoreT?

Agreed, but I don't think mypy can see that because Storage isn't generic over DataStoreT itself. (?)

@DMRobertson
Copy link
Contributor Author

@clokep I had a quick go at getting mypy to spot the problem that made the tests fail. I think it'll be a bit involved though; I'd like to handle that in a separate change and get this wrapped up.

To proceed, I can see three options:

  1. revert the last three changes, so that CacheInvalidationWorkerStore inherits from StatsStore again. (i.e. reset back to dfe4779)
  2. keep the last three changes, and change the .clock attribute reference that's causing tests to fail to ._clock.
  3. keep the last three changes. Introduce a clock read-only @property on SQLBaseStore that exposes the _clock.

I think my preferences would be 3 > 1 > 2; what do you think? Perhaps there's another way to proceed that I'm not seeing.

@clokep
Copy link
Member

clokep commented Nov 19, 2021

@DMRobertson I think my preference would be a twist on idea 2 -- the twist being splitting it out to a separate PR, which I hope will help isolate the change.

@clokep
Copy link
Member

clokep commented Nov 23, 2021

Removing my review since I've given thoughts already -- @DMRobertson if you're still waiting for info for me please clarify what I can do to help move this along!

@clokep clokep removed their request for review November 23, 2021 12:42
@DMRobertson
Copy link
Contributor Author

DMRobertson commented Nov 24, 2021

@DMRobertson I think my preference would be a twist on idea 2 -- the twist being splitting it out to a separate PR, which I hope will help isolate the change.

Thanks. I had a quick go at this (I wanted to make Storage generic over DataStoreT) but found it to be quick tricky. I suggest we go with plan 2 and that I leave a comment summarising the situation.

@DMRobertson
Copy link
Contributor Author

More failures, some in portdb and some in sytest.

Feeling a bit swamped by this. I'm wondering if it might be better to close this and try to pull out some easier changes.

@DMRobertson
Copy link
Contributor Author

This has rotted. The last time we talked about the DB monstrosity I think we were in favour of a more drastic, composition-over-inheritance approach; see #11733.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants