Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Pull out less state when handling gaps #12828

Closed
wants to merge 3 commits into from

Conversation

erikjohnston
Copy link
Member

@erikjohnston erikjohnston commented May 21, 2022

After #12689 we can make the old_state parameter on compute_event_context take a StateMap[str] instead of a list of full events. As such, we no longer need to pull out all the state events from the DB when handling gaps in the room DAG.

@erikjohnston erikjohnston marked this pull request as ready for review May 23, 2022 09:36
@erikjohnston erikjohnston requested a review from a team as a code owner May 23, 2022 09:36
@richvdh richvdh self-assigned this May 23, 2022
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed bits of this, but haven't yet looked at others. TBH I'm finding it hard to keep track of all the threads, and am wondering if we can break it down.

For instance, start by changing compute_event_context to take a StateMap[str], and change the call sites which currently pass an Iterable[EventBase] to build the StateMap[str]. Then gradually chase it up each of the call trees.

@@ -273,12 +273,12 @@ async def compute_event_context(

Args:
event:
old_state: The state at the event if it can't be
state_ids_before_event: The state at the event if it can't be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state_ids_before_event: The state at the event if it can't be
state_ids_before_event: The event ids of the state at the event if it can't be

@@ -288,11 +288,7 @@ async def compute_event_context(
#
# first of all, figure out the state before the event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# first of all, figure out the state before the event
# first of all, figure out the state before the event, unless we already have it.

Comment on lines +797 to +798
if we already had all the prev events, `None`. Otherwise, returns
the state at `event`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if we already had all the prev events, `None`. Otherwise, returns
the state at `event`.
if we already had all the prev events, `None`. Otherwise, returns
the event ids of the state at `event`.

@@ -878,14 +861,14 @@ async def _resolve_state_at_missing_prevs(
"We can't get valid state history.",
affected=event_id,
)
return state
return state_map

async def _get_state_after_missing_prev_event(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename this to reflect that it returns event ids?

Suggested change
async def _get_state_after_missing_prev_event(
async def _get_state_ids_after_missing_prev_event(

This is probably relevant to other places too.

@@ -894,7 +877,7 @@ async def _get_state_after_missing_prev_event(
event_id: The id of the event we want the state at.

Returns:
A list of events in the state, including the event itself
The state *after* the given event.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The state *after* the given event.
The event ids of the state *after* the given event.

Comment on lines +153 to +155
clause, args = make_in_list_sql_clause(
self.database_engine, "e.event_id", event_ids
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do some batching here? or assert that event_ids is not overlong?

Comment on lines +158 to +160
SELECT e.event_id, e.room_id, e.type, e.state_key FROM events AS e
LEFT JOIN state_events USING (event_id)
WHERE {clause}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrmph. This is exactly the sort of thing #11496 is supposed to fix, by making it so we don't have to hit two tables.

I suppose we're going to have to have fallback code anyway, so this is fine for now, but I'd really love it if we could find some tuits to keep progressing that.

for event_id, event in fetched_events.items()
if event.room_id != room_id
]
event_metadata = await self._store.get_metadata_for_events(state_event_ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate line?

Comment on lines 944 to 947
# check for events which were in the wrong room.
#
# this can happen if a remote server claims that the state or
# auth_events at an event in room A are actually events in room B
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment looks a bit lost now. Move it down?

@@ -1040,7 +1026,7 @@ async def _process_received_pdu(
self,
origin: str,
event: EventBase,
state: Optional[Iterable[EventBase]],
state: Optional[StateMap[str]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general theme: I'd love it if we could change the generic state parameters to state_ids_before_event or similar, to make it easier to understand whether we are dealing with event ids or complete events, without having to go and look at the type.

Similarly let's rename methods which now return collections of event ids where they previously returned events.

@erikjohnston
Copy link
Member Author

Replaced by #12852

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants