Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata Validation: prevent reuse of visited_type_ids #1066

Closed
jsdw opened this issue Jul 13, 2023 · 0 comments · Fixed by #1075
Closed

Metadata Validation: prevent reuse of visited_type_ids #1066

jsdw opened this issue Jul 13, 2023 · 0 comments · Fixed by #1075
Assignees

Comments

@jsdw
Copy link
Collaborator

jsdw commented Jul 13, 2023

We pass a visited_type_ids HashSet into a bunch of functions during validation of metadata. The point of this is to spot when we are recursing into the same type ID we've already seen, and insetad of getting stuck in an infintie loop, returning some "magic" hash.

The problem is that we reuse this visited_type_ids when hashing multiple types in some cases, so eg if we hash the types:

enum Foo {
    A(Bar, Wibble),
}

enum Bar { ... }

enum Wibble { ... }

We end up with a hashing approach something like:

hash("Foo") + 
    xor(
        hash("A") + hash(Bar) + hash(Wibble)
    )

hash("Bar") == RECURSIVE_HASH

hash("Wibble") == RECURSIVE_HASH

Ie, we hash Foo without any issues, but then when we try to hash Bar and Wibble reusing visited_type_ids, we have seen both types already and so both get replaced with some constant RECURSIVE_HASH to avoid infinite loops.

The solution is to ensure that visited_ids is not reused across type hashings at all.

One way to help ensure this is to make it so that get_type_hash does not take visited_ids as an argument and instead creates it internally. It then internally will call get_type_def_hash and whatever by passing this new visited_ids in, and anything needing to recusviely call get_type_hash again can call some inner get_type_hash_recursive call or something which does accept visited_ids and is not exposed/available for anything else to call.

With that, every attempt to hash a type will atuomatically use a new internal "recursive ids set" and we'll prevent accidental re-use.

See #1041 (comment) for a bit of context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants