Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguities and Priorities in 1.2.1 earley #1443

Open
MaxOstrowski opened this issue Aug 13, 2024 · 6 comments
Open

Ambiguities and Priorities in 1.2.1 earley #1443

MaxOstrowski opened this issue Aug 13, 2024 · 6 comments
Labels

Comments

@MaxOstrowski
Copy link

Since lark version 1.2.1, there seem to be (more?) ambiguities produced for our syntax.
While the syntax is ambiguous, we used prioritiies to fix this before.

%import common.INT
%import common.FLOAT
?start: (condition"."?)?

...

unsigned_integer.1: INT
unsigned_float    : FLOAT 

So whenever a condition ends with a number, like:
x = 5.
there are 2 interpretations, either the dot comes from the start rule, and 5 is an INTEGER, or the DOT comes from a float 5. and the dot from the start rule is omitted.
The priority unsigned_integer.1: INT fixed this before.

  1. Is it normal that we still get 2 different parse trees (earley parser)
  2. We implemented the following rule in the transformer to simply select the preferred ast:
    def _ambig(self, args: list[Any]) -> Any:
        """Take most preferred version."""
        return args[0]

but it feels like a big waste of performance that all possible ambiguities are produced.
Is there a way to enforce only producing the most preferred ast?

Might be related to #1441.

@erezsh
Copy link
Member

erezsh commented Aug 13, 2024

Hi @MaxOstrowski ,

What arguments are you giving to Lark? Specifically as the "ambiguity" parameter?

@MaxOstrowski
Copy link
Author

We tried all of them, none of them made any difference, so we thought that we might have used it wrong?
Lark.open(file, ambiguity="resolve")

@erezsh
Copy link
Member

erezsh commented Aug 13, 2024

Thank you for reporting this bug.

Looks like the issue is caused by a bugfix I included in v1.2.1.

I just created a PR with a fix: #1444

If you can test it, let me know if it fixed the issue.

I will probably make a new release later today.

@erezsh
Copy link
Member

erezsh commented Aug 13, 2024

Okay, new version released - https://github.com/lark-parser/lark/releases/tag/1.2.2

It should be now fixed. Let me know otherwise.

Sorry for the inconvenience!

@erezsh erezsh closed this as completed Aug 13, 2024
@erezsh
Copy link
Member

erezsh commented Aug 13, 2024

Actually I will re-open. Perhaps your problem isn't entirely solved.

P.S. regarding the efficiency concerns - Earley must be aware of all the ambiguities in order to parse the text correctly. Lark stores the ambiguities in the SPPF structure, which is fairly efficient.

I think the final step can be rewritten a bit more efficiently, because we currently produce the trees even for the root solutions that we don't return. It just takes a bit of reshuffling, I'll try and do it tomorrow.

@erezsh erezsh reopened this Aug 13, 2024
@MaxOstrowski
Copy link
Author

MaxOstrowski commented Aug 14, 2024

Regarding our problem, with the newest version (1.2.2) we do not get any ambiguous solutions anymore. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants