Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online yacc/lex grammar editor/tester #36

Open
mingodad opened this issue Sep 7, 2023 · 11 comments
Open

Online yacc/lex grammar editor/tester #36

mingodad opened this issue Sep 7, 2023 · 11 comments

Comments

@mingodad
Copy link

mingodad commented Sep 7, 2023

I'm trying to build an online yacc/lex (LALR(1)) grammar editor/tester to help develop/debug/document grammars the main repository is here https://github.com/mingodad/parsertl-playground and the online playground with several non trivial examples is here https://mingodad.github.io/parsertl-playground/playground/ .

Select a grammar/example from "Examples" select box and then click "Parse" to see a parser tree for the source in "Input source" editor.

It's based on https://github.com/BenHanson/gram_grep and https://github.com/BenHanson/lexertl14 .

Any feedback is welcome !

The grammars available so far (with varying state of correctness):

  • Ada parser
  • Akwa parser (not working)
  • Aliceml parser (partially working)
  • Ansi C11 parser (partially working)
  • Ansi C18 parser (partially working)
  • Ansi C parser
  • Antlr4.5 parser
  • Bison parser (partially working)
  • Blawn parser
  • Braille parser (not working)
  • C3c parser
  • Calculator parser
  • Carbon parser (need review of '*')
  • Chapel parser
  • CocoR parser (partially working)
  • Cpp5-v2 parser (not working)
  • Cql parser
  • Cxx parser (not working)
  • Doxygen code scanner torture
  • Ecere parser (not working)
  • GramGrep parser
  • HTML parser
  • idl2cpp parser
  • Ispc parser
  • Java11 parser
  • JavascriptCore parser
  • Jq parser (partially working)
  • Json5 parser
  • Json lexer
  • Json parser
  • LALR parser
  • Lark parser
  • Lezer parser (partially working)
  • LFortran parser (partially working)
  • Linden Script parser
  • Lox parser
  • LPegrex parser (partially working)
  • LPython parser (not working)
  • Lua2ljs parser
  • Lua-5.3 parser
  • Lua parser
  • LuaPP parser (partially working)
  • Minic parser
  • Minizinc parser (not working)
  • MSTA parser
  • Mulang parser (not working)
  • Openscad parser (partially working)
  • Peg parser (partially working)
  • Pikchr parser
  • Playground3 parser
  • Playground parser
  • PnetC parser
  • PnetCSHarp parser
  • PnetDPas parser
  • PnetJava parser
  • PnetVBasic parser
  • Postgresql parser (be patient)
  • Preprocessor parser (not working)
  • Rust parser
  • Scheme parser
  • Souffle parser
  • Tameparse parser (not working)
  • Textdiagram parser
  • Textmapper parser
  • Webassembly interpreter parser
  • XML parser
  • Z80 assembler parser
@gwenn
Copy link
Owner

gwenn commented Sep 8, 2023

For your information,

  • SQLite lexer does some look ahead to avoid conflicts here.
  • SQLite lemon parser has two specific features: fallback and wildcard.

So I guess you will not be able to easily port the SQLite grammar.

@mingodad
Copy link
Author

mingodad commented Sep 8, 2023

Maybe, but there is https://github.com/ricomariani/CG-SQL-author that is a superset of sqlite that you can also try online here https://mingodad.github.io/CG-SQL-Lua-playground/ and the grammar is there with this name Cql parser.

@gwenn
Copy link
Owner

gwenn commented Sep 8, 2023

As expected, it doesn't work:

CREATE TABLE test (view TEXT); -- view fallbacks to ID instead of keyword 

gives

code.cql:1:1: error: syntax error, unexpected VIEW
Parse errors found, no further passes will run.

And

CREATE VIRTUAL TABLE t3 using fts5(a,b,c); -- a,b,c are wildcards 

gives

code.cql:1:1: error: syntax error, unexpected ';', expecting AS
Parse errors found, no further passes will run.

@mingodad
Copy link
Author

mingodad commented Sep 8, 2023

Thank you for pointing it out !
I've tested CREATE TABLE test (view TEXT); with the Postgresql parser (be patient) and it parses fine, then going back to Cql parser on line 415 :

name :
	ID
	| TEXT
	| TRIGGER
	| ROWID
	| REPLACE
	| KEY
	| VIRTUAL
	| TYPE
	| HIDDEN
	| PRIVATE
	| VIEW -- adding VIEW here to be accepted as ID
	;

You can see that the authors of CG-CQL decided to not allow view as a valid ID but if we add it there (like showing above) then it parses fine too (this is one of the reasons I'm developing this tool https://mingodad.github.io/parsertl-playground/playground/ to allow kick experimentation/debug/development of YACC/LEX LALR(1) grammars).

In the case of CREATE VIRTUAL TABLE t3 using fts5(a,b,c); there is no support for the fts5 extension on CG-CQL so far .

@mingodad
Copy link
Author

mingodad commented Sep 8, 2023

I just added a partially working sqlite3 grammar converting the original parser using the changes I made to lemon here https://github.com/mingodad/lalr-parser-test/tree/main/lemon :

mylemon -h
Valid command line options for "lemon-nb" are:
  -b           Print only the basis in report.
  -c           Don't compress the action table.
  -d<string>   Output directory.  Default '.'
  -D<string>   Define an %ifdef macro.
  -E           Print input file after preprocessing.
  -f<string>   Ignored.  (Placeholder for -f compiler options.)
  -g           Print grammar without actions.
  -y           Print yacc grammar without actions.
  -Y           Print yacc grammar without actions with full precedences.
  -z           Use yacc rule precedence
  -u           Ignore all precedences
  -I<string>   Ignored.  (Placeholder for '-I' compiler options.)
  -m           Output a makeheaders compatible file.
  -l           Do not print #line statements.
  -O<string>   Ignored.  (Placeholder for '-O' compiler options.)
  -p           Show conflicts resolved by precedence rules
  -q           (Quiet) Don't print the report file.
  -r           Do not sort or renumber states
  -s           Print parser stats to standard output.
  -S           Generate the *.sql file describing the parser tables.
  -x           Print the version number.
  -T<string>   Specify a template file.
  -W<string>   Ignored.  (Placeholder for '-W' compiler options.)

mylemon -Y parser.y

Then added the lexer part by hand.

It doesn't handle CREATE TABLE test (view TEXT); because I didn't added a rule that adds non reserved keywords to be accepted as ID, but it does handle CREATE VIRTUAL TABLE t3 using fts5(a,b,c); .

On https://mingodad.github.io/parsertl-playground/playground/ select SQLite3 parser (partially working) and play with it, any fixes (pull requests are welcome).

@ricomariani
Copy link

One big regret I have from building CQL is that I didn't just start from the SQLite grammar. My life would have been so much simpler...

We started from some mysql and it was good enough for us but then getting more of the grammar in place became harder and harder. Oh well, that ship has sailed.

Note that CG/SQL grammar is not a strict superset of SQLite. It's a venn diagram. For instance, CQL does not and cannot reasonably support column names that are not valid identifiers. And it's stricter in many areas. But it does support some useful sugar that isn't in the original.

@ricomariani
Copy link

Oh I should add, because CQL uses yacc we get LALR(1) and that means some fallbacks that SQLite could do, we can't. A few other choices were made to avoid shift/reduce conflicts.

The grammar is pretty good but I would never call it a superset. The presence of keywords is crucial for ambiguity removal and indeed if you added a lot of names as ids you could find the grammar in a bad state.

@ricomariani
Copy link

One other thing. CG/SQL departs significantly from SQLite on virtual tables because it can't do its job at all unless it knows the datatypes of the columns in the virtual table -- it offers strict typing. So there is totally new syntax for specifying the types of the columns as well as the module.

@ricomariani
Copy link

ricomariani commented Sep 8, 2023

CREATE VIRTUAL TABLE t3 using fts5(a,b,c);

Could be made to work I think but you need to tell it the shape of the resulting table. CG/SQL doesn't care about virtual tables other than it needs to know what columns they have.

e.g.

CREATE VIRTUAL TABLE t3 using fts5(a,b,c) AS (a1 int, a2 text, b1 int, b2 text, b3 text, c1 int);

Note the AS portion is unique to CG/SQL -- it has whatever the combined indexed columns will be.

then you can do

select * from t3 where a2 match 'something';

@mingodad
Copy link
Author

mingodad commented Sep 9, 2023

Hello @ricomariani !
Thank you so much for you great work and time helping understand the issues pointed here and the general view of CG-CQL !

@ricomariani
Copy link

ricomariani commented Sep 9, 2023

FWIW I just added "add" and "view" to the allow list. That didn't cause any conflicts. So you could just pull again.

And you're welcome :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants