What's wrong with this ANTLR grammar?
up vote
0
down vote
favorite
I want to parse query expressions that look like this:
Person Name=%John%
(Person Name=John% and Address=%Ontario%)
Person Fullname_3="John C. Smith"
But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get
line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME}
for a simple query like
Person Name=John
Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?
I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?
/* ANTLR Grammar for Minidb Query Language */
grammar Mdb;
start : searchclause EOF ;
searchclause
: table expr
;
expr
: fieldsearch
| unop fieldsearch
| LPAREN expr relop expr RPAREN
;
unop
: NOT
;
relop
: AND
| OR
;
fieldsearch
: field EQ searchterm
;
field
: FIELDNAME
;
table
: TABLENAME
;
searchterm
: STRING
;
AND
: 'and'
;
OR
: 'or'
;
NOT
: 'not'
;
EQ
: '='
;
LPAREN
: '('
;
RPAREN
: ')'
;
fragment VALID_ID_START
: ('a' .. 'z') | ('A' .. 'Z') | '_'
;
fragment VALID_ID_CHAR
: VALID_ID_START | ('0' .. '9')
;
TABLENAME
: VALID_ID_START VALID_ID_CHAR*
;
FIELDNAME
: VALID_ID_START VALID_ID_CHAR*
;
STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;
WS
: [ rnt] + -> skip
;
antlr antlr4 context-free-grammar
add a comment |
up vote
0
down vote
favorite
I want to parse query expressions that look like this:
Person Name=%John%
(Person Name=John% and Address=%Ontario%)
Person Fullname_3="John C. Smith"
But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get
line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME}
for a simple query like
Person Name=John
Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?
I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?
/* ANTLR Grammar for Minidb Query Language */
grammar Mdb;
start : searchclause EOF ;
searchclause
: table expr
;
expr
: fieldsearch
| unop fieldsearch
| LPAREN expr relop expr RPAREN
;
unop
: NOT
;
relop
: AND
| OR
;
fieldsearch
: field EQ searchterm
;
field
: FIELDNAME
;
table
: TABLENAME
;
searchterm
: STRING
;
AND
: 'and'
;
OR
: 'or'
;
NOT
: 'not'
;
EQ
: '='
;
LPAREN
: '('
;
RPAREN
: ')'
;
fragment VALID_ID_START
: ('a' .. 'z') | ('A' .. 'Z') | '_'
;
fragment VALID_ID_CHAR
: VALID_ID_START | ('0' .. '9')
;
TABLENAME
: VALID_ID_START VALID_ID_CHAR*
;
FIELDNAME
: VALID_ID_START VALID_ID_CHAR*
;
STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;
WS
: [ rnt] + -> skip
;
antlr antlr4 context-free-grammar
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I want to parse query expressions that look like this:
Person Name=%John%
(Person Name=John% and Address=%Ontario%)
Person Fullname_3="John C. Smith"
But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get
line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME}
for a simple query like
Person Name=John
Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?
I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?
/* ANTLR Grammar for Minidb Query Language */
grammar Mdb;
start : searchclause EOF ;
searchclause
: table expr
;
expr
: fieldsearch
| unop fieldsearch
| LPAREN expr relop expr RPAREN
;
unop
: NOT
;
relop
: AND
| OR
;
fieldsearch
: field EQ searchterm
;
field
: FIELDNAME
;
table
: TABLENAME
;
searchterm
: STRING
;
AND
: 'and'
;
OR
: 'or'
;
NOT
: 'not'
;
EQ
: '='
;
LPAREN
: '('
;
RPAREN
: ')'
;
fragment VALID_ID_START
: ('a' .. 'z') | ('A' .. 'Z') | '_'
;
fragment VALID_ID_CHAR
: VALID_ID_START | ('0' .. '9')
;
TABLENAME
: VALID_ID_START VALID_ID_CHAR*
;
FIELDNAME
: VALID_ID_START VALID_ID_CHAR*
;
STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;
WS
: [ rnt] + -> skip
;
antlr antlr4 context-free-grammar
I want to parse query expressions that look like this:
Person Name=%John%
(Person Name=John% and Address=%Ontario%)
Person Fullname_3="John C. Smith"
But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get
line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME}
for a simple query like
Person Name=John
Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?
I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?
/* ANTLR Grammar for Minidb Query Language */
grammar Mdb;
start : searchclause EOF ;
searchclause
: table expr
;
expr
: fieldsearch
| unop fieldsearch
| LPAREN expr relop expr RPAREN
;
unop
: NOT
;
relop
: AND
| OR
;
fieldsearch
: field EQ searchterm
;
field
: FIELDNAME
;
table
: TABLENAME
;
searchterm
: STRING
;
AND
: 'and'
;
OR
: 'or'
;
NOT
: 'not'
;
EQ
: '='
;
LPAREN
: '('
;
RPAREN
: ')'
;
fragment VALID_ID_START
: ('a' .. 'z') | ('A' .. 'Z') | '_'
;
fragment VALID_ID_CHAR
: VALID_ID_START | ('0' .. '9')
;
TABLENAME
: VALID_ID_START VALID_ID_CHAR*
;
FIELDNAME
: VALID_ID_START VALID_ID_CHAR*
;
STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;
WS
: [ rnt] + -> skip
;
antlr antlr4 context-free-grammar
antlr antlr4 context-free-grammar
asked Nov 10 at 18:24
Eric '3ToedSloth'
1827
1827
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
Try looking at the tokens produced for that input using grun Mdb tokens -tokens
. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.
The first problem is that TABLENAME
and FIELDNAME
have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME
token. To fix that just replace both of those rules with a single ID
rule. If you want to, you can then introduce parser rules tableName : ID ;
and fieldName : ID ;
if you want to keep the names.
The other problem is more straight forward: John
simply does not match your rules for a string since it's not in quotes. If you do want to allow John
as a valid search term, you might want to define it as searchterm : STRING | ID ;
instead of only allowing STRING
s.
Perfect reply, I understand the problem now. Thank you so much!
– Eric '3ToedSloth'
Nov 10 at 22:07
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
Try looking at the tokens produced for that input using grun Mdb tokens -tokens
. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.
The first problem is that TABLENAME
and FIELDNAME
have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME
token. To fix that just replace both of those rules with a single ID
rule. If you want to, you can then introduce parser rules tableName : ID ;
and fieldName : ID ;
if you want to keep the names.
The other problem is more straight forward: John
simply does not match your rules for a string since it's not in quotes. If you do want to allow John
as a valid search term, you might want to define it as searchterm : STRING | ID ;
instead of only allowing STRING
s.
Perfect reply, I understand the problem now. Thank you so much!
– Eric '3ToedSloth'
Nov 10 at 22:07
add a comment |
up vote
1
down vote
accepted
Try looking at the tokens produced for that input using grun Mdb tokens -tokens
. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.
The first problem is that TABLENAME
and FIELDNAME
have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME
token. To fix that just replace both of those rules with a single ID
rule. If you want to, you can then introduce parser rules tableName : ID ;
and fieldName : ID ;
if you want to keep the names.
The other problem is more straight forward: John
simply does not match your rules for a string since it's not in quotes. If you do want to allow John
as a valid search term, you might want to define it as searchterm : STRING | ID ;
instead of only allowing STRING
s.
Perfect reply, I understand the problem now. Thank you so much!
– Eric '3ToedSloth'
Nov 10 at 22:07
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
Try looking at the tokens produced for that input using grun Mdb tokens -tokens
. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.
The first problem is that TABLENAME
and FIELDNAME
have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME
token. To fix that just replace both of those rules with a single ID
rule. If you want to, you can then introduce parser rules tableName : ID ;
and fieldName : ID ;
if you want to keep the names.
The other problem is more straight forward: John
simply does not match your rules for a string since it's not in quotes. If you do want to allow John
as a valid search term, you might want to define it as searchterm : STRING | ID ;
instead of only allowing STRING
s.
Try looking at the tokens produced for that input using grun Mdb tokens -tokens
. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.
The first problem is that TABLENAME
and FIELDNAME
have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME
token. To fix that just replace both of those rules with a single ID
rule. If you want to, you can then introduce parser rules tableName : ID ;
and fieldName : ID ;
if you want to keep the names.
The other problem is more straight forward: John
simply does not match your rules for a string since it's not in quotes. If you do want to allow John
as a valid search term, you might want to define it as searchterm : STRING | ID ;
instead of only allowing STRING
s.
edited Nov 12 at 23:24
answered Nov 10 at 18:44
sepp2k
289k36592604
289k36592604
Perfect reply, I understand the problem now. Thank you so much!
– Eric '3ToedSloth'
Nov 10 at 22:07
add a comment |
Perfect reply, I understand the problem now. Thank you so much!
– Eric '3ToedSloth'
Nov 10 at 22:07
Perfect reply, I understand the problem now. Thank you so much!
– Eric '3ToedSloth'
Nov 10 at 22:07
Perfect reply, I understand the problem now. Thank you so much!
– Eric '3ToedSloth'
Nov 10 at 22:07
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242079%2fwhats-wrong-with-this-antlr-grammar%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown