What's wrong with this ANTLR grammar?











up vote
0
down vote

favorite
1












I want to parse query expressions that look like this:




Person Name=%John%



(Person Name=John% and Address=%Ontario%)



Person Fullname_3="John C. Smith"




But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get



line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME} 


for a simple query like




Person Name=John




Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?



I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?



/* ANTLR Grammar for Minidb Query Language */

grammar Mdb;

start : searchclause EOF ;

searchclause
: table expr
;

expr
: fieldsearch
| unop fieldsearch
| LPAREN expr relop expr RPAREN
;

unop
: NOT
;

relop
: AND
| OR
;

fieldsearch
: field EQ searchterm
;

field
: FIELDNAME
;

table
: TABLENAME
;

searchterm
: STRING
;

AND
: 'and'
;

OR
: 'or'
;

NOT
: 'not'
;
EQ
: '='
;

LPAREN
: '('
;

RPAREN
: ')'
;

fragment VALID_ID_START
: ('a' .. 'z') | ('A' .. 'Z') | '_'
;

fragment VALID_ID_CHAR
: VALID_ID_START | ('0' .. '9')
;

TABLENAME
: VALID_ID_START VALID_ID_CHAR*
;

FIELDNAME
: VALID_ID_START VALID_ID_CHAR*
;

STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;

WS
: [ rnt] + -> skip
;









share|improve this question


























    up vote
    0
    down vote

    favorite
    1












    I want to parse query expressions that look like this:




    Person Name=%John%



    (Person Name=John% and Address=%Ontario%)



    Person Fullname_3="John C. Smith"




    But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get



    line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME} 


    for a simple query like




    Person Name=John




    Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?



    I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?



    /* ANTLR Grammar for Minidb Query Language */

    grammar Mdb;

    start : searchclause EOF ;

    searchclause
    : table expr
    ;

    expr
    : fieldsearch
    | unop fieldsearch
    | LPAREN expr relop expr RPAREN
    ;

    unop
    : NOT
    ;

    relop
    : AND
    | OR
    ;

    fieldsearch
    : field EQ searchterm
    ;

    field
    : FIELDNAME
    ;

    table
    : TABLENAME
    ;

    searchterm
    : STRING
    ;

    AND
    : 'and'
    ;

    OR
    : 'or'
    ;

    NOT
    : 'not'
    ;
    EQ
    : '='
    ;

    LPAREN
    : '('
    ;

    RPAREN
    : ')'
    ;

    fragment VALID_ID_START
    : ('a' .. 'z') | ('A' .. 'Z') | '_'
    ;

    fragment VALID_ID_CHAR
    : VALID_ID_START | ('0' .. '9')
    ;

    TABLENAME
    : VALID_ID_START VALID_ID_CHAR*
    ;

    FIELDNAME
    : VALID_ID_START VALID_ID_CHAR*
    ;

    STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;

    WS
    : [ rnt] + -> skip
    ;









    share|improve this question
























      up vote
      0
      down vote

      favorite
      1









      up vote
      0
      down vote

      favorite
      1






      1





      I want to parse query expressions that look like this:




      Person Name=%John%



      (Person Name=John% and Address=%Ontario%)



      Person Fullname_3="John C. Smith"




      But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get



      line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME} 


      for a simple query like




      Person Name=John




      Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?



      I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?



      /* ANTLR Grammar for Minidb Query Language */

      grammar Mdb;

      start : searchclause EOF ;

      searchclause
      : table expr
      ;

      expr
      : fieldsearch
      | unop fieldsearch
      | LPAREN expr relop expr RPAREN
      ;

      unop
      : NOT
      ;

      relop
      : AND
      | OR
      ;

      fieldsearch
      : field EQ searchterm
      ;

      field
      : FIELDNAME
      ;

      table
      : TABLENAME
      ;

      searchterm
      : STRING
      ;

      AND
      : 'and'
      ;

      OR
      : 'or'
      ;

      NOT
      : 'not'
      ;
      EQ
      : '='
      ;

      LPAREN
      : '('
      ;

      RPAREN
      : ')'
      ;

      fragment VALID_ID_START
      : ('a' .. 'z') | ('A' .. 'Z') | '_'
      ;

      fragment VALID_ID_CHAR
      : VALID_ID_START | ('0' .. '9')
      ;

      TABLENAME
      : VALID_ID_START VALID_ID_CHAR*
      ;

      FIELDNAME
      : VALID_ID_START VALID_ID_CHAR*
      ;

      STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;

      WS
      : [ rnt] + -> skip
      ;









      share|improve this question













      I want to parse query expressions that look like this:




      Person Name=%John%



      (Person Name=John% and Address=%Ontario%)



      Person Fullname_3="John C. Smith"




      But I'm totally new to Antlr4 and can't even figure out how to parse one single TABLE FIELD=QUERY clause. When I run the grammar below in Go as target, I get



      line 1:7 mismatched input 'Name' expecting {'not', '(', FIELDNAME} 


      for a simple query like




      Person Name=John




      Why can't the Grammar parse FIELDNAME via parsing fieldsearch->field EQ searchterm->FIELDNAME?



      I guess I'm misunderstanding something very fundamental here about how Antlr Grammars work, but what?



      /* ANTLR Grammar for Minidb Query Language */

      grammar Mdb;

      start : searchclause EOF ;

      searchclause
      : table expr
      ;

      expr
      : fieldsearch
      | unop fieldsearch
      | LPAREN expr relop expr RPAREN
      ;

      unop
      : NOT
      ;

      relop
      : AND
      | OR
      ;

      fieldsearch
      : field EQ searchterm
      ;

      field
      : FIELDNAME
      ;

      table
      : TABLENAME
      ;

      searchterm
      : STRING
      ;

      AND
      : 'and'
      ;

      OR
      : 'or'
      ;

      NOT
      : 'not'
      ;
      EQ
      : '='
      ;

      LPAREN
      : '('
      ;

      RPAREN
      : ')'
      ;

      fragment VALID_ID_START
      : ('a' .. 'z') | ('A' .. 'Z') | '_'
      ;

      fragment VALID_ID_CHAR
      : VALID_ID_START | ('0' .. '9')
      ;

      TABLENAME
      : VALID_ID_START VALID_ID_CHAR*
      ;

      FIELDNAME
      : VALID_ID_START VALID_ID_CHAR*
      ;

      STRING: '"' ~('n'|'"')* ('"' | { panic("syntax-error - unterminated string literal") } ) ;

      WS
      : [ rnt] + -> skip
      ;






      antlr antlr4 context-free-grammar






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 10 at 18:24









      Eric '3ToedSloth'

      1827




      1827
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          Try looking at the tokens produced for that input using grun Mdb tokens -tokens. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.



          The first problem is that TABLENAME and FIELDNAME have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME token. To fix that just replace both of those rules with a single ID rule. If you want to, you can then introduce parser rules tableName : ID ; and fieldName : ID ; if you want to keep the names.



          The other problem is more straight forward: John simply does not match your rules for a string since it's not in quotes. If you do want to allow John as a valid search term, you might want to define it as searchterm : STRING | ID ; instead of only allowing STRINGs.






          share|improve this answer























          • Perfect reply, I understand the problem now. Thank you so much!
            – Eric '3ToedSloth'
            Nov 10 at 22:07











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242079%2fwhats-wrong-with-this-antlr-grammar%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote



          accepted










          Try looking at the tokens produced for that input using grun Mdb tokens -tokens. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.



          The first problem is that TABLENAME and FIELDNAME have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME token. To fix that just replace both of those rules with a single ID rule. If you want to, you can then introduce parser rules tableName : ID ; and fieldName : ID ; if you want to keep the names.



          The other problem is more straight forward: John simply does not match your rules for a string since it's not in quotes. If you do want to allow John as a valid search term, you might want to define it as searchterm : STRING | ID ; instead of only allowing STRINGs.






          share|improve this answer























          • Perfect reply, I understand the problem now. Thank you so much!
            – Eric '3ToedSloth'
            Nov 10 at 22:07















          up vote
          1
          down vote



          accepted










          Try looking at the tokens produced for that input using grun Mdb tokens -tokens. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.



          The first problem is that TABLENAME and FIELDNAME have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME token. To fix that just replace both of those rules with a single ID rule. If you want to, you can then introduce parser rules tableName : ID ; and fieldName : ID ; if you want to keep the names.



          The other problem is more straight forward: John simply does not match your rules for a string since it's not in quotes. If you do want to allow John as a valid search term, you might want to define it as searchterm : STRING | ID ; instead of only allowing STRINGs.






          share|improve this answer























          • Perfect reply, I understand the problem now. Thank you so much!
            – Eric '3ToedSloth'
            Nov 10 at 22:07













          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          Try looking at the tokens produced for that input using grun Mdb tokens -tokens. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.



          The first problem is that TABLENAME and FIELDNAME have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME token. To fix that just replace both of those rules with a single ID rule. If you want to, you can then introduce parser rules tableName : ID ; and fieldName : ID ; if you want to keep the names.



          The other problem is more straight forward: John simply does not match your rules for a string since it's not in quotes. If you do want to allow John as a valid search term, you might want to define it as searchterm : STRING | ID ; instead of only allowing STRINGs.






          share|improve this answer














          Try looking at the tokens produced for that input using grun Mdb tokens -tokens. It will tell you that the input consists of two table names, an equals sign and then another table name. To match your grammar it would have needed to be a table name, a field name, an equals sign and a string.



          The first problem is that TABLENAME and FIELDNAME have the exact same definition. In cases where two lexer rules would produce a match of the same length on the current input, ANTLR prefers the one that comes first in the grammar. So it will never produce a FIELDNAME token. To fix that just replace both of those rules with a single ID rule. If you want to, you can then introduce parser rules tableName : ID ; and fieldName : ID ; if you want to keep the names.



          The other problem is more straight forward: John simply does not match your rules for a string since it's not in quotes. If you do want to allow John as a valid search term, you might want to define it as searchterm : STRING | ID ; instead of only allowing STRINGs.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 12 at 23:24

























          answered Nov 10 at 18:44









          sepp2k

          289k36592604




          289k36592604












          • Perfect reply, I understand the problem now. Thank you so much!
            – Eric '3ToedSloth'
            Nov 10 at 22:07


















          • Perfect reply, I understand the problem now. Thank you so much!
            – Eric '3ToedSloth'
            Nov 10 at 22:07
















          Perfect reply, I understand the problem now. Thank you so much!
          – Eric '3ToedSloth'
          Nov 10 at 22:07




          Perfect reply, I understand the problem now. Thank you so much!
          – Eric '3ToedSloth'
          Nov 10 at 22:07


















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242079%2fwhats-wrong-with-this-antlr-grammar%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Florida Star v. B. J. F.

          Danny Elfman

          Lugert, Oklahoma