Airflow Failed: ParseException line 2:0 cannot recognize input near












1














I'm trying to run a test task on Airflow but I keep getting the following error:




FAILED: ParseException 2:0 cannot recognize input near 'create_import_table_fct_latest_values' '.' 'hql'




Here is my Airflow Dag file:



import airflow
from datetime import datetime, timedelta
from airflow.operators.hive_operator import HiveOperator
from airflow.models import DAG

args = {
'owner': 'raul',
'start_date': datetime(2018, 11, 12),
'provide_context': True,
'depends_on_past': False,
'retries': 2,
'retry_delay': timedelta(minutes=5),
'email': ['raul.gregglino@leroymerlin.ru'],
'email_on_failure': True,
'email_on_retry': False
}

dag = DAG('opus_data',
default_args=args,
max_active_runs=6,
schedule_interval="@daily"
)

import_lv_data = HiveOperator(
task_id='fct_latest_values',
hive_cli_conn_id='metastore_default',
hql='create_import_table_fct_latest_values.hql ',
hiveconf_jinja_translate=True,
dag=dag
)

deps = {}

# Explicity define the dependencies in the DAG
for downstream, upstream_list in deps.iteritems():
for upstream in upstream_list:
dag.set_dependency(upstream, downstream)


Here is the content of my HQL file, in case this may be the issue and I can't figure:



*I'm testing the connection to understand if the table is created or not, then I'll try to LOAD DATA, hence the LOAD DATA is commented out.
CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
id_product STRING,
id_model STRING,
id_attribute STRING,
attribute_value STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED ',';

#LOAD DATA LOCAL INPATH
#'/media/windows_share/schemas/opus/fct_latest_values_20181106.csv'
#OVERWRITE INTO TABLE opus_data.fct_latest_values_new_data;









share|improve this question





























    1














    I'm trying to run a test task on Airflow but I keep getting the following error:




    FAILED: ParseException 2:0 cannot recognize input near 'create_import_table_fct_latest_values' '.' 'hql'




    Here is my Airflow Dag file:



    import airflow
    from datetime import datetime, timedelta
    from airflow.operators.hive_operator import HiveOperator
    from airflow.models import DAG

    args = {
    'owner': 'raul',
    'start_date': datetime(2018, 11, 12),
    'provide_context': True,
    'depends_on_past': False,
    'retries': 2,
    'retry_delay': timedelta(minutes=5),
    'email': ['raul.gregglino@leroymerlin.ru'],
    'email_on_failure': True,
    'email_on_retry': False
    }

    dag = DAG('opus_data',
    default_args=args,
    max_active_runs=6,
    schedule_interval="@daily"
    )

    import_lv_data = HiveOperator(
    task_id='fct_latest_values',
    hive_cli_conn_id='metastore_default',
    hql='create_import_table_fct_latest_values.hql ',
    hiveconf_jinja_translate=True,
    dag=dag
    )

    deps = {}

    # Explicity define the dependencies in the DAG
    for downstream, upstream_list in deps.iteritems():
    for upstream in upstream_list:
    dag.set_dependency(upstream, downstream)


    Here is the content of my HQL file, in case this may be the issue and I can't figure:



    *I'm testing the connection to understand if the table is created or not, then I'll try to LOAD DATA, hence the LOAD DATA is commented out.
    CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
    id_product STRING,
    id_model STRING,
    id_attribute STRING,
    attribute_value STRING
    ) ROW FORMAT DELIMITED FIELDS TERMINATED ',';

    #LOAD DATA LOCAL INPATH
    #'/media/windows_share/schemas/opus/fct_latest_values_20181106.csv'
    #OVERWRITE INTO TABLE opus_data.fct_latest_values_new_data;









    share|improve this question



























      1












      1








      1







      I'm trying to run a test task on Airflow but I keep getting the following error:




      FAILED: ParseException 2:0 cannot recognize input near 'create_import_table_fct_latest_values' '.' 'hql'




      Here is my Airflow Dag file:



      import airflow
      from datetime import datetime, timedelta
      from airflow.operators.hive_operator import HiveOperator
      from airflow.models import DAG

      args = {
      'owner': 'raul',
      'start_date': datetime(2018, 11, 12),
      'provide_context': True,
      'depends_on_past': False,
      'retries': 2,
      'retry_delay': timedelta(minutes=5),
      'email': ['raul.gregglino@leroymerlin.ru'],
      'email_on_failure': True,
      'email_on_retry': False
      }

      dag = DAG('opus_data',
      default_args=args,
      max_active_runs=6,
      schedule_interval="@daily"
      )

      import_lv_data = HiveOperator(
      task_id='fct_latest_values',
      hive_cli_conn_id='metastore_default',
      hql='create_import_table_fct_latest_values.hql ',
      hiveconf_jinja_translate=True,
      dag=dag
      )

      deps = {}

      # Explicity define the dependencies in the DAG
      for downstream, upstream_list in deps.iteritems():
      for upstream in upstream_list:
      dag.set_dependency(upstream, downstream)


      Here is the content of my HQL file, in case this may be the issue and I can't figure:



      *I'm testing the connection to understand if the table is created or not, then I'll try to LOAD DATA, hence the LOAD DATA is commented out.
      CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
      id_product STRING,
      id_model STRING,
      id_attribute STRING,
      attribute_value STRING
      ) ROW FORMAT DELIMITED FIELDS TERMINATED ',';

      #LOAD DATA LOCAL INPATH
      #'/media/windows_share/schemas/opus/fct_latest_values_20181106.csv'
      #OVERWRITE INTO TABLE opus_data.fct_latest_values_new_data;









      share|improve this question















      I'm trying to run a test task on Airflow but I keep getting the following error:




      FAILED: ParseException 2:0 cannot recognize input near 'create_import_table_fct_latest_values' '.' 'hql'




      Here is my Airflow Dag file:



      import airflow
      from datetime import datetime, timedelta
      from airflow.operators.hive_operator import HiveOperator
      from airflow.models import DAG

      args = {
      'owner': 'raul',
      'start_date': datetime(2018, 11, 12),
      'provide_context': True,
      'depends_on_past': False,
      'retries': 2,
      'retry_delay': timedelta(minutes=5),
      'email': ['raul.gregglino@leroymerlin.ru'],
      'email_on_failure': True,
      'email_on_retry': False
      }

      dag = DAG('opus_data',
      default_args=args,
      max_active_runs=6,
      schedule_interval="@daily"
      )

      import_lv_data = HiveOperator(
      task_id='fct_latest_values',
      hive_cli_conn_id='metastore_default',
      hql='create_import_table_fct_latest_values.hql ',
      hiveconf_jinja_translate=True,
      dag=dag
      )

      deps = {}

      # Explicity define the dependencies in the DAG
      for downstream, upstream_list in deps.iteritems():
      for upstream in upstream_list:
      dag.set_dependency(upstream, downstream)


      Here is the content of my HQL file, in case this may be the issue and I can't figure:



      *I'm testing the connection to understand if the table is created or not, then I'll try to LOAD DATA, hence the LOAD DATA is commented out.
      CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
      id_product STRING,
      id_model STRING,
      id_attribute STRING,
      attribute_value STRING
      ) ROW FORMAT DELIMITED FIELDS TERMINATED ',';

      #LOAD DATA LOCAL INPATH
      #'/media/windows_share/schemas/opus/fct_latest_values_20181106.csv'
      #OVERWRITE INTO TABLE opus_data.fct_latest_values_new_data;






      hadoop hive airflow






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 12 at 17:51









      kaxil

      2,483827




      2,483827










      asked Nov 12 at 10:36









      RGregg

      356




      356
























          2 Answers
          2






          active

          oldest

          votes


















          1














          In the HQL file it should be FIELDS TERMINATED BY ',':



          CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
          id_product STRING,
          id_model STRING,
          id_attribute STRING,
          attribute_value STRING
          ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';


          And comments should start with -- in HQL file, not #



          Also this seems incorrect and causing Exception hql='create_import_table_fct_latest_values.hql '



          Have a look at this example:



           #Create full path for the file
          hql_file_path = os.path.join(os.path.dirname(__file__), source['hql'])
          print hql_file_path
          run_hive_query = HiveOperator(
          task_id='run_hive_query',
          dag = dag,
          hql = """
          {{ local_hive_settings }}
          """ + "n " + open(hql_file_path, 'r').read()
          )


          See here for more details.



          Or put all HQL into hql parameter:



          hql='CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data ...'





          share|improve this answer























          • 1st, your nickname made me laugh. nice one. 2nd. Thanks for the update on the commenting in hive, been a while I don't use. 3rd. The error I'm trying to figure is exactly what is creating the error and even with the example you gave me, I'm not capable of interpret and apply.
            – RGregg
            Nov 12 at 11:49



















          0














          I managed to find the answer for my issue.



          It was related to the path my HiveOperator was calling the file from. As no Variable had been defined to tell Airflow where to look for, I was getting the error I mentioned in my post.



          Once I have defined it using the webserver interface (See picture), my dag started to work propertly.
          enter image description here



          I made a change to my DAG code regarding the file location for organization only and this is how my HiveOperator looks like now:



          import_lv_data = HiveOperator(
          task_id='fct_latest_values',
          hive_cli_conn_id='metastore_default',
          hql='hql/create_import_table_fct_latest_values2.hql',
          hiveconf_jinja_translate=True,
          dag=dag
          )


          Thanks to (@panov.st) who helped me in person to identify my issue.






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260340%2fairflow-failed-parseexception-line-20-cannot-recognize-input-near%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            In the HQL file it should be FIELDS TERMINATED BY ',':



            CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
            id_product STRING,
            id_model STRING,
            id_attribute STRING,
            attribute_value STRING
            ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';


            And comments should start with -- in HQL file, not #



            Also this seems incorrect and causing Exception hql='create_import_table_fct_latest_values.hql '



            Have a look at this example:



             #Create full path for the file
            hql_file_path = os.path.join(os.path.dirname(__file__), source['hql'])
            print hql_file_path
            run_hive_query = HiveOperator(
            task_id='run_hive_query',
            dag = dag,
            hql = """
            {{ local_hive_settings }}
            """ + "n " + open(hql_file_path, 'r').read()
            )


            See here for more details.



            Or put all HQL into hql parameter:



            hql='CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data ...'





            share|improve this answer























            • 1st, your nickname made me laugh. nice one. 2nd. Thanks for the update on the commenting in hive, been a while I don't use. 3rd. The error I'm trying to figure is exactly what is creating the error and even with the example you gave me, I'm not capable of interpret and apply.
              – RGregg
              Nov 12 at 11:49
















            1














            In the HQL file it should be FIELDS TERMINATED BY ',':



            CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
            id_product STRING,
            id_model STRING,
            id_attribute STRING,
            attribute_value STRING
            ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';


            And comments should start with -- in HQL file, not #



            Also this seems incorrect and causing Exception hql='create_import_table_fct_latest_values.hql '



            Have a look at this example:



             #Create full path for the file
            hql_file_path = os.path.join(os.path.dirname(__file__), source['hql'])
            print hql_file_path
            run_hive_query = HiveOperator(
            task_id='run_hive_query',
            dag = dag,
            hql = """
            {{ local_hive_settings }}
            """ + "n " + open(hql_file_path, 'r').read()
            )


            See here for more details.



            Or put all HQL into hql parameter:



            hql='CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data ...'





            share|improve this answer























            • 1st, your nickname made me laugh. nice one. 2nd. Thanks for the update on the commenting in hive, been a while I don't use. 3rd. The error I'm trying to figure is exactly what is creating the error and even with the example you gave me, I'm not capable of interpret and apply.
              – RGregg
              Nov 12 at 11:49














            1












            1








            1






            In the HQL file it should be FIELDS TERMINATED BY ',':



            CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
            id_product STRING,
            id_model STRING,
            id_attribute STRING,
            attribute_value STRING
            ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';


            And comments should start with -- in HQL file, not #



            Also this seems incorrect and causing Exception hql='create_import_table_fct_latest_values.hql '



            Have a look at this example:



             #Create full path for the file
            hql_file_path = os.path.join(os.path.dirname(__file__), source['hql'])
            print hql_file_path
            run_hive_query = HiveOperator(
            task_id='run_hive_query',
            dag = dag,
            hql = """
            {{ local_hive_settings }}
            """ + "n " + open(hql_file_path, 'r').read()
            )


            See here for more details.



            Or put all HQL into hql parameter:



            hql='CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data ...'





            share|improve this answer














            In the HQL file it should be FIELDS TERMINATED BY ',':



            CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data (
            id_product STRING,
            id_model STRING,
            id_attribute STRING,
            attribute_value STRING
            ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';


            And comments should start with -- in HQL file, not #



            Also this seems incorrect and causing Exception hql='create_import_table_fct_latest_values.hql '



            Have a look at this example:



             #Create full path for the file
            hql_file_path = os.path.join(os.path.dirname(__file__), source['hql'])
            print hql_file_path
            run_hive_query = HiveOperator(
            task_id='run_hive_query',
            dag = dag,
            hql = """
            {{ local_hive_settings }}
            """ + "n " + open(hql_file_path, 'r').read()
            )


            See here for more details.



            Or put all HQL into hql parameter:



            hql='CREATE TABLE IF NOT EXISTS opus_data.fct_latest_values_new_data ...'






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 12 at 11:28

























            answered Nov 12 at 10:46









            leftjoin

            8,17422050




            8,17422050












            • 1st, your nickname made me laugh. nice one. 2nd. Thanks for the update on the commenting in hive, been a while I don't use. 3rd. The error I'm trying to figure is exactly what is creating the error and even with the example you gave me, I'm not capable of interpret and apply.
              – RGregg
              Nov 12 at 11:49


















            • 1st, your nickname made me laugh. nice one. 2nd. Thanks for the update on the commenting in hive, been a while I don't use. 3rd. The error I'm trying to figure is exactly what is creating the error and even with the example you gave me, I'm not capable of interpret and apply.
              – RGregg
              Nov 12 at 11:49
















            1st, your nickname made me laugh. nice one. 2nd. Thanks for the update on the commenting in hive, been a while I don't use. 3rd. The error I'm trying to figure is exactly what is creating the error and even with the example you gave me, I'm not capable of interpret and apply.
            – RGregg
            Nov 12 at 11:49




            1st, your nickname made me laugh. nice one. 2nd. Thanks for the update on the commenting in hive, been a while I don't use. 3rd. The error I'm trying to figure is exactly what is creating the error and even with the example you gave me, I'm not capable of interpret and apply.
            – RGregg
            Nov 12 at 11:49













            0














            I managed to find the answer for my issue.



            It was related to the path my HiveOperator was calling the file from. As no Variable had been defined to tell Airflow where to look for, I was getting the error I mentioned in my post.



            Once I have defined it using the webserver interface (See picture), my dag started to work propertly.
            enter image description here



            I made a change to my DAG code regarding the file location for organization only and this is how my HiveOperator looks like now:



            import_lv_data = HiveOperator(
            task_id='fct_latest_values',
            hive_cli_conn_id='metastore_default',
            hql='hql/create_import_table_fct_latest_values2.hql',
            hiveconf_jinja_translate=True,
            dag=dag
            )


            Thanks to (@panov.st) who helped me in person to identify my issue.






            share|improve this answer


























              0














              I managed to find the answer for my issue.



              It was related to the path my HiveOperator was calling the file from. As no Variable had been defined to tell Airflow where to look for, I was getting the error I mentioned in my post.



              Once I have defined it using the webserver interface (See picture), my dag started to work propertly.
              enter image description here



              I made a change to my DAG code regarding the file location for organization only and this is how my HiveOperator looks like now:



              import_lv_data = HiveOperator(
              task_id='fct_latest_values',
              hive_cli_conn_id='metastore_default',
              hql='hql/create_import_table_fct_latest_values2.hql',
              hiveconf_jinja_translate=True,
              dag=dag
              )


              Thanks to (@panov.st) who helped me in person to identify my issue.






              share|improve this answer
























                0












                0








                0






                I managed to find the answer for my issue.



                It was related to the path my HiveOperator was calling the file from. As no Variable had been defined to tell Airflow where to look for, I was getting the error I mentioned in my post.



                Once I have defined it using the webserver interface (See picture), my dag started to work propertly.
                enter image description here



                I made a change to my DAG code regarding the file location for organization only and this is how my HiveOperator looks like now:



                import_lv_data = HiveOperator(
                task_id='fct_latest_values',
                hive_cli_conn_id='metastore_default',
                hql='hql/create_import_table_fct_latest_values2.hql',
                hiveconf_jinja_translate=True,
                dag=dag
                )


                Thanks to (@panov.st) who helped me in person to identify my issue.






                share|improve this answer












                I managed to find the answer for my issue.



                It was related to the path my HiveOperator was calling the file from. As no Variable had been defined to tell Airflow where to look for, I was getting the error I mentioned in my post.



                Once I have defined it using the webserver interface (See picture), my dag started to work propertly.
                enter image description here



                I made a change to my DAG code regarding the file location for organization only and this is how my HiveOperator looks like now:



                import_lv_data = HiveOperator(
                task_id='fct_latest_values',
                hive_cli_conn_id='metastore_default',
                hql='hql/create_import_table_fct_latest_values2.hql',
                hiveconf_jinja_translate=True,
                dag=dag
                )


                Thanks to (@panov.st) who helped me in person to identify my issue.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 13 at 11:13









                RGregg

                356




                356






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260340%2fairflow-failed-parseexception-line-20-cannot-recognize-input-near%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Florida Star v. B. J. F.

                    Error while running script in elastic search , gateway timeout

                    Adding quotations to stringified JSON object values