R: Two Identically Structured Excel Files Return Different Data Types in Data Frames











up vote
0
down vote

favorite












I have two different Excel files, excel1 and excel2.



I am reading them in using separate but identical functions:



df1<- readxl::read_xlsx("excel1.xlsx", sheet= "Ad Awareness", skip= 7)
df2<- readxl::read_xlsx("excel2.xlsx", sheet= "Ad Awareness", skip= 7)


However, when I run head() on each, here is what df` returns:



calDate             Score
<dttm> <dbl>
1 2016-10-17 00:00:00 17.8
2 2016-10-18 00:00:00 17.2
3 2016-10-19 00:00:00 20.3


And here is what df2 returns:



  calDate Score
<dbl> <lgl>
1 43025 NA
2 43026 NA
3 43027 NA


Any reason why the data type are being read-in different? There is nothing different about the files.










share|improve this question


























    up vote
    0
    down vote

    favorite












    I have two different Excel files, excel1 and excel2.



    I am reading them in using separate but identical functions:



    df1<- readxl::read_xlsx("excel1.xlsx", sheet= "Ad Awareness", skip= 7)
    df2<- readxl::read_xlsx("excel2.xlsx", sheet= "Ad Awareness", skip= 7)


    However, when I run head() on each, here is what df` returns:



    calDate             Score
    <dttm> <dbl>
    1 2016-10-17 00:00:00 17.8
    2 2016-10-18 00:00:00 17.2
    3 2016-10-19 00:00:00 20.3


    And here is what df2 returns:



      calDate Score
    <dbl> <lgl>
    1 43025 NA
    2 43026 NA
    3 43027 NA


    Any reason why the data type are being read-in different? There is nothing different about the files.










    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I have two different Excel files, excel1 and excel2.



      I am reading them in using separate but identical functions:



      df1<- readxl::read_xlsx("excel1.xlsx", sheet= "Ad Awareness", skip= 7)
      df2<- readxl::read_xlsx("excel2.xlsx", sheet= "Ad Awareness", skip= 7)


      However, when I run head() on each, here is what df` returns:



      calDate             Score
      <dttm> <dbl>
      1 2016-10-17 00:00:00 17.8
      2 2016-10-18 00:00:00 17.2
      3 2016-10-19 00:00:00 20.3


      And here is what df2 returns:



        calDate Score
      <dbl> <lgl>
      1 43025 NA
      2 43026 NA
      3 43027 NA


      Any reason why the data type are being read-in different? There is nothing different about the files.










      share|improve this question













      I have two different Excel files, excel1 and excel2.



      I am reading them in using separate but identical functions:



      df1<- readxl::read_xlsx("excel1.xlsx", sheet= "Ad Awareness", skip= 7)
      df2<- readxl::read_xlsx("excel2.xlsx", sheet= "Ad Awareness", skip= 7)


      However, when I run head() on each, here is what df` returns:



      calDate             Score
      <dttm> <dbl>
      1 2016-10-17 00:00:00 17.8
      2 2016-10-18 00:00:00 17.2
      3 2016-10-19 00:00:00 20.3


      And here is what df2 returns:



        calDate Score
      <dbl> <lgl>
      1 43025 NA
      2 43026 NA
      3 43027 NA


      Any reason why the data type are being read-in different? There is nothing different about the files.







      r readxl






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 10 at 18:23









      enriqueStateSpacias

      668




      668
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          read_xlsx() will guess the variable types based on your data (see here for more information).



          So what you are describing could be due to:




          • different amount of data in your different files (not enough data in one of them to get to a correct guess)


          • changes you might have made in Excel to the cell format (those changes are not always visually obvious in Excel)



          Without seeing your data, it is hard to give you more answer than this.



          But you can control this with the col_types argument:




          col_types: Either ‘NULL’ to guess all from the spreadsheet or a
          character vector containing one entry per column from these
          options: "skip", "guess", "logical", "numeric", "date",
          "text" or "list". If exactly one ‘col_type’ is specified, it
          will be recycled. The content of a cell in a skipped column
          is never read and that column will not appear in the data
          frame output. A list cell loads a column as a list of length
          1 vectors, which are typed using the type guessing logic from
          ‘col_types = NULL’, but on a cell-by-cell basis.







          share|improve this answer





















          • Complete Excel error. Shut Excel down and started back up, re-read the data back in and it was magically able to read both files in the same format. Thank you.
            – enriqueStateSpacias
            Nov 11 at 1:04










          • Glad that worked. But when this doesn't work, you can always use col_types to set the variable types manually.
            – prosoitos
            Nov 11 at 1:43










          • even if they are different data types? one was datetime, the other was a numeric... is there indexing in col_types? I tried passing c("date" , "time") and list("date", "time") and neither worked.
            – enriqueStateSpacias
            Nov 11 at 1:53










          • This readxl.tidyverse.org/articles/cell-and-column-types.html and the help files should help you. There is no indexing and you set the type by column.
            – prosoitos
            Nov 11 at 1:55











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242070%2fr-two-identically-structured-excel-files-return-different-data-types-in-data-fr%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote



          accepted










          read_xlsx() will guess the variable types based on your data (see here for more information).



          So what you are describing could be due to:




          • different amount of data in your different files (not enough data in one of them to get to a correct guess)


          • changes you might have made in Excel to the cell format (those changes are not always visually obvious in Excel)



          Without seeing your data, it is hard to give you more answer than this.



          But you can control this with the col_types argument:




          col_types: Either ‘NULL’ to guess all from the spreadsheet or a
          character vector containing one entry per column from these
          options: "skip", "guess", "logical", "numeric", "date",
          "text" or "list". If exactly one ‘col_type’ is specified, it
          will be recycled. The content of a cell in a skipped column
          is never read and that column will not appear in the data
          frame output. A list cell loads a column as a list of length
          1 vectors, which are typed using the type guessing logic from
          ‘col_types = NULL’, but on a cell-by-cell basis.







          share|improve this answer





















          • Complete Excel error. Shut Excel down and started back up, re-read the data back in and it was magically able to read both files in the same format. Thank you.
            – enriqueStateSpacias
            Nov 11 at 1:04










          • Glad that worked. But when this doesn't work, you can always use col_types to set the variable types manually.
            – prosoitos
            Nov 11 at 1:43










          • even if they are different data types? one was datetime, the other was a numeric... is there indexing in col_types? I tried passing c("date" , "time") and list("date", "time") and neither worked.
            – enriqueStateSpacias
            Nov 11 at 1:53










          • This readxl.tidyverse.org/articles/cell-and-column-types.html and the help files should help you. There is no indexing and you set the type by column.
            – prosoitos
            Nov 11 at 1:55















          up vote
          1
          down vote



          accepted










          read_xlsx() will guess the variable types based on your data (see here for more information).



          So what you are describing could be due to:




          • different amount of data in your different files (not enough data in one of them to get to a correct guess)


          • changes you might have made in Excel to the cell format (those changes are not always visually obvious in Excel)



          Without seeing your data, it is hard to give you more answer than this.



          But you can control this with the col_types argument:




          col_types: Either ‘NULL’ to guess all from the spreadsheet or a
          character vector containing one entry per column from these
          options: "skip", "guess", "logical", "numeric", "date",
          "text" or "list". If exactly one ‘col_type’ is specified, it
          will be recycled. The content of a cell in a skipped column
          is never read and that column will not appear in the data
          frame output. A list cell loads a column as a list of length
          1 vectors, which are typed using the type guessing logic from
          ‘col_types = NULL’, but on a cell-by-cell basis.







          share|improve this answer





















          • Complete Excel error. Shut Excel down and started back up, re-read the data back in and it was magically able to read both files in the same format. Thank you.
            – enriqueStateSpacias
            Nov 11 at 1:04










          • Glad that worked. But when this doesn't work, you can always use col_types to set the variable types manually.
            – prosoitos
            Nov 11 at 1:43










          • even if they are different data types? one was datetime, the other was a numeric... is there indexing in col_types? I tried passing c("date" , "time") and list("date", "time") and neither worked.
            – enriqueStateSpacias
            Nov 11 at 1:53










          • This readxl.tidyverse.org/articles/cell-and-column-types.html and the help files should help you. There is no indexing and you set the type by column.
            – prosoitos
            Nov 11 at 1:55













          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          read_xlsx() will guess the variable types based on your data (see here for more information).



          So what you are describing could be due to:




          • different amount of data in your different files (not enough data in one of them to get to a correct guess)


          • changes you might have made in Excel to the cell format (those changes are not always visually obvious in Excel)



          Without seeing your data, it is hard to give you more answer than this.



          But you can control this with the col_types argument:




          col_types: Either ‘NULL’ to guess all from the spreadsheet or a
          character vector containing one entry per column from these
          options: "skip", "guess", "logical", "numeric", "date",
          "text" or "list". If exactly one ‘col_type’ is specified, it
          will be recycled. The content of a cell in a skipped column
          is never read and that column will not appear in the data
          frame output. A list cell loads a column as a list of length
          1 vectors, which are typed using the type guessing logic from
          ‘col_types = NULL’, but on a cell-by-cell basis.







          share|improve this answer












          read_xlsx() will guess the variable types based on your data (see here for more information).



          So what you are describing could be due to:




          • different amount of data in your different files (not enough data in one of them to get to a correct guess)


          • changes you might have made in Excel to the cell format (those changes are not always visually obvious in Excel)



          Without seeing your data, it is hard to give you more answer than this.



          But you can control this with the col_types argument:




          col_types: Either ‘NULL’ to guess all from the spreadsheet or a
          character vector containing one entry per column from these
          options: "skip", "guess", "logical", "numeric", "date",
          "text" or "list". If exactly one ‘col_type’ is specified, it
          will be recycled. The content of a cell in a skipped column
          is never read and that column will not appear in the data
          frame output. A list cell loads a column as a list of length
          1 vectors, which are typed using the type guessing logic from
          ‘col_types = NULL’, but on a cell-by-cell basis.








          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 10 at 21:19









          prosoitos

          752218




          752218












          • Complete Excel error. Shut Excel down and started back up, re-read the data back in and it was magically able to read both files in the same format. Thank you.
            – enriqueStateSpacias
            Nov 11 at 1:04










          • Glad that worked. But when this doesn't work, you can always use col_types to set the variable types manually.
            – prosoitos
            Nov 11 at 1:43










          • even if they are different data types? one was datetime, the other was a numeric... is there indexing in col_types? I tried passing c("date" , "time") and list("date", "time") and neither worked.
            – enriqueStateSpacias
            Nov 11 at 1:53










          • This readxl.tidyverse.org/articles/cell-and-column-types.html and the help files should help you. There is no indexing and you set the type by column.
            – prosoitos
            Nov 11 at 1:55


















          • Complete Excel error. Shut Excel down and started back up, re-read the data back in and it was magically able to read both files in the same format. Thank you.
            – enriqueStateSpacias
            Nov 11 at 1:04










          • Glad that worked. But when this doesn't work, you can always use col_types to set the variable types manually.
            – prosoitos
            Nov 11 at 1:43










          • even if they are different data types? one was datetime, the other was a numeric... is there indexing in col_types? I tried passing c("date" , "time") and list("date", "time") and neither worked.
            – enriqueStateSpacias
            Nov 11 at 1:53










          • This readxl.tidyverse.org/articles/cell-and-column-types.html and the help files should help you. There is no indexing and you set the type by column.
            – prosoitos
            Nov 11 at 1:55
















          Complete Excel error. Shut Excel down and started back up, re-read the data back in and it was magically able to read both files in the same format. Thank you.
          – enriqueStateSpacias
          Nov 11 at 1:04




          Complete Excel error. Shut Excel down and started back up, re-read the data back in and it was magically able to read both files in the same format. Thank you.
          – enriqueStateSpacias
          Nov 11 at 1:04












          Glad that worked. But when this doesn't work, you can always use col_types to set the variable types manually.
          – prosoitos
          Nov 11 at 1:43




          Glad that worked. But when this doesn't work, you can always use col_types to set the variable types manually.
          – prosoitos
          Nov 11 at 1:43












          even if they are different data types? one was datetime, the other was a numeric... is there indexing in col_types? I tried passing c("date" , "time") and list("date", "time") and neither worked.
          – enriqueStateSpacias
          Nov 11 at 1:53




          even if they are different data types? one was datetime, the other was a numeric... is there indexing in col_types? I tried passing c("date" , "time") and list("date", "time") and neither worked.
          – enriqueStateSpacias
          Nov 11 at 1:53












          This readxl.tidyverse.org/articles/cell-and-column-types.html and the help files should help you. There is no indexing and you set the type by column.
          – prosoitos
          Nov 11 at 1:55




          This readxl.tidyverse.org/articles/cell-and-column-types.html and the help files should help you. There is no indexing and you set the type by column.
          – prosoitos
          Nov 11 at 1:55


















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242070%2fr-two-identically-structured-excel-files-return-different-data-types-in-data-fr%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Florida Star v. B. J. F.

          Danny Elfman

          Lugert, Oklahoma