Schema conflict when storing dataframes with datetime objects using load_table_from_dataframe()












1















I'm trying to load data from a Pandas DataFrames into a BigQuery table. The DataFrame has a column of dtype datetime64[ns], and when I try to store the df using load_table_from_dataframe(), I get




google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table [table name]. Field computation_triggered_time has changed type from DATETIME to TIMESTAMP.




The table has a schema which reads



CREATE TABLE `[table name]` (
...
computation_triggered_time DATETIME NOT NULL,
...
)


In the DataFrame, computation_triggered_time is a datetime64[ns] column. When I read the original DataFrame from CSV, I convert it from text to datetime like so:



df['computation_triggered_time'] =  
df.to_datetime(df['computation_triggered_time']).values.astype('datetime64[ms]')


Note:



The .values.astype('datetime64[ms]') part is necessary because load_table_from_dataframe() uses PyArrow to serialize the df and that fails if the data has nanosecond-precision. The error is something like




[...] Casting from timestamp[ns] to timestamp[ms] would lose data











share|improve this question























  • Can you give an example of the format of 'computation_triggered_time' in the dataframe?

    – Bobbylank
    Nov 16 '18 at 9:38











  • @Bobbylank: see Issue github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14
















1















I'm trying to load data from a Pandas DataFrames into a BigQuery table. The DataFrame has a column of dtype datetime64[ns], and when I try to store the df using load_table_from_dataframe(), I get




google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table [table name]. Field computation_triggered_time has changed type from DATETIME to TIMESTAMP.




The table has a schema which reads



CREATE TABLE `[table name]` (
...
computation_triggered_time DATETIME NOT NULL,
...
)


In the DataFrame, computation_triggered_time is a datetime64[ns] column. When I read the original DataFrame from CSV, I convert it from text to datetime like so:



df['computation_triggered_time'] =  
df.to_datetime(df['computation_triggered_time']).values.astype('datetime64[ms]')


Note:



The .values.astype('datetime64[ms]') part is necessary because load_table_from_dataframe() uses PyArrow to serialize the df and that fails if the data has nanosecond-precision. The error is something like




[...] Casting from timestamp[ns] to timestamp[ms] would lose data











share|improve this question























  • Can you give an example of the format of 'computation_triggered_time' in the dataframe?

    – Bobbylank
    Nov 16 '18 at 9:38











  • @Bobbylank: see Issue github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14














1












1








1








I'm trying to load data from a Pandas DataFrames into a BigQuery table. The DataFrame has a column of dtype datetime64[ns], and when I try to store the df using load_table_from_dataframe(), I get




google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table [table name]. Field computation_triggered_time has changed type from DATETIME to TIMESTAMP.




The table has a schema which reads



CREATE TABLE `[table name]` (
...
computation_triggered_time DATETIME NOT NULL,
...
)


In the DataFrame, computation_triggered_time is a datetime64[ns] column. When I read the original DataFrame from CSV, I convert it from text to datetime like so:



df['computation_triggered_time'] =  
df.to_datetime(df['computation_triggered_time']).values.astype('datetime64[ms]')


Note:



The .values.astype('datetime64[ms]') part is necessary because load_table_from_dataframe() uses PyArrow to serialize the df and that fails if the data has nanosecond-precision. The error is something like




[...] Casting from timestamp[ns] to timestamp[ms] would lose data











share|improve this question














I'm trying to load data from a Pandas DataFrames into a BigQuery table. The DataFrame has a column of dtype datetime64[ns], and when I try to store the df using load_table_from_dataframe(), I get




google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table [table name]. Field computation_triggered_time has changed type from DATETIME to TIMESTAMP.




The table has a schema which reads



CREATE TABLE `[table name]` (
...
computation_triggered_time DATETIME NOT NULL,
...
)


In the DataFrame, computation_triggered_time is a datetime64[ns] column. When I read the original DataFrame from CSV, I convert it from text to datetime like so:



df['computation_triggered_time'] =  
df.to_datetime(df['computation_triggered_time']).values.astype('datetime64[ms]')


Note:



The .values.astype('datetime64[ms]') part is necessary because load_table_from_dataframe() uses PyArrow to serialize the df and that fails if the data has nanosecond-precision. The error is something like




[...] Casting from timestamp[ns] to timestamp[ms] would lose data








pandas google-cloud-platform google-bigquery pyarrow






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 16 '18 at 8:06









Johannes BauerJohannes Bauer

205210




205210













  • Can you give an example of the format of 'computation_triggered_time' in the dataframe?

    – Bobbylank
    Nov 16 '18 at 9:38











  • @Bobbylank: see Issue github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14



















  • Can you give an example of the format of 'computation_triggered_time' in the dataframe?

    – Bobbylank
    Nov 16 '18 at 9:38











  • @Bobbylank: see Issue github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14

















Can you give an example of the format of 'computation_triggered_time' in the dataframe?

– Bobbylank
Nov 16 '18 at 9:38





Can you give an example of the format of 'computation_triggered_time' in the dataframe?

– Bobbylank
Nov 16 '18 at 9:38













@Bobbylank: see Issue github.com/googleapis/google-cloud-python/issues/6542

– Johannes Bauer
Nov 16 '18 at 20:14





@Bobbylank: see Issue github.com/googleapis/google-cloud-python/issues/6542

– Johannes Bauer
Nov 16 '18 at 20:14












1 Answer
1






active

oldest

votes


















1














This looks like a problem with Google's google-cloud-python package, can you report the bug there? https://github.com/googleapis/google-cloud-python






share|improve this answer
























  • Done: github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53333742%2fschema-conflict-when-storing-dataframes-with-datetime-objects-using-load-table-f%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














This looks like a problem with Google's google-cloud-python package, can you report the bug there? https://github.com/googleapis/google-cloud-python






share|improve this answer
























  • Done: github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14
















1














This looks like a problem with Google's google-cloud-python package, can you report the bug there? https://github.com/googleapis/google-cloud-python






share|improve this answer
























  • Done: github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14














1












1








1







This looks like a problem with Google's google-cloud-python package, can you report the bug there? https://github.com/googleapis/google-cloud-python






share|improve this answer













This looks like a problem with Google's google-cloud-python package, can you report the bug there? https://github.com/googleapis/google-cloud-python







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 16 '18 at 14:44









Wes McKinneyWes McKinney

56.7k2011594




56.7k2011594













  • Done: github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14



















  • Done: github.com/googleapis/google-cloud-python/issues/6542

    – Johannes Bauer
    Nov 16 '18 at 20:14

















Done: github.com/googleapis/google-cloud-python/issues/6542

– Johannes Bauer
Nov 16 '18 at 20:14





Done: github.com/googleapis/google-cloud-python/issues/6542

– Johannes Bauer
Nov 16 '18 at 20:14




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53333742%2fschema-conflict-when-storing-dataframes-with-datetime-objects-using-load-table-f%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

The Sandy Post

Danny Elfman

Pages that link to "Head v. Amoskeag Manufacturing Co."