Why the type of pd.DataFrame every items is float, but the dtype of pd.DataFrame is object?
results_table is a pd.DataFrame
When I
print(type(results_table.loc[0,'Mean recall score']))
it return
<class 'numpy.float64'>
Every items is float
But when I
print(results_table['Mean recall score'].dtype)
it returns
object
Why is there such behavior?
python pandas
add a comment |
results_table is a pd.DataFrame
When I
print(type(results_table.loc[0,'Mean recall score']))
it return
<class 'numpy.float64'>
Every items is float
But when I
print(results_table['Mean recall score'].dtype)
it returns
object
Why is there such behavior?
python pandas
1
There are some scenarios where every item in a series is a float but thedtype
isobject
. For example, some error when reading from file that was coerced; or when you had mixed types (e.g. floats and strings) and substituted the strings with other floats at a later time; etc. Just usepd.to_numeric(df['score'])
or.astype(float)
directly
– RafaelC
Nov 16 '18 at 0:41
add a comment |
results_table is a pd.DataFrame
When I
print(type(results_table.loc[0,'Mean recall score']))
it return
<class 'numpy.float64'>
Every items is float
But when I
print(results_table['Mean recall score'].dtype)
it returns
object
Why is there such behavior?
python pandas
results_table is a pd.DataFrame
When I
print(type(results_table.loc[0,'Mean recall score']))
it return
<class 'numpy.float64'>
Every items is float
But when I
print(results_table['Mean recall score'].dtype)
it returns
object
Why is there such behavior?
python pandas
python pandas
edited Nov 16 '18 at 0:39
RafaelC
28.1k83154
28.1k83154
asked Nov 16 '18 at 0:37
SIRIUSSIRIUS
82
82
1
There are some scenarios where every item in a series is a float but thedtype
isobject
. For example, some error when reading from file that was coerced; or when you had mixed types (e.g. floats and strings) and substituted the strings with other floats at a later time; etc. Just usepd.to_numeric(df['score'])
or.astype(float)
directly
– RafaelC
Nov 16 '18 at 0:41
add a comment |
1
There are some scenarios where every item in a series is a float but thedtype
isobject
. For example, some error when reading from file that was coerced; or when you had mixed types (e.g. floats and strings) and substituted the strings with other floats at a later time; etc. Just usepd.to_numeric(df['score'])
or.astype(float)
directly
– RafaelC
Nov 16 '18 at 0:41
1
1
There are some scenarios where every item in a series is a float but the
dtype
is object
. For example, some error when reading from file that was coerced; or when you had mixed types (e.g. floats and strings) and substituted the strings with other floats at a later time; etc. Just use pd.to_numeric(df['score'])
or .astype(float)
directly– RafaelC
Nov 16 '18 at 0:41
There are some scenarios where every item in a series is a float but the
dtype
is object
. For example, some error when reading from file that was coerced; or when you had mixed types (e.g. floats and strings) and substituted the strings with other floats at a later time; etc. Just use pd.to_numeric(df['score'])
or .astype(float)
directly– RafaelC
Nov 16 '18 at 0:41
add a comment |
2 Answers
2
active
oldest
votes
First note df.loc[0, x]
only considers the value in row label 0
and column label x
, not your entire dataframe. Now let's consider an example:
df = pd.DataFrame({'A': [1.5, 'hello', 'test', 2]}, dtype=object)
print(type(df.loc[0, 'A'])) # type of single element in series
<class 'float'>
print(df['A'].dtype) # type of series
object
As you can see, an object
dtype series can hold arbitrary Python objects. You can even, if you wish, extract the type of each element of your series:
print(df['A'].map(type))
0 <class 'float'>
1 <class 'str'>
2 <class 'str'>
3 <class 'int'>
Name: A, dtype: object
An object
dtype series is simply a collection of pointers to various objects not held in a contiguous memory block, as may be the case with numeric series. This is comparable to Python list
and explains why performance is poor when you work with object
instead of numeric series.
See also this answer for a visual respresentation of the above.
add a comment |
In the first print statement you are slicing out one single element from you dataframe. This single item you are looking at is a float.
In the second print statement you are actually pulling out a pandas series (ie you are pulling out the whole column) and printing the type of that.
The pandas series is an object, but each entry in the series is a float. So this is why you get the results you did.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53329807%2fwhy-the-type-of-pd-dataframe-every-items-is-float-but-the-dtype-of-pd-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
First note df.loc[0, x]
only considers the value in row label 0
and column label x
, not your entire dataframe. Now let's consider an example:
df = pd.DataFrame({'A': [1.5, 'hello', 'test', 2]}, dtype=object)
print(type(df.loc[0, 'A'])) # type of single element in series
<class 'float'>
print(df['A'].dtype) # type of series
object
As you can see, an object
dtype series can hold arbitrary Python objects. You can even, if you wish, extract the type of each element of your series:
print(df['A'].map(type))
0 <class 'float'>
1 <class 'str'>
2 <class 'str'>
3 <class 'int'>
Name: A, dtype: object
An object
dtype series is simply a collection of pointers to various objects not held in a contiguous memory block, as may be the case with numeric series. This is comparable to Python list
and explains why performance is poor when you work with object
instead of numeric series.
See also this answer for a visual respresentation of the above.
add a comment |
First note df.loc[0, x]
only considers the value in row label 0
and column label x
, not your entire dataframe. Now let's consider an example:
df = pd.DataFrame({'A': [1.5, 'hello', 'test', 2]}, dtype=object)
print(type(df.loc[0, 'A'])) # type of single element in series
<class 'float'>
print(df['A'].dtype) # type of series
object
As you can see, an object
dtype series can hold arbitrary Python objects. You can even, if you wish, extract the type of each element of your series:
print(df['A'].map(type))
0 <class 'float'>
1 <class 'str'>
2 <class 'str'>
3 <class 'int'>
Name: A, dtype: object
An object
dtype series is simply a collection of pointers to various objects not held in a contiguous memory block, as may be the case with numeric series. This is comparable to Python list
and explains why performance is poor when you work with object
instead of numeric series.
See also this answer for a visual respresentation of the above.
add a comment |
First note df.loc[0, x]
only considers the value in row label 0
and column label x
, not your entire dataframe. Now let's consider an example:
df = pd.DataFrame({'A': [1.5, 'hello', 'test', 2]}, dtype=object)
print(type(df.loc[0, 'A'])) # type of single element in series
<class 'float'>
print(df['A'].dtype) # type of series
object
As you can see, an object
dtype series can hold arbitrary Python objects. You can even, if you wish, extract the type of each element of your series:
print(df['A'].map(type))
0 <class 'float'>
1 <class 'str'>
2 <class 'str'>
3 <class 'int'>
Name: A, dtype: object
An object
dtype series is simply a collection of pointers to various objects not held in a contiguous memory block, as may be the case with numeric series. This is comparable to Python list
and explains why performance is poor when you work with object
instead of numeric series.
See also this answer for a visual respresentation of the above.
First note df.loc[0, x]
only considers the value in row label 0
and column label x
, not your entire dataframe. Now let's consider an example:
df = pd.DataFrame({'A': [1.5, 'hello', 'test', 2]}, dtype=object)
print(type(df.loc[0, 'A'])) # type of single element in series
<class 'float'>
print(df['A'].dtype) # type of series
object
As you can see, an object
dtype series can hold arbitrary Python objects. You can even, if you wish, extract the type of each element of your series:
print(df['A'].map(type))
0 <class 'float'>
1 <class 'str'>
2 <class 'str'>
3 <class 'int'>
Name: A, dtype: object
An object
dtype series is simply a collection of pointers to various objects not held in a contiguous memory block, as may be the case with numeric series. This is comparable to Python list
and explains why performance is poor when you work with object
instead of numeric series.
See also this answer for a visual respresentation of the above.
edited Nov 16 '18 at 0:55
answered Nov 16 '18 at 0:50
jppjpp
102k2165116
102k2165116
add a comment |
add a comment |
In the first print statement you are slicing out one single element from you dataframe. This single item you are looking at is a float.
In the second print statement you are actually pulling out a pandas series (ie you are pulling out the whole column) and printing the type of that.
The pandas series is an object, but each entry in the series is a float. So this is why you get the results you did.
add a comment |
In the first print statement you are slicing out one single element from you dataframe. This single item you are looking at is a float.
In the second print statement you are actually pulling out a pandas series (ie you are pulling out the whole column) and printing the type of that.
The pandas series is an object, but each entry in the series is a float. So this is why you get the results you did.
add a comment |
In the first print statement you are slicing out one single element from you dataframe. This single item you are looking at is a float.
In the second print statement you are actually pulling out a pandas series (ie you are pulling out the whole column) and printing the type of that.
The pandas series is an object, but each entry in the series is a float. So this is why you get the results you did.
In the first print statement you are slicing out one single element from you dataframe. This single item you are looking at is a float.
In the second print statement you are actually pulling out a pandas series (ie you are pulling out the whole column) and printing the type of that.
The pandas series is an object, but each entry in the series is a float. So this is why you get the results you did.
answered Nov 16 '18 at 0:55
James FultonJames Fulton
1825
1825
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53329807%2fwhy-the-type-of-pd-dataframe-every-items-is-float-but-the-dtype-of-pd-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
There are some scenarios where every item in a series is a float but the
dtype
isobject
. For example, some error when reading from file that was coerced; or when you had mixed types (e.g. floats and strings) and substituted the strings with other floats at a later time; etc. Just usepd.to_numeric(df['score'])
or.astype(float)
directly– RafaelC
Nov 16 '18 at 0:41