Why does model.fit() raise ValueError with tf.train.AdamOptimizer using categorical_crossentropy loss...
I'm following the TensorFlow basic classification example with the Keras API provided in the "Getting Started" docs. I get through the tutorial as-is just fine, but if I change the loss function from sparse_categorical_crossentropy
to categorical_crossentropy
, the code below:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
fails during the training/fitting step with the following error:
ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)
The documentation on the loss functions doesn't delve much into expected input and output. Obviously there is a dimensionality issue here, but if any experts can give a detailed explanation, what is it about this loss function or any other loss function that raises this ValueError
?
python tensorflow machine-learning keras neural-network
add a comment |
I'm following the TensorFlow basic classification example with the Keras API provided in the "Getting Started" docs. I get through the tutorial as-is just fine, but if I change the loss function from sparse_categorical_crossentropy
to categorical_crossentropy
, the code below:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
fails during the training/fitting step with the following error:
ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)
The documentation on the loss functions doesn't delve much into expected input and output. Obviously there is a dimensionality issue here, but if any experts can give a detailed explanation, what is it about this loss function or any other loss function that raises this ValueError
?
python tensorflow machine-learning keras neural-network
add a comment |
I'm following the TensorFlow basic classification example with the Keras API provided in the "Getting Started" docs. I get through the tutorial as-is just fine, but if I change the loss function from sparse_categorical_crossentropy
to categorical_crossentropy
, the code below:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
fails during the training/fitting step with the following error:
ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)
The documentation on the loss functions doesn't delve much into expected input and output. Obviously there is a dimensionality issue here, but if any experts can give a detailed explanation, what is it about this loss function or any other loss function that raises this ValueError
?
python tensorflow machine-learning keras neural-network
I'm following the TensorFlow basic classification example with the Keras API provided in the "Getting Started" docs. I get through the tutorial as-is just fine, but if I change the loss function from sparse_categorical_crossentropy
to categorical_crossentropy
, the code below:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
fails during the training/fitting step with the following error:
ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)
The documentation on the loss functions doesn't delve much into expected input and output. Obviously there is a dimensionality issue here, but if any experts can give a detailed explanation, what is it about this loss function or any other loss function that raises this ValueError
?
python tensorflow machine-learning keras neural-network
python tensorflow machine-learning keras neural-network
edited Nov 13 '18 at 7:02
today
10.6k21536
10.6k21536
asked Nov 13 '18 at 6:29
nmurthynmurthy
322417
322417
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
sparse_categorical_crossentropy
loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. On the other hand, categorical_crossentropy
loss takes one-hot encoded labels such as [1,0,0]
, [0,1,0]
, [0,0,1]
and they are interpreted such that the index of 1 indicates the class of the sample. For example [0,0,1]
means this sample belongs to class 2 (i.e. lions). Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. Again, [0,0,1]
means that with probability of one we know that this sample belongs to class two.
sparse_categorical_crossentropy
is almost a convenient way to use categorical_crossentropy
as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy
as the loss function.
Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53275045%2fwhy-does-model-fit-raise-valueerror-with-tf-train-adamoptimizer-using-categori%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
sparse_categorical_crossentropy
loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. On the other hand, categorical_crossentropy
loss takes one-hot encoded labels such as [1,0,0]
, [0,1,0]
, [0,0,1]
and they are interpreted such that the index of 1 indicates the class of the sample. For example [0,0,1]
means this sample belongs to class 2 (i.e. lions). Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. Again, [0,0,1]
means that with probability of one we know that this sample belongs to class two.
sparse_categorical_crossentropy
is almost a convenient way to use categorical_crossentropy
as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy
as the loss function.
Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks.
add a comment |
sparse_categorical_crossentropy
loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. On the other hand, categorical_crossentropy
loss takes one-hot encoded labels such as [1,0,0]
, [0,1,0]
, [0,0,1]
and they are interpreted such that the index of 1 indicates the class of the sample. For example [0,0,1]
means this sample belongs to class 2 (i.e. lions). Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. Again, [0,0,1]
means that with probability of one we know that this sample belongs to class two.
sparse_categorical_crossentropy
is almost a convenient way to use categorical_crossentropy
as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy
as the loss function.
Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks.
add a comment |
sparse_categorical_crossentropy
loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. On the other hand, categorical_crossentropy
loss takes one-hot encoded labels such as [1,0,0]
, [0,1,0]
, [0,0,1]
and they are interpreted such that the index of 1 indicates the class of the sample. For example [0,0,1]
means this sample belongs to class 2 (i.e. lions). Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. Again, [0,0,1]
means that with probability of one we know that this sample belongs to class two.
sparse_categorical_crossentropy
is almost a convenient way to use categorical_crossentropy
as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy
as the loss function.
Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks.
sparse_categorical_crossentropy
loss expects the provided labels to be integers like 0, 1, 2 and so on, where each integer indicates a particular class. For example class 0 might be dogs, class 1 might be cats and class 2 might be lions. On the other hand, categorical_crossentropy
loss takes one-hot encoded labels such as [1,0,0]
, [0,1,0]
, [0,0,1]
and they are interpreted such that the index of 1 indicates the class of the sample. For example [0,0,1]
means this sample belongs to class 2 (i.e. lions). Further, in the context of classification models, since the output is usually a probability distribution produced by the output of softmax layer, this form of labels also corresponds to a probability distribution and match with the output of the model. Again, [0,0,1]
means that with probability of one we know that this sample belongs to class two.
sparse_categorical_crossentropy
is almost a convenient way to use categorical_crossentropy
as the loss function where Keras (or its backend) would handle the integer labels internally and you don't need to manually convert labels to one-hot encoded form. However, if the labels you provide are one-hot encoded then you must use categorical_crossentropy
as the loss function.
Also you might be interested to look at this answer as well, where I have explained briefly about the activation and loss functions and the format of labels used in the context of different kinds of classification tasks.
edited Nov 13 '18 at 10:11
answered Nov 13 '18 at 6:49
todaytoday
10.6k21536
10.6k21536
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53275045%2fwhy-does-model-fit-raise-valueerror-with-tf-train-adamoptimizer-using-categori%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown