How to adopt multiple different loss functions in each steps of LSTM in Keras

I have a set of sentences and their scores, I would like to train a marking system that could predict the score for a given sentence, such one example is like this:

(X =Tomorrow is a good day, Y = 0.9)

I would like to use LSTM to build such a marking system, and also consider the sequential relationship between each word in the sentence, so the training example shown above is transformed as following:

(x1=Tomorrow, y1=is) (x2=is, y2=a) (x3=a, y3=good) (x4=day, y4=0.9)

When training this LSTM, I would like the first three time steps using a softmax classifier, and the final step using a MSE. It is obvious that the loss function used in this LSTM is composed of two different loss functions. In this case, it seems the Keras does not provide the way to address my problem directly. In addition, I am not sure whether my method to build the marking system is correct or not.

edited Nov 13 '18 at 9:23

Amir

7,25763972

asked Nov 13 '18 at 8:43

Kevin Sun

1259

add a comment |

I have a set of sentences and their scores, I would like to train a marking system that could predict the score for a given sentence, such one example is like this:

(X =Tomorrow is a good day, Y = 0.9)

(x1=Tomorrow, y1=is) (x2=is, y2=a) (x3=a, y3=good) (x4=day, y4=0.9)

edited Nov 13 '18 at 9:23

Amir

7,25763972

asked Nov 13 '18 at 8:43

Kevin Sun

1259

add a comment |

I have a set of sentences and their scores, I would like to train a marking system that could predict the score for a given sentence, such one example is like this:

(X =Tomorrow is a good day, Y = 0.9)

(x1=Tomorrow, y1=is) (x2=is, y2=a) (x3=a, y3=good) (x4=day, y4=0.9)

edited Nov 13 '18 at 9:23

Amir

7,25763972

asked Nov 13 '18 at 8:43

Kevin Sun

1259

I have a set of sentences and their scores, I would like to train a marking system that could predict the score for a given sentence, such one example is like this:

(X =Tomorrow is a good day, Y = 0.9)

(x1=Tomorrow, y1=is) (x2=is, y2=a) (x3=a, y3=good) (x4=day, y4=0.9)

keras lstm

edited Nov 13 '18 at 9:23

Amir

7,25763972

asked Nov 13 '18 at 8:43

Kevin Sun

1259

edited Nov 13 '18 at 9:23

Amir

7,25763972

asked Nov 13 '18 at 8:43

Kevin Sun

1259

edited Nov 13 '18 at 9:23

Amir

7,25763972

edited Nov 13 '18 at 9:23

Amir

7,25763972

edited Nov 13 '18 at 9:23

Amir

7,25763972

asked Nov 13 '18 at 8:43

Kevin Sun

1259

asked Nov 13 '18 at 8:43

Kevin Sun

1259

asked Nov 13 '18 at 8:43

Kevin Sun

1259

add a comment |

1 Answer
1

active

oldest

votes

Keras support multiple loss functions as well:

   model = Model(inputs=inputs,

                 outputs=[lang_model, sent_model])



    model.compile(optimizer='sgd', 

                  loss=['categorical_crossentropy', 'mse'],

                  metrics=['accuracy'], loss_weights=[1., 1.])

Based on your explanation, I think you need a model that first, predict a token based on previous tokens, in NLP domain it usually called Language model, and then compute a score which I assume it is a sentiment (it is applicable to other domain).

To do so, you can train your language model with LSTM and pick the last output of LSTM for your ranking task. To this end, you need to define two loss function: categorical_crossentropy for the language model and MSE for the ranking task.

This tutorial would be helpful: https://www.pyimagesearch.com/2018/06/04/keras-multiple-outputs-and-multiple-losses/

edited Nov 13 '18 at 21:04

answered Nov 13 '18 at 9:07

Amir

7,25763972

Hi Amir, thanks very much for your reply. Does the "token" in your response mean the features of the sentence? i.e., the input for the softmax at the last time step?

– Kevin Sun
Nov 13 '18 at 20:21

@KevinSun I mean the things that you pass to your LSTM. It is usually word-vectors (Glove or w2v).

– Amir
Nov 13 '18 at 21:04

Thanks very much again. As I understood from the tutorial you referred, the multiple loss is built on different output layers that are without any connections. For example, if we want two losses, and these two losses are built on two layers named layer1 and layer2. From the tutorial, layer1 and layer3 have no connections to each other. In my problem, my losses are built on the outputs of layer3 and layer4, while the input of layer4 is the output of layer3. In this regard, could I use thses multiple losses?

– Kevin Sun
Nov 13 '18 at 21:28

Your welcome. You could but I am unsure about the convergence of the model.

– Amir
Nov 13 '18 at 21:44

Is your first reply to my question with the same meaning that I asked you in the previous post?

– Kevin Sun
Nov 13 '18 at 21:53

|
show 2 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53276981%2fhow-to-adopt-multiple-different-loss-functions-in-each-steps-of-lstm-in-keras%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Keras support multiple loss functions as well:

   model = Model(inputs=inputs,

                 outputs=[lang_model, sent_model])



    model.compile(optimizer='sgd', 

                  loss=['categorical_crossentropy', 'mse'],

                  metrics=['accuracy'], loss_weights=[1., 1.])

This tutorial would be helpful: https://www.pyimagesearch.com/2018/06/04/keras-multiple-outputs-and-multiple-losses/

edited Nov 13 '18 at 21:04

answered Nov 13 '18 at 9:07

Amir

7,25763972

Hi Amir, thanks very much for your reply. Does the "token" in your response mean the features of the sentence? i.e., the input for the softmax at the last time step?

– Kevin Sun
Nov 13 '18 at 20:21

@KevinSun I mean the things that you pass to your LSTM. It is usually word-vectors (Glove or w2v).

– Amir
Nov 13 '18 at 21:04

Thanks very much again. As I understood from the tutorial you referred, the multiple loss is built on different output layers that are without any connections. For example, if we want two losses, and these two losses are built on two layers named layer1 and layer2. From the tutorial, layer1 and layer3 have no connections to each other. In my problem, my losses are built on the outputs of layer3 and layer4, while the input of layer4 is the output of layer3. In this regard, could I use thses multiple losses?

– Kevin Sun
Nov 13 '18 at 21:28

Your welcome. You could but I am unsure about the convergence of the model.

– Amir
Nov 13 '18 at 21:44

Is your first reply to my question with the same meaning that I asked you in the previous post?

– Kevin Sun
Nov 13 '18 at 21:53

|
show 2 more comments

Keras support multiple loss functions as well:

   model = Model(inputs=inputs,

                 outputs=[lang_model, sent_model])



    model.compile(optimizer='sgd', 

                  loss=['categorical_crossentropy', 'mse'],

                  metrics=['accuracy'], loss_weights=[1., 1.])

This tutorial would be helpful: https://www.pyimagesearch.com/2018/06/04/keras-multiple-outputs-and-multiple-losses/

edited Nov 13 '18 at 21:04

answered Nov 13 '18 at 9:07

Amir

7,25763972

Hi Amir, thanks very much for your reply. Does the "token" in your response mean the features of the sentence? i.e., the input for the softmax at the last time step?

– Kevin Sun
Nov 13 '18 at 20:21

@KevinSun I mean the things that you pass to your LSTM. It is usually word-vectors (Glove or w2v).

– Amir
Nov 13 '18 at 21:04

Thanks very much again. As I understood from the tutorial you referred, the multiple loss is built on different output layers that are without any connections. For example, if we want two losses, and these two losses are built on two layers named layer1 and layer2. From the tutorial, layer1 and layer3 have no connections to each other. In my problem, my losses are built on the outputs of layer3 and layer4, while the input of layer4 is the output of layer3. In this regard, could I use thses multiple losses?

– Kevin Sun
Nov 13 '18 at 21:28

Your welcome. You could but I am unsure about the convergence of the model.

– Amir
Nov 13 '18 at 21:44

Is your first reply to my question with the same meaning that I asked you in the previous post?

– Kevin Sun
Nov 13 '18 at 21:53

|
show 2 more comments

Keras support multiple loss functions as well:

   model = Model(inputs=inputs,

                 outputs=[lang_model, sent_model])



    model.compile(optimizer='sgd', 

                  loss=['categorical_crossentropy', 'mse'],

                  metrics=['accuracy'], loss_weights=[1., 1.])

This tutorial would be helpful: https://www.pyimagesearch.com/2018/06/04/keras-multiple-outputs-and-multiple-losses/

edited Nov 13 '18 at 21:04

answered Nov 13 '18 at 9:07

Amir

7,25763972

Keras support multiple loss functions as well:

   model = Model(inputs=inputs,

                 outputs=[lang_model, sent_model])



    model.compile(optimizer='sgd', 

                  loss=['categorical_crossentropy', 'mse'],

                  metrics=['accuracy'], loss_weights=[1., 1.])

This tutorial would be helpful: https://www.pyimagesearch.com/2018/06/04/keras-multiple-outputs-and-multiple-losses/

edited Nov 13 '18 at 21:04

answered Nov 13 '18 at 9:07

Amir

7,25763972

edited Nov 13 '18 at 21:04

answered Nov 13 '18 at 9:07

Amir

7,25763972

answered Nov 13 '18 at 9:07

Amir

7,25763972

answered Nov 13 '18 at 9:07

Amir

7,25763972

Hi Amir, thanks very much for your reply. Does the "token" in your response mean the features of the sentence? i.e., the input for the softmax at the last time step?

– Kevin Sun
Nov 13 '18 at 20:21

@KevinSun I mean the things that you pass to your LSTM. It is usually word-vectors (Glove or w2v).

– Amir
Nov 13 '18 at 21:04

Thanks very much again. As I understood from the tutorial you referred, the multiple loss is built on different output layers that are without any connections. For example, if we want two losses, and these two losses are built on two layers named layer1 and layer2. From the tutorial, layer1 and layer3 have no connections to each other. In my problem, my losses are built on the outputs of layer3 and layer4, while the input of layer4 is the output of layer3. In this regard, could I use thses multiple losses?

– Kevin Sun
Nov 13 '18 at 21:28

Your welcome. You could but I am unsure about the convergence of the model.

– Amir
Nov 13 '18 at 21:44

Is your first reply to my question with the same meaning that I asked you in the previous post?

– Kevin Sun
Nov 13 '18 at 21:53

|
show 2 more comments

Hi Amir, thanks very much for your reply. Does the "token" in your response mean the features of the sentence? i.e., the input for the softmax at the last time step?

– Kevin Sun
Nov 13 '18 at 20:21

@KevinSun I mean the things that you pass to your LSTM. It is usually word-vectors (Glove or w2v).

– Amir
Nov 13 '18 at 21:04

Thanks very much again. As I understood from the tutorial you referred, the multiple loss is built on different output layers that are without any connections. For example, if we want two losses, and these two losses are built on two layers named layer1 and layer2. From the tutorial, layer1 and layer3 have no connections to each other. In my problem, my losses are built on the outputs of layer3 and layer4, while the input of layer4 is the output of layer3. In this regard, could I use thses multiple losses?

– Kevin Sun
Nov 13 '18 at 21:28

Your welcome. You could but I am unsure about the convergence of the model.

– Amir
Nov 13 '18 at 21:44

Is your first reply to my question with the same meaning that I asked you in the previous post?

– Kevin Sun
Nov 13 '18 at 21:53

Hi Amir, thanks very much for your reply. Does the "token" in your response mean the features of the sentence? i.e., the input for the softmax at the last time step?

– Kevin Sun
Nov 13 '18 at 20:21

@KevinSun I mean the things that you pass to your LSTM. It is usually word-vectors (Glove or w2v).

– Amir
Nov 13 '18 at 21:04

Thanks very much again. As I understood from the tutorial you referred, the multiple loss is built on different output layers that are without any connections. For example, if we want two losses, and these two losses are built on two layers named layer1 and layer2. From the tutorial, layer1 and layer3 have no connections to each other. In my problem, my losses are built on the outputs of layer3 and layer4, while the input of layer4 is the output of layer3. In this regard, could I use thses multiple losses?

– Kevin Sun
Nov 13 '18 at 21:28

Your welcome. You could but I am unsure about the convergence of the model.

– Amir
Nov 13 '18 at 21:44

Is your first reply to my question with the same meaning that I asked you in the previous post?

– Kevin Sun
Nov 13 '18 at 21:53

|
show 2 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ndtyjky