How to pre-train a deep neural network (or RNN) with unlabeled data?

Recently, I was asked about how to pre-train a deep neural network with unlabeled data, meaning, instead of initializing the model weight with small random numbers, we set initial weight from a pretrained model (with unlabeled data).

Well, intuitively, I kinda get it, it probably helps with the vanishing gradient issue and shorten the training time when there are not too much labeled data available. But still, I don't really know how it is done, how can you train a neural network with unlabeled data? Is it something like SOM or Boltzmann machine?

Has anybody heard about this? If yes, can you provide some links to sources or papers. I am curious. Greatly appreciate!

edited Nov 14 '18 at 23:47

asked Nov 14 '18 at 23:33

Aaron_Geng

566

I've answered, but this question may be more appropriate on a site like cross-validated. Would not be surprised to see it migrated.

– user3390629
Nov 14 '18 at 23:54

add a comment |

Has anybody heard about this? If yes, can you provide some links to sources or papers. I am curious. Greatly appreciate!

edited Nov 14 '18 at 23:47

asked Nov 14 '18 at 23:33

Aaron_Geng

566

I've answered, but this question may be more appropriate on a site like cross-validated. Would not be surprised to see it migrated.

– user3390629
Nov 14 '18 at 23:54

add a comment |

Has anybody heard about this? If yes, can you provide some links to sources or papers. I am curious. Greatly appreciate!

edited Nov 14 '18 at 23:47

asked Nov 14 '18 at 23:33

Aaron_Geng

566

Has anybody heard about this? If yes, can you provide some links to sources or papers. I am curious. Greatly appreciate!

neural-network deep-learning

edited Nov 14 '18 at 23:47

asked Nov 14 '18 at 23:33

Aaron_Geng

566

edited Nov 14 '18 at 23:47

asked Nov 14 '18 at 23:33

Aaron_Geng

566

edited Nov 14 '18 at 23:47

asked Nov 14 '18 at 23:33

Aaron_Geng

566

asked Nov 14 '18 at 23:33

Aaron_Geng

566

asked Nov 14 '18 at 23:33

Aaron_Geng

566

I've answered, but this question may be more appropriate on a site like cross-validated. Would not be surprised to see it migrated.

– user3390629
Nov 14 '18 at 23:54

add a comment |

I've answered, but this question may be more appropriate on a site like cross-validated. Would not be surprised to see it migrated.

– user3390629
Nov 14 '18 at 23:54

I've answered, but this question may be more appropriate on a site like cross-validated. Would not be surprised to see it migrated.

– user3390629
Nov 14 '18 at 23:54

add a comment |

1 Answer
1

active

oldest

votes

There are lots of ways to deep-learn from unlabeled data. Layerwise pre-training was developed back in the 2000s by Geoff Hinton's group, though that's generally fallen out of favor.

More modern unsupervised deep learning methods include Auto-Encoders, Variational Auto-Encoders, and Generative Adversarial Networks. I won't dive into the details of all of them, but the simplest of these, auto-encoders, work by compressing an unlabeled input into a low dimensional real-valued representation, and using this compressed representation to reconstruct the original input. Intuitively, a compressed code that can effectively be used to recreate an input is likely to capture some useful features of said input. See here for an illustration and more detailed description. There are also plenty of examples implemented in your deep learning library of choice.

I guess in some sense any of the listed methods could be used as pre-training, e.g for preparing a network for a discriminative task like classification, though I'm not aware of that being a particularly common practice. Initialization methods, activation functions, and other optimization tricks are generally advanced enough to do well without more complicated initialization procedures.

answered Nov 14 '18 at 23:53

user3390629

554512

Thanks, It sounds weird but make sense, so, it means using the auto-encoders to train each layer separately and then take the weights as initial weight of the original neural network

– Aaron_Geng
Nov 15 '18 at 0:15

If you were to use an auto-encoder as a pre-training technique, I don't think you would use it to train each layer separately. Rather you would train an auto-encoder, then grab the encoder portion of the network and re-use those layers in another architecture. In that case, those encoder layers have been trained jointly, not separately. Again, I'm not sure I've ever seen a paper take that approach, but it wouldn't surprise me.

– user3390629
Nov 15 '18 at 14:55

Yeah, you are right.

– Aaron_Geng
Nov 15 '18 at 23:36

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53310347%2fhow-to-pre-train-a-deep-neural-network-or-rnn-with-unlabeled-data%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

There are lots of ways to deep-learn from unlabeled data. Layerwise pre-training was developed back in the 2000s by Geoff Hinton's group, though that's generally fallen out of favor.

answered Nov 14 '18 at 23:53

user3390629

554512

Thanks, It sounds weird but make sense, so, it means using the auto-encoders to train each layer separately and then take the weights as initial weight of the original neural network

– Aaron_Geng
Nov 15 '18 at 0:15

If you were to use an auto-encoder as a pre-training technique, I don't think you would use it to train each layer separately. Rather you would train an auto-encoder, then grab the encoder portion of the network and re-use those layers in another architecture. In that case, those encoder layers have been trained jointly, not separately. Again, I'm not sure I've ever seen a paper take that approach, but it wouldn't surprise me.

– user3390629
Nov 15 '18 at 14:55

Yeah, you are right.

– Aaron_Geng
Nov 15 '18 at 23:36

add a comment |

There are lots of ways to deep-learn from unlabeled data. Layerwise pre-training was developed back in the 2000s by Geoff Hinton's group, though that's generally fallen out of favor.

answered Nov 14 '18 at 23:53

user3390629

554512

Thanks, It sounds weird but make sense, so, it means using the auto-encoders to train each layer separately and then take the weights as initial weight of the original neural network

– Aaron_Geng
Nov 15 '18 at 0:15

If you were to use an auto-encoder as a pre-training technique, I don't think you would use it to train each layer separately. Rather you would train an auto-encoder, then grab the encoder portion of the network and re-use those layers in another architecture. In that case, those encoder layers have been trained jointly, not separately. Again, I'm not sure I've ever seen a paper take that approach, but it wouldn't surprise me.

– user3390629
Nov 15 '18 at 14:55

Yeah, you are right.

– Aaron_Geng
Nov 15 '18 at 23:36

add a comment |

There are lots of ways to deep-learn from unlabeled data. Layerwise pre-training was developed back in the 2000s by Geoff Hinton's group, though that's generally fallen out of favor.

answered Nov 14 '18 at 23:53

user3390629

554512

There are lots of ways to deep-learn from unlabeled data. Layerwise pre-training was developed back in the 2000s by Geoff Hinton's group, though that's generally fallen out of favor.

answered Nov 14 '18 at 23:53

user3390629

554512

answered Nov 14 '18 at 23:53

user3390629

554512

answered Nov 14 '18 at 23:53

user3390629

554512

answered Nov 14 '18 at 23:53

user3390629

554512

Thanks, It sounds weird but make sense, so, it means using the auto-encoders to train each layer separately and then take the weights as initial weight of the original neural network

– Aaron_Geng
Nov 15 '18 at 0:15

If you were to use an auto-encoder as a pre-training technique, I don't think you would use it to train each layer separately. Rather you would train an auto-encoder, then grab the encoder portion of the network and re-use those layers in another architecture. In that case, those encoder layers have been trained jointly, not separately. Again, I'm not sure I've ever seen a paper take that approach, but it wouldn't surprise me.

– user3390629
Nov 15 '18 at 14:55

Yeah, you are right.

– Aaron_Geng
Nov 15 '18 at 23:36

add a comment |

Thanks, It sounds weird but make sense, so, it means using the auto-encoders to train each layer separately and then take the weights as initial weight of the original neural network

– Aaron_Geng
Nov 15 '18 at 0:15

If you were to use an auto-encoder as a pre-training technique, I don't think you would use it to train each layer separately. Rather you would train an auto-encoder, then grab the encoder portion of the network and re-use those layers in another architecture. In that case, those encoder layers have been trained jointly, not separately. Again, I'm not sure I've ever seen a paper take that approach, but it wouldn't surprise me.

– user3390629
Nov 15 '18 at 14:55

Yeah, you are right.

– Aaron_Geng
Nov 15 '18 at 23:36

Thanks, It sounds weird but make sense, so, it means using the auto-encoders to train each layer separately and then take the weights as initial weight of the original neural network

– Aaron_Geng
Nov 15 '18 at 0:15

If you were to use an auto-encoder as a pre-training technique, I don't think you would use it to train each layer separately. Rather you would train an auto-encoder, then grab the encoder portion of the network and re-use those layers in another architecture. In that case, those encoder layers have been trained jointly, not separately. Again, I'm not sure I've ever seen a paper take that approach, but it wouldn't surprise me.

– user3390629
Nov 15 '18 at 14:55

Yeah, you are right.

– Aaron_Geng
Nov 15 '18 at 23:36

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ndtyjky