What is the difference between SGD and back-propagation?

Can you please tell me the difference between Stochastic Gradient Descent (SGD) and back-propagation?

edited Nov 14 '18 at 21:29

nbro

5,68384996

asked Jun 21 '16 at 20:02

Влад Концевич

8815

add a comment |

Can you please tell me the difference between Stochastic Gradient Descent (SGD) and back-propagation?

edited Nov 14 '18 at 21:29

nbro

5,68384996

asked Jun 21 '16 at 20:02

Влад Концевич

8815

add a comment |

Can you please tell me the difference between Stochastic Gradient Descent (SGD) and back-propagation?

edited Nov 14 '18 at 21:29

nbro

5,68384996

asked Jun 21 '16 at 20:02

Влад Концевич

8815

Can you please tell me the difference between Stochastic Gradient Descent (SGD) and back-propagation?

machine-learning artificial-intelligence difference backpropagation gradient-descent

edited Nov 14 '18 at 21:29

nbro

5,68384996

asked Jun 21 '16 at 20:02

Влад Концевич

8815

edited Nov 14 '18 at 21:29

nbro

5,68384996

asked Jun 21 '16 at 20:02

Влад Концевич

8815

edited Nov 14 '18 at 21:29

nbro

5,68384996

edited Nov 14 '18 at 21:29

nbro

5,68384996

edited Nov 14 '18 at 21:29

nbro

5,68384996

asked Jun 21 '16 at 20:02

Влад Концевич

8815

asked Jun 21 '16 at 20:02

Влад Концевич

8815

asked Jun 21 '16 at 20:02

Влад Концевич

8815

add a comment |

3 Answers
3

active

oldest

votes

Backpropagation is an efficient method of computing gradients in directed graphs of computations, such as neural networks. This is not a learning method, but rather a nice computational trick which is often used in learning methods. This is actually a simple implementation of chain rule of derivatives, which simply gives you the ability to compute all required partial derivatives in linear time in terms of the graph size (while naive gradient computations would scale exponentially with depth).

SGD is one of many optimization methods, namely first order optimizer, meaning, that it is based on analysis of the gradient of the objective. Consequently, in terms of neural networks it is often applied together with backprop to make efficient updates. You could also apply SGD to gradients obtained in a different way (from sampling, numerical approximators etc.). Symmetrically you can use other optimization techniques with backprop as well, everything that can use gradient/jacobian.

This common misconception comes from the fact, that for simplicity people sometimes say "trained with backprop", what actually means (if they do not specify optimizer) "trained with SGD using backprop as a gradient computing technique". Also, in old textbooks you can find things like "delta rule" and other a bit confusing terms, which describe exactly the same thing (as neural network community was for a long time a bit independent from general optimization community).

Thus you have two layers of abstraction:

gradient computation - where backprop comes to play

optimization level - where techniques like SGD, Adam, Rprop, BFGS etc. come into play, which (if they are first order or higher) use gradient computed above

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

add a comment |

Stochastic gradient descent (SGD) is an optimization method used e.g. to minimize a loss function.

In the SGD, you use 1 example, at each iteration, to update the weights of your model, depending on the error due to this example, instead of using the average of the errors of all examples (as in "simple" gradient descent), at each iteration. To do so, SGD needs to compute the "gradient of your model".

Backpropagation is an efficient technique to compute this "gradient" that SGD uses.

edited Nov 14 '18 at 21:38

nbro

5,68384996

answered Feb 3 '18 at 12:53

mohamed_18

11017

add a comment |

Back-propagation is just a method for calculating multi-variable derivatives of your model, whereas SGD is the method of locating the minimum of your loss/cost function.

edited Nov 14 '18 at 21:46

nbro

5,68384996

answered Mar 1 '18 at 4:41

lf2225

18915

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f37953585%2fwhat-is-the-difference-between-sgd-and-back-propagation%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Thus you have two layers of abstraction:

gradient computation - where backprop comes to play

optimization level - where techniques like SGD, Adam, Rprop, BFGS etc. come into play, which (if they are first order or higher) use gradient computed above

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

add a comment |

Thus you have two layers of abstraction:

gradient computation - where backprop comes to play

optimization level - where techniques like SGD, Adam, Rprop, BFGS etc. come into play, which (if they are first order or higher) use gradient computed above

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

add a comment |

Thus you have two layers of abstraction:

gradient computation - where backprop comes to play

optimization level - where techniques like SGD, Adam, Rprop, BFGS etc. come into play, which (if they are first order or higher) use gradient computed above

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

Thus you have two layers of abstraction:

gradient computation - where backprop comes to play

optimization level - where techniques like SGD, Adam, Rprop, BFGS etc. come into play, which (if they are first order or higher) use gradient computed above

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

edited Jun 19 '18 at 6:22

QINGYUAN FENG

194

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

answered Jun 21 '16 at 20:22

lejlot

47.4k482110

add a comment |

Stochastic gradient descent (SGD) is an optimization method used e.g. to minimize a loss function.

Backpropagation is an efficient technique to compute this "gradient" that SGD uses.

edited Nov 14 '18 at 21:38

nbro

5,68384996

answered Feb 3 '18 at 12:53

mohamed_18

11017

add a comment |

Stochastic gradient descent (SGD) is an optimization method used e.g. to minimize a loss function.

Backpropagation is an efficient technique to compute this "gradient" that SGD uses.

edited Nov 14 '18 at 21:38

nbro

5,68384996

answered Feb 3 '18 at 12:53

mohamed_18

11017

add a comment |

Stochastic gradient descent (SGD) is an optimization method used e.g. to minimize a loss function.

Backpropagation is an efficient technique to compute this "gradient" that SGD uses.

edited Nov 14 '18 at 21:38

nbro

5,68384996

answered Feb 3 '18 at 12:53

mohamed_18

11017

Stochastic gradient descent (SGD) is an optimization method used e.g. to minimize a loss function.

Backpropagation is an efficient technique to compute this "gradient" that SGD uses.

edited Nov 14 '18 at 21:38

nbro

5,68384996

answered Feb 3 '18 at 12:53

mohamed_18

11017

edited Nov 14 '18 at 21:38

nbro

5,68384996

edited Nov 14 '18 at 21:38

nbro

5,68384996

edited Nov 14 '18 at 21:38

nbro

5,68384996

answered Feb 3 '18 at 12:53

mohamed_18

11017

answered Feb 3 '18 at 12:53

mohamed_18

11017

answered Feb 3 '18 at 12:53

mohamed_18

11017

add a comment |

Back-propagation is just a method for calculating multi-variable derivatives of your model, whereas SGD is the method of locating the minimum of your loss/cost function.

edited Nov 14 '18 at 21:46

nbro

5,68384996

answered Mar 1 '18 at 4:41

lf2225

18915

add a comment |

Back-propagation is just a method for calculating multi-variable derivatives of your model, whereas SGD is the method of locating the minimum of your loss/cost function.

edited Nov 14 '18 at 21:46

nbro

5,68384996

answered Mar 1 '18 at 4:41

lf2225

18915

add a comment |

Back-propagation is just a method for calculating multi-variable derivatives of your model, whereas SGD is the method of locating the minimum of your loss/cost function.

edited Nov 14 '18 at 21:46

nbro

5,68384996

answered Mar 1 '18 at 4:41

lf2225

18915

Back-propagation is just a method for calculating multi-variable derivatives of your model, whereas SGD is the method of locating the minimum of your loss/cost function.

edited Nov 14 '18 at 21:46

nbro

5,68384996

answered Mar 1 '18 at 4:41

lf2225

18915

edited Nov 14 '18 at 21:46

nbro

5,68384996

edited Nov 14 '18 at 21:46

nbro

5,68384996

edited Nov 14 '18 at 21:46

nbro

5,68384996

answered Mar 1 '18 at 4:41

lf2225

18915

answered Mar 1 '18 at 4:41

lf2225

18915

answered Mar 1 '18 at 4:41

lf2225

18915

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ndtyjky