Training speed on a shallow neural network with a small dataset












0















My data contains of 1 feature and a label per feature
ie. ["smallBigTest", "toastBob"] <- feature
4 labels ["mix", "small", "big", "medium"]



I have converted my features to numbers based on alphabet
ie.



smallBigTest -> 18, 12,  0, 53, 53, 27,  8,  6, 45,  4, 18, 19
toastBob -> 19, 14, 0, 18, 19, 27, 14, 1, -1, -1, -1, -1


which later on I hot-encoded and reshaped so the final array of features would look like [[hotencoded(18,12,0,53,53,27,8,6,45,4,18,19)], [hotencoded(19,14,0,18,19,27,14,1,-1,-1,-1,-1)]



simply made it into a 2d array from 3d array to match my labels shape,
i have also hot encoded labels



the training data is about 60k lines of text 1.2mb csv file



and here is my model:



model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(16, activation=tf.nn.sigmoid))
model.add(tf.keras.layers.Dense(labelsDictSize, activation=tf.nn.softmax))

optimizer = tf.train.GradientDescentOptimizer(0.05)
model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)
model.fit(featuresOneHot,labelsOneHot, steps_per_epoch=dataCount, epochs=5, verbose=1)


I'm new to ML, so I might be doing something completely wrong or completely stupid, I thought though that this amount of data would be fine.
Training on my machine with gtx870m takes an hour per epoch and on google collaboratory around 20-30 minutes per epoch










share|improve this question



























    0















    My data contains of 1 feature and a label per feature
    ie. ["smallBigTest", "toastBob"] <- feature
    4 labels ["mix", "small", "big", "medium"]



    I have converted my features to numbers based on alphabet
    ie.



    smallBigTest -> 18, 12,  0, 53, 53, 27,  8,  6, 45,  4, 18, 19
    toastBob -> 19, 14, 0, 18, 19, 27, 14, 1, -1, -1, -1, -1


    which later on I hot-encoded and reshaped so the final array of features would look like [[hotencoded(18,12,0,53,53,27,8,6,45,4,18,19)], [hotencoded(19,14,0,18,19,27,14,1,-1,-1,-1,-1)]



    simply made it into a 2d array from 3d array to match my labels shape,
    i have also hot encoded labels



    the training data is about 60k lines of text 1.2mb csv file



    and here is my model:



    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Dense(16, activation=tf.nn.sigmoid))
    model.add(tf.keras.layers.Dense(labelsDictSize, activation=tf.nn.softmax))

    optimizer = tf.train.GradientDescentOptimizer(0.05)
    model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)
    model.fit(featuresOneHot,labelsOneHot, steps_per_epoch=dataCount, epochs=5, verbose=1)


    I'm new to ML, so I might be doing something completely wrong or completely stupid, I thought though that this amount of data would be fine.
    Training on my machine with gtx870m takes an hour per epoch and on google collaboratory around 20-30 minutes per epoch










    share|improve this question

























      0












      0








      0








      My data contains of 1 feature and a label per feature
      ie. ["smallBigTest", "toastBob"] <- feature
      4 labels ["mix", "small", "big", "medium"]



      I have converted my features to numbers based on alphabet
      ie.



      smallBigTest -> 18, 12,  0, 53, 53, 27,  8,  6, 45,  4, 18, 19
      toastBob -> 19, 14, 0, 18, 19, 27, 14, 1, -1, -1, -1, -1


      which later on I hot-encoded and reshaped so the final array of features would look like [[hotencoded(18,12,0,53,53,27,8,6,45,4,18,19)], [hotencoded(19,14,0,18,19,27,14,1,-1,-1,-1,-1)]



      simply made it into a 2d array from 3d array to match my labels shape,
      i have also hot encoded labels



      the training data is about 60k lines of text 1.2mb csv file



      and here is my model:



      model = tf.keras.Sequential()
      model.add(tf.keras.layers.Dense(16, activation=tf.nn.sigmoid))
      model.add(tf.keras.layers.Dense(labelsDictSize, activation=tf.nn.softmax))

      optimizer = tf.train.GradientDescentOptimizer(0.05)
      model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)
      model.fit(featuresOneHot,labelsOneHot, steps_per_epoch=dataCount, epochs=5, verbose=1)


      I'm new to ML, so I might be doing something completely wrong or completely stupid, I thought though that this amount of data would be fine.
      Training on my machine with gtx870m takes an hour per epoch and on google collaboratory around 20-30 minutes per epoch










      share|improve this question














      My data contains of 1 feature and a label per feature
      ie. ["smallBigTest", "toastBob"] <- feature
      4 labels ["mix", "small", "big", "medium"]



      I have converted my features to numbers based on alphabet
      ie.



      smallBigTest -> 18, 12,  0, 53, 53, 27,  8,  6, 45,  4, 18, 19
      toastBob -> 19, 14, 0, 18, 19, 27, 14, 1, -1, -1, -1, -1


      which later on I hot-encoded and reshaped so the final array of features would look like [[hotencoded(18,12,0,53,53,27,8,6,45,4,18,19)], [hotencoded(19,14,0,18,19,27,14,1,-1,-1,-1,-1)]



      simply made it into a 2d array from 3d array to match my labels shape,
      i have also hot encoded labels



      the training data is about 60k lines of text 1.2mb csv file



      and here is my model:



      model = tf.keras.Sequential()
      model.add(tf.keras.layers.Dense(16, activation=tf.nn.sigmoid))
      model.add(tf.keras.layers.Dense(labelsDictSize, activation=tf.nn.softmax))

      optimizer = tf.train.GradientDescentOptimizer(0.05)
      model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)
      model.fit(featuresOneHot,labelsOneHot, steps_per_epoch=dataCount, epochs=5, verbose=1)


      I'm new to ML, so I might be doing something completely wrong or completely stupid, I thought though that this amount of data would be fine.
      Training on my machine with gtx870m takes an hour per epoch and on google collaboratory around 20-30 minutes per epoch







      python tensorflow keras training-data






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 13 '18 at 22:05









      HigeathHigeath

      2416




      2416
























          1 Answer
          1






          active

          oldest

          votes


















          0














          It's not unusual for NLP models to take so much time to train. The only thing I would change about your model to speed up the learning process is changing the optimizer to something that doesn't have a fixed learning rate. That should speed up the process. I would suggest using adam as it's one of the fastest optimizers with good performance.



          Just replace



          model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)


          with



          model.compile(optimizer='adam', loss=tf.losses.softmax_cross_entropy)





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53290220%2ftraining-speed-on-a-shallow-neural-network-with-a-small-dataset%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            It's not unusual for NLP models to take so much time to train. The only thing I would change about your model to speed up the learning process is changing the optimizer to something that doesn't have a fixed learning rate. That should speed up the process. I would suggest using adam as it's one of the fastest optimizers with good performance.



            Just replace



            model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)


            with



            model.compile(optimizer='adam', loss=tf.losses.softmax_cross_entropy)





            share|improve this answer




























              0














              It's not unusual for NLP models to take so much time to train. The only thing I would change about your model to speed up the learning process is changing the optimizer to something that doesn't have a fixed learning rate. That should speed up the process. I would suggest using adam as it's one of the fastest optimizers with good performance.



              Just replace



              model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)


              with



              model.compile(optimizer='adam', loss=tf.losses.softmax_cross_entropy)





              share|improve this answer


























                0












                0








                0







                It's not unusual for NLP models to take so much time to train. The only thing I would change about your model to speed up the learning process is changing the optimizer to something that doesn't have a fixed learning rate. That should speed up the process. I would suggest using adam as it's one of the fastest optimizers with good performance.



                Just replace



                model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)


                with



                model.compile(optimizer='adam', loss=tf.losses.softmax_cross_entropy)





                share|improve this answer













                It's not unusual for NLP models to take so much time to train. The only thing I would change about your model to speed up the learning process is changing the optimizer to something that doesn't have a fixed learning rate. That should speed up the process. I would suggest using adam as it's one of the fastest optimizers with good performance.



                Just replace



                model.compile(optimizer, loss=tf.losses.softmax_cross_entropy)


                with



                model.compile(optimizer='adam', loss=tf.losses.softmax_cross_entropy)






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 13 '18 at 22:57









                Tadej MagajnaTadej Magajna

                1,1201332




                1,1201332






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53290220%2ftraining-speed-on-a-shallow-neural-network-with-a-small-dataset%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Florida Star v. B. J. F.

                    Danny Elfman

                    Lugert, Oklahoma