Problem with python memory, flush, csv size












0















After solving a sorting of a dataset, I have a problem at this point of my code.



with open(fns_land[xx]) as infile:
lines = infile.readlines()
for line in lines:
result_station.append(line.split(',')[0])
result_date.append(line.split(',')[1])
result_metar.append(line.split(',')[-1])


I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.



Is there a short/nice way to rewrite this point?










share|improve this question

























  • Possible duplicate of Python readlines() usage and efficient practice for reading

    – The Pjot
    Nov 14 '18 at 14:23
















0















After solving a sorting of a dataset, I have a problem at this point of my code.



with open(fns_land[xx]) as infile:
lines = infile.readlines()
for line in lines:
result_station.append(line.split(',')[0])
result_date.append(line.split(',')[1])
result_metar.append(line.split(',')[-1])


I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.



Is there a short/nice way to rewrite this point?










share|improve this question

























  • Possible duplicate of Python readlines() usage and efficient practice for reading

    – The Pjot
    Nov 14 '18 at 14:23














0












0








0








After solving a sorting of a dataset, I have a problem at this point of my code.



with open(fns_land[xx]) as infile:
lines = infile.readlines()
for line in lines:
result_station.append(line.split(',')[0])
result_date.append(line.split(',')[1])
result_metar.append(line.split(',')[-1])


I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.



Is there a short/nice way to rewrite this point?










share|improve this question
















After solving a sorting of a dataset, I have a problem at this point of my code.



with open(fns_land[xx]) as infile:
lines = infile.readlines()
for line in lines:
result_station.append(line.split(',')[0])
result_date.append(line.split(',')[1])
result_metar.append(line.split(',')[-1])


I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.



Is there a short/nice way to rewrite this point?







python arrays memory flush






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 '18 at 15:56









toti08

1,77941623




1,77941623










asked Nov 14 '18 at 14:17









S.KociokS.Kociok

287




287













  • Possible duplicate of Python readlines() usage and efficient practice for reading

    – The Pjot
    Nov 14 '18 at 14:23



















  • Possible duplicate of Python readlines() usage and efficient practice for reading

    – The Pjot
    Nov 14 '18 at 14:23

















Possible duplicate of Python readlines() usage and efficient practice for reading

– The Pjot
Nov 14 '18 at 14:23





Possible duplicate of Python readlines() usage and efficient practice for reading

– The Pjot
Nov 14 '18 at 14:23












2 Answers
2






active

oldest

votes


















1














Use readline instead, this read it one line at a time without loading the entire file into memory.



with open(fns_land[xx]) as infile:
while True:
line = infile.readline()
if not line:
break
result_station.append(line.split(',')[0])
result_date.append(line.split(',')[1])
result_metar.append(line.split(',')[-1])





share|improve this answer































    1














    If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.



    If your problem is a large dataset, you could load the data in chunks.



    import pandas as pd
    tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)



    1. Line: imported pandas modul

    2. Line: read data from your csv file in chunks of 1000 lines.


    This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:



    df = pd.concat(tfr, ignore_index=True)


    The parameter ignore_index=True is added to avoid duplicity of indexes.



    You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.



    Have a look here this question that dealt with something similar.






    share|improve this answer
























    • Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

      – S.Kociok
      Nov 14 '18 at 14:54











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53302313%2fproblem-with-python-memory-flush-csv-size%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Use readline instead, this read it one line at a time without loading the entire file into memory.



    with open(fns_land[xx]) as infile:
    while True:
    line = infile.readline()
    if not line:
    break
    result_station.append(line.split(',')[0])
    result_date.append(line.split(',')[1])
    result_metar.append(line.split(',')[-1])





    share|improve this answer




























      1














      Use readline instead, this read it one line at a time without loading the entire file into memory.



      with open(fns_land[xx]) as infile:
      while True:
      line = infile.readline()
      if not line:
      break
      result_station.append(line.split(',')[0])
      result_date.append(line.split(',')[1])
      result_metar.append(line.split(',')[-1])





      share|improve this answer


























        1












        1








        1







        Use readline instead, this read it one line at a time without loading the entire file into memory.



        with open(fns_land[xx]) as infile:
        while True:
        line = infile.readline()
        if not line:
        break
        result_station.append(line.split(',')[0])
        result_date.append(line.split(',')[1])
        result_metar.append(line.split(',')[-1])





        share|improve this answer













        Use readline instead, this read it one line at a time without loading the entire file into memory.



        with open(fns_land[xx]) as infile:
        while True:
        line = infile.readline()
        if not line:
        break
        result_station.append(line.split(',')[0])
        result_date.append(line.split(',')[1])
        result_metar.append(line.split(',')[-1])






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 14 '18 at 14:24









        Rocky LiRocky Li

        3,1131516




        3,1131516

























            1














            If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.



            If your problem is a large dataset, you could load the data in chunks.



            import pandas as pd
            tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)



            1. Line: imported pandas modul

            2. Line: read data from your csv file in chunks of 1000 lines.


            This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:



            df = pd.concat(tfr, ignore_index=True)


            The parameter ignore_index=True is added to avoid duplicity of indexes.



            You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.



            Have a look here this question that dealt with something similar.






            share|improve this answer
























            • Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

              – S.Kociok
              Nov 14 '18 at 14:54
















            1














            If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.



            If your problem is a large dataset, you could load the data in chunks.



            import pandas as pd
            tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)



            1. Line: imported pandas modul

            2. Line: read data from your csv file in chunks of 1000 lines.


            This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:



            df = pd.concat(tfr, ignore_index=True)


            The parameter ignore_index=True is added to avoid duplicity of indexes.



            You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.



            Have a look here this question that dealt with something similar.






            share|improve this answer
























            • Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

              – S.Kociok
              Nov 14 '18 at 14:54














            1












            1








            1







            If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.



            If your problem is a large dataset, you could load the data in chunks.



            import pandas as pd
            tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)



            1. Line: imported pandas modul

            2. Line: read data from your csv file in chunks of 1000 lines.


            This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:



            df = pd.concat(tfr, ignore_index=True)


            The parameter ignore_index=True is added to avoid duplicity of indexes.



            You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.



            Have a look here this question that dealt with something similar.






            share|improve this answer













            If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.



            If your problem is a large dataset, you could load the data in chunks.



            import pandas as pd
            tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)



            1. Line: imported pandas modul

            2. Line: read data from your csv file in chunks of 1000 lines.


            This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:



            df = pd.concat(tfr, ignore_index=True)


            The parameter ignore_index=True is added to avoid duplicity of indexes.



            You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.



            Have a look here this question that dealt with something similar.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 14 '18 at 14:41









            PhilipPhilip

            341212




            341212













            • Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

              – S.Kociok
              Nov 14 '18 at 14:54



















            • Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

              – S.Kociok
              Nov 14 '18 at 14:54

















            Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

            – S.Kociok
            Nov 14 '18 at 14:54





            Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

            – S.Kociok
            Nov 14 '18 at 14:54


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53302313%2fproblem-with-python-memory-flush-csv-size%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Florida Star v. B. J. F.

            Error while running script in elastic search , gateway timeout

            Adding quotations to stringified JSON object values