Problem with python memory, flush, csv size

After solving a sorting of a dataset, I have a problem at this point of my code.

with open(fns_land[xx]) as infile:

    lines = infile.readlines()

    for line in lines:

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.

Is there a short/nice way to rewrite this point?

edited Nov 14 '18 at 15:56

toti08

1,77941623

asked Nov 14 '18 at 14:17

S.Kociok

287

Possible duplicate of Python readlines() usage and efficient practice for reading

– The Pjot
Nov 14 '18 at 14:23

add a comment |

After solving a sorting of a dataset, I have a problem at this point of my code.

with open(fns_land[xx]) as infile:

    lines = infile.readlines()

    for line in lines:

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.

Is there a short/nice way to rewrite this point?

edited Nov 14 '18 at 15:56

toti08

1,77941623

asked Nov 14 '18 at 14:17

S.Kociok

287

Possible duplicate of Python readlines() usage and efficient practice for reading

– The Pjot
Nov 14 '18 at 14:23

add a comment |

After solving a sorting of a dataset, I have a problem at this point of my code.

with open(fns_land[xx]) as infile:

    lines = infile.readlines()

    for line in lines:

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.

Is there a short/nice way to rewrite this point?

edited Nov 14 '18 at 15:56

toti08

1,77941623

asked Nov 14 '18 at 14:17

S.Kociok

287

After solving a sorting of a dataset, I have a problem at this point of my code.

with open(fns_land[xx]) as infile:

    lines = infile.readlines()

    for line in lines:

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

I have a problem in the lines line. In this line the data are sometimes to huge and i get a kill error.

Is there a short/nice way to rewrite this point?

python arrays memory flush

edited Nov 14 '18 at 15:56

toti08

1,77941623

asked Nov 14 '18 at 14:17

S.Kociok

287

edited Nov 14 '18 at 15:56

toti08

1,77941623

asked Nov 14 '18 at 14:17

S.Kociok

287

edited Nov 14 '18 at 15:56

toti08

1,77941623

edited Nov 14 '18 at 15:56

toti08

1,77941623

edited Nov 14 '18 at 15:56

toti08

1,77941623

asked Nov 14 '18 at 14:17

S.Kociok

287

asked Nov 14 '18 at 14:17

S.Kociok

287

asked Nov 14 '18 at 14:17

S.Kociok

287

Possible duplicate of Python readlines() usage and efficient practice for reading

– The Pjot
Nov 14 '18 at 14:23

add a comment |

Possible duplicate of Python readlines() usage and efficient practice for reading

– The Pjot
Nov 14 '18 at 14:23

Possible duplicate of Python readlines() usage and efficient practice for reading

– The Pjot
Nov 14 '18 at 14:23

add a comment |

2 Answers
2

active

oldest

votes

Use readline instead, this read it one line at a time without loading the entire file into memory.

with open(fns_land[xx]) as infile:

    while True:

        line = infile.readline()

        if not line:

            break

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

add a comment |

If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.

If your problem is a large dataset, you could load the data in chunks.

import pandas as pd

tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)

Line: imported pandas modul

Line: read data from your csv file in chunks of 1000 lines.

This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:

df = pd.concat(tfr, ignore_index=True)

The parameter ignore_index=True is added to avoid duplicity of indexes.

You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.

Have a look here this question that dealt with something similar.

answered Nov 14 '18 at 14:41

Philip

341212

Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

– S.Kociok
Nov 14 '18 at 14:54

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53302313%2fproblem-with-python-memory-flush-csv-size%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Use readline instead, this read it one line at a time without loading the entire file into memory.

with open(fns_land[xx]) as infile:

    while True:

        line = infile.readline()

        if not line:

            break

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

add a comment |

Use readline instead, this read it one line at a time without loading the entire file into memory.

with open(fns_land[xx]) as infile:

    while True:

        line = infile.readline()

        if not line:

            break

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

add a comment |

Use readline instead, this read it one line at a time without loading the entire file into memory.

with open(fns_land[xx]) as infile:

    while True:

        line = infile.readline()

        if not line:

            break

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

Use readline instead, this read it one line at a time without loading the entire file into memory.

with open(fns_land[xx]) as infile:

    while True:

        line = infile.readline()

        if not line:

            break

        result_station.append(line.split(',')[0])

        result_date.append(line.split(',')[1])

        result_metar.append(line.split(',')[-1])

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

answered Nov 14 '18 at 14:24

Rocky Li

3,1131516

add a comment |

If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.

If your problem is a large dataset, you could load the data in chunks.

import pandas as pd

tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)

Line: imported pandas modul

Line: read data from your csv file in chunks of 1000 lines.

This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:

df = pd.concat(tfr, ignore_index=True)

The parameter ignore_index=True is added to avoid duplicity of indexes.

You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.

Have a look here this question that dealt with something similar.

answered Nov 14 '18 at 14:41

Philip

341212

Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

– S.Kociok
Nov 14 '18 at 14:54

add a comment |

If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.

If your problem is a large dataset, you could load the data in chunks.

import pandas as pd

tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)

Line: imported pandas modul

Line: read data from your csv file in chunks of 1000 lines.

This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:

df = pd.concat(tfr, ignore_index=True)

The parameter ignore_index=True is added to avoid duplicity of indexes.

You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.

Have a look here this question that dealt with something similar.

answered Nov 14 '18 at 14:41

Philip

341212

Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

– S.Kociok
Nov 14 '18 at 14:54

add a comment |

If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.

If your problem is a large dataset, you could load the data in chunks.

import pandas as pd

tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)

Line: imported pandas modul

Line: read data from your csv file in chunks of 1000 lines.

This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:

df = pd.concat(tfr, ignore_index=True)

The parameter ignore_index=True is added to avoid duplicity of indexes.

You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.

Have a look here this question that dealt with something similar.

answered Nov 14 '18 at 14:41

Philip

341212

If you are dealing with a dataset, I would suggest that you have a look at pandas, which I great for dealing with data wrangling.

If your problem is a large dataset, you could load the data in chunks.

import pandas as pd

tfr = pd.read_csv('fns_land{0}.csv'.format(xx), iterator=True, chunksize=1000)

Line: imported pandas modul

Line: read data from your csv file in chunks of 1000 lines.

This will be of type pandas.io.parsers.TextFileReader. To load the entire csv file, you follow up with:

df = pd.concat(tfr, ignore_index=True)

The parameter ignore_index=True is added to avoid duplicity of indexes.

You now have all your data loaded into a dataframe. Then do your data manipulation on the columns as vectors, which also is faster than regular line by line.

Have a look here this question that dealt with something similar.

answered Nov 14 '18 at 14:41

Philip

341212

answered Nov 14 '18 at 14:41

Philip

341212

answered Nov 14 '18 at 14:41

Philip

341212

answered Nov 14 '18 at 14:41

Philip

341212

Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

– S.Kociok
Nov 14 '18 at 14:54

add a comment |

Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

– S.Kociok
Nov 14 '18 at 14:54

Thanks. But for my using the open methode was the best way. I only want to read in three colums out of 1000 colums. For the next time it is maybe a better way with pandas.

– S.Kociok
Nov 14 '18 at 14:54

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

cxsCRM11g4BY53oAQRElHzzxo,RYNrM,8nH2mGLgtAeYmNS6

搜尋此網誌

Ndtyjky