Most efficient way to create a list of dictionaries in Python
up vote
0
down vote
favorite
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
add a comment |
up vote
0
down vote
favorite
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09
1
Theappend
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.
What I am doing is the following:
def f(x):
some function f that returns a dictionary given x
list_of_dict =
xlist = [x1, x2, ..., xN]
for x in xlist:
list_of_dict.append(f(x))
I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.
Thank you.
python list dictionary
python list dictionary
asked Nov 11 at 0:02
mgiom
226
226
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09
1
Theappend
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41
add a comment |
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09
1
Theappend
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09
1
1
The
append
loop can be replaced with list_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing what f(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.– usr2564301
Nov 11 at 0:12
The
append
loop can be replaced with list_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing what f(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.– usr2564301
Nov 11 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18
1
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41
map
is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41
add a comment |
2 Answers
2
active
oldest
votes
up vote
2
down vote
accepted
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
add a comment |
up vote
-1
down vote
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 at 2:52
|
show 2 more comments
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
add a comment |
up vote
2
down vote
accepted
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.
The original approach seems to be the slowest:
In [20]: %%timeit
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mapping is the best way to go:
In [21]: %timeit list(map(f,xlist))
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
List comprehension is somewhere in the middle:
In [22]: %timeit [f(x) for x in xlist]
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
edited Nov 11 at 2:57
answered Nov 11 at 2:45
DYZ
24.3k61948
24.3k61948
add a comment |
add a comment |
up vote
-1
down vote
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 at 2:52
|
show 2 more comments
up vote
-1
down vote
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 at 2:52
|
show 2 more comments
up vote
-1
down vote
up vote
-1
down vote
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
Since you mentioned in the comments that you want faster execution overall, maybe async the requests?
Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/
import asyncio
import requests
async def main():
xlist = [...]
list_of_dict =
loop = asyncio.get_event_loop()
futures = [
loop.run_in_executor(
None,
requests.get,
i
)
for i in xlist
]
for response in await asyncio.gather(*futures):
respons_dict = your_parser(response) # Your parsing to dict from before
list_of_dict.append(response_dict)
return list_of_dict
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.
answered Nov 11 at 0:47
Charles Landau
1,2951212
1,2951212
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 at 2:52
|
show 2 more comments
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 at 1:45
In general,async
does not speed up computational tasks.
– DYZ
Nov 11 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 at 2:51
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 at 2:52
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 at 1:45
Did this answer responds to the OP's question ?
– Chiheb Nexus
Nov 11 at 1:45
In general,
async
does not speed up computational tasks.– DYZ
Nov 11 at 2:47
In general,
async
does not speed up computational tasks.– DYZ
Nov 11 at 2:47
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 at 2:48
In the example above I'm using it to make the requests in parallel, not the computation.
– Charles Landau
Nov 11 at 2:48
The OP never mentions requests in their post.
– DYZ
Nov 11 at 2:51
The OP never mentions requests in their post.
– DYZ
Nov 11 at 2:51
1
1
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 at 2:52
Correct, they mentioned it in the comments @DYZ
– Charles Landau
Nov 11 at 2:52
|
show 2 more comments
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244627%2fmost-efficient-way-to-create-a-list-of-dictionaries-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07
Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09
1
The
append
loop can be replaced withlist_of_dict = [f(x) for x in xlist]
but you'll have to time it to see if it's any faster. Without seeing whatf(x)
does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.– usr2564301
Nov 11 at 0:12
Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18
1
map
is usually one of the faster methods to apply a function to each element of a list. So you would do:list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41