Most efficient way to create a list of dictionaries in Python











up vote
0
down vote

favorite












Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.



What I am doing is the following:



def f(x):
some function f that returns a dictionary given x

list_of_dict =

xlist = [x1, x2, ..., xN]

for x in xlist:
list_of_dict.append(f(x))


I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.



Thank you.










share|improve this question






















  • Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
    – usr2564301
    Nov 11 at 0:07










  • Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
    – mgiom
    Nov 11 at 0:09






  • 1




    The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
    – usr2564301
    Nov 11 at 0:12












  • Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
    – mgiom
    Nov 11 at 0:18








  • 1




    map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
    – dawg
    Nov 11 at 0:41















up vote
0
down vote

favorite












Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.



What I am doing is the following:



def f(x):
some function f that returns a dictionary given x

list_of_dict =

xlist = [x1, x2, ..., xN]

for x in xlist:
list_of_dict.append(f(x))


I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.



Thank you.










share|improve this question






















  • Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
    – usr2564301
    Nov 11 at 0:07










  • Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
    – mgiom
    Nov 11 at 0:09






  • 1




    The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
    – usr2564301
    Nov 11 at 0:12












  • Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
    – mgiom
    Nov 11 at 0:18








  • 1




    map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
    – dawg
    Nov 11 at 0:41













up vote
0
down vote

favorite









up vote
0
down vote

favorite











Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.



What I am doing is the following:



def f(x):
some function f that returns a dictionary given x

list_of_dict =

xlist = [x1, x2, ..., xN]

for x in xlist:
list_of_dict.append(f(x))


I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.



Thank you.










share|improve this question













Suppose I have a list of objects and for each of these objects I want to create a dictionary and put all the dictionaries I generated in a list.



What I am doing is the following:



def f(x):
some function f that returns a dictionary given x

list_of_dict =

xlist = [x1, x2, ..., xN]

for x in xlist:
list_of_dict.append(f(x))


I am wondering whether there is a more efficient (faster) way to create a list of dictionaries than the one I am proposing.



Thank you.







python list dictionary






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 11 at 0:02









mgiom

226




226












  • Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
    – usr2564301
    Nov 11 at 0:07










  • Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
    – mgiom
    Nov 11 at 0:09






  • 1




    The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
    – usr2564301
    Nov 11 at 0:12












  • Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
    – mgiom
    Nov 11 at 0:18








  • 1




    map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
    – dawg
    Nov 11 at 0:41


















  • Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
    – usr2564301
    Nov 11 at 0:07










  • Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
    – mgiom
    Nov 11 at 0:09






  • 1




    The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
    – usr2564301
    Nov 11 at 0:12












  • Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
    – mgiom
    Nov 11 at 0:18








  • 1




    map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
    – dawg
    Nov 11 at 0:41
















Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07




Faster to type in, or faster in execution? Did you time your function? Did it take a perceptually long time? For how many items?
– usr2564301
Nov 11 at 0:07












Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09




Faster in execution. The xlist contains around 2,000,000 elements, and on top of making the function f more efficient I am wondering whether I can make the list creation faster.
– mgiom
Nov 11 at 0:09




1




1




The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 at 0:12






The append loop can be replaced with list_of_dict = [f(x) for x in xlist] but you'll have to time it to see if it's any faster. Without seeing what f(x) does, we cannot reasonably recommend anything. Perhaps cache the result, if you get lots of repeats.
– usr2564301
Nov 11 at 0:12














Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18






Thank you. The xlist is a list of links, and the function f request and parse an html page with beautifulSoup from a link and create two lists of strings, and then zips them together as a dictionary.
– mgiom
Nov 11 at 0:18






1




1




map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41




map is usually one of the faster methods to apply a function to each element of a list. So you would do: list_of_dict=map(f(x) for x in xlist)
– dawg
Nov 11 at 0:41












2 Answers
2






active

oldest

votes

















up vote
2
down vote



accepted










Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.




The original approach seems to be the slowest:



In [20]: %%timeit 
...: list_of_dict =
...: for x in xlist:
...: list_of_dict.append(f(x))
...:
13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


Mapping is the best way to go:



In [21]: %timeit list(map(f,xlist))                                             
8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


List comprehension is somewhere in the middle:



In [22]: %timeit [f(x) for x in xlist]                                          
10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)





share|improve this answer






























    up vote
    -1
    down vote













    Since you mentioned in the comments that you want faster execution overall, maybe async the requests?



    Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/



    import asyncio
    import requests

    async def main():
    xlist = [...]
    list_of_dict =
    loop = asyncio.get_event_loop()
    futures = [
    loop.run_in_executor(
    None,
    requests.get,
    i
    )
    for i in xlist
    ]
    for response in await asyncio.gather(*futures):
    respons_dict = your_parser(response) # Your parsing to dict from before
    list_of_dict.append(response_dict)
    return list_of_dict

    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())


    I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.






    share|improve this answer





















    • Did this answer responds to the OP's question ?
      – Chiheb Nexus
      Nov 11 at 1:45










    • In general, async does not speed up computational tasks.
      – DYZ
      Nov 11 at 2:47










    • In the example above I'm using it to make the requests in parallel, not the computation.
      – Charles Landau
      Nov 11 at 2:48










    • The OP never mentions requests in their post.
      – DYZ
      Nov 11 at 2:51






    • 1




      Correct, they mentioned it in the comments @DYZ
      – Charles Landau
      Nov 11 at 2:52











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244627%2fmost-efficient-way-to-create-a-list-of-dictionaries-in-python%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote



    accepted










    Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.




    The original approach seems to be the slowest:



    In [20]: %%timeit 
    ...: list_of_dict =
    ...: for x in xlist:
    ...: list_of_dict.append(f(x))
    ...:
    13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


    Mapping is the best way to go:



    In [21]: %timeit list(map(f,xlist))                                             
    8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


    List comprehension is somewhere in the middle:



    In [22]: %timeit [f(x) for x in xlist]                                          
    10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)





    share|improve this answer



























      up vote
      2
      down vote



      accepted










      Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.




      The original approach seems to be the slowest:



      In [20]: %%timeit 
      ...: list_of_dict =
      ...: for x in xlist:
      ...: list_of_dict.append(f(x))
      ...:
      13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


      Mapping is the best way to go:



      In [21]: %timeit list(map(f,xlist))                                             
      8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


      List comprehension is somewhere in the middle:



      In [22]: %timeit [f(x) for x in xlist]                                          
      10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)





      share|improve this answer

























        up vote
        2
        down vote



        accepted







        up vote
        2
        down vote



        accepted






        Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.




        The original approach seems to be the slowest:



        In [20]: %%timeit 
        ...: list_of_dict =
        ...: for x in xlist:
        ...: list_of_dict.append(f(x))
        ...:
        13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


        Mapping is the best way to go:



        In [21]: %timeit list(map(f,xlist))                                             
        8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


        List comprehension is somewhere in the middle:



        In [22]: %timeit [f(x) for x in xlist]                                          
        10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)





        share|improve this answer














        Since you are dealing with HTTP requests (which is not obvious from your question), the rest of the answer is irrelevant: communications will dominate the computations by a sheer margin. I'll leave the answer here, anyway.




        The original approach seems to be the slowest:



        In [20]: %%timeit 
        ...: list_of_dict =
        ...: for x in xlist:
        ...: list_of_dict.append(f(x))
        ...:
        13.5 µs ± 39.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


        Mapping is the best way to go:



        In [21]: %timeit list(map(f,xlist))                                             
        8.45 µs ± 17 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


        List comprehension is somewhere in the middle:



        In [22]: %timeit [f(x) for x in xlist]                                          
        10.2 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 11 at 2:57

























        answered Nov 11 at 2:45









        DYZ

        24.3k61948




        24.3k61948
























            up vote
            -1
            down vote













            Since you mentioned in the comments that you want faster execution overall, maybe async the requests?



            Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/



            import asyncio
            import requests

            async def main():
            xlist = [...]
            list_of_dict =
            loop = asyncio.get_event_loop()
            futures = [
            loop.run_in_executor(
            None,
            requests.get,
            i
            )
            for i in xlist
            ]
            for response in await asyncio.gather(*futures):
            respons_dict = your_parser(response) # Your parsing to dict from before
            list_of_dict.append(response_dict)
            return list_of_dict

            loop = asyncio.get_event_loop()
            loop.run_until_complete(main())


            I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.






            share|improve this answer





















            • Did this answer responds to the OP's question ?
              – Chiheb Nexus
              Nov 11 at 1:45










            • In general, async does not speed up computational tasks.
              – DYZ
              Nov 11 at 2:47










            • In the example above I'm using it to make the requests in parallel, not the computation.
              – Charles Landau
              Nov 11 at 2:48










            • The OP never mentions requests in their post.
              – DYZ
              Nov 11 at 2:51






            • 1




              Correct, they mentioned it in the comments @DYZ
              – Charles Landau
              Nov 11 at 2:52















            up vote
            -1
            down vote













            Since you mentioned in the comments that you want faster execution overall, maybe async the requests?



            Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/



            import asyncio
            import requests

            async def main():
            xlist = [...]
            list_of_dict =
            loop = asyncio.get_event_loop()
            futures = [
            loop.run_in_executor(
            None,
            requests.get,
            i
            )
            for i in xlist
            ]
            for response in await asyncio.gather(*futures):
            respons_dict = your_parser(response) # Your parsing to dict from before
            list_of_dict.append(response_dict)
            return list_of_dict

            loop = asyncio.get_event_loop()
            loop.run_until_complete(main())


            I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.






            share|improve this answer





















            • Did this answer responds to the OP's question ?
              – Chiheb Nexus
              Nov 11 at 1:45










            • In general, async does not speed up computational tasks.
              – DYZ
              Nov 11 at 2:47










            • In the example above I'm using it to make the requests in parallel, not the computation.
              – Charles Landau
              Nov 11 at 2:48










            • The OP never mentions requests in their post.
              – DYZ
              Nov 11 at 2:51






            • 1




              Correct, they mentioned it in the comments @DYZ
              – Charles Landau
              Nov 11 at 2:52













            up vote
            -1
            down vote










            up vote
            -1
            down vote









            Since you mentioned in the comments that you want faster execution overall, maybe async the requests?



            Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/



            import asyncio
            import requests

            async def main():
            xlist = [...]
            list_of_dict =
            loop = asyncio.get_event_loop()
            futures = [
            loop.run_in_executor(
            None,
            requests.get,
            i
            )
            for i in xlist
            ]
            for response in await asyncio.gather(*futures):
            respons_dict = your_parser(response) # Your parsing to dict from before
            list_of_dict.append(response_dict)
            return list_of_dict

            loop = asyncio.get_event_loop()
            loop.run_until_complete(main())


            I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.






            share|improve this answer












            Since you mentioned in the comments that you want faster execution overall, maybe async the requests?



            Check out this example adapted from this blog post: http://skipperkongen.dk/2016/09/09/easy-parallel-http-requests-with-python-and-asyncio/



            import asyncio
            import requests

            async def main():
            xlist = [...]
            list_of_dict =
            loop = asyncio.get_event_loop()
            futures = [
            loop.run_in_executor(
            None,
            requests.get,
            i
            )
            for i in xlist
            ]
            for response in await asyncio.gather(*futures):
            respons_dict = your_parser(response) # Your parsing to dict from before
            list_of_dict.append(response_dict)
            return list_of_dict

            loop = asyncio.get_event_loop()
            loop.run_until_complete(main())


            I think this can help you with implement parallelism. Just note that you should be sure you aren't flooding somebody with requests that they wouldn't appreciate.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 11 at 0:47









            Charles Landau

            1,2951212




            1,2951212












            • Did this answer responds to the OP's question ?
              – Chiheb Nexus
              Nov 11 at 1:45










            • In general, async does not speed up computational tasks.
              – DYZ
              Nov 11 at 2:47










            • In the example above I'm using it to make the requests in parallel, not the computation.
              – Charles Landau
              Nov 11 at 2:48










            • The OP never mentions requests in their post.
              – DYZ
              Nov 11 at 2:51






            • 1




              Correct, they mentioned it in the comments @DYZ
              – Charles Landau
              Nov 11 at 2:52


















            • Did this answer responds to the OP's question ?
              – Chiheb Nexus
              Nov 11 at 1:45










            • In general, async does not speed up computational tasks.
              – DYZ
              Nov 11 at 2:47










            • In the example above I'm using it to make the requests in parallel, not the computation.
              – Charles Landau
              Nov 11 at 2:48










            • The OP never mentions requests in their post.
              – DYZ
              Nov 11 at 2:51






            • 1




              Correct, they mentioned it in the comments @DYZ
              – Charles Landau
              Nov 11 at 2:52
















            Did this answer responds to the OP's question ?
            – Chiheb Nexus
            Nov 11 at 1:45




            Did this answer responds to the OP's question ?
            – Chiheb Nexus
            Nov 11 at 1:45












            In general, async does not speed up computational tasks.
            – DYZ
            Nov 11 at 2:47




            In general, async does not speed up computational tasks.
            – DYZ
            Nov 11 at 2:47












            In the example above I'm using it to make the requests in parallel, not the computation.
            – Charles Landau
            Nov 11 at 2:48




            In the example above I'm using it to make the requests in parallel, not the computation.
            – Charles Landau
            Nov 11 at 2:48












            The OP never mentions requests in their post.
            – DYZ
            Nov 11 at 2:51




            The OP never mentions requests in their post.
            – DYZ
            Nov 11 at 2:51




            1




            1




            Correct, they mentioned it in the comments @DYZ
            – Charles Landau
            Nov 11 at 2:52




            Correct, they mentioned it in the comments @DYZ
            – Charles Landau
            Nov 11 at 2:52


















             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244627%2fmost-efficient-way-to-create-a-list-of-dictionaries-in-python%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Florida Star v. B. J. F.

            Danny Elfman

            Retrieve a Users Dashboard in Tumblr with R and TumblR. Oauth Issues