Convert list of edges dataframe to adjacency matrix dataframe











up vote
0
down vote

favorite












My dataframe represents a list of edges of a graph and has the following format:



  node1 node2 weight
0 a c 1
1 b c 2
2 d c 3


My goal is to generate the equivalent adjacency matrix:



    a b c d
a 0 0 1 0
b 0 0 2 0
c 0 0 0 3
d 0 0 0 0


At the moment, while constructing the the dataframe of edges I count the number of nodes and create an NxN data frame and fill in the values manually. what is the pandas way of generating the second dataframe from the first one?










share|improve this question


























    up vote
    0
    down vote

    favorite












    My dataframe represents a list of edges of a graph and has the following format:



      node1 node2 weight
    0 a c 1
    1 b c 2
    2 d c 3


    My goal is to generate the equivalent adjacency matrix:



        a b c d
    a 0 0 1 0
    b 0 0 2 0
    c 0 0 0 3
    d 0 0 0 0


    At the moment, while constructing the the dataframe of edges I count the number of nodes and create an NxN data frame and fill in the values manually. what is the pandas way of generating the second dataframe from the first one?










    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      My dataframe represents a list of edges of a graph and has the following format:



        node1 node2 weight
      0 a c 1
      1 b c 2
      2 d c 3


      My goal is to generate the equivalent adjacency matrix:



          a b c d
      a 0 0 1 0
      b 0 0 2 0
      c 0 0 0 3
      d 0 0 0 0


      At the moment, while constructing the the dataframe of edges I count the number of nodes and create an NxN data frame and fill in the values manually. what is the pandas way of generating the second dataframe from the first one?










      share|improve this question













      My dataframe represents a list of edges of a graph and has the following format:



        node1 node2 weight
      0 a c 1
      1 b c 2
      2 d c 3


      My goal is to generate the equivalent adjacency matrix:



          a b c d
      a 0 0 1 0
      b 0 0 2 0
      c 0 0 0 3
      d 0 0 0 0


      At the moment, while constructing the the dataframe of edges I count the number of nodes and create an NxN data frame and fill in the values manually. what is the pandas way of generating the second dataframe from the first one?







      python pandas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 11 at 5:26









      Hamza

      7029




      7029
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote













          Use pivot with reindex



          In [20]: vals = np.unique(df[['node1', 'node2']])

          In [21]: df.pivot(index='node1', columns='node2', values='weight'
          ).reindex(columns=vals, index=vals, fill_value=0)
          Out[21]:
          node2 a b c d
          node1
          a 0 0 1 0
          b 0 0 2 0
          c 0 0 0 0
          d 0 0 3 0


          Or use set_index and unstack



          In [27]: (df.set_index(['node1', 'node2'])['weight'].unstack()
          .reindex(columns=vals, index=vals, fill_value=0))
          Out[27]:
          node2 a b c d
          node1
          a 0 0 1 0
          b 0 0 2 0
          c 0 0 0 0
          d 0 0 3 0





          share|improve this answer




























            up vote
            1
            down vote













            Decided to have a little fun with the problem.



            You can convert node1 and node2 to Categorical dtype and then use groupby.



            from functools import partial

            vals = np.unique(df[['node1', 'node2']])
            p = partial(pd.Categorical, categories=vals)
            df['node1'], df['node2'] = p(df['node1']), p(df['node2'])

            (df.groupby(['node1', 'node2'])
            .first()
            .fillna(0, downcast='infer')
            .weight
            .unstack())

            node2 a b c d
            node1
            a 0 0 1 0
            b 0 0 2 0
            c 0 0 0 0
            d 0 0 3 0




            Another option is setting the underlying array values directly.



            df2 = pd.DataFrame(0, index=vals, columns=vals)
            f = df2.index.get_indexer
            df2.values[f(df.node1), f(df.node2)] = df.weight.values

            print(df2)
            a b c d
            a 0 0 1 0
            b 0 0 2 0
            c 0 0 0 0
            d 0 0 3 0





            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53246086%2fconvert-list-of-edges-dataframe-to-adjacency-matrix-dataframe%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              1
              down vote













              Use pivot with reindex



              In [20]: vals = np.unique(df[['node1', 'node2']])

              In [21]: df.pivot(index='node1', columns='node2', values='weight'
              ).reindex(columns=vals, index=vals, fill_value=0)
              Out[21]:
              node2 a b c d
              node1
              a 0 0 1 0
              b 0 0 2 0
              c 0 0 0 0
              d 0 0 3 0


              Or use set_index and unstack



              In [27]: (df.set_index(['node1', 'node2'])['weight'].unstack()
              .reindex(columns=vals, index=vals, fill_value=0))
              Out[27]:
              node2 a b c d
              node1
              a 0 0 1 0
              b 0 0 2 0
              c 0 0 0 0
              d 0 0 3 0





              share|improve this answer

























                up vote
                1
                down vote













                Use pivot with reindex



                In [20]: vals = np.unique(df[['node1', 'node2']])

                In [21]: df.pivot(index='node1', columns='node2', values='weight'
                ).reindex(columns=vals, index=vals, fill_value=0)
                Out[21]:
                node2 a b c d
                node1
                a 0 0 1 0
                b 0 0 2 0
                c 0 0 0 0
                d 0 0 3 0


                Or use set_index and unstack



                In [27]: (df.set_index(['node1', 'node2'])['weight'].unstack()
                .reindex(columns=vals, index=vals, fill_value=0))
                Out[27]:
                node2 a b c d
                node1
                a 0 0 1 0
                b 0 0 2 0
                c 0 0 0 0
                d 0 0 3 0





                share|improve this answer























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  Use pivot with reindex



                  In [20]: vals = np.unique(df[['node1', 'node2']])

                  In [21]: df.pivot(index='node1', columns='node2', values='weight'
                  ).reindex(columns=vals, index=vals, fill_value=0)
                  Out[21]:
                  node2 a b c d
                  node1
                  a 0 0 1 0
                  b 0 0 2 0
                  c 0 0 0 0
                  d 0 0 3 0


                  Or use set_index and unstack



                  In [27]: (df.set_index(['node1', 'node2'])['weight'].unstack()
                  .reindex(columns=vals, index=vals, fill_value=0))
                  Out[27]:
                  node2 a b c d
                  node1
                  a 0 0 1 0
                  b 0 0 2 0
                  c 0 0 0 0
                  d 0 0 3 0





                  share|improve this answer












                  Use pivot with reindex



                  In [20]: vals = np.unique(df[['node1', 'node2']])

                  In [21]: df.pivot(index='node1', columns='node2', values='weight'
                  ).reindex(columns=vals, index=vals, fill_value=0)
                  Out[21]:
                  node2 a b c d
                  node1
                  a 0 0 1 0
                  b 0 0 2 0
                  c 0 0 0 0
                  d 0 0 3 0


                  Or use set_index and unstack



                  In [27]: (df.set_index(['node1', 'node2'])['weight'].unstack()
                  .reindex(columns=vals, index=vals, fill_value=0))
                  Out[27]:
                  node2 a b c d
                  node1
                  a 0 0 1 0
                  b 0 0 2 0
                  c 0 0 0 0
                  d 0 0 3 0






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 11 at 5:31









                  Zero

                  37.9k86388




                  37.9k86388
























                      up vote
                      1
                      down vote













                      Decided to have a little fun with the problem.



                      You can convert node1 and node2 to Categorical dtype and then use groupby.



                      from functools import partial

                      vals = np.unique(df[['node1', 'node2']])
                      p = partial(pd.Categorical, categories=vals)
                      df['node1'], df['node2'] = p(df['node1']), p(df['node2'])

                      (df.groupby(['node1', 'node2'])
                      .first()
                      .fillna(0, downcast='infer')
                      .weight
                      .unstack())

                      node2 a b c d
                      node1
                      a 0 0 1 0
                      b 0 0 2 0
                      c 0 0 0 0
                      d 0 0 3 0




                      Another option is setting the underlying array values directly.



                      df2 = pd.DataFrame(0, index=vals, columns=vals)
                      f = df2.index.get_indexer
                      df2.values[f(df.node1), f(df.node2)] = df.weight.values

                      print(df2)
                      a b c d
                      a 0 0 1 0
                      b 0 0 2 0
                      c 0 0 0 0
                      d 0 0 3 0





                      share|improve this answer



























                        up vote
                        1
                        down vote













                        Decided to have a little fun with the problem.



                        You can convert node1 and node2 to Categorical dtype and then use groupby.



                        from functools import partial

                        vals = np.unique(df[['node1', 'node2']])
                        p = partial(pd.Categorical, categories=vals)
                        df['node1'], df['node2'] = p(df['node1']), p(df['node2'])

                        (df.groupby(['node1', 'node2'])
                        .first()
                        .fillna(0, downcast='infer')
                        .weight
                        .unstack())

                        node2 a b c d
                        node1
                        a 0 0 1 0
                        b 0 0 2 0
                        c 0 0 0 0
                        d 0 0 3 0




                        Another option is setting the underlying array values directly.



                        df2 = pd.DataFrame(0, index=vals, columns=vals)
                        f = df2.index.get_indexer
                        df2.values[f(df.node1), f(df.node2)] = df.weight.values

                        print(df2)
                        a b c d
                        a 0 0 1 0
                        b 0 0 2 0
                        c 0 0 0 0
                        d 0 0 3 0





                        share|improve this answer

























                          up vote
                          1
                          down vote










                          up vote
                          1
                          down vote









                          Decided to have a little fun with the problem.



                          You can convert node1 and node2 to Categorical dtype and then use groupby.



                          from functools import partial

                          vals = np.unique(df[['node1', 'node2']])
                          p = partial(pd.Categorical, categories=vals)
                          df['node1'], df['node2'] = p(df['node1']), p(df['node2'])

                          (df.groupby(['node1', 'node2'])
                          .first()
                          .fillna(0, downcast='infer')
                          .weight
                          .unstack())

                          node2 a b c d
                          node1
                          a 0 0 1 0
                          b 0 0 2 0
                          c 0 0 0 0
                          d 0 0 3 0




                          Another option is setting the underlying array values directly.



                          df2 = pd.DataFrame(0, index=vals, columns=vals)
                          f = df2.index.get_indexer
                          df2.values[f(df.node1), f(df.node2)] = df.weight.values

                          print(df2)
                          a b c d
                          a 0 0 1 0
                          b 0 0 2 0
                          c 0 0 0 0
                          d 0 0 3 0





                          share|improve this answer














                          Decided to have a little fun with the problem.



                          You can convert node1 and node2 to Categorical dtype and then use groupby.



                          from functools import partial

                          vals = np.unique(df[['node1', 'node2']])
                          p = partial(pd.Categorical, categories=vals)
                          df['node1'], df['node2'] = p(df['node1']), p(df['node2'])

                          (df.groupby(['node1', 'node2'])
                          .first()
                          .fillna(0, downcast='infer')
                          .weight
                          .unstack())

                          node2 a b c d
                          node1
                          a 0 0 1 0
                          b 0 0 2 0
                          c 0 0 0 0
                          d 0 0 3 0




                          Another option is setting the underlying array values directly.



                          df2 = pd.DataFrame(0, index=vals, columns=vals)
                          f = df2.index.get_indexer
                          df2.values[f(df.node1), f(df.node2)] = df.weight.values

                          print(df2)
                          a b c d
                          a 0 0 1 0
                          b 0 0 2 0
                          c 0 0 0 0
                          d 0 0 3 0






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Nov 11 at 9:13

























                          answered Nov 11 at 5:53









                          coldspeed

                          111k17101170




                          111k17101170






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.





                              Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                              Please pay close attention to the following guidance:


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53246086%2fconvert-list-of-edges-dataframe-to-adjacency-matrix-dataframe%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Florida Star v. B. J. F.

                              Error while running script in elastic search , gateway timeout

                              Adding quotations to stringified JSON object values