How to Sort rows based on Column matching data using python?












-2















I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?



enter image description here



I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.



enter image description here



How to perform above operation using python ?










share|improve this question





























    -2















    I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?



    enter image description here



    I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.



    enter image description here



    How to perform above operation using python ?










    share|improve this question



























      -2












      -2








      -2








      I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?



      enter image description here



      I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.



      enter image description here



      How to perform above operation using python ?










      share|improve this question
















      I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?



      enter image description here



      I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.



      enter image description here



      How to perform above operation using python ?







      python excel sorting






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 15 '18 at 5:42









      Thilina Nakkawita

      9511228




      9511228










      asked Nov 15 '18 at 4:41









      user2006419user2006419

      1




      1
























          2 Answers
          2






          active

          oldest

          votes


















          2














          It looks like your question is trying to do two things:




          1. Read the data from excel into your python program.

          2. Manipulate the data once it's in python.


          For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.



          I would start with that and see if you can get the data into python. It would look something like this:



          import xlrd

          # open your workbook
          wb = xlrd.open_workbook('mybook.xlsx')
          sh = wb.sheet_by_index(0)

          # now go through the rows and do what you want with them
          for x in range(sh.nrows):
          for y in range(sh.ncols):
          value = sh.cell(x,y).value
          # and do something with this value.


          I hope the above gets you started on the above.






          share|improve this answer
























          • Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.

            – user2006419
            Nov 15 '18 at 4:53





















          2














          One of the approach is as follows,



          Create an empty dataframe and append the matched column values as per your logic



          Merge the created dataframe with your original dataframe.



          I have made an attempt using your example data and here goes it:



          import pandas as pd
          dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
          dat = pd.DataFrame(dat) # Converting to a pandas dataframe
          dat1 = pd.DataFrame()
          for n in range(dat.shape[0]):
          for m in range(dat.shape[0]):
          if dat['Col1'][n] == dat['Col2'][m]:
          dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
          # print(dat1)
          df = pd.concat([dat, dat1], axis=1)
          print(df)


          Input(as dataframe):



            Col1 Col2     Col3  Col4
          0 ABC DEF 12 DGMN
          1 PQR MNO 17 DGSD
          2 DEF JPG United DGFS
          3 JPG PQR 21Hi DFPR
          4 SQL STF STM DGBC
          5 PQR YZW Hello90 DGSF
          6 MNO ABC DQT DGCV
          7 STF SQL A18B DGFD


          Intermediate/Temp dataframe:



            Column5 Column6
          0 DQT DGCV
          1 21Hi DFPR
          2 12 DGMN
          3 United DGFS
          4 A18B DGFD
          5 21Hi DFPR
          6 17 DGSD
          7 STM DGBC


          Output(df):



            Col1 Col2     Col3  Col4 Column5 Column6
          0 ABC DEF 12 DGMN DQT DGCV
          1 PQR MNO 17 DGSD 21Hi DFPR
          2 DEF JPG United DGFS 12 DGMN
          3 JPG PQR 21Hi DFPR United DGFS
          4 SQL STF STM DGBC A18B DGFD
          5 PQR YZW Hello90 DGSF 21Hi DFPR
          6 MNO ABC DQT DGCV 17 DGSD
          7 STF SQL A18B DGFD STM DGBC


          While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.



          Note
          Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.






          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53312537%2fhow-to-sort-rows-based-on-column-matching-data-using-python%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2














            It looks like your question is trying to do two things:




            1. Read the data from excel into your python program.

            2. Manipulate the data once it's in python.


            For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.



            I would start with that and see if you can get the data into python. It would look something like this:



            import xlrd

            # open your workbook
            wb = xlrd.open_workbook('mybook.xlsx')
            sh = wb.sheet_by_index(0)

            # now go through the rows and do what you want with them
            for x in range(sh.nrows):
            for y in range(sh.ncols):
            value = sh.cell(x,y).value
            # and do something with this value.


            I hope the above gets you started on the above.






            share|improve this answer
























            • Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.

              – user2006419
              Nov 15 '18 at 4:53


















            2














            It looks like your question is trying to do two things:




            1. Read the data from excel into your python program.

            2. Manipulate the data once it's in python.


            For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.



            I would start with that and see if you can get the data into python. It would look something like this:



            import xlrd

            # open your workbook
            wb = xlrd.open_workbook('mybook.xlsx')
            sh = wb.sheet_by_index(0)

            # now go through the rows and do what you want with them
            for x in range(sh.nrows):
            for y in range(sh.ncols):
            value = sh.cell(x,y).value
            # and do something with this value.


            I hope the above gets you started on the above.






            share|improve this answer
























            • Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.

              – user2006419
              Nov 15 '18 at 4:53
















            2












            2








            2







            It looks like your question is trying to do two things:




            1. Read the data from excel into your python program.

            2. Manipulate the data once it's in python.


            For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.



            I would start with that and see if you can get the data into python. It would look something like this:



            import xlrd

            # open your workbook
            wb = xlrd.open_workbook('mybook.xlsx')
            sh = wb.sheet_by_index(0)

            # now go through the rows and do what you want with them
            for x in range(sh.nrows):
            for y in range(sh.ncols):
            value = sh.cell(x,y).value
            # and do something with this value.


            I hope the above gets you started on the above.






            share|improve this answer













            It looks like your question is trying to do two things:




            1. Read the data from excel into your python program.

            2. Manipulate the data once it's in python.


            For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.



            I would start with that and see if you can get the data into python. It would look something like this:



            import xlrd

            # open your workbook
            wb = xlrd.open_workbook('mybook.xlsx')
            sh = wb.sheet_by_index(0)

            # now go through the rows and do what you want with them
            for x in range(sh.nrows):
            for y in range(sh.ncols):
            value = sh.cell(x,y).value
            # and do something with this value.


            I hope the above gets you started on the above.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 15 '18 at 4:49









            David LDavid L

            31611




            31611













            • Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.

              – user2006419
              Nov 15 '18 at 4:53





















            • Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.

              – user2006419
              Nov 15 '18 at 4:53



















            Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.

            – user2006419
            Nov 15 '18 at 4:53







            Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.

            – user2006419
            Nov 15 '18 at 4:53















            2














            One of the approach is as follows,



            Create an empty dataframe and append the matched column values as per your logic



            Merge the created dataframe with your original dataframe.



            I have made an attempt using your example data and here goes it:



            import pandas as pd
            dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
            dat = pd.DataFrame(dat) # Converting to a pandas dataframe
            dat1 = pd.DataFrame()
            for n in range(dat.shape[0]):
            for m in range(dat.shape[0]):
            if dat['Col1'][n] == dat['Col2'][m]:
            dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
            # print(dat1)
            df = pd.concat([dat, dat1], axis=1)
            print(df)


            Input(as dataframe):



              Col1 Col2     Col3  Col4
            0 ABC DEF 12 DGMN
            1 PQR MNO 17 DGSD
            2 DEF JPG United DGFS
            3 JPG PQR 21Hi DFPR
            4 SQL STF STM DGBC
            5 PQR YZW Hello90 DGSF
            6 MNO ABC DQT DGCV
            7 STF SQL A18B DGFD


            Intermediate/Temp dataframe:



              Column5 Column6
            0 DQT DGCV
            1 21Hi DFPR
            2 12 DGMN
            3 United DGFS
            4 A18B DGFD
            5 21Hi DFPR
            6 17 DGSD
            7 STM DGBC


            Output(df):



              Col1 Col2     Col3  Col4 Column5 Column6
            0 ABC DEF 12 DGMN DQT DGCV
            1 PQR MNO 17 DGSD 21Hi DFPR
            2 DEF JPG United DGFS 12 DGMN
            3 JPG PQR 21Hi DFPR United DGFS
            4 SQL STF STM DGBC A18B DGFD
            5 PQR YZW Hello90 DGSF 21Hi DFPR
            6 MNO ABC DQT DGCV 17 DGSD
            7 STF SQL A18B DGFD STM DGBC


            While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.



            Note
            Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.






            share|improve this answer






























              2














              One of the approach is as follows,



              Create an empty dataframe and append the matched column values as per your logic



              Merge the created dataframe with your original dataframe.



              I have made an attempt using your example data and here goes it:



              import pandas as pd
              dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
              dat = pd.DataFrame(dat) # Converting to a pandas dataframe
              dat1 = pd.DataFrame()
              for n in range(dat.shape[0]):
              for m in range(dat.shape[0]):
              if dat['Col1'][n] == dat['Col2'][m]:
              dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
              # print(dat1)
              df = pd.concat([dat, dat1], axis=1)
              print(df)


              Input(as dataframe):



                Col1 Col2     Col3  Col4
              0 ABC DEF 12 DGMN
              1 PQR MNO 17 DGSD
              2 DEF JPG United DGFS
              3 JPG PQR 21Hi DFPR
              4 SQL STF STM DGBC
              5 PQR YZW Hello90 DGSF
              6 MNO ABC DQT DGCV
              7 STF SQL A18B DGFD


              Intermediate/Temp dataframe:



                Column5 Column6
              0 DQT DGCV
              1 21Hi DFPR
              2 12 DGMN
              3 United DGFS
              4 A18B DGFD
              5 21Hi DFPR
              6 17 DGSD
              7 STM DGBC


              Output(df):



                Col1 Col2     Col3  Col4 Column5 Column6
              0 ABC DEF 12 DGMN DQT DGCV
              1 PQR MNO 17 DGSD 21Hi DFPR
              2 DEF JPG United DGFS 12 DGMN
              3 JPG PQR 21Hi DFPR United DGFS
              4 SQL STF STM DGBC A18B DGFD
              5 PQR YZW Hello90 DGSF 21Hi DFPR
              6 MNO ABC DQT DGCV 17 DGSD
              7 STF SQL A18B DGFD STM DGBC


              While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.



              Note
              Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.






              share|improve this answer




























                2












                2








                2







                One of the approach is as follows,



                Create an empty dataframe and append the matched column values as per your logic



                Merge the created dataframe with your original dataframe.



                I have made an attempt using your example data and here goes it:



                import pandas as pd
                dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
                dat = pd.DataFrame(dat) # Converting to a pandas dataframe
                dat1 = pd.DataFrame()
                for n in range(dat.shape[0]):
                for m in range(dat.shape[0]):
                if dat['Col1'][n] == dat['Col2'][m]:
                dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
                # print(dat1)
                df = pd.concat([dat, dat1], axis=1)
                print(df)


                Input(as dataframe):



                  Col1 Col2     Col3  Col4
                0 ABC DEF 12 DGMN
                1 PQR MNO 17 DGSD
                2 DEF JPG United DGFS
                3 JPG PQR 21Hi DFPR
                4 SQL STF STM DGBC
                5 PQR YZW Hello90 DGSF
                6 MNO ABC DQT DGCV
                7 STF SQL A18B DGFD


                Intermediate/Temp dataframe:



                  Column5 Column6
                0 DQT DGCV
                1 21Hi DFPR
                2 12 DGMN
                3 United DGFS
                4 A18B DGFD
                5 21Hi DFPR
                6 17 DGSD
                7 STM DGBC


                Output(df):



                  Col1 Col2     Col3  Col4 Column5 Column6
                0 ABC DEF 12 DGMN DQT DGCV
                1 PQR MNO 17 DGSD 21Hi DFPR
                2 DEF JPG United DGFS 12 DGMN
                3 JPG PQR 21Hi DFPR United DGFS
                4 SQL STF STM DGBC A18B DGFD
                5 PQR YZW Hello90 DGSF 21Hi DFPR
                6 MNO ABC DQT DGCV 17 DGSD
                7 STF SQL A18B DGFD STM DGBC


                While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.



                Note
                Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.






                share|improve this answer















                One of the approach is as follows,



                Create an empty dataframe and append the matched column values as per your logic



                Merge the created dataframe with your original dataframe.



                I have made an attempt using your example data and here goes it:



                import pandas as pd
                dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
                dat = pd.DataFrame(dat) # Converting to a pandas dataframe
                dat1 = pd.DataFrame()
                for n in range(dat.shape[0]):
                for m in range(dat.shape[0]):
                if dat['Col1'][n] == dat['Col2'][m]:
                dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
                # print(dat1)
                df = pd.concat([dat, dat1], axis=1)
                print(df)


                Input(as dataframe):



                  Col1 Col2     Col3  Col4
                0 ABC DEF 12 DGMN
                1 PQR MNO 17 DGSD
                2 DEF JPG United DGFS
                3 JPG PQR 21Hi DFPR
                4 SQL STF STM DGBC
                5 PQR YZW Hello90 DGSF
                6 MNO ABC DQT DGCV
                7 STF SQL A18B DGFD


                Intermediate/Temp dataframe:



                  Column5 Column6
                0 DQT DGCV
                1 21Hi DFPR
                2 12 DGMN
                3 United DGFS
                4 A18B DGFD
                5 21Hi DFPR
                6 17 DGSD
                7 STM DGBC


                Output(df):



                  Col1 Col2     Col3  Col4 Column5 Column6
                0 ABC DEF 12 DGMN DQT DGCV
                1 PQR MNO 17 DGSD 21Hi DFPR
                2 DEF JPG United DGFS 12 DGMN
                3 JPG PQR 21Hi DFPR United DGFS
                4 SQL STF STM DGBC A18B DGFD
                5 PQR YZW Hello90 DGSF 21Hi DFPR
                6 MNO ABC DQT DGCV 17 DGSD
                7 STF SQL A18B DGFD STM DGBC


                While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.



                Note
                Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 15 '18 at 6:43

























                answered Nov 15 '18 at 6:05









                RussellBRussellB

                8191330




                8191330






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53312537%2fhow-to-sort-rows-based-on-column-matching-data-using-python%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Florida Star v. B. J. F.

                    Danny Elfman

                    Lugert, Oklahoma