How to Sort rows based on Column matching data using python?
I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?
I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.
How to perform above operation using python ?
python excel sorting
add a comment |
I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?
I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.
How to perform above operation using python ?
python excel sorting
add a comment |
I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?
I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.
How to perform above operation using python ?
python excel sorting
I have table as below and i want to sort as expected output as below. How to perform using Python. Table is inside Excel/csv format. ?
I want to Match column1 data with Column2 data and want to add new 2 columns(5&6) with sorted data as below.
How to perform above operation using python ?
python excel sorting
python excel sorting
edited Nov 15 '18 at 5:42
Thilina Nakkawita
9511228
9511228
asked Nov 15 '18 at 4:41
user2006419user2006419
1
1
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
It looks like your question is trying to do two things:
- Read the data from excel into your python program.
- Manipulate the data once it's in python.
For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.
I would start with that and see if you can get the data into python. It would look something like this:
import xlrd
# open your workbook
wb = xlrd.open_workbook('mybook.xlsx')
sh = wb.sheet_by_index(0)
# now go through the rows and do what you want with them
for x in range(sh.nrows):
for y in range(sh.ncols):
value = sh.cell(x,y).value
# and do something with this value.
I hope the above gets you started on the above.
Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.
– user2006419
Nov 15 '18 at 4:53
add a comment |
One of the approach is as follows,
Create an empty dataframe and append the matched column values as per your logic
Merge the created dataframe with your original dataframe.
I have made an attempt using your example data and here goes it:
import pandas as pd
dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
dat = pd.DataFrame(dat) # Converting to a pandas dataframe
dat1 = pd.DataFrame()
for n in range(dat.shape[0]):
for m in range(dat.shape[0]):
if dat['Col1'][n] == dat['Col2'][m]:
dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
# print(dat1)
df = pd.concat([dat, dat1], axis=1)
print(df)
Input(as dataframe):
Col1 Col2 Col3 Col4
0 ABC DEF 12 DGMN
1 PQR MNO 17 DGSD
2 DEF JPG United DGFS
3 JPG PQR 21Hi DFPR
4 SQL STF STM DGBC
5 PQR YZW Hello90 DGSF
6 MNO ABC DQT DGCV
7 STF SQL A18B DGFD
Intermediate/Temp dataframe:
Column5 Column6
0 DQT DGCV
1 21Hi DFPR
2 12 DGMN
3 United DGFS
4 A18B DGFD
5 21Hi DFPR
6 17 DGSD
7 STM DGBC
Output(df):
Col1 Col2 Col3 Col4 Column5 Column6
0 ABC DEF 12 DGMN DQT DGCV
1 PQR MNO 17 DGSD 21Hi DFPR
2 DEF JPG United DGFS 12 DGMN
3 JPG PQR 21Hi DFPR United DGFS
4 SQL STF STM DGBC A18B DGFD
5 PQR YZW Hello90 DGSF 21Hi DFPR
6 MNO ABC DQT DGCV 17 DGSD
7 STF SQL A18B DGFD STM DGBC
While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.
Note
Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53312537%2fhow-to-sort-rows-based-on-column-matching-data-using-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
It looks like your question is trying to do two things:
- Read the data from excel into your python program.
- Manipulate the data once it's in python.
For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.
I would start with that and see if you can get the data into python. It would look something like this:
import xlrd
# open your workbook
wb = xlrd.open_workbook('mybook.xlsx')
sh = wb.sheet_by_index(0)
# now go through the rows and do what you want with them
for x in range(sh.nrows):
for y in range(sh.ncols):
value = sh.cell(x,y).value
# and do something with this value.
I hope the above gets you started on the above.
Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.
– user2006419
Nov 15 '18 at 4:53
add a comment |
It looks like your question is trying to do two things:
- Read the data from excel into your python program.
- Manipulate the data once it's in python.
For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.
I would start with that and see if you can get the data into python. It would look something like this:
import xlrd
# open your workbook
wb = xlrd.open_workbook('mybook.xlsx')
sh = wb.sheet_by_index(0)
# now go through the rows and do what you want with them
for x in range(sh.nrows):
for y in range(sh.ncols):
value = sh.cell(x,y).value
# and do something with this value.
I hope the above gets you started on the above.
Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.
– user2006419
Nov 15 '18 at 4:53
add a comment |
It looks like your question is trying to do two things:
- Read the data from excel into your python program.
- Manipulate the data once it's in python.
For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.
I would start with that and see if you can get the data into python. It would look something like this:
import xlrd
# open your workbook
wb = xlrd.open_workbook('mybook.xlsx')
sh = wb.sheet_by_index(0)
# now go through the rows and do what you want with them
for x in range(sh.nrows):
for y in range(sh.ncols):
value = sh.cell(x,y).value
# and do something with this value.
I hope the above gets you started on the above.
It looks like your question is trying to do two things:
- Read the data from excel into your python program.
- Manipulate the data once it's in python.
For #1, you could use something like Python's xlrd or there are several other xls(x) parsers.
I would start with that and see if you can get the data into python. It would look something like this:
import xlrd
# open your workbook
wb = xlrd.open_workbook('mybook.xlsx')
sh = wb.sheet_by_index(0)
# now go through the rows and do what you want with them
for x in range(sh.nrows):
for y in range(sh.ncols):
value = sh.cell(x,y).value
# and do something with this value.
I hope the above gets you started on the above.
answered Nov 15 '18 at 4:49
David LDavid L
31611
31611
Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.
– user2006419
Nov 15 '18 at 4:53
add a comment |
Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.
– user2006419
Nov 15 '18 at 4:53
Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.
– user2006419
Nov 15 '18 at 4:53
Hi Henry H, Thanks for Answer. But i already have code to read data. Just i want to Compare with second column and sort rows as shown in expected output.
– user2006419
Nov 15 '18 at 4:53
add a comment |
One of the approach is as follows,
Create an empty dataframe and append the matched column values as per your logic
Merge the created dataframe with your original dataframe.
I have made an attempt using your example data and here goes it:
import pandas as pd
dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
dat = pd.DataFrame(dat) # Converting to a pandas dataframe
dat1 = pd.DataFrame()
for n in range(dat.shape[0]):
for m in range(dat.shape[0]):
if dat['Col1'][n] == dat['Col2'][m]:
dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
# print(dat1)
df = pd.concat([dat, dat1], axis=1)
print(df)
Input(as dataframe):
Col1 Col2 Col3 Col4
0 ABC DEF 12 DGMN
1 PQR MNO 17 DGSD
2 DEF JPG United DGFS
3 JPG PQR 21Hi DFPR
4 SQL STF STM DGBC
5 PQR YZW Hello90 DGSF
6 MNO ABC DQT DGCV
7 STF SQL A18B DGFD
Intermediate/Temp dataframe:
Column5 Column6
0 DQT DGCV
1 21Hi DFPR
2 12 DGMN
3 United DGFS
4 A18B DGFD
5 21Hi DFPR
6 17 DGSD
7 STM DGBC
Output(df):
Col1 Col2 Col3 Col4 Column5 Column6
0 ABC DEF 12 DGMN DQT DGCV
1 PQR MNO 17 DGSD 21Hi DFPR
2 DEF JPG United DGFS 12 DGMN
3 JPG PQR 21Hi DFPR United DGFS
4 SQL STF STM DGBC A18B DGFD
5 PQR YZW Hello90 DGSF 21Hi DFPR
6 MNO ABC DQT DGCV 17 DGSD
7 STF SQL A18B DGFD STM DGBC
While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.
Note
Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.
add a comment |
One of the approach is as follows,
Create an empty dataframe and append the matched column values as per your logic
Merge the created dataframe with your original dataframe.
I have made an attempt using your example data and here goes it:
import pandas as pd
dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
dat = pd.DataFrame(dat) # Converting to a pandas dataframe
dat1 = pd.DataFrame()
for n in range(dat.shape[0]):
for m in range(dat.shape[0]):
if dat['Col1'][n] == dat['Col2'][m]:
dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
# print(dat1)
df = pd.concat([dat, dat1], axis=1)
print(df)
Input(as dataframe):
Col1 Col2 Col3 Col4
0 ABC DEF 12 DGMN
1 PQR MNO 17 DGSD
2 DEF JPG United DGFS
3 JPG PQR 21Hi DFPR
4 SQL STF STM DGBC
5 PQR YZW Hello90 DGSF
6 MNO ABC DQT DGCV
7 STF SQL A18B DGFD
Intermediate/Temp dataframe:
Column5 Column6
0 DQT DGCV
1 21Hi DFPR
2 12 DGMN
3 United DGFS
4 A18B DGFD
5 21Hi DFPR
6 17 DGSD
7 STM DGBC
Output(df):
Col1 Col2 Col3 Col4 Column5 Column6
0 ABC DEF 12 DGMN DQT DGCV
1 PQR MNO 17 DGSD 21Hi DFPR
2 DEF JPG United DGFS 12 DGMN
3 JPG PQR 21Hi DFPR United DGFS
4 SQL STF STM DGBC A18B DGFD
5 PQR YZW Hello90 DGSF 21Hi DFPR
6 MNO ABC DQT DGCV 17 DGSD
7 STF SQL A18B DGFD STM DGBC
While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.
Note
Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.
add a comment |
One of the approach is as follows,
Create an empty dataframe and append the matched column values as per your logic
Merge the created dataframe with your original dataframe.
I have made an attempt using your example data and here goes it:
import pandas as pd
dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
dat = pd.DataFrame(dat) # Converting to a pandas dataframe
dat1 = pd.DataFrame()
for n in range(dat.shape[0]):
for m in range(dat.shape[0]):
if dat['Col1'][n] == dat['Col2'][m]:
dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
# print(dat1)
df = pd.concat([dat, dat1], axis=1)
print(df)
Input(as dataframe):
Col1 Col2 Col3 Col4
0 ABC DEF 12 DGMN
1 PQR MNO 17 DGSD
2 DEF JPG United DGFS
3 JPG PQR 21Hi DFPR
4 SQL STF STM DGBC
5 PQR YZW Hello90 DGSF
6 MNO ABC DQT DGCV
7 STF SQL A18B DGFD
Intermediate/Temp dataframe:
Column5 Column6
0 DQT DGCV
1 21Hi DFPR
2 12 DGMN
3 United DGFS
4 A18B DGFD
5 21Hi DFPR
6 17 DGSD
7 STM DGBC
Output(df):
Col1 Col2 Col3 Col4 Column5 Column6
0 ABC DEF 12 DGMN DQT DGCV
1 PQR MNO 17 DGSD 21Hi DFPR
2 DEF JPG United DGFS 12 DGMN
3 JPG PQR 21Hi DFPR United DGFS
4 SQL STF STM DGBC A18B DGFD
5 PQR YZW Hello90 DGSF 21Hi DFPR
6 MNO ABC DQT DGCV 17 DGSD
7 STF SQL A18B DGFD STM DGBC
While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.
Note
Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.
One of the approach is as follows,
Create an empty dataframe and append the matched column values as per your logic
Merge the created dataframe with your original dataframe.
I have made an attempt using your example data and here goes it:
import pandas as pd
dat = pd.read_excel(<location_to_file>) # Reading excel in to pandas
dat = pd.DataFrame(dat) # Converting to a pandas dataframe
dat1 = pd.DataFrame()
for n in range(dat.shape[0]):
for m in range(dat.shape[0]):
if dat['Col1'][n] == dat['Col2'][m]:
dat1 = dat1.append(pd.DataFrame({'Column5': dat.iloc[m][2], 'Column6': dat.iloc[m][3]}, index=[0]), ignore_index=True)
# print(dat1)
df = pd.concat([dat, dat1], axis=1)
print(df)
Input(as dataframe):
Col1 Col2 Col3 Col4
0 ABC DEF 12 DGMN
1 PQR MNO 17 DGSD
2 DEF JPG United DGFS
3 JPG PQR 21Hi DFPR
4 SQL STF STM DGBC
5 PQR YZW Hello90 DGSF
6 MNO ABC DQT DGCV
7 STF SQL A18B DGFD
Intermediate/Temp dataframe:
Column5 Column6
0 DQT DGCV
1 21Hi DFPR
2 12 DGMN
3 United DGFS
4 A18B DGFD
5 21Hi DFPR
6 17 DGSD
7 STM DGBC
Output(df):
Col1 Col2 Col3 Col4 Column5 Column6
0 ABC DEF 12 DGMN DQT DGCV
1 PQR MNO 17 DGSD 21Hi DFPR
2 DEF JPG United DGFS 12 DGMN
3 JPG PQR 21Hi DFPR United DGFS
4 SQL STF STM DGBC A18B DGFD
5 PQR YZW Hello90 DGSF 21Hi DFPR
6 MNO ABC DQT DGCV 17 DGSD
7 STF SQL A18B DGFD STM DGBC
While this snippet could be improved further for performance by vectorizing the operations. Hope this would help you get started.
Note
Please show your research efforts in addressing/solving the issue you're posting. That would motivate SO members to help you.
edited Nov 15 '18 at 6:43
answered Nov 15 '18 at 6:05
RussellBRussellB
8191330
8191330
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53312537%2fhow-to-sort-rows-based-on-column-matching-data-using-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown