Pandas pivot table using custom conditions on the dataframe

up vote
4
down vote

favorite

I want to make a pivot table based on custom conditions in the dataframe:

The dataframe looks like this:

>>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],

                       "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],

                       "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"], 

                       "Population" : [100,150,50,200,170,390,80,100]

                       "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })

>>> df

  Area City Condition   Population Pincode

 0    A    X      Good   100       X1

 1    A    Y       Bad   150       Y1

 2    B    Z      Good   50        Z1

 3    A    P      Good   200       P1

 4    C    Q      Good   170       Q1

 5    A    R       Bad   390       R1

 6    D    S      Good   80        S1

 7    A    X      Good   100       X2

Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.

I expect an output like this:

Area  city_count  good_city_count   Population

A        4        2                 940

B        1        1                 50

C        1        1                 170

D        1        1                 80

All      7        5                 1240

I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.

>>> city_count = df.pivot_table(index=["Area"],

                                values=["City", "Population"],

                                aggfunc={"City": lambda x: len(x.unique()),

                                         "Population": "sum"},

                                margins=True)



    Area    City    Population

0   A       4       940

1   B       1       50

2   C       1       170

3   D       1       80

4   All     7       1240

I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.

edited Nov 10 at 21:19

asked Nov 10 at 13:36

Pratiek Malhotra

212

New contributor

add a comment |

up vote
4
down vote

favorite

I want to make a pivot table based on custom conditions in the dataframe:

The dataframe looks like this:

>>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],

                       "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],

                       "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"], 

                       "Population" : [100,150,50,200,170,390,80,100]

                       "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })

>>> df

  Area City Condition   Population Pincode

 0    A    X      Good   100       X1

 1    A    Y       Bad   150       Y1

 2    B    Z      Good   50        Z1

 3    A    P      Good   200       P1

 4    C    Q      Good   170       Q1

 5    A    R       Bad   390       R1

 6    D    S      Good   80        S1

 7    A    X      Good   100       X2

I expect an output like this:

Area  city_count  good_city_count   Population

A        4        2                 940

B        1        1                 50

C        1        1                 170

D        1        1                 80

All      7        5                 1240

I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.

>>> city_count = df.pivot_table(index=["Area"],

                                values=["City", "Population"],

                                aggfunc={"City": lambda x: len(x.unique()),

                                         "Population": "sum"},

                                margins=True)



    Area    City    Population

0   A       4       940

1   B       1       50

2   C       1       170

3   D       1       80

4   All     7       1240

I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.

edited Nov 10 at 21:19

asked Nov 10 at 13:36

Pratiek Malhotra

212

New contributor

add a comment |

up vote
4
down vote

favorite

I want to make a pivot table based on custom conditions in the dataframe:

The dataframe looks like this:

>>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],

                       "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],

                       "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"], 

                       "Population" : [100,150,50,200,170,390,80,100]

                       "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })

>>> df

  Area City Condition   Population Pincode

 0    A    X      Good   100       X1

 1    A    Y       Bad   150       Y1

 2    B    Z      Good   50        Z1

 3    A    P      Good   200       P1

 4    C    Q      Good   170       Q1

 5    A    R       Bad   390       R1

 6    D    S      Good   80        S1

 7    A    X      Good   100       X2

I expect an output like this:

Area  city_count  good_city_count   Population

A        4        2                 940

B        1        1                 50

C        1        1                 170

D        1        1                 80

All      7        5                 1240

I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.

>>> city_count = df.pivot_table(index=["Area"],

                                values=["City", "Population"],

                                aggfunc={"City": lambda x: len(x.unique()),

                                         "Population": "sum"},

                                margins=True)



    Area    City    Population

0   A       4       940

1   B       1       50

2   C       1       170

3   D       1       80

4   All     7       1240

I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.

edited Nov 10 at 21:19

asked Nov 10 at 13:36

Pratiek Malhotra

212

New contributor

I want to make a pivot table based on custom conditions in the dataframe:

The dataframe looks like this:

>>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],

                       "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],

                       "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"], 

                       "Population" : [100,150,50,200,170,390,80,100]

                       "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })

>>> df

  Area City Condition   Population Pincode

 0    A    X      Good   100       X1

 1    A    Y       Bad   150       Y1

 2    B    Z      Good   50        Z1

 3    A    P      Good   200       P1

 4    C    Q      Good   170       Q1

 5    A    R       Bad   390       R1

 6    D    S      Good   80        S1

 7    A    X      Good   100       X2

I expect an output like this:

Area  city_count  good_city_count   Population

A        4        2                 940

B        1        1                 50

C        1        1                 170

D        1        1                 80

All      7        5                 1240

I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.

>>> city_count = df.pivot_table(index=["Area"],

                                values=["City", "Population"],

                                aggfunc={"City": lambda x: len(x.unique()),

                                         "Population": "sum"},

                                margins=True)



    Area    City    Population

0   A       4       940

1   B       1       50

2   C       1       170

3   D       1       80

4   All     7       1240

I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.

python pandas pivot-table

edited Nov 10 at 21:19

asked Nov 10 at 13:36

Pratiek Malhotra

212

New contributor

edited Nov 10 at 21:19

asked Nov 10 at 13:36

Pratiek Malhotra

212

New contributor

edited Nov 10 at 21:19

asked Nov 10 at 13:36

Pratiek Malhotra

212

New contributor

asked Nov 10 at 13:36

Pratiek Malhotra

212

asked Nov 10 at 13:36

Pratiek Malhotra

212

New contributor

Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

Add new parameters columns with fill_value and also is possible use nunique for aggregate function:

city_count = df.pivot_table(index = "Area", 

                            values = "City", 

                            columns='Condition', 

                            aggfunc = lambda x : x.nunique(), 

                            margins = True,

                            fill_value=0)

print (city_count)

Condition  Bad  Good  All

Area                     

A            2     2    4

B            0     1    1

C            0     1    1

D            0     1    1

All          2     5    7

Last if need convert index to column and change columns names:

city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)

print (city_count)

  Area  Bad_count  Good_count  All_count

0    A          2           2          4

1    B          0           1          1

2    C          0           1          1

3    D          0           1          1

4  All          2           5          7

EDIT:

d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}

d1 = {'City':'city_count','Condition':'good_city_count'}



mask = df["Condition"] == 'Good'

df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))

       .groupby('Area')

       .agg(d)

       .rename(columns=d1))



df = df.append(df.sum().rename('All')).reset_index()



print (df)

  Area  city_count  Population  good_city_count

0    A           4         940                2

1    B           1          50                1

2    C           1         170                1

3    D           1          80                1

4  All           7        1240                5

edited Nov 10 at 22:25

answered Nov 10 at 13:40

jezrael

305k20239314

1

This wouldn't give me the total_count of cities against the good_count
– Pratiek Malhotra
Nov 10 at 14:18

@PratiekMalhotra - sorry, you are right. rollback to previous answer.
– jezrael
Nov 10 at 14:20

1

@PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
– jezrael
Nov 10 at 14:24

@jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
– Pratiek Malhotra
Nov 10 at 14:31

@PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
– jezrael
Nov 10 at 14:36

|
show 4 more comments

up vote
1
down vote

Another method without using pivot_table. Use np.where with groupby+agg:

df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)

df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})

                       .rename(columns={'City':'city_count', 'Condition':'good_city_count'})

df.loc['All',:] = df.sum()

df = df.astype(int).reset_index()



print(df)

  Area  city_count  good_city_count  Population

0    A           4                2         940

1    B           1                1          50

2    C           1                1         170

3    D           1                1          80

4  All           7                5        1240

edited Nov 11 at 2:44

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
– Pratiek Malhotra
Nov 10 at 20:48

@PratiekMalhotra Check the update.
– Sandeep Kadapa
Nov 11 at 2:46

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239514%2fpandas-pivot-table-using-custom-conditions-on-the-dataframe%23new-answer', 'question_page');
}
);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

Add new parameters columns with fill_value and also is possible use nunique for aggregate function:

city_count = df.pivot_table(index = "Area", 

                            values = "City", 

                            columns='Condition', 

                            aggfunc = lambda x : x.nunique(), 

                            margins = True,

                            fill_value=0)

print (city_count)

Condition  Bad  Good  All

Area                     

A            2     2    4

B            0     1    1

C            0     1    1

D            0     1    1

All          2     5    7

Last if need convert index to column and change columns names:

city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)

print (city_count)

  Area  Bad_count  Good_count  All_count

0    A          2           2          4

1    B          0           1          1

2    C          0           1          1

3    D          0           1          1

4  All          2           5          7

EDIT:

d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}

d1 = {'City':'city_count','Condition':'good_city_count'}



mask = df["Condition"] == 'Good'

df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))

       .groupby('Area')

       .agg(d)

       .rename(columns=d1))



df = df.append(df.sum().rename('All')).reset_index()



print (df)

  Area  city_count  Population  good_city_count

0    A           4         940                2

1    B           1          50                1

2    C           1         170                1

3    D           1          80                1

4  All           7        1240                5

edited Nov 10 at 22:25

answered Nov 10 at 13:40

jezrael

305k20239314

1

This wouldn't give me the total_count of cities against the good_count
– Pratiek Malhotra
Nov 10 at 14:18

@PratiekMalhotra - sorry, you are right. rollback to previous answer.
– jezrael
Nov 10 at 14:20

1

@PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
– jezrael
Nov 10 at 14:24

@jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
– Pratiek Malhotra
Nov 10 at 14:31

@PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
– jezrael
Nov 10 at 14:36

|
show 4 more comments

up vote
1
down vote

Add new parameters columns with fill_value and also is possible use nunique for aggregate function:

city_count = df.pivot_table(index = "Area", 

                            values = "City", 

                            columns='Condition', 

                            aggfunc = lambda x : x.nunique(), 

                            margins = True,

                            fill_value=0)

print (city_count)

Condition  Bad  Good  All

Area                     

A            2     2    4

B            0     1    1

C            0     1    1

D            0     1    1

All          2     5    7

Last if need convert index to column and change columns names:

city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)

print (city_count)

  Area  Bad_count  Good_count  All_count

0    A          2           2          4

1    B          0           1          1

2    C          0           1          1

3    D          0           1          1

4  All          2           5          7

EDIT:

d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}

d1 = {'City':'city_count','Condition':'good_city_count'}



mask = df["Condition"] == 'Good'

df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))

       .groupby('Area')

       .agg(d)

       .rename(columns=d1))



df = df.append(df.sum().rename('All')).reset_index()



print (df)

  Area  city_count  Population  good_city_count

0    A           4         940                2

1    B           1          50                1

2    C           1         170                1

3    D           1          80                1

4  All           7        1240                5

edited Nov 10 at 22:25

answered Nov 10 at 13:40

jezrael

305k20239314

1

This wouldn't give me the total_count of cities against the good_count
– Pratiek Malhotra
Nov 10 at 14:18

@PratiekMalhotra - sorry, you are right. rollback to previous answer.
– jezrael
Nov 10 at 14:20

1

@PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
– jezrael
Nov 10 at 14:24

@jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
– Pratiek Malhotra
Nov 10 at 14:31

@PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
– jezrael
Nov 10 at 14:36

|
show 4 more comments

up vote
1
down vote

Add new parameters columns with fill_value and also is possible use nunique for aggregate function:

city_count = df.pivot_table(index = "Area", 

                            values = "City", 

                            columns='Condition', 

                            aggfunc = lambda x : x.nunique(), 

                            margins = True,

                            fill_value=0)

print (city_count)

Condition  Bad  Good  All

Area                     

A            2     2    4

B            0     1    1

C            0     1    1

D            0     1    1

All          2     5    7

Last if need convert index to column and change columns names:

city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)

print (city_count)

  Area  Bad_count  Good_count  All_count

0    A          2           2          4

1    B          0           1          1

2    C          0           1          1

3    D          0           1          1

4  All          2           5          7

EDIT:

d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}

d1 = {'City':'city_count','Condition':'good_city_count'}



mask = df["Condition"] == 'Good'

df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))

       .groupby('Area')

       .agg(d)

       .rename(columns=d1))



df = df.append(df.sum().rename('All')).reset_index()



print (df)

  Area  city_count  Population  good_city_count

0    A           4         940                2

1    B           1          50                1

2    C           1         170                1

3    D           1          80                1

4  All           7        1240                5

edited Nov 10 at 22:25

answered Nov 10 at 13:40

jezrael

305k20239314

Add new parameters columns with fill_value and also is possible use nunique for aggregate function:

city_count = df.pivot_table(index = "Area", 

                            values = "City", 

                            columns='Condition', 

                            aggfunc = lambda x : x.nunique(), 

                            margins = True,

                            fill_value=0)

print (city_count)

Condition  Bad  Good  All

Area                     

A            2     2    4

B            0     1    1

C            0     1    1

D            0     1    1

All          2     5    7

Last if need convert index to column and change columns names:

city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)

print (city_count)

  Area  Bad_count  Good_count  All_count

0    A          2           2          4

1    B          0           1          1

2    C          0           1          1

3    D          0           1          1

4  All          2           5          7

EDIT:

d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}

d1 = {'City':'city_count','Condition':'good_city_count'}



mask = df["Condition"] == 'Good'

df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))

       .groupby('Area')

       .agg(d)

       .rename(columns=d1))



df = df.append(df.sum().rename('All')).reset_index()



print (df)

  Area  city_count  Population  good_city_count

0    A           4         940                2

1    B           1          50                1

2    C           1         170                1

3    D           1          80                1

4  All           7        1240                5

edited Nov 10 at 22:25

answered Nov 10 at 13:40

jezrael

305k20239314

edited Nov 10 at 22:25

answered Nov 10 at 13:40

jezrael

305k20239314

answered Nov 10 at 13:40

jezrael

305k20239314

answered Nov 10 at 13:40

jezrael

305k20239314

1

This wouldn't give me the total_count of cities against the good_count
– Pratiek Malhotra
Nov 10 at 14:18

@PratiekMalhotra - sorry, you are right. rollback to previous answer.
– jezrael
Nov 10 at 14:20

1

@PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
– jezrael
Nov 10 at 14:24

@jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
– Pratiek Malhotra
Nov 10 at 14:31

@PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
– jezrael
Nov 10 at 14:36

|
show 4 more comments

1

This wouldn't give me the total_count of cities against the good_count
– Pratiek Malhotra
Nov 10 at 14:18

@PratiekMalhotra - sorry, you are right. rollback to previous answer.
– jezrael
Nov 10 at 14:20

1

@PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
– jezrael
Nov 10 at 14:24

@jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
– Pratiek Malhotra
Nov 10 at 14:31

@PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
– jezrael
Nov 10 at 14:36

This wouldn't give me the total_count of cities against the good_count
– Pratiek Malhotra
Nov 10 at 14:18

@PratiekMalhotra - sorry, you are right. rollback to previous answer.
– jezrael
Nov 10 at 14:20

@PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
– jezrael
Nov 10 at 14:24

@jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
– Pratiek Malhotra
Nov 10 at 14:31

@PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
– jezrael
Nov 10 at 14:36

|
show 4 more comments

up vote
1
down vote

Another method without using pivot_table. Use np.where with groupby+agg:

df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)

df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})

                       .rename(columns={'City':'city_count', 'Condition':'good_city_count'})

df.loc['All',:] = df.sum()

df = df.astype(int).reset_index()



print(df)

  Area  city_count  good_city_count  Population

0    A           4                2         940

1    B           1                1          50

2    C           1                1         170

3    D           1                1          80

4  All           7                5        1240

edited Nov 11 at 2:44

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
– Pratiek Malhotra
Nov 10 at 20:48

@PratiekMalhotra Check the update.
– Sandeep Kadapa
Nov 11 at 2:46

add a comment |

up vote
1
down vote

Another method without using pivot_table. Use np.where with groupby+agg:

df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)

df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})

                       .rename(columns={'City':'city_count', 'Condition':'good_city_count'})

df.loc['All',:] = df.sum()

df = df.astype(int).reset_index()



print(df)

  Area  city_count  good_city_count  Population

0    A           4                2         940

1    B           1                1          50

2    C           1                1         170

3    D           1                1          80

4  All           7                5        1240

edited Nov 11 at 2:44

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
– Pratiek Malhotra
Nov 10 at 20:48

@PratiekMalhotra Check the update.
– Sandeep Kadapa
Nov 11 at 2:46

add a comment |

up vote
1
down vote

Another method without using pivot_table. Use np.where with groupby+agg:

df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)

df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})

                       .rename(columns={'City':'city_count', 'Condition':'good_city_count'})

df.loc['All',:] = df.sum()

df = df.astype(int).reset_index()



print(df)

  Area  city_count  good_city_count  Population

0    A           4                2         940

1    B           1                1          50

2    C           1                1         170

3    D           1                1          80

4  All           7                5        1240

edited Nov 11 at 2:44

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

Another method without using pivot_table. Use np.where with groupby+agg:

df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)

df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})

                       .rename(columns={'City':'city_count', 'Condition':'good_city_count'})

df.loc['All',:] = df.sum()

df = df.astype(int).reset_index()



print(df)

  Area  city_count  good_city_count  Population

0    A           4                2         940

1    B           1                1          50

2    C           1                1         170

3    D           1                1          80

4  All           7                5        1240

edited Nov 11 at 2:44

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

edited Nov 11 at 2:44

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

answered Nov 10 at 13:48

Sandeep Kadapa

4,829426

When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
– Pratiek Malhotra
Nov 10 at 20:48

@PratiekMalhotra Check the update.
– Sandeep Kadapa
Nov 11 at 2:46

add a comment |

When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
– Pratiek Malhotra
Nov 10 at 20:48

@PratiekMalhotra Check the update.
– Sandeep Kadapa
Nov 11 at 2:46

When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
– Pratiek Malhotra
Nov 10 at 20:48

@PratiekMalhotra Check the update.
– Sandeep Kadapa
Nov 11 at 2:46

add a comment |

Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Name

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ndtyjky