Pandas dataframe select rows where a list-column contains any of a list of strings





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







6















I've got a pandas DataFrame that looks like this:



  molecule            species
0 a [dog]
1 b [horse, pig]
2 c [cat, dog]
3 d [cat, horse, pig]
4 e [chicken, pig]


and I like to extract a DataFrame containing only thoses rows, that contain any of selection = ['cat', 'dog']. So the result should look like this:



  molecule            species
0 a [dog]
1 c [cat, dog]
2 d [cat, horse, pig]


What would be the simplest way to do this?



For testing:



selection = ['cat', 'dog']
df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})









share|improve this question


















  • 1





    Use df = df.loc[df.species.str.contains('cat|dog'),:]

    – Sandeep Kadapa
    Nov 16 '18 at 17:44




















6















I've got a pandas DataFrame that looks like this:



  molecule            species
0 a [dog]
1 b [horse, pig]
2 c [cat, dog]
3 d [cat, horse, pig]
4 e [chicken, pig]


and I like to extract a DataFrame containing only thoses rows, that contain any of selection = ['cat', 'dog']. So the result should look like this:



  molecule            species
0 a [dog]
1 c [cat, dog]
2 d [cat, horse, pig]


What would be the simplest way to do this?



For testing:



selection = ['cat', 'dog']
df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})









share|improve this question


















  • 1





    Use df = df.loc[df.species.str.contains('cat|dog'),:]

    – Sandeep Kadapa
    Nov 16 '18 at 17:44
















6












6








6


1






I've got a pandas DataFrame that looks like this:



  molecule            species
0 a [dog]
1 b [horse, pig]
2 c [cat, dog]
3 d [cat, horse, pig]
4 e [chicken, pig]


and I like to extract a DataFrame containing only thoses rows, that contain any of selection = ['cat', 'dog']. So the result should look like this:



  molecule            species
0 a [dog]
1 c [cat, dog]
2 d [cat, horse, pig]


What would be the simplest way to do this?



For testing:



selection = ['cat', 'dog']
df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})









share|improve this question














I've got a pandas DataFrame that looks like this:



  molecule            species
0 a [dog]
1 b [horse, pig]
2 c [cat, dog]
3 d [cat, horse, pig]
4 e [chicken, pig]


and I like to extract a DataFrame containing only thoses rows, that contain any of selection = ['cat', 'dog']. So the result should look like this:



  molecule            species
0 a [dog]
1 c [cat, dog]
2 d [cat, horse, pig]


What would be the simplest way to do this?



For testing:



selection = ['cat', 'dog']
df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})






python pandas dataframe






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 16 '18 at 17:29









NicoHNicoH

9517




9517








  • 1





    Use df = df.loc[df.species.str.contains('cat|dog'),:]

    – Sandeep Kadapa
    Nov 16 '18 at 17:44
















  • 1





    Use df = df.loc[df.species.str.contains('cat|dog'),:]

    – Sandeep Kadapa
    Nov 16 '18 at 17:44










1




1





Use df = df.loc[df.species.str.contains('cat|dog'),:]

– Sandeep Kadapa
Nov 16 '18 at 17:44







Use df = df.loc[df.species.str.contains('cat|dog'),:]

– Sandeep Kadapa
Nov 16 '18 at 17:44














6 Answers
6






active

oldest

votes


















3














IIUC Re-create your df then using isin with any should be faster than apply



df[pd.DataFrame(df.species.tolist()).isin(selection).any(1)]
Out[64]:
molecule species
0 a [dog]
2 c [cat, dog]
3 d [cat, horse, pig]





share|improve this answer































    2














    Using Numpy would be much faster than using Pandas in this case,



    Option 1: Using numpy intersection,



    mask =  df.species.apply(lambda x: np.intersect1d(x, selection).size > 0)
    df[mask]
    450 µs ± 21.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

    molecule species
    0 a [dog]
    2 c [cat, dog]
    3 d [cat, horse, pig]


    Option2: A similar solution as above using numpy in1d,



    df[df.species.apply(lambda x: np.any(np.in1d(x, selection)))]
    420 µs ± 17.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


    Option 3: Interestingly, using pure python set is quite fast here



    df[df.species.apply(lambda x: bool(set(x) & set(selection)))]
    305 µs ± 5.22 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)





    share|improve this answer

































      1














      You can use mask with apply here.



      selection = ['cat', 'dog']

      mask = df.species.apply(lambda x: any(item for item in selection if item in x))
      df1 = df[mask]


      For the DataFrame you've provided as an example above, df1 will be:



      molecule    species
      0 a [dog]
      2 c [cat, dog]
      3 d [cat, horse, pig]





      share|improve this answer





















      • 1





        Given that @NicoH is looking for the presence of 'cat' or 'dog', i would recommend changing the mask to this mask = df.species.apply(lambda x: any(item for item in selection if item in x))

        – rs311
        Nov 16 '18 at 17:40











      • @rs311 agreed - updated the lambda with selection example

        – Wes Doyle
        Nov 16 '18 at 17:43



















      1














      This is an easy and basic approach.
      You can create a function that checks if the elements in Selection list are present in the pandas column list.



      def check(speciesList):
      flag = False
      for animal in selection:
      if animal in speciesList:
      flag = True
      return flag


      You could then use this list to create a column that contains True of False based on whether the record contains at least one element in Selection List and create a new data frame based on it.



      df['containsCatDog'] = df.species.apply(lambda animals: check(animals))
      newDf = df[df.containsCatDog == True]


      Hope it helps.






      share|improve this answer































        1














        import  pandas as pd
        import numpy as np
        selection = ['cat', 'dog']
        df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})

        df1 = df[df['species'].apply((lambda x: 'dog' in x) )]
        df2=df[df['species'].apply((lambda x: 'cat' in x) )]
        frames = [df1, df2]
        result = pd.concat(frames,join='inner',ignore_index=False)
        print("result",result)
        result = result[~result.index.duplicated(keep='first')]
        print(result)





        share|improve this answer































          0














          Using pandas str.contains (uses regular expression):



          df[~df["species"].str.contains('(cat|dog)', regex=True)]


          Output:



              molecule    species
          1 b [horse, pig]
          4 e [chicken, pig]





          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53342715%2fpandas-dataframe-select-rows-where-a-list-column-contains-any-of-a-list-of-strin%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            6 Answers
            6






            active

            oldest

            votes








            6 Answers
            6






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            IIUC Re-create your df then using isin with any should be faster than apply



            df[pd.DataFrame(df.species.tolist()).isin(selection).any(1)]
            Out[64]:
            molecule species
            0 a [dog]
            2 c [cat, dog]
            3 d [cat, horse, pig]





            share|improve this answer




























              3














              IIUC Re-create your df then using isin with any should be faster than apply



              df[pd.DataFrame(df.species.tolist()).isin(selection).any(1)]
              Out[64]:
              molecule species
              0 a [dog]
              2 c [cat, dog]
              3 d [cat, horse, pig]





              share|improve this answer


























                3












                3








                3







                IIUC Re-create your df then using isin with any should be faster than apply



                df[pd.DataFrame(df.species.tolist()).isin(selection).any(1)]
                Out[64]:
                molecule species
                0 a [dog]
                2 c [cat, dog]
                3 d [cat, horse, pig]





                share|improve this answer













                IIUC Re-create your df then using isin with any should be faster than apply



                df[pd.DataFrame(df.species.tolist()).isin(selection).any(1)]
                Out[64]:
                molecule species
                0 a [dog]
                2 c [cat, dog]
                3 d [cat, horse, pig]






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 16 '18 at 17:56









                Wen-BenWen-Ben

                126k83872




                126k83872

























                    2














                    Using Numpy would be much faster than using Pandas in this case,



                    Option 1: Using numpy intersection,



                    mask =  df.species.apply(lambda x: np.intersect1d(x, selection).size > 0)
                    df[mask]
                    450 µs ± 21.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

                    molecule species
                    0 a [dog]
                    2 c [cat, dog]
                    3 d [cat, horse, pig]


                    Option2: A similar solution as above using numpy in1d,



                    df[df.species.apply(lambda x: np.any(np.in1d(x, selection)))]
                    420 µs ± 17.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


                    Option 3: Interestingly, using pure python set is quite fast here



                    df[df.species.apply(lambda x: bool(set(x) & set(selection)))]
                    305 µs ± 5.22 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)





                    share|improve this answer






























                      2














                      Using Numpy would be much faster than using Pandas in this case,



                      Option 1: Using numpy intersection,



                      mask =  df.species.apply(lambda x: np.intersect1d(x, selection).size > 0)
                      df[mask]
                      450 µs ± 21.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

                      molecule species
                      0 a [dog]
                      2 c [cat, dog]
                      3 d [cat, horse, pig]


                      Option2: A similar solution as above using numpy in1d,



                      df[df.species.apply(lambda x: np.any(np.in1d(x, selection)))]
                      420 µs ± 17.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


                      Option 3: Interestingly, using pure python set is quite fast here



                      df[df.species.apply(lambda x: bool(set(x) & set(selection)))]
                      305 µs ± 5.22 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)





                      share|improve this answer




























                        2












                        2








                        2







                        Using Numpy would be much faster than using Pandas in this case,



                        Option 1: Using numpy intersection,



                        mask =  df.species.apply(lambda x: np.intersect1d(x, selection).size > 0)
                        df[mask]
                        450 µs ± 21.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

                        molecule species
                        0 a [dog]
                        2 c [cat, dog]
                        3 d [cat, horse, pig]


                        Option2: A similar solution as above using numpy in1d,



                        df[df.species.apply(lambda x: np.any(np.in1d(x, selection)))]
                        420 µs ± 17.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


                        Option 3: Interestingly, using pure python set is quite fast here



                        df[df.species.apply(lambda x: bool(set(x) & set(selection)))]
                        305 µs ± 5.22 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)





                        share|improve this answer















                        Using Numpy would be much faster than using Pandas in this case,



                        Option 1: Using numpy intersection,



                        mask =  df.species.apply(lambda x: np.intersect1d(x, selection).size > 0)
                        df[mask]
                        450 µs ± 21.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

                        molecule species
                        0 a [dog]
                        2 c [cat, dog]
                        3 d [cat, horse, pig]


                        Option2: A similar solution as above using numpy in1d,



                        df[df.species.apply(lambda x: np.any(np.in1d(x, selection)))]
                        420 µs ± 17.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


                        Option 3: Interestingly, using pure python set is quite fast here



                        df[df.species.apply(lambda x: bool(set(x) & set(selection)))]
                        305 µs ± 5.22 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited Nov 16 '18 at 18:04

























                        answered Nov 16 '18 at 17:53









                        VaishaliVaishali

                        22.9k41438




                        22.9k41438























                            1














                            You can use mask with apply here.



                            selection = ['cat', 'dog']

                            mask = df.species.apply(lambda x: any(item for item in selection if item in x))
                            df1 = df[mask]


                            For the DataFrame you've provided as an example above, df1 will be:



                            molecule    species
                            0 a [dog]
                            2 c [cat, dog]
                            3 d [cat, horse, pig]





                            share|improve this answer





















                            • 1





                              Given that @NicoH is looking for the presence of 'cat' or 'dog', i would recommend changing the mask to this mask = df.species.apply(lambda x: any(item for item in selection if item in x))

                              – rs311
                              Nov 16 '18 at 17:40











                            • @rs311 agreed - updated the lambda with selection example

                              – Wes Doyle
                              Nov 16 '18 at 17:43
















                            1














                            You can use mask with apply here.



                            selection = ['cat', 'dog']

                            mask = df.species.apply(lambda x: any(item for item in selection if item in x))
                            df1 = df[mask]


                            For the DataFrame you've provided as an example above, df1 will be:



                            molecule    species
                            0 a [dog]
                            2 c [cat, dog]
                            3 d [cat, horse, pig]





                            share|improve this answer





















                            • 1





                              Given that @NicoH is looking for the presence of 'cat' or 'dog', i would recommend changing the mask to this mask = df.species.apply(lambda x: any(item for item in selection if item in x))

                              – rs311
                              Nov 16 '18 at 17:40











                            • @rs311 agreed - updated the lambda with selection example

                              – Wes Doyle
                              Nov 16 '18 at 17:43














                            1












                            1








                            1







                            You can use mask with apply here.



                            selection = ['cat', 'dog']

                            mask = df.species.apply(lambda x: any(item for item in selection if item in x))
                            df1 = df[mask]


                            For the DataFrame you've provided as an example above, df1 will be:



                            molecule    species
                            0 a [dog]
                            2 c [cat, dog]
                            3 d [cat, horse, pig]





                            share|improve this answer















                            You can use mask with apply here.



                            selection = ['cat', 'dog']

                            mask = df.species.apply(lambda x: any(item for item in selection if item in x))
                            df1 = df[mask]


                            For the DataFrame you've provided as an example above, df1 will be:



                            molecule    species
                            0 a [dog]
                            2 c [cat, dog]
                            3 d [cat, horse, pig]






                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Nov 16 '18 at 17:42

























                            answered Nov 16 '18 at 17:34









                            Wes DoyleWes Doyle

                            1,1092721




                            1,1092721








                            • 1





                              Given that @NicoH is looking for the presence of 'cat' or 'dog', i would recommend changing the mask to this mask = df.species.apply(lambda x: any(item for item in selection if item in x))

                              – rs311
                              Nov 16 '18 at 17:40











                            • @rs311 agreed - updated the lambda with selection example

                              – Wes Doyle
                              Nov 16 '18 at 17:43














                            • 1





                              Given that @NicoH is looking for the presence of 'cat' or 'dog', i would recommend changing the mask to this mask = df.species.apply(lambda x: any(item for item in selection if item in x))

                              – rs311
                              Nov 16 '18 at 17:40











                            • @rs311 agreed - updated the lambda with selection example

                              – Wes Doyle
                              Nov 16 '18 at 17:43








                            1




                            1





                            Given that @NicoH is looking for the presence of 'cat' or 'dog', i would recommend changing the mask to this mask = df.species.apply(lambda x: any(item for item in selection if item in x))

                            – rs311
                            Nov 16 '18 at 17:40





                            Given that @NicoH is looking for the presence of 'cat' or 'dog', i would recommend changing the mask to this mask = df.species.apply(lambda x: any(item for item in selection if item in x))

                            – rs311
                            Nov 16 '18 at 17:40













                            @rs311 agreed - updated the lambda with selection example

                            – Wes Doyle
                            Nov 16 '18 at 17:43





                            @rs311 agreed - updated the lambda with selection example

                            – Wes Doyle
                            Nov 16 '18 at 17:43











                            1














                            This is an easy and basic approach.
                            You can create a function that checks if the elements in Selection list are present in the pandas column list.



                            def check(speciesList):
                            flag = False
                            for animal in selection:
                            if animal in speciesList:
                            flag = True
                            return flag


                            You could then use this list to create a column that contains True of False based on whether the record contains at least one element in Selection List and create a new data frame based on it.



                            df['containsCatDog'] = df.species.apply(lambda animals: check(animals))
                            newDf = df[df.containsCatDog == True]


                            Hope it helps.






                            share|improve this answer




























                              1














                              This is an easy and basic approach.
                              You can create a function that checks if the elements in Selection list are present in the pandas column list.



                              def check(speciesList):
                              flag = False
                              for animal in selection:
                              if animal in speciesList:
                              flag = True
                              return flag


                              You could then use this list to create a column that contains True of False based on whether the record contains at least one element in Selection List and create a new data frame based on it.



                              df['containsCatDog'] = df.species.apply(lambda animals: check(animals))
                              newDf = df[df.containsCatDog == True]


                              Hope it helps.






                              share|improve this answer


























                                1












                                1








                                1







                                This is an easy and basic approach.
                                You can create a function that checks if the elements in Selection list are present in the pandas column list.



                                def check(speciesList):
                                flag = False
                                for animal in selection:
                                if animal in speciesList:
                                flag = True
                                return flag


                                You could then use this list to create a column that contains True of False based on whether the record contains at least one element in Selection List and create a new data frame based on it.



                                df['containsCatDog'] = df.species.apply(lambda animals: check(animals))
                                newDf = df[df.containsCatDog == True]


                                Hope it helps.






                                share|improve this answer













                                This is an easy and basic approach.
                                You can create a function that checks if the elements in Selection list are present in the pandas column list.



                                def check(speciesList):
                                flag = False
                                for animal in selection:
                                if animal in speciesList:
                                flag = True
                                return flag


                                You could then use this list to create a column that contains True of False based on whether the record contains at least one element in Selection List and create a new data frame based on it.



                                df['containsCatDog'] = df.species.apply(lambda animals: check(animals))
                                newDf = df[df.containsCatDog == True]


                                Hope it helps.







                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Nov 16 '18 at 17:55









                                CommandCommand

                                608




                                608























                                    1














                                    import  pandas as pd
                                    import numpy as np
                                    selection = ['cat', 'dog']
                                    df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})

                                    df1 = df[df['species'].apply((lambda x: 'dog' in x) )]
                                    df2=df[df['species'].apply((lambda x: 'cat' in x) )]
                                    frames = [df1, df2]
                                    result = pd.concat(frames,join='inner',ignore_index=False)
                                    print("result",result)
                                    result = result[~result.index.duplicated(keep='first')]
                                    print(result)





                                    share|improve this answer




























                                      1














                                      import  pandas as pd
                                      import numpy as np
                                      selection = ['cat', 'dog']
                                      df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})

                                      df1 = df[df['species'].apply((lambda x: 'dog' in x) )]
                                      df2=df[df['species'].apply((lambda x: 'cat' in x) )]
                                      frames = [df1, df2]
                                      result = pd.concat(frames,join='inner',ignore_index=False)
                                      print("result",result)
                                      result = result[~result.index.duplicated(keep='first')]
                                      print(result)





                                      share|improve this answer


























                                        1












                                        1








                                        1







                                        import  pandas as pd
                                        import numpy as np
                                        selection = ['cat', 'dog']
                                        df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})

                                        df1 = df[df['species'].apply((lambda x: 'dog' in x) )]
                                        df2=df[df['species'].apply((lambda x: 'cat' in x) )]
                                        frames = [df1, df2]
                                        result = pd.concat(frames,join='inner',ignore_index=False)
                                        print("result",result)
                                        result = result[~result.index.duplicated(keep='first')]
                                        print(result)





                                        share|improve this answer













                                        import  pandas as pd
                                        import numpy as np
                                        selection = ['cat', 'dog']
                                        df = pd.DataFrame({'molecule': ['a','b','c','d','e'], 'species' : [['dog'], ['horse','pig'],['cat', 'dog'], ['cat','horse','pig'], ['chicken','pig']]})

                                        df1 = df[df['species'].apply((lambda x: 'dog' in x) )]
                                        df2=df[df['species'].apply((lambda x: 'cat' in x) )]
                                        frames = [df1, df2]
                                        result = pd.concat(frames,join='inner',ignore_index=False)
                                        print("result",result)
                                        result = result[~result.index.duplicated(keep='first')]
                                        print(result)






                                        share|improve this answer












                                        share|improve this answer



                                        share|improve this answer










                                        answered Nov 16 '18 at 20:03









                                        ALEN M AALEN M A

                                        92




                                        92























                                            0














                                            Using pandas str.contains (uses regular expression):



                                            df[~df["species"].str.contains('(cat|dog)', regex=True)]


                                            Output:



                                                molecule    species
                                            1 b [horse, pig]
                                            4 e [chicken, pig]





                                            share|improve this answer




























                                              0














                                              Using pandas str.contains (uses regular expression):



                                              df[~df["species"].str.contains('(cat|dog)', regex=True)]


                                              Output:



                                                  molecule    species
                                              1 b [horse, pig]
                                              4 e [chicken, pig]





                                              share|improve this answer


























                                                0












                                                0








                                                0







                                                Using pandas str.contains (uses regular expression):



                                                df[~df["species"].str.contains('(cat|dog)', regex=True)]


                                                Output:



                                                    molecule    species
                                                1 b [horse, pig]
                                                4 e [chicken, pig]





                                                share|improve this answer













                                                Using pandas str.contains (uses regular expression):



                                                df[~df["species"].str.contains('(cat|dog)', regex=True)]


                                                Output:



                                                    molecule    species
                                                1 b [horse, pig]
                                                4 e [chicken, pig]






                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered Nov 16 '18 at 19:30









                                                Ken DekalbKen Dekalb

                                                317112




                                                317112






























                                                    draft saved

                                                    draft discarded




















































                                                    Thanks for contributing an answer to Stack Overflow!


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    To learn more, see our tips on writing great answers.




                                                    draft saved


                                                    draft discarded














                                                    StackExchange.ready(
                                                    function () {
                                                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53342715%2fpandas-dataframe-select-rows-where-a-list-column-contains-any-of-a-list-of-strin%23new-answer', 'question_page');
                                                    }
                                                    );

                                                    Post as a guest















                                                    Required, but never shown





















































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown

































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown







                                                    Popular posts from this blog

                                                    Florida Star v. B. J. F.

                                                    Error while running script in elastic search , gateway timeout

                                                    Adding quotations to stringified JSON object values