python xpath returns empty list

python xpath returns empty list - exilead

I'm fairly new to scraping with Python.
I am trying to obtain the number of search results from a query on Exilead. In this example I would like to get "
586,564 results".

This is the code I am running:

r = requests.get(URL, headers=headers)

tree = html.fromstring(r.text)

stats = tree.xpath('//[@id="searchform"]/div/div/small/text()')

This returns an empty list.

I copy-pasted the xPath directly from the elements' page.

As an alternative, I have tried using Beautiful soup:

html = r.text

soup = BeautifulSoup(html, 'xml')

stats = soup.find('small', {'class': 'pull-right'}).text

which returns a Attribute error: NoneType object does not have attribute text.

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

Does anyone know why this is happening and how this can be resolved?
Thanks a lot!

asked Nov 14 '18 at 21:37

Elisa Macchi

did you try the xpath without the /text() ? Then get the innerHTML

– Ywapom
Nov 14 '18 at 21:52

add a comment |

I'm fairly new to scraping with Python.
I am trying to obtain the number of search results from a query on Exilead. In this example I would like to get "
586,564 results".

This is the code I am running:

r = requests.get(URL, headers=headers)

tree = html.fromstring(r.text)

stats = tree.xpath('//[@id="searchform"]/div/div/small/text()')

This returns an empty list.

I copy-pasted the xPath directly from the elements' page.

As an alternative, I have tried using Beautiful soup:

html = r.text

soup = BeautifulSoup(html, 'xml')

stats = soup.find('small', {'class': 'pull-right'}).text

which returns a Attribute error: NoneType object does not have attribute text.

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

Does anyone know why this is happening and how this can be resolved?
Thanks a lot!

asked Nov 14 '18 at 21:37

Elisa Macchi

did you try the xpath without the /text() ? Then get the innerHTML

– Ywapom
Nov 14 '18 at 21:52

add a comment |

I'm fairly new to scraping with Python.
I am trying to obtain the number of search results from a query on Exilead. In this example I would like to get "
586,564 results".

This is the code I am running:

r = requests.get(URL, headers=headers)

tree = html.fromstring(r.text)

stats = tree.xpath('//[@id="searchform"]/div/div/small/text()')

This returns an empty list.

I copy-pasted the xPath directly from the elements' page.

As an alternative, I have tried using Beautiful soup:

html = r.text

soup = BeautifulSoup(html, 'xml')

stats = soup.find('small', {'class': 'pull-right'}).text

which returns a Attribute error: NoneType object does not have attribute text.

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

Does anyone know why this is happening and how this can be resolved?
Thanks a lot!

asked Nov 14 '18 at 21:37

Elisa Macchi

I'm fairly new to scraping with Python.
I am trying to obtain the number of search results from a query on Exilead. In this example I would like to get "
586,564 results".

This is the code I am running:

r = requests.get(URL, headers=headers)

tree = html.fromstring(r.text)

stats = tree.xpath('//[@id="searchform"]/div/div/small/text()')

This returns an empty list.

I copy-pasted the xPath directly from the elements' page.

As an alternative, I have tried using Beautiful soup:

html = r.text

soup = BeautifulSoup(html, 'xml')

stats = soup.find('small', {'class': 'pull-right'}).text

which returns a Attribute error: NoneType object does not have attribute text.

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

Does anyone know why this is happening and how this can be resolved?
Thanks a lot!

python xpath web-scraping beautifulsoup empty-list

asked Nov 14 '18 at 21:37

Elisa Macchi

asked Nov 14 '18 at 21:37

Elisa Macchi

asked Nov 14 '18 at 21:37

Elisa Macchi

asked Nov 14 '18 at 21:37

Elisa Macchi

asked Nov 14 '18 at 21:37

Elisa Macchi

did you try the xpath without the /text() ? Then get the innerHTML

– Ywapom
Nov 14 '18 at 21:52

add a comment |

did you try the xpath without the /text() ? Then get the innerHTML

– Ywapom
Nov 14 '18 at 21:52

did you try the xpath without the /text() ? Then get the innerHTML

– Ywapom
Nov 14 '18 at 21:52

add a comment |

2 Answers
2

active

oldest

votes

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

This suggests that the data you're looking for is dynamically generated with javascript. You'll need to be able to see the element you're looking for in the html source.

To confirm this being the cause of your error, you could try something really simple like:

html = r.text

soup = BeautifulSoup(html, 'lxml')

*note the 'lxml' above.

And then manually check 'soup' to see if your desired element is there.

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

add a comment |

I can get that with a css selector combination of small.pull-right to target the tag and the class name of the element.

from bs4 import BeautifulSoup

import requests

url = 'https://www.exalead.com/search/web/results/?q=lead+poisoning'

res = requests.get(url)

soup = BeautifulSoup(res.content, "lxml")

print(soup.select_one('small.pull-right').text)

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

1

This one works.

– Kamikaze_goldfish
Nov 14 '18 at 22:32

1

This did the trick! thanks a lot! :)

– Elisa Macchi
Nov 14 '18 at 22:48

You are most welcome.

– QHarr
Nov 14 '18 at 22:49

Please remember to consider hitting the check mark next to the answer to check resolved. stackoverflow.com/help/someone-answers

– QHarr
Nov 15 '18 at 19:50

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53309097%2fpython-xpath-returns-empty-list-exilead%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

This suggests that the data you're looking for is dynamically generated with javascript. You'll need to be able to see the element you're looking for in the html source.

To confirm this being the cause of your error, you could try something really simple like:

html = r.text

soup = BeautifulSoup(html, 'lxml')

*note the 'lxml' above.

And then manually check 'soup' to see if your desired element is there.

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

add a comment |

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

This suggests that the data you're looking for is dynamically generated with javascript. You'll need to be able to see the element you're looking for in the html source.

To confirm this being the cause of your error, you could try something really simple like:

html = r.text

soup = BeautifulSoup(html, 'lxml')

*note the 'lxml' above.

And then manually check 'soup' to see if your desired element is there.

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

add a comment |

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

This suggests that the data you're looking for is dynamically generated with javascript. You'll need to be able to see the element you're looking for in the html source.

To confirm this being the cause of your error, you could try something really simple like:

html = r.text

soup = BeautifulSoup(html, 'lxml')

*note the 'lxml' above.

And then manually check 'soup' to see if your desired element is there.

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

When I checked the html source I realised I actually cannot find the element I am looking for (the number of results) on the source.

This suggests that the data you're looking for is dynamically generated with javascript. You'll need to be able to see the element you're looking for in the html source.

To confirm this being the cause of your error, you could try something really simple like:

html = r.text

soup = BeautifulSoup(html, 'lxml')

*note the 'lxml' above.

And then manually check 'soup' to see if your desired element is there.

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

answered Nov 14 '18 at 21:51

Matthew5Johnson

536

add a comment |

I can get that with a css selector combination of small.pull-right to target the tag and the class name of the element.

from bs4 import BeautifulSoup

import requests

url = 'https://www.exalead.com/search/web/results/?q=lead+poisoning'

res = requests.get(url)

soup = BeautifulSoup(res.content, "lxml")

print(soup.select_one('small.pull-right').text)

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

1

This one works.

– Kamikaze_goldfish
Nov 14 '18 at 22:32

1

This did the trick! thanks a lot! :)

– Elisa Macchi
Nov 14 '18 at 22:48

You are most welcome.

– QHarr
Nov 14 '18 at 22:49

Please remember to consider hitting the check mark next to the answer to check resolved. stackoverflow.com/help/someone-answers

– QHarr
Nov 15 '18 at 19:50

add a comment |

I can get that with a css selector combination of small.pull-right to target the tag and the class name of the element.

from bs4 import BeautifulSoup

import requests

url = 'https://www.exalead.com/search/web/results/?q=lead+poisoning'

res = requests.get(url)

soup = BeautifulSoup(res.content, "lxml")

print(soup.select_one('small.pull-right').text)

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

1

This one works.

– Kamikaze_goldfish
Nov 14 '18 at 22:32

1

This did the trick! thanks a lot! :)

– Elisa Macchi
Nov 14 '18 at 22:48

You are most welcome.

– QHarr
Nov 14 '18 at 22:49

Please remember to consider hitting the check mark next to the answer to check resolved. stackoverflow.com/help/someone-answers

– QHarr
Nov 15 '18 at 19:50

add a comment |

I can get that with a css selector combination of small.pull-right to target the tag and the class name of the element.

from bs4 import BeautifulSoup

import requests

url = 'https://www.exalead.com/search/web/results/?q=lead+poisoning'

res = requests.get(url)

soup = BeautifulSoup(res.content, "lxml")

print(soup.select_one('small.pull-right').text)

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

I can get that with a css selector combination of small.pull-right to target the tag and the class name of the element.

from bs4 import BeautifulSoup

import requests

url = 'https://www.exalead.com/search/web/results/?q=lead+poisoning'

res = requests.get(url)

soup = BeautifulSoup(res.content, "lxml")

print(soup.select_one('small.pull-right').text)

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

answered Nov 14 '18 at 22:03

QHarr

33.4k82043

1

This one works.

– Kamikaze_goldfish
Nov 14 '18 at 22:32

1

This did the trick! thanks a lot! :)

– Elisa Macchi
Nov 14 '18 at 22:48

You are most welcome.

– QHarr
Nov 14 '18 at 22:49

Please remember to consider hitting the check mark next to the answer to check resolved. stackoverflow.com/help/someone-answers

– QHarr
Nov 15 '18 at 19:50

add a comment |

1

This one works.

– Kamikaze_goldfish
Nov 14 '18 at 22:32

1

This did the trick! thanks a lot! :)

– Elisa Macchi
Nov 14 '18 at 22:48

You are most welcome.

– QHarr
Nov 14 '18 at 22:49

Please remember to consider hitting the check mark next to the answer to check resolved. stackoverflow.com/help/someone-answers

– QHarr
Nov 15 '18 at 19:50

This one works.

– Kamikaze_goldfish
Nov 14 '18 at 22:32

This did the trick! thanks a lot! :)

– Elisa Macchi
Nov 14 '18 at 22:48

You are most welcome.

– QHarr
Nov 14 '18 at 22:49

Please remember to consider hitting the check mark next to the answer to check resolved. stackoverflow.com/help/someone-answers

– QHarr
Nov 15 '18 at 19:50

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ndtyjky