Using htmlagilitypack to capture 'p class' in a website
up vote
0
down vote
favorite
I want to capture this 'text' shown int he picture below on a loop every minute, as this text changes every few minutes.
Here's the code I am using, and using HtmlAgilityPack.HtmlDocument
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
[Reflection.Assembly]::LoadFile("C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll")
[HtmlAgilityPack.HtmlWeb]$web = @{}
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url)
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This is slimier code below, does the same thing, however just a different way of doing it
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
Add-Type -path 'C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll'
$doc = New-Object HtmlAgilityPack.HtmlDocument
$wc = New-Object System.Net.WebClient
$doc.LoadHtml($wc.DownloadString($url))
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This class sc-bdVaJa iHZvIS is a div and sits just a bit higher than PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh and PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh is what I want to capture, however when using this in my code, it returns blank.
How can I return just the text I want? Any help greatly appreciated.
powershell html-agility-pack
add a comment |
up vote
0
down vote
favorite
I want to capture this 'text' shown int he picture below on a loop every minute, as this text changes every few minutes.
Here's the code I am using, and using HtmlAgilityPack.HtmlDocument
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
[Reflection.Assembly]::LoadFile("C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll")
[HtmlAgilityPack.HtmlWeb]$web = @{}
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url)
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This is slimier code below, does the same thing, however just a different way of doing it
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
Add-Type -path 'C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll'
$doc = New-Object HtmlAgilityPack.HtmlDocument
$wc = New-Object System.Net.WebClient
$doc.LoadHtml($wc.DownloadString($url))
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This class sc-bdVaJa iHZvIS is a div and sits just a bit higher than PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh and PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh is what I want to capture, however when using this in my code, it returns blank.
How can I return just the text I want? Any help greatly appreciated.
powershell html-agility-pack
1. You can't achieve this using HtmlAgilityPack, as content that you are trying to get is loaded via ajax. 2. Do you really need to use Powershell scripts? It is much better/simpler to use java or c# to accomplish this task (using headless Selenium WebDriver) and if it's acceptable I can show and example. Afterwards you can even compile my example to .dll and wrap it with Powershell script.
– Andrew Kotov
Nov 12 at 17:37
I am only familiar with PowerShell (IT Pro background), and I am familiar with loading .dll files into PowerShell, and no real exposure to Java of C# - if you want to show me something new, I'd be very keen to learn.
– Marc Kean
Nov 12 at 22:52
Marc, sorry for late answer, was a bit busy. So I've prepared everything here: pastebin.com/GK2JGt3H. All information and screenshot with usage are there. If you feel that this is what you need - I can duplicate my answer here at stackoverflow as well.
– Andrew Kotov
Nov 13 at 21:46
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I want to capture this 'text' shown int he picture below on a loop every minute, as this text changes every few minutes.
Here's the code I am using, and using HtmlAgilityPack.HtmlDocument
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
[Reflection.Assembly]::LoadFile("C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll")
[HtmlAgilityPack.HtmlWeb]$web = @{}
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url)
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This is slimier code below, does the same thing, however just a different way of doing it
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
Add-Type -path 'C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll'
$doc = New-Object HtmlAgilityPack.HtmlDocument
$wc = New-Object System.Net.WebClient
$doc.LoadHtml($wc.DownloadString($url))
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This class sc-bdVaJa iHZvIS is a div and sits just a bit higher than PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh and PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh is what I want to capture, however when using this in my code, it returns blank.
How can I return just the text I want? Any help greatly appreciated.
powershell html-agility-pack
I want to capture this 'text' shown int he picture below on a loop every minute, as this text changes every few minutes.
Here's the code I am using, and using HtmlAgilityPack.HtmlDocument
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
[Reflection.Assembly]::LoadFile("C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll")
[HtmlAgilityPack.HtmlWeb]$web = @{}
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url)
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This is slimier code below, does the same thing, however just a different way of doing it
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
Add-Type -path 'C:UsersmakeanDownloadshtmlagilitypack.1.8.10libNet45HtmlAgilityPack.dll'
$doc = New-Object HtmlAgilityPack.HtmlDocument
$wc = New-Object System.Net.WebClient
$doc.LoadHtml($wc.DownloadString($url))
$doc.DocumentNode.SelectNodes(".//*[contains(@class,'sc-bdVaJa iHZvIS')]")
This class sc-bdVaJa iHZvIS is a div and sits just a bit higher than PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh and PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh is what I want to capture, however when using this in my code, it returns blank.
How can I return just the text I want? Any help greatly appreciated.
powershell html-agility-pack
powershell html-agility-pack
asked Nov 11 at 7:22
Marc Kean
1331314
1331314
1. You can't achieve this using HtmlAgilityPack, as content that you are trying to get is loaded via ajax. 2. Do you really need to use Powershell scripts? It is much better/simpler to use java or c# to accomplish this task (using headless Selenium WebDriver) and if it's acceptable I can show and example. Afterwards you can even compile my example to .dll and wrap it with Powershell script.
– Andrew Kotov
Nov 12 at 17:37
I am only familiar with PowerShell (IT Pro background), and I am familiar with loading .dll files into PowerShell, and no real exposure to Java of C# - if you want to show me something new, I'd be very keen to learn.
– Marc Kean
Nov 12 at 22:52
Marc, sorry for late answer, was a bit busy. So I've prepared everything here: pastebin.com/GK2JGt3H. All information and screenshot with usage are there. If you feel that this is what you need - I can duplicate my answer here at stackoverflow as well.
– Andrew Kotov
Nov 13 at 21:46
add a comment |
1. You can't achieve this using HtmlAgilityPack, as content that you are trying to get is loaded via ajax. 2. Do you really need to use Powershell scripts? It is much better/simpler to use java or c# to accomplish this task (using headless Selenium WebDriver) and if it's acceptable I can show and example. Afterwards you can even compile my example to .dll and wrap it with Powershell script.
– Andrew Kotov
Nov 12 at 17:37
I am only familiar with PowerShell (IT Pro background), and I am familiar with loading .dll files into PowerShell, and no real exposure to Java of C# - if you want to show me something new, I'd be very keen to learn.
– Marc Kean
Nov 12 at 22:52
Marc, sorry for late answer, was a bit busy. So I've prepared everything here: pastebin.com/GK2JGt3H. All information and screenshot with usage are there. If you feel that this is what you need - I can duplicate my answer here at stackoverflow as well.
– Andrew Kotov
Nov 13 at 21:46
1. You can't achieve this using HtmlAgilityPack, as content that you are trying to get is loaded via ajax. 2. Do you really need to use Powershell scripts? It is much better/simpler to use java or c# to accomplish this task (using headless Selenium WebDriver) and if it's acceptable I can show and example. Afterwards you can even compile my example to .dll and wrap it with Powershell script.
– Andrew Kotov
Nov 12 at 17:37
1. You can't achieve this using HtmlAgilityPack, as content that you are trying to get is loaded via ajax. 2. Do you really need to use Powershell scripts? It is much better/simpler to use java or c# to accomplish this task (using headless Selenium WebDriver) and if it's acceptable I can show and example. Afterwards you can even compile my example to .dll and wrap it with Powershell script.
– Andrew Kotov
Nov 12 at 17:37
I am only familiar with PowerShell (IT Pro background), and I am familiar with loading .dll files into PowerShell, and no real exposure to Java of C# - if you want to show me something new, I'd be very keen to learn.
– Marc Kean
Nov 12 at 22:52
I am only familiar with PowerShell (IT Pro background), and I am familiar with loading .dll files into PowerShell, and no real exposure to Java of C# - if you want to show me something new, I'd be very keen to learn.
– Marc Kean
Nov 12 at 22:52
Marc, sorry for late answer, was a bit busy. So I've prepared everything here: pastebin.com/GK2JGt3H. All information and screenshot with usage are there. If you feel that this is what you need - I can duplicate my answer here at stackoverflow as well.
– Andrew Kotov
Nov 13 at 21:46
Marc, sorry for late answer, was a bit busy. So I've prepared everything here: pastebin.com/GK2JGt3H. All information and screenshot with usage are there. If you feel that this is what you need - I can duplicate my answer here at stackoverflow as well.
– Andrew Kotov
Nov 13 at 21:46
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
In this case, F12 -> Network tab is your friend. Look at all javascript files.
The data you are probably looking for is here :
https://master.myradio-api.prod.scadigital.com.au/mmm/stations
Write code to download the json string from the URL. See for instance https://stackoverflow.com/a/11891101/4180382
Copy the whole json string from your F12 response tab
In Visual Studio create a new class file
Click Edit > Paste special > Paste Json as classes.
In your code you will need the name of the first class that you pasted. It is the parent class of all classes underneath. I would say it is something like 'Rootobject', but verify. So do: (C#)
var obj = JsonConvert.DeserializeObject < Rootobject>(downloadedJson);
Now you can loop through the Rootobject children to extract all of the info you need.
He needs to get a name of the song. How can he get it from this data? This json doesn't contain song name..
– Andrew Kotov
Nov 13 at 21:48
add a comment |
up vote
0
down vote
I looked further at the thanks to the person above who pointed me in the right direction, checked the network option in Chrome 'inspect'. Grabbed the metadata from the stream URL.
$metro = '2classicrock'
$URL = 'https://wz2web.scahw.com.au/live/' + $metro + '_32.stream/playlist.m3u8'
$null = (Invoke-WebRequest -Uri $URL).RawContent -match '(https.*m3u8.*)'
$StreamURL = $Matches[0]
$streamMetaData = Invoke-WebRequest -Uri $StreamURL
$null = $streamMetaData.RawContent -match '#EXTINF:4.*?,(.*)'
$Matches[1]
I noticed that my answer didn't quite resolve the problem... Is it ok now?
– Ole EH Dufour
Nov 14 at 7:22
True story..I've learnt few things from this question..
– Andrew Kotov
Nov 14 at 11:16
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
In this case, F12 -> Network tab is your friend. Look at all javascript files.
The data you are probably looking for is here :
https://master.myradio-api.prod.scadigital.com.au/mmm/stations
Write code to download the json string from the URL. See for instance https://stackoverflow.com/a/11891101/4180382
Copy the whole json string from your F12 response tab
In Visual Studio create a new class file
Click Edit > Paste special > Paste Json as classes.
In your code you will need the name of the first class that you pasted. It is the parent class of all classes underneath. I would say it is something like 'Rootobject', but verify. So do: (C#)
var obj = JsonConvert.DeserializeObject < Rootobject>(downloadedJson);
Now you can loop through the Rootobject children to extract all of the info you need.
He needs to get a name of the song. How can he get it from this data? This json doesn't contain song name..
– Andrew Kotov
Nov 13 at 21:48
add a comment |
up vote
0
down vote
In this case, F12 -> Network tab is your friend. Look at all javascript files.
The data you are probably looking for is here :
https://master.myradio-api.prod.scadigital.com.au/mmm/stations
Write code to download the json string from the URL. See for instance https://stackoverflow.com/a/11891101/4180382
Copy the whole json string from your F12 response tab
In Visual Studio create a new class file
Click Edit > Paste special > Paste Json as classes.
In your code you will need the name of the first class that you pasted. It is the parent class of all classes underneath. I would say it is something like 'Rootobject', but verify. So do: (C#)
var obj = JsonConvert.DeserializeObject < Rootobject>(downloadedJson);
Now you can loop through the Rootobject children to extract all of the info you need.
He needs to get a name of the song. How can he get it from this data? This json doesn't contain song name..
– Andrew Kotov
Nov 13 at 21:48
add a comment |
up vote
0
down vote
up vote
0
down vote
In this case, F12 -> Network tab is your friend. Look at all javascript files.
The data you are probably looking for is here :
https://master.myradio-api.prod.scadigital.com.au/mmm/stations
Write code to download the json string from the URL. See for instance https://stackoverflow.com/a/11891101/4180382
Copy the whole json string from your F12 response tab
In Visual Studio create a new class file
Click Edit > Paste special > Paste Json as classes.
In your code you will need the name of the first class that you pasted. It is the parent class of all classes underneath. I would say it is something like 'Rootobject', but verify. So do: (C#)
var obj = JsonConvert.DeserializeObject < Rootobject>(downloadedJson);
Now you can loop through the Rootobject children to extract all of the info you need.
In this case, F12 -> Network tab is your friend. Look at all javascript files.
The data you are probably looking for is here :
https://master.myradio-api.prod.scadigital.com.au/mmm/stations
Write code to download the json string from the URL. See for instance https://stackoverflow.com/a/11891101/4180382
Copy the whole json string from your F12 response tab
In Visual Studio create a new class file
Click Edit > Paste special > Paste Json as classes.
In your code you will need the name of the first class that you pasted. It is the parent class of all classes underneath. I would say it is something like 'Rootobject', but verify. So do: (C#)
var obj = JsonConvert.DeserializeObject < Rootobject>(downloadedJson);
Now you can loop through the Rootobject children to extract all of the info you need.
edited Nov 13 at 13:11
answered Nov 13 at 12:44
Ole EH Dufour
1,0541021
1,0541021
He needs to get a name of the song. How can he get it from this data? This json doesn't contain song name..
– Andrew Kotov
Nov 13 at 21:48
add a comment |
He needs to get a name of the song. How can he get it from this data? This json doesn't contain song name..
– Andrew Kotov
Nov 13 at 21:48
He needs to get a name of the song. How can he get it from this data? This json doesn't contain song name..
– Andrew Kotov
Nov 13 at 21:48
He needs to get a name of the song. How can he get it from this data? This json doesn't contain song name..
– Andrew Kotov
Nov 13 at 21:48
add a comment |
up vote
0
down vote
I looked further at the thanks to the person above who pointed me in the right direction, checked the network option in Chrome 'inspect'. Grabbed the metadata from the stream URL.
$metro = '2classicrock'
$URL = 'https://wz2web.scahw.com.au/live/' + $metro + '_32.stream/playlist.m3u8'
$null = (Invoke-WebRequest -Uri $URL).RawContent -match '(https.*m3u8.*)'
$StreamURL = $Matches[0]
$streamMetaData = Invoke-WebRequest -Uri $StreamURL
$null = $streamMetaData.RawContent -match '#EXTINF:4.*?,(.*)'
$Matches[1]
I noticed that my answer didn't quite resolve the problem... Is it ok now?
– Ole EH Dufour
Nov 14 at 7:22
True story..I've learnt few things from this question..
– Andrew Kotov
Nov 14 at 11:16
add a comment |
up vote
0
down vote
I looked further at the thanks to the person above who pointed me in the right direction, checked the network option in Chrome 'inspect'. Grabbed the metadata from the stream URL.
$metro = '2classicrock'
$URL = 'https://wz2web.scahw.com.au/live/' + $metro + '_32.stream/playlist.m3u8'
$null = (Invoke-WebRequest -Uri $URL).RawContent -match '(https.*m3u8.*)'
$StreamURL = $Matches[0]
$streamMetaData = Invoke-WebRequest -Uri $StreamURL
$null = $streamMetaData.RawContent -match '#EXTINF:4.*?,(.*)'
$Matches[1]
I noticed that my answer didn't quite resolve the problem... Is it ok now?
– Ole EH Dufour
Nov 14 at 7:22
True story..I've learnt few things from this question..
– Andrew Kotov
Nov 14 at 11:16
add a comment |
up vote
0
down vote
up vote
0
down vote
I looked further at the thanks to the person above who pointed me in the right direction, checked the network option in Chrome 'inspect'. Grabbed the metadata from the stream URL.
$metro = '2classicrock'
$URL = 'https://wz2web.scahw.com.au/live/' + $metro + '_32.stream/playlist.m3u8'
$null = (Invoke-WebRequest -Uri $URL).RawContent -match '(https.*m3u8.*)'
$StreamURL = $Matches[0]
$streamMetaData = Invoke-WebRequest -Uri $StreamURL
$null = $streamMetaData.RawContent -match '#EXTINF:4.*?,(.*)'
$Matches[1]
I looked further at the thanks to the person above who pointed me in the right direction, checked the network option in Chrome 'inspect'. Grabbed the metadata from the stream URL.
$metro = '2classicrock'
$URL = 'https://wz2web.scahw.com.au/live/' + $metro + '_32.stream/playlist.m3u8'
$null = (Invoke-WebRequest -Uri $URL).RawContent -match '(https.*m3u8.*)'
$StreamURL = $Matches[0]
$streamMetaData = Invoke-WebRequest -Uri $StreamURL
$null = $streamMetaData.RawContent -match '#EXTINF:4.*?,(.*)'
$Matches[1]
answered Nov 14 at 3:40
Marc Kean
1331314
1331314
I noticed that my answer didn't quite resolve the problem... Is it ok now?
– Ole EH Dufour
Nov 14 at 7:22
True story..I've learnt few things from this question..
– Andrew Kotov
Nov 14 at 11:16
add a comment |
I noticed that my answer didn't quite resolve the problem... Is it ok now?
– Ole EH Dufour
Nov 14 at 7:22
True story..I've learnt few things from this question..
– Andrew Kotov
Nov 14 at 11:16
I noticed that my answer didn't quite resolve the problem... Is it ok now?
– Ole EH Dufour
Nov 14 at 7:22
I noticed that my answer didn't quite resolve the problem... Is it ok now?
– Ole EH Dufour
Nov 14 at 7:22
True story..I've learnt few things from this question..
– Andrew Kotov
Nov 14 at 11:16
True story..I've learnt few things from this question..
– Andrew Kotov
Nov 14 at 11:16
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53246667%2fusing-htmlagilitypack-to-capture-p-class-in-a-website%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1. You can't achieve this using HtmlAgilityPack, as content that you are trying to get is loaded via ajax. 2. Do you really need to use Powershell scripts? It is much better/simpler to use java or c# to accomplish this task (using headless Selenium WebDriver) and if it's acceptable I can show and example. Afterwards you can even compile my example to .dll and wrap it with Powershell script.
– Andrew Kotov
Nov 12 at 17:37
I am only familiar with PowerShell (IT Pro background), and I am familiar with loading .dll files into PowerShell, and no real exposure to Java of C# - if you want to show me something new, I'd be very keen to learn.
– Marc Kean
Nov 12 at 22:52
Marc, sorry for late answer, was a bit busy. So I've prepared everything here: pastebin.com/GK2JGt3H. All information and screenshot with usage are there. If you feel that this is what you need - I can duplicate my answer here at stackoverflow as well.
– Andrew Kotov
Nov 13 at 21:46