How to get my Python script to go to a URL, download the latest file
up vote
0
down vote
favorite
I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.
#import the writer
import xlwt
#import the reader
import xlrd
#open the rankings spreadsheet
book = xlrd.open_workbook('rankings.xls')
#open the first sheet
first_sheet = book.sheet_by_index(0)
#print the values in the second column of the first sheet
print first_sheet.col_values(1)
#open the spreadsheet
workbook = xlwt.Workbook()
#add a sheet named "Club BFA ranking"
worksheet1 = workbook.add_sheet("Club BFA ranking")
#in cell 0,0 (first cell of the first row) write "Ranking"
worksheet1.write(0, 0, "Ranking")
#in cell 0,1 (second cell of the first row) write "Name"
worksheet1.write(0, 1, "Name")
#save and create the spreadsheet file
workbook.save("saxons.xls")
name =
rank =
for i in range(first_sheet.nrows):
#print(first_sheet.cell_value(i,3))
if('Saxon' in first_sheet.cell_value(i,3)):
name.append(first_sheet.cell_value(i,1))
rank.append(first_sheet.cell_value(i,8))
print('a')
for j in range(len(name)):
worksheet1.write(j+1,0,rank[j])
worksheet1.write(j+1,1,name[j])
workbook.save("saxons.xls")
As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls
How can I do that?
python url xls xlrd xlwt
add a comment |
up vote
0
down vote
favorite
I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.
#import the writer
import xlwt
#import the reader
import xlrd
#open the rankings spreadsheet
book = xlrd.open_workbook('rankings.xls')
#open the first sheet
first_sheet = book.sheet_by_index(0)
#print the values in the second column of the first sheet
print first_sheet.col_values(1)
#open the spreadsheet
workbook = xlwt.Workbook()
#add a sheet named "Club BFA ranking"
worksheet1 = workbook.add_sheet("Club BFA ranking")
#in cell 0,0 (first cell of the first row) write "Ranking"
worksheet1.write(0, 0, "Ranking")
#in cell 0,1 (second cell of the first row) write "Name"
worksheet1.write(0, 1, "Name")
#save and create the spreadsheet file
workbook.save("saxons.xls")
name =
rank =
for i in range(first_sheet.nrows):
#print(first_sheet.cell_value(i,3))
if('Saxon' in first_sheet.cell_value(i,3)):
name.append(first_sheet.cell_value(i,1))
rank.append(first_sheet.cell_value(i,8))
print('a')
for j in range(len(name)):
worksheet1.write(j+1,0,rank[j])
worksheet1.write(j+1,1,name[j])
workbook.save("saxons.xls")
As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls
How can I do that?
python url xls xlrd xlwt
docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.
#import the writer
import xlwt
#import the reader
import xlrd
#open the rankings spreadsheet
book = xlrd.open_workbook('rankings.xls')
#open the first sheet
first_sheet = book.sheet_by_index(0)
#print the values in the second column of the first sheet
print first_sheet.col_values(1)
#open the spreadsheet
workbook = xlwt.Workbook()
#add a sheet named "Club BFA ranking"
worksheet1 = workbook.add_sheet("Club BFA ranking")
#in cell 0,0 (first cell of the first row) write "Ranking"
worksheet1.write(0, 0, "Ranking")
#in cell 0,1 (second cell of the first row) write "Name"
worksheet1.write(0, 1, "Name")
#save and create the spreadsheet file
workbook.save("saxons.xls")
name =
rank =
for i in range(first_sheet.nrows):
#print(first_sheet.cell_value(i,3))
if('Saxon' in first_sheet.cell_value(i,3)):
name.append(first_sheet.cell_value(i,1))
rank.append(first_sheet.cell_value(i,8))
print('a')
for j in range(len(name)):
worksheet1.write(j+1,0,rank[j])
worksheet1.write(j+1,1,name[j])
workbook.save("saxons.xls")
As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls
How can I do that?
python url xls xlrd xlwt
I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.
#import the writer
import xlwt
#import the reader
import xlrd
#open the rankings spreadsheet
book = xlrd.open_workbook('rankings.xls')
#open the first sheet
first_sheet = book.sheet_by_index(0)
#print the values in the second column of the first sheet
print first_sheet.col_values(1)
#open the spreadsheet
workbook = xlwt.Workbook()
#add a sheet named "Club BFA ranking"
worksheet1 = workbook.add_sheet("Club BFA ranking")
#in cell 0,0 (first cell of the first row) write "Ranking"
worksheet1.write(0, 0, "Ranking")
#in cell 0,1 (second cell of the first row) write "Name"
worksheet1.write(0, 1, "Name")
#save and create the spreadsheet file
workbook.save("saxons.xls")
name =
rank =
for i in range(first_sheet.nrows):
#print(first_sheet.cell_value(i,3))
if('Saxon' in first_sheet.cell_value(i,3)):
name.append(first_sheet.cell_value(i,1))
rank.append(first_sheet.cell_value(i,8))
print('a')
for j in range(len(name)):
worksheet1.write(j+1,0,rank[j])
worksheet1.write(j+1,1,name[j])
workbook.save("saxons.xls")
As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls
How can I do that?
python url xls xlrd xlwt
python url xls xlrd xlwt
asked Nov 11 at 11:20
J4G
108110
108110
docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25
add a comment |
docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25
docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25
docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
You could use the requests library. For example,
import requests
url = "YOUR_URL"
downloaded_file = requests.get(url)
with open("YOUR_PATH/rankings.xls", 'wb') as file:
file.write(downloaded_file.content)
EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.
time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")
as YOUR_URL
to get the latest month's rankings.
add a comment |
up vote
1
down vote
I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.
Do a
pip install requests
before doing a
import requests
url = "http://foobar.com/rankings.xls"
r = requests.get(url)
then push the contents into a file
with open('./rankings.xls', 'w') as f:
f.write(r.content)
So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.
EDIT: OP asked for a method to extract the latest xls file from the page. I'd suggest to parse the html for hrefs containing xls (as the page OP wants to parse is providing no common format for the xls files to be downloaded).
Best way to do this would be BeautifulSoup:
pip install bs4
from bs4 import BeautifulSoup
import requests
x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')
soup = BeautifulSoup(x.content, 'html.parser')
result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]
print(result[0])
apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25
I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57
1
how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58
updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
You could use the requests library. For example,
import requests
url = "YOUR_URL"
downloaded_file = requests.get(url)
with open("YOUR_PATH/rankings.xls", 'wb') as file:
file.write(downloaded_file.content)
EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.
time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")
as YOUR_URL
to get the latest month's rankings.
add a comment |
up vote
1
down vote
accepted
You could use the requests library. For example,
import requests
url = "YOUR_URL"
downloaded_file = requests.get(url)
with open("YOUR_PATH/rankings.xls", 'wb') as file:
file.write(downloaded_file.content)
EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.
time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")
as YOUR_URL
to get the latest month's rankings.
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
You could use the requests library. For example,
import requests
url = "YOUR_URL"
downloaded_file = requests.get(url)
with open("YOUR_PATH/rankings.xls", 'wb') as file:
file.write(downloaded_file.content)
EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.
time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")
as YOUR_URL
to get the latest month's rankings.
You could use the requests library. For example,
import requests
url = "YOUR_URL"
downloaded_file = requests.get(url)
with open("YOUR_PATH/rankings.xls", 'wb') as file:
file.write(downloaded_file.content)
EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.
time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")
as YOUR_URL
to get the latest month's rankings.
edited Nov 11 at 22:15
answered Nov 11 at 11:30
Faquarl
3239
3239
add a comment |
add a comment |
up vote
1
down vote
I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.
Do a
pip install requests
before doing a
import requests
url = "http://foobar.com/rankings.xls"
r = requests.get(url)
then push the contents into a file
with open('./rankings.xls', 'w') as f:
f.write(r.content)
So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.
EDIT: OP asked for a method to extract the latest xls file from the page. I'd suggest to parse the html for hrefs containing xls (as the page OP wants to parse is providing no common format for the xls files to be downloaded).
Best way to do this would be BeautifulSoup:
pip install bs4
from bs4 import BeautifulSoup
import requests
x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')
soup = BeautifulSoup(x.content, 'html.parser')
result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]
print(result[0])
apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25
I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57
1
how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58
updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12
add a comment |
up vote
1
down vote
I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.
Do a
pip install requests
before doing a
import requests
url = "http://foobar.com/rankings.xls"
r = requests.get(url)
then push the contents into a file
with open('./rankings.xls', 'w') as f:
f.write(r.content)
So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.
EDIT: OP asked for a method to extract the latest xls file from the page. I'd suggest to parse the html for hrefs containing xls (as the page OP wants to parse is providing no common format for the xls files to be downloaded).
Best way to do this would be BeautifulSoup:
pip install bs4
from bs4 import BeautifulSoup
import requests
x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')
soup = BeautifulSoup(x.content, 'html.parser')
result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]
print(result[0])
apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25
I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57
1
how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58
updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12
add a comment |
up vote
1
down vote
up vote
1
down vote
I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.
Do a
pip install requests
before doing a
import requests
url = "http://foobar.com/rankings.xls"
r = requests.get(url)
then push the contents into a file
with open('./rankings.xls', 'w') as f:
f.write(r.content)
So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.
EDIT: OP asked for a method to extract the latest xls file from the page. I'd suggest to parse the html for hrefs containing xls (as the page OP wants to parse is providing no common format for the xls files to be downloaded).
Best way to do this would be BeautifulSoup:
pip install bs4
from bs4 import BeautifulSoup
import requests
x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')
soup = BeautifulSoup(x.content, 'html.parser')
result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]
print(result[0])
I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.
Do a
pip install requests
before doing a
import requests
url = "http://foobar.com/rankings.xls"
r = requests.get(url)
then push the contents into a file
with open('./rankings.xls', 'w') as f:
f.write(r.content)
So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.
EDIT: OP asked for a method to extract the latest xls file from the page. I'd suggest to parse the html for hrefs containing xls (as the page OP wants to parse is providing no common format for the xls files to be downloaded).
Best way to do this would be BeautifulSoup:
pip install bs4
from bs4 import BeautifulSoup
import requests
x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')
soup = BeautifulSoup(x.content, 'html.parser')
result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]
print(result[0])
edited Nov 12 at 23:10
answered Nov 11 at 11:32
ferdy
3,42212432
3,42212432
apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25
I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57
1
how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58
updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12
add a comment |
apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25
I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57
1
how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58
updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12
apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25
apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25
I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57
I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57
1
1
how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58
how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58
updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12
updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248198%2fhow-to-get-my-python-script-to-go-to-a-url-download-the-latest-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25