Running Scrapyd as a daemon on centos 6.10 python 3.6

up vote
2
down vote

favorite

I am trying to run my scrapers on my dedicated centos 6.10 server. I got python 3.6.6 installed, created a venv, and installed a ran scrapyd from a pip install. The command scrapyd shows this:

2018-10-24T12:23:56-0700 [-] Loading /usr/local/lib/python3.6/site-packages/scrapyd/txapp.py...

2018-10-24T12:23:57-0700 [-] Scrapyd web console available at http://127.0.0.1:6800/

2018-10-24T12:23:57-0700 [-] Loaded.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 18.7.0 (/usr/local/bin/python3.6 3.6.6) starting up.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.

2018-10-24T12:23:57-0700 [-] Site starting on 6800

2018-10-24T12:23:57-0700 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f4661cdf940>

2018-10-24T12:23:57-0700 [Launcher] Scrapyd 1.2.0 started: max_proc=16, runner='scrapyd.runner'

Totally cool. Now I have a couple questions.

1- If this is running on my dedicated server, does that mean that scrapyd web console is then at [serverIP]:6800? Or, at least, is it supposed to be there? Because while the command is running, it doesn't appear. The website can't be found. So, I sort of hit a brick wall with this.

2- Another thing is that I don't want to have to leave a browser or SSH terminal open to get scrapyd running. All of the articles I have read have advised that there is no proper RPM package for scrapyd and until somebody makes one I am out of luck because I am not personally a linux expert I am surprised I made it this far.

So I guess this is an issue for running scrapyd as a daemon on the server because it needs special files. I can install scrapyd directly from the git? It didn't seem however that even the git had the right files that I seemingly needed for this project to work.

If somebody could help me on the right track, guide me or provide me with an article where somebody has done the whole process on 6.10 that would be awesome.

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

asked Oct 24 at 19:31

Pixelknight1398

69114

add a comment |

up vote
2
down vote

favorite

2018-10-24T12:23:56-0700 [-] Loading /usr/local/lib/python3.6/site-packages/scrapyd/txapp.py...

2018-10-24T12:23:57-0700 [-] Scrapyd web console available at http://127.0.0.1:6800/

2018-10-24T12:23:57-0700 [-] Loaded.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 18.7.0 (/usr/local/bin/python3.6 3.6.6) starting up.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.

2018-10-24T12:23:57-0700 [-] Site starting on 6800

2018-10-24T12:23:57-0700 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f4661cdf940>

2018-10-24T12:23:57-0700 [Launcher] Scrapyd 1.2.0 started: max_proc=16, runner='scrapyd.runner'

Totally cool. Now I have a couple questions.

If somebody could help me on the right track, guide me or provide me with an article where somebody has done the whole process on 6.10 that would be awesome.

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

asked Oct 24 at 19:31

Pixelknight1398

69114

add a comment |

up vote
2
down vote

favorite

2018-10-24T12:23:56-0700 [-] Loading /usr/local/lib/python3.6/site-packages/scrapyd/txapp.py...

2018-10-24T12:23:57-0700 [-] Scrapyd web console available at http://127.0.0.1:6800/

2018-10-24T12:23:57-0700 [-] Loaded.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 18.7.0 (/usr/local/bin/python3.6 3.6.6) starting up.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.

2018-10-24T12:23:57-0700 [-] Site starting on 6800

2018-10-24T12:23:57-0700 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f4661cdf940>

2018-10-24T12:23:57-0700 [Launcher] Scrapyd 1.2.0 started: max_proc=16, runner='scrapyd.runner'

Totally cool. Now I have a couple questions.

If somebody could help me on the right track, guide me or provide me with an article where somebody has done the whole process on 6.10 that would be awesome.

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

asked Oct 24 at 19:31

Pixelknight1398

69114

2018-10-24T12:23:56-0700 [-] Loading /usr/local/lib/python3.6/site-packages/scrapyd/txapp.py...

2018-10-24T12:23:57-0700 [-] Scrapyd web console available at http://127.0.0.1:6800/

2018-10-24T12:23:57-0700 [-] Loaded.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 18.7.0 (/usr/local/bin/python3.6 3.6.6) starting up.

2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.

2018-10-24T12:23:57-0700 [-] Site starting on 6800

2018-10-24T12:23:57-0700 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f4661cdf940>

2018-10-24T12:23:57-0700 [Launcher] Scrapyd 1.2.0 started: max_proc=16, runner='scrapyd.runner'

Totally cool. Now I have a couple questions.

If somebody could help me on the right track, guide me or provide me with an article where somebody has done the whole process on 6.10 that would be awesome.

python scrapy centos twisted scrapyd

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

asked Oct 24 at 19:31

Pixelknight1398

69114

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

asked Oct 24 at 19:31

Pixelknight1398

69114

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

edited Nov 10 at 23:55

Eray Balkanli

3,85741943

asked Oct 24 at 19:31

Pixelknight1398

69114

asked Oct 24 at 19:31

Pixelknight1398

69114

asked Oct 24 at 19:31

Pixelknight1398

69114

add a comment |

3 Answers
3

active

oldest

votes

up vote
0
down vote

accepted

+50

1 - use scrapyd config file add bind_address=0.0.0.0 in it

# cat ~/.scrapyd.conf [scrapyd] bind_address=0.0.0.0

start scrapyd and you should see something like

2018-11-11T13:58:08-0800 [-] Scrapyd web console available at http://0.0.0.0:6800/

now you should be able to access the web interface from [serverIP]:6800

2 - you can always use tmux for this, read https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340

answered Nov 11 at 22:03

Rene Xu

6021712

Ok cool :) I'll have to give this a shot. Both you and PROW both created valuable answers, who do I mark as the answer?
– Pixelknight1398
Nov 14 at 23:09

Does not matter, as long as I helped, I am happy. :)
– Rene Xu
Nov 15 at 18:32

add a comment |

up vote
0
down vote

You can use the @Rene_Xu answer and check the firewall to see if its dropping external connections. To keep alive the scrapyd you can write a simple script and turn it into a daemon or just use crontab as explained here

answered Nov 14 at 1:26

PROW

1527

Thank you for adding onto his answer it is helpful :)
– Pixelknight1398
Nov 14 at 23:10

Glad that I could help :) !!
– PROW
Nov 16 at 17:18

add a comment |

up vote
0
down vote

Also, check your dedicated environment settings, for example if you are hosted in AWS, you need to setup your security groups, network ACLs etc. to allow incoming requests on this particupar port.

answered Nov 16 at 18:41

Guillaume

9331624

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52976640%2frunning-scrapyd-as-a-daemon-on-centos-6-10-python-3-6%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
0
down vote

accepted

+50

1 - use scrapyd config file add bind_address=0.0.0.0 in it

# cat ~/.scrapyd.conf [scrapyd] bind_address=0.0.0.0

start scrapyd and you should see something like

2018-11-11T13:58:08-0800 [-] Scrapyd web console available at http://0.0.0.0:6800/

now you should be able to access the web interface from [serverIP]:6800

2 - you can always use tmux for this, read https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340

answered Nov 11 at 22:03

Rene Xu

6021712

Ok cool :) I'll have to give this a shot. Both you and PROW both created valuable answers, who do I mark as the answer?
– Pixelknight1398
Nov 14 at 23:09

Does not matter, as long as I helped, I am happy. :)
– Rene Xu
Nov 15 at 18:32

add a comment |

up vote
0
down vote

accepted

+50

1 - use scrapyd config file add bind_address=0.0.0.0 in it

# cat ~/.scrapyd.conf [scrapyd] bind_address=0.0.0.0

start scrapyd and you should see something like

2018-11-11T13:58:08-0800 [-] Scrapyd web console available at http://0.0.0.0:6800/

now you should be able to access the web interface from [serverIP]:6800

2 - you can always use tmux for this, read https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340

answered Nov 11 at 22:03

Rene Xu

6021712

Ok cool :) I'll have to give this a shot. Both you and PROW both created valuable answers, who do I mark as the answer?
– Pixelknight1398
Nov 14 at 23:09

Does not matter, as long as I helped, I am happy. :)
– Rene Xu
Nov 15 at 18:32

add a comment |

up vote
0
down vote

accepted

+50

up vote
0
down vote

accepted

+50

1 - use scrapyd config file add bind_address=0.0.0.0 in it

# cat ~/.scrapyd.conf [scrapyd] bind_address=0.0.0.0

start scrapyd and you should see something like

2018-11-11T13:58:08-0800 [-] Scrapyd web console available at http://0.0.0.0:6800/

now you should be able to access the web interface from [serverIP]:6800

2 - you can always use tmux for this, read https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340

answered Nov 11 at 22:03

Rene Xu

6021712

1 - use scrapyd config file add bind_address=0.0.0.0 in it

# cat ~/.scrapyd.conf [scrapyd] bind_address=0.0.0.0

start scrapyd and you should see something like

2018-11-11T13:58:08-0800 [-] Scrapyd web console available at http://0.0.0.0:6800/

now you should be able to access the web interface from [serverIP]:6800

2 - you can always use tmux for this, read https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340

answered Nov 11 at 22:03

Rene Xu

6021712

answered Nov 11 at 22:03

Rene Xu

6021712

answered Nov 11 at 22:03

Rene Xu

6021712

answered Nov 11 at 22:03

Rene Xu

6021712

Ok cool :) I'll have to give this a shot. Both you and PROW both created valuable answers, who do I mark as the answer?
– Pixelknight1398
Nov 14 at 23:09

Does not matter, as long as I helped, I am happy. :)
– Rene Xu
Nov 15 at 18:32

add a comment |

Ok cool :) I'll have to give this a shot. Both you and PROW both created valuable answers, who do I mark as the answer?
– Pixelknight1398
Nov 14 at 23:09

Does not matter, as long as I helped, I am happy. :)
– Rene Xu
Nov 15 at 18:32

Ok cool :) I'll have to give this a shot. Both you and PROW both created valuable answers, who do I mark as the answer?
– Pixelknight1398
Nov 14 at 23:09

Does not matter, as long as I helped, I am happy. :)
– Rene Xu
Nov 15 at 18:32

add a comment |

up vote
0
down vote

answered Nov 14 at 1:26

PROW

1527

Thank you for adding onto his answer it is helpful :)
– Pixelknight1398
Nov 14 at 23:10

Glad that I could help :) !!
– PROW
Nov 16 at 17:18

add a comment |

up vote
0
down vote

answered Nov 14 at 1:26

PROW

1527

Thank you for adding onto his answer it is helpful :)
– Pixelknight1398
Nov 14 at 23:10

Glad that I could help :) !!
– PROW
Nov 16 at 17:18

add a comment |

up vote
0
down vote

answered Nov 14 at 1:26

PROW

1527

answered Nov 14 at 1:26

PROW

1527

answered Nov 14 at 1:26

PROW

1527

answered Nov 14 at 1:26

PROW

1527

answered Nov 14 at 1:26

PROW

1527

Thank you for adding onto his answer it is helpful :)
– Pixelknight1398
Nov 14 at 23:10

Glad that I could help :) !!
– PROW
Nov 16 at 17:18

add a comment |

Thank you for adding onto his answer it is helpful :)
– Pixelknight1398
Nov 14 at 23:10

Glad that I could help :) !!
– PROW
Nov 16 at 17:18

Thank you for adding onto his answer it is helpful :)
– Pixelknight1398
Nov 14 at 23:10

Glad that I could help :) !!
– PROW
Nov 16 at 17:18

add a comment |

up vote
0
down vote

Also, check your dedicated environment settings, for example if you are hosted in AWS, you need to setup your security groups, network ACLs etc. to allow incoming requests on this particupar port.

answered Nov 16 at 18:41

Guillaume

9331624

add a comment |

up vote
0
down vote

Also, check your dedicated environment settings, for example if you are hosted in AWS, you need to setup your security groups, network ACLs etc. to allow incoming requests on this particupar port.

answered Nov 16 at 18:41

Guillaume

9331624

add a comment |

up vote
0
down vote

Also, check your dedicated environment settings, for example if you are hosted in AWS, you need to setup your security groups, network ACLs etc. to allow incoming requests on this particupar port.

answered Nov 16 at 18:41

Guillaume

9331624

Also, check your dedicated environment settings, for example if you are hosted in AWS, you need to setup your security groups, network ACLs etc. to allow incoming requests on this particupar port.

answered Nov 16 at 18:41

Guillaume

9331624

answered Nov 16 at 18:41

Guillaume

9331624

answered Nov 16 at 18:41

Guillaume

9331624

answered Nov 16 at 18:41

Guillaume

9331624

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

vR4lESg vv35ErQw6n5Yw4uj8

搜尋此網誌

Ndtyjky