Can't deploy to Scrapy cloud, problem with importing even common modules
I am attempting to deploy a spider to Scrapy cloud, but I am repeatedly running into requirements problems, even importing commonly used modules such as io, re, shlex and PyPDF2. I am using Python 3.5. My scrapinghub.yml
file contains these lines:
projects:
default: 358310
stacks:
default: scrapy:1.3-py3
requirements:
file: requirements.txt
My requirements.txt
file contains these lines:
io
re
shlex
PyPDF2==1.26.0
This is the error I got:
Login succeeded
Building an image:
Step 1/12 : FROM scrapinghub/scrapinghub-stack-scrapy:1.3-py3
[91m# Executing 2 build triggers...
[0m
Step 1/1 : ENV PIP_TRUSTED_HOST $PIP_TRUSTED_HOST PIP_INDEX_URL $PIP_INDEX_URL
---> Using cache
Step 1/1 : RUN test -n $APT_PROXY && echo 'Acquire::http::Proxy "$APT_PROXY";' >/etc/apt/apt.conf.d/proxy
---> Using cache
---> 8d2af0ecc1ce
Step 2/12 : ENV PYTHONUSERBASE /app/python
---> Using cache
---> c5bc537289c7
Step 3/12 : ADD eggbased-entrypoint /usr/local/sbin/
---> Using cache
---> 210ce92ef42e
Step 4/12 : ADD run-pipcheck /usr/local/bin/
---> Using cache
---> 2d0a46143fa4
Step 5/12 : RUN chmod +x /usr/local/bin/run-pipcheck
---> Using cache
---> a2eefa41c642
Step 6/12 : RUN chmod +x /usr/local/sbin/eggbased-entrypoint && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/start-crawl && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/scrapy-list && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/shub-image-info && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/run-pipcheck
---> Using cache
---> f3f5b5c713e3
Step 7/12 : ADD requirements.txt /app/requirements.txt
---> Using cache
---> e6417a1c3fea
Step 8/12 : RUN mkdir /app/python && chown nobody:nogroup /app/python
---> Using cache
---> 6720be2ef393
Step 9/12 : RUN sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt
---> Running in 8d8694511a74
Collecting io (from -r /app/requirements.txt (line 1))
[91m Could not find a version that satisfies the requirement io (from -r /app/requirements.txt (line 1)) (from versions: )
[0m
[91mNo matching distribution found for io (from -r /app/requirements.txt (line 1))
[0m
[91mYou are using pip version 9.0.3, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[0m
{"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "details": {"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "code": 1}, "error": "requirements_error"}
Where am I going wrong? BTW, I have the latest version of pip installed (contrary to what the error message states).
python scrapy
add a comment |
I am attempting to deploy a spider to Scrapy cloud, but I am repeatedly running into requirements problems, even importing commonly used modules such as io, re, shlex and PyPDF2. I am using Python 3.5. My scrapinghub.yml
file contains these lines:
projects:
default: 358310
stacks:
default: scrapy:1.3-py3
requirements:
file: requirements.txt
My requirements.txt
file contains these lines:
io
re
shlex
PyPDF2==1.26.0
This is the error I got:
Login succeeded
Building an image:
Step 1/12 : FROM scrapinghub/scrapinghub-stack-scrapy:1.3-py3
[91m# Executing 2 build triggers...
[0m
Step 1/1 : ENV PIP_TRUSTED_HOST $PIP_TRUSTED_HOST PIP_INDEX_URL $PIP_INDEX_URL
---> Using cache
Step 1/1 : RUN test -n $APT_PROXY && echo 'Acquire::http::Proxy "$APT_PROXY";' >/etc/apt/apt.conf.d/proxy
---> Using cache
---> 8d2af0ecc1ce
Step 2/12 : ENV PYTHONUSERBASE /app/python
---> Using cache
---> c5bc537289c7
Step 3/12 : ADD eggbased-entrypoint /usr/local/sbin/
---> Using cache
---> 210ce92ef42e
Step 4/12 : ADD run-pipcheck /usr/local/bin/
---> Using cache
---> 2d0a46143fa4
Step 5/12 : RUN chmod +x /usr/local/bin/run-pipcheck
---> Using cache
---> a2eefa41c642
Step 6/12 : RUN chmod +x /usr/local/sbin/eggbased-entrypoint && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/start-crawl && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/scrapy-list && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/shub-image-info && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/run-pipcheck
---> Using cache
---> f3f5b5c713e3
Step 7/12 : ADD requirements.txt /app/requirements.txt
---> Using cache
---> e6417a1c3fea
Step 8/12 : RUN mkdir /app/python && chown nobody:nogroup /app/python
---> Using cache
---> 6720be2ef393
Step 9/12 : RUN sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt
---> Running in 8d8694511a74
Collecting io (from -r /app/requirements.txt (line 1))
[91m Could not find a version that satisfies the requirement io (from -r /app/requirements.txt (line 1)) (from versions: )
[0m
[91mNo matching distribution found for io (from -r /app/requirements.txt (line 1))
[0m
[91mYou are using pip version 9.0.3, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[0m
{"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "details": {"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "code": 1}, "error": "requirements_error"}
Where am I going wrong? BTW, I have the latest version of pip installed (contrary to what the error message states).
python scrapy
add a comment |
I am attempting to deploy a spider to Scrapy cloud, but I am repeatedly running into requirements problems, even importing commonly used modules such as io, re, shlex and PyPDF2. I am using Python 3.5. My scrapinghub.yml
file contains these lines:
projects:
default: 358310
stacks:
default: scrapy:1.3-py3
requirements:
file: requirements.txt
My requirements.txt
file contains these lines:
io
re
shlex
PyPDF2==1.26.0
This is the error I got:
Login succeeded
Building an image:
Step 1/12 : FROM scrapinghub/scrapinghub-stack-scrapy:1.3-py3
[91m# Executing 2 build triggers...
[0m
Step 1/1 : ENV PIP_TRUSTED_HOST $PIP_TRUSTED_HOST PIP_INDEX_URL $PIP_INDEX_URL
---> Using cache
Step 1/1 : RUN test -n $APT_PROXY && echo 'Acquire::http::Proxy "$APT_PROXY";' >/etc/apt/apt.conf.d/proxy
---> Using cache
---> 8d2af0ecc1ce
Step 2/12 : ENV PYTHONUSERBASE /app/python
---> Using cache
---> c5bc537289c7
Step 3/12 : ADD eggbased-entrypoint /usr/local/sbin/
---> Using cache
---> 210ce92ef42e
Step 4/12 : ADD run-pipcheck /usr/local/bin/
---> Using cache
---> 2d0a46143fa4
Step 5/12 : RUN chmod +x /usr/local/bin/run-pipcheck
---> Using cache
---> a2eefa41c642
Step 6/12 : RUN chmod +x /usr/local/sbin/eggbased-entrypoint && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/start-crawl && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/scrapy-list && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/shub-image-info && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/run-pipcheck
---> Using cache
---> f3f5b5c713e3
Step 7/12 : ADD requirements.txt /app/requirements.txt
---> Using cache
---> e6417a1c3fea
Step 8/12 : RUN mkdir /app/python && chown nobody:nogroup /app/python
---> Using cache
---> 6720be2ef393
Step 9/12 : RUN sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt
---> Running in 8d8694511a74
Collecting io (from -r /app/requirements.txt (line 1))
[91m Could not find a version that satisfies the requirement io (from -r /app/requirements.txt (line 1)) (from versions: )
[0m
[91mNo matching distribution found for io (from -r /app/requirements.txt (line 1))
[0m
[91mYou are using pip version 9.0.3, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[0m
{"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "details": {"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "code": 1}, "error": "requirements_error"}
Where am I going wrong? BTW, I have the latest version of pip installed (contrary to what the error message states).
python scrapy
I am attempting to deploy a spider to Scrapy cloud, but I am repeatedly running into requirements problems, even importing commonly used modules such as io, re, shlex and PyPDF2. I am using Python 3.5. My scrapinghub.yml
file contains these lines:
projects:
default: 358310
stacks:
default: scrapy:1.3-py3
requirements:
file: requirements.txt
My requirements.txt
file contains these lines:
io
re
shlex
PyPDF2==1.26.0
This is the error I got:
Login succeeded
Building an image:
Step 1/12 : FROM scrapinghub/scrapinghub-stack-scrapy:1.3-py3
[91m# Executing 2 build triggers...
[0m
Step 1/1 : ENV PIP_TRUSTED_HOST $PIP_TRUSTED_HOST PIP_INDEX_URL $PIP_INDEX_URL
---> Using cache
Step 1/1 : RUN test -n $APT_PROXY && echo 'Acquire::http::Proxy "$APT_PROXY";' >/etc/apt/apt.conf.d/proxy
---> Using cache
---> 8d2af0ecc1ce
Step 2/12 : ENV PYTHONUSERBASE /app/python
---> Using cache
---> c5bc537289c7
Step 3/12 : ADD eggbased-entrypoint /usr/local/sbin/
---> Using cache
---> 210ce92ef42e
Step 4/12 : ADD run-pipcheck /usr/local/bin/
---> Using cache
---> 2d0a46143fa4
Step 5/12 : RUN chmod +x /usr/local/bin/run-pipcheck
---> Using cache
---> a2eefa41c642
Step 6/12 : RUN chmod +x /usr/local/sbin/eggbased-entrypoint && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/start-crawl && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/scrapy-list && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/shub-image-info && ln -sf /usr/local/sbin/eggbased-entrypoint /usr/local/sbin/run-pipcheck
---> Using cache
---> f3f5b5c713e3
Step 7/12 : ADD requirements.txt /app/requirements.txt
---> Using cache
---> e6417a1c3fea
Step 8/12 : RUN mkdir /app/python && chown nobody:nogroup /app/python
---> Using cache
---> 6720be2ef393
Step 9/12 : RUN sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt
---> Running in 8d8694511a74
Collecting io (from -r /app/requirements.txt (line 1))
[91m Could not find a version that satisfies the requirement io (from -r /app/requirements.txt (line 1)) (from versions: )
[0m
[91mNo matching distribution found for io (from -r /app/requirements.txt (line 1))
[0m
[91mYou are using pip version 9.0.3, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[0m
{"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "details": {"message": "The command '/bin/sh -c sudo -u nobody -E PYTHONUSERBASE=$PYTHONUSERBASE pip install --user --no-cache-dir -r /app/requirements.txt' returned a non-zero code: 1", "code": 1}, "error": "requirements_error"}
Where am I going wrong? BTW, I have the latest version of pip installed (contrary to what the error message states).
python scrapy
python scrapy
asked Nov 14 '18 at 17:16
Code MonkeyCode Monkey
181110
181110
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
io, re, shlex are internal libraries, not packages, as I know. They should not be in requirements file.
But what about PyPDF2? I am pretty sure it's not an internal library, but is not being imported into the Scrapy project...
– Code Monkey
Nov 14 '18 at 19:29
It is separate library: pypi.org/project/PyPDF2 So it should be installed as you wrote in requirements.txt.
– vezunchik
Nov 15 '18 at 9:52
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53305563%2fcant-deploy-to-scrapy-cloud-problem-with-importing-even-common-modules%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
io, re, shlex are internal libraries, not packages, as I know. They should not be in requirements file.
But what about PyPDF2? I am pretty sure it's not an internal library, but is not being imported into the Scrapy project...
– Code Monkey
Nov 14 '18 at 19:29
It is separate library: pypi.org/project/PyPDF2 So it should be installed as you wrote in requirements.txt.
– vezunchik
Nov 15 '18 at 9:52
add a comment |
io, re, shlex are internal libraries, not packages, as I know. They should not be in requirements file.
But what about PyPDF2? I am pretty sure it's not an internal library, but is not being imported into the Scrapy project...
– Code Monkey
Nov 14 '18 at 19:29
It is separate library: pypi.org/project/PyPDF2 So it should be installed as you wrote in requirements.txt.
– vezunchik
Nov 15 '18 at 9:52
add a comment |
io, re, shlex are internal libraries, not packages, as I know. They should not be in requirements file.
io, re, shlex are internal libraries, not packages, as I know. They should not be in requirements file.
answered Nov 14 '18 at 17:43
vezunchikvezunchik
9821413
9821413
But what about PyPDF2? I am pretty sure it's not an internal library, but is not being imported into the Scrapy project...
– Code Monkey
Nov 14 '18 at 19:29
It is separate library: pypi.org/project/PyPDF2 So it should be installed as you wrote in requirements.txt.
– vezunchik
Nov 15 '18 at 9:52
add a comment |
But what about PyPDF2? I am pretty sure it's not an internal library, but is not being imported into the Scrapy project...
– Code Monkey
Nov 14 '18 at 19:29
It is separate library: pypi.org/project/PyPDF2 So it should be installed as you wrote in requirements.txt.
– vezunchik
Nov 15 '18 at 9:52
But what about PyPDF2? I am pretty sure it's not an internal library, but is not being imported into the Scrapy project...
– Code Monkey
Nov 14 '18 at 19:29
But what about PyPDF2? I am pretty sure it's not an internal library, but is not being imported into the Scrapy project...
– Code Monkey
Nov 14 '18 at 19:29
It is separate library: pypi.org/project/PyPDF2 So it should be installed as you wrote in requirements.txt.
– vezunchik
Nov 15 '18 at 9:52
It is separate library: pypi.org/project/PyPDF2 So it should be installed as you wrote in requirements.txt.
– vezunchik
Nov 15 '18 at 9:52
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53305563%2fcant-deploy-to-scrapy-cloud-problem-with-importing-even-common-modules%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown