structuring a large python repository, to not import everything
up vote
4
down vote
favorite
I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:
repo/
__init__.py
utils/
__init__.py
math.py
readers.py
...
...
Now our __init__.py
files are setup so that we can do something like this
from repo.utils import IniReader
In this example repo/utils/__init__.py
would have
from .readers import IniReader, DatReader
This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.
The issue is this... let's pretend I'm writing an app that looks like this:
from repo.utils import IniReader
if __name__ == '__main__':
r = IniReader('blah.ini')
print(r.fields)
Now the from repo.utils import IniReader
will execute repo/utils/__init__.py
which in this case will import IniReader
and DatReader
. Let's pretend that DatReader
looks something like this:
import numpy as np
import scipy
import tensorflow
from .math import transform
class DatReader():
...
which adheres to PEP8, with all the imports at the top of the file.
The problem here is that DatReader
requires some heavyweight imports (e.g. numpy, scipy, tensorflow are huge libraries). To make matters worse, the from .math import transform
might have something like from repo.contrib import lookup
which then hits the repo/contrib/__init__.py
which starts a chain reaction and ends up importing our entire repository.
This really hasn't been a problem for all of us developers with a full development environment stood up, but now that we're trying to ship applications (internally) this import hell is becoming an issue.
Is there a standard solution to this problem? We've talked about just keeping the __init__.py
empty, or just not having all the imports at the top of a file as PEP8 states. Both of these solutions come with compromises, so if anyone has suggestions or references, I'd love to hear it.
Thanks!
python deployment import
add a comment |
up vote
4
down vote
favorite
I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:
repo/
__init__.py
utils/
__init__.py
math.py
readers.py
...
...
Now our __init__.py
files are setup so that we can do something like this
from repo.utils import IniReader
In this example repo/utils/__init__.py
would have
from .readers import IniReader, DatReader
This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.
The issue is this... let's pretend I'm writing an app that looks like this:
from repo.utils import IniReader
if __name__ == '__main__':
r = IniReader('blah.ini')
print(r.fields)
Now the from repo.utils import IniReader
will execute repo/utils/__init__.py
which in this case will import IniReader
and DatReader
. Let's pretend that DatReader
looks something like this:
import numpy as np
import scipy
import tensorflow
from .math import transform
class DatReader():
...
which adheres to PEP8, with all the imports at the top of the file.
The problem here is that DatReader
requires some heavyweight imports (e.g. numpy, scipy, tensorflow are huge libraries). To make matters worse, the from .math import transform
might have something like from repo.contrib import lookup
which then hits the repo/contrib/__init__.py
which starts a chain reaction and ends up importing our entire repository.
This really hasn't been a problem for all of us developers with a full development environment stood up, but now that we're trying to ship applications (internally) this import hell is becoming an issue.
Is there a standard solution to this problem? We've talked about just keeping the __init__.py
empty, or just not having all the imports at the top of a file as PEP8 states. Both of these solutions come with compromises, so if anyone has suggestions or references, I'd love to hear it.
Thanks!
python deployment import
add a comment |
up vote
4
down vote
favorite
up vote
4
down vote
favorite
I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:
repo/
__init__.py
utils/
__init__.py
math.py
readers.py
...
...
Now our __init__.py
files are setup so that we can do something like this
from repo.utils import IniReader
In this example repo/utils/__init__.py
would have
from .readers import IniReader, DatReader
This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.
The issue is this... let's pretend I'm writing an app that looks like this:
from repo.utils import IniReader
if __name__ == '__main__':
r = IniReader('blah.ini')
print(r.fields)
Now the from repo.utils import IniReader
will execute repo/utils/__init__.py
which in this case will import IniReader
and DatReader
. Let's pretend that DatReader
looks something like this:
import numpy as np
import scipy
import tensorflow
from .math import transform
class DatReader():
...
which adheres to PEP8, with all the imports at the top of the file.
The problem here is that DatReader
requires some heavyweight imports (e.g. numpy, scipy, tensorflow are huge libraries). To make matters worse, the from .math import transform
might have something like from repo.contrib import lookup
which then hits the repo/contrib/__init__.py
which starts a chain reaction and ends up importing our entire repository.
This really hasn't been a problem for all of us developers with a full development environment stood up, but now that we're trying to ship applications (internally) this import hell is becoming an issue.
Is there a standard solution to this problem? We've talked about just keeping the __init__.py
empty, or just not having all the imports at the top of a file as PEP8 states. Both of these solutions come with compromises, so if anyone has suggestions or references, I'd love to hear it.
Thanks!
python deployment import
I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:
repo/
__init__.py
utils/
__init__.py
math.py
readers.py
...
...
Now our __init__.py
files are setup so that we can do something like this
from repo.utils import IniReader
In this example repo/utils/__init__.py
would have
from .readers import IniReader, DatReader
This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.
The issue is this... let's pretend I'm writing an app that looks like this:
from repo.utils import IniReader
if __name__ == '__main__':
r = IniReader('blah.ini')
print(r.fields)
Now the from repo.utils import IniReader
will execute repo/utils/__init__.py
which in this case will import IniReader
and DatReader
. Let's pretend that DatReader
looks something like this:
import numpy as np
import scipy
import tensorflow
from .math import transform
class DatReader():
...
which adheres to PEP8, with all the imports at the top of the file.
The problem here is that DatReader
requires some heavyweight imports (e.g. numpy, scipy, tensorflow are huge libraries). To make matters worse, the from .math import transform
might have something like from repo.contrib import lookup
which then hits the repo/contrib/__init__.py
which starts a chain reaction and ends up importing our entire repository.
This really hasn't been a problem for all of us developers with a full development environment stood up, but now that we're trying to ship applications (internally) this import hell is becoming an issue.
Is there a standard solution to this problem? We've talked about just keeping the __init__.py
empty, or just not having all the imports at the top of a file as PEP8 states. Both of these solutions come with compromises, so if anyone has suggestions or references, I'd love to hear it.
Thanks!
python deployment import
python deployment import
asked Nov 10 at 17:10
matt
567
567
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"
Basically there are two categories of solutions to this problem:
- Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with
pip
. Just include dependency specifications in your distributed package, andpip
will offer users to automatically download and install any missing packages. - You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.
- You could distribute your code as a package that users can install with
- Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in
__init__.py
files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing theDatReader
imports. - As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).
- As you already noted, you could sanitize the module-level imports (i.e. imports in
Example of an import in try-except clause:
import warnings
try:
import scipy
except ImportError:
warnings.warn("The python package `scipy` could not be imported. As a result "
"the class `repo.utils.DatReader` will not be functional.")
Now to come back again to your original question "Is there a standard solution to this problem?": I'd say no. There's no single golden bullet. All solutions come with their own advantages and disadvantages, and you'll have to decide which solution is the optimal one for your specific situation.
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"
Basically there are two categories of solutions to this problem:
- Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with
pip
. Just include dependency specifications in your distributed package, andpip
will offer users to automatically download and install any missing packages. - You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.
- You could distribute your code as a package that users can install with
- Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in
__init__.py
files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing theDatReader
imports. - As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).
- As you already noted, you could sanitize the module-level imports (i.e. imports in
Example of an import in try-except clause:
import warnings
try:
import scipy
except ImportError:
warnings.warn("The python package `scipy` could not be imported. As a result "
"the class `repo.utils.DatReader` will not be functional.")
Now to come back again to your original question "Is there a standard solution to this problem?": I'd say no. There's no single golden bullet. All solutions come with their own advantages and disadvantages, and you'll have to decide which solution is the optimal one for your specific situation.
add a comment |
up vote
1
down vote
accepted
It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"
Basically there are two categories of solutions to this problem:
- Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with
pip
. Just include dependency specifications in your distributed package, andpip
will offer users to automatically download and install any missing packages. - You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.
- You could distribute your code as a package that users can install with
- Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in
__init__.py
files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing theDatReader
imports. - As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).
- As you already noted, you could sanitize the module-level imports (i.e. imports in
Example of an import in try-except clause:
import warnings
try:
import scipy
except ImportError:
warnings.warn("The python package `scipy` could not be imported. As a result "
"the class `repo.utils.DatReader` will not be functional.")
Now to come back again to your original question "Is there a standard solution to this problem?": I'd say no. There's no single golden bullet. All solutions come with their own advantages and disadvantages, and you'll have to decide which solution is the optimal one for your specific situation.
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"
Basically there are two categories of solutions to this problem:
- Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with
pip
. Just include dependency specifications in your distributed package, andpip
will offer users to automatically download and install any missing packages. - You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.
- You could distribute your code as a package that users can install with
- Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in
__init__.py
files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing theDatReader
imports. - As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).
- As you already noted, you could sanitize the module-level imports (i.e. imports in
Example of an import in try-except clause:
import warnings
try:
import scipy
except ImportError:
warnings.warn("The python package `scipy` could not be imported. As a result "
"the class `repo.utils.DatReader` will not be functional.")
Now to come back again to your original question "Is there a standard solution to this problem?": I'd say no. There's no single golden bullet. All solutions come with their own advantages and disadvantages, and you'll have to decide which solution is the optimal one for your specific situation.
It might be helpful to take a step back for a brief moment and look at the fundamental issue that you seem to be faced with, namely: "How do I deal with missing python packages on users' machines?"
Basically there are two categories of solutions to this problem:
- Help to make the missing packages available on the user's machine.
- You could distribute your code as a package that users can install with
pip
. Just include dependency specifications in your distributed package, andpip
will offer users to automatically download and install any missing packages. - You could freeze your code, i.e. convert your code to a self-standing application that already includes all the required packages.
- You could distribute your code as a package that users can install with
- Divide your package dependencies into mandatory and optional ones, and adapt your code such that the absence of an optional package doesn't cause all of the code to break.
- As you already noted, you could sanitize the module-level imports (i.e. imports in
__init__.py
files) such that optional packages are not loaded 'prematurely'. In your case that would mean removing theDatReader
imports. - As you also already noted, you could move optional package imports inside the classes or functions that need them. Style-wise this is not really optimal, but the code itself will still be perfectly valid. It normally doesn't matter that the import statements will get executed again every time when the function is run, because the actual import will still only take place once.
- You could wrap the imports of the optional packages into try-except clauses. This will prevent any import errors from occurring (though of course you'll still encounter an error once you try to run a class or function that depends upon the missing package).
- As you already noted, you could sanitize the module-level imports (i.e. imports in
Example of an import in try-except clause:
import warnings
try:
import scipy
except ImportError:
warnings.warn("The python package `scipy` could not be imported. As a result "
"the class `repo.utils.DatReader` will not be functional.")
Now to come back again to your original question "Is there a standard solution to this problem?": I'd say no. There's no single golden bullet. All solutions come with their own advantages and disadvantages, and you'll have to decide which solution is the optimal one for your specific situation.
edited Nov 10 at 23:43
answered Nov 10 at 23:36
Xukrao
1,5961521
1,5961521
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53241381%2fstructuring-a-large-python-repository-to-not-import-everything%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown