Using numba functions in map_blocks
I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.
The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.
The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point
And I now get a more comprehensible error rather than just a straight up crash:
TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)
So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.
Any views on how to achieve this?
dask numba
add a comment |
I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.
The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.
The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point
And I now get a more comprehensible error rather than just a straight up crash:
TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)
So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.
Any views on how to achieve this?
dask numba
2
Can you please show a minimal version, with code we can run, that shows the problem
– mdurant
Nov 8 '18 at 3:33
"You can't put an array creation line" - yes, you can do exactly that, and you should.
– mdurant
Nov 13 '18 at 14:45
add a comment |
I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.
The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.
The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point
And I now get a more comprehensible error rather than just a straight up crash:
TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)
So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.
Any views on how to achieve this?
dask numba
I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.
The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.
The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point
And I now get a more comprehensible error rather than just a straight up crash:
TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)
So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.
Any views on how to achieve this?
dask numba
dask numba
edited Nov 9 '18 at 9:34
Ross
asked Nov 8 '18 at 0:59
RossRoss
11
11
2
Can you please show a minimal version, with code we can run, that shows the problem
– mdurant
Nov 8 '18 at 3:33
"You can't put an array creation line" - yes, you can do exactly that, and you should.
– mdurant
Nov 13 '18 at 14:45
add a comment |
2
Can you please show a minimal version, with code we can run, that shows the problem
– mdurant
Nov 8 '18 at 3:33
"You can't put an array creation line" - yes, you can do exactly that, and you should.
– mdurant
Nov 13 '18 at 14:45
2
2
Can you please show a minimal version, with code we can run, that shows the problem
– mdurant
Nov 8 '18 at 3:33
Can you please show a minimal version, with code we can run, that shows the problem
– mdurant
Nov 8 '18 at 3:33
"You can't put an array creation line" - yes, you can do exactly that, and you should.
– mdurant
Nov 13 '18 at 14:45
"You can't put an array creation line" - yes, you can do exactly that, and you should.
– mdurant
Nov 13 '18 at 14:45
add a comment |
1 Answer
1
active
oldest
votes
Here is a version of your code with array creation, which runs fine with numba nopython mode
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)
@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()
Note the y=
line. Note the list of numpy functions supported, according to the documentation.
Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.
– Ross
Nov 15 '18 at 3:14
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53200134%2fusing-numba-functions-in-map-blocks%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is a version of your code with array creation, which runs fine with numba nopython mode
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)
@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()
Note the y=
line. Note the list of numpy functions supported, according to the documentation.
Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.
– Ross
Nov 15 '18 at 3:14
add a comment |
Here is a version of your code with array creation, which runs fine with numba nopython mode
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)
@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()
Note the y=
line. Note the list of numpy functions supported, according to the documentation.
Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.
– Ross
Nov 15 '18 at 3:14
add a comment |
Here is a version of your code with array creation, which runs fine with numba nopython mode
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)
@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()
Note the y=
line. Note the list of numpy functions supported, according to the documentation.
Here is a version of your code with array creation, which runs fine with numba nopython mode
import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)
@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()
Note the y=
line. Note the list of numpy functions supported, according to the documentation.
answered Nov 13 '18 at 14:49
mdurantmdurant
10.1k11436
10.1k11436
Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.
– Ross
Nov 15 '18 at 3:14
add a comment |
Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.
– Ross
Nov 15 '18 at 3:14
Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.
– Ross
Nov 15 '18 at 3:14
Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.
– Ross
Nov 15 '18 at 3:14
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53200134%2fusing-numba-functions-in-map-blocks%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Can you please show a minimal version, with code we can run, that shows the problem
– mdurant
Nov 8 '18 at 3:33
"You can't put an array creation line" - yes, you can do exactly that, and you should.
– mdurant
Nov 13 '18 at 14:45