Using numba functions in map_blocks












-1















I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.



The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.



The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point


And I now get a more comprehensible error rather than just a straight up crash:



TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)


So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.



Any views on how to achieve this?










share|improve this question




















  • 2





    Can you please show a minimal version, with code we can run, that shows the problem

    – mdurant
    Nov 8 '18 at 3:33











  • "You can't put an array creation line" - yes, you can do exactly that, and you should.

    – mdurant
    Nov 13 '18 at 14:45
















-1















I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.



The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.



The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point


And I now get a more comprehensible error rather than just a straight up crash:



TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)


So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.



Any views on how to achieve this?










share|improve this question




















  • 2





    Can you please show a minimal version, with code we can run, that shows the problem

    – mdurant
    Nov 8 '18 at 3:33











  • "You can't put an array creation line" - yes, you can do exactly that, and you should.

    – mdurant
    Nov 13 '18 at 14:45














-1












-1








-1








I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.



The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.



The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point


And I now get a more comprehensible error rather than just a straight up crash:



TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)


So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.



Any views on how to achieve this?










share|improve this question
















I have successfully used map_blocks a few times on dask arrays. I'm now trying to deploy a numba function to act on each block, and to act and change one of the inputs.



The numba function takes in 2 numpy arrays, and updates the second one. this is then returned in the return statement to make it available to map_blocks as a result.



The function works fine on a numpy array, but python just crashes when calling it from map_blocks. numba functions that do not act on an input array behave normally (although it is difficult to get them to do anything useful in this case).
Is this a known limitation? A bug? Am I using it wrong?!
Update
I've finally boiled it down to a reproducible example with a trivial numba function, and I get a clearer idea of the problem. However I'm still unclear on how to resolve the issue. Here's the code:



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
b=np.zeros((size,),dtype='float64')
dista=da.from_array(a,chunks=size//4)
distb=da.from_array(b,chunks=size//4)
@jit(float64[:](float64[:],float64[:]))
def crasher(x,y):
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,distb,dtype='float64')
c=distc.compute() #it all crashes at this point


And I now get a more comprehensible error rather than just a straight up crash:



TypeError: No matching definition for argument type(s) readonly array(float64, 1d, C), readonly array(float64, 1d, C)


So if numba is receiving numpy arrays with write=False set, how do you get numba to do any useful work? You can't put an array creation line in the numba function, and you can't feed it writeable arrays.



Any views on how to achieve this?







dask numba






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 9 '18 at 9:34







Ross

















asked Nov 8 '18 at 0:59









RossRoss

11




11








  • 2





    Can you please show a minimal version, with code we can run, that shows the problem

    – mdurant
    Nov 8 '18 at 3:33











  • "You can't put an array creation line" - yes, you can do exactly that, and you should.

    – mdurant
    Nov 13 '18 at 14:45














  • 2





    Can you please show a minimal version, with code we can run, that shows the problem

    – mdurant
    Nov 8 '18 at 3:33











  • "You can't put an array creation line" - yes, you can do exactly that, and you should.

    – mdurant
    Nov 13 '18 at 14:45








2




2





Can you please show a minimal version, with code we can run, that shows the problem

– mdurant
Nov 8 '18 at 3:33





Can you please show a minimal version, with code we can run, that shows the problem

– mdurant
Nov 8 '18 at 3:33













"You can't put an array creation line" - yes, you can do exactly that, and you should.

– mdurant
Nov 13 '18 at 14:45





"You can't put an array creation line" - yes, you can do exactly that, and you should.

– mdurant
Nov 13 '18 at 14:45












1 Answer
1






active

oldest

votes


















1














Here is a version of your code with array creation, which runs fine with numba nopython mode



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)

@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()


Note the y= line. Note the list of numpy functions supported, according to the documentation.






share|improve this answer
























  • Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.

    – Ross
    Nov 15 '18 at 3:14











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53200134%2fusing-numba-functions-in-map-blocks%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Here is a version of your code with array creation, which runs fine with numba nopython mode



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)

@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()


Note the y= line. Note the list of numpy functions supported, according to the documentation.






share|improve this answer
























  • Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.

    – Ross
    Nov 15 '18 at 3:14
















1














Here is a version of your code with array creation, which runs fine with numba nopython mode



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)

@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()


Note the y= line. Note the list of numpy functions supported, according to the documentation.






share|improve this answer
























  • Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.

    – Ross
    Nov 15 '18 at 3:14














1












1








1







Here is a version of your code with array creation, which runs fine with numba nopython mode



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)

@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()


Note the y= line. Note the list of numpy functions supported, according to the documentation.






share|improve this answer













Here is a version of your code with array creation, which runs fine with numba nopython mode



import numpy as np
from numba import jit, float64, int64
from dask.distributed import Client, LocalCluster
import dask.array as da
cluster=LocalCluster()
c=Client(cluster)
size=int(1e5)
a=np.arange(size,dtype='float64')
dista=da.from_array(a,chunks=size//4)

@jit(nopython=True)
def crasher(x):
y = np.empty_like(x)
for i in range(x.shape[0]):
y[i]=x[i]*2
return y
distc=da.map_blocks(crasher,dista,dtype='float64')
c=distc.compute()


Note the y= line. Note the list of numpy functions supported, according to the documentation.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 13 '18 at 14:49









mdurantmdurant

10.1k11436




10.1k11436













  • Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.

    – Ross
    Nov 15 '18 at 3:14



















  • Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.

    – Ross
    Nov 15 '18 at 3:14

















Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.

– Ross
Nov 15 '18 at 3:14





Many thanks for your inputs. I'll need to investigate this a bit further. I can clearly use numpy functions in numba, but for some reason I had this failing via dask map_blocks. If this works for you, I've clearly got some misconfiguration on my side that I'll need to look in to.

– Ross
Nov 15 '18 at 3:14


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53200134%2fusing-numba-functions-in-map-blocks%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Florida Star v. B. J. F.

Error while running script in elastic search , gateway timeout

Adding quotations to stringified JSON object values