Can I make the falling sand game run parallel on the GPU?
up vote
0
down vote
favorite
I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:
In one update, both particle 1A and 1B could flow into space T.
Sequential
In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.
Parallel
My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.
My question is then, can I make a parallel solution to this, or can it only be solved sequentially?
Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:
struct Particle {
float4 color;
int live;
int particleType;
};
kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]]) {
uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];
Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right
if (particle.live == 1 && particle.particleType == SAND) { // move down
if (bottom.live == 0) {
dst[index].live = 0;
dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;
}
else if (bottomLeft.live == 0) { // move down to the left if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;
}
else if (bottomRight.live == 0) { // move down to the right if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;
}
else { // cant move
dst[index] = src[index];
}
}
if (particle.particleType == WALL) {
dst[index] = particle;
dst[index].live = 1;
}
if (particle.particleType == ERASER) {
dst[index].color = blue;
dst[index].live = 0;
}
}
gpu simulation metal
|
show 1 more comment
up vote
0
down vote
favorite
I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:
In one update, both particle 1A and 1B could flow into space T.
Sequential
In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.
Parallel
My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.
My question is then, can I make a parallel solution to this, or can it only be solved sequentially?
Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:
struct Particle {
float4 color;
int live;
int particleType;
};
kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]]) {
uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];
Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right
if (particle.live == 1 && particle.particleType == SAND) { // move down
if (bottom.live == 0) {
dst[index].live = 0;
dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;
}
else if (bottomLeft.live == 0) { // move down to the left if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;
}
else if (bottomRight.live == 0) { // move down to the right if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;
}
else { // cant move
dst[index] = src[index];
}
}
if (particle.particleType == WALL) {
dst[index] = particle;
dst[index].live = 1;
}
if (particle.particleType == ERASER) {
dst[index].color = blue;
dst[index].live = 0;
}
}
gpu simulation metal
Your "code" is confusing because there are unexplained undeclared variables, likereverse
andparticleRef
. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20
Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23
I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29
1
Does thedst
buffer start with undefined contents? Or is it initially a copy ofsrc
? Or perhaps cleared out? Doessrc
need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25
1
Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44
|
show 1 more comment
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:
In one update, both particle 1A and 1B could flow into space T.
Sequential
In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.
Parallel
My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.
My question is then, can I make a parallel solution to this, or can it only be solved sequentially?
Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:
struct Particle {
float4 color;
int live;
int particleType;
};
kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]]) {
uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];
Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right
if (particle.live == 1 && particle.particleType == SAND) { // move down
if (bottom.live == 0) {
dst[index].live = 0;
dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;
}
else if (bottomLeft.live == 0) { // move down to the left if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;
}
else if (bottomRight.live == 0) { // move down to the right if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;
}
else { // cant move
dst[index] = src[index];
}
}
if (particle.particleType == WALL) {
dst[index] = particle;
dst[index].live = 1;
}
if (particle.particleType == ERASER) {
dst[index].color = blue;
dst[index].live = 0;
}
}
gpu simulation metal
I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:
In one update, both particle 1A and 1B could flow into space T.
Sequential
In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.
Parallel
My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.
My question is then, can I make a parallel solution to this, or can it only be solved sequentially?
Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:
struct Particle {
float4 color;
int live;
int particleType;
};
kernel void calculateFrame(
device Particle *src [[buffer(1)]],
device Particle *dst [[buffer(2)]],
uint2 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]],
uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],
uint2 threads_per_threadgroup [[ threads_per_threadgroup ]]) {
uint2 gid = threadgroup_position_in_grid;
int index = (gid * 640) + gid;
Particle particle = src[index];
Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom
Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left
Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right
if (particle.live == 1 && particle.particleType == SAND) { // move down
if (bottom.live == 0) {
dst[index].live = 0;
dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 0, 1))].color = particle.color;
dst[toLinear( gid + uint2( 0, 1))].live = 1;
}
else if (bottomLeft.live == 0) { // move down to the left if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( -1, 1))].color = red;
dst[toLinear( gid + uint2( -1, 1))].live = 1;
}
else if (bottomRight.live == 0) { // move down to the right if clear
dst[index].live = 0;
dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;
dst[toLinear( gid + uint2( 1, 1))].color = red;
dst[toLinear( gid + uint2( 1, 1))].live = 1;
}
else { // cant move
dst[index] = src[index];
}
}
if (particle.particleType == WALL) {
dst[index] = particle;
dst[index].live = 1;
}
if (particle.particleType == ERASER) {
dst[index].color = blue;
dst[index].live = 0;
}
}
gpu simulation metal
gpu simulation metal
edited Nov 10 at 20:27
asked Nov 10 at 16:54
Chris
62531530
62531530
Your "code" is confusing because there are unexplained undeclared variables, likereverse
andparticleRef
. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20
Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23
I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29
1
Does thedst
buffer start with undefined contents? Or is it initially a copy ofsrc
? Or perhaps cleared out? Doessrc
need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25
1
Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44
|
show 1 more comment
Your "code" is confusing because there are unexplained undeclared variables, likereverse
andparticleRef
. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20
Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23
I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29
1
Does thedst
buffer start with undefined contents? Or is it initially a copy ofsrc
? Or perhaps cleared out? Doessrc
need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25
1
Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44
Your "code" is confusing because there are unexplained undeclared variables, like
reverse
and particleRef
. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.– Ken Thomases
Nov 10 at 20:20
Your "code" is confusing because there are unexplained undeclared variables, like
reverse
and particleRef
. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.– Ken Thomases
Nov 10 at 20:20
Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23
Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23
I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29
I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29
1
1
Does the
dst
buffer start with undefined contents? Or is it initially a copy of src
? Or perhaps cleared out? Does src
need to remain unmodified?– Ken Thomases
Nov 10 at 21:25
Does the
dst
buffer start with undefined contents? Or is it initially a copy of src
? Or perhaps cleared out? Does src
need to remain unmodified?– Ken Thomases
Nov 10 at 21:25
1
1
Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44
Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44
|
show 1 more comment
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53241223%2fcan-i-make-the-falling-sand-game-run-parallel-on-the-gpu%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Your "code" is confusing because there are unexplained undeclared variables, like
reverse
andparticleRef
. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.– Ken Thomases
Nov 10 at 20:20
Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23
I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29
1
Does the
dst
buffer start with undefined contents? Or is it initially a copy ofsrc
? Or perhaps cleared out? Doessrc
need to remain unmodified?– Ken Thomases
Nov 10 at 21:25
1
Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44