Can I make the falling sand game run parallel on the GPU?

up vote
0
down vote

favorite

I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:

Sand particles

In one update, both particle 1A and 1B could flow into space T.

Sequential

In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.

Parallel

My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.

My question is then, can I make a parallel solution to this, or can it only be solved sequentially?

Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:

struct Particle {

    float4 color;

    int live;

    int particleType;

};





kernel void calculateFrame(

                           device Particle *src [[buffer(1)]],

                           device Particle *dst [[buffer(2)]],

                           uint2 threadgroup_position_in_grid   [[ threadgroup_position_in_grid ]],

                           uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],

                           uint2 threads_per_threadgroup        [[ threads_per_threadgroup ]]) {



    uint2 gid = threadgroup_position_in_grid;

    int index = (gid * 640) + gid;

    Particle particle = src[index];



    Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom

    Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left

    Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right



    if (particle.live == 1 && particle.particleType == SAND) { // move down



        if (bottom.live == 0) {



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 0, 1))].color = particle.color;

            dst[toLinear( gid + uint2( 0, 1))].live = 1;



        }

        else if (bottomLeft.live == 0) { // move down to the left if clear

            dst[index].live = 0;



            dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( -1, 1))].color = red;

            dst[toLinear( gid + uint2( -1, 1))].live = 1;





        }

        else if (bottomRight.live == 0) { // move down to the right if clear



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 1, 1))].color = red;

            dst[toLinear( gid + uint2( 1, 1))].live = 1;

        }

        else { // cant move

            dst[index] = src[index];

        }



    }



    if (particle.particleType == WALL) {

        dst[index] = particle;

        dst[index].live = 1;

    }



    if (particle.particleType == ERASER) {

        dst[index].color = blue;

        dst[index].live = 0;

    }

}

edited Nov 10 at 20:27

asked Nov 10 at 16:54

Chris

62531530

Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20

Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23

I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29

1

Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25

1

Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44

|
show 1 more comment

up vote
0
down vote

favorite

I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:

Sand particles

In one update, both particle 1A and 1B could flow into space T.

Sequential

In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.

Parallel

My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.

My question is then, can I make a parallel solution to this, or can it only be solved sequentially?

Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:

struct Particle {

    float4 color;

    int live;

    int particleType;

};





kernel void calculateFrame(

                           device Particle *src [[buffer(1)]],

                           device Particle *dst [[buffer(2)]],

                           uint2 threadgroup_position_in_grid   [[ threadgroup_position_in_grid ]],

                           uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],

                           uint2 threads_per_threadgroup        [[ threads_per_threadgroup ]]) {



    uint2 gid = threadgroup_position_in_grid;

    int index = (gid * 640) + gid;

    Particle particle = src[index];



    Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom

    Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left

    Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right



    if (particle.live == 1 && particle.particleType == SAND) { // move down



        if (bottom.live == 0) {



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 0, 1))].color = particle.color;

            dst[toLinear( gid + uint2( 0, 1))].live = 1;



        }

        else if (bottomLeft.live == 0) { // move down to the left if clear

            dst[index].live = 0;



            dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( -1, 1))].color = red;

            dst[toLinear( gid + uint2( -1, 1))].live = 1;





        }

        else if (bottomRight.live == 0) { // move down to the right if clear



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 1, 1))].color = red;

            dst[toLinear( gid + uint2( 1, 1))].live = 1;

        }

        else { // cant move

            dst[index] = src[index];

        }



    }



    if (particle.particleType == WALL) {

        dst[index] = particle;

        dst[index].live = 1;

    }



    if (particle.particleType == ERASER) {

        dst[index].color = blue;

        dst[index].live = 0;

    }

}

edited Nov 10 at 20:27

asked Nov 10 at 16:54

Chris

62531530

Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20

Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23

I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29

1

Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25

1

Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44

|
show 1 more comment

up vote
0
down vote

favorite

I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:

Sand particles

In one update, both particle 1A and 1B could flow into space T.

Sequential

In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.

Parallel

My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.

My question is then, can I make a parallel solution to this, or can it only be solved sequentially?

Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:

struct Particle {

    float4 color;

    int live;

    int particleType;

};





kernel void calculateFrame(

                           device Particle *src [[buffer(1)]],

                           device Particle *dst [[buffer(2)]],

                           uint2 threadgroup_position_in_grid   [[ threadgroup_position_in_grid ]],

                           uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],

                           uint2 threads_per_threadgroup        [[ threads_per_threadgroup ]]) {



    uint2 gid = threadgroup_position_in_grid;

    int index = (gid * 640) + gid;

    Particle particle = src[index];



    Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom

    Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left

    Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right



    if (particle.live == 1 && particle.particleType == SAND) { // move down



        if (bottom.live == 0) {



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 0, 1))].color = particle.color;

            dst[toLinear( gid + uint2( 0, 1))].live = 1;



        }

        else if (bottomLeft.live == 0) { // move down to the left if clear

            dst[index].live = 0;



            dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( -1, 1))].color = red;

            dst[toLinear( gid + uint2( -1, 1))].live = 1;





        }

        else if (bottomRight.live == 0) { // move down to the right if clear



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 1, 1))].color = red;

            dst[toLinear( gid + uint2( 1, 1))].live = 1;

        }

        else { // cant move

            dst[index] = src[index];

        }



    }



    if (particle.particleType == WALL) {

        dst[index] = particle;

        dst[index].live = 1;

    }



    if (particle.particleType == ERASER) {

        dst[index].color = blue;

        dst[index].live = 0;

    }

}

edited Nov 10 at 20:27

asked Nov 10 at 16:54

Chris

62531530

I've been trying to put together a falling sand type game in Metal, that uses a compute shader to do the work. My initial work looked ok, until I ran into the following problem:

Sand particles

In one update, both particle 1A and 1B could flow into space T.

Sequential

In a sequential loop, I could understand how I would simply move 1A into T, then when it came to evaluate 1B, I could see that the space was already filled.

Parallel

My problem is that with the GPU, The code to check 1A and 1B are both running on different threads, so in one update loop, they can both move to position T, essentially killing one of them.

My question is then, can I make a parallel solution to this, or can it only be solved sequentially?

Here's my calculation code, running a GPU thread for every pixel in a 640 by 480 grid:

struct Particle {

    float4 color;

    int live;

    int particleType;

};





kernel void calculateFrame(

                           device Particle *src [[buffer(1)]],

                           device Particle *dst [[buffer(2)]],

                           uint2 threadgroup_position_in_grid   [[ threadgroup_position_in_grid ]],

                           uint2 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]],

                           uint2 threads_per_threadgroup        [[ threads_per_threadgroup ]]) {



    uint2 gid = threadgroup_position_in_grid;

    int index = (gid * 640) + gid;

    Particle particle = src[index];



    Particle bottom = src[toLinear( gid + uint2( 0, 1))]; //bottom

    Particle bottomLeft = src[toLinear( gid + uint2(-1, 1))]; //bottom left

    Particle bottomRight = src[toLinear( gid + uint2( 1, 1))]; //bottom right



    if (particle.live == 1 && particle.particleType == SAND) { // move down



        if (bottom.live == 0) {



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 0, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 0, 1))].color = particle.color;

            dst[toLinear( gid + uint2( 0, 1))].live = 1;



        }

        else if (bottomLeft.live == 0) { // move down to the left if clear

            dst[index].live = 0;



            dst[toLinear( gid + uint2( -1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( -1, 1))].color = red;

            dst[toLinear( gid + uint2( -1, 1))].live = 1;





        }

        else if (bottomRight.live == 0) { // move down to the right if clear



            dst[index].live = 0;



            dst[toLinear( gid + uint2( 1, 1))].particleType = particle.particleType;

            dst[toLinear( gid + uint2( 1, 1))].color = red;

            dst[toLinear( gid + uint2( 1, 1))].live = 1;

        }

        else { // cant move

            dst[index] = src[index];

        }



    }



    if (particle.particleType == WALL) {

        dst[index] = particle;

        dst[index].live = 1;

    }



    if (particle.particleType == ERASER) {

        dst[index].color = blue;

        dst[index].live = 0;

    }

}

gpu simulation metal

edited Nov 10 at 20:27

asked Nov 10 at 16:54

Chris

62531530

edited Nov 10 at 20:27

asked Nov 10 at 16:54

Chris

62531530

edited Nov 10 at 20:27

asked Nov 10 at 16:54

Chris

62531530

asked Nov 10 at 16:54

Chris

62531530

asked Nov 10 at 16:54

Chris

62531530

Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20

Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23

I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29

1

Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25

1

Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44

|
show 1 more comment

Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20

Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23

I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29

1

Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25

1

Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44

Your "code" is confusing because there are unexplained undeclared variables, like reverse and particleRef. That said, the general solution to this type of problem is atomic compare-and-exchange. Each thread says "if the value is how I expect it to be, then change it to this other value". Only one thread "wins". There will have to be some fallback logic that the losing thread does, instead.
– Ken Thomases
Nov 10 at 20:20

Sorry @KenThomases I had some atomic code in there that I was trying, I've cleaned it up now.
– Chris
Nov 10 at 20:23

I guess I need to investigate how to do an atomic compare and exchange in this scenario.
– Chris
Nov 10 at 20:29

Does the dst buffer start with undefined contents? Or is it initially a copy of src? Or perhaps cleared out? Does src need to remain unmodified?
– Ken Thomases
Nov 10 at 21:25

Then you can use logic like "if (bottom.live == 0 && I win the atomic compare dst[toLinear( gid + uint2( 0, 1))].live to 0 and exchange with 1) …".
– Ken Thomases
Nov 11 at 19:44

|
show 1 more comment

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53241223%2fcan-i-make-the-falling-sand-game-run-parallel-on-the-gpu%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ndtyjky