Calculate “position in run” in SQL












1















I have a table of consecutive ids (integers, 1 ... n), and values (integers), like this:



Input Table:



id value
-- -----
1 1
2 1
3 2
4 3
5 1
6 1
7 1


Going down the table i.e. in order of increasing id, I want to count how many times in a row the same value has been seen consecutively, i.e. the position in a run:



Output Table:



id value position in run  
-- ----- ---------------
1 1 1
2 1 2
3 2 1
4 3 1
5 1 1
6 1 2
7 1 3


Any ideas? I've searched for a combination of windowing functions including lead and lag, but can't come up with it. Note that the same value can appear in the value column as part of different runs, so partitioning by value may not help solve this. I'm on Hive 1.2.










share|improve this question























  • I'm pretty sure using a variable for representing the previous value, and another variable for keeping track of its run position would be the ideal solution for generating such an output. Honestly though, I'd modify the schema to calculate this kind of stuff when it's inserted/update/delete to help speed up selects.

    – Ultimater
    Nov 15 '18 at 3:47
















1















I have a table of consecutive ids (integers, 1 ... n), and values (integers), like this:



Input Table:



id value
-- -----
1 1
2 1
3 2
4 3
5 1
6 1
7 1


Going down the table i.e. in order of increasing id, I want to count how many times in a row the same value has been seen consecutively, i.e. the position in a run:



Output Table:



id value position in run  
-- ----- ---------------
1 1 1
2 1 2
3 2 1
4 3 1
5 1 1
6 1 2
7 1 3


Any ideas? I've searched for a combination of windowing functions including lead and lag, but can't come up with it. Note that the same value can appear in the value column as part of different runs, so partitioning by value may not help solve this. I'm on Hive 1.2.










share|improve this question























  • I'm pretty sure using a variable for representing the previous value, and another variable for keeping track of its run position would be the ideal solution for generating such an output. Honestly though, I'd modify the schema to calculate this kind of stuff when it's inserted/update/delete to help speed up selects.

    – Ultimater
    Nov 15 '18 at 3:47














1












1








1


1






I have a table of consecutive ids (integers, 1 ... n), and values (integers), like this:



Input Table:



id value
-- -----
1 1
2 1
3 2
4 3
5 1
6 1
7 1


Going down the table i.e. in order of increasing id, I want to count how many times in a row the same value has been seen consecutively, i.e. the position in a run:



Output Table:



id value position in run  
-- ----- ---------------
1 1 1
2 1 2
3 2 1
4 3 1
5 1 1
6 1 2
7 1 3


Any ideas? I've searched for a combination of windowing functions including lead and lag, but can't come up with it. Note that the same value can appear in the value column as part of different runs, so partitioning by value may not help solve this. I'm on Hive 1.2.










share|improve this question














I have a table of consecutive ids (integers, 1 ... n), and values (integers), like this:



Input Table:



id value
-- -----
1 1
2 1
3 2
4 3
5 1
6 1
7 1


Going down the table i.e. in order of increasing id, I want to count how many times in a row the same value has been seen consecutively, i.e. the position in a run:



Output Table:



id value position in run  
-- ----- ---------------
1 1 1
2 1 2
3 2 1
4 3 1
5 1 1
6 1 2
7 1 3


Any ideas? I've searched for a combination of windowing functions including lead and lag, but can't come up with it. Note that the same value can appear in the value column as part of different runs, so partitioning by value may not help solve this. I'm on Hive 1.2.







hive hiveql






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 3:28









batteybattey

349212




349212













  • I'm pretty sure using a variable for representing the previous value, and another variable for keeping track of its run position would be the ideal solution for generating such an output. Honestly though, I'd modify the schema to calculate this kind of stuff when it's inserted/update/delete to help speed up selects.

    – Ultimater
    Nov 15 '18 at 3:47



















  • I'm pretty sure using a variable for representing the previous value, and another variable for keeping track of its run position would be the ideal solution for generating such an output. Honestly though, I'd modify the schema to calculate this kind of stuff when it's inserted/update/delete to help speed up selects.

    – Ultimater
    Nov 15 '18 at 3:47

















I'm pretty sure using a variable for representing the previous value, and another variable for keeping track of its run position would be the ideal solution for generating such an output. Honestly though, I'd modify the schema to calculate this kind of stuff when it's inserted/update/delete to help speed up selects.

– Ultimater
Nov 15 '18 at 3:47





I'm pretty sure using a variable for representing the previous value, and another variable for keeping track of its run position would be the ideal solution for generating such an output. Honestly though, I'd modify the schema to calculate this kind of stuff when it's inserted/update/delete to help speed up selects.

– Ultimater
Nov 15 '18 at 3:47












1 Answer
1






active

oldest

votes


















1














One way is to use a difference of row numbers approach to classify consecutive same values into one group. Then a row number function to get the desired positions in each group.



Query to assign groups (Running this will help you understand how the groups are assigned.)



select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t


Final Query using row_number to get positions in each group assigned with the above query.



select id,value,row_number() over(partition by value,rnum_diff order by id) as pos_in_grp
from (select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t
) t





share|improve this answer
























  • Excellent!!.. this is somewhat complex and you solved it.

    – stack0114106
    Nov 20 '18 at 21:16











  • hi @Vamsi.. could you pls give a try at stackoverflow.com/questions/53752311/…

    – stack0114106
    Dec 13 '18 at 17:39











  • @stack0114106 .. been thinking about this..couldn't find a way to solve this yet :) possible with recursive cte in SQL..however, it is not supported in Spark SQL and Hive.

    – Vamsi Prabhala
    Dec 13 '18 at 18:47













  • thanks for your reply..whatever way you are solving..fine..pls let me know once you complete

    – stack0114106
    Dec 13 '18 at 19:03











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53312001%2fcalculate-position-in-run-in-sql%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














One way is to use a difference of row numbers approach to classify consecutive same values into one group. Then a row number function to get the desired positions in each group.



Query to assign groups (Running this will help you understand how the groups are assigned.)



select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t


Final Query using row_number to get positions in each group assigned with the above query.



select id,value,row_number() over(partition by value,rnum_diff order by id) as pos_in_grp
from (select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t
) t





share|improve this answer
























  • Excellent!!.. this is somewhat complex and you solved it.

    – stack0114106
    Nov 20 '18 at 21:16











  • hi @Vamsi.. could you pls give a try at stackoverflow.com/questions/53752311/…

    – stack0114106
    Dec 13 '18 at 17:39











  • @stack0114106 .. been thinking about this..couldn't find a way to solve this yet :) possible with recursive cte in SQL..however, it is not supported in Spark SQL and Hive.

    – Vamsi Prabhala
    Dec 13 '18 at 18:47













  • thanks for your reply..whatever way you are solving..fine..pls let me know once you complete

    – stack0114106
    Dec 13 '18 at 19:03
















1














One way is to use a difference of row numbers approach to classify consecutive same values into one group. Then a row number function to get the desired positions in each group.



Query to assign groups (Running this will help you understand how the groups are assigned.)



select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t


Final Query using row_number to get positions in each group assigned with the above query.



select id,value,row_number() over(partition by value,rnum_diff order by id) as pos_in_grp
from (select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t
) t





share|improve this answer
























  • Excellent!!.. this is somewhat complex and you solved it.

    – stack0114106
    Nov 20 '18 at 21:16











  • hi @Vamsi.. could you pls give a try at stackoverflow.com/questions/53752311/…

    – stack0114106
    Dec 13 '18 at 17:39











  • @stack0114106 .. been thinking about this..couldn't find a way to solve this yet :) possible with recursive cte in SQL..however, it is not supported in Spark SQL and Hive.

    – Vamsi Prabhala
    Dec 13 '18 at 18:47













  • thanks for your reply..whatever way you are solving..fine..pls let me know once you complete

    – stack0114106
    Dec 13 '18 at 19:03














1












1








1







One way is to use a difference of row numbers approach to classify consecutive same values into one group. Then a row number function to get the desired positions in each group.



Query to assign groups (Running this will help you understand how the groups are assigned.)



select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t


Final Query using row_number to get positions in each group assigned with the above query.



select id,value,row_number() over(partition by value,rnum_diff order by id) as pos_in_grp
from (select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t
) t





share|improve this answer













One way is to use a difference of row numbers approach to classify consecutive same values into one group. Then a row number function to get the desired positions in each group.



Query to assign groups (Running this will help you understand how the groups are assigned.)



select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t


Final Query using row_number to get positions in each group assigned with the above query.



select id,value,row_number() over(partition by value,rnum_diff order by id) as pos_in_grp
from (select t.*
,row_number() over(order by id) - row_number() over(partition by value order by id) as rnum_diff
from tbl t
) t






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 16 '18 at 16:04









Vamsi PrabhalaVamsi Prabhala

41.5k42039




41.5k42039













  • Excellent!!.. this is somewhat complex and you solved it.

    – stack0114106
    Nov 20 '18 at 21:16











  • hi @Vamsi.. could you pls give a try at stackoverflow.com/questions/53752311/…

    – stack0114106
    Dec 13 '18 at 17:39











  • @stack0114106 .. been thinking about this..couldn't find a way to solve this yet :) possible with recursive cte in SQL..however, it is not supported in Spark SQL and Hive.

    – Vamsi Prabhala
    Dec 13 '18 at 18:47













  • thanks for your reply..whatever way you are solving..fine..pls let me know once you complete

    – stack0114106
    Dec 13 '18 at 19:03



















  • Excellent!!.. this is somewhat complex and you solved it.

    – stack0114106
    Nov 20 '18 at 21:16











  • hi @Vamsi.. could you pls give a try at stackoverflow.com/questions/53752311/…

    – stack0114106
    Dec 13 '18 at 17:39











  • @stack0114106 .. been thinking about this..couldn't find a way to solve this yet :) possible with recursive cte in SQL..however, it is not supported in Spark SQL and Hive.

    – Vamsi Prabhala
    Dec 13 '18 at 18:47













  • thanks for your reply..whatever way you are solving..fine..pls let me know once you complete

    – stack0114106
    Dec 13 '18 at 19:03

















Excellent!!.. this is somewhat complex and you solved it.

– stack0114106
Nov 20 '18 at 21:16





Excellent!!.. this is somewhat complex and you solved it.

– stack0114106
Nov 20 '18 at 21:16













hi @Vamsi.. could you pls give a try at stackoverflow.com/questions/53752311/…

– stack0114106
Dec 13 '18 at 17:39





hi @Vamsi.. could you pls give a try at stackoverflow.com/questions/53752311/…

– stack0114106
Dec 13 '18 at 17:39













@stack0114106 .. been thinking about this..couldn't find a way to solve this yet :) possible with recursive cte in SQL..however, it is not supported in Spark SQL and Hive.

– Vamsi Prabhala
Dec 13 '18 at 18:47







@stack0114106 .. been thinking about this..couldn't find a way to solve this yet :) possible with recursive cte in SQL..however, it is not supported in Spark SQL and Hive.

– Vamsi Prabhala
Dec 13 '18 at 18:47















thanks for your reply..whatever way you are solving..fine..pls let me know once you complete

– stack0114106
Dec 13 '18 at 19:03





thanks for your reply..whatever way you are solving..fine..pls let me know once you complete

– stack0114106
Dec 13 '18 at 19:03




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53312001%2fcalculate-position-in-run-in-sql%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Florida Star v. B. J. F.

Danny Elfman

Retrieve a Users Dashboard in Tumblr with R and TumblR. Oauth Issues