Collapsing data in SAS with PROC SQL
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I've been trying unsuccesfully for sometime now to collapse a data set using a PROC SQL
with GROUPBY
and was wondering if I could get some help. Here is an example of what I am trying to do. Suppose we have the following data:
id year parent_id age
"01" 1990 "23" 17
"01" 1991 "23" 18
"01" 1992 "23" 19
"02" 1978 "18" 24
"02" 1979 "18" 25
that we wanted to collapse by id
preserving the row with the min
age
across years to get the following dataset
id year parent_id age
"01" 1990 "23" 17
"02" 1978 "18" 24
I tried something along the lines of
proc sql;
CREATE TABLE output_tablename as
SELECT DISTINCT id, year, parent_id, min(age) as age
FROM input_tablename
GROUPBY id;
quit;
to no avail.
group-by sas proc-sql
add a comment |
I've been trying unsuccesfully for sometime now to collapse a data set using a PROC SQL
with GROUPBY
and was wondering if I could get some help. Here is an example of what I am trying to do. Suppose we have the following data:
id year parent_id age
"01" 1990 "23" 17
"01" 1991 "23" 18
"01" 1992 "23" 19
"02" 1978 "18" 24
"02" 1979 "18" 25
that we wanted to collapse by id
preserving the row with the min
age
across years to get the following dataset
id year parent_id age
"01" 1990 "23" 17
"02" 1978 "18" 24
I tried something along the lines of
proc sql;
CREATE TABLE output_tablename as
SELECT DISTINCT id, year, parent_id, min(age) as age
FROM input_tablename
GROUPBY id;
quit;
to no avail.
group-by sas proc-sql
2
Using adata
step here would be easy. Forproc sql
, you could join theyear
andparent_id
after themin(age)
calculation onid
andage
. something like -proc sql; create table output_tablename as select a.*, b.year, b.parent_id from ( select id, min(age) as age from input_tablename group by id) a left join input_tablename b on a.id=b.id and a.age=b.age; quit;
– samkart
Nov 16 '18 at 19:49
What does to no avail mean? What is the error or undesired result? Why not minimize all values except the grouping variable, id?
– Parfait
Nov 16 '18 at 20:46
Sorry for the confusion, I was getting duplicate results in my output dataset but I think this was actually a result of another mistake. However, the alternate methods suggested are very useful!
– student_t
Nov 16 '18 at 21:33
add a comment |
I've been trying unsuccesfully for sometime now to collapse a data set using a PROC SQL
with GROUPBY
and was wondering if I could get some help. Here is an example of what I am trying to do. Suppose we have the following data:
id year parent_id age
"01" 1990 "23" 17
"01" 1991 "23" 18
"01" 1992 "23" 19
"02" 1978 "18" 24
"02" 1979 "18" 25
that we wanted to collapse by id
preserving the row with the min
age
across years to get the following dataset
id year parent_id age
"01" 1990 "23" 17
"02" 1978 "18" 24
I tried something along the lines of
proc sql;
CREATE TABLE output_tablename as
SELECT DISTINCT id, year, parent_id, min(age) as age
FROM input_tablename
GROUPBY id;
quit;
to no avail.
group-by sas proc-sql
I've been trying unsuccesfully for sometime now to collapse a data set using a PROC SQL
with GROUPBY
and was wondering if I could get some help. Here is an example of what I am trying to do. Suppose we have the following data:
id year parent_id age
"01" 1990 "23" 17
"01" 1991 "23" 18
"01" 1992 "23" 19
"02" 1978 "18" 24
"02" 1979 "18" 25
that we wanted to collapse by id
preserving the row with the min
age
across years to get the following dataset
id year parent_id age
"01" 1990 "23" 17
"02" 1978 "18" 24
I tried something along the lines of
proc sql;
CREATE TABLE output_tablename as
SELECT DISTINCT id, year, parent_id, min(age) as age
FROM input_tablename
GROUPBY id;
quit;
to no avail.
group-by sas proc-sql
group-by sas proc-sql
asked Nov 16 '18 at 19:24
student_tstudent_t
1566
1566
2
Using adata
step here would be easy. Forproc sql
, you could join theyear
andparent_id
after themin(age)
calculation onid
andage
. something like -proc sql; create table output_tablename as select a.*, b.year, b.parent_id from ( select id, min(age) as age from input_tablename group by id) a left join input_tablename b on a.id=b.id and a.age=b.age; quit;
– samkart
Nov 16 '18 at 19:49
What does to no avail mean? What is the error or undesired result? Why not minimize all values except the grouping variable, id?
– Parfait
Nov 16 '18 at 20:46
Sorry for the confusion, I was getting duplicate results in my output dataset but I think this was actually a result of another mistake. However, the alternate methods suggested are very useful!
– student_t
Nov 16 '18 at 21:33
add a comment |
2
Using adata
step here would be easy. Forproc sql
, you could join theyear
andparent_id
after themin(age)
calculation onid
andage
. something like -proc sql; create table output_tablename as select a.*, b.year, b.parent_id from ( select id, min(age) as age from input_tablename group by id) a left join input_tablename b on a.id=b.id and a.age=b.age; quit;
– samkart
Nov 16 '18 at 19:49
What does to no avail mean? What is the error or undesired result? Why not minimize all values except the grouping variable, id?
– Parfait
Nov 16 '18 at 20:46
Sorry for the confusion, I was getting duplicate results in my output dataset but I think this was actually a result of another mistake. However, the alternate methods suggested are very useful!
– student_t
Nov 16 '18 at 21:33
2
2
Using a
data
step here would be easy. For proc sql
, you could join the year
and parent_id
after the min(age)
calculation on id
and age
. something like - proc sql; create table output_tablename as select a.*, b.year, b.parent_id from ( select id, min(age) as age from input_tablename group by id) a left join input_tablename b on a.id=b.id and a.age=b.age; quit;
– samkart
Nov 16 '18 at 19:49
Using a
data
step here would be easy. For proc sql
, you could join the year
and parent_id
after the min(age)
calculation on id
and age
. something like - proc sql; create table output_tablename as select a.*, b.year, b.parent_id from ( select id, min(age) as age from input_tablename group by id) a left join input_tablename b on a.id=b.id and a.age=b.age; quit;
– samkart
Nov 16 '18 at 19:49
What does to no avail mean? What is the error or undesired result? Why not minimize all values except the grouping variable, id?
– Parfait
Nov 16 '18 at 20:46
What does to no avail mean? What is the error or undesired result? Why not minimize all values except the grouping variable, id?
– Parfait
Nov 16 '18 at 20:46
Sorry for the confusion, I was getting duplicate results in my output dataset but I think this was actually a result of another mistake. However, the alternate methods suggested are very useful!
– student_t
Nov 16 '18 at 21:33
Sorry for the confusion, I was getting duplicate results in my output dataset but I think this was actually a result of another mistake. However, the alternate methods suggested are very useful!
– student_t
Nov 16 '18 at 21:33
add a comment |
1 Answer
1
active
oldest
votes
You can use the HAVING clause to pick only records where age = min(age).
proc sql;
create table want as
select * from have
group by ID
having age=min(age);
quit;
PROC SORT option:
proc sort data=have; by id descending age;
run;
proc sort data=have nodupkey out=want;
by id;
run;
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53344174%2fcollapsing-data-in-sas-with-proc-sql%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use the HAVING clause to pick only records where age = min(age).
proc sql;
create table want as
select * from have
group by ID
having age=min(age);
quit;
PROC SORT option:
proc sort data=have; by id descending age;
run;
proc sort data=have nodupkey out=want;
by id;
run;
add a comment |
You can use the HAVING clause to pick only records where age = min(age).
proc sql;
create table want as
select * from have
group by ID
having age=min(age);
quit;
PROC SORT option:
proc sort data=have; by id descending age;
run;
proc sort data=have nodupkey out=want;
by id;
run;
add a comment |
You can use the HAVING clause to pick only records where age = min(age).
proc sql;
create table want as
select * from have
group by ID
having age=min(age);
quit;
PROC SORT option:
proc sort data=have; by id descending age;
run;
proc sort data=have nodupkey out=want;
by id;
run;
You can use the HAVING clause to pick only records where age = min(age).
proc sql;
create table want as
select * from have
group by ID
having age=min(age);
quit;
PROC SORT option:
proc sort data=have; by id descending age;
run;
proc sort data=have nodupkey out=want;
by id;
run;
answered Nov 16 '18 at 20:34
ReezaReeza
13.4k21227
13.4k21227
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53344174%2fcollapsing-data-in-sas-with-proc-sql%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Using a
data
step here would be easy. Forproc sql
, you could join theyear
andparent_id
after themin(age)
calculation onid
andage
. something like -proc sql; create table output_tablename as select a.*, b.year, b.parent_id from ( select id, min(age) as age from input_tablename group by id) a left join input_tablename b on a.id=b.id and a.age=b.age; quit;
– samkart
Nov 16 '18 at 19:49
What does to no avail mean? What is the error or undesired result? Why not minimize all values except the grouping variable, id?
– Parfait
Nov 16 '18 at 20:46
Sorry for the confusion, I was getting duplicate results in my output dataset but I think this was actually a result of another mistake. However, the alternate methods suggested are very useful!
– student_t
Nov 16 '18 at 21:33