awk subtract different columns












-1














I want to group duplicate rows and subtract the values in the cols from these rows. Any idea how I can do this ?


Example:

dbstat 100 90 80 60 1000 
dbstat 10 10 10 20
test 5 5
output should be:
dbstat 90 80 70 40 1000
test 5 5


update:
For other files I have to sum the values for matching rows and this works fine using:


awk '{for (i=2;i<=NF;i++) {a[$1][i]+=$i}} END{ for (j in a) {s=j; for (i=2;i<=NF;i++) {s=s" "a[j][i]}; print s}}'
But changing += to -= doesn't subtract, it still does a sum of the values but puts a minus sign before.

Thank you.










share|improve this question




















  • 3




    You have to show us the sample of expected output as well as your efforts too in code tags kindly do so and let us know then.
    – RavinderSingh13
    Nov 12 '18 at 13:15










  • Please edit your Q to show your best attempt to solve the problem. Also please clarify : is the rule to "always subtract any dbstat record value from the first instance of of that record"? What if /Can there be 2ndary records with neg values, i.e. -10 -10 10 20 and if so, what should be the result? Good luck.
    – shellter
    Nov 12 '18 at 14:00


















-1














I want to group duplicate rows and subtract the values in the cols from these rows. Any idea how I can do this ?


Example:

dbstat 100 90 80 60 1000 
dbstat 10 10 10 20
test 5 5
output should be:
dbstat 90 80 70 40 1000
test 5 5


update:
For other files I have to sum the values for matching rows and this works fine using:


awk '{for (i=2;i<=NF;i++) {a[$1][i]+=$i}} END{ for (j in a) {s=j; for (i=2;i<=NF;i++) {s=s" "a[j][i]}; print s}}'
But changing += to -= doesn't subtract, it still does a sum of the values but puts a minus sign before.

Thank you.










share|improve this question




















  • 3




    You have to show us the sample of expected output as well as your efforts too in code tags kindly do so and let us know then.
    – RavinderSingh13
    Nov 12 '18 at 13:15










  • Please edit your Q to show your best attempt to solve the problem. Also please clarify : is the rule to "always subtract any dbstat record value from the first instance of of that record"? What if /Can there be 2ndary records with neg values, i.e. -10 -10 10 20 and if so, what should be the result? Good luck.
    – shellter
    Nov 12 '18 at 14:00
















-1












-1








-1







I want to group duplicate rows and subtract the values in the cols from these rows. Any idea how I can do this ?


Example:

dbstat 100 90 80 60 1000 
dbstat 10 10 10 20
test 5 5
output should be:
dbstat 90 80 70 40 1000
test 5 5


update:
For other files I have to sum the values for matching rows and this works fine using:


awk '{for (i=2;i<=NF;i++) {a[$1][i]+=$i}} END{ for (j in a) {s=j; for (i=2;i<=NF;i++) {s=s" "a[j][i]}; print s}}'
But changing += to -= doesn't subtract, it still does a sum of the values but puts a minus sign before.

Thank you.










share|improve this question















I want to group duplicate rows and subtract the values in the cols from these rows. Any idea how I can do this ?


Example:

dbstat 100 90 80 60 1000 
dbstat 10 10 10 20
test 5 5
output should be:
dbstat 90 80 70 40 1000
test 5 5


update:
For other files I have to sum the values for matching rows and this works fine using:


awk '{for (i=2;i<=NF;i++) {a[$1][i]+=$i}} END{ for (j in a) {s=j; for (i=2;i<=NF;i++) {s=s" "a[j][i]}; print s}}'
But changing += to -= doesn't subtract, it still does a sum of the values but puts a minus sign before.

Thank you.







linux awk






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 '18 at 13:58

























asked Nov 12 '18 at 13:13









jerio97

83




83








  • 3




    You have to show us the sample of expected output as well as your efforts too in code tags kindly do so and let us know then.
    – RavinderSingh13
    Nov 12 '18 at 13:15










  • Please edit your Q to show your best attempt to solve the problem. Also please clarify : is the rule to "always subtract any dbstat record value from the first instance of of that record"? What if /Can there be 2ndary records with neg values, i.e. -10 -10 10 20 and if so, what should be the result? Good luck.
    – shellter
    Nov 12 '18 at 14:00
















  • 3




    You have to show us the sample of expected output as well as your efforts too in code tags kindly do so and let us know then.
    – RavinderSingh13
    Nov 12 '18 at 13:15










  • Please edit your Q to show your best attempt to solve the problem. Also please clarify : is the rule to "always subtract any dbstat record value from the first instance of of that record"? What if /Can there be 2ndary records with neg values, i.e. -10 -10 10 20 and if so, what should be the result? Good luck.
    – shellter
    Nov 12 '18 at 14:00










3




3




You have to show us the sample of expected output as well as your efforts too in code tags kindly do so and let us know then.
– RavinderSingh13
Nov 12 '18 at 13:15




You have to show us the sample of expected output as well as your efforts too in code tags kindly do so and let us know then.
– RavinderSingh13
Nov 12 '18 at 13:15












Please edit your Q to show your best attempt to solve the problem. Also please clarify : is the rule to "always subtract any dbstat record value from the first instance of of that record"? What if /Can there be 2ndary records with neg values, i.e. -10 -10 10 20 and if so, what should be the result? Good luck.
– shellter
Nov 12 '18 at 14:00






Please edit your Q to show your best attempt to solve the problem. Also please clarify : is the rule to "always subtract any dbstat record value from the first instance of of that record"? What if /Can there be 2ndary records with neg values, i.e. -10 -10 10 20 and if so, what should be the result? Good luck.
– shellter
Nov 12 '18 at 14:00














2 Answers
2






active

oldest

votes


















1














$ cat file
test1 100 20 25 30
test1 10 10
test1 30 0 2
test2 500
test2 100 50 90
test2 10 0
test3 100 100 100
$
$ cat tst.awk
p != $1 {
for (i in c)
p = p OFS c[i]
if (p)
print p
p = $1
delete c
for (i = 2; i <= NF; ++i)
c[i] = $i
next
}
{
for (i = 2; i <= NF; ++i)
c[i] -= $i
}
END {
for (i in c)
p = p OFS c[i]
print p
}
$
$ awk -f tst.awk file
test1 60 10 23 30
test2 390 -50 -90
test3 100 100 100




If duplicate columns are not grouped:



$ cat file2
test2 500
test1 100 20 25 30
test2 100 50 90
test1 30 0 2
test2 10 0
test3 100 100 100
test1 10 10
$
$ cat tst2.awk
{
f = ($1 in a)
for (i = 2; i <= NF; ++i)
a[$1][i] -= (f ? $i : -$i)
}
END {
for (k in a) {
o = k
for (i in a[k])
o = o OFS a[k][i]
print o
}
}
$
$ awk -f tst2.awk file2
test1 60 10 23 30
test2 390 -50 -90
test3 100 100 100





share|improve this answer































    0














    If it doesn't have to be awk, this perl script gives your desired output for your sample input:



    #!/usr/bin/perl
    use warnings;
    use strict;
    use feature qw/say/;
    use List::MoreUtils qw/pairwise/; # Non-core; install through your OS or CPAN.

    my %records;

    while (<>) {
    my ($key, @cols) = split /s+/;
    $records{$key} =
    [ pairwise { defined $a ? $a - ($b // 0) : $b } @{$records{$key}}, @cols ];
    }

    foreach my $key (sort keys %records) {
    say "$key @{$records{$key}}";
    }


    It produces output in sorted order, like your example. If rows are supposed to be in the order they first appear in your real input, that's doable too. It wasn't clear from your sample since that starts out sorted.



    It also doesn't require duplicates to be consecutive - is that important?






    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53262947%2fawk-subtract-different-columns%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      $ cat file
      test1 100 20 25 30
      test1 10 10
      test1 30 0 2
      test2 500
      test2 100 50 90
      test2 10 0
      test3 100 100 100
      $
      $ cat tst.awk
      p != $1 {
      for (i in c)
      p = p OFS c[i]
      if (p)
      print p
      p = $1
      delete c
      for (i = 2; i <= NF; ++i)
      c[i] = $i
      next
      }
      {
      for (i = 2; i <= NF; ++i)
      c[i] -= $i
      }
      END {
      for (i in c)
      p = p OFS c[i]
      print p
      }
      $
      $ awk -f tst.awk file
      test1 60 10 23 30
      test2 390 -50 -90
      test3 100 100 100




      If duplicate columns are not grouped:



      $ cat file2
      test2 500
      test1 100 20 25 30
      test2 100 50 90
      test1 30 0 2
      test2 10 0
      test3 100 100 100
      test1 10 10
      $
      $ cat tst2.awk
      {
      f = ($1 in a)
      for (i = 2; i <= NF; ++i)
      a[$1][i] -= (f ? $i : -$i)
      }
      END {
      for (k in a) {
      o = k
      for (i in a[k])
      o = o OFS a[k][i]
      print o
      }
      }
      $
      $ awk -f tst2.awk file2
      test1 60 10 23 30
      test2 390 -50 -90
      test3 100 100 100





      share|improve this answer




























        1














        $ cat file
        test1 100 20 25 30
        test1 10 10
        test1 30 0 2
        test2 500
        test2 100 50 90
        test2 10 0
        test3 100 100 100
        $
        $ cat tst.awk
        p != $1 {
        for (i in c)
        p = p OFS c[i]
        if (p)
        print p
        p = $1
        delete c
        for (i = 2; i <= NF; ++i)
        c[i] = $i
        next
        }
        {
        for (i = 2; i <= NF; ++i)
        c[i] -= $i
        }
        END {
        for (i in c)
        p = p OFS c[i]
        print p
        }
        $
        $ awk -f tst.awk file
        test1 60 10 23 30
        test2 390 -50 -90
        test3 100 100 100




        If duplicate columns are not grouped:



        $ cat file2
        test2 500
        test1 100 20 25 30
        test2 100 50 90
        test1 30 0 2
        test2 10 0
        test3 100 100 100
        test1 10 10
        $
        $ cat tst2.awk
        {
        f = ($1 in a)
        for (i = 2; i <= NF; ++i)
        a[$1][i] -= (f ? $i : -$i)
        }
        END {
        for (k in a) {
        o = k
        for (i in a[k])
        o = o OFS a[k][i]
        print o
        }
        }
        $
        $ awk -f tst2.awk file2
        test1 60 10 23 30
        test2 390 -50 -90
        test3 100 100 100





        share|improve this answer


























          1












          1








          1






          $ cat file
          test1 100 20 25 30
          test1 10 10
          test1 30 0 2
          test2 500
          test2 100 50 90
          test2 10 0
          test3 100 100 100
          $
          $ cat tst.awk
          p != $1 {
          for (i in c)
          p = p OFS c[i]
          if (p)
          print p
          p = $1
          delete c
          for (i = 2; i <= NF; ++i)
          c[i] = $i
          next
          }
          {
          for (i = 2; i <= NF; ++i)
          c[i] -= $i
          }
          END {
          for (i in c)
          p = p OFS c[i]
          print p
          }
          $
          $ awk -f tst.awk file
          test1 60 10 23 30
          test2 390 -50 -90
          test3 100 100 100




          If duplicate columns are not grouped:



          $ cat file2
          test2 500
          test1 100 20 25 30
          test2 100 50 90
          test1 30 0 2
          test2 10 0
          test3 100 100 100
          test1 10 10
          $
          $ cat tst2.awk
          {
          f = ($1 in a)
          for (i = 2; i <= NF; ++i)
          a[$1][i] -= (f ? $i : -$i)
          }
          END {
          for (k in a) {
          o = k
          for (i in a[k])
          o = o OFS a[k][i]
          print o
          }
          }
          $
          $ awk -f tst2.awk file2
          test1 60 10 23 30
          test2 390 -50 -90
          test3 100 100 100





          share|improve this answer














          $ cat file
          test1 100 20 25 30
          test1 10 10
          test1 30 0 2
          test2 500
          test2 100 50 90
          test2 10 0
          test3 100 100 100
          $
          $ cat tst.awk
          p != $1 {
          for (i in c)
          p = p OFS c[i]
          if (p)
          print p
          p = $1
          delete c
          for (i = 2; i <= NF; ++i)
          c[i] = $i
          next
          }
          {
          for (i = 2; i <= NF; ++i)
          c[i] -= $i
          }
          END {
          for (i in c)
          p = p OFS c[i]
          print p
          }
          $
          $ awk -f tst.awk file
          test1 60 10 23 30
          test2 390 -50 -90
          test3 100 100 100




          If duplicate columns are not grouped:



          $ cat file2
          test2 500
          test1 100 20 25 30
          test2 100 50 90
          test1 30 0 2
          test2 10 0
          test3 100 100 100
          test1 10 10
          $
          $ cat tst2.awk
          {
          f = ($1 in a)
          for (i = 2; i <= NF; ++i)
          a[$1][i] -= (f ? $i : -$i)
          }
          END {
          for (k in a) {
          o = k
          for (i in a[k])
          o = o OFS a[k][i]
          print o
          }
          }
          $
          $ awk -f tst2.awk file2
          test1 60 10 23 30
          test2 390 -50 -90
          test3 100 100 100






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 12 '18 at 15:02

























          answered Nov 12 '18 at 14:26









          oguzismail

          3,26131025




          3,26131025

























              0














              If it doesn't have to be awk, this perl script gives your desired output for your sample input:



              #!/usr/bin/perl
              use warnings;
              use strict;
              use feature qw/say/;
              use List::MoreUtils qw/pairwise/; # Non-core; install through your OS or CPAN.

              my %records;

              while (<>) {
              my ($key, @cols) = split /s+/;
              $records{$key} =
              [ pairwise { defined $a ? $a - ($b // 0) : $b } @{$records{$key}}, @cols ];
              }

              foreach my $key (sort keys %records) {
              say "$key @{$records{$key}}";
              }


              It produces output in sorted order, like your example. If rows are supposed to be in the order they first appear in your real input, that's doable too. It wasn't clear from your sample since that starts out sorted.



              It also doesn't require duplicates to be consecutive - is that important?






              share|improve this answer


























                0














                If it doesn't have to be awk, this perl script gives your desired output for your sample input:



                #!/usr/bin/perl
                use warnings;
                use strict;
                use feature qw/say/;
                use List::MoreUtils qw/pairwise/; # Non-core; install through your OS or CPAN.

                my %records;

                while (<>) {
                my ($key, @cols) = split /s+/;
                $records{$key} =
                [ pairwise { defined $a ? $a - ($b // 0) : $b } @{$records{$key}}, @cols ];
                }

                foreach my $key (sort keys %records) {
                say "$key @{$records{$key}}";
                }


                It produces output in sorted order, like your example. If rows are supposed to be in the order they first appear in your real input, that's doable too. It wasn't clear from your sample since that starts out sorted.



                It also doesn't require duplicates to be consecutive - is that important?






                share|improve this answer
























                  0












                  0








                  0






                  If it doesn't have to be awk, this perl script gives your desired output for your sample input:



                  #!/usr/bin/perl
                  use warnings;
                  use strict;
                  use feature qw/say/;
                  use List::MoreUtils qw/pairwise/; # Non-core; install through your OS or CPAN.

                  my %records;

                  while (<>) {
                  my ($key, @cols) = split /s+/;
                  $records{$key} =
                  [ pairwise { defined $a ? $a - ($b // 0) : $b } @{$records{$key}}, @cols ];
                  }

                  foreach my $key (sort keys %records) {
                  say "$key @{$records{$key}}";
                  }


                  It produces output in sorted order, like your example. If rows are supposed to be in the order they first appear in your real input, that's doable too. It wasn't clear from your sample since that starts out sorted.



                  It also doesn't require duplicates to be consecutive - is that important?






                  share|improve this answer












                  If it doesn't have to be awk, this perl script gives your desired output for your sample input:



                  #!/usr/bin/perl
                  use warnings;
                  use strict;
                  use feature qw/say/;
                  use List::MoreUtils qw/pairwise/; # Non-core; install through your OS or CPAN.

                  my %records;

                  while (<>) {
                  my ($key, @cols) = split /s+/;
                  $records{$key} =
                  [ pairwise { defined $a ? $a - ($b // 0) : $b } @{$records{$key}}, @cols ];
                  }

                  foreach my $key (sort keys %records) {
                  say "$key @{$records{$key}}";
                  }


                  It produces output in sorted order, like your example. If rows are supposed to be in the order they first appear in your real input, that's doable too. It wasn't clear from your sample since that starts out sorted.



                  It also doesn't require duplicates to be consecutive - is that important?







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 12 '18 at 14:09









                  Shawn

                  3,5331613




                  3,5331613






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53262947%2fawk-subtract-different-columns%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Florida Star v. B. J. F.

                      Danny Elfman

                      Lugert, Oklahoma