How to fill missing data according to the date previous and next to it in R?












1















Two more questions about this topic:
A
B



Take Fig.1 as an example, we can see that data in 10/12/2016 12:07 is missing. I want to use the previous and next row of data (i.e., 10/10/2016 10:50 5.73; 10/24/2016 08:53 6.09) to linear interpolate this missing data (not the mean value of "5.73" and "6.09", but
according to the "date"). The example data file is attached below:



09/26/2016 11:57    5.42
10/10/2016 10:50 5.73
10/12/2016 12:07
10/24/2016 08:53 6.09
11/07/2016 11:25 6.43
11/21/2016 13:57 6.33
12/05/2016 14:01 7.97
12/19/2016 13:00 8.47


You can see Fig.2, we can use "Trend()" to attain this goal.



=TREND(M22:M23,L22:L23,O22)


I was wondering if there is a useful function as well in R?



Fig-1Fig-2










share|improve this question

























  • Pls post data no pictures! Check out ?dput.

    – vaettchen
    Nov 14 '18 at 4:02











  • You could do a weighted average, so if the previous date is closer to the missing date it's weighted more.

    – mickey
    Nov 14 '18 at 4:03











  • Have a look at difftime. Then you can get two time differences: "before and missing" and "missing and after". The value you can interpolate by "value before" + "value after" * diff1 / (diff1 + diff2)

    – CIAndrews
    Nov 14 '18 at 5:05











  • I'm so sorry I didn't post data. I have uploaded it.@vaettchen

    – T X
    Nov 14 '18 at 7:33











  • Thanks!@CIAndrews. Yet I'am afraid your equation should be: "value before" + "value after - value before" * diff1 / (diff1 + diff2)

    – T X
    Nov 14 '18 at 7:53
















1















Two more questions about this topic:
A
B



Take Fig.1 as an example, we can see that data in 10/12/2016 12:07 is missing. I want to use the previous and next row of data (i.e., 10/10/2016 10:50 5.73; 10/24/2016 08:53 6.09) to linear interpolate this missing data (not the mean value of "5.73" and "6.09", but
according to the "date"). The example data file is attached below:



09/26/2016 11:57    5.42
10/10/2016 10:50 5.73
10/12/2016 12:07
10/24/2016 08:53 6.09
11/07/2016 11:25 6.43
11/21/2016 13:57 6.33
12/05/2016 14:01 7.97
12/19/2016 13:00 8.47


You can see Fig.2, we can use "Trend()" to attain this goal.



=TREND(M22:M23,L22:L23,O22)


I was wondering if there is a useful function as well in R?



Fig-1Fig-2










share|improve this question

























  • Pls post data no pictures! Check out ?dput.

    – vaettchen
    Nov 14 '18 at 4:02











  • You could do a weighted average, so if the previous date is closer to the missing date it's weighted more.

    – mickey
    Nov 14 '18 at 4:03











  • Have a look at difftime. Then you can get two time differences: "before and missing" and "missing and after". The value you can interpolate by "value before" + "value after" * diff1 / (diff1 + diff2)

    – CIAndrews
    Nov 14 '18 at 5:05











  • I'm so sorry I didn't post data. I have uploaded it.@vaettchen

    – T X
    Nov 14 '18 at 7:33











  • Thanks!@CIAndrews. Yet I'am afraid your equation should be: "value before" + "value after - value before" * diff1 / (diff1 + diff2)

    – T X
    Nov 14 '18 at 7:53














1












1








1








Two more questions about this topic:
A
B



Take Fig.1 as an example, we can see that data in 10/12/2016 12:07 is missing. I want to use the previous and next row of data (i.e., 10/10/2016 10:50 5.73; 10/24/2016 08:53 6.09) to linear interpolate this missing data (not the mean value of "5.73" and "6.09", but
according to the "date"). The example data file is attached below:



09/26/2016 11:57    5.42
10/10/2016 10:50 5.73
10/12/2016 12:07
10/24/2016 08:53 6.09
11/07/2016 11:25 6.43
11/21/2016 13:57 6.33
12/05/2016 14:01 7.97
12/19/2016 13:00 8.47


You can see Fig.2, we can use "Trend()" to attain this goal.



=TREND(M22:M23,L22:L23,O22)


I was wondering if there is a useful function as well in R?



Fig-1Fig-2










share|improve this question
















Two more questions about this topic:
A
B



Take Fig.1 as an example, we can see that data in 10/12/2016 12:07 is missing. I want to use the previous and next row of data (i.e., 10/10/2016 10:50 5.73; 10/24/2016 08:53 6.09) to linear interpolate this missing data (not the mean value of "5.73" and "6.09", but
according to the "date"). The example data file is attached below:



09/26/2016 11:57    5.42
10/10/2016 10:50 5.73
10/12/2016 12:07
10/24/2016 08:53 6.09
11/07/2016 11:25 6.43
11/21/2016 13:57 6.33
12/05/2016 14:01 7.97
12/19/2016 13:00 8.47


You can see Fig.2, we can use "Trend()" to attain this goal.



=TREND(M22:M23,L22:L23,O22)


I was wondering if there is a useful function as well in R?



Fig-1Fig-2







r datetime






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 7:25









zx8754

29.7k76399




29.7k76399










asked Nov 14 '18 at 3:41









T XT X

83116




83116













  • Pls post data no pictures! Check out ?dput.

    – vaettchen
    Nov 14 '18 at 4:02











  • You could do a weighted average, so if the previous date is closer to the missing date it's weighted more.

    – mickey
    Nov 14 '18 at 4:03











  • Have a look at difftime. Then you can get two time differences: "before and missing" and "missing and after". The value you can interpolate by "value before" + "value after" * diff1 / (diff1 + diff2)

    – CIAndrews
    Nov 14 '18 at 5:05











  • I'm so sorry I didn't post data. I have uploaded it.@vaettchen

    – T X
    Nov 14 '18 at 7:33











  • Thanks!@CIAndrews. Yet I'am afraid your equation should be: "value before" + "value after - value before" * diff1 / (diff1 + diff2)

    – T X
    Nov 14 '18 at 7:53



















  • Pls post data no pictures! Check out ?dput.

    – vaettchen
    Nov 14 '18 at 4:02











  • You could do a weighted average, so if the previous date is closer to the missing date it's weighted more.

    – mickey
    Nov 14 '18 at 4:03











  • Have a look at difftime. Then you can get two time differences: "before and missing" and "missing and after". The value you can interpolate by "value before" + "value after" * diff1 / (diff1 + diff2)

    – CIAndrews
    Nov 14 '18 at 5:05











  • I'm so sorry I didn't post data. I have uploaded it.@vaettchen

    – T X
    Nov 14 '18 at 7:33











  • Thanks!@CIAndrews. Yet I'am afraid your equation should be: "value before" + "value after - value before" * diff1 / (diff1 + diff2)

    – T X
    Nov 14 '18 at 7:53

















Pls post data no pictures! Check out ?dput.

– vaettchen
Nov 14 '18 at 4:02





Pls post data no pictures! Check out ?dput.

– vaettchen
Nov 14 '18 at 4:02













You could do a weighted average, so if the previous date is closer to the missing date it's weighted more.

– mickey
Nov 14 '18 at 4:03





You could do a weighted average, so if the previous date is closer to the missing date it's weighted more.

– mickey
Nov 14 '18 at 4:03













Have a look at difftime. Then you can get two time differences: "before and missing" and "missing and after". The value you can interpolate by "value before" + "value after" * diff1 / (diff1 + diff2)

– CIAndrews
Nov 14 '18 at 5:05





Have a look at difftime. Then you can get two time differences: "before and missing" and "missing and after". The value you can interpolate by "value before" + "value after" * diff1 / (diff1 + diff2)

– CIAndrews
Nov 14 '18 at 5:05













I'm so sorry I didn't post data. I have uploaded it.@vaettchen

– T X
Nov 14 '18 at 7:33





I'm so sorry I didn't post data. I have uploaded it.@vaettchen

– T X
Nov 14 '18 at 7:33













Thanks!@CIAndrews. Yet I'am afraid your equation should be: "value before" + "value after - value before" * diff1 / (diff1 + diff2)

– T X
Nov 14 '18 at 7:53





Thanks!@CIAndrews. Yet I'am afraid your equation should be: "value before" + "value after - value before" * diff1 / (diff1 + diff2)

– T X
Nov 14 '18 at 7:53












1 Answer
1






active

oldest

votes


















2














Example data:



df <- data.frame(date = mdy_hm(
c("10/10/2016 10:50",
"10/12/2016 12:07",
"10/24/2016 08:53")),
figure = c(5.73, NA_real_, 6.09))


Using the zoo package:



library(zoo)    
library(magrittr)

zoo(df$figure, df$date) %>%
na.approx() %>%
as.data.frame()


Using lubridate and dplyr



library(dplyr)
library(lubridate)

df %>%
mutate(figure = ifelse(is.na(figure),
lag(figure, 1) + (lead(figure, 1) - lag(figure, 1)) *
as.numeric(difftime(date, lag(date, 1))) /
as.numeric((difftime(lead(date, 1), date) + difftime(date, lag(date, 1)))),
figure)) %>%
mutate(figure = round(figure, 2))





share|improve this answer


























  • Appreciate it. I tried your code, and the result is "13.92" for "2016-10-12 12:07:00". However, I think this value may be wrong. Because what I want to fill is like CIAndrews says in Comments. So the result should be "5.78". You can see my edited question, I have changed it with an example in Excel.

    – T X
    Nov 14 '18 at 8:00











  • Updated following your comment.

    – Jay Achar
    Nov 14 '18 at 8:46











  • I was wondering if you could see the question (stackoverflow.com/questions/53302125/…) when you have time. This extra question is another 50% part of the whole "missing data" problem.

    – T X
    Nov 14 '18 at 14:11











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292882%2fhow-to-fill-missing-data-according-to-the-date-previous-and-next-to-it-in-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














Example data:



df <- data.frame(date = mdy_hm(
c("10/10/2016 10:50",
"10/12/2016 12:07",
"10/24/2016 08:53")),
figure = c(5.73, NA_real_, 6.09))


Using the zoo package:



library(zoo)    
library(magrittr)

zoo(df$figure, df$date) %>%
na.approx() %>%
as.data.frame()


Using lubridate and dplyr



library(dplyr)
library(lubridate)

df %>%
mutate(figure = ifelse(is.na(figure),
lag(figure, 1) + (lead(figure, 1) - lag(figure, 1)) *
as.numeric(difftime(date, lag(date, 1))) /
as.numeric((difftime(lead(date, 1), date) + difftime(date, lag(date, 1)))),
figure)) %>%
mutate(figure = round(figure, 2))





share|improve this answer


























  • Appreciate it. I tried your code, and the result is "13.92" for "2016-10-12 12:07:00". However, I think this value may be wrong. Because what I want to fill is like CIAndrews says in Comments. So the result should be "5.78". You can see my edited question, I have changed it with an example in Excel.

    – T X
    Nov 14 '18 at 8:00











  • Updated following your comment.

    – Jay Achar
    Nov 14 '18 at 8:46











  • I was wondering if you could see the question (stackoverflow.com/questions/53302125/…) when you have time. This extra question is another 50% part of the whole "missing data" problem.

    – T X
    Nov 14 '18 at 14:11
















2














Example data:



df <- data.frame(date = mdy_hm(
c("10/10/2016 10:50",
"10/12/2016 12:07",
"10/24/2016 08:53")),
figure = c(5.73, NA_real_, 6.09))


Using the zoo package:



library(zoo)    
library(magrittr)

zoo(df$figure, df$date) %>%
na.approx() %>%
as.data.frame()


Using lubridate and dplyr



library(dplyr)
library(lubridate)

df %>%
mutate(figure = ifelse(is.na(figure),
lag(figure, 1) + (lead(figure, 1) - lag(figure, 1)) *
as.numeric(difftime(date, lag(date, 1))) /
as.numeric((difftime(lead(date, 1), date) + difftime(date, lag(date, 1)))),
figure)) %>%
mutate(figure = round(figure, 2))





share|improve this answer


























  • Appreciate it. I tried your code, and the result is "13.92" for "2016-10-12 12:07:00". However, I think this value may be wrong. Because what I want to fill is like CIAndrews says in Comments. So the result should be "5.78". You can see my edited question, I have changed it with an example in Excel.

    – T X
    Nov 14 '18 at 8:00











  • Updated following your comment.

    – Jay Achar
    Nov 14 '18 at 8:46











  • I was wondering if you could see the question (stackoverflow.com/questions/53302125/…) when you have time. This extra question is another 50% part of the whole "missing data" problem.

    – T X
    Nov 14 '18 at 14:11














2












2








2







Example data:



df <- data.frame(date = mdy_hm(
c("10/10/2016 10:50",
"10/12/2016 12:07",
"10/24/2016 08:53")),
figure = c(5.73, NA_real_, 6.09))


Using the zoo package:



library(zoo)    
library(magrittr)

zoo(df$figure, df$date) %>%
na.approx() %>%
as.data.frame()


Using lubridate and dplyr



library(dplyr)
library(lubridate)

df %>%
mutate(figure = ifelse(is.na(figure),
lag(figure, 1) + (lead(figure, 1) - lag(figure, 1)) *
as.numeric(difftime(date, lag(date, 1))) /
as.numeric((difftime(lead(date, 1), date) + difftime(date, lag(date, 1)))),
figure)) %>%
mutate(figure = round(figure, 2))





share|improve this answer















Example data:



df <- data.frame(date = mdy_hm(
c("10/10/2016 10:50",
"10/12/2016 12:07",
"10/24/2016 08:53")),
figure = c(5.73, NA_real_, 6.09))


Using the zoo package:



library(zoo)    
library(magrittr)

zoo(df$figure, df$date) %>%
na.approx() %>%
as.data.frame()


Using lubridate and dplyr



library(dplyr)
library(lubridate)

df %>%
mutate(figure = ifelse(is.na(figure),
lag(figure, 1) + (lead(figure, 1) - lag(figure, 1)) *
as.numeric(difftime(date, lag(date, 1))) /
as.numeric((difftime(lead(date, 1), date) + difftime(date, lag(date, 1)))),
figure)) %>%
mutate(figure = round(figure, 2))






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 14 '18 at 10:21

























answered Nov 14 '18 at 6:53









Jay AcharJay Achar

1357




1357













  • Appreciate it. I tried your code, and the result is "13.92" for "2016-10-12 12:07:00". However, I think this value may be wrong. Because what I want to fill is like CIAndrews says in Comments. So the result should be "5.78". You can see my edited question, I have changed it with an example in Excel.

    – T X
    Nov 14 '18 at 8:00











  • Updated following your comment.

    – Jay Achar
    Nov 14 '18 at 8:46











  • I was wondering if you could see the question (stackoverflow.com/questions/53302125/…) when you have time. This extra question is another 50% part of the whole "missing data" problem.

    – T X
    Nov 14 '18 at 14:11



















  • Appreciate it. I tried your code, and the result is "13.92" for "2016-10-12 12:07:00". However, I think this value may be wrong. Because what I want to fill is like CIAndrews says in Comments. So the result should be "5.78". You can see my edited question, I have changed it with an example in Excel.

    – T X
    Nov 14 '18 at 8:00











  • Updated following your comment.

    – Jay Achar
    Nov 14 '18 at 8:46











  • I was wondering if you could see the question (stackoverflow.com/questions/53302125/…) when you have time. This extra question is another 50% part of the whole "missing data" problem.

    – T X
    Nov 14 '18 at 14:11

















Appreciate it. I tried your code, and the result is "13.92" for "2016-10-12 12:07:00". However, I think this value may be wrong. Because what I want to fill is like CIAndrews says in Comments. So the result should be "5.78". You can see my edited question, I have changed it with an example in Excel.

– T X
Nov 14 '18 at 8:00





Appreciate it. I tried your code, and the result is "13.92" for "2016-10-12 12:07:00". However, I think this value may be wrong. Because what I want to fill is like CIAndrews says in Comments. So the result should be "5.78". You can see my edited question, I have changed it with an example in Excel.

– T X
Nov 14 '18 at 8:00













Updated following your comment.

– Jay Achar
Nov 14 '18 at 8:46





Updated following your comment.

– Jay Achar
Nov 14 '18 at 8:46













I was wondering if you could see the question (stackoverflow.com/questions/53302125/…) when you have time. This extra question is another 50% part of the whole "missing data" problem.

– T X
Nov 14 '18 at 14:11





I was wondering if you could see the question (stackoverflow.com/questions/53302125/…) when you have time. This extra question is another 50% part of the whole "missing data" problem.

– T X
Nov 14 '18 at 14:11


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292882%2fhow-to-fill-missing-data-according-to-the-date-previous-and-next-to-it-in-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Florida Star v. B. J. F.

Error while running script in elastic search , gateway timeout

Adding quotations to stringified JSON object values