R: Extracting elements for loop from lapply object
up vote
0
down vote
favorite
In short: is there a way to loop through each element of the lapply object described below allModelsResults
?
Meaning, allModelsResults$'1'
for example gives me 1st element from the object. Next allModelsResults$'2'
would be the 2nd element. I would like to create a for
loop in order to extract each element, run some commands, and store the results.
Detailed description below...
I have the following code, where I run a simple ML model using "knn" across multiple model specifications. The model specifications are stored in allModelList
, and all the results are stored in allModelsResults
.
An single model from all model list looks like:
y ~ x1 + x2 + x3
or
y ~ x1 + x5 + x4
and so on... in short a series of combinations of model specifications
allModelsResults <- lapply(allModelsList, function(x) train(x, data=All_categories_merged_done,method = "knn"))
I would like to now extract each element (results from each model) one by one to run analysis on. For example I can manually take:
allModelsResults$'1'
to get results from the first model, or allModelsResults$'5'
to et results from the 5th model and so on.
I ideally I would loop through these in a for loop, were each time I select one of the elements are run a series of commands on.
Any help on how to extract the elements from allModelsResults object would really help! I have about 50 model specifications, so I need to create a loop or something similar to extract one by one automatically.
Specifically in order to share for the community, for each element I would like to do this one by one for each model.
As an example I am extracting model 1 here (this does not work obviously):
aggregate_results <- NULL
for(z in 1:length(categories)){
element_number_ID <- (element_number[z])
element_number_ID
should equal '1'
to extract the right model
model_1_result <- allModelsResults$'1'
ResultsTestPred <- predict(model_1_result, testing_data)
results_to_store <- confusionMatrix(ResultsTestPred, testing_data $outcome)
aggregate_results <- rbind(aggregate_results, results_to_store)
}
results_to_store
output for one element looks like:
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 14 2
1 4 19
Accuracy : 0.8462
95% CI : (0.6947, 0.9414)
No Information Rate : 0.5385
P-Value [Acc > NIR] : 0.00005274
Kappa : 0.688
Mcnemar's Test P-Value : 0.6831
Sensitivity : 0.7778
Specificity : 0.9048
Pos Pred Value : 0.8750
Neg Pred Value : 0.8261
Prevalence : 0.4615
Detection Rate : 0.3590
Detection Prevalence : 0.4103
Balanced Accuracy : 0.8413
'Positive' Class : 0
Where I want to save Accuracy
value for each element/model. This way I can compare each model specification with regard to accuracy.
Any insight would be greatly appreciated!
r for-loop machine-learning lapply
|
show 4 more comments
up vote
0
down vote
favorite
In short: is there a way to loop through each element of the lapply object described below allModelsResults
?
Meaning, allModelsResults$'1'
for example gives me 1st element from the object. Next allModelsResults$'2'
would be the 2nd element. I would like to create a for
loop in order to extract each element, run some commands, and store the results.
Detailed description below...
I have the following code, where I run a simple ML model using "knn" across multiple model specifications. The model specifications are stored in allModelList
, and all the results are stored in allModelsResults
.
An single model from all model list looks like:
y ~ x1 + x2 + x3
or
y ~ x1 + x5 + x4
and so on... in short a series of combinations of model specifications
allModelsResults <- lapply(allModelsList, function(x) train(x, data=All_categories_merged_done,method = "knn"))
I would like to now extract each element (results from each model) one by one to run analysis on. For example I can manually take:
allModelsResults$'1'
to get results from the first model, or allModelsResults$'5'
to et results from the 5th model and so on.
I ideally I would loop through these in a for loop, were each time I select one of the elements are run a series of commands on.
Any help on how to extract the elements from allModelsResults object would really help! I have about 50 model specifications, so I need to create a loop or something similar to extract one by one automatically.
Specifically in order to share for the community, for each element I would like to do this one by one for each model.
As an example I am extracting model 1 here (this does not work obviously):
aggregate_results <- NULL
for(z in 1:length(categories)){
element_number_ID <- (element_number[z])
element_number_ID
should equal '1'
to extract the right model
model_1_result <- allModelsResults$'1'
ResultsTestPred <- predict(model_1_result, testing_data)
results_to_store <- confusionMatrix(ResultsTestPred, testing_data $outcome)
aggregate_results <- rbind(aggregate_results, results_to_store)
}
results_to_store
output for one element looks like:
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 14 2
1 4 19
Accuracy : 0.8462
95% CI : (0.6947, 0.9414)
No Information Rate : 0.5385
P-Value [Acc > NIR] : 0.00005274
Kappa : 0.688
Mcnemar's Test P-Value : 0.6831
Sensitivity : 0.7778
Specificity : 0.9048
Pos Pred Value : 0.8750
Neg Pred Value : 0.8261
Prevalence : 0.4615
Detection Rate : 0.3590
Detection Prevalence : 0.4103
Balanced Accuracy : 0.8413
'Positive' Class : 0
Where I want to save Accuracy
value for each element/model. This way I can compare each model specification with regard to accuracy.
Any insight would be greatly appreciated!
r for-loop machine-learning lapply
lapply(allModelsResults, predict, testing_data)
. If you have a list (of results) use a form of*apply
to process it.
– Rui Barradas
Nov 10 at 19:48
Thank you Rui Barradas, how would I go about storing specific results? From myresults_to_store
for example?
– Peter Alexander
Nov 10 at 19:51
1
That depends. If what is extracted is a vector with a fixed length I usesapply
. It will return a vector or matrix or similar object and these are in a tidy format easy to further process. Otherwise I uselapply
and keep what is extracted in a list.
– Rui Barradas
Nov 10 at 19:55
3
Use[[
not$
to access list elements. This works well with variables.allModelResults[[1]]
is the first element, and ifi = 3
thenallModelResults[[i]]
is the third element.
– Gregor
Nov 10 at 20:07
3
Andlapply
always returns alist
. You don't need to call it an "lapply object", it's just a list.
– Gregor
Nov 10 at 20:08
|
show 4 more comments
up vote
0
down vote
favorite
up vote
0
down vote
favorite
In short: is there a way to loop through each element of the lapply object described below allModelsResults
?
Meaning, allModelsResults$'1'
for example gives me 1st element from the object. Next allModelsResults$'2'
would be the 2nd element. I would like to create a for
loop in order to extract each element, run some commands, and store the results.
Detailed description below...
I have the following code, where I run a simple ML model using "knn" across multiple model specifications. The model specifications are stored in allModelList
, and all the results are stored in allModelsResults
.
An single model from all model list looks like:
y ~ x1 + x2 + x3
or
y ~ x1 + x5 + x4
and so on... in short a series of combinations of model specifications
allModelsResults <- lapply(allModelsList, function(x) train(x, data=All_categories_merged_done,method = "knn"))
I would like to now extract each element (results from each model) one by one to run analysis on. For example I can manually take:
allModelsResults$'1'
to get results from the first model, or allModelsResults$'5'
to et results from the 5th model and so on.
I ideally I would loop through these in a for loop, were each time I select one of the elements are run a series of commands on.
Any help on how to extract the elements from allModelsResults object would really help! I have about 50 model specifications, so I need to create a loop or something similar to extract one by one automatically.
Specifically in order to share for the community, for each element I would like to do this one by one for each model.
As an example I am extracting model 1 here (this does not work obviously):
aggregate_results <- NULL
for(z in 1:length(categories)){
element_number_ID <- (element_number[z])
element_number_ID
should equal '1'
to extract the right model
model_1_result <- allModelsResults$'1'
ResultsTestPred <- predict(model_1_result, testing_data)
results_to_store <- confusionMatrix(ResultsTestPred, testing_data $outcome)
aggregate_results <- rbind(aggregate_results, results_to_store)
}
results_to_store
output for one element looks like:
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 14 2
1 4 19
Accuracy : 0.8462
95% CI : (0.6947, 0.9414)
No Information Rate : 0.5385
P-Value [Acc > NIR] : 0.00005274
Kappa : 0.688
Mcnemar's Test P-Value : 0.6831
Sensitivity : 0.7778
Specificity : 0.9048
Pos Pred Value : 0.8750
Neg Pred Value : 0.8261
Prevalence : 0.4615
Detection Rate : 0.3590
Detection Prevalence : 0.4103
Balanced Accuracy : 0.8413
'Positive' Class : 0
Where I want to save Accuracy
value for each element/model. This way I can compare each model specification with regard to accuracy.
Any insight would be greatly appreciated!
r for-loop machine-learning lapply
In short: is there a way to loop through each element of the lapply object described below allModelsResults
?
Meaning, allModelsResults$'1'
for example gives me 1st element from the object. Next allModelsResults$'2'
would be the 2nd element. I would like to create a for
loop in order to extract each element, run some commands, and store the results.
Detailed description below...
I have the following code, where I run a simple ML model using "knn" across multiple model specifications. The model specifications are stored in allModelList
, and all the results are stored in allModelsResults
.
An single model from all model list looks like:
y ~ x1 + x2 + x3
or
y ~ x1 + x5 + x4
and so on... in short a series of combinations of model specifications
allModelsResults <- lapply(allModelsList, function(x) train(x, data=All_categories_merged_done,method = "knn"))
I would like to now extract each element (results from each model) one by one to run analysis on. For example I can manually take:
allModelsResults$'1'
to get results from the first model, or allModelsResults$'5'
to et results from the 5th model and so on.
I ideally I would loop through these in a for loop, were each time I select one of the elements are run a series of commands on.
Any help on how to extract the elements from allModelsResults object would really help! I have about 50 model specifications, so I need to create a loop or something similar to extract one by one automatically.
Specifically in order to share for the community, for each element I would like to do this one by one for each model.
As an example I am extracting model 1 here (this does not work obviously):
aggregate_results <- NULL
for(z in 1:length(categories)){
element_number_ID <- (element_number[z])
element_number_ID
should equal '1'
to extract the right model
model_1_result <- allModelsResults$'1'
ResultsTestPred <- predict(model_1_result, testing_data)
results_to_store <- confusionMatrix(ResultsTestPred, testing_data $outcome)
aggregate_results <- rbind(aggregate_results, results_to_store)
}
results_to_store
output for one element looks like:
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 14 2
1 4 19
Accuracy : 0.8462
95% CI : (0.6947, 0.9414)
No Information Rate : 0.5385
P-Value [Acc > NIR] : 0.00005274
Kappa : 0.688
Mcnemar's Test P-Value : 0.6831
Sensitivity : 0.7778
Specificity : 0.9048
Pos Pred Value : 0.8750
Neg Pred Value : 0.8261
Prevalence : 0.4615
Detection Rate : 0.3590
Detection Prevalence : 0.4103
Balanced Accuracy : 0.8413
'Positive' Class : 0
Where I want to save Accuracy
value for each element/model. This way I can compare each model specification with regard to accuracy.
Any insight would be greatly appreciated!
r for-loop machine-learning lapply
r for-loop machine-learning lapply
edited Nov 10 at 20:03
asked Nov 10 at 19:38
Peter Alexander
257
257
lapply(allModelsResults, predict, testing_data)
. If you have a list (of results) use a form of*apply
to process it.
– Rui Barradas
Nov 10 at 19:48
Thank you Rui Barradas, how would I go about storing specific results? From myresults_to_store
for example?
– Peter Alexander
Nov 10 at 19:51
1
That depends. If what is extracted is a vector with a fixed length I usesapply
. It will return a vector or matrix or similar object and these are in a tidy format easy to further process. Otherwise I uselapply
and keep what is extracted in a list.
– Rui Barradas
Nov 10 at 19:55
3
Use[[
not$
to access list elements. This works well with variables.allModelResults[[1]]
is the first element, and ifi = 3
thenallModelResults[[i]]
is the third element.
– Gregor
Nov 10 at 20:07
3
Andlapply
always returns alist
. You don't need to call it an "lapply object", it's just a list.
– Gregor
Nov 10 at 20:08
|
show 4 more comments
lapply(allModelsResults, predict, testing_data)
. If you have a list (of results) use a form of*apply
to process it.
– Rui Barradas
Nov 10 at 19:48
Thank you Rui Barradas, how would I go about storing specific results? From myresults_to_store
for example?
– Peter Alexander
Nov 10 at 19:51
1
That depends. If what is extracted is a vector with a fixed length I usesapply
. It will return a vector or matrix or similar object and these are in a tidy format easy to further process. Otherwise I uselapply
and keep what is extracted in a list.
– Rui Barradas
Nov 10 at 19:55
3
Use[[
not$
to access list elements. This works well with variables.allModelResults[[1]]
is the first element, and ifi = 3
thenallModelResults[[i]]
is the third element.
– Gregor
Nov 10 at 20:07
3
Andlapply
always returns alist
. You don't need to call it an "lapply object", it's just a list.
– Gregor
Nov 10 at 20:08
lapply(allModelsResults, predict, testing_data)
. If you have a list (of results) use a form of *apply
to process it.– Rui Barradas
Nov 10 at 19:48
lapply(allModelsResults, predict, testing_data)
. If you have a list (of results) use a form of *apply
to process it.– Rui Barradas
Nov 10 at 19:48
Thank you Rui Barradas, how would I go about storing specific results? From my
results_to_store
for example?– Peter Alexander
Nov 10 at 19:51
Thank you Rui Barradas, how would I go about storing specific results? From my
results_to_store
for example?– Peter Alexander
Nov 10 at 19:51
1
1
That depends. If what is extracted is a vector with a fixed length I use
sapply
. It will return a vector or matrix or similar object and these are in a tidy format easy to further process. Otherwise I use lapply
and keep what is extracted in a list.– Rui Barradas
Nov 10 at 19:55
That depends. If what is extracted is a vector with a fixed length I use
sapply
. It will return a vector or matrix or similar object and these are in a tidy format easy to further process. Otherwise I use lapply
and keep what is extracted in a list.– Rui Barradas
Nov 10 at 19:55
3
3
Use
[[
not $
to access list elements. This works well with variables. allModelResults[[1]]
is the first element, and if i = 3
then allModelResults[[i]]
is the third element.– Gregor
Nov 10 at 20:07
Use
[[
not $
to access list elements. This works well with variables. allModelResults[[1]]
is the first element, and if i = 3
then allModelResults[[i]]
is the third element.– Gregor
Nov 10 at 20:07
3
3
And
lapply
always returns a list
. You don't need to call it an "lapply object", it's just a list.– Gregor
Nov 10 at 20:08
And
lapply
always returns a list
. You don't need to call it an "lapply object", it's just a list.– Gregor
Nov 10 at 20:08
|
show 4 more comments
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
You seem to want to get predictions and confusion matrix for each model. Without a reproducible example and with some confusion terminology, I'm doing a lot of guesswork, but I think I understand what you want (or close enough). I'll show you how I would do it with lapply
and Map
, and then we can do it with a for
loop too.
First, get predictions on the testing data. All of these methods are exactly the same:
# lapply way
predictions = lapply(allModelsList, predict, newdata = testingdata)
# for loop way
predictions = list()
for (i in 1:length(allModelsList)) {
predictions[[i]] = predict(allModelsList[[i]], newdata = testingdata)
}
# manual way - just so you understand exactly what's going on
predictions = list(
predict(allModelsList[[1]], newdata = testingdata),
predict(allModelsList[[2]], newdata = testingdata),
predict(allModelsList[[3]], newdata = testingdata),
...
)
Now, predictions
is a list
, so we access each element with [[
. The first one is predictions[[1]]
, the k
th one is predictions[[k]]
if we want to define some variable k
(like to use in loop). We could also add descriptive names and use the names instead of the indices.
Similarly, we can calculate all the confusion matrices:
# lapply way
conf_matrices = lapply(predictions, confusionMatrix, reference = testingdata$outcome)
# for loop way
conf_matrices = list()
for (p in 1:length(predictions)) {
conf_matrices[[p]] = confusionMatrix(p, reference = testingdata$outcome)
}
# manual way (for illustration)
conf_matrices = list(
confusionMatrix(predictions[[1]], reference = testingdata$outcome),
confusionMatrix(predictions[[2]], reference = testingdata$outcome),
...
)
Again, we have a list
. The first confusion matrix is conf_matrices[[1]]
and all the same as above.
Hopefully that's helps us understand how to use lapply
or a for
loop to create a list.
Now, toward the bottom of your question you seem to imply that the Accuracy
part of the confusion matrix. I ran the example at the bottom of the help page ?confusionMatrix
and looked at the result. Running str(conf_mat)
on a result showed me that it is a list
, and that the "overall"
element of the list is a named vector, including the "Accuracy"
. So, for an individual confusion matrix cm
we can extract the accuracy with cm[["overall"]]["Accuracy"]
. We use [[
for the list
part and [
for the regular vector part. (We could also use cm$overall["Accuracy"]
. $
works when we give it the exact name, no quotes, no variables. A lot of your issues seem to be related to trying to use $
with quotes or variables. You just can't do that. See fortunes::fortune(312)
).
So, we can extract the accuracies from our confusion matrix list:
# I use *s*apply here so the result will be *s*implified into a vector
acc = sapply(conf_matrices, function(cm) cm[["overall"]]["Accuracy"])
acc = numeric(length(conf_matrices))
for (i in 1:length(conf_matrices)) {
acc[i] = conf_matrices[[i]][["overall"]]["Accuracy"]
}
Or, if you know from the beginning you only want the accuracy, we could get there directly without saving the intermediate steps:
# apply
acc = sapply(allModelsList, function(x) {
pred = predict(x, newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
return(cm[["overall"]]["Accuracy"]
}
)
# for loop
acc = numeric(length(allModelsList))
for (i in 1:length(allModelsList)) {
pred = predict(allModelsList[[i]], newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
acc[i] = (cm[["overall"]]["Accuracy"]
}
Notes: As mentioned above, without a reproducible example I'm guessing quite a bit and none of this is tested because I don't have any inputs to test on. I'm presuming that what I see in your question in terms of individual steps, like that we want to predict on each element of allModelResults
, are correct. (If so, it seems like, say, fittedModels
would be a much better name than allModelResults
.) I don't know what you mean by "model specifications", and I have no idea what's in allModelList
, but hopefully this gives you enough examples of working with lists that you can work out any kinks. (There may also be, say, mismatched parentheses or missing brackets.)
lapply
and sapply
are convenient for letting you do less typing than a for
loop, but they're not really any different. They set up an object to hold the results, and they fill it up. If you want to create multiple results at the same time, you may want to just us a for
loop. And as the number of steps inside gets longer, it can be easier to debug a for loop anyway. Use what you like and what makes sense to you.
1
This answer is absolutely amazing. Indeed I am used to using data-frames with the $ usage. Thank you for explaining each step so well. I hope this is useful for others as well
– Peter Alexander
Nov 11 at 22:03
Note that using[[
and[
with variables works for data frames too. (Data frames are lists, just with a few extra properties.) So whilemtcars$mpg
is thempg
column from themtcars
data frame, if you have a column name in a variablecol = "mpg"
, thenmtcars$col
won't work.You can usemtcars[, col]
ormtcars[[col]]
- use[[
when you know you want a single column only as a vector, use[, col]
when you want possibly multiple columns.
– Gregor
Nov 12 at 13:52
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
You seem to want to get predictions and confusion matrix for each model. Without a reproducible example and with some confusion terminology, I'm doing a lot of guesswork, but I think I understand what you want (or close enough). I'll show you how I would do it with lapply
and Map
, and then we can do it with a for
loop too.
First, get predictions on the testing data. All of these methods are exactly the same:
# lapply way
predictions = lapply(allModelsList, predict, newdata = testingdata)
# for loop way
predictions = list()
for (i in 1:length(allModelsList)) {
predictions[[i]] = predict(allModelsList[[i]], newdata = testingdata)
}
# manual way - just so you understand exactly what's going on
predictions = list(
predict(allModelsList[[1]], newdata = testingdata),
predict(allModelsList[[2]], newdata = testingdata),
predict(allModelsList[[3]], newdata = testingdata),
...
)
Now, predictions
is a list
, so we access each element with [[
. The first one is predictions[[1]]
, the k
th one is predictions[[k]]
if we want to define some variable k
(like to use in loop). We could also add descriptive names and use the names instead of the indices.
Similarly, we can calculate all the confusion matrices:
# lapply way
conf_matrices = lapply(predictions, confusionMatrix, reference = testingdata$outcome)
# for loop way
conf_matrices = list()
for (p in 1:length(predictions)) {
conf_matrices[[p]] = confusionMatrix(p, reference = testingdata$outcome)
}
# manual way (for illustration)
conf_matrices = list(
confusionMatrix(predictions[[1]], reference = testingdata$outcome),
confusionMatrix(predictions[[2]], reference = testingdata$outcome),
...
)
Again, we have a list
. The first confusion matrix is conf_matrices[[1]]
and all the same as above.
Hopefully that's helps us understand how to use lapply
or a for
loop to create a list.
Now, toward the bottom of your question you seem to imply that the Accuracy
part of the confusion matrix. I ran the example at the bottom of the help page ?confusionMatrix
and looked at the result. Running str(conf_mat)
on a result showed me that it is a list
, and that the "overall"
element of the list is a named vector, including the "Accuracy"
. So, for an individual confusion matrix cm
we can extract the accuracy with cm[["overall"]]["Accuracy"]
. We use [[
for the list
part and [
for the regular vector part. (We could also use cm$overall["Accuracy"]
. $
works when we give it the exact name, no quotes, no variables. A lot of your issues seem to be related to trying to use $
with quotes or variables. You just can't do that. See fortunes::fortune(312)
).
So, we can extract the accuracies from our confusion matrix list:
# I use *s*apply here so the result will be *s*implified into a vector
acc = sapply(conf_matrices, function(cm) cm[["overall"]]["Accuracy"])
acc = numeric(length(conf_matrices))
for (i in 1:length(conf_matrices)) {
acc[i] = conf_matrices[[i]][["overall"]]["Accuracy"]
}
Or, if you know from the beginning you only want the accuracy, we could get there directly without saving the intermediate steps:
# apply
acc = sapply(allModelsList, function(x) {
pred = predict(x, newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
return(cm[["overall"]]["Accuracy"]
}
)
# for loop
acc = numeric(length(allModelsList))
for (i in 1:length(allModelsList)) {
pred = predict(allModelsList[[i]], newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
acc[i] = (cm[["overall"]]["Accuracy"]
}
Notes: As mentioned above, without a reproducible example I'm guessing quite a bit and none of this is tested because I don't have any inputs to test on. I'm presuming that what I see in your question in terms of individual steps, like that we want to predict on each element of allModelResults
, are correct. (If so, it seems like, say, fittedModels
would be a much better name than allModelResults
.) I don't know what you mean by "model specifications", and I have no idea what's in allModelList
, but hopefully this gives you enough examples of working with lists that you can work out any kinks. (There may also be, say, mismatched parentheses or missing brackets.)
lapply
and sapply
are convenient for letting you do less typing than a for
loop, but they're not really any different. They set up an object to hold the results, and they fill it up. If you want to create multiple results at the same time, you may want to just us a for
loop. And as the number of steps inside gets longer, it can be easier to debug a for loop anyway. Use what you like and what makes sense to you.
1
This answer is absolutely amazing. Indeed I am used to using data-frames with the $ usage. Thank you for explaining each step so well. I hope this is useful for others as well
– Peter Alexander
Nov 11 at 22:03
Note that using[[
and[
with variables works for data frames too. (Data frames are lists, just with a few extra properties.) So whilemtcars$mpg
is thempg
column from themtcars
data frame, if you have a column name in a variablecol = "mpg"
, thenmtcars$col
won't work.You can usemtcars[, col]
ormtcars[[col]]
- use[[
when you know you want a single column only as a vector, use[, col]
when you want possibly multiple columns.
– Gregor
Nov 12 at 13:52
add a comment |
up vote
1
down vote
accepted
You seem to want to get predictions and confusion matrix for each model. Without a reproducible example and with some confusion terminology, I'm doing a lot of guesswork, but I think I understand what you want (or close enough). I'll show you how I would do it with lapply
and Map
, and then we can do it with a for
loop too.
First, get predictions on the testing data. All of these methods are exactly the same:
# lapply way
predictions = lapply(allModelsList, predict, newdata = testingdata)
# for loop way
predictions = list()
for (i in 1:length(allModelsList)) {
predictions[[i]] = predict(allModelsList[[i]], newdata = testingdata)
}
# manual way - just so you understand exactly what's going on
predictions = list(
predict(allModelsList[[1]], newdata = testingdata),
predict(allModelsList[[2]], newdata = testingdata),
predict(allModelsList[[3]], newdata = testingdata),
...
)
Now, predictions
is a list
, so we access each element with [[
. The first one is predictions[[1]]
, the k
th one is predictions[[k]]
if we want to define some variable k
(like to use in loop). We could also add descriptive names and use the names instead of the indices.
Similarly, we can calculate all the confusion matrices:
# lapply way
conf_matrices = lapply(predictions, confusionMatrix, reference = testingdata$outcome)
# for loop way
conf_matrices = list()
for (p in 1:length(predictions)) {
conf_matrices[[p]] = confusionMatrix(p, reference = testingdata$outcome)
}
# manual way (for illustration)
conf_matrices = list(
confusionMatrix(predictions[[1]], reference = testingdata$outcome),
confusionMatrix(predictions[[2]], reference = testingdata$outcome),
...
)
Again, we have a list
. The first confusion matrix is conf_matrices[[1]]
and all the same as above.
Hopefully that's helps us understand how to use lapply
or a for
loop to create a list.
Now, toward the bottom of your question you seem to imply that the Accuracy
part of the confusion matrix. I ran the example at the bottom of the help page ?confusionMatrix
and looked at the result. Running str(conf_mat)
on a result showed me that it is a list
, and that the "overall"
element of the list is a named vector, including the "Accuracy"
. So, for an individual confusion matrix cm
we can extract the accuracy with cm[["overall"]]["Accuracy"]
. We use [[
for the list
part and [
for the regular vector part. (We could also use cm$overall["Accuracy"]
. $
works when we give it the exact name, no quotes, no variables. A lot of your issues seem to be related to trying to use $
with quotes or variables. You just can't do that. See fortunes::fortune(312)
).
So, we can extract the accuracies from our confusion matrix list:
# I use *s*apply here so the result will be *s*implified into a vector
acc = sapply(conf_matrices, function(cm) cm[["overall"]]["Accuracy"])
acc = numeric(length(conf_matrices))
for (i in 1:length(conf_matrices)) {
acc[i] = conf_matrices[[i]][["overall"]]["Accuracy"]
}
Or, if you know from the beginning you only want the accuracy, we could get there directly without saving the intermediate steps:
# apply
acc = sapply(allModelsList, function(x) {
pred = predict(x, newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
return(cm[["overall"]]["Accuracy"]
}
)
# for loop
acc = numeric(length(allModelsList))
for (i in 1:length(allModelsList)) {
pred = predict(allModelsList[[i]], newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
acc[i] = (cm[["overall"]]["Accuracy"]
}
Notes: As mentioned above, without a reproducible example I'm guessing quite a bit and none of this is tested because I don't have any inputs to test on. I'm presuming that what I see in your question in terms of individual steps, like that we want to predict on each element of allModelResults
, are correct. (If so, it seems like, say, fittedModels
would be a much better name than allModelResults
.) I don't know what you mean by "model specifications", and I have no idea what's in allModelList
, but hopefully this gives you enough examples of working with lists that you can work out any kinks. (There may also be, say, mismatched parentheses or missing brackets.)
lapply
and sapply
are convenient for letting you do less typing than a for
loop, but they're not really any different. They set up an object to hold the results, and they fill it up. If you want to create multiple results at the same time, you may want to just us a for
loop. And as the number of steps inside gets longer, it can be easier to debug a for loop anyway. Use what you like and what makes sense to you.
1
This answer is absolutely amazing. Indeed I am used to using data-frames with the $ usage. Thank you for explaining each step so well. I hope this is useful for others as well
– Peter Alexander
Nov 11 at 22:03
Note that using[[
and[
with variables works for data frames too. (Data frames are lists, just with a few extra properties.) So whilemtcars$mpg
is thempg
column from themtcars
data frame, if you have a column name in a variablecol = "mpg"
, thenmtcars$col
won't work.You can usemtcars[, col]
ormtcars[[col]]
- use[[
when you know you want a single column only as a vector, use[, col]
when you want possibly multiple columns.
– Gregor
Nov 12 at 13:52
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
You seem to want to get predictions and confusion matrix for each model. Without a reproducible example and with some confusion terminology, I'm doing a lot of guesswork, but I think I understand what you want (or close enough). I'll show you how I would do it with lapply
and Map
, and then we can do it with a for
loop too.
First, get predictions on the testing data. All of these methods are exactly the same:
# lapply way
predictions = lapply(allModelsList, predict, newdata = testingdata)
# for loop way
predictions = list()
for (i in 1:length(allModelsList)) {
predictions[[i]] = predict(allModelsList[[i]], newdata = testingdata)
}
# manual way - just so you understand exactly what's going on
predictions = list(
predict(allModelsList[[1]], newdata = testingdata),
predict(allModelsList[[2]], newdata = testingdata),
predict(allModelsList[[3]], newdata = testingdata),
...
)
Now, predictions
is a list
, so we access each element with [[
. The first one is predictions[[1]]
, the k
th one is predictions[[k]]
if we want to define some variable k
(like to use in loop). We could also add descriptive names and use the names instead of the indices.
Similarly, we can calculate all the confusion matrices:
# lapply way
conf_matrices = lapply(predictions, confusionMatrix, reference = testingdata$outcome)
# for loop way
conf_matrices = list()
for (p in 1:length(predictions)) {
conf_matrices[[p]] = confusionMatrix(p, reference = testingdata$outcome)
}
# manual way (for illustration)
conf_matrices = list(
confusionMatrix(predictions[[1]], reference = testingdata$outcome),
confusionMatrix(predictions[[2]], reference = testingdata$outcome),
...
)
Again, we have a list
. The first confusion matrix is conf_matrices[[1]]
and all the same as above.
Hopefully that's helps us understand how to use lapply
or a for
loop to create a list.
Now, toward the bottom of your question you seem to imply that the Accuracy
part of the confusion matrix. I ran the example at the bottom of the help page ?confusionMatrix
and looked at the result. Running str(conf_mat)
on a result showed me that it is a list
, and that the "overall"
element of the list is a named vector, including the "Accuracy"
. So, for an individual confusion matrix cm
we can extract the accuracy with cm[["overall"]]["Accuracy"]
. We use [[
for the list
part and [
for the regular vector part. (We could also use cm$overall["Accuracy"]
. $
works when we give it the exact name, no quotes, no variables. A lot of your issues seem to be related to trying to use $
with quotes or variables. You just can't do that. See fortunes::fortune(312)
).
So, we can extract the accuracies from our confusion matrix list:
# I use *s*apply here so the result will be *s*implified into a vector
acc = sapply(conf_matrices, function(cm) cm[["overall"]]["Accuracy"])
acc = numeric(length(conf_matrices))
for (i in 1:length(conf_matrices)) {
acc[i] = conf_matrices[[i]][["overall"]]["Accuracy"]
}
Or, if you know from the beginning you only want the accuracy, we could get there directly without saving the intermediate steps:
# apply
acc = sapply(allModelsList, function(x) {
pred = predict(x, newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
return(cm[["overall"]]["Accuracy"]
}
)
# for loop
acc = numeric(length(allModelsList))
for (i in 1:length(allModelsList)) {
pred = predict(allModelsList[[i]], newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
acc[i] = (cm[["overall"]]["Accuracy"]
}
Notes: As mentioned above, without a reproducible example I'm guessing quite a bit and none of this is tested because I don't have any inputs to test on. I'm presuming that what I see in your question in terms of individual steps, like that we want to predict on each element of allModelResults
, are correct. (If so, it seems like, say, fittedModels
would be a much better name than allModelResults
.) I don't know what you mean by "model specifications", and I have no idea what's in allModelList
, but hopefully this gives you enough examples of working with lists that you can work out any kinks. (There may also be, say, mismatched parentheses or missing brackets.)
lapply
and sapply
are convenient for letting you do less typing than a for
loop, but they're not really any different. They set up an object to hold the results, and they fill it up. If you want to create multiple results at the same time, you may want to just us a for
loop. And as the number of steps inside gets longer, it can be easier to debug a for loop anyway. Use what you like and what makes sense to you.
You seem to want to get predictions and confusion matrix for each model. Without a reproducible example and with some confusion terminology, I'm doing a lot of guesswork, but I think I understand what you want (or close enough). I'll show you how I would do it with lapply
and Map
, and then we can do it with a for
loop too.
First, get predictions on the testing data. All of these methods are exactly the same:
# lapply way
predictions = lapply(allModelsList, predict, newdata = testingdata)
# for loop way
predictions = list()
for (i in 1:length(allModelsList)) {
predictions[[i]] = predict(allModelsList[[i]], newdata = testingdata)
}
# manual way - just so you understand exactly what's going on
predictions = list(
predict(allModelsList[[1]], newdata = testingdata),
predict(allModelsList[[2]], newdata = testingdata),
predict(allModelsList[[3]], newdata = testingdata),
...
)
Now, predictions
is a list
, so we access each element with [[
. The first one is predictions[[1]]
, the k
th one is predictions[[k]]
if we want to define some variable k
(like to use in loop). We could also add descriptive names and use the names instead of the indices.
Similarly, we can calculate all the confusion matrices:
# lapply way
conf_matrices = lapply(predictions, confusionMatrix, reference = testingdata$outcome)
# for loop way
conf_matrices = list()
for (p in 1:length(predictions)) {
conf_matrices[[p]] = confusionMatrix(p, reference = testingdata$outcome)
}
# manual way (for illustration)
conf_matrices = list(
confusionMatrix(predictions[[1]], reference = testingdata$outcome),
confusionMatrix(predictions[[2]], reference = testingdata$outcome),
...
)
Again, we have a list
. The first confusion matrix is conf_matrices[[1]]
and all the same as above.
Hopefully that's helps us understand how to use lapply
or a for
loop to create a list.
Now, toward the bottom of your question you seem to imply that the Accuracy
part of the confusion matrix. I ran the example at the bottom of the help page ?confusionMatrix
and looked at the result. Running str(conf_mat)
on a result showed me that it is a list
, and that the "overall"
element of the list is a named vector, including the "Accuracy"
. So, for an individual confusion matrix cm
we can extract the accuracy with cm[["overall"]]["Accuracy"]
. We use [[
for the list
part and [
for the regular vector part. (We could also use cm$overall["Accuracy"]
. $
works when we give it the exact name, no quotes, no variables. A lot of your issues seem to be related to trying to use $
with quotes or variables. You just can't do that. See fortunes::fortune(312)
).
So, we can extract the accuracies from our confusion matrix list:
# I use *s*apply here so the result will be *s*implified into a vector
acc = sapply(conf_matrices, function(cm) cm[["overall"]]["Accuracy"])
acc = numeric(length(conf_matrices))
for (i in 1:length(conf_matrices)) {
acc[i] = conf_matrices[[i]][["overall"]]["Accuracy"]
}
Or, if you know from the beginning you only want the accuracy, we could get there directly without saving the intermediate steps:
# apply
acc = sapply(allModelsList, function(x) {
pred = predict(x, newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
return(cm[["overall"]]["Accuracy"]
}
)
# for loop
acc = numeric(length(allModelsList))
for (i in 1:length(allModelsList)) {
pred = predict(allModelsList[[i]], newdata = testingdata)
cm = confusionMatrix(pred, reference = testingdata$outcome
acc[i] = (cm[["overall"]]["Accuracy"]
}
Notes: As mentioned above, without a reproducible example I'm guessing quite a bit and none of this is tested because I don't have any inputs to test on. I'm presuming that what I see in your question in terms of individual steps, like that we want to predict on each element of allModelResults
, are correct. (If so, it seems like, say, fittedModels
would be a much better name than allModelResults
.) I don't know what you mean by "model specifications", and I have no idea what's in allModelList
, but hopefully this gives you enough examples of working with lists that you can work out any kinks. (There may also be, say, mismatched parentheses or missing brackets.)
lapply
and sapply
are convenient for letting you do less typing than a for
loop, but they're not really any different. They set up an object to hold the results, and they fill it up. If you want to create multiple results at the same time, you may want to just us a for
loop. And as the number of steps inside gets longer, it can be easier to debug a for loop anyway. Use what you like and what makes sense to you.
answered Nov 10 at 22:19
Gregor
61.4k988163
61.4k988163
1
This answer is absolutely amazing. Indeed I am used to using data-frames with the $ usage. Thank you for explaining each step so well. I hope this is useful for others as well
– Peter Alexander
Nov 11 at 22:03
Note that using[[
and[
with variables works for data frames too. (Data frames are lists, just with a few extra properties.) So whilemtcars$mpg
is thempg
column from themtcars
data frame, if you have a column name in a variablecol = "mpg"
, thenmtcars$col
won't work.You can usemtcars[, col]
ormtcars[[col]]
- use[[
when you know you want a single column only as a vector, use[, col]
when you want possibly multiple columns.
– Gregor
Nov 12 at 13:52
add a comment |
1
This answer is absolutely amazing. Indeed I am used to using data-frames with the $ usage. Thank you for explaining each step so well. I hope this is useful for others as well
– Peter Alexander
Nov 11 at 22:03
Note that using[[
and[
with variables works for data frames too. (Data frames are lists, just with a few extra properties.) So whilemtcars$mpg
is thempg
column from themtcars
data frame, if you have a column name in a variablecol = "mpg"
, thenmtcars$col
won't work.You can usemtcars[, col]
ormtcars[[col]]
- use[[
when you know you want a single column only as a vector, use[, col]
when you want possibly multiple columns.
– Gregor
Nov 12 at 13:52
1
1
This answer is absolutely amazing. Indeed I am used to using data-frames with the $ usage. Thank you for explaining each step so well. I hope this is useful for others as well
– Peter Alexander
Nov 11 at 22:03
This answer is absolutely amazing. Indeed I am used to using data-frames with the $ usage. Thank you for explaining each step so well. I hope this is useful for others as well
– Peter Alexander
Nov 11 at 22:03
Note that using
[[
and [
with variables works for data frames too. (Data frames are lists, just with a few extra properties.) So while mtcars$mpg
is the mpg
column from the mtcars
data frame, if you have a column name in a variable col = "mpg"
, then mtcars$col
won't work.You can use mtcars[, col]
or mtcars[[col]]
- use [[
when you know you want a single column only as a vector, use [, col]
when you want possibly multiple columns.– Gregor
Nov 12 at 13:52
Note that using
[[
and [
with variables works for data frames too. (Data frames are lists, just with a few extra properties.) So while mtcars$mpg
is the mpg
column from the mtcars
data frame, if you have a column name in a variable col = "mpg"
, then mtcars$col
won't work.You can use mtcars[, col]
or mtcars[[col]]
- use [[
when you know you want a single column only as a vector, use [, col]
when you want possibly multiple columns.– Gregor
Nov 12 at 13:52
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242723%2fr-extracting-elements-for-loop-from-lapply-object%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
lapply(allModelsResults, predict, testing_data)
. If you have a list (of results) use a form of*apply
to process it.– Rui Barradas
Nov 10 at 19:48
Thank you Rui Barradas, how would I go about storing specific results? From my
results_to_store
for example?– Peter Alexander
Nov 10 at 19:51
1
That depends. If what is extracted is a vector with a fixed length I use
sapply
. It will return a vector or matrix or similar object and these are in a tidy format easy to further process. Otherwise I uselapply
and keep what is extracted in a list.– Rui Barradas
Nov 10 at 19:55
3
Use
[[
not$
to access list elements. This works well with variables.allModelResults[[1]]
is the first element, and ifi = 3
thenallModelResults[[i]]
is the third element.– Gregor
Nov 10 at 20:07
3
And
lapply
always returns alist
. You don't need to call it an "lapply object", it's just a list.– Gregor
Nov 10 at 20:08