purrr::map() - reading files in groups
I need to read a bunch of files from a grouped list, and combine them based on the group (files from the same group will have the same columns and thus can be reduced with bind_rows()
.
I can't seem to get a handle on how the data is changing as I move through the purrr::map()
function as I keep geting warnings that I can't use the $
on atomic vectors.
The first thing I do is split by the group so that I get a list of a list of the files I want to read within each group. Then I use map to go through each item in that list, and a second map to go through the rows on each sublist to read the files. However something happens at that level where it no longer treats the data the same as if I was just working with a single group at the top level.
(The lack of being able to debug and look at my environment inside a map function is really an issue in understanding the mechanics.)
require(tidyverse)
#> Loading required package: tidyverse
x <- structure(list(survey = c("adm2014", "adm2015", "adm2016", "eap2008",
"eap2009", "eap2011", "eap2012", "eap2013", "eap2014", "eap2015",
"eap2016", "ef2008a", "ef2008b", "ef2008c", "ef2008cp", "ef2008d",
"ef2009a", "ef2009b", "ef2009c", "ef2009d", "ef2010a", "ef2010b",
"ef2010c", "ef2010cp", "ef2010d", "ef2011a", "ef2011b", "ef2011c",
"ef2011d", "ef2012a", "ef2012b", "ef2012c", "ef2012cp", "ef2012d",
"ef2013a", "ef2013b", "ef2013c", "ef2013d", "ef2014a", "ef2014b",
"ef2014c", "ef2014cp", "ef2014d", "ef2015a", "ef2015b", "ef2015c",
"ef2015d", "ef2016a", "ef2016b", "ef2016c", "ef2016cp", "ef2016d",
"efest2008", "efest2009", "effy2008", "effy2009", "effy2010",
"effy2011", "effy2012", "effy2013", "effy2014", "effy2015", "effy2016",
"effy2017", "efia2008", "efia2009", "efia2011", "efia2012", "efia2013",
"efia2014", "efia2015", "efia2016", "efia2017", "f0708_f1a",
"f0708_f2", "f0708_f3", "f0809_f1a", "f0809_f2", "f0809_f3",
"f0910_f1a", "f0910_f2", "f0910_f3", "f1011_f1a", "f1011_f2",
"f1011_f3", "f1112_f1a", "f1112_f2", "f1112_f3", "f1213_f1a",
"f1213_f2", "f1213_f3", "f1314_f1a", "f1314_f2", "f1314_f3",
"f1415_f1a", "f1415_f2", "f1415_f3", "f1516_f1a", "f1516_f2",
"f1516_f3", "gr2008", "gr2008_l2", "gr2009", "gr2009_l2", "gr200_08",
"gr200_09", "gr200_10", "gr200_11", "gr200_12", "gr200_13", "gr200_14",
"gr200_15", "gr200_16", "gr2010", "gr2010_l2", "gr2011", "gr2011_l2",
"gr2012", "gr2012_l2", "gr2013", "gr2013_l2", "gr2014", "gr2014_l2",
"gr2015", "gr2015_l2", "gr2016", "gr2016_l2", "hd2008", "hd2009",
"hd2010", "hd2011", "hd2012", "hd2013", "hd2014", "hd2015", "hd2017",
"ic2008", "ic2008_ay", "ic2008_py", "ic2009", "ic2009_ay", "ic2009_py",
"ic2010", "ic2010_ay", "ic2010_py", "ic2011", "ic2011_ay", "ic2011_py",
"ic2012", "ic2012_ay", "ic2012_py", "ic2013", "ic2013_ay", "ic2013_py",
"ic2014", "ic2014_ay", "ic2014_py", "ic2015", "ic2015_ay", "ic2015_py",
"ic2016", "ic2016_ay", "ic2016_py", "ic2017", "ic2017_ay", "ic2017_py",
"s2008_abd", "s2008_cn", "s2008_f", "s2008_g", "s2009_abd", "s2009_cn",
"s2009_f", "s2009_g", "s2010_abd", "s2010_cn", "s2010_f", "s2010_g",
"s2011_abd", "s2011_cn", "s2011_f", "s2011_g", "sal2008_a", "sal2008_a_lt9",
"sal2008_b", "sal2008_faculty", "sal2009_a", "sal2009_a_lt9",
"sal2009_b", "sal2009_faculty", "sal2010_a", "sal2010_a_lt9",
"sal2010_b", "sal2010_faculty", "sal2011_a", "sal2011_a_lt9",
"sal2011_faculty"), survgroup = c("adm", "adm", "adm", "eap",
"eap", "eap", "eap", "eap", "eap", "eap", "eap", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efest", "efest", "effy", "effy", "effy",
"effy", "effy", "effy", "effy", "effy", "effy", "effy", "efia",
"efia", "efia", "efia", "efia", "efia", "efia", "efia", "efia",
"f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2",
"f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a",
"f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3",
"f_f1a", "f_f2", "f_f3", "gr", "gr_l2", "gr", "gr_l2", "gr_08",
"gr_09", "gr_10", "gr_11", "gr_12", "gr_13", "gr_14", "gr_15",
"gr_16", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2",
"gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "hd", "hd", "hd",
"hd", "hd", "hd", "hd", "hd", "hd", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py",
"ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay",
"ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "sal_a", "sal_a_lt9", "sal_b", "sal_faculty", "sal_a",
"sal_a_lt9", "sal_b", "sal_faculty", "sal_a", "sal_a_lt9", "sal_b",
"sal_faculty", "sal_a", "sal_a_lt9", "sal_faculty")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -197L))
x %>%
split(.$survgroup) %>%
map(function(currentgroup) {
#currentgroup should now be a tibble of each group.
currentgroup %>%
map(function(singlesurvey) { #singlesurvey should be each row in the group
x <- read_csv(path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, ".csv")))
}) %>% bind_rows()
})
#> Error in path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, : could not find function "path_expand"
Created on 2018-11-12 by the reprex package (v0.2.1)
r purrr
add a comment |
I need to read a bunch of files from a grouped list, and combine them based on the group (files from the same group will have the same columns and thus can be reduced with bind_rows()
.
I can't seem to get a handle on how the data is changing as I move through the purrr::map()
function as I keep geting warnings that I can't use the $
on atomic vectors.
The first thing I do is split by the group so that I get a list of a list of the files I want to read within each group. Then I use map to go through each item in that list, and a second map to go through the rows on each sublist to read the files. However something happens at that level where it no longer treats the data the same as if I was just working with a single group at the top level.
(The lack of being able to debug and look at my environment inside a map function is really an issue in understanding the mechanics.)
require(tidyverse)
#> Loading required package: tidyverse
x <- structure(list(survey = c("adm2014", "adm2015", "adm2016", "eap2008",
"eap2009", "eap2011", "eap2012", "eap2013", "eap2014", "eap2015",
"eap2016", "ef2008a", "ef2008b", "ef2008c", "ef2008cp", "ef2008d",
"ef2009a", "ef2009b", "ef2009c", "ef2009d", "ef2010a", "ef2010b",
"ef2010c", "ef2010cp", "ef2010d", "ef2011a", "ef2011b", "ef2011c",
"ef2011d", "ef2012a", "ef2012b", "ef2012c", "ef2012cp", "ef2012d",
"ef2013a", "ef2013b", "ef2013c", "ef2013d", "ef2014a", "ef2014b",
"ef2014c", "ef2014cp", "ef2014d", "ef2015a", "ef2015b", "ef2015c",
"ef2015d", "ef2016a", "ef2016b", "ef2016c", "ef2016cp", "ef2016d",
"efest2008", "efest2009", "effy2008", "effy2009", "effy2010",
"effy2011", "effy2012", "effy2013", "effy2014", "effy2015", "effy2016",
"effy2017", "efia2008", "efia2009", "efia2011", "efia2012", "efia2013",
"efia2014", "efia2015", "efia2016", "efia2017", "f0708_f1a",
"f0708_f2", "f0708_f3", "f0809_f1a", "f0809_f2", "f0809_f3",
"f0910_f1a", "f0910_f2", "f0910_f3", "f1011_f1a", "f1011_f2",
"f1011_f3", "f1112_f1a", "f1112_f2", "f1112_f3", "f1213_f1a",
"f1213_f2", "f1213_f3", "f1314_f1a", "f1314_f2", "f1314_f3",
"f1415_f1a", "f1415_f2", "f1415_f3", "f1516_f1a", "f1516_f2",
"f1516_f3", "gr2008", "gr2008_l2", "gr2009", "gr2009_l2", "gr200_08",
"gr200_09", "gr200_10", "gr200_11", "gr200_12", "gr200_13", "gr200_14",
"gr200_15", "gr200_16", "gr2010", "gr2010_l2", "gr2011", "gr2011_l2",
"gr2012", "gr2012_l2", "gr2013", "gr2013_l2", "gr2014", "gr2014_l2",
"gr2015", "gr2015_l2", "gr2016", "gr2016_l2", "hd2008", "hd2009",
"hd2010", "hd2011", "hd2012", "hd2013", "hd2014", "hd2015", "hd2017",
"ic2008", "ic2008_ay", "ic2008_py", "ic2009", "ic2009_ay", "ic2009_py",
"ic2010", "ic2010_ay", "ic2010_py", "ic2011", "ic2011_ay", "ic2011_py",
"ic2012", "ic2012_ay", "ic2012_py", "ic2013", "ic2013_ay", "ic2013_py",
"ic2014", "ic2014_ay", "ic2014_py", "ic2015", "ic2015_ay", "ic2015_py",
"ic2016", "ic2016_ay", "ic2016_py", "ic2017", "ic2017_ay", "ic2017_py",
"s2008_abd", "s2008_cn", "s2008_f", "s2008_g", "s2009_abd", "s2009_cn",
"s2009_f", "s2009_g", "s2010_abd", "s2010_cn", "s2010_f", "s2010_g",
"s2011_abd", "s2011_cn", "s2011_f", "s2011_g", "sal2008_a", "sal2008_a_lt9",
"sal2008_b", "sal2008_faculty", "sal2009_a", "sal2009_a_lt9",
"sal2009_b", "sal2009_faculty", "sal2010_a", "sal2010_a_lt9",
"sal2010_b", "sal2010_faculty", "sal2011_a", "sal2011_a_lt9",
"sal2011_faculty"), survgroup = c("adm", "adm", "adm", "eap",
"eap", "eap", "eap", "eap", "eap", "eap", "eap", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efest", "efest", "effy", "effy", "effy",
"effy", "effy", "effy", "effy", "effy", "effy", "effy", "efia",
"efia", "efia", "efia", "efia", "efia", "efia", "efia", "efia",
"f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2",
"f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a",
"f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3",
"f_f1a", "f_f2", "f_f3", "gr", "gr_l2", "gr", "gr_l2", "gr_08",
"gr_09", "gr_10", "gr_11", "gr_12", "gr_13", "gr_14", "gr_15",
"gr_16", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2",
"gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "hd", "hd", "hd",
"hd", "hd", "hd", "hd", "hd", "hd", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py",
"ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay",
"ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "sal_a", "sal_a_lt9", "sal_b", "sal_faculty", "sal_a",
"sal_a_lt9", "sal_b", "sal_faculty", "sal_a", "sal_a_lt9", "sal_b",
"sal_faculty", "sal_a", "sal_a_lt9", "sal_faculty")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -197L))
x %>%
split(.$survgroup) %>%
map(function(currentgroup) {
#currentgroup should now be a tibble of each group.
currentgroup %>%
map(function(singlesurvey) { #singlesurvey should be each row in the group
x <- read_csv(path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, ".csv")))
}) %>% bind_rows()
})
#> Error in path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, : could not find function "path_expand"
Created on 2018-11-12 by the reprex package (v0.2.1)
r purrr
path.expand
instead ofpath_expand
maybe?
– Jrakru56
Nov 12 '18 at 20:36
from packagefs
: path_expand() differs from base::path.expand() in the interpretation of the home directory of Windows. In particular path_expand() uses the path set in the USERPROFILE environment variable and, if unset, then uses HOMEDRIVE/HOMEPATH.
– jzadra
Nov 12 '18 at 21:12
1
Interesting! A smarter file/path management ... Definitely will check this package out
– Jrakru56
Nov 12 '18 at 21:17
add a comment |
I need to read a bunch of files from a grouped list, and combine them based on the group (files from the same group will have the same columns and thus can be reduced with bind_rows()
.
I can't seem to get a handle on how the data is changing as I move through the purrr::map()
function as I keep geting warnings that I can't use the $
on atomic vectors.
The first thing I do is split by the group so that I get a list of a list of the files I want to read within each group. Then I use map to go through each item in that list, and a second map to go through the rows on each sublist to read the files. However something happens at that level where it no longer treats the data the same as if I was just working with a single group at the top level.
(The lack of being able to debug and look at my environment inside a map function is really an issue in understanding the mechanics.)
require(tidyverse)
#> Loading required package: tidyverse
x <- structure(list(survey = c("adm2014", "adm2015", "adm2016", "eap2008",
"eap2009", "eap2011", "eap2012", "eap2013", "eap2014", "eap2015",
"eap2016", "ef2008a", "ef2008b", "ef2008c", "ef2008cp", "ef2008d",
"ef2009a", "ef2009b", "ef2009c", "ef2009d", "ef2010a", "ef2010b",
"ef2010c", "ef2010cp", "ef2010d", "ef2011a", "ef2011b", "ef2011c",
"ef2011d", "ef2012a", "ef2012b", "ef2012c", "ef2012cp", "ef2012d",
"ef2013a", "ef2013b", "ef2013c", "ef2013d", "ef2014a", "ef2014b",
"ef2014c", "ef2014cp", "ef2014d", "ef2015a", "ef2015b", "ef2015c",
"ef2015d", "ef2016a", "ef2016b", "ef2016c", "ef2016cp", "ef2016d",
"efest2008", "efest2009", "effy2008", "effy2009", "effy2010",
"effy2011", "effy2012", "effy2013", "effy2014", "effy2015", "effy2016",
"effy2017", "efia2008", "efia2009", "efia2011", "efia2012", "efia2013",
"efia2014", "efia2015", "efia2016", "efia2017", "f0708_f1a",
"f0708_f2", "f0708_f3", "f0809_f1a", "f0809_f2", "f0809_f3",
"f0910_f1a", "f0910_f2", "f0910_f3", "f1011_f1a", "f1011_f2",
"f1011_f3", "f1112_f1a", "f1112_f2", "f1112_f3", "f1213_f1a",
"f1213_f2", "f1213_f3", "f1314_f1a", "f1314_f2", "f1314_f3",
"f1415_f1a", "f1415_f2", "f1415_f3", "f1516_f1a", "f1516_f2",
"f1516_f3", "gr2008", "gr2008_l2", "gr2009", "gr2009_l2", "gr200_08",
"gr200_09", "gr200_10", "gr200_11", "gr200_12", "gr200_13", "gr200_14",
"gr200_15", "gr200_16", "gr2010", "gr2010_l2", "gr2011", "gr2011_l2",
"gr2012", "gr2012_l2", "gr2013", "gr2013_l2", "gr2014", "gr2014_l2",
"gr2015", "gr2015_l2", "gr2016", "gr2016_l2", "hd2008", "hd2009",
"hd2010", "hd2011", "hd2012", "hd2013", "hd2014", "hd2015", "hd2017",
"ic2008", "ic2008_ay", "ic2008_py", "ic2009", "ic2009_ay", "ic2009_py",
"ic2010", "ic2010_ay", "ic2010_py", "ic2011", "ic2011_ay", "ic2011_py",
"ic2012", "ic2012_ay", "ic2012_py", "ic2013", "ic2013_ay", "ic2013_py",
"ic2014", "ic2014_ay", "ic2014_py", "ic2015", "ic2015_ay", "ic2015_py",
"ic2016", "ic2016_ay", "ic2016_py", "ic2017", "ic2017_ay", "ic2017_py",
"s2008_abd", "s2008_cn", "s2008_f", "s2008_g", "s2009_abd", "s2009_cn",
"s2009_f", "s2009_g", "s2010_abd", "s2010_cn", "s2010_f", "s2010_g",
"s2011_abd", "s2011_cn", "s2011_f", "s2011_g", "sal2008_a", "sal2008_a_lt9",
"sal2008_b", "sal2008_faculty", "sal2009_a", "sal2009_a_lt9",
"sal2009_b", "sal2009_faculty", "sal2010_a", "sal2010_a_lt9",
"sal2010_b", "sal2010_faculty", "sal2011_a", "sal2011_a_lt9",
"sal2011_faculty"), survgroup = c("adm", "adm", "adm", "eap",
"eap", "eap", "eap", "eap", "eap", "eap", "eap", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efest", "efest", "effy", "effy", "effy",
"effy", "effy", "effy", "effy", "effy", "effy", "effy", "efia",
"efia", "efia", "efia", "efia", "efia", "efia", "efia", "efia",
"f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2",
"f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a",
"f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3",
"f_f1a", "f_f2", "f_f3", "gr", "gr_l2", "gr", "gr_l2", "gr_08",
"gr_09", "gr_10", "gr_11", "gr_12", "gr_13", "gr_14", "gr_15",
"gr_16", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2",
"gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "hd", "hd", "hd",
"hd", "hd", "hd", "hd", "hd", "hd", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py",
"ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay",
"ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "sal_a", "sal_a_lt9", "sal_b", "sal_faculty", "sal_a",
"sal_a_lt9", "sal_b", "sal_faculty", "sal_a", "sal_a_lt9", "sal_b",
"sal_faculty", "sal_a", "sal_a_lt9", "sal_faculty")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -197L))
x %>%
split(.$survgroup) %>%
map(function(currentgroup) {
#currentgroup should now be a tibble of each group.
currentgroup %>%
map(function(singlesurvey) { #singlesurvey should be each row in the group
x <- read_csv(path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, ".csv")))
}) %>% bind_rows()
})
#> Error in path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, : could not find function "path_expand"
Created on 2018-11-12 by the reprex package (v0.2.1)
r purrr
I need to read a bunch of files from a grouped list, and combine them based on the group (files from the same group will have the same columns and thus can be reduced with bind_rows()
.
I can't seem to get a handle on how the data is changing as I move through the purrr::map()
function as I keep geting warnings that I can't use the $
on atomic vectors.
The first thing I do is split by the group so that I get a list of a list of the files I want to read within each group. Then I use map to go through each item in that list, and a second map to go through the rows on each sublist to read the files. However something happens at that level where it no longer treats the data the same as if I was just working with a single group at the top level.
(The lack of being able to debug and look at my environment inside a map function is really an issue in understanding the mechanics.)
require(tidyverse)
#> Loading required package: tidyverse
x <- structure(list(survey = c("adm2014", "adm2015", "adm2016", "eap2008",
"eap2009", "eap2011", "eap2012", "eap2013", "eap2014", "eap2015",
"eap2016", "ef2008a", "ef2008b", "ef2008c", "ef2008cp", "ef2008d",
"ef2009a", "ef2009b", "ef2009c", "ef2009d", "ef2010a", "ef2010b",
"ef2010c", "ef2010cp", "ef2010d", "ef2011a", "ef2011b", "ef2011c",
"ef2011d", "ef2012a", "ef2012b", "ef2012c", "ef2012cp", "ef2012d",
"ef2013a", "ef2013b", "ef2013c", "ef2013d", "ef2014a", "ef2014b",
"ef2014c", "ef2014cp", "ef2014d", "ef2015a", "ef2015b", "ef2015c",
"ef2015d", "ef2016a", "ef2016b", "ef2016c", "ef2016cp", "ef2016d",
"efest2008", "efest2009", "effy2008", "effy2009", "effy2010",
"effy2011", "effy2012", "effy2013", "effy2014", "effy2015", "effy2016",
"effy2017", "efia2008", "efia2009", "efia2011", "efia2012", "efia2013",
"efia2014", "efia2015", "efia2016", "efia2017", "f0708_f1a",
"f0708_f2", "f0708_f3", "f0809_f1a", "f0809_f2", "f0809_f3",
"f0910_f1a", "f0910_f2", "f0910_f3", "f1011_f1a", "f1011_f2",
"f1011_f3", "f1112_f1a", "f1112_f2", "f1112_f3", "f1213_f1a",
"f1213_f2", "f1213_f3", "f1314_f1a", "f1314_f2", "f1314_f3",
"f1415_f1a", "f1415_f2", "f1415_f3", "f1516_f1a", "f1516_f2",
"f1516_f3", "gr2008", "gr2008_l2", "gr2009", "gr2009_l2", "gr200_08",
"gr200_09", "gr200_10", "gr200_11", "gr200_12", "gr200_13", "gr200_14",
"gr200_15", "gr200_16", "gr2010", "gr2010_l2", "gr2011", "gr2011_l2",
"gr2012", "gr2012_l2", "gr2013", "gr2013_l2", "gr2014", "gr2014_l2",
"gr2015", "gr2015_l2", "gr2016", "gr2016_l2", "hd2008", "hd2009",
"hd2010", "hd2011", "hd2012", "hd2013", "hd2014", "hd2015", "hd2017",
"ic2008", "ic2008_ay", "ic2008_py", "ic2009", "ic2009_ay", "ic2009_py",
"ic2010", "ic2010_ay", "ic2010_py", "ic2011", "ic2011_ay", "ic2011_py",
"ic2012", "ic2012_ay", "ic2012_py", "ic2013", "ic2013_ay", "ic2013_py",
"ic2014", "ic2014_ay", "ic2014_py", "ic2015", "ic2015_ay", "ic2015_py",
"ic2016", "ic2016_ay", "ic2016_py", "ic2017", "ic2017_ay", "ic2017_py",
"s2008_abd", "s2008_cn", "s2008_f", "s2008_g", "s2009_abd", "s2009_cn",
"s2009_f", "s2009_g", "s2010_abd", "s2010_cn", "s2010_f", "s2010_g",
"s2011_abd", "s2011_cn", "s2011_f", "s2011_g", "sal2008_a", "sal2008_a_lt9",
"sal2008_b", "sal2008_faculty", "sal2009_a", "sal2009_a_lt9",
"sal2009_b", "sal2009_faculty", "sal2010_a", "sal2010_a_lt9",
"sal2010_b", "sal2010_faculty", "sal2011_a", "sal2011_a_lt9",
"sal2011_faculty"), survgroup = c("adm", "adm", "adm", "eap",
"eap", "eap", "eap", "eap", "eap", "eap", "eap", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efa", "efb", "efc", "efd", "efa", "efb",
"efc", "efcp", "efd", "efest", "efest", "effy", "effy", "effy",
"effy", "effy", "effy", "effy", "effy", "effy", "effy", "efia",
"efia", "efia", "efia", "efia", "efia", "efia", "efia", "efia",
"f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2",
"f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a",
"f_f2", "f_f3", "f_f1a", "f_f2", "f_f3", "f_f1a", "f_f2", "f_f3",
"f_f1a", "f_f2", "f_f3", "gr", "gr_l2", "gr", "gr_l2", "gr_08",
"gr_09", "gr_10", "gr_11", "gr_12", "gr_13", "gr_14", "gr_15",
"gr_16", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2",
"gr", "gr_l2", "gr", "gr_l2", "gr", "gr_l2", "hd", "hd", "hd",
"hd", "hd", "hd", "hd", "hd", "hd", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py",
"ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay",
"ic_py", "ic", "ic_ay", "ic_py", "ic", "ic_ay", "ic_py", "ic",
"ic_ay", "ic_py", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "s_abd", "s_cn", "s_f", "s_g", "s_abd", "s_cn",
"s_f", "s_g", "sal_a", "sal_a_lt9", "sal_b", "sal_faculty", "sal_a",
"sal_a_lt9", "sal_b", "sal_faculty", "sal_a", "sal_a_lt9", "sal_b",
"sal_faculty", "sal_a", "sal_a_lt9", "sal_faculty")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -197L))
x %>%
split(.$survgroup) %>%
map(function(currentgroup) {
#currentgroup should now be a tibble of each group.
currentgroup %>%
map(function(singlesurvey) { #singlesurvey should be each row in the group
x <- read_csv(path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, ".csv")))
}) %>% bind_rows()
})
#> Error in path_expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", singlesurvey$survey, : could not find function "path_expand"
Created on 2018-11-12 by the reprex package (v0.2.1)
r purrr
r purrr
asked Nov 12 '18 at 20:26
jzadrajzadra
99811118
99811118
path.expand
instead ofpath_expand
maybe?
– Jrakru56
Nov 12 '18 at 20:36
from packagefs
: path_expand() differs from base::path.expand() in the interpretation of the home directory of Windows. In particular path_expand() uses the path set in the USERPROFILE environment variable and, if unset, then uses HOMEDRIVE/HOMEPATH.
– jzadra
Nov 12 '18 at 21:12
1
Interesting! A smarter file/path management ... Definitely will check this package out
– Jrakru56
Nov 12 '18 at 21:17
add a comment |
path.expand
instead ofpath_expand
maybe?
– Jrakru56
Nov 12 '18 at 20:36
from packagefs
: path_expand() differs from base::path.expand() in the interpretation of the home directory of Windows. In particular path_expand() uses the path set in the USERPROFILE environment variable and, if unset, then uses HOMEDRIVE/HOMEPATH.
– jzadra
Nov 12 '18 at 21:12
1
Interesting! A smarter file/path management ... Definitely will check this package out
– Jrakru56
Nov 12 '18 at 21:17
path.expand
instead of path_expand
maybe?– Jrakru56
Nov 12 '18 at 20:36
path.expand
instead of path_expand
maybe?– Jrakru56
Nov 12 '18 at 20:36
from package
fs
: path_expand() differs from base::path.expand() in the interpretation of the home directory of Windows. In particular path_expand() uses the path set in the USERPROFILE environment variable and, if unset, then uses HOMEDRIVE/HOMEPATH.– jzadra
Nov 12 '18 at 21:12
from package
fs
: path_expand() differs from base::path.expand() in the interpretation of the home directory of Windows. In particular path_expand() uses the path set in the USERPROFILE environment variable and, if unset, then uses HOMEDRIVE/HOMEPATH.– jzadra
Nov 12 '18 at 21:12
1
1
Interesting! A smarter file/path management ... Definitely will check this package out
– Jrakru56
Nov 12 '18 at 21:17
Interesting! A smarter file/path management ... Definitely will check this package out
– Jrakru56
Nov 12 '18 at 21:17
add a comment |
2 Answers
2
active
oldest
votes
The issue is that we need to loop through the individual files in the column instead of the looping through the columns in the dataset. In the OP's post, the second map
loops through the data.frame
with a single column. Here, the basic unit is a data.frame
with one column. If the column was extracted as a vector
, the unit becomes vector
and it loops through each element of the vector
x %>%
split(.$survgroup) %>%
map(~ .x %>%
pull(survey) %>%
map(~ .x %>%
paste0("~data/IPEDS/API Pulls/datadownloaded/", ., '.csv') %>%
path.expand %>%
read_csv)))
this has 2 maps, same as mine. The only different seems to be the pull to isolate the survey, however this doesn't work regardless: > x %>% + split(.$survgroup) %>% + map(~ .x %>% + pull(survey) %>% + map(~ .x %>% + paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, '.csv') %>% + path.expand %>% + read_csv)) Error: 'adm2014~data/IPEDS/API Pulls/datadownloaded/adm2014.csv' does not exist in current working directory for some reason it puts teh survey value in front. Trying to fix it back to my original...
– jzadra
Nov 12 '18 at 21:02
but adding the pull gives bad results as well: x %>% split(.$survgroup) %>% map(~ .x %>% pull(survey) %>% map(function(dat) { read_csv(path_expand(paste0("~/Google Drive/SI/DataScience/data/gates/IPEDS/API Pulls/data/downloaded/", dat$survey, '.csv'))) }) ) #Error in dat$survey : $ operator is invalid for atomic vectors
– jzadra
Nov 12 '18 at 21:02
1
@jzadra I think I made a mistake. Sorry. It should be.
instead of.x
– akrun
Nov 12 '18 at 21:05
1
aha, got it. Thanks
– jzadra
Nov 12 '18 at 21:10
1
@jzadra Yeah, actually, I intended that, but somehow written wrongly. Updated the post. Thanks
– akrun
Nov 12 '18 at 22:45
|
show 1 more comment
An alternative solution is to use list-columns to read the data frames into a column, and do the split afterwards.
x %>%
mutate(data = map(survey, ~ read_csv(path.expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, ".csv"))))) %>%
unnest() %>%
split(.$survgroup)
The problem is that I actually have other functions that i need to do to within the lowest level map.
– jzadra
Nov 12 '18 at 21:04
Could you add these to the question? You might be able to do these with list-columns and/orgroup_by
– Scott Gigante
Nov 12 '18 at 21:43
Unfortunately it's too much data to really do a reprex (pulling from multiple files to combine dictionaries with data in order to rename and recode).
– jzadra
Nov 12 '18 at 22:39
Create a minimum reproducible example? stackoverflow.com/help/mcve
– Scott Gigante
Nov 13 '18 at 18:12
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53269589%2fpurrrmap-reading-files-in-groups%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The issue is that we need to loop through the individual files in the column instead of the looping through the columns in the dataset. In the OP's post, the second map
loops through the data.frame
with a single column. Here, the basic unit is a data.frame
with one column. If the column was extracted as a vector
, the unit becomes vector
and it loops through each element of the vector
x %>%
split(.$survgroup) %>%
map(~ .x %>%
pull(survey) %>%
map(~ .x %>%
paste0("~data/IPEDS/API Pulls/datadownloaded/", ., '.csv') %>%
path.expand %>%
read_csv)))
this has 2 maps, same as mine. The only different seems to be the pull to isolate the survey, however this doesn't work regardless: > x %>% + split(.$survgroup) %>% + map(~ .x %>% + pull(survey) %>% + map(~ .x %>% + paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, '.csv') %>% + path.expand %>% + read_csv)) Error: 'adm2014~data/IPEDS/API Pulls/datadownloaded/adm2014.csv' does not exist in current working directory for some reason it puts teh survey value in front. Trying to fix it back to my original...
– jzadra
Nov 12 '18 at 21:02
but adding the pull gives bad results as well: x %>% split(.$survgroup) %>% map(~ .x %>% pull(survey) %>% map(function(dat) { read_csv(path_expand(paste0("~/Google Drive/SI/DataScience/data/gates/IPEDS/API Pulls/data/downloaded/", dat$survey, '.csv'))) }) ) #Error in dat$survey : $ operator is invalid for atomic vectors
– jzadra
Nov 12 '18 at 21:02
1
@jzadra I think I made a mistake. Sorry. It should be.
instead of.x
– akrun
Nov 12 '18 at 21:05
1
aha, got it. Thanks
– jzadra
Nov 12 '18 at 21:10
1
@jzadra Yeah, actually, I intended that, but somehow written wrongly. Updated the post. Thanks
– akrun
Nov 12 '18 at 22:45
|
show 1 more comment
The issue is that we need to loop through the individual files in the column instead of the looping through the columns in the dataset. In the OP's post, the second map
loops through the data.frame
with a single column. Here, the basic unit is a data.frame
with one column. If the column was extracted as a vector
, the unit becomes vector
and it loops through each element of the vector
x %>%
split(.$survgroup) %>%
map(~ .x %>%
pull(survey) %>%
map(~ .x %>%
paste0("~data/IPEDS/API Pulls/datadownloaded/", ., '.csv') %>%
path.expand %>%
read_csv)))
this has 2 maps, same as mine. The only different seems to be the pull to isolate the survey, however this doesn't work regardless: > x %>% + split(.$survgroup) %>% + map(~ .x %>% + pull(survey) %>% + map(~ .x %>% + paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, '.csv') %>% + path.expand %>% + read_csv)) Error: 'adm2014~data/IPEDS/API Pulls/datadownloaded/adm2014.csv' does not exist in current working directory for some reason it puts teh survey value in front. Trying to fix it back to my original...
– jzadra
Nov 12 '18 at 21:02
but adding the pull gives bad results as well: x %>% split(.$survgroup) %>% map(~ .x %>% pull(survey) %>% map(function(dat) { read_csv(path_expand(paste0("~/Google Drive/SI/DataScience/data/gates/IPEDS/API Pulls/data/downloaded/", dat$survey, '.csv'))) }) ) #Error in dat$survey : $ operator is invalid for atomic vectors
– jzadra
Nov 12 '18 at 21:02
1
@jzadra I think I made a mistake. Sorry. It should be.
instead of.x
– akrun
Nov 12 '18 at 21:05
1
aha, got it. Thanks
– jzadra
Nov 12 '18 at 21:10
1
@jzadra Yeah, actually, I intended that, but somehow written wrongly. Updated the post. Thanks
– akrun
Nov 12 '18 at 22:45
|
show 1 more comment
The issue is that we need to loop through the individual files in the column instead of the looping through the columns in the dataset. In the OP's post, the second map
loops through the data.frame
with a single column. Here, the basic unit is a data.frame
with one column. If the column was extracted as a vector
, the unit becomes vector
and it loops through each element of the vector
x %>%
split(.$survgroup) %>%
map(~ .x %>%
pull(survey) %>%
map(~ .x %>%
paste0("~data/IPEDS/API Pulls/datadownloaded/", ., '.csv') %>%
path.expand %>%
read_csv)))
The issue is that we need to loop through the individual files in the column instead of the looping through the columns in the dataset. In the OP's post, the second map
loops through the data.frame
with a single column. Here, the basic unit is a data.frame
with one column. If the column was extracted as a vector
, the unit becomes vector
and it loops through each element of the vector
x %>%
split(.$survgroup) %>%
map(~ .x %>%
pull(survey) %>%
map(~ .x %>%
paste0("~data/IPEDS/API Pulls/datadownloaded/", ., '.csv') %>%
path.expand %>%
read_csv)))
edited Nov 12 '18 at 22:44
answered Nov 12 '18 at 20:37
akrunakrun
399k13188263
399k13188263
this has 2 maps, same as mine. The only different seems to be the pull to isolate the survey, however this doesn't work regardless: > x %>% + split(.$survgroup) %>% + map(~ .x %>% + pull(survey) %>% + map(~ .x %>% + paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, '.csv') %>% + path.expand %>% + read_csv)) Error: 'adm2014~data/IPEDS/API Pulls/datadownloaded/adm2014.csv' does not exist in current working directory for some reason it puts teh survey value in front. Trying to fix it back to my original...
– jzadra
Nov 12 '18 at 21:02
but adding the pull gives bad results as well: x %>% split(.$survgroup) %>% map(~ .x %>% pull(survey) %>% map(function(dat) { read_csv(path_expand(paste0("~/Google Drive/SI/DataScience/data/gates/IPEDS/API Pulls/data/downloaded/", dat$survey, '.csv'))) }) ) #Error in dat$survey : $ operator is invalid for atomic vectors
– jzadra
Nov 12 '18 at 21:02
1
@jzadra I think I made a mistake. Sorry. It should be.
instead of.x
– akrun
Nov 12 '18 at 21:05
1
aha, got it. Thanks
– jzadra
Nov 12 '18 at 21:10
1
@jzadra Yeah, actually, I intended that, but somehow written wrongly. Updated the post. Thanks
– akrun
Nov 12 '18 at 22:45
|
show 1 more comment
this has 2 maps, same as mine. The only different seems to be the pull to isolate the survey, however this doesn't work regardless: > x %>% + split(.$survgroup) %>% + map(~ .x %>% + pull(survey) %>% + map(~ .x %>% + paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, '.csv') %>% + path.expand %>% + read_csv)) Error: 'adm2014~data/IPEDS/API Pulls/datadownloaded/adm2014.csv' does not exist in current working directory for some reason it puts teh survey value in front. Trying to fix it back to my original...
– jzadra
Nov 12 '18 at 21:02
but adding the pull gives bad results as well: x %>% split(.$survgroup) %>% map(~ .x %>% pull(survey) %>% map(function(dat) { read_csv(path_expand(paste0("~/Google Drive/SI/DataScience/data/gates/IPEDS/API Pulls/data/downloaded/", dat$survey, '.csv'))) }) ) #Error in dat$survey : $ operator is invalid for atomic vectors
– jzadra
Nov 12 '18 at 21:02
1
@jzadra I think I made a mistake. Sorry. It should be.
instead of.x
– akrun
Nov 12 '18 at 21:05
1
aha, got it. Thanks
– jzadra
Nov 12 '18 at 21:10
1
@jzadra Yeah, actually, I intended that, but somehow written wrongly. Updated the post. Thanks
– akrun
Nov 12 '18 at 22:45
this has 2 maps, same as mine. The only different seems to be the pull to isolate the survey, however this doesn't work regardless: > x %>% + split(.$survgroup) %>% + map(~ .x %>% + pull(survey) %>% + map(~ .x %>% + paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, '.csv') %>% + path.expand %>% + read_csv)) Error: 'adm2014~data/IPEDS/API Pulls/datadownloaded/adm2014.csv' does not exist in current working directory for some reason it puts teh survey value in front. Trying to fix it back to my original...
– jzadra
Nov 12 '18 at 21:02
this has 2 maps, same as mine. The only different seems to be the pull to isolate the survey, however this doesn't work regardless: > x %>% + split(.$survgroup) %>% + map(~ .x %>% + pull(survey) %>% + map(~ .x %>% + paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, '.csv') %>% + path.expand %>% + read_csv)) Error: 'adm2014~data/IPEDS/API Pulls/datadownloaded/adm2014.csv' does not exist in current working directory for some reason it puts teh survey value in front. Trying to fix it back to my original...
– jzadra
Nov 12 '18 at 21:02
but adding the pull gives bad results as well: x %>% split(.$survgroup) %>% map(~ .x %>% pull(survey) %>% map(function(dat) { read_csv(path_expand(paste0("~/Google Drive/SI/DataScience/data/gates/IPEDS/API Pulls/data/downloaded/", dat$survey, '.csv'))) }) ) #Error in dat$survey : $ operator is invalid for atomic vectors
– jzadra
Nov 12 '18 at 21:02
but adding the pull gives bad results as well: x %>% split(.$survgroup) %>% map(~ .x %>% pull(survey) %>% map(function(dat) { read_csv(path_expand(paste0("~/Google Drive/SI/DataScience/data/gates/IPEDS/API Pulls/data/downloaded/", dat$survey, '.csv'))) }) ) #Error in dat$survey : $ operator is invalid for atomic vectors
– jzadra
Nov 12 '18 at 21:02
1
1
@jzadra I think I made a mistake. Sorry. It should be
.
instead of .x
– akrun
Nov 12 '18 at 21:05
@jzadra I think I made a mistake. Sorry. It should be
.
instead of .x
– akrun
Nov 12 '18 at 21:05
1
1
aha, got it. Thanks
– jzadra
Nov 12 '18 at 21:10
aha, got it. Thanks
– jzadra
Nov 12 '18 at 21:10
1
1
@jzadra Yeah, actually, I intended that, but somehow written wrongly. Updated the post. Thanks
– akrun
Nov 12 '18 at 22:45
@jzadra Yeah, actually, I intended that, but somehow written wrongly. Updated the post. Thanks
– akrun
Nov 12 '18 at 22:45
|
show 1 more comment
An alternative solution is to use list-columns to read the data frames into a column, and do the split afterwards.
x %>%
mutate(data = map(survey, ~ read_csv(path.expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, ".csv"))))) %>%
unnest() %>%
split(.$survgroup)
The problem is that I actually have other functions that i need to do to within the lowest level map.
– jzadra
Nov 12 '18 at 21:04
Could you add these to the question? You might be able to do these with list-columns and/orgroup_by
– Scott Gigante
Nov 12 '18 at 21:43
Unfortunately it's too much data to really do a reprex (pulling from multiple files to combine dictionaries with data in order to rename and recode).
– jzadra
Nov 12 '18 at 22:39
Create a minimum reproducible example? stackoverflow.com/help/mcve
– Scott Gigante
Nov 13 '18 at 18:12
add a comment |
An alternative solution is to use list-columns to read the data frames into a column, and do the split afterwards.
x %>%
mutate(data = map(survey, ~ read_csv(path.expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, ".csv"))))) %>%
unnest() %>%
split(.$survgroup)
The problem is that I actually have other functions that i need to do to within the lowest level map.
– jzadra
Nov 12 '18 at 21:04
Could you add these to the question? You might be able to do these with list-columns and/orgroup_by
– Scott Gigante
Nov 12 '18 at 21:43
Unfortunately it's too much data to really do a reprex (pulling from multiple files to combine dictionaries with data in order to rename and recode).
– jzadra
Nov 12 '18 at 22:39
Create a minimum reproducible example? stackoverflow.com/help/mcve
– Scott Gigante
Nov 13 '18 at 18:12
add a comment |
An alternative solution is to use list-columns to read the data frames into a column, and do the split afterwards.
x %>%
mutate(data = map(survey, ~ read_csv(path.expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, ".csv"))))) %>%
unnest() %>%
split(.$survgroup)
An alternative solution is to use list-columns to read the data frames into a column, and do the split afterwards.
x %>%
mutate(data = map(survey, ~ read_csv(path.expand(paste0("~data/IPEDS/API Pulls/datadownloaded/", .x, ".csv"))))) %>%
unnest() %>%
split(.$survgroup)
answered Nov 12 '18 at 20:44
Scott GiganteScott Gigante
447414
447414
The problem is that I actually have other functions that i need to do to within the lowest level map.
– jzadra
Nov 12 '18 at 21:04
Could you add these to the question? You might be able to do these with list-columns and/orgroup_by
– Scott Gigante
Nov 12 '18 at 21:43
Unfortunately it's too much data to really do a reprex (pulling from multiple files to combine dictionaries with data in order to rename and recode).
– jzadra
Nov 12 '18 at 22:39
Create a minimum reproducible example? stackoverflow.com/help/mcve
– Scott Gigante
Nov 13 '18 at 18:12
add a comment |
The problem is that I actually have other functions that i need to do to within the lowest level map.
– jzadra
Nov 12 '18 at 21:04
Could you add these to the question? You might be able to do these with list-columns and/orgroup_by
– Scott Gigante
Nov 12 '18 at 21:43
Unfortunately it's too much data to really do a reprex (pulling from multiple files to combine dictionaries with data in order to rename and recode).
– jzadra
Nov 12 '18 at 22:39
Create a minimum reproducible example? stackoverflow.com/help/mcve
– Scott Gigante
Nov 13 '18 at 18:12
The problem is that I actually have other functions that i need to do to within the lowest level map.
– jzadra
Nov 12 '18 at 21:04
The problem is that I actually have other functions that i need to do to within the lowest level map.
– jzadra
Nov 12 '18 at 21:04
Could you add these to the question? You might be able to do these with list-columns and/or
group_by
– Scott Gigante
Nov 12 '18 at 21:43
Could you add these to the question? You might be able to do these with list-columns and/or
group_by
– Scott Gigante
Nov 12 '18 at 21:43
Unfortunately it's too much data to really do a reprex (pulling from multiple files to combine dictionaries with data in order to rename and recode).
– jzadra
Nov 12 '18 at 22:39
Unfortunately it's too much data to really do a reprex (pulling from multiple files to combine dictionaries with data in order to rename and recode).
– jzadra
Nov 12 '18 at 22:39
Create a minimum reproducible example? stackoverflow.com/help/mcve
– Scott Gigante
Nov 13 '18 at 18:12
Create a minimum reproducible example? stackoverflow.com/help/mcve
– Scott Gigante
Nov 13 '18 at 18:12
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53269589%2fpurrrmap-reading-files-in-groups%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
path.expand
instead ofpath_expand
maybe?– Jrakru56
Nov 12 '18 at 20:36
from package
fs
: path_expand() differs from base::path.expand() in the interpretation of the home directory of Windows. In particular path_expand() uses the path set in the USERPROFILE environment variable and, if unset, then uses HOMEDRIVE/HOMEPATH.– jzadra
Nov 12 '18 at 21:12
1
Interesting! A smarter file/path management ... Definitely will check this package out
– Jrakru56
Nov 12 '18 at 21:17