Skip to content

Commit

Permalink
Updating use of superseded map_df() to map() %>% list_rbind() (#753)
Browse files Browse the repository at this point in the history
Fixes #746
  • Loading branch information
SokolovAnatoliy authored Aug 15, 2024
1 parent 5d06aae commit f3eb9e5
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions vignettes/articles/readxl-workflows.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -134,17 +134,18 @@ What if the datasets found on different sheets have the same variables? Then you

readxl ships with an example sheet `deaths.xlsx`, containing data on famous people who died in 2016 or 2017. It has two worksheets named "arts" and "other", but the spreadsheet layout is the same in each and the data tables have the same variables, e.g., name and date of death.

The `map_df()` function from purrr makes it easy to iterate over worksheets and glue together the resulting data frames, all at once.
The `map()` function from purrr makes it easy to iterate over worksheets. Use `purrr::list_rbind()` to glue together the resulting data frames.

* Store a self-named vector of worksheet names (critical for the ID variable below).
* Use `purrr::map_df()` to import the data, create an ID variable for the source worksheet, and row bind.
* Use `purrr::map() %>% purrr::list_rbind()` to import the data, create an ID variable for the source worksheet, and row bind.

```{r}
path <- readxl_example("deaths.xlsx")
deaths <- path %>%
excel_sheets() %>%
set_names() %>%
map_df(~ read_excel(path = path, sheet = .x, range = "A5:F15"), .id = "sheet")
map(~ read_excel(path = path, sheet = .x, range = "A5:F15")) %>%
list_rbind(names_to = "sheet")
print(deaths, n = Inf)
```

Expand All @@ -162,7 +163,7 @@ Even though the worksheets in `deaths.xlsx` have the same layout, we'll pretend

* Store a self-named vector of worksheet names.
* Store a vector of cell range specifications.
* Use `purrr::map2_df()` to iterate over those two vectors in parallel, importing the data, row binding, and creating an ID variable for the source worksheet.
* Use `purrr::map2() %>% purrr::list_rbind()` to iterate over those two vectors in parallel, importing the data, row binding, and creating an ID variable for the source worksheet.
* Cache the unified data to CSV.

```{r}
Expand All @@ -171,12 +172,13 @@ sheets <- path %>%
excel_sheets() %>%
set_names()
ranges <- list("A5:F15", cell_rows(5:15))
deaths <- map2_df(
deaths <- map2(
sheets,
ranges,
~ read_excel(path, sheet = .x, range = .y),
.id = "sheet"
) %>%
list_rbind() %>%
write_csv("deaths.csv")
print(deaths, n = Inf)
```
Expand Down

0 comments on commit f3eb9e5

Please sign in to comment.