Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support tabularray #804

Closed

Conversation

vincentarelbundock
Copy link
Collaborator

@vincentarelbundock vincentarelbundock commented Dec 12, 2023

I propose to add support for tabularray. Using that package, I should be able to bring us to feature parity, while solving all the issues above (and maybe more). Another benefit is that our LaTeX code would be much cleaner and readable, because tabularray allows us to fix global settings for whole rows and columns, instead of having to modify each cell.

  • border_right and border_left
  • add_header_above() arguments
  • scale_down
  • longtblr sometimes leaves a single row with an empty page. weird line breaks
  • kable_styling(position = "center")
  • images in columns
  • Hex colors do not work in cell_spec(), but I'm not sure there's a good path forward here.
  • Snapshot tests
  • tabularray should not support vectorized arguments in row_spec() and column_spec(), because this goes against the fundamental package design, which treats rows and columns as "blocks" with separate settings, instead of applying styles to individual cells. We should return an informative error or warning.

Problem

We use the tabu LaTeX package for full_width and background. Unfortunately, tabu is unmaintained, it has not been updated for 10 years (except an emergency patch), and it has major bugs and limitations (ex: dealing with conflicts of colors). We hit those limitations often, as attested by all the Github Issues related to striping and row_spec/column_spec/cell_spec.

After playing around with this for many hours, I've come to the conclusion that many kableExtra bugs will be very hard --- or more likely impossible --- to fix with tabu. Take these, for example:

Implementation

To use tabularray, the user needs to explicity specify the tabular="tblr" argument in kbl(). If this argument is not specified, everything works exactly as before. If the argument is supplied explicitly, we bypass the default row_spec_latex()/column_spec_latex()/cell_spec_latex() functions and use the analogues stored in the R/tabularray.R file instead.

Accordingly, this PR should zero effect on current functionality and should essentially not be a breaking change (unless someone already used that specific argument, in which case all other functions would not work anyway).

Long run

In the long run, I think that tabu should be deprecated completely, because it is fundamentally unsuited for kableExtra's needs, which leads to poor user experience in some cases. But that is a discussion that we can obviously have much later, when tabularray (or some other solution) has proven itself.

Working example (code and screenshot below)

Install my branch:

remotes::install_github("vincentarelbundock/kableExtra@tabularray")

Then, render this Rmarkdown document:

---
output:
  pdf_document:
    keep_tex: true
header-includes: 
  - \newcommand{\kableExtraTabularrayUnderline}[1]{\underline}
  - \newcommand{\kableExtraTabularrayStrikeout}[1]{\sout}
---

```{r}
library(kableExtra)
d <- mtcars[1:3, 1:3]
d[2, 3] <- cell_spec(d[2, 3],
  format = "tblr",
  underline = TRUE,
  background = "pink",
  escape = FALSE,
  font_size = 20)

kbl(d,
  format = "latex",
  tabular = "tblr",
  escape = FALSE,
  caption = "Blah blah") |>

  row_spec(
    row = c(1, 3),
    background = "pink",
    color = "blue",
    bold = TRUE,
    italic = TRUE,
    strikeout = FALSE,
    align = "c") |>

  column_spec(
    1:2,
    background = "yellow",
    color = "red",
    monospace = TRUE,
    strikeout = TRUE,
    width = c("4cm", "6cm"),
    border_left = TRUE)
```

tabularray

@haozhu233

This comment was marked as resolved.

@vincentarelbundock

This comment was marked as resolved.

@vincentarelbundock
Copy link
Collaborator Author

vincentarelbundock commented Dec 14, 2023

This PR is ready for a first review.

Note that this code should not make any change to existing behavior. The new functions are triggered only by the tabular argument:

kbl(dat, tabular = "tblr")
kbl(dat, tabular = "talltblr")
kbl(dat, tabular = "longtblr")

If @haozhu233 or @dmurdoch are interested, I wrote a detailed vignette to highlight the main features and workflow:

tabularray.pdf

I am super excited about this. As noted in the vignette, this helps us solve at least 7 thorny bugs (and probably more), and it opens up a lot of cool possibilities for LaTeX tables. Row and column processing via regexes should eventually be much more robust, because we only need to manipulate the "header" of a table and not its individual cells.

@vincentarelbundock vincentarelbundock marked this pull request as ready for review December 14, 2023 15:41
@dmurdoch

This comment was marked as resolved.

@dmurdoch

This comment was marked as resolved.

@vincentarelbundock
Copy link
Collaborator Author

vincentarelbundock commented Dec 15, 2023

@dmurdoch, thanks for looking at this. I really appreciate your time!

tabularray does not seem to recognize this syntax:

\arrayrulecolor{lightgray}\\toprule[4pt]

But this works:

\toprule[4pt, lightgray]

The other issues should now be fixed.

  • New examples with \\midrule and friends added to the vignette.
  • add_header_row() can now use the \SetCell mechanism for spanning column labels.
  • Check for ifelse(is.character(color) && length(color) == 1, color, ""). Eventually we might support vectorized arguments, but I would prefer this to be in a different PR.
  • No longer rely on match.call(); pass all the arguments to init_tabularray() instead.
  • No longer trying to be clever about linesep and vline. The arguments behave strictly as in CRAN version, and tabularray-specific options can only be passed via kable_styling.
  • Updated the vignette: tabularray.pdf

Click on Details for .Rmd code and screenshot:

remotes::install_github("vincentarelbundock/kableExtra@tabularray")
---
output: pdf_document
---

```{r}
library(kableExtra)
kbl(mtcars[1:5, 1:4],
    tabular = "tblr",
    booktabs = TRUE,
    vline = "",
    linesep = "\\midrule[2pt, lightgray]",
    toprule = "\\midrule[4pt, orange]",
    midrule = "\\midrule[3pt, lightgray]",
    bottomrule = "\\midrule[5pt, orange]"
    )
```

```{r}
kbl(mtcars[1:4, 1:5], tabular = "tblr", align = "c", booktabs = TRUE) |> 
    add_header_above(
        c(" " = 1, "$\\alpha$" = 2, "$\\beta$" = 3),
        escape = FALSE) |>
    add_header_above(
        c( "First Three" = 3, " " = 1, "Penultimate" = 1, " " = 1),
        italic = TRUE)
```

Screenshot 2023-12-15 075047

@vincentarelbundock
Copy link
Collaborator Author

I think I've come to the conclusion that this is not possible or desirable within kableExtra. Closing this now, so you don't wait on it before releasing a new version. Sorry to have made you waste your time with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants