Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clearer error messages for invalid aesthetics #3091

Merged
merged 14 commits into from
Apr 29, 2019

Conversation

clairemcwhite
Copy link
Contributor

Fixes #3060

Previously, if a stat wasn't valid for layers, the error message was confusing

# wrong, `fill` aesthetic mapped to stat-generated data column that
# doesn't exist in `geom_point()` layer
ggplot(mtcars, aes(mpg, 1, fill = stat(density))) +
  geom_tile(stat = "density") +
  geom_point(position = "jitter")
#> Error in is.finite(x): default method not implemented for type 'closure'

The new error message for this code is now

 #> Error: Aesthetics must be valid computed stats: fill. Did you map your stat in the wrong layer? 

In the same vein, if a function like density was used as an aesthetic, the error message is clarified from the old:

# wrong, `fill` aesthetic mapped to data column that doesn't exist
ggplot(mtcars, aes(mpg, 1, fill = density)) +
  geom_tile(stat = "density") +
  geom_point(position = "jitter")
#> Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
#> Error: Column `fill` must be a 1d atomic vector or a list

to the new error message below that prompts the user to use stat(density)

#> Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
#> Error: Aesthetics must be valid data columns: fill. Did you forget to add stat()? 

This was done by adding two checks in R/layer.r

For the first change:

    # Check that all columns in aesthetic stats are valid data
    nondata_stat_cols <- check_nondata_cols(stat_data)
    if(length(nondata_stat_cols) > 0){
      msg <- paste0(
        "Aesthetics must be valid computed stats: ", nondata_stat_cols,
        ". Did you map your stat in the wrong layer?"
      )
      stop(msg, call. = FALSE)
    }

and for the second change:

    nondata_cols <- check_nondata_cols(evaled) 
    if(length(nondata_cols) > 0){
      msg <- paste0(
        "Aesthetics must be valid data columns: ", nondata_cols,
        ". Did you forget to add stat()?"
      )
      stop(msg, call. = FALSE)
      }

Both use a new R/utilities.R function check_nondata_cols()

# This function checks that all columns of a dataframe `x` are data and
# returns the names of any columns that are not.
# We define "data" as atomic types or lists, not functions or otherwise
check_nondata_cols <- function(x) {
  idx <- (vapply(x, function(x) rlang::is_atomic(x) || rlang::is_list(x), logical(1)))
  names(x)[which(!idx)]
}

There are two test in tests/testthat/test-layer.r for these two situations


test_that("function aesthetics are wrapped with stat()", {
  df <- data_frame(x = 1:10)
  expect_error(
    ggplot_build(ggplot(df, aes(density)) + geom_tile(stat = "density")),
    "Aesthetics must be valid data columns:"
  )
})

test_that("computed stats are in appropriate layer", {
  df <- data_frame(x = 1:10)
  expect_error(
    ggplot_build(ggplot(df, aes(x = x, stat(density))) + geom_tile(stat = "density") + geom_point()),
    "Aesthetics must be valid computed stats:"
  )
})

@clauswilke
Copy link
Member

@thomasp85 Could you take a look at the code? It looks good to me but would be good to get a second pair of eyes on it.

@batpigandme Would you mind looking this over for the phrasing of the error messages?

@thomasp85
Copy link
Member

It appears to be good, but I can for the love of me not shake the feeling that it will error out in some fringe cases where it should not...

But I'm pretty sure that's just me — can't think of any instances where data could be other than atomic or recursive...

@thomasp85
Copy link
Member

@batpigandme if you are ok with the wording you can approve and merge

R/utilities.r Outdated Show resolved Hide resolved
R/layer.r Outdated Show resolved Hide resolved
R/layer.r Outdated
if(length(nondata_cols) > 0){
msg <- paste0(
"Aesthetics must be valid data columns: ", nondata_cols,
". Did you forget to add stat()?"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is forgetting stat() really the most likely cause?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's two ways to get to this point: Forgetting stat() or mistyping a column name in such way that the result is a function, e.g. typing mean when the data column is called Mean. How about: "Did you mistype the name of a data column or forget to add stat()?"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we display the thing that they actually typed? Then we could hint to check that blah is a column in the data frame.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to get it to print the actual typed statement, but now it prints the problematic aesthetic (along with the tilde).

Situation 1:
ggplot(mtcars, aes(x = mpg, stat(density))) + geom_tile(stat = "density") + geom_point()

Error: Aesthetics must be valid computed stats: ~stat(density). Did you map your stat in the wrong layer?

Situation 2:

ggplot(mtcars, aes(x = mpg, fill = density)) + geom_tile(stat = "density") + geom_point()

Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Error: Aesthetics must be valid data columns: ~density. Did you mistype the name of a data column or forget to add stat()?

@clauswilke
Copy link
Member

It appears to be good, but I can for the love of me not shake the feeling that it will error out in some fringe cases where it should not...

I discussed this with @hadley at the developers day and the more I think about it the more I think it's right. Note that we're already performing a similar check on all incoming data, here:

ggplot2/R/utilities.r

Lines 440 to 444 in e9d4e5d

# Check inputs with tibble but allow column vectors (see #2609 and #2374)
as_gg_data_frame <- function(x) {
x <- lapply(x, validate_column_vec)
new_data_frame(tibble::as_tibble(x))
}

Though now that I think about it, maybe this check and the new check in this PR should be combined into one? Do we need to call tibble to validate the input data or could we simply run something like the check_nondata_cols() defined here?

@thomasp85 thomasp85 added this to the ggplot2 3.2.0 milestone Apr 11, 2019
@thomasp85
Copy link
Member

@clairemcwhite do you want to have a go at finishing this PR off?

@clairemcwhite
Copy link
Contributor Author

I can take a shot this weekend

R/layer.r Outdated
nondata_cols <- check_nondata_cols(evaled)
if (length(nondata_cols) > 0) {
msg <- paste0(
"Aesthetics must be valid data columns: `", deparse(aesthetics[[nondata_cols]]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use as_label() instead of deparse().

R/layer.r Outdated
nondata_stat_cols <- check_nondata_cols(stat_data)
if (length(nondata_stat_cols) > 0) {
msg <- paste0(
"Aesthetics must be valid computed stats: `", deparse(aesthetics[[nondata_stat_cols]]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@clauswilke
Copy link
Member

As I said in the code comments, I think you should use as_label() instead of deparse(). However, be aware that there's a dependency with #3242, which isn't merged yet. For now, you should write rlang::as_label(). We can then wait merging this PR until #3242 is merged, and then change rlang::as_label() to as_label() here and merge this.

@clairemcwhite
Copy link
Contributor Author

Deparse is now replaced with rlang::as_label. Current behavior is :

ggplot(mtcars, aes(x = mpg, stat(density))) + geom_tile(stat = "density") + geom_point()

Error: Aesthetics must be valid computed stats: stat(density). Did you map your stat in the wrong layer?

ggplot(mtcars, aes(x = mpg, fill = density)) + geom_tile(stat = "density") + geom_point()

Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Error: Aesthetics must be valid data columns: density. Did you mistype the name of a data column or forget to add stat()?

Copy link
Member

@clauswilke clauswilke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at everything carefully once more and saw a few more minor nitpicks (see code comments). Also, I'm still not 100% convinced by the error message. I feel it needs a better lead into the aesthetic names that are the problem. E.g.:

Error: Aesthetics must be valid computed stats. Problematic aesthetic(s): stat(density).

@batpigandme Could you comment?

Finally, Claire, you'll have to resolve the merge conflict before merging. You can do that by merging the current ggplot2 master into your branch.

NEWS.md Outdated Show resolved Hide resolved
R/layer.r Outdated Show resolved Hide resolved
R/layer.r Outdated Show resolved Hide resolved
R/layer.r Outdated Show resolved Hide resolved
@clauswilke
Copy link
Member

Claire, could you check what happens when there are multiple problematic aesthetics in one layer? I just realized that that case may not work yet. It probably needs a paste0() and a vapply():

paste0(vapply(aesthetics[[nondata_stat_cols]], rlang::as_label, character(1)), collapse = ", ")

Also make sure the test functions check against the final text in the error message. You can probably just delete the final colon.

@yutannihilation
Copy link
Member

#3242 is merged now.

NEWS.md Outdated
@@ -81,6 +81,10 @@ core developer team.

* ggplot2 now works in Turkish locale (@yutannihilation, #3011).

* Clearer error messages for inappropriate aesthetics (@clairemcwhite, #3060).

* `geom_rug()` gains an "outside" option to allow for moving the rug tassels to outside the plot area. (@njtierney, #3085)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an idea of where this news item comes from? It should have been present in NEWS.md already, but it isn't, so I'm confused about both why it was lost and how it made its reappearance here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it was moved in this commit: 230e8f7 so you can just delete it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, will do in next commit.

@clauswilke
Copy link
Member

Claire, just a reminder, the deadline to merge this for the 3.2.0 release is April 30. It would be great if you could complete this by this deadline. Thanks!

@clairemcwhite
Copy link
Contributor Author

@clauswilke

Multiple problem aesthetics now work.

Case 1:

ggplot(mtcars, aes(x = mpg, fill = density, color = density)) + geom_tile(stat = "density") + geom_point(stat = "density")
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Error: Aesthetics must be valid data columns. Problematic aesthetic(s): density, density.
Did you mistype the name of a data column or forget to add stat()?

Case 2:

ggplot(mtcars, aes(x = mpg, fill = stat(density), colour = stat(density))) + geom_tile(stat = "density") + geom_point()
Error: Aesthetics must be valid computed stats. Problematic aesthetic(s): stat(density), stat(density).
Did you map your stat in the wrong layer?

The only improvement I can think of would be to have the error include the type of aesthetic, like: "Problematic aesthetics(s): fill = stat(density), color = stat(density)".

The variable "nondata_cols" is a vector c("fill", "color"), so there should be a quick way to construct this kind of error statement.

@clauswilke
Copy link
Member

clauswilke commented Apr 27, 2019

Very good! Final set of comments, I promise. :-)

  1. To improve the error message, I think you can replace the function(x) {...} statement with:
function(x) {paste0(x, " = ", as_label(aesthetics[[x]]))}
  1. I would use the case with multiple broken aesthetics in the unit tests, and check that the aesthetics are listed correctly (after addressing my point 1).

  2. Do you need two geoms to create the multi-aesthetics case, or is one geom sufficient as long as you have aes(color = ..., fill = ...)?

  3. What happens when both types of mistakes appear in the same plot?

@clairemcwhite
Copy link
Contributor Author

clairemcwhite commented Apr 28, 2019

  1. Cool, the new function is function(x) {paste0(x, " = ", as_label(aesthetics[[x]]))}

Case 1:

ggplot(mtcars, aes(x = mpg, fill = density, color = density)) +  geom_point()
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
 Error: Aesthetics must be valid data columns. Problematic aesthetic(s): fill = density, colour = density. 
Did you mistype the name of a data column or forget to add stat()? 

Case 2:

ggplot(mtcars, aes(x = mpg, fill = stat(density), color = stat(density))) + geom_point()
 Error: Aesthetics must be valid computed stats. Problematic aesthetic(s): fill = stat(density), colour = stat(density). 
Did you map your stat in the wrong layer? 

One note is that this approach anglicizes my "color" to "colour"

  1. I changed the tests to be more like what you suggested.

  2. One geom is sufficient "ggplot(mtcars, aes(fill = stat(density), color = stat(density))) + geom_point()" and "ggplot(mtcars, aes(fill = density, color = density)) + geom_point()" produce the two errors

  3. You get the errors in the order that they appear in layer.R. Remove the first error, and you get the second error

ggplot(mtcars, aes(x = mpg, fill = stat(density), color = density)) + geom_point()
Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
 Error: Aesthetics must be valid data columns. Problematic aesthetic(s): colour = density. 
Did you mistype the name of a data column or forget to add stat()? 
# get rid of color = density
> ggplot(mtcars, aes(x = mpg, fill = stat(density))) + geom_point()
 Error: Aesthetics must be valid computed stats. Problematic aesthetic(s): fill = stat(density). 
Did you map your stat in the wrong layer? 

@clauswilke
Copy link
Member

Super, thanks! Assuming the automated regression tests pass I'll merge.

@clauswilke
Copy link
Member

The AppVeyor build fails, for reasons that I can't discern. It doesn't seem to have anything to do with the changes that were made in this PR. The tests that break are visual tests for the date scale.

@thomasp85, any idea what's happening? Can we merge and hope for the best?

@thomasp85
Copy link
Member

thomasp85 commented Apr 29, 2019

It has been going on since 5aacacb... I think it is fair to assume it has nothing to do with this PR

If it propagates to master I'll fix it in a dedicated PR

@clauswilke clauswilke merged commit 837e601 into tidyverse:master Apr 29, 2019
@clauswilke
Copy link
Member

Thanks!

@lock
Copy link

lock bot commented Oct 26, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Oct 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better error messages for incorrect aesthetic mappings
5 participants