Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: sf-ify regular layers #3655

Closed
wants to merge 4 commits into from
Closed

Conversation

clauswilke
Copy link
Member

First attempt at implementing the idea proposed in #3654.

library(ggplot2)

nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)

df <- data.frame(
  lat = 35.76667,
  long = -78.63333,
  name = "Raleigh"
)

# doesn't work, non-sf layers are not transformed
ggplot(df, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  geom_point() +
  geom_text(aes(label = name), hjust = -.1, vjust = 1.1) +
  coord_sf(crs = 2264)

# with_sf() fixes it
ggplot(df, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  with_sf(geom_point()) +
  with_sf(geom_text(aes(label = name), hjust = -.1, vjust = 1.1)) +
  coord_sf(crs = 2264)

Created on 2019-12-04 by the reprex package (v0.3.0)

@paleolimbot
Copy link
Member

In ggspatial I implemented this as a Stat...not sure if that's the best way to go, but it might help to look at the code here: https://github.com/paleolimbot/ggspatial/blob/master/R/geom-spatial.R#L162 . Usage is for that is more stat-like:

library(ggplot2)

nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)

df <- data.frame(
  lat = 35.76667,
  long = -78.63333,
  name = "Raleigh"
)

# as a stat
ggplot(df, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  geom_point(stat = ggspatial:::StatSpatialIdentity) +
  geom_text(aes(label = name), stat = ggspatial:::StatSpatialIdentity, hjust = -.1, vjust = 1.1) +
  coord_sf(crs = 2264)
#> Assuming crs = 4326 in stat_spatial_identity()
#> Assuming crs = 4326 in stat_spatial_identity()

# as with_sf()
with_sf <- function(layer, crs = NULL) {
  layer$stat <- ggspatial:::create_spatial_stat_class(
    layer$stat, 
    ggplot2:::snake_class(layer$stat)[1]
  )
  
  layer$stat$crs <- crs
  layer
}

ggplot(df, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  with_sf(geom_point()) +
  with_sf(geom_text(aes(label = name), hjust = -.1, vjust = 1.1)) +
  coord_sf(crs = 2264)
#> Assuming crs = 4326 in stat_identity()
#> Assuming crs = 4326 in stat_identity()

Created on 2019-12-04 by the reprex package (v0.2.1)

@markpayneatwork
Copy link
Contributor

One thing I can see that is missing from the with_sf() approach is the ability to tell with_sf what the default actually is. Maybe this could be supplied as an argument?

@clauswilke
Copy link
Member Author

@paleolimbot Since we don't have a good way of chaining stats, I'm not sure that using a stat just for the purpose of a coordinate transformation is the right way to go. Also not sure how that would play out with geoms such as geom_hline() and geom_vline() which need to know about the extent of the coord. But I'd be interested to hear other opinions. @yutannihilation @thomasp85?

@markpayneatwork Yes, of course. This is a work in progress. Source crs can be specified now.

library(ggplot2)

nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)

df <- data.frame(
  lat = 35.76667,
  long = -78.63333,
  name = "Raleigh"
)

ggplot(df, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  with_sf(
    geom_point(),
    geom_text(aes(label = name), hjust = -.1, vjust = 1.1)
  ) +
  coord_sf(crs = 2264)

# transformed data
df_trans <- data.frame(
  lat = 734165.51156997,
  long = 2108774.28357641,
  name = "Raleigh"
)

ggplot(df_trans, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  with_sf(
    geom_point(),
    geom_text(aes(label = name), hjust = -.1, vjust = 1.1),
    crs = 2264
  ) +
  coord_sf(crs = 2264)

Created on 2019-12-04 by the reprex package (v0.3.0)

@clauswilke
Copy link
Member Author

Thinking some more about this, I don't think a stat can do this.

library(ggplot2)

nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)

ggplot(nc) +
  geom_sf(size = 0.1, fill = "white") +
  with_sf(
    geom_hline(yintercept = 34:36),
    geom_vline(xintercept = c(-84, -80, -76))
  ) +
  coord_sf(crs = 2264)

Created on 2019-12-04 by the reprex package (v0.3.0)

@clauswilke
Copy link
Member Author

Or this.

library(ggplot2)
library(readr)
library(ggforce)

nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)

df <- read_csv(file = "name,population,lat,long
Charlotte,827097,35.227,-80.843
Raleigh,451066,35.772,-78.639
Greensboro,285342,36.073,-79.792
Durham,257636,35.994,-78.899
Winston-Salem,241218,36.1,-80.244")

ggplot(df, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  with_sf(
    geom_point(),
    geom_mark_hull()
  ) +
  coord_sf(crs = 2264)

Created on 2019-12-04 by the reprex package (v0.3.0)

@clauswilke
Copy link
Member Author

Fix #3653.

library(ggplot2)
library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)

ggplot() +
  with_sf(
    annotation_map(map_data("state"), fill="antiquewhite", colour="darkgrey")
  ) +
  geom_sf(data = nc) +
  coord_sf(crs = st_crs(3347))

Created on 2019-12-04 by the reprex package (v0.3.0)

@microly

This comment has been minimized.

@clauswilke

This comment has been minimized.

@clauswilke
Copy link
Member Author

@hadley Are you Ok with this addition to the grammar?

@microly

This comment has been minimized.

@clauswilke

This comment has been minimized.

@microly

This comment has been minimized.

@yutannihilation
Copy link
Member

yutannihilation commented Dec 5, 2019

Is it possible to let CoordSf transform all data in this way if the user wants it? For example:

ggplot(df, aes(x = long, y = lat)) +
  geom_sf(data = nc, size = 0.1, fill = "white", inherit.aes = FALSE) +
  geom_point() +
  geom_text(aes(label = name), hjust = -.1, vjust = 1.1) +
  coord_sf(crs = 2264, transform_non_sf_layers = TRUE)

I feel with_*() syntax is a bit overspec when we need just a single modification; I think the main advantage of with_*() is that it can be nested to express the series of modifications (e.g. with_blur(with_sobel(...))), but it doesn't seem the case to me.

(I don't mean to say this is unacceptable, but it's probably depends on how common this with_*() syntax will be and I just don't imagine the future yet.)

@clauswilke
Copy link
Member Author

@yutannihilation The problem I see is that the transform() function of coord_sf() doesn't know anything about the layer that calls it, and therefore doesn't know what to assume about the coordinates coming in:

ggplot2/R/coord-sf.R

Lines 43 to 58 in 913e936

transform = function(self, data, panel_params) {
data[[ geom_column(data) ]] <- sf_rescale01(
data[[ geom_column(data) ]],
panel_params$x_range,
panel_params$y_range
)
# Assume x and y supplied directly already in common CRS
data <- transform_position(
data,
function(x) sf_rescale01_x(x, panel_params$x_range),
function(x) sf_rescale01_x(x, panel_params$y_range)
)
transform_position(data, squish_infinite, squish_infinite)
},

So you'd have to first transform all incoming sf layers to long-lat or similar in setup_data(), and then transform again to the target crs in transform(). But you also have to transform all incoming sf layers to the target crs at some point before transform() is called, because you need to figure out the limits of the plot. So you're going from one transformation per sf layer to three. I don't know if this can be avoided.

@yutannihilation
Copy link
Member

because you need to figure out the limits of the plot

If we assume most of the usage of non-sf layers are for annotation, this step might not be necessary, but your explanation makes sense, thanks. Hmm...

@clauswilke
Copy link
Member Author

Another way to think about the problem: The current coord_sf() is essentially a Cartesian coordinate system operating in projected coordinates. To have regular geoms work natively with longitude/latitude data, we'd have to implement a non-Cartesian coordinate system operating in non-projected coordinates. These two coordinate systems are sufficiently different that we'd end up with two coords, e.g. coord_sf() and coord_sf2() (or whatever better name somebody can come up with). While this is an approach that could be implemented, I'm not sure it's less confusing.

@paleolimbot
Copy link
Member

Just a quick vote that I think coord_sf(crs = ..., default.crs = crs) would be the best option. It could return a different coordinate system object if crs != default.crs, or return a classed object implementing ggplot_add() that with_sf()s all non-sf layers and returns the modified plot.

@clauswilke
Copy link
Member Author

Thanks for the suggestion, Dewey. This may work, though I foresee some issues with limits. In what CRS would they be specified? Do we need a third parameter, limits.crs, that defines the limits CRS to be used? If limits are specified in long-lat, do we run into problems with non-rectangular areas and/or areas that cross the international dateline?

@paleolimbot
Copy link
Member

It's a hard one for sure...I think it would be impossible for scales to be in anything except final projected coordinates, although the coord xlim/ylim could be anything (probably more useful in mapping anyway, since limiting might result in weird clipping for anything except points). I always assumed this was in the datum CRS, but I've never tried.

@hadley
Copy link
Member

hadley commented Dec 5, 2019

IIRC my original intention was for other layers to be automatically transformed, assuming that the coordinates were specified in lat and long, and this is supported by one of the examples:

ggplot(nc) +
  geom_sf() +
  annotate("point", x = -80, y = 35, colour = "red", size = 4)

I'd really prefer if we could perform this transformation automatically — but I don't know enough about the constraints to know whether or not this is possible.

@clauswilke
Copy link
Member Author

@hadley I think the main constraint is the API: We'd have to add an argument to the transform() function of coord_sf(), so that the calling layers can specify what CRS they are assuming. The least invasive change would be to require the layers that don't want to be transformed to announce so. This would only require changes in geom_sf() and related geoms. It's a breaking change though for any downstream packages that have defined their own sf-related geoms.

@clauswilke
Copy link
Member Author

Oh, and it's also a breaking change for any code currently in existence that mixes geom_sf() with other geoms using pre-transformed data, though I'm not sure whether people commonly do this.

@hadley
Copy link
Member

hadley commented Dec 5, 2019

@clauswilke I think those breakages should be minor, and would be worth it (esp since I'm pretty sure that this is what I always intended)

@clauswilke
Copy link
Member Author

@hadley Ok, let me take a stab at this and then we can see what it looks like.

@clauswilke
Copy link
Member Author

@hadley It turns out changing the transform() API doesn't work after all, because geoms don't in general know what data they are drawing. E.g., in stat_sf_coordinates(geom = "point"), the geom has no idea it is working on properly transformed sf data.

Thus, the only other alternative I see is to transform everything to a common CRS first and then transform a second time right before drawing. Maybe I'll take a stab at that to see where it leads.

@microly
Copy link
Contributor

microly commented Dec 6, 2019

What is about that with_sf works with data? Like:

ggplot() +
geom_sf(data = nc) +
geom_point(aes(x = long, y = lat), data = with_sf(df, crs = the_crs_of_df))

In essence, it is a issue about transforming data, so we can only wrap the data...
Please forgive me if this is a foolish idea...

@paleolimbot
Copy link
Member

Would it be possible to use a scale_geometry() to train sf-enabled positions (which have a CRS attached)? I think that Coords can modify scales before they're trained, so CoordSf could set the target CRS in the geometry scale. Then, when setting the plot limits (also done by CoordSf it could combine the info from both types of scales. Not sure this solves all of the problems, but it's really hard to break the "coords/scales don't know about layers" issue without undoing the nice orthogonal pieces of the grammar of graphics thing.

Also, I think stat_sf_coordinates() has enough information to backtransform to the CoordSf's default.crs, which would limit transforming twice to that particular type of layer.

@clauswilke
Copy link
Member Author

Thanks @paleolimbot, I think your comment just cut the Gordian knot. We can continue treating geometry objects as we currently do, no change required, and we adopt the convention that regular position data is in long-lat (to be configurable in coord_sf()). Problems arise when geometry data is converted into regular position data, as stat_sf_coordinates() does, but it can just transform the geometry data as a first step. The final open question is whether we can get coord_sf() and stat_sf_coordinates() to talk to each other, so one can know what the other's default crs is, but worst case scenario both would take a default crs argument.

@paleolimbot
Copy link
Member

I'm glad!

I think in Stat$compute_layer() you have access to the layout, which has access to the Coord. Is that what you were looking for?

@clauswilke
Copy link
Member Author

@paleolimbot See #3659.

@clauswilke
Copy link
Member Author

Closing this as it has been superseded by #3659.

@clauswilke clauswilke closed this Apr 30, 2020
@clauswilke clauswilke deleted the with-sf branch April 30, 2020 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants