-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move POINT geometries out of a list column, into a matrix #2059
base: main
Are you sure you want to change the base?
Conversation
Memory size: decreases with a factor 18: library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
n = 1e6
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
st_as_sf(df, coords = c("x", "y")) |>
st_geometry() -> pts
(s1 = object.size(pts))
# 24002864 bytes
(s2 = object.size(pts[]))
# 432002312 bytes
as.numeric(s2)/as.numeric(s1)
# [1] 17.99795 |
This one is pretty obvious: library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
n = 1e6
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
st_as_sf(df, coords = c("x", "y")) |>
st_geometry() -> pts
pts |> st_is_empty() |> system.time()
# user system elapsed
# 1.692 0.072 1.764
system.time(pts0 <- pts[])
# user system elapsed
# 1.104 0.036 1.140
pts0 |> st_is_empty() |> system.time()
# user system elapsed
# 1.853 0.057 1.909
pts |> st_combine() |> system.time()
# user system elapsed
# 0.002 0.000 0.002
pts0 |> st_combine() |> system.time()
# user system elapsed
# 3.119 0.155 3.275 |
This is a very cool idea! The use of an attribute to maintain a semblance of backward compatibility with the structure is clever. I do wonder if this change is going to result in enough performance gains to justify the maintenance cost of the additional representation. My adventures playing with this branch are below...basically all the ways I know about based on the benchmarks above. While I certainly don't mind if this representation exists in sf, my first impression is that it's rather complicated and not that fast. If performance is the motivation, I wonder if broadening the sf API to work with other S3 classes (e.g., Another low-hanging performance fruit would be to drop the cached bbox and geometry classes (that's most of why the geos package is faster than sf in most cases). library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.6.0, PROJ 9.1.1; sf_use_s2() is TRUE
n = 1e5
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
st_as_sf(df, coords = c("x", "y")) |>
st_geometry() -> pts
pts0 <- pts[]
wk_xy <- wk::as_xy(df)
geos_pts <- geos::as_geos_geometry(wk_xy)
lobstr::obj_size(pts)
#> 2.40 MB
lobstr::obj_size(pts0)
#> 12.80 MB
lobstr::obj_size(wk_xy)
#> 2.40 MB
# not really accurate...only includes R allocs
lobstr::obj_size(geos_pts)
#> 7.20 MB
bench::mark(
st_is_empty(pts),
st_is_empty(pts0),
wk::wk_meta(wk_xy)$size == 0,
wk::wk_meta(geos_pts)$size == 0,
geos::geos_is_empty(geos_pts)
)
#> # A tibble: 5 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 st_is_empty(pts) 33.83ms 34.48ms 28.9 1.16MB 12.8
#> 2 st_is_empty(pts0) 46.59ms 46.8ms 21.4 1.15MB 49.8
#> 3 wk::wk_meta(wk_xy)$size == 0 2.06ms 2.25ms 421. 3.1MB 53.9
#> 4 wk::wk_meta(geos_pts)$size == 0 4.21ms 4.43ms 221. 3.06MB 31.1
#> 5 geos::geos_is_empty(geos_pts) 756.29µs 840.21µs 1180. 406.18KB 22.7
bench::mark(
st_combine(pts),
st_combine(pts0),
wk::wk_collection(wk_xy, wk::wk_geometry_type("multipoint")),
geos::geos_make_collection(geos_pts, "multipoint"),
check = FALSE
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 4 × 6
#> expression min median
#> <bch:expr> <bch:tm> <bch:tm>
#> 1 st_combine(pts) 97.99µs 100.61µs
#> 2 st_combine(pts0) 139.98ms 181.44ms
#> 3 wk::wk_collection(wk_xy, wk::wk_geometry_type("multipoint")) 1.85ms 2.03ms
#> 4 geos::geos_make_collection(geos_pts, "multipoint") 6.09ms 6.51ms
#> # … with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the POINT object (XY and XYZ) more like sp
, but:
> data.frame(z=rnorm(3), x=1:3, y=3:1) |> st_as_sf(coords = c("x", "y")) -> x
> as(x$geometry[], "Spatial")
SpatialPoints:
coords.x1 coords.x2
[1,] 1 3
[2,] 2 2
[3,] 3 1
Coordinate Reference System (CRS) arguments: NA
> as(x$geometry, "Spatial")
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function 'coordinates' for signature '"NULL"'
> as(x, "Spatial")
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'addAttrToGeom': unable to find an inherited method for function 'coordinates' for signature '"NULL"'
> sp_x <- as(x$geometry[], "Spatial")
> st_as_sfc(sp_x)
Geometry set for 3 features
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 1 ymin: 1 xmax: 3 ymax: 3
CRS: NA
POINT (1 3)
POINT (2 2)
POINT (3 1)
> st_as_sfc(sp_x) |> unclass() |> str()
List of 3
$ : 'XY' num [1:2] 1 3
$ : 'XY' num [1:2] 2 2
$ : 'XY' num [1:2] 3 1
- attr(*, "precision")= num 0
- attr(*, "bbox")= 'bbox' Named num [1:4] 1 1 3 3
..- attr(*, "names")= chr [1:4] "xmin" "ymin" "xmax" "ymax"
- attr(*, "crs")=List of 2
..$ input: chr NA
..$ wkt : chr NA
..- attr(*, "class")= chr "crs"
- attr(*, "n_empty")= int 0
@rsbivand thanks, that should now work; I finished implementation so far only to pass |
I just encountered an other case where this pull request could provide a big speedup. When using |
... which is rather stupid (the 20 seconds), given that #2046 suggests |
Today ALTREP support for lists appeared in R SVN. Do you think it can somehow affect |
Thanks for reporting! I'm not an altrep expert, but would expect no miracles; I think it's mostly useful for (i) data structures out of memory (e.g. in a memory mapped area) or (ii) unrealised arrays (e.g. 1:1e12, where you're going to compute the value when you need it rather than realize it in memory). The first would help with massive point sets (but merely make it possible, not fast), the second doesn't help us here. The speedup of this PR (if any) is due to the flat memory model of matrices vs. the arbitrary memory layout of lists; accessing lists elements is more expensive because of that; the smaller memory footprint may be even more important. |
FWIW I actually think ALTREP for lists will be very useful here. It will let you use the data structure you've prototyped here without breaking backwards compatibility with users of the previous data structure (i.e., everybody). |
Great idea, do you know of an example of such a thing? |
I just played around with this pull request out of curiosity what the consequences would be for move2. I see substantial memory usage reductions, in the example data from 7mb to 2mb. That is great! require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
n <- 4
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
st_as_sf(df, coords = c("x", "y")) |>
st_geometry() -> pts
empty <- st_as_sfc("POINT(EMPTY)", crs = st_crs(pts))
c(pts, empty)
#> Geometry set for 5 features (with 1 geometry empty)
#> Geometry type: POINT
#> Error in a$points[i, , drop = FALSE]: subscript out of bounds
st_distance(c(pts, empty), c(empty, pts), by_element = T)
#> [1] NaN NaN NaN NaN
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.1.2 (2021-11-01)
#> os Ubuntu 22.04.2 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Amsterdam
#> date 2023-06-29
#> pandoc 2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> class 7.3-22 2023-05-03 [1] CRAN (R 4.1.2)
#> classInt 0.4-9 2023-02-28 [1] CRAN (R 4.1.2)
#> cli 3.6.1 2023-03-23 [1] CRAN (R 4.1.2)
#> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.1.2)
#> digest 0.6.32 2023-06-26 [1] CRAN (R 4.1.2)
#> dplyr 1.1.2 2023-04-20 [1] CRAN (R 4.1.2)
#> e1071 1.7-13 2023-02-01 [1] CRAN (R 4.1.2)
#> evaluate 0.21 2023-05-05 [1] CRAN (R 4.1.2)
#> fansi 1.0.4 2023-01-22 [1] CRAN (R 4.1.2)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.1.2)
#> fs 1.6.2 2023-04-25 [1] CRAN (R 4.1.2)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.1.2)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.2)
#> htmltools 0.5.5 2023-03-23 [1] CRAN (R 4.1.2)
#> KernSmooth 2.23-21 2023-05-03 [1] CRAN (R 4.1.2)
#> knitr 1.43 2023-05-25 [1] CRAN (R 4.1.2)
#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.1.2)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.2)
#> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.1.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.2)
#> proxy 0.4-27 2022-06-09 [1] CRAN (R 4.1.2)
#> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.2)
#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.1.2)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.1.2)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.1.2)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.1.2)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.2)
#> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.1.2)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.1.2)
#> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.1.2)
#> rmarkdown 2.22 2023-06-01 [1] CRAN (R 4.1.2)
#> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.1.2)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
#> sf * 1.0-13 2023-06-29 [1] Github (r-spatial/sf@e3def06)
#> styler 1.10.1 2023-06-05 [1] CRAN (R 4.1.2)
#> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.1.2)
#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.1.2)
#> units 0.8-2 2023-04-27 [1] CRAN (R 4.1.2)
#> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.1.2)
#> vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.1.2)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.2)
#> xfun 0.39 2023-04-20 [1] CRAN (R 4.1.2)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.1.2)
#>
#> [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.1
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── |
Great example; that should run better, now. |
Thanks @edzer , I tested a bit more and noticed two other examples changing for unexpected behavior. Sorry for having these issues iteratively, the popup up in some of the unit test. Let me also know if your not looking for these detailed tests yet require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
require(dplyr)
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
n <- 4
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
dfsf<-st_as_sf(df, coords = c("x", "y"))
# The filtered data turns empty:
dfsf |> filter(rep(T,4))
#> Simple feature collection with 4 features and 1 field (with 4 geometries empty)
#> Geometry type: GEOMETRY
#> Dimension: XY
#> Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
#> CRS: NA
#> z geometry
#> 1 -0.32671172 GEOMETRYCOLLECTION EMPTY
#> 2 -0.05574357 GEOMETRYCOLLECTION EMPTY
#> 3 -1.34424667 GEOMETRYCOLLECTION EMPTY
#> 4 -0.24700773 GEOMETRYCOLLECTION EMPTY
# there is no empty point inserted
dfsf$geometry[2]<-st_point()
dfsf
#> Simple feature collection with 4 features and 1 field (with 4 geometries empty)
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 0.02291049 ymin: 0.2227787 xmax: 0.6861906 ymax: 0.4822213
#> CRS: NA
#> z geometry
#> 1 -0.29728105 POINT (0.3049286 0.4746749)
#> 2 -0.33397322 POINT (0.5350945 0.2988531)
#> 3 -0.93209304 POINT (0.6861906 0.4822213)
#> 4 0.08929689 POINT (0.02291049 0.2227787)
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.1.2 (2021-11-01)
#> os Ubuntu 22.04.2 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Amsterdam
#> date 2023-06-30
#> pandoc 2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> ...
#> dplyr * 1.1.2 2023-04-20 [1] CRAN (R 4.1.2)
#> ...
#> sf * 1.0-14 2023-06-29 [1] Github (r-spatial/sf@3ec76b8)
#> ...
#>
#> [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.1
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── |
This should now also run. |
That works for the examples above but filter fails if there is an empty location: require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
n <- 4
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
dfsf<-st_as_sf(df, coords = c("x", "y"))
dfsf$geometry[2]<-st_point()
dfsf |> dplyr::filter(rep(T,4))
#> Error in st_as_sf.data.frame(NextMethod(), coords = attr(.data, "sf_column"), : missing values in coordinates not allowed I also see a few other changes in behavior/test error but I have not been able to make small reproducible examples out of those yet |
It seems dplyr filter drops the CRS: require(dplyr)
require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
data(meuse, package = "sp")
meuse_sf = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
st_crs(meuse_sf)
#> Coordinate Reference System:
#> User input: EPSG:28992
#> wkt:
....
#> BBOX[50.75,3.2,53.7,7.22]],
#> ID["EPSG",28992]]
st_crs(dplyr::filter(meuse_sf, copper>80))
#> Coordinate Reference System: NA
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
.....
#> sf * 1.0-14 2023-07-01 [1] Github (r-spatial/sf@203187d)
.....
#>
#> [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.1
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── Created on 2023-07-02 with reprex v2.0.2 |
Hi, I have been doing some more testing. In general I use the test suite from I have found a few more cases that show some strange/undesired? behavior. require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
d<-st_as_sf(data.frame(x=c(1:4), y=c(1:3,NA)), coords=c("x","y"), na.fail = F)
st_is_empty(st_geometry(d))
#> [1] FALSE FALSE FALSE FALSE
st_is_empty(d)
#> [1] FALSE FALSE FALSE FALSE
# should be c(F,F,F, T)?, now is c(F,F,F,F)
dput(st_geometry(d))
#> structure(list(NULL, NULL, NULL, NULL), points = structure(c(1,
#> 2, 3, 4, 1, 2, 3, NA), dim = c(4L, 2L)), points_dim = "XY", n_empty = 0L, precision = 0, crs = structure(list(
#> input = NA_character_, wkt = NA_character_), class = "crs"), bbox = structure(c(xmin = 1,
#> ymin = 1, xmax = 4, ymax = 3), class = "bbox"), class = c("sfc_POINT",
#> "sfc"))
dput(st_geometry(d[1:4,]))
#> structure(list(structure(c(1, 1), class = c("XY", "POINT", "sfg"
#> )), structure(c(2, 2), class = c("XY", "POINT", "sfg")), structure(c(3,
#> 3), class = c("XY", "POINT", "sfg")), structure(c(4, NA), class = c("XY",
#> "POINT", "sfg"))), n_empty = 0L, precision = 0, crs = structure(list(
#> input = NA_character_, wkt = NA_character_), class = "crs"), bbox = structure(c(xmin = 1,
#> ymin = 1, xmax = 4, ymax = 3), class = "bbox"), class = c("sfc_POINT",
#> "sfc"))
# sub setting directly reduces to the old structure, keeping the matrix might be nicer and makes sure the benefits are throughout scripts
e<-st_as_sf(data.frame(x=c(1:3, NA), y=c(1:3,NA)), coords=c("x","y"), na.fail = F)
st_is_empty(st_geometry(e))
#> [1] FALSE FALSE FALSE TRUE
# works correctly
st_is_empty(st_geometry(dplyr::filter(e, rep(T,4))))
#> Error in scan(text = lst[[length(lst)]], quiet = TRUE): scan() expected 'a real', got 'ParseException:'
#> Error in (function (msg) : ParseException: Unexpected EOF parsing WKB
# fails, one difference I see is that a Z dimension is added to points_dim.
sessioninfo::session_info(pkgs = "attached")
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.2 (2023-10-31)
#> os Ubuntu 22.04.3 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Amsterdam
#> date 2024-01-22
#> pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> sf * 1.0-14 2024-01-22 [1] Github (r-spatial/sf@d38b3dc)
#>
#> [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.3
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── Created on 2024-01-22 with reprex v2.0.2 |
Another small change/corner case casting and empty require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
st_cast(st_sfc(),"POINT")
#> Error in cls[2, ]: incorrect number of dimensions
sessioninfo::session_info("attached")
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.2 (2023-10-31)
#> os Ubuntu 22.04.3 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Amsterdam
#> date 2024-01-23
#> pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> sf * 1.0-14 2024-01-22 [1] Github (r-spatial/sf@d38b3dc)
#>
#> [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.3
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── This used to work fine: require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
st_cast(st_sfc(),"POINT")
#> Geometry set for 0 features
#> Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
#> CRS: NA
sessioninfo::session_info("attached")
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.2 (2023-10-31)
#> os Ubuntu 22.04.3 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Amsterdam
#> date 2024-01-23
#> pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> sf * 1.0-15 2023-12-18 [1] CRAN (R 4.3.2)
#>
#> [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.3
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── Created on 2024-01-23 with reprex v2.0.2 |
Great to see progress! @edzer am I right to notice that this resolves the cast error I posted today but does not address the remarks from yesterday? |
Additionally I found this issue with replacing values: require(sf)
d<-st_as_sf(data.frame(x=1:3,y=1:3), coords = c("x","y"))
e<-st_as_sf(data.frame(x=4:5,y=4:5), coords = c("x","y"))
# Works correctly if both are of the list type:
replace(st_geometry(d)[T,], c(F,T,T), st_geometry(e)[T,])
#> Geometry set for 3 features
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 1 ymin: 1 xmax: 5 ymax: 5
#> CRS: NA
#> POINT (1 1)
#> POINT (4 4)
#> POINT (5 5)
# However not of the replacement is to the matrix type:
replace(st_geometry(d)[T,], c(F,T,T), st_geometry(e))
#> Geometry set for 3 features (with 2 geometries empty)
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 1 ymin: 1 xmax: 1 ymax: 1
#> CRS: NA
#> POINT (1 1)
#> POINT EMPTY
#> POINT EMPTY
# Replace is equivalent to:
dd<-st_geometry(d)[T,]
ee<-st_geometry(e)
dd[c(F,T,T)]<-ee
dd
#> Geometry set for 3 features (with 2 geometries empty)
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 1 ymin: 1 xmax: 1 ymax: 1
#> CRS: NA
#> POINT (1 1)
#> POINT EMPTY
#> POINT EMPTY
sessioninfo::package_info("attached")
#> package * version date (UTC) lib source
#> sf * 1.0-16 2024-01-23 [1] Github (r-spatial/sf@4ce501e) |
Fix value replacement for sfc_POINT
ensure vctrs operations work
This PR moves
POINT
geometries out of the list column, leaves a list column filled withNULL
values in place, and moves the point coordinates in ann
x 2 (or 3 for XYM/XYZ, or 4 for XYZM) matrix attribute calledpoints
. The class dim label (XY/XYZ/XYM/XYZM) is put in attributepoints_dim
. This happens when:st_as_sf.data.frame
is called with e.g.coords
specifying the coordinate columssfc
, i.e. when coming back fromst_read()
or any of the GEOS functions.Right now,
a[]
restores the original list ofsfg
objects:This may give some performance changes, both in memory footprint and runtime; still need to do benchmarks.
Comments welcome!