-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return Tables instead of NamedTuples. #59
Conversation
I also refactored the canopy/ground switches for ICESat-2 and GEDI, by just calling the underlying methods multiple times (one time for ground, one time for canopy, if both are enabled). This significantly reduces boilerplate and possible bugs. |
Failing tests are unrelated (one slow download, nightly fails on HDF5, for which an issue has been made). |
Working my way through breaking changes. I had an internal function: """
points_plus(granule::ICESat_Granule{}; bbox = (min_x = -Inf, min_y = -Inf, max_x = Inf, max_y = Inf))
returns the ICESat granual *WITH* granual infomation for each track
"""
function points_plus(
granule::ICESat_Granule{};
extent::Extent = world,
)
ptsplus = merge(points(granule, bbox = extent), (; granule_info = granule))
return ptsplus
end since points now returns a table I need to append Do you know of an easy way to append a new column to a table? Thanks. |
I'll add a merge method to Table so this can keep working. Note that it's a SpaceLiDAR specific table, it's not a Table from Tables.jl, but it does implement the Tables.jl interface. I'm not sure if your code ever worked for ICESat-2 and GEDI, as those returned a vector of namedtuples (now it returns a partionedtable). Also, wouldn't it make more sense to store the granule info as metadata (depending on your output format)? I wrote points with the intent to have a table of equal length columns and simple types. Storing a non vector custom type goes against that, and can cause some headaches. |
I also had a method for ICESat2 and GEDI: function points_plus(
granule::Union{GEDI_Granule{}, ICESat2_Granule{}};
extent::Extent = world,
)
ptsplus = merge.(points(granule, bbox = extent), Ref((; granule_info = granule)))
return ptsplus
end |
Can you comment on how you use the granule info? For example, you could further explode it, or store it as metadata in arrow. We could make it default? I now recover similar info from the filename (my files only have the extensions renamed), which also isn't ideal. |
The way I've setup my pipeline is to segment the data by "geotiles"... that is X by X degree geographic extents that contain all mission data within the X degree bounding box. Data is extracted from the raw files using This approach is working well but I will eventually overhaul the whole pipleline so that each row of a dataframe contains a single point.. when I do this I will make heavy use of FillArrays. |
The biggest bottleneck for my global processing pipeline is file I/O. This is why I've moved to implementing |
I've added |
Did you find other instances where this PR broke your code? I will hold off on releasing a new version untill I've got this compatible with EarthData and have some version of HFD5Tables in. |
using SpaceLiDAR
g = ICESat_Granule{:GLAH06}("GLAH06_634_1102_001_0073_1_01_0001.H5", "/Users/gardnera/data/icesat/GLAH06/034/raw/GLAH06_634_1102_001_0073_1_01_0001.H5", (type=:GLAH06, phase=1, rgt=0, instance=73, cycle=1, segment=1, version=1, revision=634), Vector{Vector{Vector{Float64}}}[])
tbl = merge(points(geotile_info), (; g))
DataFrame(tbl) results in a I think in this case we want the length of :ICESat_Granule to == 1 so that the named tuple occupies it's own table cell |
Yeah, that won't work. Defining granule as something to iterate on (also requires a getindex on it), will lead to ERROR: DimensionMismatch: column :longitude has length 729117 and column :g has length 1. But the error comes because DataFrame wants matching length columns, your pointsplus function never did that right? So |
You're correct... apologies... the exact code that I'm trying to get working again is: if mission == :ICESat
df = DataFrame(points_plus.(row.granules, extent =row.extent));
elseif mission == :ICESat2 || mission == :GEDI
df = reduce(vcat, (DataFrame.(points_plus.(row.granules, extent = row.extent))));
end With this PR I can create the tables: using SpaceLiDAR
using DataFrames
g = [ICESat_Granule{:GLAH06}("GLAH06_634_1102_001_0073_1_01_0001.H5", "/Users/gardnera/data/icesat/GLAH06/034/raw/GLAH06_634_1102_001_0073_1_01_0001.H5", (type = :GLAH06, phase = 1, rgt = 0, instance = 73, cycle = 1, segment = 1, version = 1, revision = 634), Vector{Vector{Vector{Float64}}}[]),
ICESat_Granule{:GLAH06}("GLAH06_634_1102_001_0073_2_01_0001.H5", "/Users/gardnera/data/icesat/GLAH06/034/raw/GLAH06_634_1102_001_0073_2_01_0001.H5", (type = :GLAH06, phase = 1, rgt = 0, instance = 73, cycle = 1, segment = 2, version = 1, revision = 634), Vector{Vector{Vector{Float64}}}[])]
tbl = merge.(points.(g), [(; granuel_info = f) for f in g]) but I'm unable to make a DataFrame where each row makes up a single granule (ICESat) or beam (GEDI & ICESat2). This makes sense as points now returns a table where each observation occupies a single row. Your new approach is absolutely the way to go but I need someway to provide backwards compatibility. To make my code work I need a way to reverse the Fill arrays and table properties so that I can treat a SpaceLiDAR Table as the original NamedTuple. |
As I mentioned before, I have been meaning to refactor my code to move away from storing each granule as it's own row. Maybe this PR will force me to finally make the change. It's a fairly major change on my end as everything is built around rows = granules. The one thing that is really lacking when moving from What do you think is the best path forward? As of now |
Maybe something changed in DataFrames over time in terms of automatic repeating non-iterable column values? I've added julia> SL.points(g)
SpaceLiDAR Table with 6 partitions
julia> parent(SL.points(g))
6-element Vector{NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :phr, :sensitivity, :scattered, :saturated, :clouds, :track, :strong_beam, :classification, :height_reference, :detector_id, :reflectance, :nphotons), Tuple{Vector{Float32}, Vector{Float32}, Vector{Float32}, Vector{Float32}, ... If that doensn't work, code along the following lines would do the trick: reduce(vcat, DataFrame.(Pair.(:track, SL.points(g)), :granule=>g))
0.019452 seconds (2.51 k allocations: 3.784 MiB)
6×2 DataFrame
Row │ track granule
│ NamedTup… ICESat2_…
─────┼──────────────────────────────────────────────────────────────────────
1 │ (longitude = Float32[117.077, 11… ICESat2_Granule{:ATL08}("ATL08_2…
2 │ (longitude = Float32[117.078, 11… ICESat2_Granule{:ATL08}("ATL08_2…
3 │ (longitude = Float32[117.112, 11… ICESat2_Granule{:ATL08}("ATL08_2…
4 │ (longitude = Float32[117.096, 11… ICESat2_Granule{:ATL08}("ATL08_2…
5 │ (longitude = Float32[117.137, 11… ICESat2_Granule{:ATL08}("ATL08_2…
6 │ (longitude = Float32[117.134, 11… ICESat2_Granule{:ATL08}("ATL08_2…
I think metadata is something we should support, as it could be passed if you save granules individually, and it will work with most IO (Arrow, GeoDataFrames), so you would get it back after you open the file again. But indeed if you concatenate Tables further, the metadata will be lost. I save each granule with the same filename as the HDF5 (just with a different extension), so the unique id of the granule is preserved. From the filename, all granule information can be restored (except for the polygon, which we get from search). But logic in filenames is a bit frowned upon, so the another solution is to make a String column SL.icesat2_info(fn)
(type = :ATL08, date = Dates.DateTime("2020-08-12T23:54:29"), rgt = 742, cycle = 8, segment = 14, version = 6, revision = 1, ascending = true, descending = false) If you want, you could even store these attributes as their own (Fill) columns. We just need a |
If we went down the path of storing |
Empty granules return a |
@evetion once the empty granules issue is fixed I should be able to use |
Adding the metadata to the table is not breaking on top of this. Adding extra columns might be a grey area, but for a performance I rather do not have the extra columns that I don't need (yet), and we make it easy to add them? Besides, I think doing a Fill(basename(granule.id)) probably requires less column overhead/memory than all the exploded attributes? If you also need it quick as a vector, you could use InlineStrings? |
So purely the id of the granule takes less info than the exploded info namedtuple. The inlinestring takes just as much information. julia> info(g)
(type = :ATL08, date = Dates.DateTime("2020-08-12T23:54:29"), rgt = 742, cycle = 8, segment = 14, version = 5, revision = 1, ascending = true, descending = false)
julia> sizeof(info(g))
64
julia> sizeof(g.id)
39
julia> typeof(InlineString(g.id))
String63 |
You didn't specify which product(s) has this problem, but I think I fixed this for GLAH06. Let me know if I missed one. |
Ok, last big change. I've added support for metadata, and included functions to add either id or the granule info to the tables. DataAPI metadata support. Arrow.jl will get support for it soon: apache/arrow-julia#481, so Tables/DataFrames saved with Arrow will retain their metadata. julia> g = SL.granule(fn)
ICESat2_Granule{:ATL08}("ATL08_20200812235429_07420814_005_01.h5", "/Users/evetion/Downloads/ATL08_20200812235429_07420814_005_01.h5", (type = :ATL08, date = Dates.DateTime("2020-08-12T23:54:29"), rgt = 742, cycle = 8, segment = 14, version = 5, revision = 1, ascending = true, descending = false), Vector{Vector{Vector{Float64}}}[])
julia> t = points(g)
SpaceLiDAR Table with 6 partitions
julia> DataAPI.metadata(t)
Dict{String, Any} with 10 entries:
"cycle" => 8
"descending" => false
"revision" => 1
"segment" => 14
"id" => "ATL08_20200812235429_07420814_005_01.h5"
"rgt" => 742
"date" => DateTime("2020-08-12T23:54:29")
"ascending" => true
"type" => :ATL08
"version" => 5
julia> df = DataFrame(t)
julia> DataAPI.metadata(df) == DataAPI.metadata(t) # metadata is propagated. Furthermore, I've included the following functions, which should help your workflow. julia> t = SL.add_info(t) # adds multiple columns from `info(g)`
julia> t = SL.add_id(t) # adds :id column
julia> t[1].id
10509-element Fill{String}, with entries equal to ATL08_20200812235429_07420814_005_01.h5
julia> t[end].revision # info fields are added to all tracks
5180-element Fill{Int64}, with entries equal to 1 |
Looks like there is an issue with using subsetting: using Extents, DataFrames, SpaceLiDAR, Dates
g = ICESat2_Granule{:ATL06}("ATL06_20181201095523_09740106_005_01.h5", "/Users/gardnera/data/icesat2/ATL06/005/raw/ATL06_20181201095523_09740106_005_01.h5", (type=:ATL06, date=Dates.DateTime("2018-12-01T09:55:23"), rgt=974, cycle=1, segment=6, version=5, revision=1, ascending=false, descending=true), Vector{Vector{Vector{Float64}}}[]);
points(g) works great but ext = Extent(X = (-128.0, -126.0), Y = (50.0, 52.0))
points(g, bbox=ext) return this error: ERROR: MethodError: no method matching SpaceLiDAR.PartitionedTable(::Tuple{NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{Dates.DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{Dates.DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{Dates.DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{Dates.DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{Dates.DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{Dates.DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}})
Closest candidates are:
SpaceLiDAR.PartitionedTable(::NamedTuple)
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/granule.jl:169
SpaceLiDAR.PartitionedTable(::Tuple{Vararg{NamedTuple{K, V}, N}}) where {N, K, V}
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/granule.jl:167
Stacktrace:
[1] points(granule::ICESat2_Granule{:ATL06}; tracks::NTuple{6, String}, step::Int64, bbox::Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}})
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL06.jl:49
[2] points
@ ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL06.jl:24 [inlined]
[3] points_plus(granule::ICESat2_Granule{:ATL06}; extent::Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}})
@ Altim ~/Documents/GitHub/Altim.jl/src/utilities.jl:103
[4] points_plus
@ ~/Documents/GitHub/Altim.jl/src/utilities.jl:97 [inlined]
[5] #43
@ ./broadcast.jl:1297 [inlined]
[6] _broadcast_getindex_evalf
@ ./broadcast.jl:683 [inlined]
[7] _broadcast_getindex
@ ./broadcast.jl:656 [inlined]
[8] _getindex
@ ./broadcast.jl:680 [inlined]
[9] _broadcast_getindex
@ ./broadcast.jl:655 [inlined]
[10] getindex
@ ./broadcast.jl:610 [inlined]
[11] copyto_nonleaf!(dest::Vector{DataFrame}, bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Tuple{Base.OneTo{Int64}}, Type{DataFrame}, Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, Base.Broadcast.var"#43#44"{Base.Pairs{Symbol, Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}}, Tuple{Symbol}, NamedTuple{(:extent,), Tuple{Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}}}}}, typeof(points_plus)}, Tuple{Base.Broadcast.Extruded{Vector{ICESat2_Granule{:ATL06}}, Tuple{Bool}, Tuple{Int64}}}}}}, iter::Base.OneTo{Int64}, state::Int64, count::Int64)
@ Base.Broadcast ./broadcast.jl:1068
[12] copy
@ ./broadcast.jl:920 [inlined]
[13] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, Type{DataFrame}, Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, Base.Broadcast.var"#43#44"{Base.Pairs{Symbol, Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}}, Tuple{Symbol}, NamedTuple{(:extent,), Tuple{Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}}}}}, typeof(points_plus)}, Tuple{Vector{ICESat2_Granule{:ATL06}}}}}})
@ Base.Broadcast ./broadcast.jl:873
[14] geotile_build(geotile_granules::DataFrame, geotile_dir::String, mission::Symbol; warnings::Bool)
@ Altim ~/Documents/GitHub/Altim.jl/src/utilities.jl:531
[15] top-level scope
@ ~/Documents/GitHub/Altim.jl/src/geotile_build_archive.jl:71
ERROR: MethodError: no method matching points(::String)
Closest candidates are:
points(::ICESat2_Granule{:ATL03}; tracks, step, bbox)
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL03.jl:26
points(::ICESat2_Granule{:ATL03}, ::HDF5.H5DataStore, ::AbstractString, ::Float64)
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL03.jl:89
points(::ICESat2_Granule{:ATL03}, ::HDF5.H5DataStore, ::AbstractString, ::Float64, ::Any)
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL03.jl:89
...
Stacktrace:
[1] top-level scope
@ ~/Documents/GitHub/Altim.jl/src/geotile_build_archive.jl:87
ERROR: UndefVarError: `Dates` not defined
Stacktrace:
[1] top-level scope
@ ~/Documents/GitHub/Altim.jl/src/geotile_build_archive.jl:86
SpaceLiDAR Table with 6 partitions
ERROR: MethodError: no method matching points(::ICESat2_Granule{:ATL06}; extent::Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}})
Closest candidates are:
points(::ICESat2_Granule{:ATL06}; tracks, step, bbox) got unsupported keyword argument "extent"
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL06.jl:24
points(::ICESat2_Granule{:ATL06}, ::HDF5.H5DataStore, ::AbstractString, ::Float64) got unsupported keyword argument "extent"
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL06.jl:52
points(::ICESat2_Granule{:ATL06}, ::HDF5.H5DataStore, ::AbstractString, ::Float64, ::Any) got unsupported keyword argument "extent"
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL06.jl:52
...
Stacktrace:
[1] kwerr(::NamedTuple{(:extent,), Tuple{Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}}}}, ::Function, ::ICESat2_Granule{:ATL06})
@ Base ./error.jl:165
[2] top-level scope
@ ~/Documents/GitHub/Altim.jl/src/geotile_build_archive.jl:87
ERROR: MethodError: no method matching SpaceLiDAR.PartitionedTable(::Tuple{NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}})
Closest candidates are:
SpaceLiDAR.PartitionedTable(::NamedTuple)
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/granule.jl:169
SpaceLiDAR.PartitionedTable(::Tuple{Vararg{NamedTuple{K, V}, N}}) where {N, K, V}
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/granule.jl:167
Stacktrace:
[1] points(granule::ICESat2_Granule{:ATL06}; tracks::NTuple{6, String}, step::Int64, bbox::Extent{(:X, :Y), Tuple{Tuple{Float64, Float64}, Tuple{Float64, Float64}}})
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/ra53x/src/ICESat-2/ATL06.jl:49
[2] top-level scope
@ ~/Documents/GitHub/Altim.jl/src/geotile_build_archive.jl:87 |
Are you sure you checked out this branch, including the latest commits? The type signature of your methods is old and I can't replicate over here. |
I ext = Extent{(:X, :Y),Tuple{Tuple{Float64,Float64},Tuple{Float64,Float64}}}((X=(-128.0, -126.0), Y=(50.0, 52.0)))
g = ICESat2_Granule{:ATL06}("ATL06_20181201095523_09740106_005_01.h5", "/Users/gardnera/data/icesat2/ATL06/005/raw/ATL06_20181201095523_09740106_005_01.h5", (type=:ATL06, date=DateTime("2018-12-01T09:55:23"), rgt=974, cycle=1, segment=6, version=5, revision=1, ascending=false, descending=true), Vector{Vector{Vector{Float64}}}[])
points(g, bbox=ext) results in: ERROR: MethodError: no method matching SpaceLiDAR.PartitionedTable(::Tuple{NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}, NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}}, ::ICESat2_Granule{:ATL06})
Closest candidates are:
SpaceLiDAR.PartitionedTable(::Tuple{Vararg{NamedTuple{K, V}, N}}, ::G) where {N, K, V, G}
@ SpaceLiDAR ~/.julia/packages/SpaceLiDAR/tJtlT/src/granule.jl:170 |
Thanks, that error message does correspond with the latest changes. I think I fixed it, the problem was in the empty defaults, where we had a Float64[] instead of a Float32[] of the non-empty data. ERROR: MethodError: no method matching # --> scroll to the right a bit
SpaceLiDAR.PartitionedTable(::Tuple{
NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}},
NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}},
NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}},
NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float32}, Vector{DateTime}, BitVector, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}},
NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}},
NamedTuple{(:longitude, :latitude, :height, :height_error, :datetime, :quality, :track, :strong_beam, :detector_id, :height_reference), Tuple{Vector{Float64}, Vector{Float64}, Vector{Float32}, Vector{Float64}, Vector{DateTime}, Vector{Bool}, FillArrays.Fill{String, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Bool, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Int8, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float32}}}}, ::ICESat2_Granule{:ATL06}) |
That seems to have fixed it. |
I'll test this today |
These are fantastic. One issue is that if julia> DataFrame(t)
6×11 DataFrame
Row │ longitude latitude height height_error datetime quality track strong_beam detector_id height_reference granule_info
│ Array… Array… Array… Array… Array… BitVector Fill… Fill… Fill… Array… ICESat2_Gra…
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ Float64[] Float64[] Float32[] Float32[] DateTime[] Bool[] Fill("gt1l", 0) Fill(false, 0) Fill(6, 0) Float32[] ICESat2_Granule{:ATL06}("ATL06_2…
2 │ Float64[] Float64[] Float32[] Float32[] DateTime[] Bool[] Fill("gt1r", 0) Fill(true, 0) Fill(5, 0) Float32[] ICESat2_Granule{:ATL06}("ATL06_2…
3 │ Float64[] Float64[] Float32[] Float32[] DateTime[] Bool[] Fill("gt2l", 0) Fill(false, 0) Fill(4, 0) Float32[] ICESat2_Granule{:ATL06}("ATL06_2…
4 │ Float64[] Float64[] Float32[] Float32[] DateTime[] Bool[] Fill("gt2r", 0) Fill(true, 0) Fill(3, 0) Float32[] ICESat2_Granule{:ATL06}("ATL06_2…
5 │ Float64[] Float64[] Float32[] Float32[] DateTime[] Bool[] Fill("gt3l", 0) Fill(false, 0) Fill(2, 0) Float32[] ICESat2_Granule{:ATL06}("ATL06_2…
6 │ Float64[] Float64[] Float32[] Float32[] DateTime[] Bool[] Fill("gt3r", 0) Fill(true, 0) Fill(1, 0) Float32[] ICESat2_Granule{:ATL06}("ATL06_2… This behavior made it easy to keep track of which granules had been searched.... not sure if you have any clever ideas on how to handle this ... i suspect with the updated implementation (single observation per row) this becomes difficult. |
Cool, in that case I will merge this! Regarding per row saving, I save to a single file per granule. So I can just check the Note that I will probably not release this as a version immediately (unless you want me to), as I would like to refactor the search (to use |
Great work! This is shaping up nicely |
See #61 for the workflow discussion. |
Also fixes #50
This changes the returns of points and lines to
Table
orPartitionedTable
, both which implement the Tables interface. Technically breaking, but the tests are not broken, so I think the impact in practice is neglible. Old things likereduce(vcat, DataFrame.(points(g))
still work, butDataFrame(points(g))
is now possible, shorter and faster.