Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for Float16 #341

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
70 changes: 59 additions & 11 deletions src/HDF5.jl
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,16 @@ const H5T_NATIVE_DOUBLE = read_const(:H5T_NATIVE_DOUBLE_g)
# Library versions
const H5F_LIBVER_EARLIEST = 0
const H5F_LIBVER_LATEST = 1
# Constructed types (occurs at runtime)
function make_float16()
FLOAT16 = h5t_copy(H5T_NATIVE_FLOAT)
h5t_set_fields(FLOAT16, 15, 10, 5, 0, 10)
h5t_set_size(FLOAT16, 2)
h5t_set_ebias(FLOAT16, 15)
h5t_lock(FLOAT16)
return FLOAT16
end
# const H5T_FLOAT16 = make_float16() (in `__init__()`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't really understand why this has to be in __init__ do you think you could please explain this to me?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for the same reason that H5open() is called inside __init__: make_float16() is allocating new data structures inside the underlying hdf5 library and must be called each time the library is loaded.

Actually it's interesting that the rest of the code assumes the value of globals like H5T_NATIVE_FLOAT_g will be consistent between different runs of the julia program. Looking at the C library, these globals are allocated inside H5open(), so it seems this isn't really guaranteed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'm wondering of a way to trigger a bug to show that.
In matlab.jl we have references https://github.com/JuliaInterop/MATLAB.jl/blob/master/src/init.jl
that are filled in with the pointers in inside __init__ ref https://github.com/JuliaInterop/MATLAB.jl/blob/master/src/MATLAB.jl#L53
we might have to do the same here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea if we can trigger a bug/make a test for this... would it happen if the HDF5 binary library was updated without this package being precompiled again? And yes, to me, filling references at run-time during __init__ seems the safest possible thing to do.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To trigger this possible bug, it looks like you'd need to first load a library which creates a custom id dynamically (through a call to H5Iregister() I guess?), before H5open() is called. I don't know whether the libhdf5 API contract prevents this from happening.

After that, load HDF5.jl into the same process and the ids may be different from when HDF5.jl was loaded in build.jl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems quite possible. Another library (e.g. PETSc) could depend on HDF5_jll so that HDF5_jll is loaded and initialized before HDF5.jl is initialized. It would be quite difficult to trigger this in a test case though.


# Object reference types
immutable HDF5ReferenceObj
Expand All @@ -238,11 +248,12 @@ hdf5_type_id(::Type{Int32}) = H5T_NATIVE_INT32
hdf5_type_id(::Type{UInt32}) = H5T_NATIVE_UINT32
hdf5_type_id(::Type{Int64}) = H5T_NATIVE_INT64
hdf5_type_id(::Type{UInt64}) = H5T_NATIVE_UINT64
#hdf5_type_id(::Type{Float16}) = H5T_FLOAT16 (in `__init__()`)
hdf5_type_id(::Type{Float32}) = H5T_NATIVE_FLOAT
hdf5_type_id(::Type{Float64}) = H5T_NATIVE_DOUBLE
hdf5_type_id(::Type{HDF5ReferenceObj}) = H5T_STD_REF_OBJ

const HDF5BitsKind = Union{Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, Float32, Float64}
const HDF5BitsKind = Union{Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, Float16, Float32, Float64}
const HDF5Scalar = Union{HDF5BitsKind, HDF5ReferenceObj}
const ScalarOrString = Union{HDF5Scalar, String}

Expand All @@ -256,6 +267,7 @@ const hdf5_type_map = Dict(
(H5T_INTEGER, H5T_SGN_NONE, convert(Csize_t, 2)) => UInt16,
(H5T_INTEGER, H5T_SGN_NONE, convert(Csize_t, 4)) => UInt32,
(H5T_INTEGER, H5T_SGN_NONE, convert(Csize_t, 8)) => UInt64,
(H5T_FLOAT, nothing, convert(Csize_t, 2)) => Float16,
(H5T_FLOAT, nothing, convert(Csize_t, 4)) => Float32,
(H5T_FLOAT, nothing, convert(Csize_t, 8)) => Float64,
)
Expand Down Expand Up @@ -1788,17 +1800,30 @@ function hdf5_to_julia_eltype(objtype)
error("character set ", cset, " not recognized")
end
elseif class_id == H5T_INTEGER || class_id == H5T_FLOAT
native_type = h5t_get_native_type(objtype.id)
try
native_size = h5t_get_size(native_type)
if class_id == H5T_INTEGER
is_signed = h5t_get_sign(native_type)
else
is_signed = nothing
# First look in the type list for a match
# otherwise fall back to a native datatype
# Allows for users to dynamically add types to the typemap
t_size = h5t_get_size(objtype)
if class_id == H5T_INTEGER
is_signed = h5t_get_sign(objtype)
else
is_signed = nothing # probably should include the mantissa size, etc...
end
if haskey(hdf5_type_map, (class_id, is_signed, t_size))
T = hdf5_type_map[(class_id, is_signed, t_size)]
else
native_type = h5t_get_native_type(objtype.id)
try
native_size = h5t_get_size(native_type)
if class_id == H5T_INTEGER
is_signed = h5t_get_sign(native_type)
else
is_signed = nothing
end
T = hdf5_type_map[(class_id, is_signed, native_size)]
finally
h5t_close(native_type)
end
T = hdf5_type_map[(class_id, is_signed, native_size)]
finally
h5t_close(native_type)
end
elseif class_id == H5T_ENUM
super_type = h5t_get_super(objtype.id)
Expand Down Expand Up @@ -1982,7 +2007,10 @@ for (jlname, h5name, outtype, argtypes, argsyms, msg) in
(:h5s_select_hyperslab, :H5Sselect_hyperslab, Herr, (Hid, Cint, Ptr{Hsize}, Ptr{Hsize}, Ptr{Hsize}, Ptr{Hsize}), (:dspace_id, :seloper, :start, :stride, :count, :block), "Error selecting hyperslab"),
(:h5t_commit, :H5Tcommit2, Herr, (Hid, Ptr{UInt8}, Hid, Hid, Hid, Hid), (:loc_id, :name, :dtype_id, :lcpl_id, :tcpl_id, :tapl_id), "Error committing type"),
(:h5t_close, :H5Tclose, Herr, (Hid,), (:dtype_id,), "Error closing datatype"),
(:h5t_lock, :H5Tlock, Herr, (Hid,), (:dtype_id,), "Error locking datatype"),
(:h5t_set_cset, :H5Tset_cset, Herr, (Hid, Cint), (:dtype_id, :cset), "Error setting character set in datatype"),
(:h5t_set_ebias, :H5Tset_ebias, Herr, (Hid, Csize_t), (:dtype_id, :ebias), "Error setting exponential bias of floating-point type"),
(:h5t_set_fields, :H5Tset_fields, Herr, (Hid, Csize_t, Csize_t, Csize_t, Csize_t, Csize_t), (:dtype_id, :spos, :epos, :esize, :mpos, :msize), "Error setting floating-point type fields"),
(:h5t_set_size, :H5Tset_size, Herr, (Hid, Csize_t), (:dtype_id, :sz), "Error setting size of datatype"),
)

Expand Down Expand Up @@ -2068,6 +2096,7 @@ for (jlname, h5name, outtype, argtypes, argsyms, ex_error) in
(:h5t_get_array_ndims, :H5Tget_array_ndims, Cint, (Hid,), (:dtype_id,), :(error("Error getting ndims of array"))),
(:h5t_get_class, :H5Tget_class, Cint, (Hid,), (:dtype_id,), :(error("Error getting class"))),
(:h5t_get_cset, :H5Tget_cset, Cint, (Hid,), (:dtype_id,), :(error("Error getting character set encoding"))),
(:h5t_get_ebias, :H5Tget_ebias, Csize_t, (Hid,), (:dtype_id,), :(error("Error getting exponential bias"))),
(:h5t_get_member_class, :H5Tget_member_class, Cint, (Hid, Cuint), (:dtype_id, :index), :(error("Error getting class of compound datatype member #", index))),
(:h5t_get_member_index, :H5Tget_member_index, Cint, (Hid, Ptr{UInt8}), (:dtype_id, :membername), :(error("Error getting index of compound datatype member \"", membername, "\""))),
(:h5t_get_member_offset, :H5Tget_member_offset, Csize_t, (Hid, Cuint), (:dtype_id, :index), :(error("Error getting offset of compound datatype member #", index))),
Expand Down Expand Up @@ -2158,6 +2187,21 @@ function h5s_get_simple_extent_dims(space_id::Hid)
h5s_get_simple_extent_dims(space_id, dims, maxdims)
return tuple(reverse!(dims)...), tuple(reverse!(maxdims)...)
end
function h5t_get_fields(type_id::Hid)
spos = Ref{Csize_t}()
epos = Ref{Csize_t}()
esize = Ref{Csize_t}()
mpos = Ref{Csize_t}()
msize = Ref{Csize_t}()
herr = ccall((:H5Tget_fields, libhdf5),
Herr,
(Hid, Ptr{Csize_t}, Ptr{Csize_t}, Ptr{Csize_t}, Ptr{Csize_t}, Ptr{Csize_t}),
type_id, spos, epos, esize, mpos, msize)
if herr < 0
error("Error getting fields of floating-point datatype")
end
return (spos[], epos[], esize[], mpos[], msize[])
end
function h5t_get_member_name(type_id::Hid, index::Integer)
pn = ccall((:H5Tget_member_name, libhdf5),
Ptr{UInt8},
Expand Down Expand Up @@ -2388,6 +2432,10 @@ function __init__()
UTF8_ATTRIBUTE_PROPERTIES[] = p_create(H5P_ATTRIBUTE_CREATE)
h5p_set_char_encoding(UTF8_ATTRIBUTE_PROPERTIES[].id, H5T_CSET_UTF8)

# Set up Float16 (must occur at runtime)
@eval(const H5T_FLOAT16 = make_float16())
@eval hdf5_type_id(::Type{Float16}) = H5T_FLOAT16

rehash!(hdf5_type_map, length(hdf5_type_map.keys))
rehash!(hdf5_prop_get_set, length(hdf5_prop_get_set.keys))

Expand Down
8 changes: 8 additions & 0 deletions test/plain.jl
Original file line number Diff line number Diff line change
Expand Up @@ -332,4 +332,12 @@ using Compat.String
str = read(fid["test"])
@test str == "Hello World"
end

# Test Float16 support
arr_float16 = Float16[1.0, 0.25, 0.5, 8.0]
h5write("test_float16.h5", "x", arr_float16)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should clean this file up when the test is done, otherwise stale results from previous runs could give misleading results

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it more robust to check for existence and delete it before the test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if you want it to stick around afterwards for debugging, maybe? But then it sticks around in the package dir (I think that's where Pkg.test will usually run from). It's not checked in so it wouldn't make the package dirty in the git sense, but it would be an untracked file.

Would be consistent with the earlier tests to put it under joinpath(tmpdir, "test_float16.h5") and move the rm(tmpdir, recursive=true) to after this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

arr_float16_2 = h5read("test_float16.h5", "x")
@test isa(arr_float16_2, Vector{Float16})
@test arr_float16_2 == arr_float16

end