-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs-2.1.13 patchset #15268
zfs-2.1.13 patchset #15268
Conversation
Two problems led to unexpected behaviour of the scriptlets: 1) Newer DKMS versions change the formatting of "dkms status": (old) zfs, 2.1.2, 5.14.10-300.fc35.x86_64, x86_64: installed (new) zfs/2.1.2, 5.14.10-300.fc35.x86_64, x86_64: installed Which broke a conditional determining whether to uninstall. 2) zfs_config.h not packaged properly, but was attempted to be read in the %preun scriptlet: CONFIG_H="/var/lib/dkms/zfs/2.1.2/*/*/zfs_config.h" Which broke the uninstallation of the module, which left behind a dangling symlink, which broke DKMS entirely with this error: Error! Could not locate dkms.conf file. File: /var/lib/dkms/zfs/2.1.1/source/dkms.conf does not exist. This change attempts to simplify life by: * Avoiding parsing anything (less prone to future breakage) * Uses %posttrans instead of %post for module installation, because %post happens before %preun, while %posttrans happens afterwards * Unconditionally reinstall module on upgrade, which is less efficient but the trade-off is that it's more reliable Alternative approaches could involve fixing the existing parsing bugs or improving the logic, but this comes at the cost of complexity and possible future bugs. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jeremy Visser <[email protected]> Closes openzfs#10463 Closes openzfs#13182
Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes openzfs#13259
Given: /sbin/zfs list filling/a-zvol<TAB> -o space,refratio The rest of the cmdline gets vored by: /sbin/zfs list filling/a-zvolcannot open 'filling/a-zvol': operation not applicable to datasets of this type With -x (fragment): + COMPREPLY=($(compgen -W "$(__zfs_match_snapshot)" -- "$cur")) +++ __zfs_match_snapshot +++ local base_dataset=filling/dziadtop-nowe-duchy +++ [[ filling/dziadtop-nowe-duchy != filling/dziadtop-nowe-duchy ]] +++ [[ filling/dziadtop-nowe-duchy != '' ]] +++ __zfs_list_datasets filling/dziadtop-nowe-duchy +++ /sbin/zfs list -H -o name -s name -t filesystem -r filling/dziadtop-nowe-duchy +++ tail -n +2 cannot open 'filling/dziadtop-nowe-duchy': operation not applicable to datasets of this type +++ echo filling/dziadtop-nowe-duchy +++ echo filling/dziadtop-nowe-duchy@ ++ compgen -W 'filling/dziadtop-nowe-duchy This properly completes with: $ /sbin/zfs list filling/a-zvol<TAB> -o space,refratio filling/a-zvol filling/a-zvol@ $ /sbin/zfs list filling/a-zvol<cursor> -o space,refratio Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Backport-of: 5ece420 Closes openzfs#12820
482da24 missed arc_buf_destroy() calls on log parse errors, possibly leaking up to 128KB of memory per dataset during ZIL replay. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc.
l2arc_evict() performs the adjustment of the size of buffers to be written on L2ARC unnecessarily. l2arc_write_size() is called right before l2arc_evict() and performs those adjustments. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#14828
l2arc_write_size() should return the write size after adjusting for trim and overhead of the L2ARC log blocks. Also take into account the allocated size of log blocks when deciding when to stop writing buffers to L2ARC. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#14939
While commit bcd5321 adjusts the write size based on the size of the log block, this happens after comparing the unadjusted write size to the evicted (target) size. In this case l2ad_hand will exceed l2ad_evict and violate an assertion at the end of l2arc_write_buffers(). Fix this by adding the max log block size to the allocated size of the buffer to be committed before comparing the result to the target size. Also reset the l2arc_trim_ahead ZFS module variable when the adjusted write size exceeds the size of the L2ARC device. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#14936 Closes openzfs#14954
If this is not done, and the pool has an ashift other than the default (at the moment 9) then the following happens: 1) vdev_alloc() assigns the ashift of the pool to L2ARC device, but upon export it is not stored anywhere 2) at the first import, vdev_open() sees an vdev_ashift() of 0 and assigns the logical_ashift, which is 9 3) reading the contents of L2ARC, including the header fails 4) L2ARC buffers are not restored in ARC. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#14313 Closes openzfs#14963
With the latest L2ARC fixes, 2 seconds is too long to wait for quiescence of arcstats like l2_size. Shorten this interval to avoid having the persistent L2ARC tests in ZTS prematurely terminated. Signed-off-by: George Amanakis <[email protected]>
The previous code was checking zfs_is_namespace_prop() only for the last property on the list. If one was not "namespace", then remount wasn't called. To fix that move zfs_is_namespace_prop() inside the loop and remount if at least one of properties was "namespace". Reviewed-by: Umer Saleem <[email protected]> Reviewed-by: Ameer Hamza <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#15000
Unlike regular receive, raw receive require destination to have the same block structure as the source. In case of dnode reclaim this triggers two special cases, requiring special handling: - If dn_nlevels == 1, we can change the ibs, but dnode_set_blksz() should not dirty the data buffer if block size does not change, or durign receive dbuf_dirty_lightweight() will trigger assertion. - If dn_nlevels > 1, we just can't change the ibs, dnode_set_blksz() would fail and receive_object would trigger assertion, so we should destroy and recreate the dnode from scratch. Reviewed-by: Paul Dagnelie <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#15039
We would see zed assert on one of our systems if we powered off a slot. Further examination showed zfs_retire_recv() was reporting a GUID of 0, which in turn would return a NULL nvlist. Add in a check for a zero GUID. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#15084
For large JBODs the log message "zfs_iter_vdev: no match" can account for the bulk of the log messages (over 70%). Since this message is purely informational and not that useful we remove it. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#15086 Closes openzfs#15094
Update the META file to reflect compatibility with the 6.4 kernel. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Lahfa <[email protected]> Closes openzfs#15125
On Linux, sometimes, when ZFS goes to unmount an automounted snap, it fails a VERIFY check on debug builds, because taskq_cancel_id returned ENOENT after not finding the taskq it was trying to cancel. This presumably happens when it already died for some reason; in this case, we don't really mind it already being dead, since we're just going to dispatch a new task to unmount it right after. So we just ignore it if we get back ENOENT trying to cancel here, retry a couple times if we get back the only other possible condition (EBUSY), and log to dbgmsg if we got anything but ENOENT or success. (We also add some locking around taskqid, to avoid one or two cases of two instances of trying to cancel something at once.) Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes openzfs#11632 Closes openzfs#12670
Unfortunately, even after e79b680, I still, much more rarely, tripped asserts when playing with many ctldir mounts at once. Since this appears to happen if we dispatched twice too fast, just ignore it. We don't actually need to do anything if someone already started doing it for us. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes openzfs#14462
When a kmem cache is exhausted and needs to be expanded a new slab is allocated. KM_SLEEP callers can block and wait for the allocation, but KM_NOSLEEP callers were incorrectly allowed to block as well. Resolve this by attempting an emergency allocation as a best effort. This may fail but that's fine since any KM_NOSLEEP consumer is required to handle an allocation failure. Signed-off-by: Brian Behlendorf <[email protected]> Reviewed-by: Adam Moss <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
This change was introduced in Linux commit 7ba150834b840f6f5cdd07ca69a4ccf39df59a66 Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15059
The disk_check_media_change() function was added which replaces bdev_check_media_change. This change was introduced in 6.5rc1 444aa2c58cb3b6cfe3b7cc7db6c294d73393a894 and the new function takes a gendisk* as its argument, no longer a block_device*. Thus, bdev->bd_disk is now used to pass the expected data. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15060
Additionally, the .child element of ctl_table has been removed in 6.5. This change adds a new test for the pre-6.5 register_sysctl_table() function, and uses the old code in that case. If it isn't found, then the parentage entries in the tables are removed, and the register_sysctl call is provided the paths of "kernel/spl", "kernel/spl/kmem", and "kernel/spl/kstat" directly, to populate each subdirectory over three calls, as is the new API. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15098
When disk_check_media_change() exists, then define zfs_check_media_change() to simply call disk_check_media_change() on the bd_disk member of its argument. Since disk_check_media_change() is newer than when revalidate_disk was present in bops, we should be able to safely do this via a macro, instead of recreating a new implementation of the inline function that forces revalidation. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15101
Multiple changes to the blkdev API were introduced in Linux 6.5. This includes passing (void* holder) to blkdev_put, adding a new blk_holder_ops* arg to blkdev_get_by_path, adding a new blk_mode_t type that replaces uses of fmode_t, and removing an argument from the release handler on block_device_operations that we weren't using. The open function definition has also changed to take gendisk* and blk_mode_t, so update it accordingly, too. Implement local wrappers for blkdev_get_by_path() and vdev_blkdev_put() so that the in-line calls are cleaner, and place the conditionally-compiled implementation details inside of both of these local wrappers. Both calls are exclusively used within vdev_disk.c, at this time. Add blk_mode_is_open_write() to test FMODE_WRITE / BLK_OPEN_WRITE The wrapper function is now used for testing using the appropriate method for the kernel, whether the open mode is writable or not. Emphasize fmode_t arg in zvol_release is not used Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15099
The iov_iter->iov member is now iov_iter->__iov and must be accessed via the accessor function iter_iov(). Create a wrapper that is conditionally compiled to use the access method appropriate for the target kernel version. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15100
An iov_iter_type() function to access the "type" member of the struct iov_iter was added at one point. Move the conditional logic to decide which method to use for accessing it into a macro and simplify the zpl_uio_init code. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15100
This reverts commit b35374f as there are error messages when loading the SPL module. Errors seemed to be tied to duplicate a duplicate entry. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes openzfs#15134
Additionally, the .child element of ctl_table has been removed in 6.5. This change adds a new test for the pre-6.5 register_sysctl_table() function, and uses the old code in that case. If it isn't found, then the parentage entries in the tables are removed, and the register_sysctl call is provided the paths of "kernel/spl", "kernel/spl/kmem", and "kernel/spl/kstat" directly, to populate each subdirectory over three calls, as is the new API. Reviewed-by: Brian Atkinson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15138
…e_read The generic_file_splice_read function was removed in Linux 6.5 in favor of filemap_splice_read. Add an autoconf test for filemap_splice_read and use it if it is found as the handler for .splice_read in the file_operations struct. Additionally, ITER_PIPE was removed in 6.5. This change removes the ITER_* macros that OpenZFS doesn't use from being tested in config/kernel-vfs-iov_iter.m4. The removal of ITER_PIPE was causing the test to fail, which also affected the code responsible for setting the .splice_read handler, above. That behavior caused run-time panics on Linux 6.5. Reviewed-by: Brian Atkinson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15155
Using the filemap_splice_read function for the splice_read handler was leading to occasional data corruption under certain circumstances. Favor using copy_splice_read instead, which does not demonstrate the same erroneous behavior under the tested failure cases. Reviewed-by: Brian Atkinson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes openzfs#15164
If we fail to create a proc entry in spl_proc_init() we may end up calling unregister_sysctl_table() twice: one in the failure path of spl_proc_init() and another time during spl_proc_fini(). Avoid the double call to unregister_sysctl_table() and while at it refactor the code a bit to reduce code duplication. This was accidentally introduced when the spl code was updated for Linux 6.5 compatibility. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ameer Hamza <[email protected]> Signed-off-by: Andrea Righi <[email protected]> Closes openzfs#15234 Closes openzfs#15235
When register_sysctl_table() is unavailable we fail to properly unregister sysctl entries under "kernel/spl". This leads to errors like the following when spl is unloaded/reloaded, making impossible to properly reload the spl module: [ 746.995704] sysctl duplicate entry: /kernel/spl/kmem/slab_kvmem_total Fix by cleaning up all the sub-entries inside "kernel/spl" when the spl module is unloaded. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Andrea Righi <[email protected]> Closes openzfs#15239
Reviewed-by: Don Brady <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Serapheim Dimitropoulos <[email protected]> Closes openzfs#15220
Update the META file to reflect compatibility with the 6.5 kernel. Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]>
I built this against 6.5.3 on arch and got a crash on boot during import-cache
|
I got a similar crash today while testing with kernel 6.5.4. I do not have logs. The core dump was only on the screen before the journal started. |
@ipha thanks for reporting this - I'm able to reproduce it on Fedora 38 as well.
I'm looking into it further |
Note that the crash happened while typing |
Ok, this part of 9c8fabf fixes the crash: diff --git a/module/os/linux/zfs/zfs_debug.c b/module/os/linux/zfs/zfs_debug.c
index 98c9923d5..595806373 100644
--- a/module/os/linux/zfs/zfs_debug.c
+++ b/module/os/linux/zfs/zfs_debug.c
@@ -30,7 +30,7 @@ typedef struct zfs_dbgmsg {
procfs_list_node_t zdm_node;
uint64_t zdm_timestamp;
int zdm_size;
- char zdm_msg[1]; /* variable length allocation */
+ char zdm_msg[]; /* variable length allocation */
} zfs_dbgmsg_t;
procfs_list_t zfs_dbgmsgs;
@@ -134,7 +134,7 @@ __set_error(const char *file, const char *func, int line, int err)
void
__zfs_dbgmsg(char *buf)
{
- int size = sizeof (zfs_dbgmsg_t) + strlen(buf);
+ int size = sizeof (zfs_dbgmsg_t) + strlen(buf) + 1;
zfs_dbgmsg_t *zdm = kmem_zalloc(size, KM_SLEEP);
zdm->zdm_size = size;
zdm->zdm_timestamp = gethrestime_sec(); Let me see about pulling that patch in |
The Linux 5.16.14 kernel's coccicheck caught this. The semantic patch that caught it was: ./scripts/coccinelle/misc/flexible_array.cocci However, unlike the cases where the GNU zero length array extension had been used, coccicheck would not suggest patches for the older style single member arrays. That was good because blindly changing them would break size calculations in most cases. Therefore, this required care to make sure that we did not break size calculations. In the case of `indirect_split_t`, we use `offsetof(indirect_split_t, is_child[is->is_children])` to calculate size. This might be subtly wrong according to an old mailing list thread: https://inbox.sourceware.org/gcc-prs/[email protected]/T/ That is because the C99 specification should consider the flexible array members to start at the end of a structure, but compilers prefer to put padding at the end. A suggestion was made to allow compilers to allocate padding after the VLA like compilers already did: http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n983.htm However, upon thinking about it, whether or not we allocate end of structure padding does not matter, so using offsetof() to calculate the size of the structure is fine, so long as we do not mix it with sizeof() on structures with no array members. In the case that we mix them and padding causes offsetof(struct_t, vla_member[0]) to differ from sizeof(struct_t), we would be doing unsafe operations if we underallocate via `offsetof()` and then overcopy via sizeof(). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes openzfs#14372
0f00f5a
to
ab81714
Compare
9c8fabf is pulled in now, so the kernel bug should be gone. |
@tonyhutter you can also pull 9192ab7 to resolve the checkstyle warnings. |
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b341b20d648bb7e9a3307c33163e7399f0913e66 Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes openzfs#15282 Closes openzfs#15284
ab81714
to
15d6de0
Compare
I can not apply this patchset anymore. It always fails with the latest patch to the META file. I am applying this PR to the latest 2.1.12 tar ball: https://github.com/zfsonlinux/zfs/releases/download/zfs-2.1.12/zfs-2.1.12.tar.gz
|
@mabod I'm not really sure how you're patching, but if you want to test out this patchset I'd recommend you check it out with git:
|
META file and changelog updated. Signed-off-by: Tony Hutter <[email protected]>
15d6de0
to
eb62221
Compare
I pulled this PR as a patch like this:
and applied it as part of the PKGBUILD with I will do it now like you suggest |
Testing this on Fedora 39 and getting the following, kernel 6.5.5-300.fc39.x86_64. If it runs the same/similar test later in the loop, it's not crashing: [root@fedora zfs]# sudo ./scripts/zloop.sh warning: Can't open file /tmp/ztest.data.6xckxB (deleted) during file-backed mapping note processing warning: Section `.reg-xstate/25552' in core file too small. warning: Section `.reg-xstate/25552' in core file too small. Coredump available at https://drive.google.com/drive/folders/1l5fyl4_xefHxo0trJqBLyNMx6lU3EKsu?usp=sharing EDIT: Correction, with enough iterations it will generate the same core dump again, but only seeing it on the raidz ones, not on the draid ones. |
@seanjnkns thanks for reporting this. I should mention that Fedora 39 is set to be released Oct 17th, and hasn't had any testing against ZFS yet. zloop also is not 100% stable either. We will definitely keep an eye out for zloop failures when we start Fedora 39 testing though. |
@tonyhutter - yeah, I'm aware Fedora 39 isn't released yet, just knew that they're including the 6.5 kernel by default there and figured I could see if we were going to have issues once that's released. Also, Fedora 38 just officially released the 6.5.x kernel so I'll be testing there as well and can provide any feedback. My main setup is using Fedora 38, hence why I want to test both environments. ATM I'm running the test suite on Fedora 39 which I'll provide when it completes, then switch to the same series of testing on Fedora 38. |
Would it be possible to release a new rc for 2.2 with support for 6.5 as well? |
Motivation and Context
Support 6.5 kernel, misc bugfixes
Description
Proposed patchset for zfs-2.1.13.
How Has This Been Tested?
Built on F38 with 6.5.3.0-rc1 kernel.
Types of changes
Checklist:
Signed-off-by
.