Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

Pull in bpf/for-next #169

Merged
merged 10,000 commits into from
Mar 29, 2024
Merged

Pull in bpf/for-next #169

merged 10,000 commits into from
Mar 29, 2024
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Mar 21, 2024

  1. Merge tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/l…

    …inux/kernel/git/rw/ubifs
    
    Pull UBI and UBIFS updates from Richard Weinberger:
     "UBI:
       - Add Zhihao Cheng as reviewer
       - Attach via device tree
       - Add NVMEM layer
       - Various fastmap related fixes
    
      UBIFS:
       - Add Zhihao Cheng as reviewer
       - Convert to folios
       - Various fixes (memory leaks in error paths, function prototypes)"
    
    * tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (34 commits)
      mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems
      mtd: ubi: provide NVMEM layer over UBI volumes
      mtd: ubi: populate ubi volume fwnode
      mtd: ubi: introduce pre-removal notification for UBI volumes
      mtd: ubi: attach from device tree
      mtd: ubi: block: use notifier to create ubiblock from parameter
      dt-bindings: mtd: ubi-volume: allow UBI volumes to provide NVMEM
      dt-bindings: mtd: add basic bindings for UBI
      ubifs: Queue up space reservation tasks if retrying many times
      ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path
      ubifs: dbg_check_idx_size: Fix kmemleak if loading znode failed
      ubi: Correct the number of PEBs after a volume resize failure
      ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130
      ubi: correct the calculation of fastmap size
      ubifs: Remove unreachable code in dbg_check_ltab_lnum
      ubifs: fix function pointer cast warnings
      ubifs: fix sort function prototype
      ubi: Check for too small LEB size in VTBL code
      MAINTAINERS: Add Zhihao Cheng as UBI/UBIFS reviewer
      ubifs: Convert populate_page() to take a folio
      ...
    torvalds committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    85a7912 View commit details
    Browse the repository at this point in the history
  2. Merge tag 'siox/for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/ukleinek/linux
    
    Pull siox updates from Uwe Kleine-König:
     "This reworks how siox device registration works yielding a saner API.
    
      This allows us to simplify the gpio bus driver using two new devm
      functions"
    
    * tag 'siox/for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
      siox: bus-gpio: Simplify using devm_siox_* functions
      siox: Provide a devm variant of siox_master_register()
      siox: Provide a devm variant of siox_master_alloc()
      siox: Don't pass the reference on a master in siox_master_register()
    torvalds committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    0045341 View commit details
    Browse the repository at this point in the history
  3. Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

    Cross-merge networking fixes after downstream PR.
    
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    537c2e9 View commit details
    Browse the repository at this point in the history
  4. Merge tag 'drm-misc-next-fixes-2024-03-21' of https://gitlab.freedesk…

    …top.org/drm/misc/kernel into drm-next
    
    Short summary of fixes pull:
    
    core:
    - fix rounding in drm_fixp2int_round()
    
    bridge:
    - fix documentation for DRM_BRIDGE_OP_EDID
    
    nouveau:
    - don't check devinit disable on GSP
    
    sun4i:
    - fix 64-bit division on 32-bit architectures
    
    tests:
    - fix dependency on DRM_KMS_HELPER
    
    Signed-off-by: Dave Airlie <[email protected]>
    
    From: Thomas Zimmermann <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    airlied committed Mar 21, 2024
    Configuration menu
    Copy the full SHA
    921074a View commit details
    Browse the repository at this point in the history

Commits on Mar 22, 2024

  1. Merge tag 'rtc-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/…

    …abelloni/linux
    
    Pull RTC updates from Alexandre Belloni:
     "Subsytem:
       - rtc_class is now const
    
      Drivers:
       - ds1511: cleanup, set date and time range and alarm offset limit
       - max31335: fix interrupt handler
       - pcf8523: improve suspend support"
    
    * tag 'rtc-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (28 commits)
      MAINTAINER: Include linux-arm-msm for Qualcomm RTC patches
      dt-bindings: rtc: zynqmp: Add support for Versal/Versal NET SoCs
      rtc: class: make rtc_class constant
      dt-bindings: rtc: abx80x: Improve checks on trickle charger constraints
      MAINTAINERS: adjust file entry in ARM/Mediatek RTC DRIVER
      rtc: nct3018y: fix possible NULL dereference
      rtc: max31335: fix interrupt status reg
      rtc: mt6397: select IRQ_DOMAIN instead of depending on it
      dt-bindings: rtc: abx80x: convert to yaml
      rtc: m41t80: Use the unified property API get the wakeup-source property
      dt-bindings: at91rm9260-rtt: add sam9x7 compatible
      dt-bindings: rtc: convert MT7622 RTC to the json-schema
      dt-bindings: rtc: convert MT2717 RTC to the json-schema
      rtc: pcf8523: add suspend handlers for alarm IRQ
      rtc: ds1511: set alarm offset limit
      rtc: ds1511: set range
      rtc: ds1511: drop inline/noinline hints
      rtc: ds1511: rename pdata
      rtc: ds1511: implement ds1511_rtc_read_alarm properly
      rtc: ds1511: remove partial alarm support
      ...
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    3faae16 View commit details
    Browse the repository at this point in the history
  2. Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/arm64/linux
    
    Pull arm64 fixes from Catalin Marinas:
    
     - Re-instate the CPUMASK_OFFSTACK option for arm64 when NR_CPUS > 256.
       The bug that led to the initial revert was the cpufreq-dt code not
       using zalloc_cpumask_var().
    
     - Make the STARFIVE_STARLINK_PMU config option depend on 64BIT to
       prevent compile-test failures on 32-bit architectures due to missing
       writeq().
    
    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
      perf: starfive: fix 64-bit only COMPILE_TEST condition
      ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    661dc19 View commit details
    Browse the repository at this point in the history
  3. Merge tag 'amd-drm-fixes-6.9-2024-03-21' of https://gitlab.freedeskto…

    …p.org/agd5f/linux into drm-next
    
    amd-drm-fixes-6.9-2024-03-21:
    
    amdgpu:
    - Freesync fixes
    - UAF IOCTL fixes
    - Fix mmhub client ID mapping
    - IH 7.0 fix
    - DML2 fixes
    - VCN 4.0.6 fix
    - GART bind fix
    - GPU reset fix
    - SR-IOV fix
    - OD table handling fixes
    - Fix TA handling on boards without display hardware
    - DML1 fix
    - ABM fix
    - eDP panel fix
    - DPPCLK fix
    - HDCP fix
    - Revert incorrect error case handling in ioremap
    - VPE fix
    - HDMI fixes
    - SDMA 4.4.2 fix
    - Other misc fixes
    
    amdkfd:
    - Fix duplicate BO handling in process restore
    
    Signed-off-by: Dave Airlie <[email protected]>
    
    From: Alex Deucher <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    airlied committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    cafd86c View commit details
    Browse the repository at this point in the history
  4. Merge tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm…

    …/kernel
    
    Pull drm fixes from Dave Airlie:
     "Fixes from the last week (or 3 weeks in amdgpu case), after amdgpu,
      it's xe and nouveau then a few scattered core fixes.
    
      core:
       - fix rounding in drm_fixp2int_round()
    
      bridge:
       - fix documentation for DRM_BRIDGE_OP_EDID
    
      sun4i:
       - fix 64-bit division on 32-bit architectures
    
      tests:
       - fix dependency on DRM_KMS_HELPER
    
      probe-helper:
       - never return negative values from .get_modes() plus driver fixes
    
      xe:
       - invalidate userptr vma on page pin fault
       - fail early on sysfs file creation error
       - skip VMA pinning on xe_exec if no batches
    
      nouveau:
       - clear bo resource bus after eviction
       - documentation fixes
       - don't check devinit disable on GSP
    
      amdgpu:
       - Freesync fixes
       - UAF IOCTL fixes
       - Fix mmhub client ID mapping
       - IH 7.0 fix
       - DML2 fixes
       - VCN 4.0.6 fix
       - GART bind fix
       - GPU reset fix
       - SR-IOV fix
       - OD table handling fixes
       - Fix TA handling on boards without display hardware
       - DML1 fix
       - ABM fix
       - eDP panel fix
       - DPPCLK fix
       - HDCP fix
       - Revert incorrect error case handling in ioremap
       - VPE fix
       - HDMI fixes
       - SDMA 4.4.2 fix
       - Other misc fixes
    
      amdkfd:
       - Fix duplicate BO handling in process restore"
    
    * tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel: (50 commits)
      drm/amdgpu/pm: Don't use OD table on Arcturus
      drm/amdgpu: drop setting buffer funcs in sdma442
      drm/amd/display: Fix noise issue on HDMI AV mute
      drm/amd/display: Revert Remove pixle rate limit for subvp
      Revert "drm/amdgpu/vpe: don't emit cond exec command under collaborate mode"
      Revert "drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()"
      drm/amd/display: Add a dc_state NULL check in dc_state_release
      drm/amd/display: Return the correct HDCP error code
      drm/amd/display: Implement wait_for_odm_update_pending_complete
      drm/amd/display: Lock all enabled otg pipes even with no planes
      drm/amd/display: Amend coasting vtotal for replay low hz
      drm/amd/display: Fix idle check for shared firmware state
      drm/amd/display: Update odm when ODM combine is changed on an otg master pipe with no plane
      drm/amd/display: Init DPPCLK from SMU on dcn32
      drm/amd/display: Add monitor patch for specific eDP
      drm/amd/display: Allow dirty rects to be sent to dmub when abm is active
      drm/amd/display: Override min required DCFCLK in dml1_validate
      drm/amdgpu: Bypass display ta if display hw is not available
      drm/amdgpu: correct the KGQ fallback message
      drm/amdgpu/pm: Check the validity of overdiver power limit
      ...
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    7ee0490 View commit details
    Browse the repository at this point in the history
  5. Merge tag '6.9-rc-smb3-client-fixes-part2' of git://git.samba.org/sfr…

    …ench/cifs-2.6
    
    Pull smb client fixes from Steve French:
    
     - Various get_inode_info_fixes
    
     - Fix for querying xattrs of cached dirs
    
     - Four minor cleanup fixes (including adding some header corrections
       and a missing flag)
    
     - Performance improvement for deferred close
    
     - Two query interface fixes
    
    * tag '6.9-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
      smb311: additional compression flag defined in updated protocol spec
      smb311: correct incorrect offset field in compression header
      cifs: Move some extern decls from .c files to .h
      cifs: remove redundant variable assignment
      cifs: fixes for get_inode_info
      cifs: open_cached_dir(): add FILE_READ_EA to desired access
      cifs: reduce warning log level for server not advertising interfaces
      cifs: make sure server interfaces are requested only for SMB3+
      cifs: defer close file handles having RH lease
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    8e938e3 View commit details
    Browse the repository at this point in the history
  6. binfmt: replace deprecated strncpy

    strncpy() is deprecated for use on NUL-terminated destination strings
    [1] and as such we should prefer more robust and less ambiguous string
    interfaces.
    
    There is a _nearly_ identical implementation of fill_psinfo present in
    binfmt_elf.c -- except that one uses get_task_comm over strncpy(). Let's
    mirror that in binfmt_elf_fdpic.c
    
    Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
    Link: KSPP/linux#90
    Cc:  <[email protected]>
    Signed-off-by: Justin Stitt <[email protected]>
    Link: https://lore.kernel.org/r/20240321-strncpy-fs-binfmt_elf_fdpic-c-v2-1-0b6daec6cc56@google.com
    Signed-off-by: Kees Cook <[email protected]>
    JustinStitt authored and kees committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    5248f40 View commit details
    Browse the repository at this point in the history
  7. x86/kexec: Do not update E820 kexec table for setup_data

    crashkernel reservation failed on a Thinkpad t440s laptop recently.
    Actually the memblock reservation succeeded, but later insert_resource()
    failed.
    
    Test steps:
      kexec load -> /* make sure add crashkernel param eg. crashkernel=160M */
        kexec reboot ->
            dmesg|grep "crashkernel reserved";
                crashkernel memory range like below reserved successfully:
                  0x00000000d0000000 - 0x00000000da000000
            But no such "Crash kernel" region in /proc/iomem
    
    The background story:
    
    Currently the E820 code reserves setup_data regions for both the current
    kernel and the kexec kernel, and it inserts them into the resources list.
    
    Before the kexec kernel reboots nobody passes the old setup_data, and
    kexec only passes fresh SETUP_EFI/SETUP_IMA/SETUP_RNG_SEED if needed.
    Thus the old setup data memory is not used at all.
    
    Due to old kernel updates the kexec e820 table as well so kexec kernel
    sees them as E820_TYPE_RESERVED_KERN regions, and later the old setup_data
    regions are inserted into resources list in the kexec kernel by
    e820__reserve_resources().
    
    Note, due to no setup_data is passed in for those old regions they are not
    early reserved (by function early_reserve_memory), and the crashkernel
    memblock reservation will just treat them as usable memory and it could
    reserve the crashkernel region which overlaps with the old setup_data
    regions. And just like the bug I noticed here, kdump insert_resource
    failed because e820__reserve_resources has added the overlapped chunks
    in /proc/iomem already.
    
    Finally, looking at the code, the old setup_data regions are not used
    at all as no setup_data is passed in by the kexec boot loader. Although
    something like SETUP_PCI etc could be needed, kexec should pass
    the info as new setup_data so that kexec kernel can take care of them.
    This should be taken care of in other separate patches if needed.
    
    Thus drop the useless buggy code here.
    
    Signed-off-by: Dave Young <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Cc: Jiri Bohac <[email protected]>
    Cc: Eric DeVolder <[email protected]>
    Cc: Baoquan He <[email protected]>
    Cc: Ard Biesheuvel <[email protected]>
    Cc: Kees Cook <[email protected]>
    Cc: "Kirill A. Shutemov" <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    daveyoung authored and Ingo Molnar committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    fc7f27c View commit details
    Browse the repository at this point in the history
  8. nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet

    syzbot reported the following uninit-value access issue [1][2]:
    
    nci_rx_work() parses and processes received packet. When the payload
    length is zero, each message type handler reads uninitialized payload
    and KMSAN detects this issue. The receipt of a packet with a zero-size
    payload is considered unexpected, and therefore, such packets should be
    silently discarded.
    
    This patch resolved this issue by checking payload size before calling
    each message type handler codes.
    
    Fixes: 6a2968a ("NFC: basic NCI protocol implementation")
    Reported-and-tested-by: [email protected]
    Reported-and-tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=7ea9413ea6749baf5574 [1]
    Closes: https://syzkaller.appspot.com/bug?extid=29b5ca705d2e0f4a44d2 [2]
    Signed-off-by: Ryosuke Yasuoka <[email protected]>
    Reviewed-by: Jeremy Cline <[email protected]>
    Reviewed-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    YsuOS authored and davem330 committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    d24b035 View commit details
    Browse the repository at this point in the history
  9. x86/pm: Work around false positive kmemleak report in msr_build_conte…

    …xt()
    
    Since:
    
      7ee18d6 ("x86/power: Make restore_processor_context() sane")
    
    kmemleak reports this issue:
    
      unreferenced object 0xf68241e0 (size 32):
        comm "swapper/0", pid 1, jiffies 4294668610 (age 68.432s)
        hex dump (first 32 bytes):
          00 cc cc cc 29 10 01 c0 00 00 00 00 00 00 00 00  ....)...........
          00 42 82 f6 cc cc cc cc cc cc cc cc cc cc cc cc  .B..............
        backtrace:
          [<461c1d50>] __kmem_cache_alloc_node+0x106/0x260
          [<ea65e13b>] __kmalloc+0x54/0x160
          [<c3858cd2>] msr_build_context.constprop.0+0x35/0x100
          [<46635aff>] pm_check_save_msr+0x63/0x80
          [<6b6bb938>] do_one_initcall+0x41/0x1f0
          [<3f3add60>] kernel_init_freeable+0x199/0x1e8
          [<3b538fde>] kernel_init+0x1a/0x110
          [<938ae2b2>] ret_from_fork+0x1c/0x28
    
    Which is a false positive.
    
    Reproducer:
    
      - Run rsync of whole kernel tree (multiple times if needed).
      - start a kmemleak scan
      - Note this is just an example: a lot of our internal tests hit these.
    
    The root cause is similar to the fix in:
    
      b0b592c x86/pm: Fix false positive kmemleak report in msr_build_context()
    
    ie. the alignment within the packed struct saved_context
    which has everything unaligned as there is only "u16 gs;" at start of
    struct where in the past there were four u16 there thus aligning
    everything afterwards.  The issue is with the fact that Kmemleak only
    searches for pointers that are aligned (see how pointers are scanned in
    kmemleak.c) so when the struct members are not aligned it doesn't see
    them.
    
    Testing:
    
    We run a lot of tests with our CI, and after applying this fix we do not
    see any kmemleak issues any more whilst without it we see hundreds of
    the above report. From a single, simple test run consisting of 416 individual test
    cases on kernel 5.10 x86 with kmemleak enabled we got 20 failures due to this,
    which is quite a lot. With this fix applied we get zero kmemleak related failures.
    
    Fixes: 7ee18d6 ("x86/power: Make restore_processor_context() sane")
    Signed-off-by: Anton Altaparmakov <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Acked-by: "Rafael J. Wysocki" <[email protected]>
    Cc: [email protected]
    Cc: Linus Torvalds <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    AntonAltaparmakov authored and Ingo Molnar committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    e3f269e View commit details
    Browse the repository at this point in the history
  10. kprobes/x86: Use copy_from_kernel_nofault() to read from unsafe address

    Read from an unsafe address with copy_from_kernel_nofault() in
    arch_adjust_kprobe_addr() because this function is used before checking
    the address is in text or not. Syzcaller bot found a bug and reported
    the case if user specifies inaccessible data area,
    arch_adjust_kprobe_addr() will cause a kernel panic.
    
    [ mingo: Clarified the comment. ]
    
    Fixes: cc66bb9 ("x86/ibt,kprobes: Cure sym+0 equals fentry woes")
    Reported-by: Qiang Zhang <[email protected]>
    Tested-by: Jinghao Jia <[email protected]>
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Link: https://lore.kernel.org/r/171042945004.154897.2221804961882915806.stgit@devnote2
    mhiramat authored and Ingo Molnar committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    4e51653 View commit details
    Browse the repository at this point in the history
  11. Revert "crypto: pkcs7 - remove sha1 support"

    This reverts commit 16ab7cb because it
    broke iwd.  iwd uses the KEYCTL_PKEY_* UAPIs via its dependency libell,
    and apparently it is relying on SHA-1 signature support.  These UAPIs
    are fairly obscure, and their documentation does not mention which
    algorithms they support.  iwd really should be using a properly
    supported userspace crypto library instead.  Regardless, since something
    broke we have to revert the change.
    
    It may be possible that some parts of this commit can be reinstated
    without breaking iwd (e.g. probably the removal of MODULE_SIG_SHA1), but
    for now this just does a full revert to get things working again.
    
    Reported-by: Karel Balej <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]
    Cc: Dimitri John Ledkov <[email protected]>
    Signed-off-by: Eric Biggers <[email protected]>
    Tested-by: Karel Balej <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    ebiggers authored and herbertx committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    203a676 View commit details
    Browse the repository at this point in the history
  12. crypto: iaa - Fix nr_cpus < nr_iaa case

    If nr_cpus < nr_iaa, the calculated cpus_per_iaa will be 0, which
    causes a divide-by-0 in rebalance_wq_table().
    
    Make sure cpus_per_iaa is 1 in that case, and also in the nr_iaa == 0
    case, even though cpus_per_iaa is never used if nr_iaa == 0, for
    paranoia.
    
    Cc: <[email protected]> # v6.8+
    Reported-by: Jerry Snitselaar <[email protected]>
    Signed-off-by: Tom Zanussi <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    tzanussi authored and herbertx committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    5a7e89d View commit details
    Browse the repository at this point in the history
  13. efi/libstub: fix efi_random_alloc() to allocate memory at alloc_min o…

    …r higher address
    
    Following warning is sometimes observed while booting my servers:
      [    3.594838] DMA: preallocated 4096 KiB GFP_KERNEL pool for atomic allocations
      [    3.602918] swapper/0: page allocation failure: order:10, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0-1
      ...
      [    3.851862] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocation
    
    If 'nokaslr' boot option is set, the warning always happens.
    
    On x86, ZONE_DMA is small zone at the first 16MB of physical address
    space. When this problem happens, most of that space seems to be used by
    decompressed kernel. Thereby, there is not enough space at DMA_ZONE to
    meet the request of DMA pool allocation.
    
    The commit 2f77465 ("x86/efistub: Avoid placing the kernel below
    LOAD_PHYSICAL_ADDR") tried to fix this problem by introducing lower
    bound of allocation.
    
    But the fix is not complete.
    
    efi_random_alloc() allocates pages by following steps.
    1. Count total available slots ('total_slots')
    2. Select a slot ('target_slot') to allocate randomly
    3. Calculate a starting address ('target') to be included target_slot
    4. Allocate pages, which starting address is 'target'
    
    In step 1, 'alloc_min' is used to offset the starting address of memory
    chunk. But in step 3 'alloc_min' is not considered at all.  As the
    result, 'target' can be miscalculated and become lower than 'alloc_min'.
    
    When KASLR is disabled, 'target_slot' is always 0 and the problem
    happens everytime if the EFI memory map of the system meets the
    condition.
    
    Fix this problem by calculating 'target' considering 'alloc_min'.
    
    Cc: [email protected]
    Cc: Tom Englund <[email protected]>
    Cc: [email protected]
    Fixes: 2f77465 ("x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR")
    Signed-off-by: Kazuma Kondo <[email protected]>
    Signed-off-by: Ard Biesheuvel <[email protected]>
    KONDO KAZUMA(近藤 和真) authored and ardbiesheuvel committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    3cb4a48 View commit details
    Browse the repository at this point in the history
  14. Merge tag 'i2c-for-6.9-rc1-part2' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/wsa/linux
    
    Pull more i2c updates from Wolfram Sang:
     "Some more I2C updates after the dependencies have been merged now.
    
      Plus a DT binding fix"
    
    * tag 'i2c-for-6.9-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
      dt-bindings: i2c: qcom,i2c-cci: Fix OV7251 'data-lanes' entries
      i2c: muxes: pca954x: Allow sharing reset GPIO
      i2c: nomadik: sort includes
      i2c: nomadik: support Mobileye EyeQ5 I2C controller
      i2c: nomadik: fetch i2c-transfer-timeout-us property from devicetree
      i2c: nomadik: replace jiffies by ktime for FIFO flushing timeout
      i2c: nomadik: support short xfer timeouts using waitqueue & hrtimer
      i2c: nomadik: use bitops helpers
      i2c: nomadik: simplify IRQ masking logic
      i2c: nomadik: rename private struct pointers from dev to priv
      dt-bindings: i2c: nomadik: add mobileye,eyeq5-i2c bindings and example
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    5ee2433 View commit details
    Browse the repository at this point in the history
  15. Merge tag 'sound-fix2-6.9-rc1' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/tiwai/sound
    
    Pull more sound fixes from Takashi Iwai:
     "The remaining fixes for 6.9-rc1 that have been gathered in this week.
    
      More about ASoC at this time (one long-standing fix for compress
      offload, SOF, AMD ACP, Rockchip, Cirrus and tlv320 stuff) while
      another regression fix in ALSA core and a couple of HD-audio quirks as
      usual are included"
    
    * tag 'sound-fix2-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
      ALSA: control: Fix unannotated kfree() cleanup
      ALSA: hda/realtek: Add quirks for some Clevo laptops
      ALSA: hda/realtek: Add quirk for HP Spectre x360 14 eu0000
      ALSA: hda/realtek: fix the hp playback volume issue for LG machines
      ASoC: soc-compress: Fix and add DPCM locking
      ASoC: SOF: amd: Skip IRAM/DRAM size modification for Steam Deck OLED
      ASoC: SOF: amd: Move signed_fw_image to struct acp_quirk_entry
      ASoC: amd: yc: Revert "add new YC platform variant (0x63) support"
      ASoC: amd: yc: Revert "Fix non-functional mic on Lenovo 21J2"
      ASoC: soc-core.c: Skip dummy codec when adding platforms
      ASoC: rockchip: i2s-tdm: Fix inaccurate sampling rates
      ASoC: dt-bindings: cirrus,cs42l43: Fix 'gpio-ranges' schema
      ASoC: amd: yc: Fix non-functional mic on ASUS M7600RE
      ASoC: tlv320adc3xxx: Don't strip remove function when driver is builtin
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    6b571e2 View commit details
    Browse the repository at this point in the history
  16. Merge tag 'regulator-fix-v6.9-merge-window' of git://git.kernel.org/p…

    …ub/scm/linux/kernel/git/broonie/regulator
    
    Pull regulator fix from Mark Brown:
     "One fix that came in during the merge window, fixing a problem with
      bootstrapping the state of exclusive regulators which have a parent
      regulator"
    
    * tag 'regulator-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
      regulator: core: Propagate the regulator state in case of exclusive get
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    8c826bd View commit details
    Browse the repository at this point in the history
  17. Merge tag 'spi-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm…

    …/linux/kernel/git/broonie/spi
    
    Pull spi fixes from Mark Brown:
     "A small collection of fixes that came in since the merge window. Most
      of it is relatively minor driver specific fixes, there's also fixes
      for error handling with SPI flash devices and a fix restoring delay
      control functionality for non-GPIO chip selects managed by the core"
    
    * tag 'spi-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
      spi: spi-mt65xx: Fix NULL pointer access in interrupt handler
      spi: docs: spidev: fix echo command format
      spi: spi-imx: fix off-by-one in mx51 CPU mode burst length
      spi: lm70llp: fix links in doc and comments
      spi: Fix error code checking in spi_mem_exec_op()
      spi: Restore delays for non-GPIO chip select
      spi: lpspi: Avoid potential use-after-free in probe()
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    4073195 View commit details
    Browse the repository at this point in the history
  18. selftests/bpf: Use syscall(SYS_gettid) instead of gettid() wrapper in…

    … bench
    
    With glibc 2.28, selftests compilation fails for benchs/bench_trigger.c:
    
    benchs/bench_trigger.c: In function ‘inc_counter’:
    benchs/bench_trigger.c:25:23: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration]
       25 |                 tid = gettid();
          |                       ^~~~~~
          |                       getgid
    cc1: all warnings being treated as errors
    
    It appears support for the gettid() wrapper is variable across glibc
    versions, so may be safer to use syscall(SYS_gettid) instead.
    
    Fixes: 520fad2 ("selftests/bpf: scale benchmark counting by using per-CPU counters")
    Signed-off-by: Alan Maguire <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    alan-maguire authored and anakryiko committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    1684d6e View commit details
    Browse the repository at this point in the history
  19. selftests/bpf: Mark uprobe trigger functions with nocf_check attribute

    Some distros seem to enable the -fcf-protection=branch by default,
    which breaks our setup on first instruction of uprobe trigger
    functions and place there endbr64 instruction.
    
    Marking them with nocf_check attribute to skip that.
    
    Ignoring unknown attribute warning in gcc for bench objects, because
    nocf_check can be used only when -fcf-protection=branch is enabled,
    otherwise we get a warning and break compilation.
    
    Signed-off-by: Jiri Olsa <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    olsajiri authored and anakryiko committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    af8d27b View commit details
    Browse the repository at this point in the history
  20. Merge tag 'fbdev-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/deller/linux-fbdev
    
    Pull fbdev updates from Helge Deller:
    
     - Allow console fonts up to 64x128 pixels (Samuel Thibault)
    
     - Prevent division-by-zero in fb monitor code (Roman Smirnov)
    
     - Drop Renesas ARM platforms from Mobile LCDC framebuffer driver (Geert
       Uytterhoeven)
    
     - Various code cleanups in viafb, uveafb and mb862xxfb drivers by
       Aleksandr Burakov, Li Zhijian and Michael Ellerman
    
    * tag 'fbdev-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
      fbdev: panel-tpo-td043mtea1: Convert sprintf() to sysfs_emit()
      fbmon: prevent division by zero in fb_videomode_from_videomode()
      fbcon: Increase maximum font width x height to 64 x 128
      fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2
      fbdev: mb862xxfb: Fix defined but not used error
      fbdev: uvesafb: Convert sprintf/snprintf to sysfs_emit
      fbdev: Restrict FB_SH_MOBILE_LCDC to SuperH
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    4f55aa8 View commit details
    Browse the repository at this point in the history
  21. Merge tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/chenhuacai/linux-loongson
    
    Pull LoongArch updates from Huacai Chen:
    
     - Add objtool support for LoongArch
    
     - Add ORC stack unwinder support for LoongArch
    
     - Add kernel livepatching support for LoongArch
    
     - Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
    
     - Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
    
     - Some bug fixes and other small changes
    
    * tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
      LoongArch/crypto: Clean up useless assignment operations
      LoongArch: Define the __io_aw() hook as mmiowb()
      LoongArch: Remove superfluous flush_dcache_page() definition
      LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h
      LoongArch: Change __my_cpu_offset definition to avoid mis-optimization
      LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
      LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
      LoongArch: Add kernel livepatching support
      LoongArch: Add ORC stack unwinder support
      objtool: Check local label in read_unwind_hints()
      objtool: Check local label in add_dead_ends()
      objtool/LoongArch: Enable orc to be built
      objtool/x86: Separate arch-specific and generic parts
      objtool/LoongArch: Implement instruction decoder
      objtool/LoongArch: Enable objtool to be built
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    1e3cd03 View commit details
    Browse the repository at this point in the history
  22. Merge tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/l…

    …inux/kernel/git/riscv/linux
    
    Pull RISC-V updates from Palmer Dabbelt:
    
     - Support for various vector-accelerated crypto routines
    
     - Hibernation is now enabled for portable kernel builds
    
     - mmap_rnd_bits_max is larger on systems with larger VAs
    
     - Support for fast GUP
    
     - Support for membarrier-based instruction cache synchronization
    
     - Support for the Andes hart-level interrupt controller and PMU
    
     - Some cleanups around unaligned access speed probing and Kconfig
       settings
    
     - Support for ACPI LPI and CPPC
    
     - Various cleanus related to barriers
    
     - A handful of fixes
    
    * tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits)
      riscv: Fix syscall wrapper for >word-size arguments
      crypto: riscv - add vector crypto accelerated AES-CBC-CTS
      crypto: riscv - parallelize AES-CBC decryption
      riscv: Only flush the mm icache when setting an exec pte
      riscv: Use kcalloc() instead of kzalloc()
      riscv/barrier: Add missing space after ','
      riscv/barrier: Consolidate fence definitions
      riscv/barrier: Define RISCV_FULL_BARRIER
      riscv/barrier: Define __{mb,rmb,wmb}
      RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ
      cpufreq: Move CPPC configs to common Kconfig and add RISC-V
      ACPI: RISC-V: Add CPPC driver
      ACPI: Enable ACPI_PROCESSOR for RISC-V
      ACPI: RISC-V: Add LPI driver
      cpuidle: RISC-V: Move few functions to arch/riscv
      riscv: Introduce set_compat_task() in asm/compat.h
      riscv: Introduce is_compat_thread() into compat.h
      riscv: add compile-time test into is_compat_task()
      riscv: Replace direct thread flag check with is_compat_task()
      riscv: Improve arch_get_mmap_end() macro
      ...
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    c150b80 View commit details
    Browse the repository at this point in the history
  23. Merge tag 'xfs-6.9-merge-9' of git://git.kernel.org/pub/scm/fs/xfs/xf…

    …s-linux
    
    Pull xfs fixes from Chandan Babu:
    
     - Fix invalid pointer dereference by initializing xmbuf before
       tracepoint function is invoked
    
     - Use memalloc_nofs_save() when inserting into quota radix tree
    
    * tag 'xfs-6.9-merge-9' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
      xfs: quota radix tree allocations need to be NOFS on insert
      xfs: fix dev_t usage in xmbuf tracepoints
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    6f6efce View commit details
    Browse the repository at this point in the history
  24. Merge tag 'ceph-for-6.9-rc1' of https://github.com/ceph/ceph-client

    Pull ceph updates from Ilya Dryomov:
     "A patch to minimize blockage when processing very large batches of
      dirty caps and two fixes to better handle EOF in the face of multiple
      clients performing reads and size-extending writes at the same time"
    
    * tag 'ceph-for-6.9-rc1' of https://github.com/ceph/ceph-client:
      ceph: set correct cap mask for getattr request for read
      ceph: stop copying to iter at EOF on sync reads
      ceph: remove SLAB_MEM_SPREAD flag usage
      ceph: break the check delayed cap loop every 5s
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    ff9c18e View commit details
    Browse the repository at this point in the history
  25. Merge tag 'for-6.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/device-mapper/linux-dm
    
    Pull device mapper fixes from Mike Snitzer:
    
     - Fix a memory leak in DM integrity recheck code that was added during
       the 6.9 merge. Also fix the recheck code to ensure it issues bios
       with proper alignment.
    
     - Fix DM snapshot's dm_exception_table_exit() to schedule while
       handling an large exception table during snapshot device shutdown.
    
    * tag 'for-6.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
      dm-integrity: align the outgoing bio in integrity_recheck
      dm snapshot: fix lockup in dm_exception_table_exit
      dm-integrity: fix a memory leak when rechecking the data
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    64f799f View commit details
    Browse the repository at this point in the history
  26. Merge tag 'io_uring-6.9-20240322' of git://git.kernel.dk/linux

    Pull more io_uring updates from Jens Axboe:
     "One patch just missed the initial pull, the rest are either fixes or
      small cleanups that make our life easier for the next kernel:
    
       - Fix a potential leak in error handling of pinned pages, and clean
         it up (Gabriel, Pavel)
    
       - Fix an issue with how read multishot returns retry (me)
    
       - Fix a problem with waitid/futex removals, if we hit the case of
         needing to remove all of them at exit time (me)
    
       - Fix for a regression introduced in this merge window, where we
         don't always have sr->done_io initialized if the ->prep_async()
         path is used (me)
    
       - Fix for SQPOLL setup error handling (me)
    
       - Fix for a poll removal request being delayed (Pavel)
    
       - Rename of a struct member which had a confusing name (Pavel)"
    
    * tag 'io_uring-6.9-20240322' of git://git.kernel.dk/linux:
      io_uring/sqpoll: early exit thread if task_context wasn't allocated
      io_uring: clear opcode specific data for an early failure
      io_uring/net: ensure async prep handlers always initialize ->done_io
      io_uring/waitid: always remove waitid entry for cancel all
      io_uring/futex: always remove futex entry for cancel all
      io_uring: fix poll_remove stalled req completion
      io_uring: Fix release of pinned pages when __io_uaddr_map fails
      io_uring/kbuf: rename is_mapped
      io_uring: simplify io_pages_free
      io_uring: clean rings on NO_MMAP alloc fail
      io_uring/rw: return IOU_ISSUE_SKIP_COMPLETE for multishot retry
      io_uring: don't save/restore iowait state
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    19dba09 View commit details
    Browse the repository at this point in the history
  27. Merge tag 'block-6.9-20240322' of git://git.kernel.dk/linux

    Pull more block updates from Jens Axboe:
    
     - NVMe pull request via Keith:
         - Make an informative message less ominous (Keith)
         - Enhanced trace decoding (Guixin)
         - TCP updates (Hannes, Li)
         - Fabrics connect deadlock fix (Chunguang)
         - Platform API migration update (Uwe)
         - A new device quirk (Jiawei)
    
     - Remove dead assignment in fd (Yufeng)
    
    * tag 'block-6.9-20240322' of git://git.kernel.dk/linux:
      nvmet-rdma: remove NVMET_RDMA_REQ_INVALIDATE_RKEY flag
      nvme: remove redundant BUILD_BUG_ON check
      floppy: remove duplicated code in redo_fd_request()
      nvme/tcp: Add wq_unbound modparam for nvme_tcp_wq
      nvme-tcp: Export the nvme_tcp_wq to sysfs
      drivers/nvme: Add quirks for device 126f:2262
      nvme: parse format command's lbafu when tracing
      nvme: add tracing of reservation commands
      nvme: parse zns command's zsa and zrasf to string
      nvme: use nvme_disk_is_ns_head helper
      nvme: fix reconnection fail due to reserved tag allocation
      nvmet: add tracing of zns commands
      nvmet: add tracing of authentication commands
      nvme-apple: Convert to platform remove callback returning void
      nvmet-tcp: do not continue for invalid icreq
      nvme: change shutdown timeout setting message
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    e3111d9 View commit details
    Browse the repository at this point in the history
  28. Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/jejb/scsi
    
    Pull more SCSI updates from James Bottomley:
     "The vfs has long had a write lifetime hint mechanism that gives the
      expected longevity on storage of the data being written. f2fs was the
      original consumer of this and used the hint for flash data placement
      (mostly to avoid write amplification by placing objects with similar
      lifetimes in the same erase block).
    
      More recently the SCSI based UFS (Universal Flash Storage) drivers
      have wanted to take advantage of this as well, for the same reasons as
      f2fs, necessitating plumbing the write hints through the block layer
      and then adding it to the SCSI core.
    
      The vfs write_hints already taken plumbs this as far as block and this
      completes the SCSI core enabling based on a recently agreed reuse of
      the old write command group number. The additions to the scsi_debug
      driver are for emulating this property so we can run tests on it in
      the absence of an actual UFS device"
    
    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
      scsi: scsi_debug: Maintain write statistics per group number
      scsi: scsi_debug: Implement GET STREAM STATUS
      scsi: scsi_debug: Implement the IO Advice Hints Grouping mode page
      scsi: scsi_debug: Allocate the MODE SENSE response from the heap
      scsi: scsi_debug: Rework subpage code error handling
      scsi: scsi_debug: Rework page code error handling
      scsi: scsi_debug: Support the block limits extension VPD page
      scsi: scsi_debug: Reduce code duplication
      scsi: sd: Translate data lifetime information
      scsi: scsi_proto: Add structures and constants related to I/O groups and streams
      scsi: core: Query the Block Limits Extension VPD page
    torvalds committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    bfa8f18 View commit details
    Browse the repository at this point in the history
  29. libbpf: Add new sec_def "sk_skb/verdict"

    The new sec_def specifies sk_skb program type with
    BPF_SK_SKB_VERDICT attachment type. This way, libbpf
    will set expected_attach_type properly for the program.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Yonghong Song authored and anakryiko committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    61df575 View commit details
    Browse the repository at this point in the history
  30. overflow: Change DEFINE_FLEX to take __counted_by member

    The norm should be flexible array structures with __counted_by
    annotations, so DEFINE_FLEX() is updated to expect that. Rename
    the non-annotated version to DEFINE_RAW_FLEX(), and update the
    few existing users. Additionally add selftests for the macros.
    
    Reviewed-by: Gustavo A. R. Silva <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Przemek Kitszel <[email protected]>
    Signed-off-by: Kees Cook <[email protected]>
    kees committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    d8e45f2 View commit details
    Browse the repository at this point in the history
  31. lkdtm/bugs: Improve warning message for compilers without counted_by …

    …support
    
    The current message for telling the user that their compiler does not
    support the counted_by attribute in the FAM_BOUNDS test does not make
    much sense either grammatically or semantically. Fix it to make it
    correct in both aspects.
    
    Signed-off-by: Nathan Chancellor <[email protected]>
    Reviewed-by: Gustavo A. R. Silva <[email protected]>
    Link: https://lore.kernel.org/r/20240321-lkdtm-improve-lack-of-counted_by-msg-v1-1-0fbf7481a29c@kernel.org
    Signed-off-by: Kees Cook <[email protected]>
    nathanchance authored and kees committed Mar 22, 2024
    Configuration menu
    Copy the full SHA
    231dc3f View commit details
    Browse the repository at this point in the history

Commits on Mar 23, 2024

  1. tools: ynl: fix setting presence bits in simple nests

    When we set members of simple nested structures in requests
    we need to set "presence" bits for all the nesting layers
    below. This has nothing to do with the presence type of
    the last layer.
    
    Fixes: be5bea1 ("net: add basic C code generators for Netlink")
    Reviewed-by: Breno Leitao <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    f6c8f5e View commit details
    Browse the repository at this point in the history
  2. nexthop: fix uninitialized variable in nla_put_nh_group_stats()

    The "*hw_stats_used" value needs to be set on the success paths to prevent
    an uninitialized variable bug in the caller, nla_put_nh_group_stats().
    
    Fixes: 5072ae0 ("net: nexthop: Expose nexthop group HW stats to user space")
    Signed-off-by: Dan Carpenter <[email protected]>
    Reviewed-by: Jiri Pirko <[email protected]>
    Reviewed-by: Ido Schimmel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Dan Carpenter authored and kuba-moo committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    9145e22 View commit details
    Browse the repository at this point in the history
  3. ipv6: Fix address dump when IPv6 is disabled on an interface

    Cited commit started returning an error when user space requests to dump
    the interface's IPv6 addresses and IPv6 is disabled on the interface.
    Restore the previous behavior and do not return an error.
    
    Before cited commit:
    
     # ip address show dev dummy1
     2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
         link/ether 1a:52:02:5a:c2:6e brd ff:ff:ff:ff:ff:ff
         inet6 fe80::1852:2ff:fe5a:c26e/64 scope link proto kernel_ll
            valid_lft forever preferred_lft forever
     # ip link set dev dummy1 mtu 1000
     # ip address show dev dummy1
     2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1000 qdisc noqueue state UNKNOWN group default qlen 1000
         link/ether 1a:52:02:5a:c2:6e brd ff:ff:ff:ff:ff:ff
    
    After cited commit:
    
     # ip address show dev dummy1
     2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
         link/ether 1e:9b:94:00:ac:e8 brd ff:ff:ff:ff:ff:ff
         inet6 fe80::1c9b:94ff:fe00:ace8/64 scope link proto kernel_ll
            valid_lft forever preferred_lft forever
     # ip link set dev dummy1 mtu 1000
     # ip address show dev dummy1
     RTNETLINK answers: No such device
     Dump terminated
    
    With this patch:
    
     # ip address show dev dummy1
     2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
         link/ether 42:35:fc:53:66:cf brd ff:ff:ff:ff:ff:ff
         inet6 fe80::4035:fcff:fe53:66cf/64 scope link proto kernel_ll
            valid_lft forever preferred_lft forever
     # ip link set dev dummy1 mtu 1000
     # ip address show dev dummy1
     2: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1000 qdisc noqueue state UNKNOWN group default qlen 1000
         link/ether 42:35:fc:53:66:cf brd ff:ff:ff:ff:ff:ff
    
    Fixes: 9cc4cc3 ("ipv6: use xa_array iterator to implement inet6_dump_addr()")
    Reported-by: Gal Pressman <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Tested-by: Gal Pressman <[email protected]>
    Signed-off-by: Ido Schimmel <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    idosch authored and kuba-moo committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    c04f7df View commit details
    Browse the repository at this point in the history
  4. bpf: verifier: fix addr_space_cast from as(1) to as(0)

    The verifier currently converts addr_space_cast from as(1) to as(0) that
    is: BPF_ALU64 | BPF_MOV | BPF_X with off=1 and imm=1
    to
    BPF_ALU | BPF_MOV | BPF_X with imm=1 (32-bit mov)
    
    Because of this imm=1, the JITs that have bpf_jit_needs_zext() == true,
    interpret the converted instruction as BPF_ZEXT_REG(DST) which is a
    special form of mov32, used for doing explicit zero extension on dst.
    These JITs will just zero extend the dst reg and will not move the src to
    dst before the zext.
    
    Fix do_misc_fixups() to set imm=0 when converting addr_space_cast to a
    normal mov32.
    
    The JITs that have bpf_jit_needs_zext() == true rely on the verifier to
    emit zext instructions. Mark dst_reg as subreg when doing cast from
    as(1) to as(0) so the verifier emits a zext instruction after the mov.
    
    Fixes: 6082b6c ("bpf: Recognize addr_space_cast instruction in the verifier.")
    Signed-off-by: Puranjay Mohan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    f7f5d18 View commit details
    Browse the repository at this point in the history
  5. selftests/bpf: verifier_arena: fix mmap address for arm64

    The arena_list selftest uses (1ull << 32) in the mmap address
    computation for arm64. Use the same in the verifier_arena selftest.
    
    This makes the selftest pass for arm64 on the CI[1].
    
    [1] kernel-patches/bpf#6622
    
    Signed-off-by: Puranjay Mohan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    fa3550d View commit details
    Browse the repository at this point in the history
  6. bpf: verifier: reject addr_space_cast insn without arena

    The verifier allows using the addr_space_cast instruction in a program
    that doesn't have an associated arena. This was caught in the form an
    invalid memory access in do_misc_fixups() when while converting
    addr_space_cast to a normal 32-bit mov, env->prog->aux->arena was
    dereferenced to check for BPF_F_NO_USER_CONV flag.
    
    Reject programs that include the addr_space_cast instruction but don't
    have an associated arena.
    
    root@rv-tester:~# ./reproducer
     Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000030
     Oops [#1]
     [<ffffffff8017eeaa>] do_misc_fixups+0x43c/0x1168
     [<ffffffff801936d6>] bpf_check+0xda8/0x22b6
     [<ffffffff80174b32>] bpf_prog_load+0x486/0x8dc
     [<ffffffff80176566>] __sys_bpf+0xbd8/0x214e
     [<ffffffff80177d14>] __riscv_sys_bpf+0x22/0x2a
     [<ffffffff80d2493a>] do_trap_ecall_u+0x102/0x17c
     [<ffffffff80d3048c>] ret_from_exception+0x0/0x64
    
    Fixes: 6082b6c ("bpf: Recognize addr_space_cast instruction in the verifier.")
    Reported-by: xingwei lee <[email protected]>
    Reported-by: yue sun <[email protected]>
    Closes: https://lore.kernel.org/bpf/CABOYnLz09O1+2gGVJuCxd_24a-7UueXzV-Ff+Fr+h5EKFDiYCQ@mail.gmail.com/
    Signed-off-by: Puranjay Mohan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    122fdbd View commit details
    Browse the repository at this point in the history
  7. x86/cpu: Ensure that CPU info updates are propagated on UP

    The boot sequence evaluates CPUID information twice:
    
      1) During early boot
    
      2) When finalizing the early setup right before
         mitigations are selected and alternatives are patched.
    
    In both cases the evaluation is stored in boot_cpu_data, but on UP the
    copying of boot_cpu_data to the per CPU info of the boot CPU happens
    between #1 and #2. So any update which happens in #2 is never propagated to
    the per CPU info instance.
    
    Consolidate the whole logic and copy boot_cpu_data right before applying
    alternatives as that's the point where boot_cpu_data is in it's final
    state and not supposed to change anymore.
    
    This also removes the voodoo mb() from smp_prepare_cpus_common() which
    had absolutely no purpose.
    
    Fixes: 71eb489 ("x86/percpu: Cure per CPU madness on UP")
    Reported-by: Guenter Roeck <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Tested-by: Guenter Roeck <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    KAGA-KOKO authored and bp3tk0v committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    c90399f View commit details
    Browse the repository at this point in the history
  8. x86/topology: Don't evaluate logical IDs during early boot

    The local APICs have not yet been enumerated so the logical ID evaluation
    from the topology bitmaps does not work and would return an error code.
    
    Skip the evaluation during the early boot CPUID evaluation and only apply
    it on the final run.
    
    Fixes: 380414b ("x86/cpu/topology: Use topology logical mapping mechanism")
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Tested-by: Guenter Roeck <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    KAGA-KOKO authored and bp3tk0v committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    7af541c View commit details
    Browse the repository at this point in the history
  9. x86/topology: Handle the !APIC case gracefully

    If there is no local APIC enumerated and registered then the topology
    bitmaps are empty. Therefore, topology_init_possible_cpus() will die with
    a division by zero exception.
    
    Prevent this by registering a fake APIC id to populate the topology
    bitmap. This also allows to use all topology query interfaces
    unconditionally. It does not affect the actual APIC code because either
    the local APIC address was not registered or no local APIC could be
    detected.
    
    Fixes: f1f758a ("x86/topology: Add a mechanism to track topology via APIC IDs")
    Reported-by: Guenter Roeck <[email protected]>
    Reported-by: Linus Torvalds <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Tested-by: Guenter Roeck <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    KAGA-KOKO authored and bp3tk0v committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    5e25eb2 View commit details
    Browse the repository at this point in the history
  10. x86/mpparse: Register APIC address only once

    The APIC address is registered twice. First during the early detection and
    afterwards when actually scanning the table for APIC IDs. The APIC and
    topology core warn about the second attempt.
    
    Restrict it to the early detection call.
    
    Fixes: 81287ad ("x86/apic: Sanitize APIC address setup")
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Tested-by: Guenter Roeck <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    KAGA-KOKO authored and bp3tk0v committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    f2208aa View commit details
    Browse the repository at this point in the history
  11. Merge tag 'hardening-v6.9-rc1-fixes' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/kees/linux
    
    Pull more hardening updates from Kees Cook:
    
     - CONFIG_MEMCPY_SLOW_KUNIT_TEST is no longer needed (Guenter Roeck)
    
     - Fix needless UTF-8 character in arch/Kconfig (Liu Song)
    
     - Improve __counted_by warning message in LKDTM (Nathan Chancellor)
    
     - Refactor DEFINE_FLEX() for default use of __counted_by
    
     - Disable signed integer overflow sanitizer on GCC < 8
    
    * tag 'hardening-v6.9-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
      lkdtm/bugs: Improve warning message for compilers without counted_by support
      overflow: Change DEFINE_FLEX to take __counted_by member
      Revert "kunit: memcpy: Split slow memcpy tests into MEMCPY_SLOW_KUNIT_TEST"
      arch/Kconfig: eliminate needless UTF-8 character in Kconfig help
      ubsan: Disable signed integer overflow sanitizer on GCC < 8
    torvalds committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    b718713 View commit details
    Browse the repository at this point in the history
  12. Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

    Pull ARM updates from Russell King:
    
     - remove a misuse of kernel-doc comment
    
     - use "Call trace:" for backtraces like other architectures
    
     - implement copy_from_kernel_nofault_allowed() to fix a LKDTM test
    
     - add a "cut here" line for prefetch aborts
    
     - remove unnecessary Kconfing entry for FRAME_POINTER
    
     - remove iwmmxy support for PJ4/PJ4B cores
    
     - use bitfield helpers in ptrace to improve readabililty
    
     - check if folio is reserved before flushing
    
    * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
      ARM: 9359/1: flush: check if the folio is reserved for no-mapping addresses
      ARM: 9354/1: ptrace: Use bitfield helpers
      ARM: 9352/1: iwmmxt: Remove support for PJ4/PJ4B cores
      ARM: 9353/1: remove unneeded entry for CONFIG_FRAME_POINTER
      ARM: 9351/1: fault: Add "cut here" line for prefetch aborts
      ARM: 9350/1: fault: Implement copy_from_kernel_nofault_allowed()
      ARM: 9349/1: unwind: Add missing "Call trace:" line
      ARM: 9334/1: mm: init: remove misuse of kernel-doc comment
    torvalds committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    02fb638 View commit details
    Browse the repository at this point in the history
  13. Merge tag 'powerpc-6.9-2' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/powerpc/linux
    
    Pull more powerpc updates from Michael Ellerman:
    
     - Handle errors in mark_rodata_ro() and mark_initmem_nx()
    
     - Make struct crash_mem available without CONFIG_CRASH_DUMP
    
    Thanks to Christophe Leroy and Hari Bathini.
    
    * tag 'powerpc-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
      powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency
      powerpc/kexec: split CONFIG_KEXEC_FILE and CONFIG_CRASH_DUMP
      kexec/kdump: make struct crash_mem available without CONFIG_CRASH_DUMP
      powerpc: Handle error in mark_rodata_ro() and mark_initmem_nx()
    torvalds committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    484193f View commit details
    Browse the repository at this point in the history
  14. Merge tag 'core-entry-2024-03-23' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/tip/tip
    
    Pull core entry fix from Thomas Gleixner:
     "A single fix for the generic entry code:
    
      The trace_sys_enter() tracepoint can modify the syscall number via
      kprobes or BPF in pt_regs, but that requires that the syscall number
      is re-evaluted from pt_regs after the tracepoint.
    
      A seccomp fix in that area removed the re-evaluation so the change
      does not take effect as the code just uses the locally cached number.
    
      Restore the original behaviour by re-evaluating the syscall number
      after the tracepoint"
    
    * tag 'core-entry-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      entry: Respect changes to system call number by trace_sys_enter()
    torvalds committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    976b029 View commit details
    Browse the repository at this point in the history
  15. Merge tag 'irq-urgent-2024-03-23' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/tip/tip
    
    Pull irq fixes from Thomas Gleixner:
     "A series of fixes for the Renesas RZG21 interrupt chip driver to
      prevent spurious and misrouted interrupts.
    
       - Ensure that posted writes are flushed in the eoi() callback
    
       - Ensure that interrupts are masked at the chip level when the
         trigger type is changed
    
       - Clear the interrupt status register when setting up edge type
         trigger modes
    
       - Ensure that the trigger type and routing information is set before
         the interrupt is enabled"
    
    * tag 'irq-urgent-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      irqchip/renesas-rzg2l: Do not set TIEN and TINT source at the same time
      irqchip/renesas-rzg2l: Prevent spurious interrupts when setting trigger type
      irqchip/renesas-rzg2l: Rename rzg2l_irq_eoi()
      irqchip/renesas-rzg2l: Rename rzg2l_tint_eoi()
      irqchip/renesas-rzg2l: Flush posted write in irq_eoi()
    torvalds committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    1a39193 View commit details
    Browse the repository at this point in the history
  16. Merge tag 'timers-core-2024-03-23' of git://git.kernel.org/pub/scm/li…

    …nux/kernel/git/tip/tip
    
    Pull more clocksource updates from Thomas Gleixner:
     "A set of updates for clocksource and clockevent drivers:
    
       - A fix for the prescaler of the ARM global timer where the prescaler
         mask define only covered 4 bits while it is actully 8 bits wide.
         This obviously restricted the possible range of prescaler
         adjustments
    
       - A fix for the RISC-V timer which prevents a timer interrupt being
         raised while the timer is initialized
    
       - A set of device tree updates to support new system on chips in
         various drivers
    
       - Kernel-doc and other cleanups all over the place"
    
    * tag 'timers-core-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      clocksource/drivers/timer-riscv: Clear timer interrupt on timer initialization
      dt-bindings: timer: Add support for cadence TTC PWM
      clocksource/drivers/arm_global_timer: Simplify prescaler register access
      clocksource/drivers/arm_global_timer: Guard against division by zero
      clocksource/drivers/arm_global_timer: Make gt_target_rate unsigned long
      dt-bindings: timer: add Ralink SoCs system tick counter
      clocksource: arm_global_timer: fix non-kernel-doc comment
      clocksource/drivers/arm_global_timer: Remove stray tab
      clocksource/drivers/arm_global_timer: Fix maximum prescaler value
      clocksource/drivers/imx-sysctr: Add i.MX95 support
      clocksource/drivers/imx-sysctr: Drop use global variables
      dt-bindings: timer: nxp,sysctr-timer: support i.MX95
      dt-bindings: timer: renesas: ostm: Document RZ/Five SoC
      dt-bindings: timer: renesas,tmu: Document input capture interrupt
      clocksource/drivers/ti-32K: Fix misuse of "/**" comment
      clocksource/drivers/stm32: Fix all kernel-doc warnings
      dt-bindings: timer: exynos4210-mct: Add google,gs101-mct compatible
      clocksource/drivers/imx: Fix -Wunused-but-set-variable warning
    torvalds committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    00164f4 View commit details
    Browse the repository at this point in the history
  17. Merge tag 'timers-urgent-2024-03-23' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/tip/tip
    
    Pull timer fixes from Thomas Gleixner:
     "Two regression fixes for the timer and timer migration code:
    
       - Prevent endless timer requeuing which is caused by two CPUs racing
         out of idle. This happens when the last CPU goes idle and therefore
         has to ensure to expire the pending global timers and some other
         CPU come out of idle at the same time and the other CPU wins the
         race and expires the global queue. This causes the last CPU to
         chase ghost timers forever and reprogramming it's clockevent device
         endlessly.
    
         Cure this by re-evaluating the wakeup time unconditionally.
    
       - The split into local (pinned) and global timers in the timer wheel
         caused a regression for NOHZ full as it broke the idle tracking of
         global timers. On NOHZ full this prevents an self IPI being sent
         which in turn causes the timer to be not programmed and not being
         expired on time.
    
         Restore the idle tracking for the global timer base so that the
         self IPI condition for NOHZ full is working correctly again"
    
    * tag 'timers-urgent-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      timers: Fix removed self-IPI on global timer's enqueue in nohz_full
      timers/migration: Fix endless timer requeue after idle interrupts
    torvalds committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    7029324 View commit details
    Browse the repository at this point in the history

Commits on Mar 24, 2024

  1. Documentation/x86: Document that resctrl bandwidth control units are MiB

    The memory bandwidth software controller uses 2^20 units rather than
    10^6. See mbm_bw_count() which computes bandwidth using the "SZ_1M"
    Linux define for 0x00100000.
    
    Update the documentation to use MiB when describing this feature.
    It's too late to fix the mount option "mba_MBps" as that is now an
    established user interface.
    
    Signed-off-by: Tony Luck <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    aegl authored and Ingo Molnar committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    a8ed59a View commit details
    Browse the repository at this point in the history
  2. x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD

    Commit 6723654 ("x86/fpu: Update XFD state where required") and
    commit 8bf2675 ("x86/fpu: Add XFD state to fpstate") introduced a
    per CPU variable xfd_state to keep the MSR_IA32_XFD value cached, in
    order to avoid unnecessary writes to the MSR.
    
    On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which
    wipes out any stale state. But the per CPU cached xfd value is not
    reset, which brings them out of sync.
    
    As a consequence a subsequent xfd_update_state() might fail to update
    the MSR which in turn can result in XRSTOR raising a #NM in kernel
    space, which crashes the kernel.
    
    To fix this, introduce xfd_set_state() to write xfd_state together
    with MSR_IA32_XFD, and use it in all places that set MSR_IA32_XFD.
    
    Fixes: 6723654 ("x86/fpu: Update XFD state where required")
    Signed-off-by: Adamos Ttofari <[email protected]>
    Signed-off-by: Chang S. Bae <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Reviewed-by: Thomas Gleixner <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    
    Closes: https://lore.kernel.org/lkml/[email protected]
    Adamos Ttofari authored and Ingo Molnar committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    10e4b51 View commit details
    Browse the repository at this point in the history
  3. x86/cpu: Add model number for another Intel Arrow Lake mobile processor

    This one is the regular laptop CPU.
    
    Signed-off-by: Tony Luck <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    aegl authored and Ingo Molnar committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    8a8a9c9 View commit details
    Browse the repository at this point in the history
  4. x86/boot/64: Apply encryption mask to 5-level pagetable update

    When running with 5-level page tables, the kernel mapping PGD entry is
    updated to point to the P4D table. The assignment uses _PAGE_TABLE_NOENC,
    which, when SME is active (mem_encrypt=on), results in a page table
    entry without the encryption mask set, causing the system to crash on
    boot.
    
    Change the assignment to use _PAGE_TABLE instead of _PAGE_TABLE_NOENC so
    that the encryption mask is set for the PGD entry.
    
    Fixes: 533568e ("x86/boot/64: Use RIP_REL_REF() to access early_top_pgt[]")
    Signed-off-by: Tom Lendacky <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Reviewed-by: Ard Biesheuvel <[email protected]>
    Link: https://lore.kernel.org/r/8f20345cda7dbba2cf748b286e1bc00816fe649a.1711122067.git.thomas.lendacky@amd.com
    tlendacky authored and Ingo Molnar committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    4d0d7e7 View commit details
    Browse the repository at this point in the history
  5. x86/boot/64: Move 5-level paging global variable assignments back

    Commit 63bed96 ("x86/startup_64: Defer assignment of 5-level paging
    global variables") moved assignment of 5-level global variables to later
    in the boot in order to avoid having to use RIP relative addressing in
    order to set them. However, when running with 5-level paging and SME
    active (mem_encrypt=on), the variables are needed as part of the page
    table setup needed to encrypt the kernel (using pgd_none(), p4d_offset(),
    etc.). Since the variables haven't been set, the page table manipulation
    is done as if 4-level paging is active, causing the system to crash on
    boot.
    
    While only a subset of the assignments that were moved need to be set
    early, move all of the assignments back into check_la57_support() so that
    these assignments aren't spread between two locations. Instead of just
    reverting the fix, this uses the new RIP_REL_REF() macro when assigning
    the variables.
    
    Fixes: 63bed96 ("x86/startup_64: Defer assignment of 5-level paging global variables")
    Signed-off-by: Tom Lendacky <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Reviewed-by: Ard Biesheuvel <[email protected]>
    Link: https://lore.kernel.org/r/2ca419f4d0de719926fd82353f6751f717590a86.1711122067.git.thomas.lendacky@amd.com
    tlendacky authored and Ingo Molnar committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    9843231 View commit details
    Browse the repository at this point in the history
  6. x86/efistub: Call mixed mode boot services on the firmware's stack

    Normally, the EFI stub calls into the EFI boot services using the stack
    that was live when the stub was entered. According to the UEFI spec,
    this stack needs to be at least 128k in size - this might seem large but
    all asynchronous processing and event handling in EFI runs from the same
    stack and so quite a lot of space may be used in practice.
    
    In mixed mode, the situation is a bit different: the bootloader calls
    the 32-bit EFI stub entry point, which calls the decompressor's 32-bit
    entry point, where the boot stack is set up, using a fixed allocation
    of 16k. This stack is still in use when the EFI stub is started in
    64-bit mode, and so all calls back into the EFI firmware will be using
    the decompressor's limited boot stack.
    
    Due to the placement of the boot stack right after the boot heap, any
    stack overruns have gone unnoticed. However, commit
    
      5c4feadb0011983b ("x86/decompressor: Move global symbol references to C code")
    
    moved the definition of the boot heap into C code, and now the boot
    stack is placed right at the base of BSS, where any overruns will
    corrupt the end of the .data section.
    
    While it would be possible to work around this by increasing the size of
    the boot stack, doing so would affect all x86 systems, and mixed mode
    systems are a tiny (and shrinking) fraction of the x86 installed base.
    
    So instead, record the firmware stack pointer value when entering from
    the 32-bit firmware, and switch to this stack every time a EFI boot
    service call is made.
    
    Cc: <[email protected]> # v6.1+
    Signed-off-by: Ard Biesheuvel <[email protected]>
    ardbiesheuvel committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    cefcd4f View commit details
    Browse the repository at this point in the history
  7. x86/efistub: Don't clear BSS twice in mixed mode

    Clearing BSS should only be done once, at the very beginning.
    efi_pe_entry() is the entrypoint from the firmware, which may not clear
    BSS and so it is done explicitly. However, efi_pe_entry() is also used
    as an entrypoint by the mixed mode startup code, in which case BSS will
    already have been cleared, and doing it again at this point will corrupt
    global variables holding the firmware's GDT/IDT and segment selectors.
    
    So make the memset() conditional on whether the EFI stub is running in
    native mode.
    
    Fixes: b3810c5 ("x86/efistub: Clear decompressor BSS in native EFI entrypoint")
    Signed-off-by: Ard Biesheuvel <[email protected]>
    ardbiesheuvel committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    df7ecce View commit details
    Browse the repository at this point in the history
  8. efi: fix panic in kdump kernel

    Check if get_next_variable() is actually valid pointer before
    calling it. In kdump kernel this method is set to NULL that causes
    panic during the kexec-ed kernel boot.
    
    Tested with QEMU and OVMF firmware.
    
    Fixes: bad267f ("efi: verify that variable services are supported")
    Signed-off-by: Oleksandr Tymoshenko <[email protected]>
    Signed-off-by: Ard Biesheuvel <[email protected]>
    gonzoua authored and ardbiesheuvel committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    62b71cd View commit details
    Browse the repository at this point in the history
  9. Merge tag 'dma-mapping-6.9-2024-03-24' of git://git.infradead.org/use…

    …rs/hch/dma-mapping
    
    Pull dma-mapping fixes from Christoph Hellwig:
     "This has a set of swiotlb alignment fixes for sometimes very long
      standing bugs from Will. We've been discussion them for a while and
      they should be solid now"
    
    * tag 'dma-mapping-6.9-2024-03-24' of git://git.infradead.org/users/hch/dma-mapping:
      swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE
      iommu/dma: Force swiotlb_max_mapping_size on an untrusted device
      swiotlb: Fix alignment checks when both allocation and DMA masks are present
      swiotlb: Honour dma_alloc_coherent() alignment in swiotlb_alloc()
      swiotlb: Enforce page alignment in swiotlb_alloc()
      swiotlb: Fix double-allocation of slots due to broken alignment handling
    torvalds committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    864ad04 View commit details
    Browse the repository at this point in the history
  10. Merge tag 'sched-urgent-2024-03-24' of git://git.kernel.org/pub/scm/l…

    …inux/kernel/git/tip/tip
    
    Pull scheduler doc clarification from Thomas Gleixner:
     "A single update for the documentation of the base_slice_ns tunable to
      clarify that any value which is less than the tick slice has no effect
      because the scheduler tick is not guaranteed to happen within the set
      time slice"
    
    * tag 'sched-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      sched/doc: Update documentation for base_slice_ns and CONFIG_HZ relation
    torvalds committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    b136f68 View commit details
    Browse the repository at this point in the history
  11. Merge tag 'x86-urgent-2024-03-24' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/tip/tip
    
    Pull x86 fixes from Thomas Gleixner:
    
     - Ensure that the encryption mask at boot is properly propagated on
       5-level page tables, otherwise the PGD entry is incorrectly set to
       non-encrypted, which causes system crashes during boot.
    
     - Undo the deferred 5-level page table setup as it cannot work with
       memory encryption enabled.
    
     - Prevent inconsistent XFD state on CPU hotplug, where the MSR is reset
       to the default value but the cached variable is not, so subsequent
       comparisons might yield the wrong result and as a consequence the
       result prevents updating the MSR.
    
     - Register the local APIC address only once in the MPPARSE enumeration
       to prevent triggering the related WARN_ONs() in the APIC and topology
       code.
    
     - Handle the case where no APIC is found gracefully by registering a
       fake APIC in the topology code. That makes all related topology
       functions work correctly and does not affect the actual APIC driver
       code at all.
    
     - Don't evaluate logical IDs during early boot as the local APIC IDs
       are not yet enumerated and the invoked function returns an error
       code. Nothing requires the logical IDs before the final CPUID
       enumeration takes place, which happens after the enumeration.
    
     - Cure the fallout of the per CPU rework on UP which misplaced the
       copying of boot_cpu_data to per CPU data so that the final update to
       boot_cpu_data got lost which caused inconsistent state and boot
       crashes.
    
     - Use copy_from_kernel_nofault() in the kprobes setup as there is no
       guarantee that the address can be safely accessed.
    
     - Reorder struct members in struct saved_context to work around another
       kmemleak false positive
    
     - Remove the buggy code which tries to update the E820 kexec table for
       setup_data as that is never passed to the kexec kernel.
    
     - Update the resource control documentation to use the proper units.
    
     - Fix a Kconfig warning observed with tinyconfig
    
    * tag 'x86-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      x86/boot/64: Move 5-level paging global variable assignments back
      x86/boot/64: Apply encryption mask to 5-level pagetable update
      x86/cpu: Add model number for another Intel Arrow Lake mobile processor
      x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD
      Documentation/x86: Document that resctrl bandwidth control units are MiB
      x86/mpparse: Register APIC address only once
      x86/topology: Handle the !APIC case gracefully
      x86/topology: Don't evaluate logical IDs during early boot
      x86/cpu: Ensure that CPU info updates are propagated on UP
      kprobes/x86: Use copy_from_kernel_nofault() to read from unsafe address
      x86/pm: Work around false positive kmemleak report in msr_build_context()
      x86/kexec: Do not update E820 kexec table for setup_data
      x86/config: Fix warning for 'make ARCH=x86_64 tinyconfig'
    torvalds committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    5e74df2 View commit details
    Browse the repository at this point in the history
  12. Merge tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linu…

    …x/kernel/git/efi/efi
    
    Pull EFI fixes from Ard Biesheuvel:
    
     - Fix logic that is supposed to prevent placement of the kernel image
       below LOAD_PHYSICAL_ADDR
    
     - Use the firmware stack in the EFI stub when running in mixed mode
    
     - Clear BSS only once when using mixed mode
    
     - Check efi.get_variable() function pointer for NULL before trying to
       call it
    
    * tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
      efi: fix panic in kdump kernel
      x86/efistub: Don't clear BSS twice in mixed mode
      x86/efistub: Call mixed mode boot services on the firmware's stack
      efi/libstub: fix efi_random_alloc() to allocate memory at alloc_min or higher address
    torvalds committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    ab8de2d View commit details
    Browse the repository at this point in the history
  13. Linux 6.9-rc1

    torvalds committed Mar 24, 2024
    Configuration menu
    Copy the full SHA
    4cece76 View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2024

  1. fs/9p: fix uaf in in v9fs_stat2inode_dotl

    The incorrect logical order of accessing the st object code in v9fs_fid_iget_dotl
    is causing this uaf.
    
    Fixes: 724a084 ("fs/9p: simplify iget to remove unnecessary paths")
    Reported-and-tested-by: [email protected]
    Signed-off-by: Lizhi Xu <[email protected]>
    Tested-by: Breno Leitao <[email protected]>
    Reviewed-by: Dominique Martinet <[email protected]>
    Signed-off-by: Eric Van Hensbergen <[email protected]>
    Lizhi Xu authored and ericvh committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    11763a8 View commit details
    Browse the repository at this point in the history
  2. fs/9p: remove redundant pointer v9ses

    Pointer v9ses is being assigned the value from the return of inlined
    function v9fs_inode2v9ses (which just returns inode->i_sb->s_fs_info).
    The pointer is not used after the assignment, so the variable is
    redundant and can be removed.
    
    Cleans up clang scan warnings such as:
    fs/9p/vfs_inode_dotl.c:300:28: warning: variable 'v9ses' set but not
    used [-Wunused-but-set-variable]
    
    Signed-off-by: Colin Ian King <[email protected]>
    Reviewed-by: Dominique Martinet <[email protected]>
    Signed-off-by: Eric Van Hensbergen <[email protected]>
    ColinIanKing authored and ericvh committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    10211b4 View commit details
    Browse the repository at this point in the history
  3. erofs: drop experimental warning for FSDAX

    As EXT4/XFS filesystems, FSDAX functionality is considered to be stable.
    Let's drop this warning.
    
    Reviewed-by: Jingbo Xu <[email protected]>
    Signed-off-by: Gao Xiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    hsiangkao committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    a97b59e View commit details
    Browse the repository at this point in the history
  4. MAINTAINERS: erofs: add myself as reviewer

    I have been contributing to erofs for sometime and I would like to help
    with code reviews as well.
    
    Signed-off-by: Sandeep Dhavale <[email protected]>
    Acked-by: Chao Yu <[email protected]>
    Reviewed-by: Gao Xiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Gao Xiang <[email protected]>
    dhavale authored and hsiangkao committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    7557d29 View commit details
    Browse the repository at this point in the history
  5. pwm: img: fix pwm clock lookup

    22e8e19 has introduced a regression in the imgchip->pwm_clk lookup, whereas
    the clock name has also been renamed to "imgchip". This causes the driver
    failing to load:
    
    [    0.546905] img-pwm 18101300.pwm: failed to get imgchip clock
    [    0.553418] img-pwm: probe of 18101300.pwm failed with error -2
    
    Fix this lookup by reverting the clock name back to "pwm".
    
    Signed-off-by: Zoltan HERPAI <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: 22e8e19 ("pwm: img: Rename variable pointing to driver private data")
    Signed-off-by: Uwe Kleine-König <[email protected]>
    wigyori authored and Uwe Kleine-König committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    9eb0587 View commit details
    Browse the repository at this point in the history
  6. tracing: probes: Fix to zero initialize a local variable

    Fix to initialize 'val' local variable with zero.
    Dan reported that Smatch static code checker reports an error that a local
    'val' variable needs to be initialized. Actually, the 'val' is expected to
    be initialized by FETCH_OP_ARG in the same loop, but it is not obvious. So
    initialize it with zero.
    
    Link: https://lore.kernel.org/all/171092223833.237219.17304490075697026697.stgit@devnote2/
    
    Reported-by: Dan Carpenter <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Fixes: 25f00e4 ("tracing/probes: Support $argN in return probe (kprobe and fprobe)")
    Reviewed-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    mhiramat committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    0add699 View commit details
    Browse the repository at this point in the history
  7. mlxbf_gige: stop PHY during open() error paths

    The mlxbf_gige_open() routine starts the PHY as part of normal
    initialization.  The mlxbf_gige_open() routine must stop the
    PHY during its error paths.
    
    Fixes: f92e186 ("Add Mellanox BlueField Gigabit Ethernet driver")
    Signed-off-by: David Thompson <[email protected]>
    Reviewed-by: Asmaa Mnebhi <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Reviewed-by: Jiri Pirko <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    dthompso authored and davem330 committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    d6c30c5 View commit details
    Browse the repository at this point in the history
  8. fs/9p: fix uninitialized values during inode evict

    If an iget fails due to not being able to retrieve information
    from the server then the inode structure is only partially
    initialized.  When the inode gets evicted, references to
    uninitialized structures (like fscache cookies) were being
    made.
    
    This patch checks for a bad_inode before doing anything other
    than clearing the inode from the cache.  Since the inode is
    bad, it shouldn't have any state associated with it that needs
    to be written back (and there really isn't a way to complete
    those anyways).
    
    Reported-by: [email protected]
    Signed-off-by: Eric Van Hensbergen <[email protected]>
    ericvh committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    6630036 View commit details
    Browse the repository at this point in the history
  9. wifi: mac80211: fix mlme_link_id_dbg()

    Make sure that the new mlme_link_id_dbg() macro honours
    CONFIG_MAC80211_MLME_DEBUG as intended to avoid spamming the log with
    messages like:
    
    	wlan0: no EHT support, limiting to HE
    	wlan0: determined local STA to be HE, BW limited to 160 MHz
    	wlan0: determined AP xx:xx:xx:xx:xx:xx to be VHT
    	wlan0: connecting with VHT mode, max bandwidth 160 MHz
    
    Fixes: 310c838 ("wifi: mac80211: clean up connection process")
    Signed-off-by: Johan Hovold <[email protected]>
    Link: https://msgid.link/[email protected]
    Tested-by: Kalle Valo <[email protected]>
    Signed-off-by: Johannes Berg <[email protected]>
    jhovold authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    27f8f10 View commit details
    Browse the repository at this point in the history
  10. wifi: mac80211: check/clear fast rx for non-4addr sta VLAN changes

    When moving a station out of a VLAN and deleting the VLAN afterwards, the
    fast_rx entry still holds a pointer to the VLAN's netdev, which can cause
    use-after-free bugs. Fix this by immediately calling ieee80211_check_fast_rx
    after the VLAN change.
    
    Cc: [email protected]
    Reported-by: [email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    nbd168 authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    4f2bdb3 View commit details
    Browse the repository at this point in the history
  11. wifi: mac80211: fix ieee80211_bss_*_flags kernel-doc

    Running kernel-doc on ieee80211_i.h flagged the following:
    net/mac80211/ieee80211_i.h:145: warning: expecting prototype for enum ieee80211_corrupt_data_flags. Prototype was for enum ieee80211_bss_corrupt_data_flags instead
    net/mac80211/ieee80211_i.h:162: warning: expecting prototype for enum ieee80211_valid_data_flags. Prototype was for enum ieee80211_bss_valid_data_flags instead
    
    Fix these warnings.
    
    Signed-off-by: Jeff Johnson <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Jeff Johnson authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    774f884 View commit details
    Browse the repository at this point in the history
  12. wifi: cfg80211: add a flag to disable wireless extensions

    Wireless extensions are already disabled if MLO is enabled,
    given that we cannot support MLO there with all the hard-
    coded assumptions about BSSID etc.
    
    However, the WiFi7 ecosystem is still stabilizing, and some
    devices may need MLO disabled while that happens. In that
    case, we might end up with a device that supports wext (but
    not MLO) in one kernel, and then breaks wext in the future
    (by enabling MLO), which is not desirable.
    
    Add a flag to let such drivers/devices disable wext even if
    MLO isn't yet enabled.
    
    Cc: [email protected]
    Link: https://msgid.link/20240314110951.b50f1dc4ec21.I656ddd8178eedb49dc5c6c0e70f8ce5807afb54f@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    be23b2d View commit details
    Browse the repository at this point in the history
  13. wifi: iwlwifi: mvm: disable MLO for the time being

    MLO ended up not really fully stable yet, we want to make
    sure it works well with the ecosystem before enabling it.
    Thus, remove the flag, but set WIPHY_FLAG_DISABLE_WEXT so
    we don't get wireless extensions back until we enable MLO
    for this hardware.
    
    Cc: [email protected]
    Reviewed-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240314110951.d6ad146df98d.I47127e4fdbdef89e4ccf7483641570ee7871d4e6@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    5f40400 View commit details
    Browse the repository at this point in the history
  14. wifi: cfg80211: fix rdev_dump_mpp() arguments order

    Fix the order of arguments in the TP_ARGS macro
    for the rdev_dump_mpp tracepoint event.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    Signed-off-by: Igor Artemiev <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Igor Artemiev authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    ec50f31 View commit details
    Browse the repository at this point in the history
  15. wifi: mac80211: fix prep_connection error path

    If prep_channel fails in prep_connection, the code releases
    the deflink's chanctx, which is wrong since we may be using
    a different link. It's already wrong to even do that always
    though, since we might still have the station. Remove it
    only if prep_channel succeeded and later updates fail.
    
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240318184907.2780c1f08c3d.I033c9b15483933088f32a2c0789612a33dd33d82@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    2e6bd24 View commit details
    Browse the repository at this point in the history
  16. wifi: iwlwifi: mvm: pick the version of SESSION_PROTECTION_NOTIF

    When we want to know whether we should look for the mac_id or the
    link_id in struct iwl_mvm_session_prot_notif, we should look at the
    version of SESSION_PROTECTION_NOTIF.
    
    This causes WARNINGs:
    
    WARNING: CPU: 0 PID: 11403 at drivers/net/wireless/intel/iwlwifi/mvm/time-event.c:959 iwl_mvm_rx_session_protect_notif+0x333/0x340 [iwlmvm]
    RIP: 0010:iwl_mvm_rx_session_protect_notif+0x333/0x340 [iwlmvm]
    Code: 00 49 c7 84 24 48 07 00 00 00 00 00 00 41 c6 84 24 78 07 00 00 ff 4c 89 f7 e8 e9 71 54 d9 e9 7d fd ff ff 0f 0b e9 23 fe ff ff <0f> 0b e9 1c fe ff ff 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90
    RSP: 0018:ffffb4bb00003d40 EFLAGS: 00010202
    RAX: 0000000000000000 RBX: ffff9ae63a361000 RCX: ffff9ae4a98b60d4
    RDX: ffff9ae4588499c0 RSI: 0000000000000305 RDI: ffff9ae4a98b6358
    RBP: ffffb4bb00003d68 R08: 0000000000000003 R09: 0000000000000010
    R10: ffffb4bb00003d00 R11: 000000000000000f R12: ffff9ae441399050
    R13: ffff9ae4761329e8 R14: 0000000000000001 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:ffff9ae7af400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000055fb75680018 CR3: 00000003dae32006 CR4: 0000000000f70ef0
    PKRU: 55555554
    Call Trace:
     <IRQ>
     ? show_regs+0x69/0x80
     ? __warn+0x8d/0x150
     ? iwl_mvm_rx_session_protect_notif+0x333/0x340 [iwlmvm]
     ? report_bug+0x196/0x1c0
     ? handle_bug+0x45/0x80
     ? exc_invalid_op+0x1c/0xb0
     ? asm_exc_invalid_op+0x1f/0x30
     ? iwl_mvm_rx_session_protect_notif+0x333/0x340 [iwlmvm]
     iwl_mvm_rx_common+0x115/0x340 [iwlmvm]
     iwl_mvm_rx_mq+0xa6/0x100 [iwlmvm]
     iwl_pcie_rx_handle+0x263/0xa10 [iwlwifi]
     iwl_pcie_napi_poll_msix+0x32/0xd0 [iwlwifi]
    
    Fixes: 085d33c ("wifi: iwlwifi: support link id in SESSION_PROTECTION_NOTIF")
    Signed-off-by: Emmanuel Grumbach <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240311081938.39d5618f7b9d.I564d863e53c6cbcb49141467932ecb6a9840b320@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    egrumbach authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    bbe806c View commit details
    Browse the repository at this point in the history
  17. wifi: iwlwifi: mvm: consider having one active link

    Do not call iwl_mvm_mld_get_primary_link if only one link
    is active.
    In that case, the sole active link should be used.
    
    iwl_mvm_mld_get_primary_link returns -1 if only one link
    is active causing a warning.
    
    Fixes: 8c9bef2 ("wifi: iwlwifi: mvm: d3: implement suspend with MLO")
    Signed-off-by: Shaul Triebitz <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240311081938.6c50061bf69b.I05b0ac7fa7149eabaa5570a6f65b0d9bfb09a6f1@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    striebit authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    847d735 View commit details
    Browse the repository at this point in the history
  18. wifi: iwlwifi: mvm: Configure the link mapping for non-MLD FW

    In the non MLD firmware flows, although the deflink is used, the mapping
    of link ID to BSS configuration was missing, which causes flows that need
    this mapping to crash.
    
    Fix this by adding the link ID to BSS configuration mapping to non MLD
    flows as well.
    
    Signed-off-by: Ilan Peer <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240311081938.0b5c361e8f0c.Ib11f41815d2efa5d1ec57f855de4c8563142987b@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    ilanpeer2 authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    a8b5d48 View commit details
    Browse the repository at this point in the history
  19. wifi: mac80211: correctly set active links upon TTLM

    Fix ieee80211_ttlm_set_links() to not set all active links,
    but instead let the driver know that valid links status changed
    and select the active links properly.
    
    Fixes: 8f500fb ("wifi: mac80211: process and save negotiated TID to Link mapping request")
    Signed-off-by: Ayala Beker <[email protected]>
    Reviewed-by: Ilan Peer <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240318184907.acddbbf39584.Ide858f95248fcb3e483c97fcaa14b0cd4e964b10@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    AyalaBkr authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    134d715 View commit details
    Browse the repository at this point in the history
  20. wifi: iwlwifi: mvm: rfi: fix potential response leaks

    If the rx payload length check fails, or if kmemdup() fails,
    we still need to free the command response. Fix that.
    
    Fixes: 2125490 ("iwlwifi: mvm: add RFI-M support")
    Co-authored-by: Anjaneyulu <[email protected]>
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240319100755.db2fa0196aa7.I116293b132502ac68a65527330fa37799694b79c@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel and panjaney committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    06a0938 View commit details
    Browse the repository at this point in the history
  21. wifi: iwlwifi: fw: don't always use FW dump trig

    Since the dump_data (struct iwl_fwrt_dump_data) is a union,
    it's not safe to unconditionally access and use the 'trig'
    member, it might be 'desc' instead. Access it only if it's
    known to be 'trig' rather than 'desc', i.e. if ini-debug
    is present.
    
    Cc: [email protected]
    Fixes: 0eb50c6 ("iwlwifi: yoyo: send hcmd to fw after dump collection completes.")
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240319100755.e2976bc58b29.I72fbd6135b3623227de53d8a2bb82776066cb72b@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    045a5b6 View commit details
    Browse the repository at this point in the history
  22. wifi: iwlwifi: read txq->read_ptr under lock

    If we read txq->read_ptr without lock, we can read the same
    value twice, then obtain the lock, and reclaim from there
    to two different places, but crucially reclaim the same
    entry twice, resulting in the WARN_ONCE() a little later.
    Fix that by reading txq->read_ptr under lock.
    
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240319100755.bf4c62196504.I978a7ca56c6bd6f1bf42c15aa923ba03366a840b@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    c2ace63 View commit details
    Browse the repository at this point in the history
  23. wifi: iwlwifi: mvm: guard against invalid STA ID on removal

    Guard against invalid station IDs in iwl_mvm_mld_rm_sta_id as that would
    result in out-of-bounds array accesses. This prevents issues should the
    driver get into a bad state during error handling.
    
    Signed-off-by: Benjamin Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240320232419.d523167bda9c.I1cffd86363805bf86a95d8bdfd4b438bb54baddc@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    benzea authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    17f6451 View commit details
    Browse the repository at this point in the history
  24. wifi: iwlwifi: mvm: handle debugfs names more carefully

    With debugfs=off, we can get here with the dbgfs_dir being
    an ERR_PTR(). Instead of checking for all this, which is
    often flagged as a mistake, simply handle the names here
    more carefully by printing them, then we don't need extra
    checks.
    
    Also, while checking, I noticed theoretically 'buf' is too
    small, so fix that size as well.
    
    Cc: [email protected]
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218422
    Fixes: c36235a ("wifi: iwlwifi: mvm: rework debugfs handling")
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240320232419.4dc1eb3dd015.I32f308b0356ef5bcf8d188dd98ce9b210e3ab9fd@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    19d82bd View commit details
    Browse the repository at this point in the history
  25. wifi: iwlwifi: mvm: include link ID when releasing frames

    When releasing frames from the reorder buffer, the link ID was not
    included in the RX status information. This subsequently led mac80211 to
    drop the frame. Change it so that the link information is set
    immediately when possible so that it doesn't not need to be filled in
    anymore when submitting the frame to mac80211.
    
    Fixes: b8a85a1 ("wifi: iwlwifi: mvm: rxmq: report link ID to mac80211")
    Signed-off-by: Benjamin Berg <[email protected]>
    Tested-by: Emmanuel Grumbach <[email protected]>
    Reviewed-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240320232419.bbbd5e9bfe80.Iec1bf5c884e371f7bc5ea2534ed9ea8d3f2c0bf6@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    benzea authored and jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    e78d787 View commit details
    Browse the repository at this point in the history
  26. net: mark racy access on sk->sk_rcvbuf

    sk->sk_rcvbuf in __sock_queue_rcv_skb() and __sk_receive_skb() can be
    changed by other threads. Mark this as benign using READ_ONCE().
    
    This patch is aimed at reducing the number of benign races reported by
    KCSAN in order to focus future debugging effort on harmful races.
    
    Signed-off-by: linke li <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    linke li authored and davem330 committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    c2deb2e View commit details
    Browse the repository at this point in the history
  27. bpf: Sync uapi bpf.h to tools directory

    There is a difference between kernel uapi bpf.h and tools
    uapi bpf.h. There is no functionality difference, but let
    us sync properly to make it easy for later bpf.h update.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Yonghong Song authored and borkmann committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    476a5e9 View commit details
    Browse the repository at this point in the history
  28. selftests/bpf: Use start_server in bpf_tcp_ca

    To simplify the code, use BPF selftests helper start_server() in
    bpf_tcp_ca.c instead of open-coding it. This helper is defined in
    network_helpers.c, and exported in network_helpers.h, which is already
    included in bpf_tcp_ca.c.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Link: https://lore.kernel.org/bpf/9926a79118db27dd6d91c4854db011c599cabd0e.1711331517.git.tanggeliang@kylinos.cn
    Geliang Tang authored and borkmann committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    c29083f View commit details
    Browse the repository at this point in the history
  29. bpf: Avoid get_kernel_nofault() to fetch kprobe entry IP

    get_kernel_nofault() (or, rather, underlying copy_from_kernel_nofault())
    is not free and it does pop up in performance profiles when
    kprobes are heavily utilized with CONFIG_X86_KERNEL_IBT=y config.
    
    Let's avoid using it if we know that fentry_ip - 4 can't cross page
    boundary. We do that by masking lowest 12 bits and checking if they are
    
    Another benefit (and actually what caused a closer look at this part of
    code) is that now LBR record is (typically) not wasted on
    copy_from_kernel_nofault() call and code, which helps tools like
    retsnoop that grab LBR records from inside BPF code in kretprobes.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Acked-by: Jiri Olsa <[email protected]>
    Acked-by: Masami Hiramatsu (Google) <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    anakryiko authored and borkmann committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    a849750 View commit details
    Browse the repository at this point in the history
  30. bpf: implement insn_is_cast_user() helper for JITs

    Implement a helper function to check if an instruction is
    addr_space_cast from as(0) to as(1). Use this helper in the x86 JIT.
    
    Other JITs can use this helper when they add support for this instruction.
    
    Signed-off-by: Puranjay Mohan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    770546a View commit details
    Browse the repository at this point in the history
  31. selftests/bpf: Fix flaky test btf_map_in_map/lookup_update

    Recently, I frequently hit the following test failure:
    
      [root@arch-fb-vm1 bpf]# ./test_progs -n 33/1
      test_lookup_update:PASS:skel_open 0 nsec
      [...]
      test_lookup_update:PASS:sync_rcu 0 nsec
      test_lookup_update:FAIL:map1_leak inner_map1 leaked!
      #33/1    btf_map_in_map/lookup_update:FAIL
      #33      btf_map_in_map:FAIL
    
    In the test, after map is closed and then after two rcu grace periods,
    it is assumed that map_id is not available to user space.
    
    But the above assumption cannot be guaranteed. After zero or one
    or two rcu grace periods in different siturations, the actual
    freeing-map-work is put into a workqueue. Later on, when the work
    is dequeued, the map will be actually freed.
    See bpf_map_put() in kernel/bpf/syscall.c.
    
    By using workqueue, there is no ganrantee that map will be actually
    freed after a couple of rcu grace periods. This patch removed
    such map leak detection and then the test can pass consistently.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Yonghong Song authored and borkmann committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    14bb1e8 View commit details
    Browse the repository at this point in the history
  32. kunit: fix wireless test dependencies

    For the wireless tests, CONFIG_WLAN and CONFIG_NETDEVICES are
    needed, though seem to be available by default on ARCH=um, so
    we didn't notice this before. Add them to fix kunit running
    on other architectures.
    
    Fixes: 28b3df1 ("kunit: add wireless unit tests")
    Reported-by: Mark Brown <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]/
    Signed-off-by: Johannes Berg <[email protected]>
    jmberg-intel committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    dbde9fd View commit details
    Browse the repository at this point in the history
  33. ice: Refactor FW data type and fix bitmap casting issue

    According to the datasheet, the recipe association data is an 8-byte
    little-endian value. It is described as 'Bitmap of the recipe indexes
    associated with this profile', it is from 24 to 31 byte area in FW.
    Therefore, it is defined to '__le64 recipe_assoc' in struct
    ice_aqc_recipe_to_profile. And then fix the bitmap casting issue, as we
    must never ever use castings for bitmap type.
    
    Fixes: 1e0f988 ("ice: Flesh out implementation of support for SRIOV on bonded interface")
    Reviewed-by: Przemek Kitszel <[email protected]>
    Reviewed-by: Andrii Staikov <[email protected]>
    Reviewed-by: Jan Sokolowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Steven Zou <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    hzouSteven authored and anguy11 committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    817b189 View commit details
    Browse the repository at this point in the history
  34. ice: fix memory corruption bug with suspend and rebuild

    The ice driver would previously panic after suspend. This is caused
    from the driver *only* calling the ice_vsi_free_q_vectors() function by
    itself, when it is suspending. Since commit b3e7b3a ("ice: prevent
    NULL pointer deref during reload") the driver has zeroed out
    num_q_vectors, and only restored it in ice_vsi_cfg_def().
    
    This further causes the ice_rebuild() function to allocate a zero length
    buffer, after which num_q_vectors is updated, and then the new value of
    num_q_vectors is used to index into the zero length buffer, which
    corrupts memory.
    
    The fix entails making sure all the code referencing num_q_vectors only
    does so after it has been reset via ice_vsi_cfg_def().
    
    I didn't perform a full bisect, but I was able to test against 6.1.77
    kernel and that ice driver works fine for suspend/resume with no panic,
    so sometime since then, this problem was introduced.
    
    Also clean up an un-needed init of a local variable in the function
    being modified.
    
    PANIC from 6.8.0-rc1:
    
    [1026674.915596] PM: suspend exit
    [1026675.664697] ice 0000:17:00.1: PTP reset successful
    [1026675.664707] ice 0000:17:00.1: 2755 msecs passed between update to cached PHC time
    [1026675.667660] ice 0000:b1:00.0: PTP reset successful
    [1026675.675944] ice 0000:b1:00.0: 2832 msecs passed between update to cached PHC time
    [1026677.137733] ixgbe 0000:31:00.0 ens787: NIC Link is Up 1 Gbps, Flow Control: None
    [1026677.190201] BUG: kernel NULL pointer dereference, address: 0000000000000010
    [1026677.192753] ice 0000:17:00.0: PTP reset successful
    [1026677.192764] ice 0000:17:00.0: 4548 msecs passed between update to cached PHC time
    [1026677.197928] #PF: supervisor read access in kernel mode
    [1026677.197933] #PF: error_code(0x0000) - not-present page
    [1026677.197937] PGD 1557a7067 P4D 0
    [1026677.212133] ice 0000:b1:00.1: PTP reset successful
    [1026677.212143] ice 0000:b1:00.1: 4344 msecs passed between update to cached PHC time
    [1026677.212575]
    [1026677.243142] Oops: 0000 [#1] PREEMPT SMP NOPTI
    [1026677.247918] CPU: 23 PID: 42790 Comm: kworker/23:0 Kdump: loaded Tainted: G        W          6.8.0-rc1+ #1
    [1026677.257989] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022
    [1026677.269367] Workqueue: ice ice_service_task [ice]
    [1026677.274592] RIP: 0010:ice_vsi_rebuild_set_coalesce+0x130/0x1e0 [ice]
    [1026677.281421] Code: 0f 84 3a ff ff ff 41 0f b7 74 ec 02 66 89 b0 22 02 00 00 81 e6 ff 1f 00 00 e8 ec fd ff ff e9 35 ff ff ff 48 8b 43 30 49 63 ed <41> 0f b7 34 24 41 83 c5 01 48 8b 3c e8 66 89 b7 aa 02 00 00 81 e6
    [1026677.300877] RSP: 0018:ff3be62a6399bcc0 EFLAGS: 00010202
    [1026677.306556] RAX: ff28691e28980828 RBX: ff28691e41099828 RCX: 0000000000188000
    [1026677.314148] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff28691e41099828
    [1026677.321730] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    [1026677.329311] R10: 0000000000000007 R11: ffffffffffffffc0 R12: 0000000000000010
    [1026677.336896] R13: 0000000000000000 R14: 0000000000000000 R15: ff28691e0eaa81a0
    [1026677.344472] FS:  0000000000000000(0000) GS:ff28693cbffc0000(0000) knlGS:0000000000000000
    [1026677.353000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [1026677.359195] CR2: 0000000000000010 CR3: 0000000128df4001 CR4: 0000000000771ef0
    [1026677.366779] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [1026677.374369] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [1026677.381952] PKRU: 55555554
    [1026677.385116] Call Trace:
    [1026677.388023]  <TASK>
    [1026677.390589]  ? __die+0x20/0x70
    [1026677.394105]  ? page_fault_oops+0x82/0x160
    [1026677.398576]  ? do_user_addr_fault+0x65/0x6a0
    [1026677.403307]  ? exc_page_fault+0x6a/0x150
    [1026677.407694]  ? asm_exc_page_fault+0x22/0x30
    [1026677.412349]  ? ice_vsi_rebuild_set_coalesce+0x130/0x1e0 [ice]
    [1026677.418614]  ice_vsi_rebuild+0x34b/0x3c0 [ice]
    [1026677.423583]  ice_vsi_rebuild_by_type+0x76/0x180 [ice]
    [1026677.429147]  ice_rebuild+0x18b/0x520 [ice]
    [1026677.433746]  ? delay_tsc+0x8f/0xc0
    [1026677.437630]  ice_do_reset+0xa3/0x190 [ice]
    [1026677.442231]  ice_service_task+0x26/0x440 [ice]
    [1026677.447180]  process_one_work+0x174/0x340
    [1026677.451669]  worker_thread+0x27e/0x390
    [1026677.455890]  ? __pfx_worker_thread+0x10/0x10
    [1026677.460627]  kthread+0xee/0x120
    [1026677.464235]  ? __pfx_kthread+0x10/0x10
    [1026677.468445]  ret_from_fork+0x2d/0x50
    [1026677.472476]  ? __pfx_kthread+0x10/0x10
    [1026677.476671]  ret_from_fork_asm+0x1b/0x30
    [1026677.481050]  </TASK>
    
    Fixes: b3e7b3a ("ice: prevent NULL pointer deref during reload")
    Reported-by: Robert Elliott <[email protected]>
    Signed-off-by: Jesse Brandeburg <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Aleksandr Loktionov <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    jbrandeb authored and anguy11 committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    1cb7fdb View commit details
    Browse the repository at this point in the history
  35. ixgbe: avoid sleeping allocation in ixgbe_ipsec_vf_add_sa()

    Change kzalloc() flags used in ixgbe_ipsec_vf_add_sa() to GFP_ATOMIC, to
    avoid sleeping in IRQ context.
    
    Dan Carpenter, with the help of Smatch, has found following issue:
    The patch eda0333: "ixgbe: add VF IPsec management" from Aug 13,
    2018 (linux-next), leads to the following Smatch static checker
    warning: drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c:917 ixgbe_ipsec_vf_add_sa()
    	warn: sleeping in IRQ context
    
    The call tree that Smatch is worried about is:
    ixgbe_msix_other() <- IRQ handler
    -> ixgbe_msg_task()
       -> ixgbe_rcv_msg_from_vf()
          -> ixgbe_ipsec_vf_add_sa()
    
    Fixes: eda0333 ("ixgbe: add VF IPsec management")
    Reported-by: Dan Carpenter <[email protected]>
    Link: https://lore.kernel.org/intel-wired-lan/[email protected]
    Reviewed-by: Michal Kubiak <[email protected]>
    Signed-off-by: Przemek Kitszel <[email protected]>
    Reviewed-by: Shannon Nelson <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    pkitszel authored and anguy11 committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    aec806f View commit details
    Browse the repository at this point in the history
  36. igc: Remove stale comment about Tx timestamping

    The initial igc Tx timestamping implementation used only one register for
    retrieving Tx timestamps. Commit 3ed247e ("igc: Add support for
    multiple in-flight TX timestamps") added support for utilizing all four of
    them e.g., for multiple domain support. Remove the stale comment/FIXME.
    
    Fixes: 3ed247e ("igc: Add support for multiple in-flight TX timestamps")
    Signed-off-by: Kurt Kanzenbach <[email protected]>
    Acked-by: Vinicius Costa Gomes <[email protected]>
    Reviewed-by: Przemek Kitszel <[email protected]>
    Tested-by: Naama Meir <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    shifty91 authored and anguy11 committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    47ce295 View commit details
    Browse the repository at this point in the history
  37. Merge tag 'v6.9-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/…

    …herbert/crypto-2.6
    
    Pull crypto fixes from Herbert Xu:
     "This fixes a regression that broke iwd as well as a divide by zero in
      iaa"
    
    * tag 'v6.9-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
      crypto: iaa - Fix nr_cpus < nr_iaa case
      Revert "crypto: pkcs7 - remove sha1 support"
    torvalds committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    174fdc9 View commit details
    Browse the repository at this point in the history
  38. Merge tag 'gfs2-v6.8-fix' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/gfs2/linux-gfs2
    
    Pull gfs2 fix from Andreas Gruenbacher:
    
     - Fix boundary check in punch_hole
    
    * tag 'gfs2-v6.8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
      gfs2: Fix invalid metadata access in punch_hole
    torvalds committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    928a87e View commit details
    Browse the repository at this point in the history
  39. riscv, bpf: Fix kfunc parameters incompatibility between bpf and risc…

    …v abi
    
    We encountered a failing case when running selftest in no_alu32 mode:
    
    The failure case is `kfunc_call/kfunc_call_test4` and its source code is
    like bellow:
    ```
    long bpf_kfunc_call_test4(signed char a, short b, int c, long d) __ksym;
    int kfunc_call_test4(struct __sk_buff *skb)
    {
    	...
    	tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000);
    	...
    }
    ```
    
    And its corresponding asm code is:
    ```
    0: r1 = -3
    1: r2 = -30
    2: r3 = 0xffffff38 # opcode: 18 03 00 00 38 ff ff ff 00 00 00 00 00 00 00 00
    4: r4 = -1000
    5: call bpf_kfunc_call_test4
    ```
    
    insn 2 is parsed to ld_imm64 insn to emit 0x00000000ffffff38 imm, and
    converted to int type and then send to bpf_kfunc_call_test4. But since
    it is zero-extended in the bpf calling convention, riscv jit will
    directly treat it as an unsigned 32-bit int value, and then fails with
    the message "actual 4294966063 != expected -1234".
    
    The reason is the incompatibility between bpf and riscv abi, that is,
    bpf will do zero-extension on uint, but riscv64 requires sign-extension
    on int or uint. We can solve this problem by sign extending the 32-bit
    parameters in kfunc.
    
    The issue is related to [0], and thanks to Yonghong and Alexei.
    
    Link: llvm/llvm-project#84874 [0]
    Fixes: d40c384 ("riscv, bpf: Add kfunc support for RV64")
    Signed-off-by: Pu Lehui <[email protected]>
    Tested-by: Puranjay Mohan <[email protected]>
    Reviewed-by: Puranjay Mohan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Pu Lehui authored and Alexei Starovoitov committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    443574b View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2024

  1. dpll: indent DPLL option type by a tab

    Indent config option type by a tab. It helps Kconfig parsers
    to read file without error.
    
    Fixes: 9431063 ("dpll: core: Add DPLL framework base functions")
    Signed-off-by: Prasad Pandit <[email protected]>
    Reviewed-by: Vadim Fedorenko <[email protected]>
    Reviewed-by: Jiri Pirko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Prasad Pandit authored and kuba-moo committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    cc26992 View commit details
    Browse the repository at this point in the history
  2. s390/qeth: handle deferred cc1

    The IO subsystem expects a driver to retry a ccw_device_start, when the
    subsequent interrupt response block (irb) contains a deferred
    condition code 1.
    
    Symptoms before this commit:
    On the read channel we always trigger the next read anyhow, so no
    different behaviour here.
    On the write channel we may experience timeout errors, because the
    expected reply will never be received without the retry.
    Other callers of qeth_send_control_data() may wrongly assume that the ccw
    was successful, which may cause problems later.
    
    Note that since
    commit 2297791 ("s390/cio: dont unregister subchannel from child-drivers")
    and
    commit 5ef1dc4 ("s390/cio: fix invalid -EBUSY on ccw_device_start")
    deferred CC1s are much more likely to occur. See the commit message of the
    latter for more background information.
    
    Fixes: 2297791 ("s390/cio: dont unregister subchannel from child-drivers")
    Signed-off-by: Alexandra Winter <[email protected]>
    Co-developed-by: Thorsten Winkler <[email protected]>
    Signed-off-by: Thorsten Winkler <[email protected]>
    Reviewed-by: Peter Oberparleiter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    SandyWinter authored and kuba-moo committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    afb373f View commit details
    Browse the repository at this point in the history
  3. net: ll_temac: platform_get_resource replaced by wrong function

    The function platform_get_resource was replaced with
    devm_platform_ioremap_resource_byname and is called using 0 as name.
    
    This eventually ends up in platform_get_resource_byname in the call
    stack, where it causes a null pointer in strcmp.
    
    	if (type == resource_type(r) && !strcmp(r->name, name))
    
    It should have been replaced with devm_platform_ioremap_resource.
    
    Fixes: bd69058 ("net: ll_temac: Use devm_platform_ioremap_resource_byname()")
    Signed-off-by: Claus Hansen Ries <[email protected]>
    Cc: [email protected]
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    ClausRies authored and kuba-moo committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    3a38a82 View commit details
    Browse the repository at this point in the history
  4. net: hsr: hsr_slave: Fix the promiscuous mode in offload mode

    commit e748d0f ("net: hsr: Disable promiscuous mode in
    offload mode") disables promiscuous mode of slave devices
    while creating an HSR interface. But while deleting the
    HSR interface, it does not take care of it. It decreases the
    promiscuous mode count, which eventually enables promiscuous
    mode on the slave devices when creating HSR interface again.
    
    Fix this by not decrementing the promiscuous mode count while
    deleting the HSR interface when offload is enabled.
    
    Fixes: e748d0f ("net: hsr: Disable promiscuous mode in offload mode")
    Signed-off-by: Ravi Gunasekaran <[email protected]>
    Reviewed-by: Jiri Pirko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Ravi Gunasekaran authored and kuba-moo committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    b11c817 View commit details
    Browse the repository at this point in the history
  5. tcp: properly terminate timers for kernel sockets

    We had various syzbot reports about tcp timers firing after
    the corresponding netns has been dismantled.
    
    Fortunately Josef Bacik could trigger the issue more often,
    and could test a patch I wrote two years ago.
    
    When TCP sockets are closed, we call inet_csk_clear_xmit_timers()
    to 'stop' the timers.
    
    inet_csk_clear_xmit_timers() can be called from any context,
    including when socket lock is held.
    This is the reason it uses sk_stop_timer(), aka del_timer().
    This means that ongoing timers might finish much later.
    
    For user sockets, this is fine because each running timer
    holds a reference on the socket, and the user socket holds
    a reference on the netns.
    
    For kernel sockets, we risk that the netns is freed before
    timer can complete, because kernel sockets do not hold
    reference on the netns.
    
    This patch adds inet_csk_clear_xmit_timers_sync() function
    that using sk_stop_timer_sync() to make sure all timers
    are terminated before the kernel socket is released.
    Modules using kernel sockets close them in their netns exit()
    handler.
    
    Also add sock_not_owned_by_me() helper to get LOCKDEP
    support : inet_csk_clear_xmit_timers_sync() must not be called
    while socket lock is held.
    
    It is very possible we can revert in the future commit
    3a58f13 ("net: rds: acquire refcount on TCP sockets")
    which attempted to solve the issue in rds only.
    (net/smc/af_smc.c and net/mptcp/subflow.c have similar code)
    
    We probably can remove the check_net() tests from
    tcp_out_of_resources() and __tcp_close() in the future.
    
    Reported-by: Josef Bacik <[email protected]>
    Closes: https://lore.kernel.org/netdev/20240314210740.GA2823176@perftesting/
    Fixes: 26abe14 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
    Fixes: 8a68173 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket")
    Link: https://lore.kernel.org/bpf/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsatjw@mail.gmail.com/
    Signed-off-by: Eric Dumazet <[email protected]>
    Tested-by: Josef Bacik <[email protected]>
    Cc: Tetsuo Handa <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    151c9c7 View commit details
    Browse the repository at this point in the history
  6. net: wwan: t7xx: Split 64bit accesses to fix alignment issues

    Some of the registers are aligned on a 32bit boundary, causing
    alignment faults on 64bit platforms.
    
     Unable to handle kernel paging request at virtual address ffffffc084a1d004
     Mem abort info:
     ESR = 0x0000000096000061
     EC = 0x25: DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
     FSC = 0x21: alignment fault
     Data abort info:
     ISV = 0, ISS = 0x00000061, ISS2 = 0x00000000
     CM = 0, WnR = 1, TnD = 0, TagAccess = 0
     GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
     swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000046ad6000
     [ffffffc084a1d004] pgd=100000013ffff003, p4d=100000013ffff003, pud=100000013ffff003, pmd=0068000020a00711
     Internal error: Oops: 0000000096000061 [#1] SMP
     Modules linked in: mtk_t7xx(+) qcserial pppoe ppp_async option nft_fib_inet nf_flow_table_inet mt7921u(O) mt7921s(O) mt7921e(O) mt7921_common(O) iwlmvm(O) iwldvm(O) usb_wwan rndis_host qmi_wwan pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7996e(O) mt792x_usb(O) mt792x_lib(O) mt7915e(O) mt76_usb(O) mt76_sdio(O) mt76_connac_lib(O) mt76(O) mac80211(O) iwlwifi(O) huawei_cdc_ncm cfg80211(O) cdc_ncm cdc_ether wwan usbserial usbnet slhc sfp rtc_pcf8563 nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mt6577_auxadc mdio_i2c libcrc32c compat(O) cdc_wdm cdc_acm at24 crypto_safexcel pwm_fan i2c_gpio i2c_smbus industrialio i2c_algo_bit i2c_mux_reg i2c_mux_pca954x i2c_mux_pca9541 i2c_mux_gpio i2c_mux dummy oid_registry tun sha512_arm64 sha1_ce sha1_generic seqiv
     md5 geniv des_generic libdes cbc authencesn authenc leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd nvme nvme_core gpio_button_hotplug(O) dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax usbcore usb_common ptp aquantia pps_core mii tpm encrypted_keys trusted
     CPU: 3 PID: 5266 Comm: kworker/u9:1 Tainted: G O 6.6.22 #0
     Hardware name: Bananapi BPI-R4 (DT)
     Workqueue: md_hk_wq t7xx_fsm_uninit [mtk_t7xx]
     pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : t7xx_cldma_hw_set_start_addr+0x1c/0x3c [mtk_t7xx]
     lr : t7xx_cldma_start+0xac/0x13c [mtk_t7xx]
     sp : ffffffc085d63d30
     x29: ffffffc085d63d30 x28: 0000000000000000 x27: 0000000000000000
     x26: 0000000000000000 x25: ffffff80c804f2c0 x24: ffffff80ca196c05
     x23: 0000000000000000 x22: ffffff80c814b9b8 x21: ffffff80c814b128
     x20: 0000000000000001 x19: ffffff80c814b080 x18: 0000000000000014
     x17: 0000000055c9806b x16: 000000007c5296d0 x15: 000000000f6bca68
     x14: 00000000dbdbdce4 x13: 000000001aeaf72a x12: 0000000000000001
     x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
     x8 : ffffff80ca1ef6b4 x7 : ffffff80c814b818 x6 : 0000000000000018
     x5 : 0000000000000870 x4 : 0000000000000000 x3 : 0000000000000000
     x2 : 000000010a947000 x1 : ffffffc084a1d004 x0 : ffffffc084a1d004
     Call trace:
     t7xx_cldma_hw_set_start_addr+0x1c/0x3c [mtk_t7xx]
     t7xx_fsm_uninit+0x578/0x5ec [mtk_t7xx]
     process_one_work+0x154/0x2a0
     worker_thread+0x2ac/0x488
     kthread+0xe0/0xec
     ret_from_fork+0x10/0x20
     Code: f9400800 91001000 8b214001 d50332bf (f9000022)
     ---[ end trace 0000000000000000 ]---
    
    The inclusion of io-64-nonatomic-lo-hi.h indicates that all 64bit
    accesses can be replaced by pairs of nonatomic 32bit access.  Fix
    alignment by forcing all accesses to be 32bit on 64bit platforms.
    
    Link: https://forum.openwrt.org/t/fibocom-fm350-gl-support/142682/72
    Fixes: 39d4390 ("net: wwan: t7xx: Add control DMA interface")
    Signed-off-by: Bjørn Mork <[email protected]>
    Reviewed-by: Sergey Ryazanov <[email protected]>
    Tested-by: Liviu Dudau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    bmork authored and kuba-moo committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    7d5a7dd View commit details
    Browse the repository at this point in the history
  7. net: dsa: mt7530: fix improper frames on all 25MHz and 40MHz XTAL MT7530

    The MT7530 switch after reset initialises with a core clock frequency that
    works with a 25MHz XTAL connected to it. For 40MHz XTAL, the core clock
    frequency must be set to 500MHz.
    
    The mt7530_pll_setup() function is responsible of setting the core clock
    frequency. Currently, it runs on MT7530 with 25MHz and 40MHz XTAL. This
    causes MT7530 switch with 25MHz XTAL to egress and ingress frames
    improperly.
    
    Introduce a check to run it only on MT7530 with 40MHz XTAL.
    
    The core clock frequency is set by writing to a switch PHY's register.
    Access to the PHY's register is done via the MDIO bus the switch is also
    on. Therefore, it works only when the switch makes switch PHYs listen on
    the MDIO bus the switch is on. This is controlled either by the state of
    the ESW_P1_LED_1 pin after reset deassertion or modifying bit 5 of the
    modifiable trap register.
    
    When ESW_P1_LED_1 is pulled high, PHY indirect access is used. That means
    accessing PHY registers via the PHY indirect access control register of the
    switch.
    
    When ESW_P1_LED_1 is pulled low, PHY direct access is used. That means
    accessing PHY registers via the MDIO bus the switch is on.
    
    For MT7530 switch with 40MHz XTAL on a board with ESW_P1_LED_1 pulled high,
    the core clock frequency won't be set to 500MHz, causing the switch to
    egress and ingress frames improperly.
    
    Run mt7530_pll_setup() after PHY direct access is set on the modifiable
    trap register.
    
    With these two changes, all MT7530 switches with 25MHz and 40MHz, and
    P1_LED_1 pulled high or low, will egress and ingress frames properly.
    
    Link: https://github.com/BPI-SINOVOIP/BPI-R2-bsp/blob/4a5dd143f2172ec97a2872fa29c7c4cd520f45b5/linux-mt/drivers/net/ethernet/mediatek/gsw_mt7623.c#L1039
    Fixes: b8f126a ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
    Signed-off-by: Arınç ÜNAL <[email protected]>
    Link: https://lore.kernel.org/r/20240320-for-net-mt7530-fix-25mhz-xtal-with-direct-phy-access-v1-1-d92f605f1160@arinc9.com
    Signed-off-by: Paolo Abeni <[email protected]>
    arinc9 authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    5f563c3 View commit details
    Browse the repository at this point in the history
  8. dns_resolver: correct module name in dns resolver documentation

    Fix an incorrect module name and sysfs path in dns resolver
    documentation.
    
    Signed-off-by: Bharath SM <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    bharathsm-ms authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    75925fa View commit details
    Browse the repository at this point in the history
  9. trace: move to TP_STORE_ADDRS related macro to net_probe_common.h

    Put the macro into another standalone file for better extension.
    Some tracepoints can use this common part in the future.
    
    Signed-off-by: Jason Xing <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    JasonXing authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    b3af904 View commit details
    Browse the repository at this point in the history
  10. trace: use TP_STORE_ADDRS() macro in inet_sk_error_report()

    As the title said, use the macro directly like the patch[1] did
    to avoid those duplications. No functional change.
    
    [1]
    commit 6a6b0b9 ("tcp: Avoid preprocessor directives in tracepoint macro args")
    
    Signed-off-by: Jason Xing <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    JasonXing authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    a24c855 View commit details
    Browse the repository at this point in the history
  11. trace: use TP_STORE_ADDRS() macro in inet_sock_set_state()

    As the title said, use the macro directly like the patch[1] did
    to avoid those duplications. No functional change.
    
    [1]
    commit 6a6b0b9 ("tcp: Avoid preprocessor directives in tracepoint macro args")
    
    Signed-off-by: Jason Xing <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    JasonXing authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    646700c View commit details
    Browse the repository at this point in the history
  12. Merge branch 'trace-use-tp_store_addrs-macro'

    Jason Xing says:
    
    ====================
    trace: use TP_STORE_ADDRS macro
    
    Using the macro for other tracepoints use to be more concise.
    No functional change.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    26f44b7 View commit details
    Browse the repository at this point in the history
  13. MAINTAINERS: split Renesas Ethernet drivers entry

    Since the Renesas Ethernet Switch driver was added by Yoshihiro Shimoda,
    I started receiving the patches to review for it -- which I was unable to
    do, as I don't know this hardware and don't even have the manuals for it.
    Fortunately, Shimoda-san has volunteered to be a reviewer for this new
    driver, thus let's now split the single entry into 3 per-driver entries,
    each with its own reviewer...
    
    Signed-off-by: Sergey Shtylyov <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Acked-by: Yoshihiro Shimoda <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Sergey Shtylyov authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    8c05813 View commit details
    Browse the repository at this point in the history
  14. net: Remove conditional threaded-NAPI wakeup based on task state.

    A NAPI thread is scheduled by first setting NAPI_STATE_SCHED bit. If
    successful (the bit was not yet set) then the NAPI_STATE_SCHED_THREADED
    is set but only if thread's state is not TASK_INTERRUPTIBLE (is
    TASK_RUNNING) followed by task wakeup.
    
    If the task is idle (TASK_INTERRUPTIBLE) then the
    NAPI_STATE_SCHED_THREADED bit is not set. The thread is no relying on
    the bit but always leaving the wait-loop after returning from schedule()
    because there must have been a wakeup.
    
    The smpboot-threads implementation for per-CPU threads requires an
    explicit condition and does not support "if we get out of schedule()
    then there must be something to do".
    
    Removing this optimisation simplifies the following integration.
    
    Set NAPI_STATE_SCHED_THREADED unconditionally on wakeup and rely on it
    in the wait path by removing the `woken' condition.
    
    Acked-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Sebastian Andrzej Siewior authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    56364c9 View commit details
    Browse the repository at this point in the history
  15. net: Allow to use SMP threads for backlog NAPI.

    Backlog NAPI is a per-CPU NAPI struct only (with no device behind it)
    used by drivers which don't do NAPI them self, RPS and parts of the
    stack which need to avoid recursive deadlocks while processing a packet.
    
    The non-NAPI driver use the CPU local backlog NAPI. If RPS is enabled
    then a flow for the skb is computed and based on the flow the skb can be
    enqueued on a remote CPU. Scheduling/ raising the softirq (for backlog's
    NAPI) on the remote CPU isn't trivial because the softirq is only
    scheduled on the local CPU and performed after the hardirq is done.
    In order to schedule a softirq on the remote CPU, an IPI is sent to the
    remote CPU which schedules the backlog-NAPI on the then local CPU.
    
    On PREEMPT_RT interrupts are force-threaded. The soft interrupts are
    raised within the interrupt thread and processed after the interrupt
    handler completed still within the context of the interrupt thread. The
    softirq is handled in the context where it originated.
    
    With force-threaded interrupts enabled, ksoftirqd is woken up if a
    softirq is raised from hardirq context. This is the case if it is raised
    from an IPI. Additionally there is a warning on PREEMPT_RT if the
    softirq is raised from the idle thread.
    This was done for two reasons:
    - With threaded interrupts the processing should happen in thread
      context (where it originated) and ksoftirqd is the only thread for
      this context if raised from hardirq. Using the currently running task
      instead would "punish" a random task.
    - Once ksoftirqd is active it consumes all further softirqs until it
      stops running. This changed recently and is no longer the case.
    
    Instead of keeping the backlog NAPI in ksoftirqd (in force-threaded/
    PREEMPT_RT setups) I am proposing NAPI-threads for backlog.
    The "proper" setup with threaded-NAPI is not doable because the threads
    are not pinned to an individual CPU and can be modified by the user.
    Additionally a dummy network device would have to be assigned. Also
    CPU-hotplug has to be considered if additional CPUs show up.
    All this can be probably done/ solved but the smpboot-threads already
    provide this infrastructure.
    
    Sending UDP packets over loopback expects that the packet is processed
    within the call. Delaying it by handing it over to the thread hurts
    performance. It is not beneficial to the outcome if the context switch
    happens immediately after enqueue or after a while to process a few
    packets in a batch.
    There is no need to always use the thread if the backlog NAPI is
    requested on the local CPU. This restores the loopback throuput. The
    performance drops mostly to the same value after enabling RPS on the
    loopback comparing the IPI and the tread result.
    
    Create NAPI-threads for backlog if request during boot. The thread runs
    the inner loop from napi_threaded_poll(), the wait part is different. It
    checks for NAPI_STATE_SCHED (the backlog NAPI can not be disabled).
    
    The NAPI threads for backlog are optional, it has to be enabled via the boot
    argument "thread_backlog_napi". It is mandatory for PREEMPT_RT to avoid the
    wakeup of ksoftirqd from the IPI.
    
    Acked-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Sebastian Andrzej Siewior authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    dad6b97 View commit details
    Browse the repository at this point in the history
  16. net: Use backlog-NAPI to clean up the defer_list.

    The defer_list is a per-CPU list which is used to free skbs outside of
    the socket lock and on the CPU on which they have been allocated.
    The list is processed during NAPI callbacks so ideally the list is
    cleaned up.
    Should the amount of skbs on the list exceed a certain water mark then
    the softirq is triggered remotely on the target CPU by invoking a remote
    function call. The raise of the softirqs via a remote function call
    leads to waking the ksoftirqd on PREEMPT_RT which is undesired.
    The backlog-NAPI threads already provide the infrastructure which can be
    utilized to perform the cleanup of the defer_list.
    
    The NAPI state is updated with the input_pkt_queue.lock acquired. It
    order not to break the state, it is needed to also wake the backlog-NAPI
    thread with the lock held. This requires to acquire the use the lock in
    rps_lock_irq*() if the backlog-NAPI threads are used even with RPS
    disabled.
    
    Move the logic of remotely starting softirqs to clean up the defer_list
    into kick_defer_list_purge(). Make sure a lock is held in
    rps_lock_irq*() if backlog-NAPI threads are used. Schedule backlog-NAPI
    for defer_list cleanup if backlog-NAPI is available.
    
    Acked-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Sebastian Andrzej Siewior authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    80d2eef View commit details
    Browse the repository at this point in the history
  17. net: Rename rps_lock to backlog_lock.

    The rps_lock.*() functions use the inner lock of a sk_buff_head for
    locking. This lock is used if RPS is enabled, otherwise the list is
    accessed lockless and disabling interrupts is enough for the
    synchronisation because it is only accessed CPU local. Not only the list
    is protected but also the NAPI state protected.
    With the addition of backlog threads, the lock is also needed because of
    the cross CPU access even without RPS. The clean up of the defer_list
    list is also done via backlog threads (if enabled).
    
    It has been suggested to rename the locking function since it is no
    longer just RPS.
    
    Rename the rps_lock*() functions to backlog_lock*().
    
    Suggested-by: Jakub Kicinski <[email protected]>
    Acked-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Sebastian Andrzej Siewior authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    765b11f View commit details
    Browse the repository at this point in the history
  18. Merge branch 'net-provide-smp-threads-for-backlog-napi'

    Sebastian Andrzej Siewior says:
    
    ====================
    net: Provide SMP threads for backlog NAPI
    
    The RPS code and "deferred skb free" both send IPI/ function call
    to a remote CPU in which a softirq is raised. This leads to a warning on
    PREEMPT_RT because raising softiqrs from function call led to undesired
    behaviour in the past. I had duct tape in RT for the "deferred skb free"
    and Wander Lairson Costa reported the RPS case.
    
    This series only provides support for SMP threads for backlog NAPI, I
    did not attach a patch to make it default and remove the IPI related
    code to avoid confusion. I can post it for reference it asked.
    
    The RedHat performance team was so kind to provide some testing here.
    The series (with the IPI code removed) has been tested and no regression
    vs without the series has been found. For testing iperf3 was used on 25G
    interface, provided by mlx5, ix40e or ice driver and RPS was enabled. I
    can provide the individual test results if needed.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    1a3e4d6 View commit details
    Browse the repository at this point in the history
  19. selftests: vxlan_mdb: Fix failures with old libnet

    Locally generated IP multicast packets (such as the ones used in the
    test) do not perform routing and simply egress the bound device.
    
    However, as explained in commit 8bcfb4a ("selftests: forwarding:
    Fix failing tests with old libnet"), old versions of libnet (used by
    mausezahn) do not use the "SO_BINDTODEVICE" socket option. Specifically,
    the library started using the option for IPv6 sockets in version 1.1.6
    and for IPv4 sockets in version 1.2. This explains why on Ubuntu - which
    uses version 1.1.6 - the IPv4 overlay tests are failing whereas the IPv6
    ones are passing.
    
    Fix by specifying the source and destination MAC of the packets which
    will cause mausezahn to use a packet socket instead of an IP socket.
    
    Fixes: 62199e3 ("selftests: net: Add VXLAN MDB test")
    Reported-by: Mirsad Todorovac <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Tested-by: Mirsad Todorovac <[email protected]>
    Signed-off-by: Ido Schimmel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    idosch authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    f142552 View commit details
    Browse the repository at this point in the history
  20. Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel…

    …/git/bpf/bpf
    
    Daniel Borkmann says:
    
    ====================
    pull-request: bpf 2024-03-25
    
    The following pull-request contains BPF updates for your *net* tree.
    
    We've added 17 non-merge commits during the last 12 day(s) which contain
    a total of 19 files changed, 184 insertions(+), 61 deletions(-).
    
    The main changes are:
    
    1) Fix an arm64 BPF JIT bug in BPF_LDX_MEMSX implementation's offset handling
       found via test_bpf module, from Puranjay Mohan.
    
    2) Various fixups to the BPF arena code in particular in the BPF verifier and
       around BPF selftests to match latest corresponding LLVM implementation,
       from Puranjay Mohan and Alexei Starovoitov.
    
    3) Fix xsk to not assume that metadata is always requested in TX completion,
       from Stanislav Fomichev.
    
    4) Fix riscv BPF JIT's kfunc parameter incompatibility between BPF and the riscv
       ABI which requires sign-extension on int/uint, from Pu Lehui.
    
    5) Fix s390x BPF JIT's bpf_plt pointer arithmetic which triggered a crash when
       testing struct_ops, from Ilya Leoshkevich.
    
    6) Fix libbpf's arena mmap handling which had incorrect u64-to-pointer cast on
       32-bit architectures, from Andrii Nakryiko.
    
    7) Fix libbpf to define MFD_CLOEXEC when not available, from Arnaldo Carvalho de Melo.
    
    8) Fix arm64 BPF JIT implementation for 32bit unconditional bswap which
       resulted in an incorrect swap as indicated by test_bpf, from Artem Savkov.
    
    9) Fix BPF man page build script to use silent mode, from Hangbin Liu.
    
    * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
      riscv, bpf: Fix kfunc parameters incompatibility between bpf and riscv abi
      bpf: verifier: reject addr_space_cast insn without arena
      selftests/bpf: verifier_arena: fix mmap address for arm64
      bpf: verifier: fix addr_space_cast from as(1) to as(0)
      libbpf: Define MFD_CLOEXEC if not available
      arm64: bpf: fix 32bit unconditional bswap
      bpf, arm64: fix bug in BPF_LDX_MEMSX
      libbpf: fix u64-to-pointer cast on 32-bit arches
      s390/bpf: Fix bpf_plt pointer arithmetic
      xsk: Don't assume metadata is always requested in TX completion
      selftests/bpf: Add arena test case for 4Gbyte corner case
      selftests/bpf: Remove hard coded PAGE_SIZE macro.
      libbpf, selftests/bpf: Adjust libbpf, bpftool, selftests to match LLVM
      bpf: Clarify bpf_arena comments.
      MAINTAINERS: Update email address for Quentin Monnet
      scripts/bpf_doc: Use silent mode when exec make cmd
      bpf: Temporarily disable atomic operations in BPF arena
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    37ccdf7 View commit details
    Browse the repository at this point in the history
  21. MAINTAINERS: wifi: mwifiex: add Francesco as reviewer

    As discussed on the mailing list, add myself as mwifiex driver reviewer.
    
    Link: https://lore.kernel.org/all/20240318112830.GA9565@francesco-nb/
    Signed-off-by: Francesco Dolcini <[email protected]>
    Acked-by: Brian Norris <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://msgid.link/[email protected]
    dolcini authored and Kalle Valo committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    8ea3f4f View commit details
    Browse the repository at this point in the history
  22. net: hns3: fix index limit to support all queue stats

    Currently, hns hardware supports more than 512 queues and the index limit
    in hclge_comm_tqps_update_stats is wrong. So this patch removes it.
    
    Fixes: 287db5c ("net: hns3: create new set of common tqp stats APIs for PF and VF reuse")
    Signed-off-by: Jie Wang <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Michal Kubiak <[email protected]>
    Reviewed-by: Kalesh AP <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Jie Wang authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    47e39d2 View commit details
    Browse the repository at this point in the history
  23. net: hns3: fix kernel crash when devlink reload during pf initialization

    The devlink reload process will access the hardware resources,
    but the register operation is done before the hardware is initialized.
    So, processing the devlink reload during initialization may lead to kernel
    crash. This patch fixes this by taking devl_lock during initialization.
    
    Fixes: b741269 ("net: hns3: add support for registering devlink for PF")
    Signed-off-by: Yonglong Liu <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    liuyonglong86 authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    93305b7 View commit details
    Browse the repository at this point in the history
  24. net: hns3: mark unexcuted loopback test result as UNEXECUTED

    Currently, loopback test may be skipped when resetting, but the test
    result will still show as 'PASS', because the driver doesn't set
    ETH_TEST_FL_FAILED flag. Fix it by setting the flag and
    initializating the value to UNEXECUTED.
    
    Fixes: 4c8dab1 ("net: hns3: reconstruct function hns3_self_test")
    Signed-off-by: Jian Shen <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Michal Kubiak <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    IronShen authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    5bd088d View commit details
    Browse the repository at this point in the history
  25. Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'

    Jijie Shao says:
    
    ====================
    There are some bugfix for the HNS3 ethernet driver
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    c1fd3a9 View commit details
    Browse the repository at this point in the history
  26. net: remove skb_free_datagram_locked()

    Last user of skb_free_datagram_locked() went away in 2016
    with commit 850cbad ("udp: use it's own memory
    accounting schema").
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Jason Xing <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Eric Dumazet authored and Paolo Abeni committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    6e06312 View commit details
    Browse the repository at this point in the history
  27. btrfs: zoned: fix use-after-free in do_zone_finish()

    Shinichiro reported the following use-after-free triggered by the device
    replace operation in fstests btrfs/070.
    
     BTRFS info (device nullb1): scrub: finished on devid 1 with status: 0
     ==================================================================
     BUG: KASAN: slab-use-after-free in do_zone_finish+0x91a/0xb90 [btrfs]
     Read of size 8 at addr ffff8881543c8060 by task btrfs-cleaner/3494007
    
     CPU: 0 PID: 3494007 Comm: btrfs-cleaner Tainted: G        W          6.8.0-rc5-kts #1
     Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020
     Call Trace:
      <TASK>
      dump_stack_lvl+0x5b/0x90
      print_report+0xcf/0x670
      ? __virt_addr_valid+0x200/0x3e0
      kasan_report+0xd8/0x110
      ? do_zone_finish+0x91a/0xb90 [btrfs]
      ? do_zone_finish+0x91a/0xb90 [btrfs]
      do_zone_finish+0x91a/0xb90 [btrfs]
      btrfs_delete_unused_bgs+0x5e1/0x1750 [btrfs]
      ? __pfx_btrfs_delete_unused_bgs+0x10/0x10 [btrfs]
      ? btrfs_put_root+0x2d/0x220 [btrfs]
      ? btrfs_clean_one_deleted_snapshot+0x299/0x430 [btrfs]
      cleaner_kthread+0x21e/0x380 [btrfs]
      ? __pfx_cleaner_kthread+0x10/0x10 [btrfs]
      kthread+0x2e3/0x3c0
      ? __pfx_kthread+0x10/0x10
      ret_from_fork+0x31/0x70
      ? __pfx_kthread+0x10/0x10
      ret_from_fork_asm+0x1b/0x30
      </TASK>
    
     Allocated by task 3493983:
      kasan_save_stack+0x33/0x60
      kasan_save_track+0x14/0x30
      __kasan_kmalloc+0xaa/0xb0
      btrfs_alloc_device+0xb3/0x4e0 [btrfs]
      device_list_add.constprop.0+0x993/0x1630 [btrfs]
      btrfs_scan_one_device+0x219/0x3d0 [btrfs]
      btrfs_control_ioctl+0x26e/0x310 [btrfs]
      __x64_sys_ioctl+0x134/0x1b0
      do_syscall_64+0x99/0x190
      entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
     Freed by task 3494056:
      kasan_save_stack+0x33/0x60
      kasan_save_track+0x14/0x30
      kasan_save_free_info+0x3f/0x60
      poison_slab_object+0x102/0x170
      __kasan_slab_free+0x32/0x70
      kfree+0x11b/0x320
      btrfs_rm_dev_replace_free_srcdev+0xca/0x280 [btrfs]
      btrfs_dev_replace_finishing+0xd7e/0x14f0 [btrfs]
      btrfs_dev_replace_by_ioctl+0x1286/0x25a0 [btrfs]
      btrfs_ioctl+0xb27/0x57d0 [btrfs]
      __x64_sys_ioctl+0x134/0x1b0
      do_syscall_64+0x99/0x190
      entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
     The buggy address belongs to the object at ffff8881543c8000
      which belongs to the cache kmalloc-1k of size 1024
     The buggy address is located 96 bytes inside of
      freed 1024-byte region [ffff8881543c8000, ffff8881543c8400)
    
     The buggy address belongs to the physical page:
     page:00000000fe2c1285 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1543c8
     head:00000000fe2c1285 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
     flags: 0x17ffffc0000840(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
     page_type: 0xffffffff()
     raw: 0017ffffc0000840 ffff888100042dc0 ffffea0019e8f200 dead000000000002
     raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
     page dumped because: kasan: bad access detected
    
     Memory state around the buggy address:
      ffff8881543c7f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      ffff8881543c7f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     >ffff8881543c8000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                            ^
      ffff8881543c8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ffff8881543c8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    
    This UAF happens because we're accessing stale zone information of a
    already removed btrfs_device in do_zone_finish().
    
    The sequence of events is as follows:
    
    btrfs_dev_replace_start
      btrfs_scrub_dev
       btrfs_dev_replace_finishing
        btrfs_dev_replace_update_device_in_mapping_tree <-- devices replaced
        btrfs_rm_dev_replace_free_srcdev
         btrfs_free_device                              <-- device freed
    
    cleaner_kthread
     btrfs_delete_unused_bgs
      btrfs_zone_finish
       do_zone_finish              <-- refers the freed device
    
    The reason for this is that we're using a cached pointer to the chunk_map
    from the block group, but on device replace this cached pointer can
    contain stale device entries.
    
    The staleness comes from the fact, that btrfs_block_group::physical_map is
    not a pointer to a btrfs_chunk_map but a memory copy of it.
    
    Also take the fs_info::dev_replace::rwsem to prevent
    btrfs_dev_replace_update_device_in_mapping_tree() from changing the device
    underneath us again.
    
    Note: btrfs_dev_replace_update_device_in_mapping_tree() is holding
    fs_info::mapping_tree_lock, but as this is a spinning read/write lock we
    cannot take it as the call to blkdev_zone_mgmt() requires a memory
    allocation which may not sleep.
    But btrfs_dev_replace_update_device_in_mapping_tree() is always called with
    the fs_info::dev_replace::rwsem held in write mode.
    
    Many thanks to Shinichiro for analyzing the bug.
    
    Reported-by: Shinichiro Kawasaki <[email protected]>
    CC: [email protected] # 6.8
    Reviewed-by: Filipe Manana <[email protected]>
    Signed-off-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    morbidrsa authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    1ec17ef View commit details
    Browse the repository at this point in the history
  28. btrfs: validate device maj:min during open

    Boris managed to create a device capable of changing its maj:min without
    altering its device path.
    
    Only multi-devices can be scanned. A device that gets scanned and remains
    in the btrfs kernel cache might end up with an incorrect maj:min.
    
    Despite the temp-fsid feature patch did not introduce this bug, it could
    lead to issues if the above multi-device is converted to a single device
    with a stale maj:min. Subsequently, attempting to mount the same device
    with the correct maj:min might mistake it for another device with the same
    fsid, potentially resulting in wrongly auto-enabling the temp-fsid feature.
    
    To address this, this patch validates the device's maj:min at the time of
    device open and updates it if it has changed since the last scan.
    
    CC: [email protected] # 6.7+
    Fixes: a5b8a5f ("btrfs: support cloned-device mount capability")
    Reported-by: Boris Burkov <[email protected]>
    Co-developed-by: Boris Burkov <[email protected]>
    Reviewed-by: Boris Burkov <[email protected]>#
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    9f7eb84 View commit details
    Browse the repository at this point in the history
  29. btrfs: fix extent map leak in unexpected scenario at unpin_extent_cac…

    …he()
    
    At unpin_extent_cache() if we happen to find an extent map with an
    unexpected start offset, we jump to the 'out' label and never release the
    reference we added to the extent map through the call to
    lookup_extent_mapping(), therefore resulting in a leak. So fix this by
    moving the free_extent_map() under the 'out' label.
    
    Fixes: c03c89f ("btrfs: handle errors returned from unpin_extent_cache()")
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    fdmanana authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    8a565ec View commit details
    Browse the repository at this point in the history
  30. btrfs: fix warning messages not printing interval at unpin_extent_ran…

    …ge()
    
    At unpin_extent_range() we print warning messages that are supposed to
    print an interval in the form "[X, Y)", with the first element being an
    inclusive start offset and the second element being the exclusive end
    offset of a range. However we end up printing the range's length instead
    of the range's exclusive end offset, so fix that to avoid having confusing
    and non-sense messages in case we hit one of these unexpected scenarios.
    
    Fixes: 00deaf0 ("btrfs: log messages at unpin_extent_range() during unexpected cases")
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    fdmanana authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    4dc1d69 View commit details
    Browse the repository at this point in the history
  31. btrfs: fix message not properly printing interval when adding extent map

    At btrfs_add_extent_mapping(), if we are unable to merge the existing
    extent map, we print a warning message that suggests interval ranges in
    the form "[X, Y)", where the first element is the inclusive start offset
    of a range and the second element is the exclusive end offset. However
    we end up printing the length of the ranges instead of the exclusive end
    offsets. So fix this by printing the range end offsets.
    
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    fdmanana authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    379c872 View commit details
    Browse the repository at this point in the history
  32. btrfs: use btrfs_warn() to log message at btrfs_add_extent_mapping()

    At btrfs_add_extent_mapping(), if we failed to merge the extent map, which
    is unexpected and theoretically should never happen, we use WARN_ONCE() to
    log a message which is not great because we don't get information about
    which filesystem it relates to in case we have multiple btrfs filesystems
    mounted. So change this to use btrfs_warn() and surround the error check
    with WARN_ON() so we always get a useful stack trace and the condition is
    flagged as "unlikely" since it's not expected to ever happen.
    
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    fdmanana authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    2133460 View commit details
    Browse the repository at this point in the history
  33. btrfs: zoned: don't skip block groups with 100% zone unusable

    Commit f4a9f21 ("btrfs: do not delete unused block group if it may be
    used soon") changed the behaviour of deleting unused block-groups on zoned
    filesystems. Starting with this commit, we're using
    btrfs_space_info_used() to calculate the number of used bytes in a
    space_info. But btrfs_space_info_used() also accounts
    btrfs_space_info::bytes_zone_unusable as used bytes.
    
    So if a block group is 100% zone_unusable it is skipped from the deletion
    step.
    
    In order not to skip fully zone_unusable block-groups, also check if the
    block-group has bytes left that can be used on a zoned filesystem.
    
    Fixes: f4a9f21 ("btrfs: do not delete unused block group if it may be used soon")
    CC: [email protected] # 6.1+
    Reviewed-by: Filipe Manana <[email protected]>
    Signed-off-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    morbidrsa authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    a8b70c7 View commit details
    Browse the repository at this point in the history
  34. btrfs: return accurate error code on open failure in open_fs_devices()

    When attempting to exclusive open a device which has no exclusive open
    permission, such as a physical device associated with the flakey dm
    device, the open operation will fail, resulting in a mount failure.
    
    In this particular scenario, we erroneously return -EINVAL instead of the
    correct error code provided by the bdev_open_by_path() function, which is
    -EBUSY.
    
    Fix this, by returning error code from the bdev_open_by_path() function.
    With this correction, the mount error message will align with that of
    ext4 and xfs.
    
    Reviewed-by: Boris Burkov <[email protected]>
    Signed-off-by: Anand Jain <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    asj authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    2f1aeab View commit details
    Browse the repository at this point in the history
  35. btrfs: fix race in read_extent_buffer_pages()

    There are reports from tree-checker that detects corrupted nodes,
    without any obvious pattern so possibly an overwrite in memory.
    After some debugging it turns out there's a race when reading an extent
    buffer the uptodate status can be missed.
    
    To prevent concurrent reads for the same extent buffer,
    read_extent_buffer_pages() performs these checks:
    
        /* (1) */
        if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))
            return 0;
    
        /* (2) */
        if (test_and_set_bit(EXTENT_BUFFER_READING, &eb->bflags))
            goto done;
    
    At this point, it seems safe to start the actual read operation. Once
    that completes, end_bbio_meta_read() does
    
        /* (3) */
        set_extent_buffer_uptodate(eb);
    
        /* (4) */
        clear_bit(EXTENT_BUFFER_READING, &eb->bflags);
    
    Normally, this is enough to ensure only one read happens, and all other
    callers wait for it to finish before returning.  Unfortunately, there is
    a racey interleaving:
    
        Thread A | Thread B | Thread C
        ---------+----------+---------
           (1)   |          |
                 |    (1)   |
           (2)   |          |
           (3)   |          |
           (4)   |          |
                 |    (2)   |
                 |          |    (1)
    
    When this happens, thread B kicks of an unnecessary read. Worse, thread
    C will see UPTODATE set and return immediately, while the read from
    thread B is still in progress.  This race could result in tree-checker
    errors like this as the extent buffer is concurrently modified:
    
        BTRFS critical (device dm-0): corrupted node, root=256
        block=8550954455682405139 owner mismatch, have 11858205567642294356
        expect [256, 18446744073709551360]
    
    Fix it by testing UPTODATE again after setting the READING bit, and if
    it's been set, skip the unnecessary read.
    
    Fixes: d7172f5 ("btrfs: use per-buffer locking for extent_buffer reading")
    Link: https://lore.kernel.org/linux-btrfs/CAHk-=whNdMaN9ntZ47XRKP6DBes2E5w7fi-0U3H2+PS18p+Pzw@mail.gmail.com/
    Link: https://lore.kernel.org/linux-btrfs/f51a6d5d7432455a6a858d51b49ecac183e0bbc9.1706312914.git.wqu@suse.com/
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    CC: [email protected] # 6.5+
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Tavian Barnes <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    [ minor update of changelog ]
    Signed-off-by: David Sterba <[email protected]>
    tavianator authored and kdave committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    ef1e682 View commit details
    Browse the repository at this point in the history
  36. Merge tag 'pwm/for-6.9-rc2-fixes' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/ukleinek/linux
    
    Pull pwm fix from Uwe Kleine-König:
     "This contains a single fix for a regression introduced in v5.18-rc1
      which made the img pwm driver fail to bind"
    
    * tag 'pwm/for-6.9-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
      pwm: img: fix pwm clock lookup
    torvalds committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    576bb2d View commit details
    Browse the repository at this point in the history
  37. Merge tag 'printk-for-6.9-rc2' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/printk/linux
    
    Pull printk fix from Petr Mladek:
    
     - Prevent scheduling in an atomic context when printk() takes over the
       console flushing duty
    
    * tag 'printk-for-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
      printk: Update @console_may_schedule in console_trylock_spinning()
    torvalds committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    7033999 View commit details
    Browse the repository at this point in the history
  38. mm/memory: fix missing pte marker for !page on pte zaps

    Commit 0cf18e8 of large folio zap work broke uffd-wp.  Now mm's uffd
    unit test "wp-unpopulated" will trigger this WARN_ON_ONCE().
    
    The WARN_ON_ONCE() asserts that an VMA cannot be registered with
    userfaultfd-wp if it contains a !normal page, but it's actually possible. 
    One example is an anonymous vma, register with uffd-wp, read anything will
    install a zero page.  Then when zap on it, this should trigger.
    
    What's more, removing that WARN_ON_ONCE may not be enough either, because
    we should also not rely on "whether it's a normal page" to decide whether
    pte marker is needed.  For example, one can register wr-protect over some
    DAX regions to track writes when UFFD_FEATURE_WP_ASYNC enabled, in which
    case it can have page==NULL for a devmap but we may want to keep the
    marker around.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 0cf18e8 ("mm/memory: handle !page case in zap_present_pte() separately")
    Signed-off-by: Peter Xu <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>
    Cc: Muhammad Usama Anjum <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    xzpeter authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    f857236 View commit details
    Browse the repository at this point in the history
  39. selftests/mm: Fix build with _FORTIFY_SOURCE

    Add missing flags argument to open(2) call with O_CREAT.
    
    Some tests fail to compile if _FORTIFY_SOURCE is defined (to any valid
    value) (together with -O), resulting in similar error messages such as:
    
      In file included from /usr/include/fcntl.h:342,
                       from gup_test.c:1:
      In function 'open',
          inlined from 'main' at gup_test.c:206:10:
      /usr/include/bits/fcntl2.h:50:11: error: call to '__open_missing_mode' declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
         50 |           __open_missing_mode ();
            |           ^~~~~~~~~~~~~~~~~~~~~~
    
    _FORTIFY_SOURCE is enabled by default in some distributions, so the
    tests are not built by default and are skipped.
    
    open(2) man-page warns about missing flags argument: "if it is not
    supplied, some arbitrary bytes from the stack will be applied as the
    file mode."
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: aeb85ed ("tools/testing/selftests/vm/gup_benchmark.c: allow user specified file")
    Fixes: fbe3750 ("mm: huge_memory: debugfs for file-backed THP split")
    Fixes: c942f5b ("selftests: soft-dirty: add test for mprotect")
    Signed-off-by: Vitaly Chikunov <[email protected]>
    Reviewed-by: Zi Yan <[email protected]>
    Reviewed-by: David Hildenbrand <[email protected]>
    Cc: Keith Busch <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: Yang Shi <[email protected]>
    Cc: Andrea Arcangeli <[email protected]>
    Cc: Nadav Amit <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    vt-alt authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    8b65ef5 View commit details
    Browse the repository at this point in the history
  40. init: open /initrd.image with O_LARGEFILE

    If initrd data is larger than 2Gb, we'll eventually fail to write to the
    /initrd.image file when we hit that limit, unless O_LARGEFILE is set.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: John Sperbeck <[email protected]>
    Cc: Jens Axboe <[email protected]>
    Cc: Nick Desaulniers <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    John Sperbeck authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    4624b34 View commit details
    Browse the repository at this point in the history
  41. mailmap: update entry for Leonard Crestez

    Put my personal email first because NXP employment ended some time ago.
    Also add my old intel email address.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Leonard Crestez <[email protected]>
    Cc: Florian Fainelli <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    cdleonard authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    3290032 View commit details
    Browse the repository at this point in the history
  42. mm,page_owner: fix recursion

    Prior to 217b211 ("mm,page_owner: implement the tracking of the
    stacks count") the only place where page_owner could potentially go into
    recursion due to its need of allocating more memory was in save_stack(),
    which ends up calling into stackdepot code with the possibility of
    allocating memory.
    
    We made sure to guard against that by signaling that the current task was
    already in page_owner code, so in case a recursion attempt was made, we
    could catch that and return dummy_handle.
    
    After above commit, a new place in page_owner code was introduced where we
    could allocate memory, meaning we could go into recursion would we take
    that path.
    
    Make sure to signal that we are in page_owner in that codepath as well. 
    Move the guard code into two helpers {un}set_current_in_page_owner() and
    use them prior to calling in the two functions that might allocate memory.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Oscar Salvador <[email protected]>
    Fixes: 217b211 ("mm,page_owner: implement the tracking of the stacks count")
    Reviewed-by: Vlastimil Babka <[email protected]>
    Cc: Alexander Potapenko <[email protected]>
    Cc: Andrey Konovalov <[email protected]>
    Cc: Marco Elver <[email protected]>
    Cc: Michal Hocko <[email protected]>
    Cc: Oscar Salvador <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    osalvadorvilardaga authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    7844c01 View commit details
    Browse the repository at this point in the history
  43. mm: increase folio batch size

    On a 104 thread, 2 socket Skylake system, Intel report a 4.7% performance
    reduction with will-it-scale page_fault2.  This was due to reducing the
    size of the batch from 32 to 15.  Increasing the folio batch size from 15
    to 31 gives a performance increase of 12.5% relative to the original, or
    17.2% relative to the reduced performance commit.
    
    The penalty of this commit is an additional 128 bytes of stack usage.  Six
    folio_batches are also allocated from percpu memory in cpu_fbatches so
    that will be an additional 768 bytes of percpu memory (per CPU).  Tim Chen
    originally submitted a patch like this in 2020:
    https://lore.kernel.org/linux-mm/d1cc9f12a8ad6c2a52cb600d93b06b064f2bbc57.1593205965.git.tim.c.chen@linux.intel.com/
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 99fbb6b ("mm: make folios_put() the basis of release_pages()")
    Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
    Tested-by: Yujie Liu <[email protected]>
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-lkp/[email protected]
    Signed-off-by: Andrew Morton <[email protected]>
    Matthew Wilcox (Oracle) authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    9cecde8 View commit details
    Browse the repository at this point in the history
  44. mm: cachestat: fix two shmem bugs

    When cachestat on shmem races with swapping and invalidation, there
    are two possible bugs:
    
    1) A swapin error can have resulted in a poisoned swap entry in the
       shmem inode's xarray. Calling get_shadow_from_swap_cache() on it
       will result in an out-of-bounds access to swapper_spaces[].
    
       Validate the entry with non_swap_entry() before going further.
    
    2) When we find a valid swap entry in the shmem's inode, the shadow
       entry in the swapcache might not exist yet: swap IO is still in
       progress and we're before __remove_mapping; swapin, invalidation,
       or swapoff have removed the shadow from swapcache after we saw the
       shmem swap entry.
    
       This will send a NULL to workingset_test_recent(). The latter
       purely operates on pointer bits, so it won't crash - node 0, memcg
       ID 0, eviction timestamp 0, etc. are all valid inputs - but it's a
       bogus test. In theory that could result in a false "recently
       evicted" count.
    
       Such a false positive wouldn't be the end of the world. But for
       code clarity and (future) robustness, be explicit about this case.
    
       Bail on get_shadow_from_swap_cache() returning NULL.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cf264e1 ("cachestat: implement cachestat syscall")
    Signed-off-by: Johannes Weiner <[email protected]>
    Reported-by: Chengming Zhou <[email protected]>	[Bug #1]
    Reported-by: Jann Horn <[email protected]>		[Bug #2]
    Reviewed-by: Chengming Zhou <[email protected]>
    Reviewed-by: Nhat Pham <[email protected]>
    Cc: <[email protected]>				[v6.5+]
    Signed-off-by: Andrew Morton <[email protected]>
    hnaz authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    d5d39c7 View commit details
    Browse the repository at this point in the history
  45. tools/Makefile: remove cgroup target

    The tools/cgroup directory no longer contains a Makefile.  This patch
    updates the top-level tools/Makefile to remove references to building and
    installing cgroup components.  This change reflects the current structure
    of the tools directory and fixes the build failure when building tools in
    the top-level directory.
    
    linux/tools$ make cgroup
      DESCEND cgroup
    make[1]: *** No targets specified and no makefile found.  Stop.
    make: *** [Makefile:73: cgroup] Error 2
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Cong Liu <[email protected]>
    Acked-by: Stanislav Fomichev <[email protected]>
    Reviewed-by: Dmitry Rokosov <[email protected]>
    Cc: Cong Liu <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Cong Liu authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    950bf45 View commit details
    Browse the repository at this point in the history
  46. selftests: mm: restore settings from only parent process

    The atexit() is called from parent process as well as forked processes. 
    Hence the child restores the settings at exit while the parent is still
    executing.  Fix this by checking pid of atexit() calling process and only
    restore THP number from parent process.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: c23ea61 ("selftests/mm: protection_keys: save/restore nr_hugepages settings")
    Signed-off-by: Muhammad Usama Anjum <[email protected]>
    Tested-by: Joey Gouly <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    musamaanjum authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    c52eb6d View commit details
    Browse the repository at this point in the history
  47. mm: zswap: fix kernel BUG in sg_init_one

    sg_init_one() relies on linearly mapped low memory for the safe
    utilization of virt_to_page().  Otherwise, we trigger a kernel BUG,
    
    kernel BUG at include/linux/scatterlist.h:187!
    Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
    Modules linked in:
    CPU: 0 PID: 2997 Comm: syz-executor198 Not tainted 6.8.0-syzkaller #0
    Hardware name: ARM-Versatile Express
    PC is at sg_set_buf include/linux/scatterlist.h:187 [inline]
    PC is at sg_init_one+0x9c/0xa8 lib/scatterlist.c:143
    LR is at sg_init_table+0x2c/0x40 lib/scatterlist.c:128
    Backtrace:
    [<807e16ac>] (sg_init_one) from [<804c1824>] (zswap_decompress+0xbc/0x208 mm/zswap.c:1089)
     r7:83471c80 r6:def6d08c r5:844847d0 r4:ff7e7ef4
    [<804c1768>] (zswap_decompress) from [<804c4468>] (zswap_load+0x15c/0x198 mm/zswap.c:1637)
     r9:8446eb80 r8:8446eb80 r7:8446eb84 r6:def6d08c r5:00000001 r4:844847d0
    [<804c430c>] (zswap_load) from [<804b9644>] (swap_read_folio+0xa8/0x498 mm/page_io.c:518)
     r9:844ac800 r8:835e6c00 r7:00000000 r6:df955d4c r5:00000001 r4:def6d08c
    [<804b959c>] (swap_read_folio) from [<804bb064>] (swap_cluster_readahead+0x1c4/0x34c mm/swap_state.c:684)
     r10:00000000 r9:00000007 r8:df955d4b r7:00000000 r6:00000000 r5:00100cca
     r4:00000001
    [<804baea0>] (swap_cluster_readahead) from [<804bb3b8>] (swapin_readahead+0x68/0x4a8 mm/swap_state.c:904)
     r10:df955eb8 r9:00000000 r8:00100cca r7:84476480 r6:00000001 r5:00000000
     r4:00000001
    [<804bb350>] (swapin_readahead) from [<8047cde0>] (do_swap_page+0x200/0xcc4 mm/memory.c:4046)
     r10:00000040 r9:00000000 r8:844ac800 r7:84476480 r6:00000001 r5:00000000
     r4:df955eb8
    [<8047cbe0>] (do_swap_page) from [<8047e6c4>] (handle_pte_fault mm/memory.c:5301 [inline])
    [<8047cbe0>] (do_swap_page) from [<8047e6c4>] (__handle_mm_fault mm/memory.c:5439 [inline])
    [<8047cbe0>] (do_swap_page) from [<8047e6c4>] (handle_mm_fault+0x3d8/0x12b8 mm/memory.c:5604)
     r10:00000040 r9:842b3900 r8:7eb0d000 r7:84476480 r6:7eb0d000 r5:835e6c00
     r4:00000254
    [<8047e2ec>] (handle_mm_fault) from [<80215d28>] (do_page_fault+0x148/0x3a8 arch/arm/mm/fault.c:326)
     r10:00000007 r9:842b3900 r8:7eb0d000 r7:00000207 r6:00000254 r5:7eb0d9b4
     r4:df955fb0
    [<80215be0>] (do_page_fault) from [<80216170>] (do_DataAbort+0x38/0xa8 arch/arm/mm/fault.c:558)
     r10:7eb0da7c r9:00000000 r8:80215be0 r7:df955fb0 r6:7eb0d9b4 r5:00000207
     r4:8261d0e0
    [<80216138>] (do_DataAbort) from [<80200e3c>] (__dabt_usr+0x5c/0x60 arch/arm/kernel/entry-armv.S:427)
    Exception stack(0xdf955fb0 to 0xdf955ff8)
    5fa0:                                     00000000 00000000 22d5f800 0008d158
    5fc0: 00000000 7eb0d9a4 00000000 00000109 00000000 00000000 7eb0da7c 7eb0da3c
    5fe0: 00000000 7eb0d9a0 00000001 00066bd4 00000010 ffffffff
     r8:824a9044 r7:835e6c00 r6:ffffffff r5:00000010 r4:00066bd4
    Code: 1a000004 e1822003 e8860094 e89da8f0 (e7f001f2)
    ---[ end trace 0000000000000000 ]---
    ----------------
    Code disassembly (best guess):
       0:	1a000004 	bne	0x18
       4:	e1822003 	orr	r2, r2, r3
       8:	e8860094 	stm	r6, {r2, r4, r7}
       c:	e89da8f0 	ldm	sp, {r4, r5, r6, r7, fp, sp, pc}
    * 10:	e7f001f2 	udf	#18 <-- trapping instruction
    
    Consequently, we have two choices: either employ kmap_to_page() alongside
    sg_set_page(), or resort to copying high memory contents to a temporary
    buffer residing in low memory.  However, considering the introduction of
    the WARN_ON_ONCE in commit ef6e06b ("highmem: fix kmap_to_page() for
    kmap_local_page() addresses"), which specifically addresses high memory
    concerns, it appears that memcpy remains the sole viable option.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 270700d ("mm/zswap: remove the memcpy if acomp is not sleepable")
    Signed-off-by: Barry Song <[email protected]>
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: [email protected]
    Acked-by: Yosry Ahmed <[email protected]>
    Reviewed-by: Nhat Pham <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Cc: Chris Li <[email protected]>
    Cc: Ira Weiny <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Barry Song authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    9c50083 View commit details
    Browse the repository at this point in the history
  48. MAINTAINERS: remove incorrect M: tag for [email protected]

    The [email protected] mailing list should only be listed under the
    L: (List) tag in the MAINTAINERS file.  However, it was incorrectly listed
    under both L: and M: (Maintainers) tags, which is not accurate.  Remove
    the M: tag for [email protected] in the MAINTAINERS file to reflect
    the correct categorization.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Kuan-Wei Chiu <[email protected]>
    Cc: Ching-Chun (Jim) Huang <[email protected]>
    Cc: Matthew Sakai <[email protected]>
    Cc: Michael Sclafani <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    visitorckw authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    db09f2d View commit details
    Browse the repository at this point in the history
  49. prctl: generalize PR_SET_MDWE support check to be per-arch

    Patch series "ARM: prctl: Reject PR_SET_MDWE where not supported".
    
    I noticed after a recent kernel update that my ARM926 system started
    segfaulting on any execve() after calling prctl(PR_SET_MDWE).  After some
    investigation it appears that ARMv5 is incapable of providing the
    appropriate protections for MDWE, since any readable memory is also
    implicitly executable.
    
    The prctl_set_mdwe() function already had some special-case logic added
    disabling it on PARISC (commit 7938381, "prctl: Disable
    prctl(PR_SET_MDWE) on parisc"); this patch series (1) generalizes that
    check to use an arch_*() function, and (2) adds a corresponding override
    for ARM to disable MDWE on pre-ARMv6 CPUs.
    
    With the series applied, prctl(PR_SET_MDWE) is rejected on ARMv5 and
    subsequent execve() calls (as well as mmap(PROT_READ|PROT_WRITE)) can
    succeed instead of unconditionally failing; on ARMv6 the prctl works as it
    did previously.
    
    [0] https://lore.kernel.org/all/2023112456-linked-nape-bf19@gregkh/
    
    
    This patch (of 2):
    
    There exist systems other than PARISC where MDWE may not be feasible to
    support; rather than cluttering up the generic code with additional
    arch-specific logic let's add a generic function for checking MDWE support
    and allow each arch to override it as needed.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Zev Weiss <[email protected]>
    Acked-by: Helge Deller <[email protected]>	[parisc]
    Cc: Borislav Petkov <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Florent Revest <[email protected]>
    Cc: "James E.J. Bottomley" <[email protected]>
    Cc: Josh Triplett <[email protected]>
    Cc: Kees Cook <[email protected]>
    Cc: Miguel Ojeda <[email protected]>
    Cc: Mike Rapoport (IBM) <[email protected]>
    Cc: Oleg Nesterov <[email protected]>
    Cc: Ondrej Mosnacek <[email protected]>
    Cc: Rick Edgecombe <[email protected]>
    Cc: Russell King (Oracle) <[email protected]>
    Cc: Sam James <[email protected]>
    Cc: Stefan Roesch <[email protected]>
    Cc: Yang Shi <[email protected]>
    Cc: Yin Fengwei <[email protected]>
    Cc: <[email protected]>	[6.3+]
    Signed-off-by: Andrew Morton <[email protected]>
    zevweiss authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    d5aad4c View commit details
    Browse the repository at this point in the history
  50. ARM: prctl: reject PR_SET_MDWE on pre-ARMv6

    On v5 and lower CPUs we can't provide MDWE protection, so ensure we fail
    any attempt to enable it via prctl(PR_SET_MDWE).
    
    Previously such an attempt would misleadingly succeed, leading to any
    subsequent mmap(PROT_READ|PROT_WRITE) or execve() failing unconditionally
    (the latter somewhat violently via force_fatal_sig(SIGSEGV) due to
    READ_IMPLIES_EXEC).
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Zev Weiss <[email protected]>
    Cc: <[email protected]>	[6.3+]
    Cc: Borislav Petkov <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Florent Revest <[email protected]>
    Cc: Helge Deller <[email protected]>
    Cc: "James E.J. Bottomley" <[email protected]>
    Cc: Josh Triplett <[email protected]>
    Cc: Kees Cook <[email protected]>
    Cc: Miguel Ojeda <[email protected]>
    Cc: Mike Rapoport (IBM) <[email protected]>
    Cc: Oleg Nesterov <[email protected]>
    Cc: Ondrej Mosnacek <[email protected]>
    Cc: Rick Edgecombe <[email protected]>
    Cc: Russell King (Oracle) <[email protected]>
    Cc: Sam James <[email protected]>
    Cc: Stefan Roesch <[email protected]>
    Cc: Yang Shi <[email protected]>
    Cc: Yin Fengwei <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    zevweiss authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    166ce84 View commit details
    Browse the repository at this point in the history
  51. mm: zswap: fix writeback shinker GFP_NOIO/GFP_NOFS recursion

    Kent forwards this bug report of zswap re-entering the block layer
    from an IO request allocation and locking up:
    
    [10264.128242] sysrq: Show Blocked State
    [10264.128268] task:kworker/20:0H   state:D stack:0     pid:143   tgid:143   ppid:2      flags:0x00004000
    [10264.128271] Workqueue: bcachefs_io btree_write_submit [bcachefs]
    [10264.128295] Call Trace:
    [10264.128295]  <TASK>
    [10264.128297]  __schedule+0x3e6/0x1520
    [10264.128303]  schedule+0x32/0xd0
    [10264.128304]  schedule_timeout+0x98/0x160
    [10264.128308]  io_schedule_timeout+0x50/0x80
    [10264.128309]  wait_for_completion_io_timeout+0x7f/0x180
    [10264.128310]  submit_bio_wait+0x78/0xb0
    [10264.128313]  swap_writepage_bdev_sync+0xf6/0x150
    [10264.128317]  zswap_writeback_entry+0xf2/0x180
    [10264.128319]  shrink_memcg_cb+0xe7/0x2f0
    [10264.128322]  __list_lru_walk_one+0xb9/0x1d0
    [10264.128325]  list_lru_walk_one+0x5d/0x90
    [10264.128326]  zswap_shrinker_scan+0xc4/0x130
    [10264.128327]  do_shrink_slab+0x13f/0x360
    [10264.128328]  shrink_slab+0x28e/0x3c0
    [10264.128329]  shrink_one+0x123/0x1b0
    [10264.128331]  shrink_node+0x97e/0xbc0
    [10264.128332]  do_try_to_free_pages+0xe7/0x5b0
    [10264.128333]  try_to_free_pages+0xe1/0x200
    [10264.128334]  __alloc_pages_slowpath.constprop.0+0x343/0xde0
    [10264.128337]  __alloc_pages+0x32d/0x350
    [10264.128338]  allocate_slab+0x400/0x460
    [10264.128339]  ___slab_alloc+0x40d/0xa40
    [10264.128345]  kmem_cache_alloc+0x2e7/0x330
    [10264.128348]  mempool_alloc+0x86/0x1b0
    [10264.128349]  bio_alloc_bioset+0x200/0x4f0
    [10264.128352]  bio_alloc_clone+0x23/0x60
    [10264.128354]  alloc_io+0x26/0xf0 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382]
    [10264.128361]  dm_submit_bio+0xb8/0x580 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382]
    [10264.128366]  __submit_bio+0xb0/0x170
    [10264.128367]  submit_bio_noacct_nocheck+0x159/0x370
    [10264.128368]  bch2_submit_wbio_replicas+0x21c/0x3a0 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2]
    [10264.128391]  btree_write_submit+0x1cf/0x220 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2]
    [10264.128406]  process_one_work+0x178/0x350
    [10264.128408]  worker_thread+0x30f/0x450
    [10264.128409]  kthread+0xe5/0x120
    
    The zswap shrinker resumes the swap_writepage()s that were intercepted
    by the zswap store. This will enter the block layer, and may even
    enter the filesystem depending on the swap backing file.
    
    Make it respect GFP_NOIO and GFP_NOFS.
    
    Link: https://lore.kernel.org/linux-mm/rc4pk2r42oyvjo4dc62z6sovquyllq56i5cdgcaqbd7wy3hfzr@n4nbxido3fme/
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: b5ba474 ("zswap: shrink zswap pool based on memory pressure")
    Signed-off-by: Johannes Weiner <[email protected]>
    Reported-by: Kent Overstreet <[email protected]>
    Acked-by: Yosry Ahmed <[email protected]>
    Reported-by: Jérôme Poulin <[email protected]>
    Reviewed-by: Nhat Pham <[email protected]>
    Reviewed-by: Chengming Zhou <[email protected]>
    Cc: [email protected]	[v6.8]
    Signed-off-by: Andrew Morton <[email protected]>
    hnaz authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    30fb6a8 View commit details
    Browse the repository at this point in the history
  52. selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM

    The sigbus-wp test requires the UFFD_FEATURE_WP_HUGETLBFS_SHMEM flag for
    shmem and hugetlb targets.  Otherwise it is not backwards compatible with
    kernels <5.19 and fails with EINVAL.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 73c1ea9 ("selftests/mm: move uffd sig/events tests into uffd unit tests")
    Signed-off-by: Edward Liaw <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Cc: Peter Xu <[email protected]
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    edliaw authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    105840e View commit details
    Browse the repository at this point in the history
  53. tmpfs: fix race on handling dquot rbtree

    A syzkaller reproducer found a race while attempting to remove dquot
    information from the rb tree.
    
    Fetching the rb_tree root node must also be protected by the
    dqopt->dqio_sem, otherwise, giving the right timing, shmem_release_dquot()
    will trigger a warning because it couldn't find a node in the tree, when
    the real reason was the root node changing before the search starts:
    
    Thread 1				Thread 2
    - shmem_release_dquot()			- shmem_{acquire,release}_dquot()
    
    - fetch ROOT				- Fetch ROOT
    
    					- acquire dqio_sem
    - wait dqio_sem
    
    					- do something, triger a tree rebalance
    					- release dqio_sem
    
    - acquire dqio_sem
    - start searching for the node, but
      from the wrong location, missing
      the node, and triggering a warning.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: eafc474 ("shmem: prepare shmem quota infrastructure")
    Signed-off-by: Carlos Maiolino <[email protected]>
    Reported-by: Ubisectech Sirius <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    cmaiolino authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    0a69b6b View commit details
    Browse the repository at this point in the history
  54. userfaultfd: fix deadlock warning when locking src and dst VMAs

    Use down_read_nested() to avoid the warning.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 867a43a ("userfaultfd: use per-vma locks in userfaultfd operations")
    Reported-by: [email protected]
    Signed-off-by: Lokesh Gidra <[email protected]>
    Cc: Andrea Arcangeli <[email protected]>
    Cc: Axel Rasmussen <[email protected]>
    Cc: Brian Geffon <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Hillf Danton <[email protected]>
    Cc: Jann Horn <[email protected]> [Bug #2]
    Cc: Kalesh Singh <[email protected]>
    Cc: Lokesh Gidra <[email protected]>
    Cc: Mike Rapoport (IBM) <[email protected]>
    Cc: Nicolas Geoffray <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: Suren Baghdasaryan <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    SENSEIIIII authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    30af24f View commit details
    Browse the repository at this point in the history
  55. hexagon: vmlinux.lds.S: handle attributes section

    After the linked LLVM change, the build fails with
    CONFIG_LD_ORPHAN_WARN_LEVEL="error", which happens with allmodconfig:
    
      ld.lld: error: vmlinux.a(init/main.o):(.hexagon.attributes) is being placed in '.hexagon.attributes'
    
    Handle the attributes section in a similar manner as arm and riscv by
    adding it after the primary ELF_DETAILS grouping in vmlinux.lds.S, which
    fixes the error.
    
    Link: https://lkml.kernel.org/r/20240319-hexagon-handle-attributes-section-vmlinux-lds-s-v1-1-59855dab8872@kernel.org
    Fixes: 113616e ("hexagon: select ARCH_WANT_LD_ORPHAN_WARN")
    Link: llvm/llvm-project@31f4b32
    Signed-off-by: Nathan Chancellor <[email protected]>
    Reviewed-by: Brian Cain <[email protected]>
    Cc: Bill Wendling <[email protected]>
    Cc: Justin Stitt <[email protected]>
    Cc: Nick Desaulniers <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    nathanchance authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    549aa96 View commit details
    Browse the repository at this point in the history
  56. selftests/mm: fix ARM related issue with fork after pthread_create

    Following issue was observed while running the uffd-unit-tests selftest
    on ARM devices. On x86_64 no issues were detected:
    
    pthread_create followed by fork caused deadlock in certain cases wherein
    fork required some work to be completed by the created thread.  Used
    synchronization to ensure that created thread's start function has started
    before invoking fork.
    
    [[email protected]: refactored to use atomic_bool]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 760aee0 ("selftests/mm: add tests for RO pinning vs fork()")
    Signed-off-by: Lokesh Gidra <[email protected]>
    Signed-off-by: Edward Liaw <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    edliaw authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    8c86437 View commit details
    Browse the repository at this point in the history
  57. mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices

    Zhongkun He reports data corruption when combining zswap with zram.
    
    The issue is the exclusive loads we're doing in zswap. They assume
    that all reads are going into the swapcache, which can assume
    authoritative ownership of the data and so the zswap copy can go.
    
    However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try to
    bypass the swapcache.  This results in an optimistic read of the swap data
    into a page that will be dismissed if the fault fails due to races.  In
    this case, zswap mustn't drop its authoritative copy.
    
    Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=zV9P691B9bVq33erwOXNTmEaUbi9DrDeJzw@mail.gmail.com/
    Fixes: b9c91c4 ("mm: zswap: support exclusive loads")
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Johannes Weiner <[email protected]>
    Reported-by: Zhongkun He <[email protected]>
    Tested-by: Zhongkun He <[email protected]>
    Acked-by: Yosry Ahmed <[email protected]>
    Acked-by: Barry Song <[email protected]>
    Reviewed-by: Chengming Zhou <[email protected]>
    Reviewed-by: Nhat Pham <[email protected]>
    Acked-by: Chris Li <[email protected]>
    Cc: <[email protected]>	[6.5+]
    Signed-off-by: Andrew Morton <[email protected]>
    hnaz authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    25cd241 View commit details
    Browse the repository at this point in the history
  58. crash: use macro to add crashk_res into iomem early for specific arch

    There are regression reports[1][2] that crashkernel region on x86_64 can't
    be added into iomem tree sometime.  This causes the later failure of kdump
    loading.
    
    This happened after commit 4a693ce ("kdump: defer the insertion of
    crashkernel resources") was merged.
    
    Even though, these reported issues are proved to be related to other
    component, they are just exposed after above commmit applied, I still
    would like to keep crashk_res and crashk_low_res being added into iomem
    early as before because the early adding has been always there on x86_64
    and working very well.  For safety of kdump, Let's change it back.
    
    Here, add a macro HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY to limit that
    only ARCH defining the macro can have the early adding
    crashk_res/_low_res into iomem. Then define
    HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY on x86 to enable it.
    
    Note: In reserve_crashkernel_low(), there's a remnant of crashk_low_res
    handling which was mistakenly added back in commit 85fcde4 ("kexec:
    split crashkernel reservation code out from crash_core.c").
    
    [1]
    [PATCH V2] x86/kexec: do not update E820 kexec table for setup_data
    https://lore.kernel.org/all/[email protected]/T/#u
    
    [2]
    Question about Address Range Validation in Crash Kernel Allocation
    https://lore.kernel.org/all/[email protected]/T/#u
    
    Link: https://lkml.kernel.org/r/ZgDYemRQ2jxjLkq+@MiWiFi-R3L-srv
    Fixes: 4a693ce ("kdump: defer the insertion of crashkernel resources")
    Signed-off-by: Baoquan He <[email protected]>
    Cc: Dave Young <[email protected]>
    Cc: Huacai Chen <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Jiri Bohac <[email protected]>
    Cc: Li Huafei <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Baoquan He authored and akpm00 committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    32fbe52 View commit details
    Browse the repository at this point in the history

Commits on Mar 27, 2024

  1. net: pin system percpu page_pools to the corresponding NUMA nodes

    System page_pools are percpu and one instance can be used only on
    one CPU.
    %NUMA_NO_NODE is fine for allocating pages, as the PP core always
    allocates local pages in this case. But for the struct &page_pool
    itself, this node ID means they are allocated on the boot CPU,
    which may belong to a different node than the target CPU.
    Pin system page_pools to the corresponding nodes when creating,
    so that all the allocated data will always be local. Use
    cpu_to_mem() to account memless nodes.
    Nodes != 0 win some Kpps when testing with xdp-trafficgen.
    
    Signed-off-by: Alexander Lobakin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    alobakin authored and kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    341ee1a View commit details
    Browse the repository at this point in the history
  2. net: amd8111e: Drop unused copy of pm_cap

    The copy of pdev->pm_cap in struct amd8111e_priv is never used.  Drop it.
    
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    bjorn-helgaas authored and kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    ee36b1e View commit details
    Browse the repository at this point in the history
  3. tls: recv: process_rx_list shouldn't use an offset with kvec

    Only MSG_PEEK needs to copy from an offset during the final
    process_rx_list call, because the bytes we copied at the beginning of
    tls_sw_recvmsg were left on the rx_list. In the KVEC case, we removed
    data from the rx_list as we were copying it, so there's no need to use
    an offset, just like in the normal case.
    
    Fixes: 692d7b5 ("tls: Fix recvmsg() to be able to peek across multiple records")
    Signed-off-by: Sabrina Dubroca <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/e5487514f828e0347d2b92ca40002c62b58af73d.1711120964.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <[email protected]>
    qsn authored and kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    7608a97 View commit details
    Browse the repository at this point in the history
  4. tls: adjust recv return with async crypto and failed copy to userspace

    process_rx_list may not copy as many bytes as we want to the userspace
    buffer, for example in case we hit an EFAULT during the copy. If this
    happens, we should only count the bytes that were actually copied,
    which may be 0.
    
    Subtracting async_copy_bytes is correct in both peek and !peek cases,
    because decrypted == async_copy_bytes + peeked for the peek case: peek
    is always !ZC, and we can go through either the sync or async path. In
    the async case, we add chunk to both decrypted and
    async_copy_bytes. In the sync case, we add chunk to both decrypted and
    peeked. I missed that in commit 6caaf10 ("tls: fix peeking with
    sync+async decryption").
    
    Fixes: 4d42cd6 ("tls: rx: fix return value for async crypto")
    Signed-off-by: Sabrina Dubroca <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/1b5a1eaab3c088a9dd5d9f1059ceecd7afe888d1.1711120964.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <[email protected]>
    qsn authored and kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    85eef9a View commit details
    Browse the repository at this point in the history
  5. selftests: tls: add test with a partially invalid iov

    Make sure that we don't return more bytes than we actually received if
    the userspace buffer was bogus. We expect to receive at least the rest
    of rec1, and possibly some of rec2 (currently, we don't, but that
    would be ok).
    
    Signed-off-by: Sabrina Dubroca <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/720e61b3d3eab40af198a58ce2cd1ee019f0ceb1.1711120964.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <[email protected]>
    qsn authored and kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    dc54b81 View commit details
    Browse the repository at this point in the history
  6. tls: get psock ref after taking rxlock to avoid leak

    At the start of tls_sw_recvmsg, we take a reference on the psock, and
    then call tls_rx_reader_lock. If that fails, we return directly
    without releasing the reference.
    
    Instead of adding a new label, just take the reference after locking
    has succeeded, since we don't need it before.
    
    Fixes: 4cbc325 ("tls: rx: allow only one reader at a time")
    Signed-off-by: Sabrina Dubroca <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/fe2ade22d030051ce4c3638704ed58b67d0df643.1711120964.git.sd@queasysnail.net
    Signed-off-by: Jakub Kicinski <[email protected]>
    qsn authored and kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    417e91e View commit details
    Browse the repository at this point in the history
  7. Merge branch 'tls-recvmsg-fixes'

    Sabrina Dubroca says:
    
    ====================
    tls: recvmsg fixes
    
    The first two fixes are again related to async decrypt. The last one
    is unrelated but I stumbled upon it while reading the code.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    646fc4b View commit details
    Browse the repository at this point in the history
  8. mlxbf_gige: call request_irq() after NAPI initialized

    The mlxbf_gige driver encounters a NULL pointer exception in
    mlxbf_gige_open() when kdump is enabled.  The sequence to reproduce
    the exception is as follows:
    a) enable kdump
    b) trigger kdump via "echo c > /proc/sysrq-trigger"
    c) kdump kernel executes
    d) kdump kernel loads mlxbf_gige module
    e) the mlxbf_gige module runs its open() as the
       the "oob_net0" interface is brought up
    f) mlxbf_gige module will experience an exception
       during its open(), something like:
    
         Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
         Mem abort info:
           ESR = 0x0000000086000004
           EC = 0x21: IABT (current EL), IL = 32 bits
           SET = 0, FnV = 0
           EA = 0, S1PTW = 0
           FSC = 0x04: level 0 translation fault
         user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000
         [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
         Internal error: Oops: 0000000086000004 [#1] SMP
         CPU: 0 PID: 812 Comm: NetworkManager Tainted: G           OE     5.15.0-1035-bluefield #37-Ubuntu
         Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024
         pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
         pc : 0x0
         lr : __napi_poll+0x40/0x230
         sp : ffff800008003e00
         x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff
         x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8
         x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000
         x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000
         x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0
         x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c
         x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398
         x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2
         x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100
         x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238
         Call trace:
          0x0
          net_rx_action+0x178/0x360
          __do_softirq+0x15c/0x428
          __irq_exit_rcu+0xac/0xec
          irq_exit+0x18/0x2c
          handle_domain_irq+0x6c/0xa0
          gic_handle_irq+0xec/0x1b0
          call_on_irq_stack+0x20/0x2c
          do_interrupt_handler+0x5c/0x70
          el1_interrupt+0x30/0x50
          el1h_64_irq_handler+0x18/0x2c
          el1h_64_irq+0x7c/0x80
          __setup_irq+0x4c0/0x950
          request_threaded_irq+0xf4/0x1bc
          mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige]
          mlxbf_gige_open+0x5c/0x170 [mlxbf_gige]
          __dev_open+0x100/0x220
          __dev_change_flags+0x16c/0x1f0
          dev_change_flags+0x2c/0x70
          do_setlink+0x220/0xa40
          __rtnl_newlink+0x56c/0x8a0
          rtnl_newlink+0x58/0x84
          rtnetlink_rcv_msg+0x138/0x3c4
          netlink_rcv_skb+0x64/0x130
          rtnetlink_rcv+0x20/0x30
          netlink_unicast+0x2ec/0x360
          netlink_sendmsg+0x278/0x490
          __sock_sendmsg+0x5c/0x6c
          ____sys_sendmsg+0x290/0x2d4
          ___sys_sendmsg+0x84/0xd0
          __sys_sendmsg+0x70/0xd0
          __arm64_sys_sendmsg+0x2c/0x40
          invoke_syscall+0x78/0x100
          el0_svc_common.constprop.0+0x54/0x184
          do_el0_svc+0x30/0xac
          el0_svc+0x48/0x160
          el0t_64_sync_handler+0xa4/0x12c
          el0t_64_sync+0x1a4/0x1a8
         Code: bad PC value
         ---[ end trace 7d1c3f3bf9d81885 ]---
         Kernel panic - not syncing: Oops: Fatal exception in interrupt
         Kernel Offset: 0x2870a7a00000 from 0xffff800008000000
         PHYS_OFFSET: 0x80000000
         CPU features: 0x0,000005c1,a3332a5a
         Memory Limit: none
         ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
    
    The exception happens because there is a pending RX interrupt before the
    call to request_irq(RX IRQ) executes.  Then, the RX IRQ handler fires
    immediately after this request_irq() completes. The RX IRQ handler runs
    "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()"
    and "napi_enable()", both which happen later in the open() logic.
    
    The logic in mlxbf_gige_open() must fully initialize NAPI before any calls
    to request_irq() execute.
    
    Fixes: f92e186 ("Add Mellanox BlueField Gigabit Ethernet driver")
    Signed-off-by: David Thompson <[email protected]>
    Reviewed-by: Asmaa Mnebhi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    dthompso authored and kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    f7442a6 View commit details
    Browse the repository at this point in the history
  9. Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/tnguy/net-queue
    
    Tony Nguyen says:
    
    ====================
    Intel Wired LAN Driver Updates 2024-03-25 (ice, ixgbe, igc)
    
    This series contains updates to ice, ixgbe, and igc drivers.
    
    Steven fixes incorrect casting of bitmap type for ice driver.
    
    Jesse fixes memory corruption issue with suspend flow on ice.
    
    Przemek adds GFP_ATOMIC flag to avoid sleeping in IRQ context for ixgbe.
    
    Kurt Kanzenbach removes no longer valid comment on igc.
    
    * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
      igc: Remove stale comment about Tx timestamping
      ixgbe: avoid sleeping allocation in ixgbe_ipsec_vf_add_sa()
      ice: fix memory corruption bug with suspend and rebuild
      ice: Refactor FW data type and fix bitmap casting issue
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    c4d2d23 View commit details
    Browse the repository at this point in the history
  10. net: wan: framer: Add missing static inline qualifiers

    Compilation with CONFIG_GENERIC_FRAMER disabled lead to the following
    warnings:
      framer.h:184:16: warning: no previous prototype for function 'framer_get' [-Wmissing-prototypes]
      184 | struct framer *framer_get(struct device *dev, const char *con_id)
      framer.h:184:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
      184 | struct framer *framer_get(struct device *dev, const char *con_id)
      framer.h:189:6: warning: no previous prototype for function 'framer_put' [-Wmissing-prototypes]
      189 | void framer_put(struct device *dev, struct framer *framer)
      framer.h:189:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
      189 | void framer_put(struct device *dev, struct framer *framer)
    
    Add missing 'static inline' qualifiers for these functions.
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Fixes: 82c944d ("net: wan: Add framer framework support")
    Cc: [email protected]
    Signed-off-by: Herve Codina <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    hcodina authored and davem330 committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    ea2c092 View commit details
    Browse the repository at this point in the history
  11. selftests: netdevsim: set test timeout to 10 minutes

    The longest running netdevsim test, nexthop.sh, currently takes
    5 min to finish. Around 260s to be exact, and 310s on a debug kernel.
    The default timeout in selftest is 45sec, so we need an explicit
    config. Give ourselves some headroom and use 10min.
    
    Commit under Fixes isn't really to "blame" but prior to that
    netdevsim tests weren't integrated with kselftest infra
    so blaming the tests themselves doesn't seem right, either.
    
    Fixes: 8ff25da ("netdevsim: add Makefile for selftests")
    Signed-off-by: Jakub Kicinski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    kuba-moo authored and davem330 committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    afbf75e View commit details
    Browse the repository at this point in the history
  12. Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel…

    …/git/bpf/bpf-next
    
    Daniel Borkmann says:
    
    ====================
    pull-request: bpf-next 2024-03-25
    
    We've added 38 non-merge commits during the last 13 day(s) which contain
    a total of 50 files changed, 867 insertions(+), 274 deletions(-).
    
    The main changes are:
    
    1) Add the ability to specify and retrieve BPF cookie also for raw
       tracepoint programs in order to ease migration from classic to raw
       tracepoints, from Andrii Nakryiko.
    
    2) Allow the use of bpf_get_{ns_,}current_pid_tgid() helper for all
       program types and add additional BPF selftests, from Yonghong Song.
    
    3) Several improvements to bpftool and its build, for example, enabling
       libbpf logs when loading pid_iter in debug mode, from Quentin Monnet.
    
    4) Check the return code of all BPF-related set_memory_*() functions during
       load and bail out in case they fail, from Christophe Leroy.
    
    5) Avoid a goto in regs_refine_cond_op() such that the verifier can
       be better integrated into Agni tool which doesn't support backedges
       yet, from Harishankar Vishwanathan.
    
    6) Add a small BPF trie perf improvement by always inlining
       longest_prefix_match, from Jesper Dangaard Brouer.
    
    7) Small BPF selftest refactor in bpf_tcp_ca.c to utilize start_server()
       helper instead of open-coding it, from Geliang Tang.
    
    8) Improve test_tc_tunnel.sh BPF selftest to prevent client connect
       before the server bind, from Alessandro Carminati.
    
    9) Fix BPF selftest benchmark for older glibc and use syscall(SYS_gettid)
       instead of gettid(), from Alan Maguire.
    
    10) Implement a backward-compatible method for struct_ops types with
        additional fields which are not present in older kernels,
        from Kui-Feng Lee.
    
    11) Add a small helper to check if an instruction is addr_space_cast
        from as(0) to as(1) and utilize it in x86-64 JIT, from Puranjay Mohan.
    
    12) Small cleanup to remove unnecessary error check in
        bpf_struct_ops_map_update_elem, from Martin KaFai Lau.
    
    13) Improvements to libbpf fd validity checks for BPF map/programs,
        from Mykyta Yatsenko.
    
    * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (38 commits)
      selftests/bpf: Fix flaky test btf_map_in_map/lookup_update
      bpf: implement insn_is_cast_user() helper for JITs
      bpf: Avoid get_kernel_nofault() to fetch kprobe entry IP
      selftests/bpf: Use start_server in bpf_tcp_ca
      bpf: Sync uapi bpf.h to tools directory
      libbpf: Add new sec_def "sk_skb/verdict"
      selftests/bpf: Mark uprobe trigger functions with nocf_check attribute
      selftests/bpf: Use syscall(SYS_gettid) instead of gettid() wrapper in bench
      bpf-next: Avoid goto in regs_refine_cond_op()
      bpftool: Clean up HOST_CFLAGS, HOST_LDFLAGS for bootstrap bpftool
      selftests/bpf: scale benchmark counting by using per-CPU counters
      bpftool: Remove unnecessary source files from bootstrap version
      bpftool: Enable libbpf logs when loading pid_iter in debug mode
      selftests/bpf: add raw_tp/tp_btf BPF cookie subtests
      libbpf: add support for BPF cookie for raw_tp/tp_btf programs
      bpf: support BPF cookie in raw tracepoint (raw_tp, tp_btf) programs
      bpf: pass whole link instead of prog when triggering raw tracepoint
      bpf: flatten bpf_probe_register call chain
      selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh
      selftests/bpf: Add a sk_msg prog bpf_get_ns_current_pid_tgid() test
      ...
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    2a702c2 View commit details
    Browse the repository at this point in the history
  13. bpf: fix warning for crash_kexec

    With [1], crash dump specific code is moved out of CONFIG_KEXEC_CORE
    and placed under CONFIG_CRASH_DUMP, where it is more appropriate.
    And since CONFIG_KEXEC & !CONFIG_CRASH_DUMP build option is supported
    with that, it led to the below warning:
    
      "WARN: resolve_btfids: unresolved symbol crash_kexec"
    
    Fix it by using the appropriate #ifdef.
    
    [1] https://lore.kernel.org/all/[email protected]/
    
    Acked-by: Baoquan He <[email protected]>
    Fixes: 02aff84 ("crash: split crash dumping code out from kexec_core.c")
    Acked-by: Jiri Olsa <[email protected]>
    Acked-by: Stanislav Fomichev <[email protected]>
    Signed-off-by: Hari Bathini <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    hbathini authored and Alexei Starovoitov committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    96b98a6 View commit details
    Browse the repository at this point in the history
  14. Fix memory leak in posix_clock_open()

    If the clk ops.open() function returns an error, we don't release the
    pccontext we allocated for this clock.
    
    Re-organize the code slightly to make it all more obvious.
    
    Reported-by: Rohit Keshri <[email protected]>
    Acked-by: Oleg Nesterov <[email protected]>
    Fixes: 60c6946 ("posix-clock: introduce posix_clock_context concept")
    Cc: Jakub Kicinski <[email protected]>
    Cc: David S. Miller <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    torvalds committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    5b4cdd9 View commit details
    Browse the repository at this point in the history
  15. Fix build errors due to new UIO_MEM_DMA_COHERENT mess

    Commit 576882e ("uio: introduce UIO_MEM_DMA_COHERENT type")
    introduced a new use-case for 'struct uio_mem' where the 'mem' field now
    contains a kernel virtual address when 'memtype' is set to
    UIO_MEM_DMA_COHERENT.
    
    That in turn causes build errors, because 'mem' is of type
    'phys_addr_t', and a virtual address is a pointer type.  When the code
    just blindly uses cast to mix the two, it caused problems when
    phys_addr_t isn't the same size as a pointer - notably on 32-bit
    architectures with PHYS_ADDR_T_64BIT.
    
    The proper thing to do would probably be to use a union member, and not
    have any casts, and make the 'mem' member be a union of 'mem.physaddr'
    and 'mem.vaddr', based on 'memtype'.
    
    This is not that proper thing.  This is just fixing the ugly casts to be
    even uglier, but at least not cause build errors on 32-bit platforms
    with 64-bit physical addresses.
    
    Reported-by: Guenter Roeck <[email protected]>
    Fixes: 576882e ("uio: introduce UIO_MEM_DMA_COHERENT type")
    Fixes: 7722151 ("uio_pruss: UIO_MEM_DMA_COHERENT conversion")
    Fixes: 0199478 ("uio_dmem_genirq: UIO_MEM_DMA_COHERENT conversion")
    Cc: Greg Kroah-Hartman <[email protected]>
    Cc: Chris Leech <[email protected]>
    Cc: Nilesh Javali <[email protected]>
    Cc: Christoph Hellwig <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    torvalds committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    498e47c View commit details
    Browse the repository at this point in the history
  16. bpf: Check bloom filter map value size

    This patch adds a missing check to bloom filter creating, rejecting
    values above KMALLOC_MAX_SIZE. This brings the bloom map in line with
    many other map types.
    
    The lack of this protection can cause kernel crashes for value sizes
    that overflow int's. Such a crash was caught by syzkaller. The next
    patch adds more guard-rails at a lower level.
    
    Signed-off-by: Andrei Matei <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    andreimatei authored and Alexei Starovoitov committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    a8d89fe View commit details
    Browse the repository at this point in the history
  17. bpf: Protect against int overflow for stack access size

    This patch re-introduces protection against the size of access to stack
    memory being negative; the access size can appear negative as a result
    of overflowing its signed int representation. This should not actually
    happen, as there are other protections along the way, but we should
    protect against it anyway. One code path was missing such protections
    (fixed in the previous patch in the series), causing out-of-bounds array
    accesses in check_stack_range_initialized(). This patch causes the
    verification of a program with such a non-sensical access size to fail.
    
    This check used to exist in a more indirect way, but was inadvertendly
    removed in a833a17.
    
    Fixes: a833a17 ("bpf: Fix verification of indirect var-off stack access")
    Reported-by: [email protected]
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/bpf/CAADnVQLORV5PT0iTAhRER+iLBTkByCYNBYyvBSgjN1T31K+gOw@mail.gmail.com/
    Acked-by: Andrii Nakryiko <[email protected]>
    Signed-off-by: Andrei Matei <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    andreimatei authored and Alexei Starovoitov committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    ecc6a21 View commit details
    Browse the repository at this point in the history
  18. Merge branch 'check-bloom-filter-map-value-size'

    Andrei Matei says:
    
    ====================
    Check bloom filter map value size
    
    v1->v2:
    - prepend a patch addressing the bloom map specifically
    - change low-level rejection error to EFAULT, to indicate a bug
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Alexei Starovoitov committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    a4e02d6 View commit details
    Browse the repository at this point in the history
  19. Merge tag 'execve-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/ker…

    …nel/git/kees/linux
    
    Pull execve fixes from Kees Cook:
    
     - Fix selftests to conform to the TAP output format (Muhammad Usama
       Anjum)
    
     - Fix NOMMU linux_binprm::exec pointer in auxv (Max Filippov)
    
     - Replace deprecated strncpy usage (Justin Stitt)
    
     - Replace another /bin/sh instance in selftests
    
    * tag 'execve-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
      binfmt: replace deprecated strncpy
      exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack()
      selftests/exec: Convert remaining /bin/sh to /bin/bash
      selftests/exec: execveat: Improve debug reporting
      selftests/exec: recursion-depth: conform test to TAP format output
      selftests/exec: load_address: conform test to TAP format output
      selftests/exec: binfmt_script: Add the overall result line according to TAP
    torvalds committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    f4a4329 View commit details
    Browse the repository at this point in the history
  20. Merge tag 'probes-fixes-v6.9-rc1' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/trace/linux-trace
    
    Pull probes fixlet from Masami Hiramatsu:
    
     - tracing/probes: initialize a 'val' local variable with zero.
    
       This variable is read by FETCH_OP_ST_EDATA in a loop, and is
       initialized by FETCH_OP_ARG in the same loop. Since this
       initialization is not obvious, smatch warns about it.
    
       Explicitly initializing 'val' with zero fixes this warning.
    
    * tag 'probes-fixes-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
      tracing: probes: Fix to zero initialize a local variable
    torvalds committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    9624905 View commit details
    Browse the repository at this point in the history
  21. bpf: update BPF LSM designated reviewer list

    Adding myself in place of both Brendan and Florent as both have since
    moved on from working on the BPF LSM and will no longer be devoting
    their time to maintaining the BPF LSM.
    
    Signed-off-by: Matt Bobrowski <[email protected]>
    Acked-by: KP Singh <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    mattbobrowski authored and Alexei Starovoitov committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    4dd6510 View commit details
    Browse the repository at this point in the history
  22. Merge tag 'mm-hotfixes-stable-2024-03-27-11-25' of git://git.kernel.o…

    …rg/pub/scm/linux/kernel/git/akpm/mm
    
    Pull misc fixes from Andrew Morton:
     "Various hotfixes. About half are cc:stable and the remainder address
      post-6.8 issues or aren't considered suitable for backporting.
    
      zswap figures prominently in the post-6.8 issues - folloup against the
      large amount of changes we have just made to that code.
    
      Apart from that, all over the map"
    
    * tag 'mm-hotfixes-stable-2024-03-27-11-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
      crash: use macro to add crashk_res into iomem early for specific arch
      mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices
      selftests/mm: fix ARM related issue with fork after pthread_create
      hexagon: vmlinux.lds.S: handle attributes section
      userfaultfd: fix deadlock warning when locking src and dst VMAs
      tmpfs: fix race on handling dquot rbtree
      selftests/mm: sigbus-wp test requires UFFD_FEATURE_WP_HUGETLBFS_SHMEM
      mm: zswap: fix writeback shinker GFP_NOIO/GFP_NOFS recursion
      ARM: prctl: reject PR_SET_MDWE on pre-ARMv6
      prctl: generalize PR_SET_MDWE support check to be per-arch
      MAINTAINERS: remove incorrect M: tag for [email protected]
      mm: zswap: fix kernel BUG in sg_init_one
      selftests: mm: restore settings from only parent process
      tools/Makefile: remove cgroup target
      mm: cachestat: fix two shmem bugs
      mm: increase folio batch size
      mm,page_owner: fix recursion
      mailmap: update entry for Leonard Crestez
      init: open /initrd.image with O_LARGEFILE
      selftests/mm: Fix build with _FORTIFY_SOURCE
      ...
    torvalds committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    dc189b8 View commit details
    Browse the repository at this point in the history
  23. Merge tag 'for-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/ker…

    …nel/git/kdave/linux
    
    Pull btrfs fixes from David Sterba:
    
     - fix race when reading extent buffer and 'uptodate' status is missed
       by one thread (introduced in 6.5)
    
     - do additional validation of devices using major:minor numbers
    
     - zoned mode fixes:
         - use zone-aware super block access during scrub
         - fix use-after-free during device replace (found by KASAN)
         - also delete zones that are 100% unusable to reclaim space
    
     - extent unpinning fixes:
         - fix extent map leak after error handling
         - print correct range in error message
    
     - error code and message updates
    
    * tag 'for-6.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
      btrfs: fix race in read_extent_buffer_pages()
      btrfs: return accurate error code on open failure in open_fs_devices()
      btrfs: zoned: don't skip block groups with 100% zone unusable
      btrfs: use btrfs_warn() to log message at btrfs_add_extent_mapping()
      btrfs: fix message not properly printing interval when adding extent map
      btrfs: fix warning messages not printing interval at unpin_extent_range()
      btrfs: fix extent map leak in unexpected scenario at unpin_extent_cache()
      btrfs: validate device maj:min during open
      btrfs: zoned: fix use-after-free in do_zone_finish()
      btrfs: zoned: use zone aware sb location for scrub
    torvalds committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    400dd45 View commit details
    Browse the repository at this point in the history
  24. Merge tag '9p-fixes-for-6.9-rc1' of git://git.kernel.org/pub/scm/linu…

    …x/kernel/git/ericvh/v9fs
    
    Pull 9p fixes from Eric Van Hensbergen:
     "Two of these fix syzbot reported issues, and the other fixes a unused
      variable in some configurations"
    
    * tag '9p-fixes-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
      fs/9p: fix uninitialized values during inode evict
      fs/9p: remove redundant pointer v9ses
      fs/9p: fix uaf in in v9fs_stat2inode_dotl
    torvalds committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    4076fa1 View commit details
    Browse the repository at this point in the history
  25. Merge tag 'wireless-2024-03-27' of git://git.kernel.org/pub/scm/linux…

    …/kernel/git/wireless/wireless
    
    Kalle Valo says:
    
    ====================
    wireless fixes for v6.9-rc2
    
    The first fixes for v6.9. Ping-Ke Shih now maintains a separate tree
    for Realtek drivers, document that in the MAINTAINERS. Plenty of fixes
    for both to stack and iwlwifi. Our kunit tests were working only on um
    architecture but that's fixed now.
    
    * tag 'wireless-2024-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (21 commits)
      MAINTAINERS: wifi: mwifiex: add Francesco as reviewer
      kunit: fix wireless test dependencies
      wifi: iwlwifi: mvm: include link ID when releasing frames
      wifi: iwlwifi: mvm: handle debugfs names more carefully
      wifi: iwlwifi: mvm: guard against invalid STA ID on removal
      wifi: iwlwifi: read txq->read_ptr under lock
      wifi: iwlwifi: fw: don't always use FW dump trig
      wifi: iwlwifi: mvm: rfi: fix potential response leaks
      wifi: mac80211: correctly set active links upon TTLM
      wifi: iwlwifi: mvm: Configure the link mapping for non-MLD FW
      wifi: iwlwifi: mvm: consider having one active link
      wifi: iwlwifi: mvm: pick the version of SESSION_PROTECTION_NOTIF
      wifi: mac80211: fix prep_connection error path
      wifi: cfg80211: fix rdev_dump_mpp() arguments order
      wifi: iwlwifi: mvm: disable MLO for the time being
      wifi: cfg80211: add a flag to disable wireless extensions
      wifi: mac80211: fix ieee80211_bss_*_flags kernel-doc
      wifi: mac80211: check/clear fast rx for non-4addr sta VLAN changes
      wifi: mac80211: fix mlme_link_id_dbg()
      MAINTAINERS: wifi: add git tree for Realtek WiFi drivers
      ...
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    56d2f48 View commit details
    Browse the repository at this point in the history

Commits on Mar 28, 2024

  1. netfilter: nf_tables: reject destroy command to remove basechain hooks

    Report EOPNOTSUPP if NFT_MSG_DESTROYCHAIN is used to delete hooks in an
    existing netdev basechain, thus, only NFT_MSG_DELCHAIN is allowed.
    
    Fixes: 7d937b1 ("netfilter: nf_tables: support for deleting devices in an existing netdev chain")
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    ummakynes committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    b32ca27 View commit details
    Browse the repository at this point in the history
  2. netfilter: nf_tables: reject table flag and netdev basechain updates

    netdev basechain updates are stored in the transaction object hook list.
    When setting on the table dormant flag, it iterates over the existing
    hooks in the basechain. Thus, skipping the hooks that are being
    added/deleted in this transaction, which leaves hook registration in
    inconsistent state.
    
    Reject table flag updates in combination with netdev basechain updates
    in the same batch:
    
    - Update table flags and add/delete basechain: Check from basechain update
      path if there are pending flag updates for this table.
    - add/delete basechain and update table flags: Iterate over the transaction
      list to search for basechain updates from the table update path.
    
    In both cases, the batch is rejected. Based on suggestion from Florian Westphal.
    
    Fixes: b9703ed ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
    Fixes: 7d937b1 ("netfilter: nf_tables: support for deleting devices in an existing netdev chain")
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    ummakynes committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    1e1fb6f View commit details
    Browse the repository at this point in the history
  3. netfilter: nf_tables: skip netdev hook unregistration if table is dor…

    …mant
    
    Skip hook unregistration when adding or deleting devices from an
    existing netdev basechain. Otherwise, commit/abort path try to
    unregister hooks which not enabled.
    
    Fixes: b9703ed ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
    Fixes: 7d937b1 ("netfilter: nf_tables: support for deleting devices in an existing netdev chain")
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    ummakynes committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    216e7bf View commit details
    Browse the repository at this point in the history
  4. netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_t…

    …ables.c
    
    syzkaller started to report a warning below [0] after consuming the
    commit 4654467 ("netfilter: arptables: allow xtables-nft only
    builds").
    
    The change accidentally removed the dependency on NETFILTER_FAMILY_ARP
    from IP_NF_ARPTABLES.
    
    If NF_TABLES_ARP is not enabled on Kconfig, NETFILTER_FAMILY_ARP will
    be removed and some code necessary for arptables will not be compiled.
    
      $ grep -E "(NETFILTER_FAMILY_ARP|IP_NF_ARPTABLES|NF_TABLES_ARP)" .config
      CONFIG_NETFILTER_FAMILY_ARP=y
      # CONFIG_NF_TABLES_ARP is not set
      CONFIG_IP_NF_ARPTABLES=y
    
      $ make olddefconfig
    
      $ grep -E "(NETFILTER_FAMILY_ARP|IP_NF_ARPTABLES|NF_TABLES_ARP)" .config
      # CONFIG_NF_TABLES_ARP is not set
      CONFIG_IP_NF_ARPTABLES=y
    
    So, when nf_register_net_hooks() is called for arptables, it will
    trigger the splat below.
    
    Now IP_NF_ARPTABLES is only enabled by IP_NF_ARPFILTER, so let's
    restore the dependency on NETFILTER_FAMILY_ARP in IP_NF_ARPFILTER.
    
    [0]:
    WARNING: CPU: 0 PID: 242 at net/netfilter/core.c:316 nf_hook_entry_head+0x1e1/0x2c0 net/netfilter/core.c:316
    Modules linked in:
    CPU: 0 PID: 242 Comm: syz-executor.0 Not tainted 6.8.0-12821-g537c2e91d354 #10
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    RIP: 0010:nf_hook_entry_head+0x1e1/0x2c0 net/netfilter/core.c:316
    Code: 83 fd 04 0f 87 bc 00 00 00 e8 5b 84 83 fd 4d 8d ac ec a8 0b 00 00 e8 4e 84 83 fd 4c 89 e8 5b 5d 41 5c 41 5d c3 e8 3f 84 83 fd <0f> 0b e8 38 84 83 fd 45 31 ed 5b 5d 4c 89 e8 41 5c 41 5d c3 e8 26
    RSP: 0018:ffffc90000b8f6e8 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffffffff83c42164
    RDX: ffff888106851180 RSI: ffffffff83c42321 RDI: 0000000000000005
    RBP: 0000000000000000 R08: 0000000000000005 R09: 000000000000000a
    R10: 0000000000000003 R11: ffff8881055c2f00 R12: ffff888112b78000
    R13: 0000000000000000 R14: ffff8881055c2f00 R15: ffff8881055c2f00
    FS:  00007f377bd78800(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000496068 CR3: 000000011298b003 CR4: 0000000000770ef0
    PKRU: 55555554
    Call Trace:
     <TASK>
     __nf_register_net_hook+0xcd/0x7a0 net/netfilter/core.c:428
     nf_register_net_hook+0x116/0x170 net/netfilter/core.c:578
     nf_register_net_hooks+0x5d/0xc0 net/netfilter/core.c:594
     arpt_register_table+0x250/0x420 net/ipv4/netfilter/arp_tables.c:1553
     arptable_filter_table_init+0x41/0x60 net/ipv4/netfilter/arptable_filter.c:39
     xt_find_table_lock+0x2e9/0x4b0 net/netfilter/x_tables.c:1260
     xt_request_find_table_lock+0x2b/0xe0 net/netfilter/x_tables.c:1285
     get_info+0x169/0x5c0 net/ipv4/netfilter/arp_tables.c:808
     do_arpt_get_ctl+0x3f9/0x830 net/ipv4/netfilter/arp_tables.c:1444
     nf_getsockopt+0x76/0xd0 net/netfilter/nf_sockopt.c:116
     ip_getsockopt+0x17d/0x1c0 net/ipv4/ip_sockglue.c:1777
     tcp_getsockopt+0x99/0x100 net/ipv4/tcp.c:4373
     do_sock_getsockopt+0x279/0x360 net/socket.c:2373
     __sys_getsockopt+0x115/0x1e0 net/socket.c:2402
     __do_sys_getsockopt net/socket.c:2412 [inline]
     __se_sys_getsockopt net/socket.c:2409 [inline]
     __x64_sys_getsockopt+0xbd/0x150 net/socket.c:2409
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x46/0x4e
    RIP: 0033:0x7f377beca6fe
    Code: 1f 44 00 00 48 8b 15 01 97 0a 00 f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 f3 0f 1e fa 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 0a c3 66 0f 1f 84 00 00 00 00 00 48 8b 15 c9
    RSP: 002b:00000000005df728 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
    RAX: ffffffffffffffda RBX: 00000000004966e0 RCX: 00007f377beca6fe
    RDX: 0000000000000060 RSI: 0000000000000000 RDI: 0000000000000003
    RBP: 000000000042938a R08: 00000000005df73c R09: 00000000005df800
    R10: 00000000004966e8 R11: 0000000000000246 R12: 0000000000000003
    R13: 0000000000496068 R14: 0000000000000003 R15: 00000000004bc9d8
     </TASK>
    
    Fixes: 4654467 ("netfilter: arptables: allow xtables-nft only builds")
    Reported-by: syzkaller <[email protected]>
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    q2ven authored and ummakynes committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    15fba56 View commit details
    Browse the repository at this point in the history
  5. Merge tag 'erofs-for-6.9-rc2-fixes' of git://git.kernel.org/pub/scm/l…

    …inux/kernel/git/xiang/erofs
    
    Pull erofs fixes from Gao Xiang:
    
     - Add a new reviewer Sandeep Dhavale to build a healthier community
    
     - Drop experimental warning for FSDAX
    
    * tag 'erofs-for-6.9-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
      MAINTAINERS: erofs: add myself as reviewer
      erofs: drop experimental warning for FSDAX
    torvalds committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    8d025e2 View commit details
    Browse the repository at this point in the history
  6. Merge tag 'for-net' of https://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/bpf/bpf
    
    Alexei Starovoitov says:
    
    ====================
    pull-request: bpf 2024-03-27
    
    The following pull-request contains BPF updates for your *net* tree.
    
    We've added 4 non-merge commits during the last 1 day(s) which contain
    a total of 5 files changed, 26 insertions(+), 3 deletions(-).
    
    The main changes are:
    
    1) Fix bloom filter value size validation and protect the verifier
       against such mistakes, from Andrei.
    
    2) Fix build due to CONFIG_KEXEC_CORE/CRASH_DUMP split, from Hari.
    
    3) Update bpf_lsm maintainers entry, from Matt.
    
    * tag 'for-net' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
      bpf: update BPF LSM designated reviewer list
      bpf: Protect against int overflow for stack access size
      bpf: Check bloom filter map value size
      bpf: fix warning for crash_kexec
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    7e6f4b2 View commit details
    Browse the repository at this point in the history
  7. Merge tag 'nf-24-03-28' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/netfilter/nf
    
    Pablo Neira Ayuso says:
    
    ====================
    Netfilter fixes for net
    
    The following patchset contains Netfilter fixes for net:
    
    Patch #1 reject destroy chain command to delete device hooks in netdev
             family, hence, only delchain commands are allowed.
    
    Patch #2 reject table flag update interference with netdev basechain
    	 hook updates, this can leave hooks in inconsistent
    	 registration/unregistration state.
    
    Patch #3 do not unregister netdev basechain hooks if table is dormant.
    	 Otherwise, splat with double unregistration is possible.
    
    Patch #4 fixes Kconfig to allow to restore IP_NF_ARPTABLES,
    	 from Kuniyuki Iwashima.
    
    There are a more fixes still in progress on my side that need more work.
    
    * tag 'nf-24-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
      netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c
      netfilter: nf_tables: skip netdev hook unregistration if table is dormant
      netfilter: nf_tables: reject table flag and netdev basechain updates
      netfilter: nf_tables: reject destroy command to remove basechain hooks
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    005e528 View commit details
    Browse the repository at this point in the history
  8. net: phy: qcom: at803x: fix kernel panic with at8031_probe

    On reworking and splitting the at803x driver, in splitting function of
    at803x PHYs it was added a NULL dereference bug where priv is referenced
    before it's actually allocated and then is tried to write to for the
    is_1000basex and is_fiber variables in the case of at8031, writing on
    the wrong address.
    
    Fix this by correctly setting priv local variable only after
    at803x_probe is called and actually allocates priv in the phydev struct.
    
    Reported-by: William Wortel <[email protected]>
    Cc: <[email protected]>
    Fixes: 25d2ba9 ("net: phy: at803x: move specific at8031 probe mode check to dedicated probe")
    Signed-off-by: Christian Marangi <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Ansuel authored and Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    6a4aee2 View commit details
    Browse the repository at this point in the history
  9. net: bcmasp: Bring up unimac after PHY link up

    The unimac requires the PHY RX clk during reset or it may be put
    into a bad state. Bring up the unimac after link up to ensure the
    PHY RX clk exists.
    
    Fixes: 490cb41 ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
    Signed-off-by: Justin Chen <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Ryceancurry authored and Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    dfd222e View commit details
    Browse the repository at this point in the history
  10. net: bcmasp: Remove phy_{suspend/resume}

    phy_{suspend/resume} is redundant. It gets called from phy_{stop/start}.
    
    Fixes: 490cb41 ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
    Signed-off-by: Justin Chen <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Ryceancurry authored and Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    4494c10 View commit details
    Browse the repository at this point in the history
  11. Merge branch 'net-bcmasp-phy-managements-fixes'

    Justin Chen says:
    
    ====================
    net: bcmasp: phy managements fixes
    
    Fix two issues.
    
    - The unimac may be put in a bad state if PHY RX clk doesn't exist
      during reset. Work around this by bringing the unimac out of reset
      during phy up.
    
    - Remove redundant phy_{suspend/resume}
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    eb67cdb View commit details
    Browse the repository at this point in the history
  12. net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips

    PCI11x1x Rev B0 devices might drop packets when receiving back to back frames
    at 2.5G link speed. Change the B0 Rev device's Receive filtering Engine FIFO
    threshold parameter from its hardware default of 4 to 3 dwords to prevent the
    problem. Rev C0 and later hardware already defaults to 3 dwords.
    
    Fixes: bb4f6bf ("net: lan743x: Add PCI11010 / PCI11414 device IDs")
    Signed-off-by: Raju Lakkaraju <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    lakkarajun authored and Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    e4a5898 View commit details
    Browse the repository at this point in the history
  13. Octeontx2-af: fix pause frame configuration in GMP mode

    The Octeontx2 MAC block (CGX) has separate data paths (SMU and GMP) for
    different speeds, allowing for efficient data transfer.
    
    The previous patch which added pause frame configuration has a bug due
    to which pause frame feature is not working in GMP mode.
    
    This patch fixes the issue by configurating appropriate registers.
    
    Fixes: f7e086e ("octeontx2-af: Pause frame configuration at cgx")
    Signed-off-by: Hariprasad Kelam <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Hariprasad Kelam authored and Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    40d4b48 View commit details
    Browse the repository at this point in the history
  14. inet: inet_defrag: prevent sk release while still in use

    ip_local_out() and other functions can pass skb->sk as function argument.
    
    If the skb is a fragment and reassembly happens before such function call
    returns, the sk must not be released.
    
    This affects skb fragments reassembled via netfilter or similar
    modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.
    
    Eric Dumazet made an initial analysis of this bug.  Quoting Eric:
      Calling ip_defrag() in output path is also implying skb_orphan(),
      which is buggy because output path relies on sk not disappearing.
    
      A relevant old patch about the issue was :
      8282f27 ("inet: frag: Always orphan skbs inside ip_defrag()")
    
      [..]
    
      net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
      inet socket, not an arbitrary one.
    
      If we orphan the packet in ipvlan, then downstream things like FQ
      packet scheduler will not work properly.
    
      We need to change ip_defrag() to only use skb_orphan() when really
      needed, ie whenever frag_list is going to be used.
    
    Eric suggested to stash sk in fragment queue and made an initial patch.
    However there is a problem with this:
    
    If skb is refragmented again right after, ip_do_fragment() will copy
    head->sk to the new fragments, and sets up destructor to sock_wfree.
    IOW, we have no choice but to fix up sk_wmem accouting to reflect the
    fully reassembled skb, else wmem will underflow.
    
    This change moves the orphan down into the core, to last possible moment.
    As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
    offset into the FRAG_CB, else skb->sk gets clobbered.
    
    This allows to delay the orphaning long enough to learn if the skb has
    to be queued or if the skb is completing the reasm queue.
    
    In the former case, things work as before, skb is orphaned.  This is
    safe because skb gets queued/stolen and won't continue past reasm engine.
    
    In the latter case, we will steal the skb->sk reference, reattach it to
    the head skb, and fix up wmem accouting when inet_frag inflates truesize.
    
    Fixes: 7026b1d ("netfilter: Pass socket pointer down through okfn().")
    Diagnosed-by: Eric Dumazet <[email protected]>
    Reported-by: xingwei lee <[email protected]>
    Reported-by: yue sun <[email protected]>
    Reported-by: [email protected]
    Signed-off-by: Florian Westphal <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Florian Westphal authored and Paolo Abeni committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    1868545 View commit details
    Browse the repository at this point in the history
  15. Merge tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/netdev/net
    
    Pull networking fixes from Paolo Abeni:
     "Including fixes from bpf, WiFi and netfilter.
    
      Current release - regressions:
    
       - ipv6: fix address dump when IPv6 is disabled on an interface
    
      Current release - new code bugs:
    
       - bpf: temporarily disable atomic operations in BPF arena
    
       - nexthop: fix uninitialized variable in nla_put_nh_group_stats()
    
      Previous releases - regressions:
    
       - bpf: protect against int overflow for stack access size
    
       - hsr: fix the promiscuous mode in offload mode
    
       - wifi: don't always use FW dump trig
    
       - tls: adjust recv return with async crypto and failed copy to
         userspace
    
       - tcp: properly terminate timers for kernel sockets
    
       - ice: fix memory corruption bug with suspend and rebuild
    
       - at803x: fix kernel panic with at8031_probe
    
       - qeth: handle deferred cc1
    
      Previous releases - always broken:
    
       - bpf: fix bug in BPF_LDX_MEMSX
    
       - netfilter: reject table flag and netdev basechain updates
    
       - inet_defrag: prevent sk release while still in use
    
       - wifi: pick the version of SESSION_PROTECTION_NOTIF
    
       - wwan: t7xx: split 64bit accesses to fix alignment issues
    
       - mlxbf_gige: call request_irq() after NAPI initialized
    
       - hns3: fix kernel crash when devlink reload during pf
         initialization"
    
    * tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
      inet: inet_defrag: prevent sk release while still in use
      Octeontx2-af: fix pause frame configuration in GMP mode
      net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips
      net: bcmasp: Remove phy_{suspend/resume}
      net: bcmasp: Bring up unimac after PHY link up
      net: phy: qcom: at803x: fix kernel panic with at8031_probe
      netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c
      netfilter: nf_tables: skip netdev hook unregistration if table is dormant
      netfilter: nf_tables: reject table flag and netdev basechain updates
      netfilter: nf_tables: reject destroy command to remove basechain hooks
      bpf: update BPF LSM designated reviewer list
      bpf: Protect against int overflow for stack access size
      bpf: Check bloom filter map value size
      bpf: fix warning for crash_kexec
      selftests: netdevsim: set test timeout to 10 minutes
      net: wan: framer: Add missing static inline qualifiers
      mlxbf_gige: call request_irq() after NAPI initialized
      tls: get psock ref after taking rxlock to avoid leak
      selftests: tls: add test with a partially invalid iov
      tls: adjust recv return with async crypto and failed copy to userspace
      ...
    torvalds committed Mar 28, 2024
    Configuration menu
    Copy the full SHA
    50108c3 View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2024

  1. Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

    Cross-merge networking fixes after downstream PR.
    
    No conflicts, or adjacent changes.
    
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    5e47fbe View commit details
    Browse the repository at this point in the history
  2. selftests: net: libs: Change variable fallback syntax

    The current syntax of X=${X:=X} first evaluates the ${X:=Y} expression,
    which either uses the existing value of $X if there is one, or uses the
    value of "Y" as a fallback, and assigns it to X. The expression is then
    replaced with the now-current value of $X. Assigning that value to X once
    more is meaningless.
    
    So avoid the outer X=... bit, and instead express the same idea though the
    do-nothing ":" built-in as : "${X:=Y}". This also cleans up the block
    nicely and makes it more readable.
    
    Signed-off-by: Petr Machata <[email protected]>
    Reviewed-by: Benjamin Poirier <[email protected]>
    Link: https://lore.kernel.org/r/1890ddc58420c2c0d5ba3154c87ecc6d9faf6947.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    fa61e9a View commit details
    Browse the repository at this point in the history
  3. selftests: forwarding.config.sample: Move overrides to lib.sh

    forwarding.config.sample, net/lib.sh and net/forwarding/lib.sh contain
    definitions and redefinitions of some of the same variables. The overlap
    between net/forwarding/lib.sh and forwarding.config.sample is especially
    large. This duplication is a potential source of confusion and problems.
    
    It would be overall less error prone if each variable were defined in one
    place only. In this patch set, that place is the library itself. Therefore
    move all comments from forwarding.config.sample to net/forwarding/lib.sh.
    
    Move over also a definition of TC_FLAG, which was missing from lib.sh
    entirely.
    
    Additionally, add to lib.sh a default definition of the topology variables.
    The logic behind this is that forgetting to specify forwarding.config was a
    frequent source of frustration for the selftest users. But really, most of
    the time the default veth based topology is just fine. We considered just
    sourcing forwarding.config.sample instead if forwarding.config is not
    available, but this is a cleaner solution.
    
    That means the syntax of the forwarding.config.sample override has to
    change to an array assignment, so that the whole variable is overwritten,
    not just individual keys, which could leave the value of some keys
    unchanged. Do the same in lib.sh for any cut'n'pasters out there.
    
    The config file is then given a sort of carte blanche to redefine whatever
    variables it sees fit from the libraries. This is described in a comment in
    the file. Only a handful of variables are left behind, to illustrate the
    customization.
    
    The fact that the variables are now missing from forwarding.config.sample,
    and therefore would miss from forwarding.config derived from that file as
    well, should not change anything. This is just the sample file. Users that
    keep their own forwarding.config would retain it as before.
    
    The only observable change is introduction of TC_FLAG to lib.sh, because
    now the filters would not be attempted to install to HW datapath. For veth
    pairs this does not change anything. For HW deployments, users presumably
    have forwarding.config with this value overridden.
    
    Signed-off-by: Petr Machata <[email protected]>
    Reviewed-by: Benjamin Poirier <[email protected]>
    Link: https://lore.kernel.org/r/b9b8a11a22821a7aa532211ff461a34f596e26bf.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    fd36fd2 View commit details
    Browse the repository at this point in the history
  4. selftests: forwarding: README: Document customization

    That any sort of customization is possible at all, let alone how it should
    be done, is currently not at all clear. Document the whats and hows in
    README.
    
    Signed-off-by: Petr Machata <[email protected]>
    Reviewed-by: Benjamin Poirier <[email protected]>
    Link: https://lore.kernel.org/r/e819623af6aaeea49e9dc36cecd95694fad73bb8.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    0cb8628 View commit details
    Browse the repository at this point in the history
  5. selftests: forwarding: ipip_lib: Do not import lib.sh

    This library is always sourced in the context where lib.sh has already been
    sourced as well. Therefore drop the explicit sourcing and expect the client
    to already have done it. This will simplify moving some of the clients to a
    different directory.
    
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/a4da5e9cd42a34cbace917a048ca71081719d6ac.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    0faa565 View commit details
    Browse the repository at this point in the history
  6. selftests: forwarding: Move several selftests

    The tests in net/forwarding are generally expected to be HW-independent.
    There are however several tests that, while not depending on any HW in
    particular, nevertheless depend on being used on HW interfaces. Placing
    these selftests to net/forwarding is confusing, because the selftest will
    just report it can't be run on veth pairs. At the same time, placing them
    to a particular driver's selftests subdirectory would be wrong.
    
    Instead, add a new directory, drivers/net/hw, where these generic but HW
    independent selftests should be placed. Move over several such tests
    including one helper library.
    
    Since typically these tests will not be expected to run, omit the directory
    drivers/net/hw from the TARGETS list in selftests/Makefile. Retain a
    Makefile in the new directory itself, so that a user can make -C into that
    directory and act on those tests explicitly.
    
    Cc: Roger Quadros <[email protected]>
    Cc: Tobias Waldekranz <[email protected]>
    Cc: Danielle Ratson <[email protected]>
    Cc: Davide Caratti <[email protected]>
    Cc: Johannes Nixdorf <[email protected]>
    Suggested-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/e11dae1f62703059e9fc2240004288ac7cc15756.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    40d269c View commit details
    Browse the repository at this point in the history
  7. selftests: forwarding: Ditch skip_on_veth()

    Since the selftests that are not supposed to run on veth pairs are now in
    their own dedicated directory, the skip_on_veth logic can go away. Drop it
    from the selftests, and from lib.sh.
    
    Cc: Danielle Ratson <[email protected]>
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/63b470e10d65270571ee7de709b31672ce314872.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    0c499a3 View commit details
    Browse the repository at this point in the history
  8. selftests: forwarding: Change inappropriate log_test_skip() calls

    The SKIP return should be used for cases where tooling of the machine under
    test is lacking. For cases where HW is lacking, the appropriate outcome is
    XFAIL.
    
    This is the case with ethtool_rmon and mlxsw_lib. For these, introduce a
    new helper, log_test_xfail().
    
    Do the same for router_mpath_nh_lib. Note that it will be fixed using a
    more reusable way in a following patch.
    
    For the two resource_scale selftests, the log should simply not be written,
    because there is no problem.
    
    Cc: Tobias Waldekranz <[email protected]>
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/3d668d8fb6fa0d9eeb47ce6d9e54114348c7c179.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    677f394 View commit details
    Browse the repository at this point in the history
  9. selftests: lib: Define more kselftest exit codes

    The following patches will operate with more exit codes besides
    ksft_skip. Add them here.
    
    Additionally, move a duplicated skip exit code definition from
    forwarding/tc_tunnel_key.sh. Keep a similar duplicate in
    forwarding/devlink_lib.sh, because even though lib.sh will have
    been sourced in all cases where devlink_lib is, the inclusion is not
    visible in the file itself, and relying on it would be confusing.
    
    Cc: Davide Caratti <[email protected]>
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/545a03046c7aca0628a51a389a9b81949ab288ce.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    51ccf26 View commit details
    Browse the repository at this point in the history
  10. selftests: forwarding: Have RET track kselftest framework constants

    The variable RET keeps track of whether the test under execution has so far
    failed or not. Currently it works in binary fashion: zero means everything
    is fine, non-zero means something failed. log_test() then uses the value to
    given a human-readable message.
    
    In order to allow log_test() to report skips and xfails, the semantics of
    RET need to be more fine-grained. Therefore have RET value be one of
    kselftest framework constants: $ksft_fail, $ksft_xfail, etc.
    
    The current logic in check_err() is such that first non-zero value of RET
    trumps all those that follow. But that is not right when RET has more
    fine-grained value semantics. Different outcomes have different weights.
    
    The results of PASS and XFAIL are mostly the same: they both communicate a
    test that did not go wrong. SKIP communicates lack of tooling, which the
    user should go and try to fix, and as such should not be overridden by the
    passes. So far, the higher-numbered statuses can be considered weightier.
    But FAIL should be the weightiest.
    
    Add a helper, ksft_status_merge(), which merges two statuses in a way that
    respects the above conditions. Express it in a generic manner, because exit
    status merge is subtly different, and we want to reuse the same logic.
    
    Use the new helper when setting RET in check_err().
    
    Re-express check_fail() in terms of check_err() to avoid duplication.
    
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/7dfff51cc925c7a3ac879b9050a0d6a327c8d21f.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    596c881 View commit details
    Browse the repository at this point in the history
  11. selftests: forwarding: Convert log_test() to recognize RET values

    In a previous patch, the interpretation of RET value was changed to mean
    the kselftest framework constant with the test outcome: $ksft_pass,
    $ksft_xfail, etc.
    
    Update log_test() to recognize the various possible RET values.
    
    Then have EXIT_STATUS track the RET value of the current test. This differs
    subtly from the way RET tracks the value: while for RET we want to
    recognize XFAIL as a separate status, for purposes of exit code, we want to
    to conflate XFAIL and PASS, because they both communicate non-failure. Thus
    add a new helper, ksft_exit_status_merge().
    
    With this log_test_skip() and log_test_xfail() can be reexpressed as thin
    wrappers around log_test.
    
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/e5f807cb5476ab795fd14ac74da53a731a9fc432.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    a923af1 View commit details
    Browse the repository at this point in the history
  12. selftests: forwarding: Support for performance sensitive tests

    Several tests in the suite use large amounts of traffic to e.g. cause
    congestion and evaluate RED or shaper performance. These tests will not run
    well on a slow machine, be it one with heavy debug kernel, or a VM, or e.g.
    a single-board computer. Allow users to specify an environment variable,
    KSFT_MACHINE_SLOW=yes, to indicate that the tests are being run on one such
    machine.
    
    Performance sensitive tests can then use a new helper, xfail_on_slow(), to
    mark parts of the test that are sensitive to low-performance machines.
    The helper can be used to just mark the whole suite, like so:
    
    	xfail_on_slow tests_run
    
    ... or, on the other side of the granularity spectrum, to override
    individual checks:
    
    	xfail_on_slow check_err $? "Expected much, got little."
    
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/99a376a2d2ffdaeee7752b1910cb0c3ea5d80fbe.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    e16a8d2 View commit details
    Browse the repository at this point in the history
  13. selftests: forwarding: Mark performance-sensitive tests

    When run on a slow machine, the scheduler traffic tests can be expected to
    fail, and should be reported as XFAIL in that case. Therefore run these
    tests through the perf_sensitive wrapper.
    
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/9a357f8cf34f5ececac08d43a3eb023008996035.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    e103910 View commit details
    Browse the repository at this point in the history
  14. selftests: forwarding: router_mpath_nh_lib: Don't skip, xfail on veth

    When the NH group stats tests are currently run on a veth topology, the
    HW-stats leg of each test is SKIP'ped. But kernel networking CI interprets
    skips as a sign that tooling is missing, and prompts maintainer
    investigation. Lack of capability to pass a test should be expressed as
    XFAIL.
    
    Selftests that require HW should normally be put in drivers/net/hw, but
    doing so for the NH counter selftests would just lead to a lot of
    duplicity.
    
    So instead, introduce a helper, xfail_on_veth(), which can be used to mark
    selftests that should XFAIL instead of FAILing when run on a veth topology.
    On non-veth topology, they don't do anything.
    
    Use the helper in the HW-stats part of router_mpath_nh_lib selftest.
    
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/15f0ab9637aa0497f164ec30e83c1c8f53d53719.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    6db870b View commit details
    Browse the repository at this point in the history
  15. selftests: forwarding: Add a test for testing lib.sh functionality

    Rerunning various scenarios to make sure lib.sh changes do not impact the
    observable behavior is no fun. Add a selftest at least for the bare basics
    -- the mechanics of setting RET, retmsg, and EXIT_STATUS.
    
    Since the selftest itself uses lib.sh, it would be possible to break lib.sh
    in such a way that invalidates result of the selftest. Since the metatest
    only uses the bare basics (just pass/fail), hopefully such fundamental
    breakages would be noticed.
    
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://lore.kernel.org/r/6d25cedbf2d4b83614944809a34fe023fbe8db38.1711464583.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    pmachata authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    8ff2d7a View commit details
    Browse the repository at this point in the history
  16. Merge branch 'selftests-fixes-for-kernel-ci'

    Petr Machata says:
    
    ====================
    selftests: Fixes for kernel CI
    
    As discussed on the bi-weekly call on Jan 30, and in mailing around
    kernel CI effort, some changes are desirable in the suite of forwarding
    selftests the better to work with the CI tooling. Namely:
    
    - The forwarding selftests use a configuration file where names of
      interfaces are defined and various variables can be overridden. There
      is also forwarding.config.sample that users can use as a template to
      refer to when creating the config file. What happens a fair bit is
      that users either do not know about this at all, or simply forget, and
      are confused by cryptic failures about interfaces that cannot be
      created.
    
      In patches #1 - #3 have lib.sh just be the single source of truth with
      regards to which variables exist. That includes the topology variables
      which were previously only in the sample file, and any "tweak
      variables", such as what tools to use, sleep times, etc.
    
      forwarding.config.sample then becomes just a placeholder with a couple
      examples. Unless specific HW should be exercised, or specific tools
      used, the defaults are usually just fine.
    
    - Several net/forwarding/ selftests (and one net/ one) cannot be run on
      veth pairs, they need an actual HW interface to run on. They are
      generic in the sense that any capable HW should pass them, which is
      why they have been put to net/forwarding/ as opposed to drivers/net/,
      but they do not generalize to veth. The fact that these tests are in
      net/forwarding/, but still complaining when run, is confusing.
    
      In patches #4 - #6 move these tests to a new directory
      drivers/net/hw.
    
    - The following patches extend the codebase to handle well test results
      other than pass and fail.
    
      Patch #7 is preparatory. It converts several log_test_skip to XFAIL,
      so that tests do not spuriously end up returning non-0 when they
      are not supposed to.
    
      In patches #8 - #10, introduce some missing ksft constants, then support
      having those constants in RET, and then finally in EXIT_STATUS.
    
    - The traffic scheduler tests generate a large amount of network traffic
      to test the behavior of the scheduler. This demands a relatively
      high-performance computer. On slow machines, such as with a debugging
      kernel, the test would spuriously fail.
    
      It can still be useful to "go through the motions" though, to possibly
      catch bugs in setup of the scheduler graph and passing packets around.
      Thus we still want to run the tests, just with lowered demands.
    
      To that end, in patches #11 - #12, introduce an environment variable
      KSFT_MACHINE_SLOW, with obvious meaning. Tests can then make checks
      more lenient, such as mark failures as XFAIL. A helper, xfail_on_slow,
      is provided to mark performance-sensitive parts of the selftest.
    
    - In patch #13, use a similar mechanism to mark a NH group stats
      selftest to XFAIL HW stats tests when run on VETH pairs.
    
    - All these changes complicate the hitherto straightforward logging and
      checking logic, so in patch #14, add a selftest that checks this
      functionality in lib.sh.
    
    v1 (vs. an RFC circulated through linux-kselftest):
    - Patch #9:
        - Clarify intended usage by s/set_ret/ret_set_ksft_status/,
          s/nret/ksft_status/
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    51cf49f View commit details
    Browse the repository at this point in the history
  17. ynl: support hex display_hint for integer

    Some times it would be convenient to read the integer as hex, like
    mask values.
    
    Suggested-by: Donald Hunter <[email protected]>
    Reviewed-by: Donald Hunter <[email protected]>
    Signed-off-by: Hangbin Liu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    liuhangbin authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    b334f5e View commit details
    Browse the repository at this point in the history
  18. doc/netlink/specs: Add vlan attr in rt_link spec

    With command:
     # ./tools/net/ynl/cli.py \
     --spec Documentation/netlink/specs/rt_link.yaml \
     --do getlink --json '{"ifname": "eno1.2"}' --output-json | \
     jq -C '.linkinfo'
    
    Before:
    Exception: No message format for 'vlan' in sub-message spec 'linkinfo-data-msg'
    
    After:
     {
       "kind": "vlan",
       "data": {
         "protocol": "8021q",
         "id": 2,
         "flag": {
           "flags": [
             "reorder-hdr"
           ],
           "mask": "0xffffffff"
         },
         "egress-qos": {
           "mapping": [
             {
               "from": 1,
               "to": 2
             },
             {
               "from": 4,
               "to": 4
             }
           ]
         }
       }
     }
    
    Signed-off-by: Hangbin Liu <[email protected]>
    Reviewed-by: Donald Hunter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    liuhangbin authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    782c108 View commit details
    Browse the repository at this point in the history
  19. Merge branch 'doc-netlink-specs-add-vlan-support'

    Hangbin Liu says:
    
    ====================
    doc/netlink/specs: Add vlan support
    
    Add vlan support in rt_link spec.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    fb984d1 View commit details
    Browse the repository at this point in the history
  20. dt-bindings: net: renesas,etheravb: Add optional MDIO bus node

    The Renesas Ethernet AVB bindings do not allow the MDIO bus to be
    described. This has not been needed as only a single PHY is
    supported and no MDIO bus properties have been needed.
    
    Add an optional mdio node to the binding which allows the MDIO bus to be
    described and allow bus properties to be set.
    
    Signed-off-by: Niklas Söderlund <[email protected]>
    Reviewed-by: Sergey Shtylyov <[email protected]>
    Reviewed-by: Rob Herring <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Niklas Söderlund authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    a87590c View commit details
    Browse the repository at this point in the history
  21. ravb: Add support for an optional MDIO mode

    The driver used the DT node of the device itself when registering the
    MDIO bus. While this works, it creates a problem: it forces any MDIO bus
    properties to also be set on the devices DT node. This mixes the
    properties of two distinctly different things and is confusing.
    
    This change adds support for an optional mdio node to be defined as a
    child to the device DT node. The child node can then be used to describe
    MDIO bus properties that the MDIO core can act on when registering the
    bus.
    
    If no mdio child node is found the driver fallback to the old behavior
    and register the MDIO bus using the device DT node. This change is
    backward compatible with old bindings in use.
    
    Signed-off-by: Niklas Söderlund <[email protected]>
    Reviewed-by: Sergey Shtylyov <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Niklas Söderlund authored and kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    2c60c4c View commit details
    Browse the repository at this point in the history
  22. Merge branch 'ravb-support-describing-the-mdio-bus'

    Niklas Söderlund says:
    
    ====================
    ravb: Support describing the MDIO bus
    
    This series adds support to the binding and driver of the Renesas
    Ethernet AVB to described the MDIO bus. Currently the driver uses
    the OF node of the device itself when registering the MDIO bus.
    This forces any MDIO bus properties the MDIO core should react on
    to be set on the device OF node. This is confusing and none of
    the MDIO bus properties are described in the Ethernet AVB bindings.
    
    Patch 1/2 extends the bindings with an optional mdio child-node
    to the device that can be used to contain the MDIO bus settings.
    While patch 2/2 changes the driver to use this node (if present)
    when registering the MDIO bus.
    
    If the new optional mdio child-node is not present the driver
    fallback to the old behavior and uses the device OF node like before.
    This change is fully backward compatible with existing usage
    of the bindings.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    c602f4c View commit details
    Browse the repository at this point in the history
  23. bpf: Add support for passing mark with bpf_fib_lookup

    Extend the bpf_fib_lookup() helper by making it to utilize mark if
    the BPF_FIB_LOOKUP_MARK flag is set. In order to pass the mark the
    four bytes of struct bpf_fib_lookup are used, shared with the
    output-only smac/dmac fields.
    
    Signed-off-by: Anton Protopopov <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Acked-by: Daniel Borkmann <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    aspsk authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    5311591 View commit details
    Browse the repository at this point in the history
  24. selftests/bpf: Add BPF_FIB_LOOKUP_MARK tests

    This patch extends the fib_lookup test suite by adding a few test
    cases for each IP family to test the new BPF_FIB_LOOKUP_MARK flag
    to the bpf_fib_lookup:
    
      * Test destination IP address selection with and without a mark
        and/or the BPF_FIB_LOOKUP_MARK flag set
    
    Signed-off-by: Anton Protopopov <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Acked-by: Daniel Borkmann <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    aspsk authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    6efec2c View commit details
    Browse the repository at this point in the history
  25. bpf: Add a check for struct bpf_fib_lookup size

    The struct bpf_fib_lookup should not grow outside of its 64 bytes.
    Add a static assert to validate this.
    
    Suggested-by: David Ahern <[email protected]>
    Signed-off-by: Anton Protopopov <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    aspsk authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    59b418c View commit details
    Browse the repository at this point in the history
  26. bpf: improve error message for unsupported helper

    BPF verifier emits "unknown func" message when given BPF program type
    does not support BPF helper. This message may be confusing for users, as
    important context that helper is unknown only to current program type is
    not provided.
    
    This patch changes message to "program of this type cannot use helper "
    and aligns dependent code in libbpf and tests. Any suggestions on
    improving/changing this message are welcome.
    
    Signed-off-by: Mykyta Yatsenko <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Acked-by: Quentin Monnet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    mykyta5 authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    786bf0e View commit details
    Browse the repository at this point in the history
  27. bpf,arena: Use helper sizeof_field in struct accessors

    Use the well defined helper sizeof_field() to calculate the size of a
    struct member, instead of doing custom calculations.
    
    Signed-off-by: Haiyue Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    haiyuewa authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    55fc888 View commit details
    Browse the repository at this point in the history
  28. selftests/bpf: rename and clean up userspace-triggered benchmarks

    Rename uprobe-base to more precise usermode-count (it will match other
    baseline-like benchmarks, kernel-count and syscall-count). Also use
    BENCH_TRIG_USERMODE() macro to define all usermode-based triggering
    benchmarks, which include usermode-count and uprobe/uretprobe benchmarks.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    anakryiko authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    1175f8d View commit details
    Browse the repository at this point in the history
  29. selftests/bpf: add batched, mostly in-kernel BPF triggering benchmarks

    Existing kprobe/fentry triggering benchmarks have 1-to-1 mapping between
    one syscall execution and BPF program run. While we use a fast
    get_pgid() syscall, syscall overhead can still be non-trivial.
    
    This patch adds kprobe/fentry set of benchmarks significantly amortizing
    the cost of syscall vs actual BPF triggering overhead. We do this by
    employing BPF_PROG_TEST_RUN command to trigger "driver" raw_tp program
    which does a tight parameterized loop calling cheap BPF helper
    (bpf_get_numa_node_id()), to which kprobe/fentry programs are
    attached for benchmarking.
    
    This way 1 bpf() syscall causes N executions of BPF program being
    benchmarked. N defaults to 100, but can be adjusted with
    --trig-batch-iters CLI argument.
    
    For comparison we also implement a new baseline program that instead of
    triggering another BPF program just does N atomic per-CPU counter
    increments, establishing the limit for all other types of program within
    this batched benchmarking setup.
    
    Taking the final set of benchmarks added in this patch set (including
    tp/raw_tp/fmodret, added in later patch), and keeping for now "legacy"
    syscall-driven benchmarks, we can capture all triggering benchmarks in
    one place for comparison, before we remove the legacy ones (and rename
    xxx-batched into just xxx).
    
    $ benchs/run_bench_trigger.sh
    usermode-count       :   79.500 ± 0.024M/s
    kernel-count         :   49.949 ± 0.081M/s
    syscall-count        :    9.009 ± 0.007M/s
    
    fentry-batch         :   31.002 ± 0.015M/s
    fexit-batch          :   20.372 ± 0.028M/s
    fmodret-batch        :   21.651 ± 0.659M/s
    rawtp-batch          :   36.775 ± 0.264M/s
    tp-batch             :   19.411 ± 0.248M/s
    kprobe-batch         :   12.949 ± 0.220M/s
    kprobe-multi-batch   :   15.400 ± 0.007M/s
    kretprobe-batch      :    5.559 ± 0.011M/s
    kretprobe-multi-batch:    5.861 ± 0.003M/s
    
    fentry-legacy        :    8.329 ± 0.004M/s
    fexit-legacy         :    6.239 ± 0.003M/s
    fmodret-legacy       :    6.595 ± 0.001M/s
    rawtp-legacy         :    8.305 ± 0.004M/s
    tp-legacy            :    6.382 ± 0.001M/s
    kprobe-legacy        :    5.528 ± 0.003M/s
    kprobe-multi-legacy  :    5.864 ± 0.022M/s
    kretprobe-legacy     :    3.081 ± 0.001M/s
    kretprobe-multi-legacy:   3.193 ± 0.001M/s
    
    Note how xxx-batch variants are measured with significantly higher
    throughput, even though it's exactly the same in-kernel overhead. As
    such, results can be compared only between benchmarks of the same kind
    (syscall vs batched):
    
    fentry-legacy        :    8.329 ± 0.004M/s
    fentry-batch         :   31.002 ± 0.015M/s
    
    kprobe-multi-legacy  :    5.864 ± 0.022M/s
    kprobe-multi-batch   :   15.400 ± 0.007M/s
    
    Note also that syscall-count is setting a theoretical limit for
    syscall-triggered benchmarks, while kernel-count is setting similar
    limits for batch variants. usermode-count is a happy and unachievable
    case of user space counting without doing any syscalls, and is mostly
    the measure of CPU speed for such a trivial benchmark.
    
    As was mentioned, tp/raw_tp/fmodret require kernel-side kfunc to produce
    similar benchmark, which we address in a separate patch.
    
    Note that run_bench_trigger.sh allows to override a list of benchmarks
    to run, which is very useful for performance work.
    
    Cc: Jiri Olsa <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    anakryiko authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    7df4e59 View commit details
    Browse the repository at this point in the history
  30. selftests/bpf: remove syscall-driven benchs, keep syscall-count only

    Remove "legacy" benchmarks triggered by syscalls in favor of newly added
    in-kernel/batched benchmarks. Drop -batched suffix now as well.
    Next patch will restore "feature parity" by adding back
    tp/raw_tp/fmodret benchmarks based on in-kernel kfunc approach.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    anakryiko authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    208c439 View commit details
    Browse the repository at this point in the history
  31. selftests/bpf: lazy-load trigger bench BPF programs

    Instead of front-loading all possible benchmarking BPF programs for
    trigger benchmarks, explicitly specify which BPF programs are used by
    specific benchmark and load only it.
    
    This allows to be more flexible in supporting older kernels, where some
    program types might not be possible to load (e.g., those that rely on
    newly added kfunc).
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    anakryiko authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    b4ccf91 View commit details
    Browse the repository at this point in the history
  32. bpf: add bpf_modify_return_test_tp() kfunc triggering tracepoint

    Add a simple bpf_modify_return_test_tp() kfunc, available to all program
    types, that is useful for various testing and benchmarking scenarios, as
    it allows to trigger most tracing BPF program types from BPF side,
    allowing to do complex testing and benchmarking scenarios.
    
    It is also attachable to for fmod_ret programs, making it a good and
    simple way to trigger fmod_ret program under test/benchmark.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    anakryiko authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    3124591 View commit details
    Browse the repository at this point in the history
  33. selftests/bpf: add batched tp/raw_tp/fmodret tests

    Utilize bpf_modify_return_test_tp() kfunc to have a fast way to trigger
    tp/raw_tp/fmodret programs from another BPF program, which gives us
    comparable batched benchmarks to (batched) kprobe/fentry benchmarks.
    
    We don't switch kprobe/fentry batched benchmarks to this kfunc to make
    bench tool usable on older kernels as well.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    anakryiko authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    985d068 View commit details
    Browse the repository at this point in the history
  34. Merge branch 'bench-fast-in-kernel-triggering-benchmarks'

    Andrii Nakryiko says:
    
    ====================
    bench: fast in-kernel triggering benchmarks
    
    Remove "legacy" triggering benchmarks which rely on syscalls (and thus syscall
    overhead is a noticeable part of benchmark, unfortunately). Replace them with
    faster versions that rely on triggering BPF programs in-kernel through another
    simple "driver" BPF program. See patch #2 with comparison results.
    
    raw_tp/tp/fmodret benchmarks required adding a simple kfunc in kernel to be
    able to trigger a simple tracepoint from BPF program (plus it is also allowed
    to be replaced by fmod_ret programs). This limits raw_tp/tp/fmodret benchmarks
    to new kernels only, but it keeps bench tool itself very portable and most of
    other benchmarks will still work on wide variety of kernels without the need
    to worry about building and deploying custom kernel module. See patches #5
    and #6 for details.
    
    v1->v2:
      - move new TP closer to BPF test run code;
      - rename/move kfunc and register it for fmod_rets (Alexei);
      - limit --trig-batch-iters param to [1, 1000] (Alexei).
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    a461a51 View commit details
    Browse the repository at this point in the history
  35. bpf: Mitigate latency spikes associated with freeing non-preallocated…

    … htab
    
    Following the recent upgrade of one of our BPF programs, we encountered
    significant latency spikes affecting other applications running on the same
    host. After thorough investigation, we identified that these spikes were
    primarily caused by the prolonged duration required to free a
    non-preallocated htab with approximately 2 million keys.
    
    Notably, our kernel configuration lacks the presence of CONFIG_PREEMPT. In
    scenarios where kernel execution extends excessively, other threads might
    be starved of CPU time, resulting in latency issues across the system. To
    mitigate this, we've adopted a proactive approach by incorporating
    cond_resched() calls within the kernel code. This ensures that during
    lengthy kernel operations, the scheduler is invoked periodically to provide
    opportunities for other threads to execute.
    
    Signed-off-by: Yafang Shao <[email protected]>
    Acked-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    laoar authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    ee3bad0 View commit details
    Browse the repository at this point in the history
  36. bpf: Remove CONFIG_X86 and CONFIG_DYNAMIC_FTRACE guard from the tcp-c…

    …c kfuncs
    
    The commit 7aae231 ("bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE")
    added CONFIG_DYNAMIC_FTRACE guard because pahole was only generating
    btf for ftrace-able functions. The ftrace filter had already been
    removed from pahole, so the CONFIG_DYNAMIC_FTRACE guard can be
    removed.
    
    The commit 569c484 ("bpf: Limit static tcp-cc functions in the .BTF_ids list to x86")
    has added CONFIG_X86 guard because it failed the powerpc arch which
    prepended a "." to the local static function, so "cubictcp_init" becomes
    ".cubictcp_init". "__bpf_kfunc" has been added to kfunc
    since then and it uses the __unused compiler attribute.
    There is an existing
    "__bpf_kfunc static u32 bpf_kfunc_call_test_static_unused_arg(u32 arg, u32 unused)"
    test in bpf_testmod.c to cover the static kfunc case.
    
    cross compile on ppc64 with CONFIG_DYNAMIC_FTRACE disabled:
    > readelf -s vmlinux | grep cubictcp_
    56938: c00000000144fd00   184 FUNC    LOCAL  DEFAULT    2 cubictcp_cwnd_event 	    [<localentry>: 8]
    56939: c00000000144fdb8   200 FUNC    LOCAL  DEFAULT    2 cubictcp_recalc_[...]   [<localentry>: 8]
    56940: c00000000144fe80   296 FUNC    LOCAL  DEFAULT    2 cubictcp_init 	    [<localentry>: 8]
    56941: c00000000144ffa8   228 FUNC    LOCAL  DEFAULT    2 cubictcp_state 	    [<localentry>: 8]
    56942: c00000000145008c  1908 FUNC    LOCAL  DEFAULT    2 cubictcp_cong_avoid  [<localentry>: 8]
    56943: c000000001450800  1644 FUNC    LOCAL  DEFAULT    2 cubictcp_acked 	    [<localentry>: 8]
    
    > bpftool btf dump file vmlinux | grep cubictcp_
    [51540] FUNC 'cubictcp_acked' type_id=38137 linkage=static
    [51541] FUNC 'cubictcp_cong_avoid' type_id=38122 linkage=static
    [51543] FUNC 'cubictcp_cwnd_event' type_id=51542 linkage=static
    [51544] FUNC 'cubictcp_init' type_id=9186 linkage=static
    [51545] FUNC 'cubictcp_recalc_ssthresh' type_id=35021 linkage=static
    [51547] FUNC 'cubictcp_state' type_id=38141 linkage=static
    
    The patch removed both config guards.
    
    Cc: Jiri Olsa <[email protected]>
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    42e4ebd View commit details
    Browse the repository at this point in the history
  37. selftests/bpf: Test loading bpf-tcp-cc prog calling the kernel tcp-cc…

    … kfuncs
    
    This patch adds a test to ensure all static tcp-cc kfuncs is visible to
    the struct_ops bpf programs. It is checked by successfully loading
    the struct_ops programs calling these tcp-cc kfuncs.
    
    This patch needs to enable the CONFIG_TCP_CONG_DCTCP and
    the CONFIG_TCP_CONG_BBR.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    5da7fb0 View commit details
    Browse the repository at this point in the history
  38. selftests/bpf: Replace CHECK with ASSERT macros for ksyms test

    Replace CHECK with ASSERT macros for ksyms tests.
    This test failed earlier with clang lto kernel, but the
    issue is gone with latest code base. But replacing
    CHECK with ASSERT still improves code as ASSERT is
    preferred in selftests.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    cdfd9cc View commit details
    Browse the repository at this point in the history
  39. libbpf: Mark libbpf_kallsyms_parse static function

    Currently libbpf_kallsyms_parse() function is declared as a global
    function but actually it is not a API and there is no external
    users in bpftool/bpf-selftests. So let us mark the function as
    static.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    ad2b052 View commit details
    Browse the repository at this point in the history
  40. libbpf: Handle <orig_name>.llvm.<hash> symbol properly

    With CONFIG_LTO_CLANG_THIN enabled, with some of previous
    version of kernel code base ([1]), I hit the following
    error:
       test_ksyms:PASS:kallsyms_fopen 0 nsec
       test_ksyms:FAIL:ksym_find symbol 'bpf_link_fops' not found
       #118     ksyms:FAIL
    
    The reason is that 'bpf_link_fops' is renamed to
       bpf_link_fops.llvm.8325593422554671469
    Due to cross-file inlining, the static variable 'bpf_link_fops'
    in syscall.c is used by a function in another file. To avoid
    potential duplicated names, the llvm added suffix
    '.llvm.<hash>' ([2]) to 'bpf_link_fops' variable.
    Such renaming caused a problem in libbpf if 'bpf_link_fops'
    is used in bpf prog as a ksym but 'bpf_link_fops' does not
    match any symbol in /proc/kallsyms.
    
    To fix this issue, libbpf needs to understand that suffix '.llvm.<hash>'
    is caused by clang lto kernel and to process such symbols properly.
    
    With latest bpf-next code base built with CONFIG_LTO_CLANG_THIN,
    I cannot reproduce the above failure any more. But such an issue
    could happen with other symbols or in the future for bpf_link_fops symbol.
    
    For example, with my current kernel, I got the following from
    /proc/kallsyms:
      ffffffff84782154 d __func__.net_ratelimit.llvm.6135436931166841955
      ffffffff85f0a500 d tk_core.llvm.726630847145216431
      ffffffff85fdb960 d __fs_reclaim_map.llvm.10487989720912350772
      ffffffff864c7300 d fake_dst_ops.llvm.54750082607048300
    
    I could not easily create a selftest to test newly-added
    libbpf functionality with a static C test since I do not know
    which symbol is cross-file inlined. But based on my particular kernel,
    the following test change can run successfully.
    
    >  diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms.c b/tools/testing/selftests/bpf/prog_tests/ksyms.c
    >  index 6a86d1f07800..904a103f7b1d 100644
    >  --- a/tools/testing/selftests/bpf/prog_tests/ksyms.c
    >  +++ b/tools/testing/selftests/bpf/prog_tests/ksyms.c
    >  @@ -42,6 +42,7 @@ void test_ksyms(void)
    >          ASSERT_EQ(data->out__bpf_link_fops, link_fops_addr, "bpf_link_fops");
    >          ASSERT_EQ(data->out__bpf_link_fops1, 0, "bpf_link_fops1");
    >          ASSERT_EQ(data->out__btf_size, btf_size, "btf_size");
    >  +       ASSERT_NEQ(data->out__fake_dst_ops, 0, "fake_dst_ops");
    >          ASSERT_EQ(data->out__per_cpu_start, per_cpu_start_addr, "__per_cpu_start");
    >
    >   cleanup:
    >  diff --git a/tools/testing/selftests/bpf/progs/test_ksyms.c b/tools/testing/selftests/bpf/progs/test_ksyms.c
    >  index 6c9cbb5a3bdf..fe91eef54b66 100644
    >  --- a/tools/testing/selftests/bpf/progs/test_ksyms.c
    >  +++ b/tools/testing/selftests/bpf/progs/test_ksyms.c
    >  @@ -9,11 +9,13 @@ __u64 out__bpf_link_fops = -1;
    >   __u64 out__bpf_link_fops1 = -1;
    >   __u64 out__btf_size = -1;
    >   __u64 out__per_cpu_start = -1;
    >  +__u64 out__fake_dst_ops = -1;
    >
    >   extern const void bpf_link_fops __ksym;
    >   extern const void __start_BTF __ksym;
    >   extern const void __stop_BTF __ksym;
    >   extern const void __per_cpu_start __ksym;
    >  +extern const void fake_dst_ops __ksym;
    >   /* non-existing symbol, weak, default to zero */
    >   extern const void bpf_link_fops1 __ksym __weak;
    >
    >  @@ -23,6 +25,7 @@ int handler(const void *ctx)
    >          out__bpf_link_fops = (__u64)&bpf_link_fops;
    >          out__btf_size = (__u64)(&__stop_BTF - &__start_BTF);
    >          out__per_cpu_start = (__u64)&__per_cpu_start;
    >  +       out__fake_dst_ops = (__u64)&fake_dst_ops;
    >
    >          out__bpf_link_fops1 = (__u64)&bpf_link_fops1;
    
    This patch fixed the issue in libbpf such that
    the suffix '.llvm.<hash>' will be ignored during comparison of
    bpf prog ksym vs. symbols in /proc/kallsyms, this resolved the issue.
    Currently, only static variables in /proc/kallsyms are checked
    with '.llvm.<hash>' suffix since in bpf programs function ksyms
    with '.llvm.<hash>' suffix are most likely kfunc's and unlikely
    to be cross-file inlined.
    
    Note that currently kernel does not support gcc build with lto.
    
      [1] https://lore.kernel.org/bpf/[email protected]/
      [2] https://github.com/llvm/llvm-project/blob/release/18.x/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1714-L1719
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    c56e597 View commit details
    Browse the repository at this point in the history
  41. selftests/bpf: Refactor some functions for kprobe_multi_test

    Refactor some functions in kprobe_multi_test.c to extract
    some helper functions who will be used in later patches
    to avoid code duplication.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    d132064 View commit details
    Browse the repository at this point in the history
  42. selftests/bpf: Refactor trace helper func load_kallsyms_local()

    Refactor trace helper function load_kallsyms_local() such that
    it invokes a common function with a compare function as input.
    The common function will be used later for other local functions.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    9475dac View commit details
    Browse the repository at this point in the history
  43. selftests/bpf: Add {load,search}_kallsyms_custom_local()

    These two functions allow selftests to do loading/searching
    kallsyms based on their specific compare functions.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    d1f0258 View commit details
    Browse the repository at this point in the history
  44. selftests/bpf: Fix kprobe_multi_bench_attach test failure with LTO ke…

    …rnel
    
    In my locally build clang LTO kernel (enabling CONFIG_LTO and
    CONFIG_LTO_CLANG_THIN), kprobe_multi_bench_attach/kernel subtest
    failed like:
      test_kprobe_multi_bench_attach:PASS:get_syms 0 nsec
      test_kprobe_multi_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
      libbpf: prog 'test_kprobe_empty': failed to attach: No such process
      test_kprobe_multi_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts unexpected error: -3
      #117/1   kprobe_multi_bench_attach/kernel:FAIL
    
    There are multiple symbols in /sys/kernel/debug/tracing/available_filter_functions
    are renamed in /proc/kallsyms due to cross file inlining. One example is for
      static function __access_remote_vm in mm/memory.c.
    In a non-LTO kernel, we have the following call stack:
      ptrace_access_vm (global, kernel/ptrace.c)
        access_remote_vm (global, mm/memory.c)
          __access_remote_vm (static, mm/memory.c)
    
    With LTO kernel, it is possible that access_remote_vm() is inlined by
    ptrace_access_vm(). So we end up with the following call stack:
      ptrace_access_vm (global, kernel/ptrace.c)
        __access_remote_vm (static, mm/memory.c)
    The compiler renames __access_remote_vm to __access_remote_vm.llvm.<hash>
    to prevent potential name collision.
    
    The kernel bpf_kprobe_multi_link_attach() and ftrace_lookup_symbols() try
    to find addresses based on /proc/kallsyms, hence the current test failed
    with LTO kenrel.
    
    This patch consulted /proc/kallsyms to find the corresponding entries
    for the ksym and this solved the issue.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    9edaafa View commit details
    Browse the repository at this point in the history
  45. selftests/bpf: Add a kprobe_multi subtest to use addrs instead of syms

    Get addrs directly from available_filter_functions_addrs and
    send to the kernel during kprobe_multi_attach. This avoids
    consultation of /proc/kallsyms. But available_filter_functions_addrs
    is introduced in 6.5, i.e., it is introduced recently,
    so I skip the test if the kernel does not support it.
    
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Yonghong Song authored and Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    6302bde View commit details
    Browse the repository at this point in the history
  46. Merge branch 'bpf-fix-a-couple-of-test-failures-with-lto-kernel'

    Yonghong Song says:
    
    ====================
    bpf: Fix a couple of test failures with LTO kernel
    
    With a LTO kernel built with clang, with one of earlier version of kernel,
    I encountered two test failures, ksyms and kprobe_multi_bench_attach/kernel.
    Now with latest bpf-next, only kprobe_multi_bench_attach/kernel failed.
    But it is possible in the future ksyms selftest may fail again.
    
    Both test failures are due to static variable/function renaming
    due to cross-file inlining. For Ksyms failure, the solution is
    to strip .llvm.<hash> suffixes for symbols in /proc/kallsyms before
    comparing against the ksym in bpf program.
    For kprobe_multi_bench_attach/kernel failure, the solution is
    to either provide names in /proc/kallsyms to the kernel or
    ignore those names who have .llvm.<hash> suffix since the kernel
    sym name comparison is against /proc/kallsyms.
    
    Please see each individual patches for details.
    
    Changelogs:
      v2 -> v3:
        - no need to check config file, directly so strstr with '.llvm.'.
        - for kprobe_multi_bench with syms, instead of skipping the syms,
          consult /proc/kallyms to find corresponding names.
        - add a test with populating addrs to the kernel for kprobe
          multi attach.
      v1 -> v2:
        - Let libbpf handle .llvm.<hash suffixes since it may impact
          bpf program ksym.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Alexei Starovoitov committed Mar 29, 2024
    Configuration menu
    Copy the full SHA
    e478cf2 View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    368ada6 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    fe1f6ad View commit details
    Browse the repository at this point in the history