Pull in bpf/for-next #169

htejun · 2024-03-29T02:15:12Z

e478cf2 ("Merge branch 'bpf-fix-a-couple-of-test-failures-with-lto-kernel'")

…inux/kernel/git/rw/ubifs Pull UBI and UBIFS updates from Richard Weinberger: "UBI: - Add Zhihao Cheng as reviewer - Attach via device tree - Add NVMEM layer - Various fastmap related fixes UBIFS: - Add Zhihao Cheng as reviewer - Convert to folios - Various fixes (memory leaks in error paths, function prototypes)" * tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (34 commits) mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems mtd: ubi: provide NVMEM layer over UBI volumes mtd: ubi: populate ubi volume fwnode mtd: ubi: introduce pre-removal notification for UBI volumes mtd: ubi: attach from device tree mtd: ubi: block: use notifier to create ubiblock from parameter dt-bindings: mtd: ubi-volume: allow UBI volumes to provide NVMEM dt-bindings: mtd: add basic bindings for UBI ubifs: Queue up space reservation tasks if retrying many times ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path ubifs: dbg_check_idx_size: Fix kmemleak if loading znode failed ubi: Correct the number of PEBs after a volume resize failure ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130 ubi: correct the calculation of fastmap size ubifs: Remove unreachable code in dbg_check_ltab_lnum ubifs: fix function pointer cast warnings ubifs: fix sort function prototype ubi: Check for too small LEB size in VTBL code MAINTAINERS: Add Zhihao Cheng as UBI/UBIFS reviewer ubifs: Convert populate_page() to take a folio ...

…rnel/git/ukleinek/linux Pull siox updates from Uwe Kleine-König: "This reworks how siox device registration works yielding a saner API. This allows us to simplify the gpio bus driver using two new devm functions" * tag 'siox/for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: siox: bus-gpio: Simplify using devm_siox_* functions siox: Provide a devm variant of siox_master_register() siox: Provide a devm variant of siox_master_alloc() siox: Don't pass the reference on a master in siox_master_register()

Cross-merge networking fixes after downstream PR. Signed-off-by: Jakub Kicinski <[email protected]>

…top.org/drm/misc/kernel into drm-next Short summary of fixes pull: core: - fix rounding in drm_fixp2int_round() bridge: - fix documentation for DRM_BRIDGE_OP_EDID nouveau: - don't check devinit disable on GSP sun4i: - fix 64-bit division on 32-bit architectures tests: - fix dependency on DRM_KMS_HELPER Signed-off-by: Dave Airlie <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

…abelloni/linux Pull RTC updates from Alexandre Belloni: "Subsytem: - rtc_class is now const Drivers: - ds1511: cleanup, set date and time range and alarm offset limit - max31335: fix interrupt handler - pcf8523: improve suspend support" * tag 'rtc-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (28 commits) MAINTAINER: Include linux-arm-msm for Qualcomm RTC patches dt-bindings: rtc: zynqmp: Add support for Versal/Versal NET SoCs rtc: class: make rtc_class constant dt-bindings: rtc: abx80x: Improve checks on trickle charger constraints MAINTAINERS: adjust file entry in ARM/Mediatek RTC DRIVER rtc: nct3018y: fix possible NULL dereference rtc: max31335: fix interrupt status reg rtc: mt6397: select IRQ_DOMAIN instead of depending on it dt-bindings: rtc: abx80x: convert to yaml rtc: m41t80: Use the unified property API get the wakeup-source property dt-bindings: at91rm9260-rtt: add sam9x7 compatible dt-bindings: rtc: convert MT7622 RTC to the json-schema dt-bindings: rtc: convert MT2717 RTC to the json-schema rtc: pcf8523: add suspend handlers for alarm IRQ rtc: ds1511: set alarm offset limit rtc: ds1511: set range rtc: ds1511: drop inline/noinline hints rtc: ds1511: rename pdata rtc: ds1511: implement ds1511_rtc_read_alarm properly rtc: ds1511: remove partial alarm support ...

…git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Re-instate the CPUMASK_OFFSTACK option for arm64 when NR_CPUS > 256. The bug that led to the initial revert was the cpufreq-dt code not using zalloc_cpumask_var(). - Make the STARFIVE_STARLINK_PMU config option depend on 64BIT to prevent compile-test failures on 32-bit architectures due to missing writeq(). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: perf: starfive: fix 64-bit only COMPILE_TEST condition ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512

…p.org/agd5f/linux into drm-next amd-drm-fixes-6.9-2024-03-21: amdgpu: - Freesync fixes - UAF IOCTL fixes - Fix mmhub client ID mapping - IH 7.0 fix - DML2 fixes - VCN 4.0.6 fix - GART bind fix - GPU reset fix - SR-IOV fix - OD table handling fixes - Fix TA handling on boards without display hardware - DML1 fix - ABM fix - eDP panel fix - DPPCLK fix - HDCP fix - Revert incorrect error case handling in ioremap - VPE fix - HDMI fixes - SDMA 4.4.2 fix - Other misc fixes amdkfd: - Fix duplicate BO handling in process restore Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

…/kernel Pull drm fixes from Dave Airlie: "Fixes from the last week (or 3 weeks in amdgpu case), after amdgpu, it's xe and nouveau then a few scattered core fixes. core: - fix rounding in drm_fixp2int_round() bridge: - fix documentation for DRM_BRIDGE_OP_EDID sun4i: - fix 64-bit division on 32-bit architectures tests: - fix dependency on DRM_KMS_HELPER probe-helper: - never return negative values from .get_modes() plus driver fixes xe: - invalidate userptr vma on page pin fault - fail early on sysfs file creation error - skip VMA pinning on xe_exec if no batches nouveau: - clear bo resource bus after eviction - documentation fixes - don't check devinit disable on GSP amdgpu: - Freesync fixes - UAF IOCTL fixes - Fix mmhub client ID mapping - IH 7.0 fix - DML2 fixes - VCN 4.0.6 fix - GART bind fix - GPU reset fix - SR-IOV fix - OD table handling fixes - Fix TA handling on boards without display hardware - DML1 fix - ABM fix - eDP panel fix - DPPCLK fix - HDCP fix - Revert incorrect error case handling in ioremap - VPE fix - HDMI fixes - SDMA 4.4.2 fix - Other misc fixes amdkfd: - Fix duplicate BO handling in process restore" * tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel: (50 commits) drm/amdgpu/pm: Don't use OD table on Arcturus drm/amdgpu: drop setting buffer funcs in sdma442 drm/amd/display: Fix noise issue on HDMI AV mute drm/amd/display: Revert Remove pixle rate limit for subvp Revert "drm/amdgpu/vpe: don't emit cond exec command under collaborate mode" Revert "drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()" drm/amd/display: Add a dc_state NULL check in dc_state_release drm/amd/display: Return the correct HDCP error code drm/amd/display: Implement wait_for_odm_update_pending_complete drm/amd/display: Lock all enabled otg pipes even with no planes drm/amd/display: Amend coasting vtotal for replay low hz drm/amd/display: Fix idle check for shared firmware state drm/amd/display: Update odm when ODM combine is changed on an otg master pipe with no plane drm/amd/display: Init DPPCLK from SMU on dcn32 drm/amd/display: Add monitor patch for specific eDP drm/amd/display: Allow dirty rects to be sent to dmub when abm is active drm/amd/display: Override min required DCFCLK in dml1_validate drm/amdgpu: Bypass display ta if display hw is not available drm/amdgpu: correct the KGQ fallback message drm/amdgpu/pm: Check the validity of overdiver power limit ...

…ench/cifs-2.6 Pull smb client fixes from Steve French: - Various get_inode_info_fixes - Fix for querying xattrs of cached dirs - Four minor cleanup fixes (including adding some header corrections and a missing flag) - Performance improvement for deferred close - Two query interface fixes * tag '6.9-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: smb311: additional compression flag defined in updated protocol spec smb311: correct incorrect offset field in compression header cifs: Move some extern decls from .c files to .h cifs: remove redundant variable assignment cifs: fixes for get_inode_info cifs: open_cached_dir(): add FILE_READ_EA to desired access cifs: reduce warning log level for server not advertising interfaces cifs: make sure server interfaces are requested only for SMB3+ cifs: defer close file handles having RH lease

strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. There is a _nearly_ identical implementation of fill_psinfo present in binfmt_elf.c -- except that one uses get_task_comm over strncpy(). Let's mirror that in binfmt_elf_fdpic.c Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: KSPP/linux#90 Cc: <[email protected]> Signed-off-by: Justin Stitt <[email protected]> Link: https://lore.kernel.org/r/20240321-strncpy-fs-binfmt_elf_fdpic-c-v2-1-0b6daec6cc56@google.com Signed-off-by: Kees Cook <[email protected]>

crashkernel reservation failed on a Thinkpad t440s laptop recently. Actually the memblock reservation succeeded, but later insert_resource() failed. Test steps: kexec load -> /* make sure add crashkernel param eg. crashkernel=160M */ kexec reboot -> dmesg|grep "crashkernel reserved"; crashkernel memory range like below reserved successfully: 0x00000000d0000000 - 0x00000000da000000 But no such "Crash kernel" region in /proc/iomem The background story: Currently the E820 code reserves setup_data regions for both the current kernel and the kexec kernel, and it inserts them into the resources list. Before the kexec kernel reboots nobody passes the old setup_data, and kexec only passes fresh SETUP_EFI/SETUP_IMA/SETUP_RNG_SEED if needed. Thus the old setup data memory is not used at all. Due to old kernel updates the kexec e820 table as well so kexec kernel sees them as E820_TYPE_RESERVED_KERN regions, and later the old setup_data regions are inserted into resources list in the kexec kernel by e820__reserve_resources(). Note, due to no setup_data is passed in for those old regions they are not early reserved (by function early_reserve_memory), and the crashkernel memblock reservation will just treat them as usable memory and it could reserve the crashkernel region which overlaps with the old setup_data regions. And just like the bug I noticed here, kdump insert_resource failed because e820__reserve_resources has added the overlapped chunks in /proc/iomem already. Finally, looking at the code, the old setup_data regions are not used at all as no setup_data is passed in by the kexec boot loader. Although something like SETUP_PCI etc could be needed, kexec should pass the info as new setup_data so that kexec kernel can take care of them. This should be taken care of in other separate patches if needed. Thus drop the useless buggy code here. Signed-off-by: Dave Young <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Jiri Bohac <[email protected]> Cc: Eric DeVolder <[email protected]> Cc: Baoquan He <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Link: https://lore.kernel.org/r/[email protected]

syzbot reported the following uninit-value access issue [1][2]: nci_rx_work() parses and processes received packet. When the payload length is zero, each message type handler reads uninitialized payload and KMSAN detects this issue. The receipt of a packet with a zero-size payload is considered unexpected, and therefore, such packets should be silently discarded. This patch resolved this issue by checking payload size before calling each message type handler codes. Fixes: 6a2968a ("NFC: basic NCI protocol implementation") Reported-and-tested-by: [email protected] Reported-and-tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=7ea9413ea6749baf5574 [1] Closes: https://syzkaller.appspot.com/bug?extid=29b5ca705d2e0f4a44d2 [2] Signed-off-by: Ryosuke Yasuoka <[email protected]> Reviewed-by: Jeremy Cline <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: David S. Miller <[email protected]>

…xt() Since: 7ee18d6 ("x86/power: Make restore_processor_context() sane") kmemleak reports this issue: unreferenced object 0xf68241e0 (size 32): comm "swapper/0", pid 1, jiffies 4294668610 (age 68.432s) hex dump (first 32 bytes): 00 cc cc cc 29 10 01 c0 00 00 00 00 00 00 00 00 ....)........... 00 42 82 f6 cc cc cc cc cc cc cc cc cc cc cc cc .B.............. backtrace: [<461c1d50>] __kmem_cache_alloc_node+0x106/0x260 [<ea65e13b>] __kmalloc+0x54/0x160 [<c3858cd2>] msr_build_context.constprop.0+0x35/0x100 [<46635aff>] pm_check_save_msr+0x63/0x80 [<6b6bb938>] do_one_initcall+0x41/0x1f0 [<3f3add60>] kernel_init_freeable+0x199/0x1e8 [<3b538fde>] kernel_init+0x1a/0x110 [<938ae2b2>] ret_from_fork+0x1c/0x28 Which is a false positive. Reproducer: - Run rsync of whole kernel tree (multiple times if needed). - start a kmemleak scan - Note this is just an example: a lot of our internal tests hit these. The root cause is similar to the fix in: b0b592c x86/pm: Fix false positive kmemleak report in msr_build_context() ie. the alignment within the packed struct saved_context which has everything unaligned as there is only "u16 gs;" at start of struct where in the past there were four u16 there thus aligning everything afterwards. The issue is with the fact that Kmemleak only searches for pointers that are aligned (see how pointers are scanned in kmemleak.c) so when the struct members are not aligned it doesn't see them. Testing: We run a lot of tests with our CI, and after applying this fix we do not see any kmemleak issues any more whilst without it we see hundreds of the above report. From a single, simple test run consisting of 416 individual test cases on kernel 5.10 x86 with kmemleak enabled we got 20 failures due to this, which is quite a lot. With this fix applied we get zero kmemleak related failures. Fixes: 7ee18d6 ("x86/power: Make restore_processor_context() sane") Signed-off-by: Anton Altaparmakov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/[email protected]

Read from an unsafe address with copy_from_kernel_nofault() in arch_adjust_kprobe_addr() because this function is used before checking the address is in text or not. Syzcaller bot found a bug and reported the case if user specifies inaccessible data area, arch_adjust_kprobe_addr() will cause a kernel panic. [ mingo: Clarified the comment. ] Fixes: cc66bb9 ("x86/ibt,kprobes: Cure sym+0 equals fentry woes") Reported-by: Qiang Zhang <[email protected]> Tested-by: Jinghao Jia <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/171042945004.154897.2221804961882915806.stgit@devnote2

This reverts commit 16ab7cb because it broke iwd. iwd uses the KEYCTL_PKEY_* UAPIs via its dependency libell, and apparently it is relying on SHA-1 signature support. These UAPIs are fairly obscure, and their documentation does not mention which algorithms they support. iwd really should be using a properly supported userspace crypto library instead. Regardless, since something broke we have to revert the change. It may be possible that some parts of this commit can be reinstated without breaking iwd (e.g. probably the removal of MODULE_SIG_SHA1), but for now this just does a full revert to get things working again. Reported-by: Karel Balej <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Cc: Dimitri John Ledkov <[email protected]> Signed-off-by: Eric Biggers <[email protected]> Tested-by: Karel Balej <[email protected]> Signed-off-by: Herbert Xu <[email protected]>

If nr_cpus < nr_iaa, the calculated cpus_per_iaa will be 0, which causes a divide-by-0 in rebalance_wq_table(). Make sure cpus_per_iaa is 1 in that case, and also in the nr_iaa == 0 case, even though cpus_per_iaa is never used if nr_iaa == 0, for paranoia. Cc: <[email protected]> # v6.8+ Reported-by: Jerry Snitselaar <[email protected]> Signed-off-by: Tom Zanussi <[email protected]> Signed-off-by: Herbert Xu <[email protected]>

…r higher address Following warning is sometimes observed while booting my servers: [ 3.594838] DMA: preallocated 4096 KiB GFP_KERNEL pool for atomic allocations [ 3.602918] swapper/0: page allocation failure: order:10, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0-1 ... [ 3.851862] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocation If 'nokaslr' boot option is set, the warning always happens. On x86, ZONE_DMA is small zone at the first 16MB of physical address space. When this problem happens, most of that space seems to be used by decompressed kernel. Thereby, there is not enough space at DMA_ZONE to meet the request of DMA pool allocation. The commit 2f77465 ("x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR") tried to fix this problem by introducing lower bound of allocation. But the fix is not complete. efi_random_alloc() allocates pages by following steps. 1. Count total available slots ('total_slots') 2. Select a slot ('target_slot') to allocate randomly 3. Calculate a starting address ('target') to be included target_slot 4. Allocate pages, which starting address is 'target' In step 1, 'alloc_min' is used to offset the starting address of memory chunk. But in step 3 'alloc_min' is not considered at all. As the result, 'target' can be miscalculated and become lower than 'alloc_min'. When KASLR is disabled, 'target_slot' is always 0 and the problem happens everytime if the EFI memory map of the system meets the condition. Fix this problem by calculating 'target' considering 'alloc_min'. Cc: [email protected] Cc: Tom Englund <[email protected]> Cc: [email protected] Fixes: 2f77465 ("x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR") Signed-off-by: Kazuma Kondo <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>

…ux/kernel/git/wsa/linux Pull more i2c updates from Wolfram Sang: "Some more I2C updates after the dependencies have been merged now. Plus a DT binding fix" * tag 'i2c-for-6.9-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: dt-bindings: i2c: qcom,i2c-cci: Fix OV7251 'data-lanes' entries i2c: muxes: pca954x: Allow sharing reset GPIO i2c: nomadik: sort includes i2c: nomadik: support Mobileye EyeQ5 I2C controller i2c: nomadik: fetch i2c-transfer-timeout-us property from devicetree i2c: nomadik: replace jiffies by ktime for FIFO flushing timeout i2c: nomadik: support short xfer timeouts using waitqueue & hrtimer i2c: nomadik: use bitops helpers i2c: nomadik: simplify IRQ masking logic i2c: nomadik: rename private struct pointers from dev to priv dt-bindings: i2c: nomadik: add mobileye,eyeq5-i2c bindings and example

…kernel/git/tiwai/sound Pull more sound fixes from Takashi Iwai: "The remaining fixes for 6.9-rc1 that have been gathered in this week. More about ASoC at this time (one long-standing fix for compress offload, SOF, AMD ACP, Rockchip, Cirrus and tlv320 stuff) while another regression fix in ALSA core and a couple of HD-audio quirks as usual are included" * tag 'sound-fix2-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: control: Fix unannotated kfree() cleanup ALSA: hda/realtek: Add quirks for some Clevo laptops ALSA: hda/realtek: Add quirk for HP Spectre x360 14 eu0000 ALSA: hda/realtek: fix the hp playback volume issue for LG machines ASoC: soc-compress: Fix and add DPCM locking ASoC: SOF: amd: Skip IRAM/DRAM size modification for Steam Deck OLED ASoC: SOF: amd: Move signed_fw_image to struct acp_quirk_entry ASoC: amd: yc: Revert "add new YC platform variant (0x63) support" ASoC: amd: yc: Revert "Fix non-functional mic on Lenovo 21J2" ASoC: soc-core.c: Skip dummy codec when adding platforms ASoC: rockchip: i2s-tdm: Fix inaccurate sampling rates ASoC: dt-bindings: cirrus,cs42l43: Fix 'gpio-ranges' schema ASoC: amd: yc: Fix non-functional mic on ASUS M7600RE ASoC: tlv320adc3xxx: Don't strip remove function when driver is builtin

…ub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "One fix that came in during the merge window, fixing a problem with bootstrapping the state of exclusive regulators which have a parent regulator" * tag 'regulator-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: Propagate the regulator state in case of exclusive get

…/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A small collection of fixes that came in since the merge window. Most of it is relatively minor driver specific fixes, there's also fixes for error handling with SPI flash devices and a fix restoring delay control functionality for non-GPIO chip selects managed by the core" * tag 'spi-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-mt65xx: Fix NULL pointer access in interrupt handler spi: docs: spidev: fix echo command format spi: spi-imx: fix off-by-one in mx51 CPU mode burst length spi: lm70llp: fix links in doc and comments spi: Fix error code checking in spi_mem_exec_op() spi: Restore delays for non-GPIO chip select spi: lpspi: Avoid potential use-after-free in probe()

… bench With glibc 2.28, selftests compilation fails for benchs/bench_trigger.c: benchs/bench_trigger.c: In function ‘inc_counter’: benchs/bench_trigger.c:25:23: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration] 25 | tid = gettid(); | ^~~~~~ | getgid cc1: all warnings being treated as errors It appears support for the gettid() wrapper is variable across glibc versions, so may be safer to use syscall(SYS_gettid) instead. Fixes: 520fad2 ("selftests/bpf: scale benchmark counting by using per-CPU counters") Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

Some distros seem to enable the -fcf-protection=branch by default, which breaks our setup on first instruction of uprobe trigger functions and place there endbr64 instruction. Marking them with nocf_check attribute to skip that. Ignoring unknown attribute warning in gcc for bench objects, because nocf_check can be used only when -fcf-protection=branch is enabled, otherwise we get a warning and break compilation. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

…ernel/git/deller/linux-fbdev Pull fbdev updates from Helge Deller: - Allow console fonts up to 64x128 pixels (Samuel Thibault) - Prevent division-by-zero in fb monitor code (Roman Smirnov) - Drop Renesas ARM platforms from Mobile LCDC framebuffer driver (Geert Uytterhoeven) - Various code cleanups in viafb, uveafb and mb862xxfb drivers by Aleksandr Burakov, Li Zhijian and Michael Ellerman * tag 'fbdev-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: panel-tpo-td043mtea1: Convert sprintf() to sysfs_emit() fbmon: prevent division by zero in fb_videomode_from_videomode() fbcon: Increase maximum font width x height to 64 x 128 fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2 fbdev: mb862xxfb: Fix defined but not used error fbdev: uvesafb: Convert sprintf/snprintf to sysfs_emit fbdev: Restrict FB_SH_MOBILE_LCDC to SuperH

…l/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Add objtool support for LoongArch - Add ORC stack unwinder support for LoongArch - Add kernel livepatching support for LoongArch - Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig - Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig - Some bug fixes and other small changes * tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch/crypto: Clean up useless assignment operations LoongArch: Define the __io_aw() hook as mmiowb() LoongArch: Remove superfluous flush_dcache_page() definition LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h LoongArch: Change __my_cpu_offset definition to avoid mis-optimization LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig LoongArch: Add kernel livepatching support LoongArch: Add ORC stack unwinder support objtool: Check local label in read_unwind_hints() objtool: Check local label in add_dead_ends() objtool/LoongArch: Enable orc to be built objtool/x86: Separate arch-specific and generic parts objtool/LoongArch: Implement instruction decoder objtool/LoongArch: Enable objtool to be built

…inux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for various vector-accelerated crypto routines - Hibernation is now enabled for portable kernel builds - mmap_rnd_bits_max is larger on systems with larger VAs - Support for fast GUP - Support for membarrier-based instruction cache synchronization - Support for the Andes hart-level interrupt controller and PMU - Some cleanups around unaligned access speed probing and Kconfig settings - Support for ACPI LPI and CPPC - Various cleanus related to barriers - A handful of fixes * tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits) riscv: Fix syscall wrapper for >word-size arguments crypto: riscv - add vector crypto accelerated AES-CBC-CTS crypto: riscv - parallelize AES-CBC decryption riscv: Only flush the mm icache when setting an exec pte riscv: Use kcalloc() instead of kzalloc() riscv/barrier: Add missing space after ',' riscv/barrier: Consolidate fence definitions riscv/barrier: Define RISCV_FULL_BARRIER riscv/barrier: Define __{mb,rmb,wmb} RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ cpufreq: Move CPPC configs to common Kconfig and add RISC-V ACPI: RISC-V: Add CPPC driver ACPI: Enable ACPI_PROCESSOR for RISC-V ACPI: RISC-V: Add LPI driver cpuidle: RISC-V: Move few functions to arch/riscv riscv: Introduce set_compat_task() in asm/compat.h riscv: Introduce is_compat_thread() into compat.h riscv: add compile-time test into is_compat_task() riscv: Replace direct thread flag check with is_compat_task() riscv: Improve arch_get_mmap_end() macro ...

…s-linux Pull xfs fixes from Chandan Babu: - Fix invalid pointer dereference by initializing xmbuf before tracepoint function is invoked - Use memalloc_nofs_save() when inserting into quota radix tree * tag 'xfs-6.9-merge-9' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: quota radix tree allocations need to be NOFS on insert xfs: fix dev_t usage in xmbuf tracepoints

Pull ceph updates from Ilya Dryomov: "A patch to minimize blockage when processing very large batches of dirty caps and two fixes to better handle EOF in the face of multiple clients performing reads and size-extending writes at the same time" * tag 'ceph-for-6.9-rc1' of https://github.com/ceph/ceph-client: ceph: set correct cap mask for getattr request for read ceph: stop copying to iter at EOF on sync reads ceph: remove SLAB_MEM_SPREAD flag usage ceph: break the check delayed cap loop every 5s

…rnel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix a memory leak in DM integrity recheck code that was added during the 6.9 merge. Also fix the recheck code to ensure it issues bios with proper alignment. - Fix DM snapshot's dm_exception_table_exit() to schedule while handling an large exception table during snapshot device shutdown. * tag 'for-6.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-integrity: align the outgoing bio in integrity_recheck dm snapshot: fix lockup in dm_exception_table_exit dm-integrity: fix a memory leak when rechecking the data

Pull more io_uring updates from Jens Axboe: "One patch just missed the initial pull, the rest are either fixes or small cleanups that make our life easier for the next kernel: - Fix a potential leak in error handling of pinned pages, and clean it up (Gabriel, Pavel) - Fix an issue with how read multishot returns retry (me) - Fix a problem with waitid/futex removals, if we hit the case of needing to remove all of them at exit time (me) - Fix for a regression introduced in this merge window, where we don't always have sr->done_io initialized if the ->prep_async() path is used (me) - Fix for SQPOLL setup error handling (me) - Fix for a poll removal request being delayed (Pavel) - Rename of a struct member which had a confusing name (Pavel)" * tag 'io_uring-6.9-20240322' of git://git.kernel.dk/linux: io_uring/sqpoll: early exit thread if task_context wasn't allocated io_uring: clear opcode specific data for an early failure io_uring/net: ensure async prep handlers always initialize ->done_io io_uring/waitid: always remove waitid entry for cancel all io_uring/futex: always remove futex entry for cancel all io_uring: fix poll_remove stalled req completion io_uring: Fix release of pinned pages when __io_uaddr_map fails io_uring/kbuf: rename is_mapped io_uring: simplify io_pages_free io_uring: clean rings on NO_MMAP alloc fail io_uring/rw: return IOU_ISSUE_SKIP_COMPLETE for multishot retry io_uring: don't save/restore iowait state

The driver used the DT node of the device itself when registering the MDIO bus. While this works, it creates a problem: it forces any MDIO bus properties to also be set on the devices DT node. This mixes the properties of two distinctly different things and is confusing. This change adds support for an optional mdio node to be defined as a child to the device DT node. The child node can then be used to describe MDIO bus properties that the MDIO core can act on when registering the bus. If no mdio child node is found the driver fallback to the old behavior and register the MDIO bus using the device DT node. This change is backward compatible with old bindings in use. Signed-off-by: Niklas Söderlund <[email protected]> Reviewed-by: Sergey Shtylyov <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>

Niklas Söderlund says: ==================== ravb: Support describing the MDIO bus This series adds support to the binding and driver of the Renesas Ethernet AVB to described the MDIO bus. Currently the driver uses the OF node of the device itself when registering the MDIO bus. This forces any MDIO bus properties the MDIO core should react on to be set on the device OF node. This is confusing and none of the MDIO bus properties are described in the Ethernet AVB bindings. Patch 1/2 extends the bindings with an optional mdio child-node to the device that can be used to contain the MDIO bus settings. While patch 2/2 changes the driver to use this node (if present) when registering the MDIO bus. If the new optional mdio child-node is not present the driver fallback to the old behavior and uses the device OF node like before. This change is fully backward compatible with existing usage of the bindings. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>

Extend the bpf_fib_lookup() helper by making it to utilize mark if the BPF_FIB_LOOKUP_MARK flag is set. In order to pass the mark the four bytes of struct bpf_fib_lookup are used, shared with the output-only smac/dmac fields. Signed-off-by: Anton Protopopov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: David Ahern <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

This patch extends the fib_lookup test suite by adding a few test cases for each IP family to test the new BPF_FIB_LOOKUP_MARK flag to the bpf_fib_lookup: * Test destination IP address selection with and without a mark and/or the BPF_FIB_LOOKUP_MARK flag set Signed-off-by: Anton Protopopov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

The struct bpf_fib_lookup should not grow outside of its 64 bytes. Add a static assert to validate this. Suggested-by: David Ahern <[email protected]> Signed-off-by: Anton Protopopov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

BPF verifier emits "unknown func" message when given BPF program type does not support BPF helper. This message may be confusing for users, as important context that helper is unknown only to current program type is not provided. This patch changes message to "program of this type cannot use helper " and aligns dependent code in libbpf and tests. Any suggestions on improving/changing this message are welcome. Signed-off-by: Mykyta Yatsenko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Quentin Monnet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Use the well defined helper sizeof_field() to calculate the size of a struct member, instead of doing custom calculations. Signed-off-by: Haiyue Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Rename uprobe-base to more precise usermode-count (it will match other baseline-like benchmarks, kernel-count and syscall-count). Also use BENCH_TRIG_USERMODE() macro to define all usermode-based triggering benchmarks, which include usermode-count and uprobe/uretprobe benchmarks. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Existing kprobe/fentry triggering benchmarks have 1-to-1 mapping between one syscall execution and BPF program run. While we use a fast get_pgid() syscall, syscall overhead can still be non-trivial. This patch adds kprobe/fentry set of benchmarks significantly amortizing the cost of syscall vs actual BPF triggering overhead. We do this by employing BPF_PROG_TEST_RUN command to trigger "driver" raw_tp program which does a tight parameterized loop calling cheap BPF helper (bpf_get_numa_node_id()), to which kprobe/fentry programs are attached for benchmarking. This way 1 bpf() syscall causes N executions of BPF program being benchmarked. N defaults to 100, but can be adjusted with --trig-batch-iters CLI argument. For comparison we also implement a new baseline program that instead of triggering another BPF program just does N atomic per-CPU counter increments, establishing the limit for all other types of program within this batched benchmarking setup. Taking the final set of benchmarks added in this patch set (including tp/raw_tp/fmodret, added in later patch), and keeping for now "legacy" syscall-driven benchmarks, we can capture all triggering benchmarks in one place for comparison, before we remove the legacy ones (and rename xxx-batched into just xxx). $ benchs/run_bench_trigger.sh usermode-count : 79.500 ± 0.024M/s kernel-count : 49.949 ± 0.081M/s syscall-count : 9.009 ± 0.007M/s fentry-batch : 31.002 ± 0.015M/s fexit-batch : 20.372 ± 0.028M/s fmodret-batch : 21.651 ± 0.659M/s rawtp-batch : 36.775 ± 0.264M/s tp-batch : 19.411 ± 0.248M/s kprobe-batch : 12.949 ± 0.220M/s kprobe-multi-batch : 15.400 ± 0.007M/s kretprobe-batch : 5.559 ± 0.011M/s kretprobe-multi-batch: 5.861 ± 0.003M/s fentry-legacy : 8.329 ± 0.004M/s fexit-legacy : 6.239 ± 0.003M/s fmodret-legacy : 6.595 ± 0.001M/s rawtp-legacy : 8.305 ± 0.004M/s tp-legacy : 6.382 ± 0.001M/s kprobe-legacy : 5.528 ± 0.003M/s kprobe-multi-legacy : 5.864 ± 0.022M/s kretprobe-legacy : 3.081 ± 0.001M/s kretprobe-multi-legacy: 3.193 ± 0.001M/s Note how xxx-batch variants are measured with significantly higher throughput, even though it's exactly the same in-kernel overhead. As such, results can be compared only between benchmarks of the same kind (syscall vs batched): fentry-legacy : 8.329 ± 0.004M/s fentry-batch : 31.002 ± 0.015M/s kprobe-multi-legacy : 5.864 ± 0.022M/s kprobe-multi-batch : 15.400 ± 0.007M/s Note also that syscall-count is setting a theoretical limit for syscall-triggered benchmarks, while kernel-count is setting similar limits for batch variants. usermode-count is a happy and unachievable case of user space counting without doing any syscalls, and is mostly the measure of CPU speed for such a trivial benchmark. As was mentioned, tp/raw_tp/fmodret require kernel-side kfunc to produce similar benchmark, which we address in a separate patch. Note that run_bench_trigger.sh allows to override a list of benchmarks to run, which is very useful for performance work. Cc: Jiri Olsa <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Remove "legacy" benchmarks triggered by syscalls in favor of newly added in-kernel/batched benchmarks. Drop -batched suffix now as well. Next patch will restore "feature parity" by adding back tp/raw_tp/fmodret benchmarks based on in-kernel kfunc approach. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Instead of front-loading all possible benchmarking BPF programs for trigger benchmarks, explicitly specify which BPF programs are used by specific benchmark and load only it. This allows to be more flexible in supporting older kernels, where some program types might not be possible to load (e.g., those that rely on newly added kfunc). Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Add a simple bpf_modify_return_test_tp() kfunc, available to all program types, that is useful for various testing and benchmarking scenarios, as it allows to trigger most tracing BPF program types from BPF side, allowing to do complex testing and benchmarking scenarios. It is also attachable to for fmod_ret programs, making it a good and simple way to trigger fmod_ret program under test/benchmark. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Utilize bpf_modify_return_test_tp() kfunc to have a fast way to trigger tp/raw_tp/fmodret programs from another BPF program, which gives us comparable batched benchmarks to (batched) kprobe/fentry benchmarks. We don't switch kprobe/fentry batched benchmarks to this kfunc to make bench tool usable on older kernels as well. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Andrii Nakryiko says: ==================== bench: fast in-kernel triggering benchmarks Remove "legacy" triggering benchmarks which rely on syscalls (and thus syscall overhead is a noticeable part of benchmark, unfortunately). Replace them with faster versions that rely on triggering BPF programs in-kernel through another simple "driver" BPF program. See patch #2 with comparison results. raw_tp/tp/fmodret benchmarks required adding a simple kfunc in kernel to be able to trigger a simple tracepoint from BPF program (plus it is also allowed to be replaced by fmod_ret programs). This limits raw_tp/tp/fmodret benchmarks to new kernels only, but it keeps bench tool itself very portable and most of other benchmarks will still work on wide variety of kernels without the need to worry about building and deploying custom kernel module. See patches #5 and #6 for details. v1->v2: - move new TP closer to BPF test run code; - rename/move kfunc and register it for fmod_rets (Alexei); - limit --trig-batch-iters param to [1, 1000] (Alexei). ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

… htab Following the recent upgrade of one of our BPF programs, we encountered significant latency spikes affecting other applications running on the same host. After thorough investigation, we identified that these spikes were primarily caused by the prolonged duration required to free a non-preallocated htab with approximately 2 million keys. Notably, our kernel configuration lacks the presence of CONFIG_PREEMPT. In scenarios where kernel execution extends excessively, other threads might be starved of CPU time, resulting in latency issues across the system. To mitigate this, we've adopted a proactive approach by incorporating cond_resched() calls within the kernel code. This ensures that during lengthy kernel operations, the scheduler is invoked periodically to provide opportunities for other threads to execute. Signed-off-by: Yafang Shao <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

…c kfuncs The commit 7aae231 ("bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE") added CONFIG_DYNAMIC_FTRACE guard because pahole was only generating btf for ftrace-able functions. The ftrace filter had already been removed from pahole, so the CONFIG_DYNAMIC_FTRACE guard can be removed. The commit 569c484 ("bpf: Limit static tcp-cc functions in the .BTF_ids list to x86") has added CONFIG_X86 guard because it failed the powerpc arch which prepended a "." to the local static function, so "cubictcp_init" becomes ".cubictcp_init". "__bpf_kfunc" has been added to kfunc since then and it uses the __unused compiler attribute. There is an existing "__bpf_kfunc static u32 bpf_kfunc_call_test_static_unused_arg(u32 arg, u32 unused)" test in bpf_testmod.c to cover the static kfunc case. cross compile on ppc64 with CONFIG_DYNAMIC_FTRACE disabled: > readelf -s vmlinux | grep cubictcp_ 56938: c00000000144fd00 184 FUNC LOCAL DEFAULT 2 cubictcp_cwnd_event [<localentry>: 8] 56939: c00000000144fdb8 200 FUNC LOCAL DEFAULT 2 cubictcp_recalc_[...] [<localentry>: 8] 56940: c00000000144fe80 296 FUNC LOCAL DEFAULT 2 cubictcp_init [<localentry>: 8] 56941: c00000000144ffa8 228 FUNC LOCAL DEFAULT 2 cubictcp_state [<localentry>: 8] 56942: c00000000145008c 1908 FUNC LOCAL DEFAULT 2 cubictcp_cong_avoid [<localentry>: 8] 56943: c000000001450800 1644 FUNC LOCAL DEFAULT 2 cubictcp_acked [<localentry>: 8] > bpftool btf dump file vmlinux | grep cubictcp_ [51540] FUNC 'cubictcp_acked' type_id=38137 linkage=static [51541] FUNC 'cubictcp_cong_avoid' type_id=38122 linkage=static [51543] FUNC 'cubictcp_cwnd_event' type_id=51542 linkage=static [51544] FUNC 'cubictcp_init' type_id=9186 linkage=static [51545] FUNC 'cubictcp_recalc_ssthresh' type_id=35021 linkage=static [51547] FUNC 'cubictcp_state' type_id=38141 linkage=static The patch removed both config guards. Cc: Jiri Olsa <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

… kfuncs This patch adds a test to ensure all static tcp-cc kfuncs is visible to the struct_ops bpf programs. It is checked by successfully loading the struct_ops programs calling these tcp-cc kfuncs. This patch needs to enable the CONFIG_TCP_CONG_DCTCP and the CONFIG_TCP_CONG_BBR. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Replace CHECK with ASSERT macros for ksyms tests. This test failed earlier with clang lto kernel, but the issue is gone with latest code base. But replacing CHECK with ASSERT still improves code as ASSERT is preferred in selftests. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Currently libbpf_kallsyms_parse() function is declared as a global function but actually it is not a API and there is no external users in bpftool/bpf-selftests. So let us mark the function as static. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

With CONFIG_LTO_CLANG_THIN enabled, with some of previous version of kernel code base ([1]), I hit the following error: test_ksyms:PASS:kallsyms_fopen 0 nsec test_ksyms:FAIL:ksym_find symbol 'bpf_link_fops' not found #118 ksyms:FAIL The reason is that 'bpf_link_fops' is renamed to bpf_link_fops.llvm.8325593422554671469 Due to cross-file inlining, the static variable 'bpf_link_fops' in syscall.c is used by a function in another file. To avoid potential duplicated names, the llvm added suffix '.llvm.<hash>' ([2]) to 'bpf_link_fops' variable. Such renaming caused a problem in libbpf if 'bpf_link_fops' is used in bpf prog as a ksym but 'bpf_link_fops' does not match any symbol in /proc/kallsyms. To fix this issue, libbpf needs to understand that suffix '.llvm.<hash>' is caused by clang lto kernel and to process such symbols properly. With latest bpf-next code base built with CONFIG_LTO_CLANG_THIN, I cannot reproduce the above failure any more. But such an issue could happen with other symbols or in the future for bpf_link_fops symbol. For example, with my current kernel, I got the following from /proc/kallsyms: ffffffff84782154 d __func__.net_ratelimit.llvm.6135436931166841955 ffffffff85f0a500 d tk_core.llvm.726630847145216431 ffffffff85fdb960 d __fs_reclaim_map.llvm.10487989720912350772 ffffffff864c7300 d fake_dst_ops.llvm.54750082607048300 I could not easily create a selftest to test newly-added libbpf functionality with a static C test since I do not know which symbol is cross-file inlined. But based on my particular kernel, the following test change can run successfully. > diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms.c b/tools/testing/selftests/bpf/prog_tests/ksyms.c > index 6a86d1f07800..904a103f7b1d 100644 > --- a/tools/testing/selftests/bpf/prog_tests/ksyms.c > +++ b/tools/testing/selftests/bpf/prog_tests/ksyms.c > @@ -42,6 +42,7 @@ void test_ksyms(void) > ASSERT_EQ(data->out__bpf_link_fops, link_fops_addr, "bpf_link_fops"); > ASSERT_EQ(data->out__bpf_link_fops1, 0, "bpf_link_fops1"); > ASSERT_EQ(data->out__btf_size, btf_size, "btf_size"); > + ASSERT_NEQ(data->out__fake_dst_ops, 0, "fake_dst_ops"); > ASSERT_EQ(data->out__per_cpu_start, per_cpu_start_addr, "__per_cpu_start"); > > cleanup: > diff --git a/tools/testing/selftests/bpf/progs/test_ksyms.c b/tools/testing/selftests/bpf/progs/test_ksyms.c > index 6c9cbb5a3bdf..fe91eef54b66 100644 > --- a/tools/testing/selftests/bpf/progs/test_ksyms.c > +++ b/tools/testing/selftests/bpf/progs/test_ksyms.c > @@ -9,11 +9,13 @@ __u64 out__bpf_link_fops = -1; > __u64 out__bpf_link_fops1 = -1; > __u64 out__btf_size = -1; > __u64 out__per_cpu_start = -1; > +__u64 out__fake_dst_ops = -1; > > extern const void bpf_link_fops __ksym; > extern const void __start_BTF __ksym; > extern const void __stop_BTF __ksym; > extern const void __per_cpu_start __ksym; > +extern const void fake_dst_ops __ksym; > /* non-existing symbol, weak, default to zero */ > extern const void bpf_link_fops1 __ksym __weak; > > @@ -23,6 +25,7 @@ int handler(const void *ctx) > out__bpf_link_fops = (__u64)&bpf_link_fops; > out__btf_size = (__u64)(&__stop_BTF - &__start_BTF); > out__per_cpu_start = (__u64)&__per_cpu_start; > + out__fake_dst_ops = (__u64)&fake_dst_ops; > > out__bpf_link_fops1 = (__u64)&bpf_link_fops1; This patch fixed the issue in libbpf such that the suffix '.llvm.<hash>' will be ignored during comparison of bpf prog ksym vs. symbols in /proc/kallsyms, this resolved the issue. Currently, only static variables in /proc/kallsyms are checked with '.llvm.<hash>' suffix since in bpf programs function ksyms with '.llvm.<hash>' suffix are most likely kfunc's and unlikely to be cross-file inlined. Note that currently kernel does not support gcc build with lto. [1] https://lore.kernel.org/bpf/[email protected]/ [2] https://github.com/llvm/llvm-project/blob/release/18.x/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1714-L1719 Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Refactor some functions in kprobe_multi_test.c to extract some helper functions who will be used in later patches to avoid code duplication. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Refactor trace helper function load_kallsyms_local() such that it invokes a common function with a compare function as input. The common function will be used later for other local functions. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

These two functions allow selftests to do loading/searching kallsyms based on their specific compare functions. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

…rnel In my locally build clang LTO kernel (enabling CONFIG_LTO and CONFIG_LTO_CLANG_THIN), kprobe_multi_bench_attach/kernel subtest failed like: test_kprobe_multi_bench_attach:PASS:get_syms 0 nsec test_kprobe_multi_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec libbpf: prog 'test_kprobe_empty': failed to attach: No such process test_kprobe_multi_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts unexpected error: -3 #117/1 kprobe_multi_bench_attach/kernel:FAIL There are multiple symbols in /sys/kernel/debug/tracing/available_filter_functions are renamed in /proc/kallsyms due to cross file inlining. One example is for static function __access_remote_vm in mm/memory.c. In a non-LTO kernel, we have the following call stack: ptrace_access_vm (global, kernel/ptrace.c) access_remote_vm (global, mm/memory.c) __access_remote_vm (static, mm/memory.c) With LTO kernel, it is possible that access_remote_vm() is inlined by ptrace_access_vm(). So we end up with the following call stack: ptrace_access_vm (global, kernel/ptrace.c) __access_remote_vm (static, mm/memory.c) The compiler renames __access_remote_vm to __access_remote_vm.llvm.<hash> to prevent potential name collision. The kernel bpf_kprobe_multi_link_attach() and ftrace_lookup_symbols() try to find addresses based on /proc/kallsyms, hence the current test failed with LTO kenrel. This patch consulted /proc/kallsyms to find the corresponding entries for the ksym and this solved the issue. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Get addrs directly from available_filter_functions_addrs and send to the kernel during kprobe_multi_attach. This avoids consultation of /proc/kallsyms. But available_filter_functions_addrs is introduced in 6.5, i.e., it is introduced recently, so I skip the test if the kernel does not support it. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

Yonghong Song says: ==================== bpf: Fix a couple of test failures with LTO kernel With a LTO kernel built with clang, with one of earlier version of kernel, I encountered two test failures, ksyms and kprobe_multi_bench_attach/kernel. Now with latest bpf-next, only kprobe_multi_bench_attach/kernel failed. But it is possible in the future ksyms selftest may fail again. Both test failures are due to static variable/function renaming due to cross-file inlining. For Ksyms failure, the solution is to strip .llvm.<hash> suffixes for symbols in /proc/kallsyms before comparing against the ksym in bpf program. For kprobe_multi_bench_attach/kernel failure, the solution is to either provide names in /proc/kallsyms to the kernel or ignore those names who have .llvm.<hash> suffix since the kernel sym name comparison is against /proc/kallsyms. Please see each individual patches for details. Changelogs: v2 -> v3: - no need to check config file, directly so strstr with '.llvm.'. - for kprobe_multi_bench with syms, instead of skipping the syms, consult /proc/kallyms to find corresponding names. - add a test with populating addrs to the kernel for kprobe multi attach. v1 -> v2: - Let libbpf handle .llvm.<hash suffixes since it may impact bpf program ksym. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>

torvalds and others added 30 commits March 21, 2024 15:09

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

537c2e9

Cross-merge networking fixes after downstream PR. Signed-off-by: Jakub Kicinski <[email protected]>

Niklas Söderlund and others added 28 commits March 28, 2024 18:17

Merge branch 'bpf/for-next' into sync-bpf-next

368ada6

scx: Build fix after pulling bpf/for-next

fe1f6ad

htejun merged commit 5121551 into sched_ext Mar 29, 2024
1 check passed

htejun deleted the sync-bpf-next branch March 29, 2024 02:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pull in bpf/for-next #169

Pull in bpf/for-next #169

htejun commented Mar 29, 2024

Pull in bpf/for-next #169

Pull in bpf/for-next #169

Conversation

htejun commented Mar 29, 2024