Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't block use of Intel on Nvidia hybrid systems. #93

Closed
flukejones opened this issue Mar 11, 2024 · 4 comments · Fixed by #95
Closed

Don't block use of Intel on Nvidia hybrid systems. #93

flukejones opened this issue Mar 11, 2024 · 4 comments · Fixed by #95

Comments

@flukejones
Copy link

The recent commit 30c4fa4 makes it so that things like zed run on the dgpu full time, this is not an acceptable solution for #88 as it causes excessive battery drain, heat, etc.

In that issue I reference gfx-rs/wgpu#4110 because it appears to be a very similar use case.

My own system is currently:

Operating System: Fedora Linux 40
Kernel Version: 6.8.0-rc7+ (64-bit)
Graphics Platform: Wayland
Processors: 32 × Intel® Core™ i9-14900HX
Memory: 62.4 GiB of RAM
Graphics Processor: Mesa Intel® Graphics
Manufacturer: ASUSTeK COMPUTER INC.
Product Name: ROG Strix SCAR 16 G634JYR_G634JYR_000045397
System Version: 1.0

Installed mesa version is: 24.0.0

Installed nvidia version: 550.54.14

I also have older laptops I can test. Plus I have tested on fedora 39 quite fine which used 6.6.x and 6.7.x kernels. The desktop is irrelevant here, I've tested on COSMIC, Gnome, KDE. What is of note however is that I do not use Xorg sessions and haven't done for years.

The proper and expected solution is to find the exact cause of #88 and either fix that, or work around that one specific case. A blanket blocking of all intel/nvidia just handicaps everyone regardless.

Logs

Output from example using Intel:

The VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/intel_icd.x86_64.json wasn't actually required here.

❯ VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/intel_icd.x86_64.json RUST_LOG=debug cargo run --example particle
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/examples/particle`
[2024-03-11T04:35:49Z DEBUG egui_winit::clipboard] Initializing arboard clipboard…
[2024-03-11T04:35:49Z DEBUG egui_winit::clipboard] Initializing smithay clipboard…
[2024-03-11T04:35:49Z WARN  blade_graphics::hal::init] Requested layer is not found: "VK_LAYER_KHRONOS_validation"
[2024-03-11T04:35:49Z INFO  blade_graphics::hal::init] Adapter "Intel(R) Graphics (RPL-S)"
[2024-03-11T04:35:49Z INFO  blade_graphics::hal::init] No ray tracing extensions are supported
[2024-03-11T04:35:49Z DEBUG blade_graphics::hal::init] Adapter AdapterCapabilities {
        api_version: 4206866,
        properties: PhysicalDeviceProperties {
            api_version: 4206866,
            driver_version: 100663296,
            vendor_id: 32902,
            device_id: 42888,
            device_type: INTEGRATED_GPU,
            device_name: "Intel(R) Graphics (RPL-S)",
            pipeline_cache_uuid: [
                30,
                116,
                185,
                68,
                191,
                206,
                235,
                242,
                182,
                25,
                50,
                41,
                188,
                127,
                83,
                255,
            ],
            limits: PhysicalDeviceLimits {
                max_image_dimension1_d: 16384,
                max_image_dimension2_d: 16384,
                max_image_dimension3_d: 2048,
                max_image_dimension_cube: 16384,
                max_image_array_layers: 2048,
                max_texel_buffer_elements: 134217728,
                max_uniform_buffer_range: 1073741824,
                max_storage_buffer_range: 4294967295,
                max_push_constants_size: 128,
                max_memory_allocation_count: 4294967295,
                max_sampler_allocation_count: 65536,
                buffer_image_granularity: 1,
                sparse_address_space_size: 17587891077120,
                max_bound_descriptor_sets: 8,
                max_per_stage_descriptor_samplers: 65535,
                max_per_stage_descriptor_uniform_buffers: 64,
                max_per_stage_descriptor_storage_buffers: 65535,
                max_per_stage_descriptor_sampled_images: 65535,
                max_per_stage_descriptor_storage_images: 65535,
                max_per_stage_descriptor_input_attachments: 64,
                max_per_stage_resources: 4294967295,
                max_descriptor_set_samplers: 393210,
                max_descriptor_set_uniform_buffers: 384,
                max_descriptor_set_uniform_buffers_dynamic: 8,
                max_descriptor_set_storage_buffers: 393210,
                max_descriptor_set_storage_buffers_dynamic: 8,
                max_descriptor_set_sampled_images: 393210,
                max_descriptor_set_storage_images: 393210,
                max_descriptor_set_input_attachments: 256,
                max_vertex_input_attributes: 29,
                max_vertex_input_bindings: 31,
                max_vertex_input_attribute_offset: 2047,
                max_vertex_input_binding_stride: 4095,
                max_vertex_output_components: 128,
                max_tessellation_generation_level: 64,
                max_tessellation_patch_size: 32,
                max_tessellation_control_per_vertex_input_components: 128,
                max_tessellation_control_per_vertex_output_components: 128,
                max_tessellation_control_per_patch_output_components: 128,
                max_tessellation_control_total_output_components: 2048,
                max_tessellation_evaluation_input_components: 128,
                max_tessellation_evaluation_output_components: 128,
                max_geometry_shader_invocations: 32,
                max_geometry_input_components: 128,
                max_geometry_output_components: 128,
                max_geometry_output_vertices: 256,
                max_geometry_total_output_components: 1024,
                max_fragment_input_components: 116,
                max_fragment_output_attachments: 8,
                max_fragment_dual_src_attachments: 1,
                max_fragment_combined_output_resources: 131078,
                max_compute_shared_memory_size: 65536,
                max_compute_work_group_count: [
                    65535,
                    65535,
                    65535,
                ],
                max_compute_work_group_invocations: 1024,
                max_compute_work_group_size: [
                    1024,
                    1024,
                    1024,
                ],
                sub_pixel_precision_bits: 8,
                sub_texel_precision_bits: 8,
                mipmap_precision_bits: 8,
                max_draw_indexed_index_value: 4294967295,
                max_draw_indirect_count: 4294967295,
                max_sampler_lod_bias: 16.0,
                max_sampler_anisotropy: 16.0,
                max_viewports: 16,
                max_viewport_dimensions: [
                    16384,
                    16384,
                ],
                viewport_bounds_range: [
                    -32768.0,
                    32767.0,
                ],
                viewport_sub_pixel_bits: 13,
                min_memory_map_alignment: 4096,
                min_texel_buffer_offset_alignment: 16,
                min_uniform_buffer_offset_alignment: 64,
                min_storage_buffer_offset_alignment: 4,
                min_texel_offset: -8,
                max_texel_offset: 7,
                min_texel_gather_offset: -32,
                max_texel_gather_offset: 31,
                min_interpolation_offset: -0.5,
                max_interpolation_offset: 0.4375,
                sub_pixel_interpolation_offset_bits: 4,
                max_framebuffer_width: 16384,
                max_framebuffer_height: 16384,
                max_framebuffer_layers: 2048,
                framebuffer_color_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                framebuffer_depth_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                framebuffer_stencil_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                framebuffer_no_attachments_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                max_color_attachments: 8,
                sampled_image_color_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                sampled_image_integer_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                sampled_image_depth_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                sampled_image_stencil_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                storage_image_sample_counts: TYPE_1,
                max_sample_mask_words: 1,
                timestamp_compute_and_graphics: 1,
                timestamp_period: 52.083332,
                max_clip_distances: 8,
                max_cull_distances: 8,
                max_combined_clip_and_cull_distances: 8,
                discrete_queue_priorities: 2,
                point_size_range: [
                    0.125,
                    255.875,
                ],
                line_width_range: [
                    0.0,
                    8.0,
                ],
                point_size_granularity: 0.125,
                line_width_granularity: 0.0078125,
                strict_lines: 0,
                standard_sample_locations: 1,
                optimal_buffer_copy_offset_alignment: 128,
                optimal_buffer_copy_row_pitch_alignment: 128,
                non_coherent_atom_size: 64,
            },
            sparse_properties: PhysicalDeviceSparseProperties {
                residency_standard2_d_block_shape: 1,
                residency_standard2_d_multisample_block_shape: 0,
                residency_standard3_d_block_shape: 1,
                residency_aligned_mip_size: 0,
                residency_non_resident_strict: 1,
            },
        },
        queue_family_index: 0,
        layered: false,
        ray_tracing: false,
        buffer_marker: true,
        shader_info: false,
    }

Using nvidia

❯ RUST_LOG=debug switcherooctl launch cargo run --example particle
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/examples/particle`
[2024-03-11T04:37:58Z DEBUG egui_winit::clipboard] Initializing arboard clipboard…
[2024-03-11T04:37:58Z DEBUG egui_winit::clipboard] Initializing smithay clipboard…
[2024-03-11T04:37:58Z WARN  blade_graphics::hal::init] Requested layer is not found: "VK_LAYER_KHRONOS_validation"
DRM kernel driver 'nvidia-drm' in use. NVK requires nouveau.
TU: error: ../src/freedreno/vulkan/tu_knl.cc:232: device /dev/dri/renderD128 (i915) is not compatible with turnip (VK_ERROR_INCOMPATIBLE_DRIVER)
TU: error: ../src/freedreno/vulkan/tu_knl.cc:232: device /dev/dri/renderD129 (nvidia-drm) is not compatible with turnip (VK_ERROR_INCOMPATIBLE_DRIVER)
[2024-03-11T04:37:58Z INFO  blade_graphics::hal::init] Adapter "NVIDIA GeForce RTX 4090 Laptop GPU"
[2024-03-11T04:37:58Z INFO  blade_graphics::hal::init] Ray tracing is supported
[2024-03-11T04:37:58Z DEBUG blade_graphics::hal::init] Ray tracing properties: PhysicalDeviceAccelerationStructurePropertiesKHR {
        s_type: PHYSICAL_DEVICE_ACCELERATION_STRUCTURE_PROPERTIES_KHR,
        p_next: 0x00007ffc7dbb8d40,
        max_geometry_count: 16777215,
        max_instance_count: 16777215,
        max_primitive_count: 536870911,
        max_per_stage_descriptor_acceleration_structures: 1048576,
        max_per_stage_descriptor_update_after_bind_acceleration_structures: 1048576,
        max_descriptor_set_acceleration_structures: 1048576,
        max_descriptor_set_update_after_bind_acceleration_structures: 1048576,
        min_acceleration_structure_scratch_offset_alignment: 128,
    }
[2024-03-11T04:37:58Z DEBUG blade_graphics::hal::init] Adapter AdapterCapabilities {
        api_version: 4206867,
        properties: PhysicalDeviceProperties {
            api_version: 4206869,
            driver_version: 2307752832,
            vendor_id: 4318,
            device_id: 10071,
            device_type: DISCRETE_GPU,
            device_name: "NVIDIA GeForce RTX 4090 Laptop GPU",
            pipeline_cache_uuid: [
                106,
                157,
                243,
                178,
                252,
                57,
                140,
                51,
                193,
                158,
                239,
                82,
                244,
                219,
                236,
                237,
            ],
            limits: PhysicalDeviceLimits {
                max_image_dimension1_d: 32768,
                max_image_dimension2_d: 32768,
                max_image_dimension3_d: 16384,
                max_image_dimension_cube: 32768,
                max_image_array_layers: 2048,
                max_texel_buffer_elements: 134217728,
                max_uniform_buffer_range: 65536,
                max_storage_buffer_range: 4294967295,
                max_push_constants_size: 256,
                max_memory_allocation_count: 4294967295,
                max_sampler_allocation_count: 4000,
                buffer_image_granularity: 1024,
                sparse_address_space_size: 1099511627775,
                max_bound_descriptor_sets: 32,
                max_per_stage_descriptor_samplers: 1048576,
                max_per_stage_descriptor_uniform_buffers: 1048576,
                max_per_stage_descriptor_storage_buffers: 1048576,
                max_per_stage_descriptor_sampled_images: 1048576,
                max_per_stage_descriptor_storage_images: 1048576,
                max_per_stage_descriptor_input_attachments: 1048576,
                max_per_stage_resources: 4294967295,
                max_descriptor_set_samplers: 1048576,
                max_descriptor_set_uniform_buffers: 1048576,
                max_descriptor_set_uniform_buffers_dynamic: 15,
                max_descriptor_set_storage_buffers: 1048576,
                max_descriptor_set_storage_buffers_dynamic: 16,
                max_descriptor_set_sampled_images: 1048576,
                max_descriptor_set_storage_images: 1048576,
                max_descriptor_set_input_attachments: 1048576,
                max_vertex_input_attributes: 32,
                max_vertex_input_bindings: 32,
                max_vertex_input_attribute_offset: 2047,
                max_vertex_input_binding_stride: 2048,
                max_vertex_output_components: 128,
                max_tessellation_generation_level: 64,
                max_tessellation_patch_size: 32,
                max_tessellation_control_per_vertex_input_components: 128,
                max_tessellation_control_per_vertex_output_components: 128,
                max_tessellation_control_per_patch_output_components: 120,
                max_tessellation_control_total_output_components: 4216,
                max_tessellation_evaluation_input_components: 128,
                max_tessellation_evaluation_output_components: 128,
                max_geometry_shader_invocations: 32,
                max_geometry_input_components: 128,
                max_geometry_output_components: 128,
                max_geometry_output_vertices: 1024,
                max_geometry_total_output_components: 1024,
                max_fragment_input_components: 128,
                max_fragment_output_attachments: 8,
                max_fragment_dual_src_attachments: 1,
                max_fragment_combined_output_resources: 4294967295,
                max_compute_shared_memory_size: 49152,
                max_compute_work_group_count: [
                    2147483647,
                    65535,
                    65535,
                ],
                max_compute_work_group_invocations: 1024,
                max_compute_work_group_size: [
                    1024,
                    1024,
                    64,
                ],
                sub_pixel_precision_bits: 8,
                sub_texel_precision_bits: 8,
                mipmap_precision_bits: 8,
                max_draw_indexed_index_value: 4294967295,
                max_draw_indirect_count: 4294967295,
                max_sampler_lod_bias: 15.0,
                max_sampler_anisotropy: 16.0,
                max_viewports: 16,
                max_viewport_dimensions: [
                    32768,
                    32768,
                ],
                viewport_bounds_range: [
                    -65536.0,
                    65536.0,
                ],
                viewport_sub_pixel_bits: 8,
                min_memory_map_alignment: 64,
                min_texel_buffer_offset_alignment: 16,
                min_uniform_buffer_offset_alignment: 64,
                min_storage_buffer_offset_alignment: 16,
                min_texel_offset: -8,
                max_texel_offset: 7,
                min_texel_gather_offset: -32,
                max_texel_gather_offset: 31,
                min_interpolation_offset: -0.5,
                max_interpolation_offset: 0.4375,
                sub_pixel_interpolation_offset_bits: 4,
                max_framebuffer_width: 32768,
                max_framebuffer_height: 32768,
                max_framebuffer_layers: 2048,
                framebuffer_color_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8,
                framebuffer_depth_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8,
                framebuffer_stencil_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                framebuffer_no_attachments_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                max_color_attachments: 8,
                sampled_image_color_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8,
                sampled_image_integer_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8,
                sampled_image_depth_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8,
                sampled_image_stencil_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8 | TYPE_16,
                storage_image_sample_counts: TYPE_1 | TYPE_2 | TYPE_4 | TYPE_8,
                max_sample_mask_words: 1,
                timestamp_compute_and_graphics: 1,
                timestamp_period: 1.0,
                max_clip_distances: 8,
                max_cull_distances: 8,
                max_combined_clip_and_cull_distances: 8,
                discrete_queue_priorities: 2,
                point_size_range: [
                    1.0,
                    2047.9375,
                ],
                line_width_range: [
                    1.0,
                    64.0,
                ],
                point_size_granularity: 0.0625,
                line_width_granularity: 0.0625,
                strict_lines: 1,
                standard_sample_locations: 1,
                optimal_buffer_copy_offset_alignment: 1,
                optimal_buffer_copy_row_pitch_alignment: 1,
                non_coherent_atom_size: 64,
            },
            sparse_properties: PhysicalDeviceSparseProperties {
                residency_standard2_d_block_shape: 1,
                residency_standard2_d_multisample_block_shape: 1,
                residency_standard3_d_block_shape: 1,
                residency_aligned_mip_size: 0,
                residency_non_resident_strict: 1,
            },
        },
        queue_family_index: 0,
        layered: false,
        ray_tracing: true,
        buffer_marker: true,
        shader_info: false,
   } 

in all cases the example presented fine.

@flukejones
Copy link
Author

As noted on the linked issue it looks like the root cause of the other users problems are the fact they are using xorg with a config to make xorg run on the nvidia dgpu by default. This is a very unique special case that will become phased out very soon. I'm surprised it isn;t already but then I guess "Ubuntu".

My own gpu management tool removed that hack a long time ago.

@kvark
Copy link
Owner

kvark commented Mar 12, 2024

I don't have a system at hand that would be subject to this problem, and so it's very hard to investigate and find a minimal workaround. Any ideas on how exactly to detect the affected platform are appreciated!

@flukejones
Copy link
Author

I would prefer the blocking code be removed because it was added for an edge case that is rarely used and will be even more uncommon with the coming distro releases.

We could perhaps restrict it to xorg only at least. That would be an env check at minimum.

@flukejones
Copy link
Author

It would be safe to check these two env:

echo $XDG_SESSION_TYPE
x11

echo $XDG_SESSION_TYPE
wayland

echo $DESKTOP_SESSION
gnome-xorg

echo $DESKTOP_SESSION
gnome-wayland

I think if xorg-nvidia is used then glxinfo -B | grep Device will return with the Nvidia card name. So that could be another secondary check to prevent blocking folks using xorg as normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants