Skip to content

Commit

Permalink
Expose gpu allocation configuration options
Browse files Browse the repository at this point in the history
This commit adds hints to control memory allocations strategies to the configuration options. These hints allow for automatic profiles such as optimizing for performance (the default, makes sense for a game), optimizing for memory usage (typically more useful for a web browser or UI library) and specifying settings manually.

The details of gpu allocation are still in flux. The goal is to switch vulkan and metal to gpu_allocator which is currently used with d3d12. gpu_allocator will also likely receive more configuration options, in particular the ability to start with smaller memory block sizes and progressively grow the block size. So the manual settings already provision for this upcoming option. Another approach could be to wait and add the manual option after the dust settles.

The reason for providing presets and defining values in the backends is that I am convinced that optimal fonigurations should take hardware capabilities into consideration. It's a deep rabbithole, though, so that will be an exercise for later.
  • Loading branch information
nical committed Jun 25, 2024
1 parent e9f1aee commit a818b5c
Show file tree
Hide file tree
Showing 27 changed files with 130 additions and 12 deletions.
1 change: 1 addition & 0 deletions examples/src/framework.rs
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,7 @@ impl ExampleContext {
label: None,
required_features: (optional_features & adapter_features) | required_features,
required_limits: needed_limits,
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
trace_dir.ok().as_ref().map(std::path::Path::new),
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/hello_compute/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ async fn execute_gpu(numbers: &[u32]) -> Option<Vec<u32>> {
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/hello_synchronization/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ async fn run() {
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::Performance,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/hello_triangle/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ async fn run(event_loop: EventLoop<()>, window: Window) {
// Make sure we use the texture resolution limits from the adapter, so we can support images the size of the swapchain.
required_limits: wgpu::Limits::downlevel_webgl2_defaults()
.using_resolution(adapter.limits()),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/hello_windows/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ async fn run(event_loop: EventLoop<()>, viewports: Vec<(Arc<Window>, wgpu::Color
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/hello_workgroups/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ async fn run() {
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/render_to_texture/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ async fn run(_path: Option<String>) {
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/repeated_compute/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,7 @@ impl WgpuContext {
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::Performance,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/storage_texture/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ async fn run(_path: Option<String>) {
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/timestamp_queries/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,7 @@ async fn run() {
label: None,
required_features: features,
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions examples/src/uniform_values/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ impl WgpuContext {
label: None,
required_features: wgpu::Features::empty(),
required_limits: wgpu::Limits::downlevel_defaults(),
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
1 change: 1 addition & 0 deletions player/tests/test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ impl Test<'_> {
label: None,
required_features: self.features,
required_limits: wgt::Limits::default(),
memory_hints: wgt::MemoryHints::default(),
},
None,
Some(device_id),
Expand Down
1 change: 1 addition & 0 deletions tests/src/init.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ pub async fn initialize_device(
label: None,
required_features: features,
required_limits: limits,
memory_hints: wgpu::MemoryHints::MemoryUsage,
},
None,
)
Expand Down
8 changes: 5 additions & 3 deletions wgpu-core/src/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -363,9 +363,11 @@ impl<A: HalApi> Adapter<A> {
}

let open = unsafe {
self.raw
.adapter
.open(desc.required_features, &desc.required_limits)
self.raw.adapter.open(
desc.required_features,
&desc.required_limits,
&desc.memory_hints,
)
}
.map_err(|err| match err {
hal::DeviceError::Lost => RequestDeviceError::DeviceLost,
Expand Down
2 changes: 1 addition & 1 deletion wgpu-hal/examples/halmark/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ impl<A: hal::Api> Example<A> {

let hal::OpenDevice { device, queue } = unsafe {
adapter
.open(wgt::Features::empty(), &wgt::Limits::default())
.open(wgt::Features::empty(), &wgt::Limits::default(), &wgt::MemoryHints::default())
.unwrap()
};

Expand Down
2 changes: 1 addition & 1 deletion wgpu-hal/examples/raw-gles.rs
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ fn fill_screen(exposed: &hal::ExposedAdapter<hal::api::Gles>, width: u32, height
let od = unsafe {
exposed
.adapter
.open(wgt::Features::empty(), &wgt::Limits::downlevel_defaults())
.open(wgt::Features::empty(), &wgt::Limits::downlevel_defaults(), &wgt::MemoryHints::default())
}
.unwrap();

Expand Down
11 changes: 9 additions & 2 deletions wgpu-hal/examples/ray-traced-triangle/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -249,8 +249,15 @@ impl<A: hal::Api> Example<A> {
.expect("Surface doesn't support presentation");
log::info!("Surface caps: {:#?}", surface_caps);

let hal::OpenDevice { device, queue } =
unsafe { adapter.open(features, &wgt::Limits::default()).unwrap() };
let hal::OpenDevice { device, queue } = unsafe {
adapter
.open(
features,
&wgt::Limits::default(),
&wgt::MemoryHints::Performance,
)
.unwrap()
};

let window_size: (u32, u32) = window.inner_size().into();
dbg!(&surface_caps.formats);
Expand Down
2 changes: 2 additions & 0 deletions wgpu-hal/src/dx12/adapter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -503,6 +503,7 @@ impl crate::Adapter for super::Adapter {
&self,
_features: wgt::Features,
limits: &wgt::Limits,
memory_hints: &wgt::MemoryHints,
) -> Result<crate::OpenDevice<super::Api>, crate::DeviceError> {
let queue = {
profiling::scope!("ID3D12Device::CreateCommandQueue");
Expand All @@ -520,6 +521,7 @@ impl crate::Adapter for super::Adapter {
self.device.clone(),
queue.clone(),
limits,
memory_hints,
self.private_caps,
&self.library,
self.dxc_container.clone(),
Expand Down
3 changes: 2 additions & 1 deletion wgpu-hal/src/dx12/device.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,13 @@ impl super::Device {
raw: d3d12::Device,
present_queue: d3d12::CommandQueue,
limits: &wgt::Limits,
memory_hints: &wgt::MemoryHints,
private_caps: super::PrivateCapabilities,
library: &Arc<d3d12::D3D12Lib>,
dxc_container: Option<Arc<shader_compilation::DxcContainer>>,
) -> Result<Self, DeviceError> {
let mem_allocator = if private_caps.suballocation_supported {
super::suballocation::create_allocator_wrapper(&raw)?
super::suballocation::create_allocator_wrapper(&raw, memory_hints)?
} else {
None
};
Expand Down
20 changes: 19 additions & 1 deletion wgpu-hal/src/dx12/suballocation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,31 @@ mod placed {

pub(crate) fn create_allocator_wrapper(
raw: &d3d12::Device,
memory_hints: &wgt::MemoryHints,
) -> Result<Option<Mutex<GpuAllocatorWrapper>>, crate::DeviceError> {
let device = raw.as_ptr();

// TODO: the allocator's configuration should take hardware capability into
// account.
let mb = 1024 * 1024;
let allocation_sizes = match memory_hints {
MemoryHints::Performance => gpu_allocator::AllocationSizes::default(),
MemoryHints::MemoryUsage => gpu_allocator::AllocationSizes::new(4 * mb, 2 * mb),
MemoryHints::Manual {
suballocated_device_memory_block_size,
} => {
// TODO: Would it be useful to expose the host size in memory hints
// instead of always using half of the device size?
let device_size = suballocated_device_memory_block_size.start;
let host_size = device_size / 2;
gpu_allocator::AllocationSizes::new(device_size, host_size)
}
};

match gpu_allocator::d3d12::Allocator::new(&gpu_allocator::d3d12::AllocatorCreateDesc {
device: gpu_allocator::d3d12::ID3D12DeviceVersion::Device(device.as_windows().clone()),
debug_settings: Default::default(),
allocation_sizes: gpu_allocator::AllocationSizes::default(),
allocation_sizes,
}) {
Ok(allocator) => Ok(Some(Mutex::new(GpuAllocatorWrapper { allocator }))),
Err(e) => {
Expand Down
1 change: 1 addition & 0 deletions wgpu-hal/src/empty.rs
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ impl crate::Adapter for Context {
&self,
features: wgt::Features,
_limits: &wgt::Limits,
_memory_hints: &wgt::MemoryHints,
) -> DeviceResult<crate::OpenDevice<Api>> {
Err(crate::DeviceError::Lost)
}
Expand Down
1 change: 1 addition & 0 deletions wgpu-hal/src/gles/adapter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -930,6 +930,7 @@ impl crate::Adapter for super::Adapter {
&self,
features: wgt::Features,
_limits: &wgt::Limits,
_memory_hints: &wgt::MemoryHints,
) -> Result<crate::OpenDevice<super::Api>, crate::DeviceError> {
let gl = &self.shared.context.lock();
unsafe { gl.pixel_store_i32(glow::UNPACK_ALIGNMENT, 1) };
Expand Down
1 change: 1 addition & 0 deletions wgpu-hal/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -558,6 +558,7 @@ pub trait Adapter: WasmNotSendSync {
&self,
features: wgt::Features,
limits: &wgt::Limits,
memory_hints: &wgt::MemoryHints,
) -> Result<OpenDevice<Self::A>, DeviceError>;

/// Return the set of supported capabilities for a texture format.
Expand Down
1 change: 1 addition & 0 deletions wgpu-hal/src/metal/adapter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ impl crate::Adapter for super::Adapter {
&self,
features: wgt::Features,
_limits: &wgt::Limits,
_memory_hints: &wgt::MemoryHints,
) -> Result<crate::OpenDevice<super::Api>, crate::DeviceError> {
let queue = self
.shared
Expand Down
38 changes: 37 additions & 1 deletion wgpu-hal/src/vulkan/adapter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1584,6 +1584,7 @@ impl super::Adapter {
handle_is_owned: bool,
enabled_extensions: &[&'static CStr],
features: wgt::Features,
memory_hints: &wgt::MemoryHints,
family_index: u32,
queue_index: u32,
) -> Result<crate::OpenDevice<super::Api>, crate::DeviceError> {
Expand Down Expand Up @@ -1834,7 +1835,40 @@ impl super::Adapter {

let mem_allocator = {
let limits = self.phd_capabilities.properties.limits;
let config = gpu_alloc::Config::i_am_prototyping(); //TODO
// TODO: These configuration options should take hardware capabilities
// into consideration.
let mb = 1024 * 1024;
let perf_cfg = gpu_alloc::Config {
starting_free_list_chunk: 128 * mb,
final_free_list_chunk: 512 * mb,
minimal_buddy_size: 1,
initial_buddy_dedicated_size: 8 * mb,
dedicated_threshold: 32 * mb,
preferred_dedicated_threshold: mb,
transient_dedicated_threshold: 128 * mb,
};
let mem_usage_cfg = gpu_alloc::Config {
starting_free_list_chunk: 4 * mb,
final_free_list_chunk: 64 * mb,
minimal_buddy_size: 1,
initial_buddy_dedicated_size: 4 * mb,
dedicated_threshold: 8 * mb,
preferred_dedicated_threshold: mb,
transient_dedicated_threshold: 16 * mb,
};
let config = match memory_hints {
wgt::MemoryHints::Performance => perf_cfg,
wgt::MemoryHints::MemoryUsage => mem_usage_cfg,
wgt::MemoryHints::Manual {
suballocated_device_memory_block_size,
} => gpu_alloc::Config {
starting_free_list_chunk: suballocated_device_memory_block_size.start,
final_free_list_chunk: suballocated_device_memory_block_size.end,
initial_buddy_dedicated_size: suballocated_device_memory_block_size.start,
..perf_cfg
},
};

let max_memory_allocation_size =
if let Some(maintenance_3) = self.phd_capabilities.maintenance_3 {
maintenance_3.max_memory_allocation_size
Expand Down Expand Up @@ -1896,6 +1930,7 @@ impl crate::Adapter for super::Adapter {
&self,
features: wgt::Features,
_limits: &wgt::Limits,
memory_hints: &wgt::MemoryHints,
) -> Result<crate::OpenDevice<super::Api>, crate::DeviceError> {
let enabled_extensions = self.required_device_extensions(features);
let mut enabled_phd_features = self.physical_device_features(&enabled_extensions, features);
Expand Down Expand Up @@ -1929,6 +1964,7 @@ impl crate::Adapter for super::Adapter {
true,
&enabled_extensions,
features,
memory_hints,
family_info.queue_family_index,
0,
)
Expand Down
37 changes: 36 additions & 1 deletion wgpu-types/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1777,11 +1777,43 @@ pub struct AdapterInfo {
pub backend: Backend,
}

/// Hints to the device about the memory allocation strategy.
///
/// Some backends may ignore these hints.
#[derive(Clone, Debug, Default)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub enum MemoryHints {
/// Favor performance over memory usage (the default value).
#[default]
Performance,
/// Favor memory usage over performance.
MemoryUsage,
/// Applications that have control over the content that is rendered
/// (typically games) may find an optimal compromise between memory
/// usage and performance by specifying the allocation configuration.
Manual {
/// Defines the range of allowed memory block sizes for sub-allocated
/// resources.
///
/// The backend may attempt to group multiple resources into fewer
/// device memory blocks (sub-allocation) for performance reasons.
/// The start of the provided range specifies the initial memory
/// block size for sub-allocated resources. After running out of
/// space in existing memory blocks, the backend may chose to
/// progressively increase the block size of subsequent allocations
/// up to a limit specified by the end of the range.
///
/// This does not limit resource sizes. If a resource does not fit
/// in the specified range, it will typically be placed in a dedicated
/// memory block.
suballocated_device_memory_block_size: Range<u64>,
},
}

/// Describes a [`Device`](../wgpu/struct.Device.html).
///
/// Corresponds to [WebGPU `GPUDeviceDescriptor`](
/// https://gpuweb.github.io/gpuweb/#gpudevicedescriptor).
#[repr(C)]
#[derive(Clone, Debug, Default)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub struct DeviceDescriptor<L> {
Expand All @@ -1799,6 +1831,8 @@ pub struct DeviceDescriptor<L> {
/// Exactly the specified limits, and no better or worse,
/// will be allowed in validation of API calls on the resulting device.
pub required_limits: Limits,
/// Hints for memory allocation strategies.
pub memory_hints: MemoryHints,
}

impl<L> DeviceDescriptor<L> {
Expand All @@ -1808,6 +1842,7 @@ impl<L> DeviceDescriptor<L> {
label: fun(&self.label),
required_features: self.required_features,
required_limits: self.required_limits.clone(),
memory_hints: self.memory_hints.clone(),
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion wgpu/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ pub use wgt::{
DepthStencilState, DeviceLostReason, DeviceType, DownlevelCapabilities, DownlevelFlags,
Dx12Compiler, DynamicOffset, Extent3d, Face, Features, FilterMode, FrontFace,
Gles3MinorVersion, ImageDataLayout, ImageSubresourceRange, IndexFormat, InstanceDescriptor,
InstanceFlags, Limits, MaintainResult, MultisampleState, Origin2d, Origin3d,
InstanceFlags, Limits, MaintainResult, MemoryHints, MultisampleState, Origin2d, Origin3d,
PipelineStatisticsTypes, PolygonMode, PowerPreference, PredefinedColorSpace, PresentMode,
PresentationTimestamp, PrimitiveState, PrimitiveTopology, PushConstantRange, QueryType,
RenderBundleDepthStencil, SamplerBindingType, SamplerBorderColor, ShaderLocation, ShaderModel,
Expand Down

0 comments on commit a818b5c

Please sign in to comment.