Runtime API
IREE runtime Python bindings.
- enum iree.runtime.BufferCompatibility(value)
Valid values are as follows:
- NONE = BufferCompatibility.NONE
- ALLOCATABLE = BufferCompatibility.ALLOCATABLE
- IMPORTABLE = BufferCompatibility.IMPORTABLE
- EXPORTABLE = BufferCompatibility.EXPORTABLE
- QUEUE_TRANSFER = BufferCompatibility.QUEUE_TRANSFER
- QUEUE_DISPATCH = BufferCompatibility.QUEUE_DISPATCH
- enum iree.runtime.BufferUsage(value)
Valid values are as follows:
- NONE = BufferUsage.NONE
- TRANSFER_SOURCE = BufferUsage.TRANSFER_SOURCE
- TRANSFER_TARGET = BufferUsage.TRANSFER_TARGET
- TRANSFER = BufferUsage.TRANSFER
- DISPATCH_INDIRECT_PARAMETERS = BufferUsage.DISPATCH_INDIRECT_PARAMETERS
- DISPATCH_UNIFORM_READ = BufferUsage.DISPATCH_UNIFORM_READ
- DISPATCH_STORAGE_READ = BufferUsage.DISPATCH_STORAGE_READ
- DISPATCH_STORAGE_WRITE = BufferUsage.DISPATCH_STORAGE_WRITE
- DISPATCH_STORAGE = BufferUsage.DISPATCH_STORAGE
- DISPATCH_IMAGE_READ = BufferUsage.DISPATCH_IMAGE_READ
- DISPATCH_IMAGE_WRITE = BufferUsage.DISPATCH_IMAGE_WRITE
- DISPATCH_IMAGE = BufferUsage.DISPATCH_IMAGE
- SHARING_EXPORT = BufferUsage.SHARING_EXPORT
- SHARING_REPLICATE = BufferUsage.SHARING_REPLICATE
- SHARING_CONCURRENT = BufferUsage.SHARING_CONCURRENT
- SHARING_IMMUTABLE = BufferUsage.SHARING_IMMUTABLE
- MAPPING_SCOPED = BufferUsage.MAPPING_SCOPED
- MAPPING_PERSISTENT = BufferUsage.MAPPING_PERSISTENT
- MAPPING_OPTIONAL = BufferUsage.MAPPING_OPTIONAL
- MAPPING_ACCESS_RANDOM = BufferUsage.MAPPING_ACCESS_RANDOM
- MAPPING_ACCESS_SEQUENTIAL_WRITE = BufferUsage.MAPPING_ACCESS_SEQUENTIAL_WRITE
- MAPPING = BufferUsage.MAPPING
- DEFAULT = BufferUsage.DEFAULT
- class iree.runtime.Config(driver_name: Optional[str] = None, *, device: Optional[iree._runtime_libs._runtime.HalDevice] = None)
System configuration.
- default_vm_modules: Tuple[iree._runtime_libs._runtime.VmModule, ...]
- vm_instance: iree._runtime_libs._runtime.VmInstance
- class iree.runtime.DeviceArray(device: iree._runtime_libs._runtime.HalDevice, buffer_view: iree._runtime_libs._runtime.HalBufferView, implicit_host_transfer: bool = False, override_dtype=None)
An IREE device array.
Device arrays can be in one of two states:
Host accessible: The array will be backed by host accessible memory and can have the usual things done with it that one expects to be able to do with an ndarray.
Device resident: The array is just a handle to a device resident Buffer (and BufferView wrapper). Metadata about the array are accessible (shape and dtype) but anything that touches the data cannot be accessed in this state.
How a device array comes into existence controls how it can transition between these states:
A user can create a DeviceArray explicitly with a device allocator. Such an array will not be implicitly convertible to host accessible, although accessors exist to do so.
When created by the platform with a synchronization policy, then implicit transfer back to the host will trigger appropriate waits and be performed automatically (this is the common case for function return values if not otherwise configured, as an example).
- astype(dtype, casting='unsafe', copy=True)
- property dtype
- property is_host_accessible
Whether this array is currently host accessible.
- reshape(*args)
- property shape
- to_host() numpy.ndarray
Return the array as host accessible NumPy ndarray. This may map the memory or create a copy depending on wether the array is mappable to the host.
- flag iree.runtime.ExternalTimepointType(value)
- Member Type
Valid values are as follows:
- NONE = 0
- WAIT_PRIMITIVE = 1
- CUDA_EVENT = 2
- HIP_EVENT = 3
- class iree.runtime.FileHandle
- property fd
(self) -> int
- property host_allocation
(self) -> object
- property is_fd
(self) -> bool
- property is_host_allocation
(self) -> bool
- wrap_fd = <nanobind.nb_func object>
- wrap_memory = <nanobind.nb_func object>
- class iree.runtime.FunctionInvoker(vm_context: iree._runtime_libs._runtime.VmContext, device: iree._runtime_libs._runtime.HalDevice, vm_function: iree._runtime_libs._runtime.VmFunction)
Wraps a VmFunction, enabling invocations against it.
- property vm_function: iree._runtime_libs._runtime.VmFunction
- class iree.runtime.HalAllocator
- allocate_buffer
Allocates a new buffer with requested characteristics (does not initialize with specific data).
- allocate_buffer_copy
Allocates a new buffer and initializes it from a Python buffer object. If an element type is specified, wraps in a BufferView matching the characteristics of the Python buffer. The format is requested as ND/C-Contiguous, which may incur copies if not already in that format.
- allocate_host_staging_buffer_copy
Allocates a new buffer and initializes it from a Python buffer object. The buffer is configured as optimal for use on the device as a transfer buffer. For buffers of unknown providence, this is a last resort method for making them compatible for transfer to arbitrary devices.
- property formatted_statistics
(self) -> str
- property has_statistics
(self) -> bool
- query_buffer_compatibility
- property statistics
(self) -> dict
- trim
- class iree.runtime.HalBuffer
- allowed_usage
- byte_length
- create_view
- fill_zero
- map
- memory_type
- property ref
(self) -> iree::python::VmRef
- class iree.runtime.HalBufferView(*args, **kwargs)
- property byte_length
(self) -> int
- property element_type
(self) -> int
- get_buffer
- map
- property ref
(self) -> iree::python::VmRef
- property shape
(self) -> list
- class iree.runtime.HalCommandBuffer(*args, **kwargs)
- begin
- copy
Copies a range from a source to target buffer. If the length is not specified, then it is taken from the source/target buffer, which must match.
- end
- fill
- property ref
(self) -> iree::python::VmRef
- class iree.runtime.HalDevice
- property allocator
(self) -> iree::python::HalAllocator
- begin_profiling
- create_dlpack_capsule
- create_semaphore
- end_profiling
- flush_profiling
- from_dlpack_capsule
- queue_alloca
Reserves and returns a device-local queue-ordered transient buffer.
- Parameters
allocation_size – The size in bytes of the allocation.
wait_semaphores – List[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.
signal_semaphores – Semaphores/Fence to signal.
- Returns
HalBuffer.
- queue_copy
Copy data from a source buffer to destination buffer.
- Parameters
source_buffer – HalBuffer that holds src data.
target_buffer – HalBuffer that will receive data.
wait_semaphores – List[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.
signal_semaphores – Semaphores/Fence to signal.
- queue_dealloca
Deallocates a queue-ordered transient buffer.
- Parameters
wait_semaphores – List[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.
signal_semaphores – Semaphores/Fence to signal.
- Returns
HalBuffer.
- queue_execute
Executes a sequence of command buffers.
- Parameters
command_buffers – Sequence of command buffers to enqueue.
wait_semaphores – List[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.
signal_semaphores – Semaphores/Fence to signal.
- class iree.runtime.HalDeviceLoopBridge(*args, **kwargs)
Bridges device semaphore signalling to asyncio futures.
This is intended to be run alongside an asyncio loop, allowing arbitrary semaphore timepoints to be bridged to the loop, satisfying futures.
Internally, it starts a thread which spins to poll the requested semaphores (which all must be from the same device). It can be used in single-device cases as a simpler implementation than a full integration with an asyncio event loop, theoretically resulting in fewer heavy-weight, kernel/device synchronization interactions.
- on_semaphore
- stop
- class iree.runtime.HalDriver
- create_default_device
- create_device
- create_device_by_uri
- dump_device_info
- query = <nanobind.nb_func object>
- query_available_devices
- enum iree.runtime.HalElementType(value)
Valid values are as follows:
- NONE = HalElementType.NONE
- OPAQUE_8 = HalElementType.OPAQUE_8
- OPAQUE_16 = HalElementType.OPAQUE_16
- OPAQUE_32 = HalElementType.OPAQUE_32
- OPAQUE_64 = HalElementType.OPAQUE_64
- BOOL_8 = HalElementType.BOOL_8
- INT_4 = HalElementType.INT_4
- INT_8 = HalElementType.INT_8
- INT_16 = HalElementType.INT_16
- INT_32 = HalElementType.INT_32
- INT_64 = HalElementType.INT_64
- SINT_4 = HalElementType.SINT_4
- SINT_8 = HalElementType.SINT_8
- SINT_16 = HalElementType.SINT_16
- SINT_32 = HalElementType.SINT_32
- SINT_64 = HalElementType.SINT_64
- UINT_4 = HalElementType.UINT_4
- UINT_8 = HalElementType.UINT_8
- UINT_16 = HalElementType.UINT_16
- UINT_32 = HalElementType.UINT_32
- UINT_64 = HalElementType.UINT_64
- FLOAT_16 = HalElementType.FLOAT_16
- FLOAT_32 = HalElementType.FLOAT_32
- FLOAT_64 = HalElementType.FLOAT_64
- BFLOAT_16 = HalElementType.BFLOAT_16
- COMPLEX_64 = HalElementType.COMPLEX_64
- COMPLEX_128 = HalElementType.COMPLEX_128
- FLOAT_8_E4M3_FN = HalElementType.FLOAT_8_E4M3_FN
- FLOAT_8_E4M3_FNUZ = HalElementType.FLOAT_8_E4M3_FNUZ
- FLOAT_8_E5M2 = HalElementType.FLOAT_8_E5M2
- FLOAT_8_E5M2_FNUZ = HalElementType.FLOAT_8_E5M2_FNUZ
- FLOAT_8_E8M0_FNU = HalElementType.FLOAT_8_E8M0_FNU
The
Enumand its members also have the following methods:- map_to_dtype = <nanobind.nb_func object>
- is_byte_aligned = <nanobind.nb_func object>
- dense_byte_count = <nanobind.nb_func object>
- class iree.runtime.HalExternalTimepoint(*args, **kwargs)
- property compatibility
(self) -> int
- property cuda_event
(self) -> int
- property flags
(self) -> int
- property hip_event
(self) -> int
- property type
(self) -> int
- class iree.runtime.HalFence(*args, **kwargs)
- create_at = <nanobind.nb_func object>
- extend
- fail
- insert
- join = <nanobind.nb_func object>
- property ref
(self) -> iree::python::VmRef
- signal
- property timepoint_count
(self) -> int
- wait
Waits until the semaphore or fence is signalled or errored.
- Three wait cases are supported:
timeout: Relative nanoseconds to wait.
deadine: Absolute nanoseconds to wait.
Neither: Waits for infinite time.
Returns whether the wait succeeded (True) or timed out (False). If the fence was asynchronously failed, an exception is raised.
- class iree.runtime.HalModuleDebugSink(*args, **kwargs)
- property buffer_view_trace_callback
(self) -> collections.abc.Callable[[str, list[iree._runtime_libs._runtime.HalBufferView]], None]
- class iree.runtime.HalSemaphore
- export_timepoint
- fail
- import_timepoint
- query
- signal
- wait
Waits until the semaphore or fence is signalled or errored.
- Three wait cases are supported:
timeout: Relative nanoseconds to wait.
deadine: Absolute nanoseconds to wait.
Neither: Waits for infinite time.
Returns whether the wait succeeded (True) or timed out (False). If the fence was asynchronously failed, an exception is raised.
- enum iree.runtime.Linkage(value)
Valid values are as follows:
- INTERNAL = Linkage.INTERNAL
- IMPORT = Linkage.IMPORT
- IMPORT_OPTIONAL = Linkage.IMPORT_OPTIONAL
- EXPORT = Linkage.EXPORT
- EXPORT_OPTIONAL = Linkage.EXPORT_OPTIONAL
- enum iree.runtime.MemoryAccess(value)
Valid values are as follows:
- NONE = MemoryAccess.NONE
- READ = MemoryAccess.READ
- WRITE = MemoryAccess.WRITE
- DISCARD = MemoryAccess.DISCARD
- DISCARD_WRITE = MemoryAccess.DISCARD_WRITE
- ALL = MemoryAccess.ALL
- enum iree.runtime.MemoryType(value)
Valid values are as follows:
- NONE = MemoryType.NONE
- OPTIMAL = MemoryType.OPTIMAL
- HOST_VISIBLE = MemoryType.HOST_VISIBLE
- HOST_COHERENT = MemoryType.HOST_COHERENT
- HOST_CACHED = MemoryType.HOST_CACHED
- HOST_LOCAL = MemoryType.HOST_LOCAL
- DEVICE_VISIBLE = MemoryType.DEVICE_VISIBLE
- DEVICE_LOCAL = MemoryType.DEVICE_LOCAL
- class iree.runtime.ParameterIndex(*args, **kwargs)
- add_buffer
- add_from_file_handle
- add_splat
- create_archive_file
- create_provider
- items
- load
- load_from_file_handle
- reserve
- class iree.runtime.ParameterIndexEntry
- property file_storage
(self) -> tuple
- property file_view
(self) -> object
- property is_file
(self) -> bool
- property is_splat
(self) -> bool
- property key
(self) -> str
- property length
(self) -> int
- property metadata
(self) -> bytes
- property splat_pattern
(self) -> bytes
- class iree.runtime.ParameterProvider
- class iree.runtime.PyModuleInterface(*args, **kwargs)
- create
- property destroyed
(self) -> bool
- export
- property initialized
(self) -> bool
- flag iree.runtime.SemaphoreCompatibility(value)
- Member Type
Valid values are as follows:
- NONE = 0
- HOST_WAIT = 1
- DEVICE_WAIT = 2
- HOST_SIGNAL = 4
- DEVICE_SIGNAL = 8
- HOST_ONLY = 5
- DEVICE_ONLY = 10
- ALL = 15
- class iree.runtime.Shape(*args, **kwargs)
- class iree.runtime.SplatValue(pattern: Union[array.array, numpy.ndarray], count: Union[Sequence[int], int])
- class iree.runtime.SystemContext(vm_modules=None, config: Optional[iree.runtime.system_api.Config] = None)
Global system.
- add_module_dependency(name, minimum_version=0)
- add_vm_module(vm_module)
- add_vm_modules(vm_modules)
- property config: iree.runtime.system_api.Config
- property instance: iree._runtime_libs._runtime.VmInstance
- property modules: iree.runtime.system_api.BoundModules
- property vm_context: iree._runtime_libs._runtime.VmContext
- class iree.runtime.VmContext(*args, **kwargs)
- property context_id
(self) -> int
- invoke
- register_modules
- class iree.runtime.VmFunction
- property linkage
(self) -> int
- property module_name
(self) -> str
- property name
(self) -> str
- property ordinal
(self) -> int
- property reflection
(self) -> dict
- class iree.runtime.VmInstance(*args, **kwargs)
- class iree.runtime.VmModule
- copy_buffer = <nanobind.nb_func object>
- from_buffer = <nanobind.nb_func object>
- from_flatbuffer = <nanobind.nb_func object>
- property function_names
(self) -> list
- lookup_function
- mmap = <nanobind.nb_func object>
- property name
(self) -> str
- resolve_module_dependency = <nanobind.nb_func object>
- property stashed_flatbuffer_blob
(self) -> object
- property version
(self) -> int
- wrap_buffer = <nanobind.nb_func object>
- class iree.runtime.VmVariantList(*args, **kwargs)
- get_as_list
- get_as_object
- get_as_ref
- get_serialized_trace_value
- get_variant
- push_float
- push_int
- push_list
- push_ref
- property ref
(self) -> iree::python::VmRef
- property size
(self) -> int
- iree.runtime.asdevicearray(device: iree._runtime_libs._runtime.HalDevice, a, dtype=None, *, implicit_host_transfer: bool = False, memory_type=MemoryType.DEVICE_LOCAL, allowed_usage=150998019, element_type: Optional[iree._runtime_libs._runtime.HalElementType] = None) iree.runtime.array_interop.DeviceArray
Helper to create a DeviceArray from an arbitrary array like.
This is similar in purpose and usage to np.asarray, except that it takes a device as the first argument. This may not be the best mechanism for getting a DeviceArray, depending on your use case, but it is reliable and simple. This function may make a defensive copy or cause implicit transfers to satisfy the request. If this is important to you, then a lower level API is likely more appropriate.
Note that additional flags memory_type, allowed_usage and element_type are only hints if creating a new DeviceArray. If a is already a DeviceArray, they are ignored.
- iree.runtime.benchmark_exe()
- iree.runtime.benchmark_module(module: Union[iree._runtime_libs._runtime.VmModule, os.PathLike], entry_function=None, inputs=[], timeout=None, **kwargs)
- iree.runtime.get_device(device_uri: str, cache: bool = True) iree._runtime_libs._runtime.HalDevice
Gets a cached device by URI.
- Parameters
device_uri – The URI of the device, either just a driver name for the default or a fully qualified “driver://path?params”.
cache – Whether to cache the device (default True).
- Returns
A HalDevice.
- iree.runtime.get_driver(device_uri: str) iree._runtime_libs._runtime.HalDriver
Returns a HAL driver by device_uri (or driver name).
- Parameters
device_uri – The URI of the device, either just a driver name for the default or a fully qualified “driver://path?params”.
- iree.runtime.get_first_device(device_uris: Optional[Sequence[str]] = None, cache: bool = True) iree._runtime_libs._runtime.HalDevice
Gets the first valid (cached) device for a prioritized list of names.
If no driver_names are given, and an environment variable of IREE_DEFAULT_DEVICE is available, then it is treated as a comma delimitted list of driver names to try.
This is meant to be used for default/automagic startup and is not suitable for any kind of multi-device setup.
- Parameters
device_uris – Explicit list of device URIs to try.
cache – Whether to cache the device (default True).
- Returns
A HalDevice instance.
- iree.runtime.load_vm_flatbuffer(vm_flatbuffer: bytes, *, driver: Optional[str] = None, backend: Optional[str] = None) iree.runtime.system_api.BoundModule
Loads a VM Flatbuffer into a callable module.
Either ‘driver’ or ‘backend’ must be specified.
Note that this API makes a defensive copy to ensure proper alignment and is therefore not suitable for large flatbuffers. See load_vm_flatbuffer_file() or mmap APIs on VmModule.
- iree.runtime.load_vm_flatbuffer_file(path: str, *, driver: Optional[str] = None, backend: Optional[str] = None, destroy_callback=None) iree.runtime.system_api.BoundModule
Loads a file containing a VM Flatbuffer into a callable module.
Either ‘driver’ or ‘backend’ must be specified.
Note that this delegates to the lower level VmModule.mmap() API, which, as the name implies, memory maps the file. This can be fiddly across platforms and for maximum compatibility, ensure that the file is not otherwise open for write or deleted while in use.
If provided, ‘destroy_callback’ will be passed to VmModule.mmap and will be invoked when no further references to the mapping exist. This can be used to clean up test state, etc (in a Windows compatible way).
- iree.runtime.load_vm_module(vm_module, config: Optional[iree.runtime.system_api.Config] = None)
Loads a VmModule into a new SystemContext and returns it.
- iree.runtime.load_vm_modules(*vm_modules, config: Optional[iree.runtime.system_api.Config] = None)
Loads VmModules into a new SystemContext and returns them.
- iree.runtime.normalize_value(value: Any) Optional[Union[numpy.ndarray, List[Any], Tuple[Any]]]
Normalizes the given value for input to (or comparison with) IREE.
- iree.runtime.parameter_index_add_numpy_ndarray(index: iree._runtime_libs._runtime.ParameterIndex, name: str, array: numpy.ndarray)
Adds an ndarray to the index.
- iree.runtime.parameter_index_entry_as_numpy_flat_ndarray(index_entry: iree._runtime_libs._runtime.ParameterIndexEntry) numpy.ndarray
Accesses the contents as a uint8 flat tensor.
If it is a splat, then the tensor will be a view of the splat pattern.
Raises a ValueError on unsupported entries.
- iree.runtime.parameter_index_entry_as_numpy_ndarray(index_entry: iree._runtime_libs._runtime.ParameterIndexEntry) numpy.ndarray
Returns a tensor viewed with appropriate shape/dtype from metadata.
Raises a ValueError if unsupported.
- iree.runtime.query_available_drivers() Collection[str]
Returns a collection of driver names that are available.
- iree.runtime.save_archive_file(entries: dict[str, typing.Union[typing.Any, iree.runtime.io.SplatValue]], file_path: os.PathLike)
Creates an IRPA (IREE Parameter Archive) from contents.
Similar to the safetensors.numpy.save_file function, this takes a dictionary of key-value pairs where the value is a buffer. It writes a file with the contents.