Runtime API

IREE runtime Python bindings.

enum iree.runtime.BufferCompatibility(value)

Valid values are as follows:

NONE = BufferCompatibility.NONE
ALLOCATABLE = BufferCompatibility.ALLOCATABLE
IMPORTABLE = BufferCompatibility.IMPORTABLE
EXPORTABLE = BufferCompatibility.EXPORTABLE
QUEUE_TRANSFER = BufferCompatibility.QUEUE_TRANSFER
QUEUE_DISPATCH = BufferCompatibility.QUEUE_DISPATCH
enum iree.runtime.BufferUsage(value)

Valid values are as follows:

NONE = BufferUsage.NONE
TRANSFER_SOURCE = BufferUsage.TRANSFER_SOURCE
TRANSFER_TARGET = BufferUsage.TRANSFER_TARGET
TRANSFER = BufferUsage.TRANSFER
DISPATCH_INDIRECT_PARAMETERS = BufferUsage.DISPATCH_INDIRECT_PARAMETERS
DISPATCH_UNIFORM_READ = BufferUsage.DISPATCH_UNIFORM_READ
DISPATCH_STORAGE_READ = BufferUsage.DISPATCH_STORAGE_READ
DISPATCH_STORAGE_WRITE = BufferUsage.DISPATCH_STORAGE_WRITE
DISPATCH_STORAGE = BufferUsage.DISPATCH_STORAGE
DISPATCH_IMAGE_READ = BufferUsage.DISPATCH_IMAGE_READ
DISPATCH_IMAGE_WRITE = BufferUsage.DISPATCH_IMAGE_WRITE
DISPATCH_IMAGE = BufferUsage.DISPATCH_IMAGE
SHARING_EXPORT = BufferUsage.SHARING_EXPORT
SHARING_REPLICATE = BufferUsage.SHARING_REPLICATE
SHARING_CONCURRENT = BufferUsage.SHARING_CONCURRENT
SHARING_IMMUTABLE = BufferUsage.SHARING_IMMUTABLE
MAPPING_SCOPED = BufferUsage.MAPPING_SCOPED
MAPPING_PERSISTENT = BufferUsage.MAPPING_PERSISTENT
MAPPING_OPTIONAL = BufferUsage.MAPPING_OPTIONAL
MAPPING_ACCESS_RANDOM = BufferUsage.MAPPING_ACCESS_RANDOM
MAPPING_ACCESS_SEQUENTIAL_WRITE = BufferUsage.MAPPING_ACCESS_SEQUENTIAL_WRITE
MAPPING = BufferUsage.MAPPING
DEFAULT = BufferUsage.DEFAULT
class iree.runtime.Config(driver_name: Optional[str] = None, *, device: Optional[iree._runtime_libs._runtime.HalDevice] = None)

System configuration.

default_vm_modules: Tuple[iree._runtime_libs._runtime.VmModule, ...]
device: iree._runtime_libs._runtime.HalDevice
vm_instance: iree._runtime_libs._runtime.VmInstance
class iree.runtime.DeviceArray(device: iree._runtime_libs._runtime.HalDevice, buffer_view: iree._runtime_libs._runtime.HalBufferView, implicit_host_transfer: bool = False, override_dtype=None)

An IREE device array.

Device arrays can be in one of two states:

  1. Host accessible: The array will be backed by host accessible memory and can have the usual things done with it that one expects to be able to do with an ndarray.

  2. Device resident: The array is just a handle to a device resident Buffer (and BufferView wrapper). Metadata about the array are accessible (shape and dtype) but anything that touches the data cannot be accessed in this state.

How a device array comes into existence controls how it can transition between these states:

  • A user can create a DeviceArray explicitly with a device allocator. Such an array will not be implicitly convertible to host accessible, although accessors exist to do so.

  • When created by the platform with a synchronization policy, then implicit transfer back to the host will trigger appropriate waits and be performed automatically (this is the common case for function return values if not otherwise configured, as an example).

astype(dtype, casting='unsafe', copy=True)
property dtype
property is_host_accessible

Whether this array is currently host accessible.

reshape(*args)
property shape
to_host() numpy.ndarray

Return the array as host accessible NumPy ndarray. This may map the memory or create a copy depending on wether the array is mappable to the host.

flag iree.runtime.ExternalTimepointFlags(value)
Member Type

int

Valid values are as follows:

NONE = 0
flag iree.runtime.ExternalTimepointType(value)
Member Type

int

Valid values are as follows:

NONE = 0
WAIT_PRIMITIVE = 1
CUDA_EVENT = 2
HIP_EVENT = 3
class iree.runtime.FileHandle
property fd

(self) -> int

property host_allocation

(self) -> object

property is_fd

(self) -> bool

property is_host_allocation

(self) -> bool

wrap_fd = <nanobind.nb_func object>
wrap_memory = <nanobind.nb_func object>
class iree.runtime.FunctionInvoker(vm_context: iree._runtime_libs._runtime.VmContext, device: iree._runtime_libs._runtime.HalDevice, vm_function: iree._runtime_libs._runtime.VmFunction)

Wraps a VmFunction, enabling invocations against it.

property vm_function: iree._runtime_libs._runtime.VmFunction
class iree.runtime.HalAllocator
allocate_buffer

Allocates a new buffer with requested characteristics (does not initialize with specific data).

allocate_buffer_copy

Allocates a new buffer and initializes it from a Python buffer object. If an element type is specified, wraps in a BufferView matching the characteristics of the Python buffer. The format is requested as ND/C-Contiguous, which may incur copies if not already in that format.

allocate_host_staging_buffer_copy

Allocates a new buffer and initializes it from a Python buffer object. The buffer is configured as optimal for use on the device as a transfer buffer. For buffers of unknown providence, this is a last resort method for making them compatible for transfer to arbitrary devices.

property formatted_statistics

(self) -> str

property has_statistics

(self) -> bool

query_buffer_compatibility
property statistics

(self) -> dict

trim
class iree.runtime.HalBuffer
allowed_usage
byte_length
create_view
fill_zero
map
memory_type
property ref

(self) -> iree::python::VmRef

class iree.runtime.HalBufferView(*args, **kwargs)
property byte_length

(self) -> int

property element_type

(self) -> int

get_buffer
map
property ref

(self) -> iree::python::VmRef

property shape

(self) -> list

class iree.runtime.HalCommandBuffer(*args, **kwargs)
begin
copy

Copies a range from a source to target buffer. If the length is not specified, then it is taken from the source/target buffer, which must match.

end
fill
property ref

(self) -> iree::python::VmRef

class iree.runtime.HalDevice
property allocator

(self) -> iree::python::HalAllocator

begin_profiling
create_dlpack_capsule
create_semaphore
end_profiling
flush_profiling
from_dlpack_capsule
queue_alloca

Reserves and returns a device-local queue-ordered transient buffer.

Parameters
  • allocation_size – The size in bytes of the allocation.

  • wait_semaphoresList[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.

  • signal_semaphores – Semaphores/Fence to signal.

Returns

HalBuffer.

queue_copy

Copy data from a source buffer to destination buffer.

Parameters
  • source_bufferHalBuffer that holds src data.

  • target_bufferHalBuffer that will receive data.

  • wait_semaphoresList[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.

  • signal_semaphores – Semaphores/Fence to signal.

queue_dealloca

Deallocates a queue-ordered transient buffer.

Parameters
  • wait_semaphoresList[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.

  • signal_semaphores – Semaphores/Fence to signal.

Returns

HalBuffer.

queue_execute

Executes a sequence of command buffers.

Parameters
  • command_buffers – Sequence of command buffers to enqueue.

  • wait_semaphoresList[Tuple[HalSemaphore, int]] of semaphore values or a HalFence. The allocation will be made once these semaphores are satisfied.

  • signal_semaphores – Semaphores/Fence to signal.

class iree.runtime.HalDeviceLoopBridge(*args, **kwargs)

Bridges device semaphore signalling to asyncio futures.

This is intended to be run alongside an asyncio loop, allowing arbitrary semaphore timepoints to be bridged to the loop, satisfying futures.

Internally, it starts a thread which spins to poll the requested semaphores (which all must be from the same device). It can be used in single-device cases as a simpler implementation than a full integration with an asyncio event loop, theoretically resulting in fewer heavy-weight, kernel/device synchronization interactions.

on_semaphore
stop
class iree.runtime.HalDriver
create_default_device
create_device
create_device_by_uri
dump_device_info
query = <nanobind.nb_func object>
query_available_devices
enum iree.runtime.HalElementType(value)

Valid values are as follows:

NONE = HalElementType.NONE
OPAQUE_8 = HalElementType.OPAQUE_8
OPAQUE_16 = HalElementType.OPAQUE_16
OPAQUE_32 = HalElementType.OPAQUE_32
OPAQUE_64 = HalElementType.OPAQUE_64
BOOL_8 = HalElementType.BOOL_8
INT_4 = HalElementType.INT_4
INT_8 = HalElementType.INT_8
INT_16 = HalElementType.INT_16
INT_32 = HalElementType.INT_32
INT_64 = HalElementType.INT_64
SINT_4 = HalElementType.SINT_4
SINT_8 = HalElementType.SINT_8
SINT_16 = HalElementType.SINT_16
SINT_32 = HalElementType.SINT_32
SINT_64 = HalElementType.SINT_64
UINT_4 = HalElementType.UINT_4
UINT_8 = HalElementType.UINT_8
UINT_16 = HalElementType.UINT_16
UINT_32 = HalElementType.UINT_32
UINT_64 = HalElementType.UINT_64
FLOAT_16 = HalElementType.FLOAT_16
FLOAT_32 = HalElementType.FLOAT_32
FLOAT_64 = HalElementType.FLOAT_64
BFLOAT_16 = HalElementType.BFLOAT_16
COMPLEX_64 = HalElementType.COMPLEX_64
COMPLEX_128 = HalElementType.COMPLEX_128
FLOAT_8_E4M3_FN = HalElementType.FLOAT_8_E4M3_FN
FLOAT_8_E4M3_FNUZ = HalElementType.FLOAT_8_E4M3_FNUZ
FLOAT_8_E5M2 = HalElementType.FLOAT_8_E5M2
FLOAT_8_E5M2_FNUZ = HalElementType.FLOAT_8_E5M2_FNUZ
FLOAT_8_E8M0_FNU = HalElementType.FLOAT_8_E8M0_FNU

The Enum and its members also have the following methods:

map_to_dtype = <nanobind.nb_func object>
is_byte_aligned = <nanobind.nb_func object>
dense_byte_count = <nanobind.nb_func object>
class iree.runtime.HalExternalTimepoint(*args, **kwargs)
property compatibility

(self) -> int

property cuda_event

(self) -> int

property flags

(self) -> int

property hip_event

(self) -> int

property type

(self) -> int

class iree.runtime.HalFence(*args, **kwargs)
create_at = <nanobind.nb_func object>
extend
fail
insert
join = <nanobind.nb_func object>
property ref

(self) -> iree::python::VmRef

signal
property timepoint_count

(self) -> int

wait

Waits until the semaphore or fence is signalled or errored.

Three wait cases are supported:
  • timeout: Relative nanoseconds to wait.

  • deadine: Absolute nanoseconds to wait.

  • Neither: Waits for infinite time.

Returns whether the wait succeeded (True) or timed out (False). If the fence was asynchronously failed, an exception is raised.

class iree.runtime.HalModuleDebugSink(*args, **kwargs)
property buffer_view_trace_callback

(self) -> collections.abc.Callable[[str, list[iree._runtime_libs._runtime.HalBufferView]], None]

class iree.runtime.HalSemaphore
export_timepoint
fail
import_timepoint
query
signal
wait

Waits until the semaphore or fence is signalled or errored.

Three wait cases are supported:
  • timeout: Relative nanoseconds to wait.

  • deadine: Absolute nanoseconds to wait.

  • Neither: Waits for infinite time.

Returns whether the wait succeeded (True) or timed out (False). If the fence was asynchronously failed, an exception is raised.

enum iree.runtime.Linkage(value)

Valid values are as follows:

INTERNAL = Linkage.INTERNAL
IMPORT = Linkage.IMPORT
IMPORT_OPTIONAL = Linkage.IMPORT_OPTIONAL
EXPORT = Linkage.EXPORT
EXPORT_OPTIONAL = Linkage.EXPORT_OPTIONAL
class iree.runtime.MappedMemory
asarray
enum iree.runtime.MemoryAccess(value)

Valid values are as follows:

NONE = MemoryAccess.NONE
READ = MemoryAccess.READ
WRITE = MemoryAccess.WRITE
DISCARD = MemoryAccess.DISCARD
DISCARD_WRITE = MemoryAccess.DISCARD_WRITE
ALL = MemoryAccess.ALL
enum iree.runtime.MemoryType(value)

Valid values are as follows:

NONE = MemoryType.NONE
OPTIMAL = MemoryType.OPTIMAL
HOST_VISIBLE = MemoryType.HOST_VISIBLE
HOST_COHERENT = MemoryType.HOST_COHERENT
HOST_CACHED = MemoryType.HOST_CACHED
HOST_LOCAL = MemoryType.HOST_LOCAL
DEVICE_VISIBLE = MemoryType.DEVICE_VISIBLE
DEVICE_LOCAL = MemoryType.DEVICE_LOCAL
class iree.runtime.ParameterIndex(*args, **kwargs)
add_buffer
add_from_file_handle
add_splat
create_archive_file
create_provider
items
load
load_from_file_handle
reserve
class iree.runtime.ParameterIndexEntry
property file_storage

(self) -> tuple

property file_view

(self) -> object

property is_file

(self) -> bool

property is_splat

(self) -> bool

property key

(self) -> str

property length

(self) -> int

property metadata

(self) -> bytes

property splat_pattern

(self) -> bytes

class iree.runtime.ParameterProvider
class iree.runtime.PyModuleInterface(*args, **kwargs)
create
property destroyed

(self) -> bool

export
property initialized

(self) -> bool

flag iree.runtime.SemaphoreCompatibility(value)
Member Type

int

Valid values are as follows:

NONE = 0
HOST_WAIT = 1
DEVICE_WAIT = 2
HOST_SIGNAL = 4
DEVICE_SIGNAL = 8
HOST_ONLY = 5
DEVICE_ONLY = 10
ALL = 15
class iree.runtime.Shape(*args, **kwargs)
class iree.runtime.SplatValue(pattern: Union[array.array, numpy.ndarray], count: Union[Sequence[int], int])
class iree.runtime.SystemContext(vm_modules=None, config: Optional[iree.runtime.system_api.Config] = None)

Global system.

add_module_dependency(name, minimum_version=0)
add_vm_module(vm_module)
add_vm_modules(vm_modules)
property config: iree.runtime.system_api.Config
property instance: iree._runtime_libs._runtime.VmInstance
property is_dynamic: bool
property modules: iree.runtime.system_api.BoundModules
property vm_context: iree._runtime_libs._runtime.VmContext
class iree.runtime.VmBuffer(*args, **kwargs)
property ref

(self) -> iree::python::VmRef

class iree.runtime.VmContext(*args, **kwargs)
property context_id

(self) -> int

invoke
register_modules
class iree.runtime.VmFunction
property linkage

(self) -> int

property module_name

(self) -> str

property name

(self) -> str

property ordinal

(self) -> int

property reflection

(self) -> dict

class iree.runtime.VmInstance(*args, **kwargs)
class iree.runtime.VmModule
copy_buffer = <nanobind.nb_func object>
from_buffer = <nanobind.nb_func object>
from_flatbuffer = <nanobind.nb_func object>
property function_names

(self) -> list

lookup_function
mmap = <nanobind.nb_func object>
property name

(self) -> str

resolve_module_dependency = <nanobind.nb_func object>
property stashed_flatbuffer_blob

(self) -> object

property version

(self) -> int

wrap_buffer = <nanobind.nb_func object>
class iree.runtime.VmRef
deref
isinstance
class iree.runtime.VmVariantList(*args, **kwargs)
get_as_list
get_as_object
get_as_ref
get_serialized_trace_value
get_variant
push_float
push_int
push_list
push_ref
property ref

(self) -> iree::python::VmRef

property size

(self) -> int

iree.runtime.asdevicearray(device: iree._runtime_libs._runtime.HalDevice, a, dtype=None, *, implicit_host_transfer: bool = False, memory_type=MemoryType.DEVICE_LOCAL, allowed_usage=150998019, element_type: Optional[iree._runtime_libs._runtime.HalElementType] = None) iree.runtime.array_interop.DeviceArray

Helper to create a DeviceArray from an arbitrary array like.

This is similar in purpose and usage to np.asarray, except that it takes a device as the first argument. This may not be the best mechanism for getting a DeviceArray, depending on your use case, but it is reliable and simple. This function may make a defensive copy or cause implicit transfers to satisfy the request. If this is important to you, then a lower level API is likely more appropriate.

Note that additional flags memory_type, allowed_usage and element_type are only hints if creating a new DeviceArray. If a is already a DeviceArray, they are ignored.

iree.runtime.benchmark_exe()
iree.runtime.benchmark_module(module: Union[iree._runtime_libs._runtime.VmModule, os.PathLike], entry_function=None, inputs=[], timeout=None, **kwargs)
iree.runtime.get_device(device_uri: str, cache: bool = True) iree._runtime_libs._runtime.HalDevice

Gets a cached device by URI.

Parameters
  • device_uri – The URI of the device, either just a driver name for the default or a fully qualified “driver://path?params”.

  • cache – Whether to cache the device (default True).

Returns

A HalDevice.

iree.runtime.get_driver(device_uri: str) iree._runtime_libs._runtime.HalDriver

Returns a HAL driver by device_uri (or driver name).

Parameters

device_uri – The URI of the device, either just a driver name for the default or a fully qualified “driver://path?params”.

iree.runtime.get_first_device(device_uris: Optional[Sequence[str]] = None, cache: bool = True) iree._runtime_libs._runtime.HalDevice

Gets the first valid (cached) device for a prioritized list of names.

If no driver_names are given, and an environment variable of IREE_DEFAULT_DEVICE is available, then it is treated as a comma delimitted list of driver names to try.

This is meant to be used for default/automagic startup and is not suitable for any kind of multi-device setup.

Parameters
  • device_uris – Explicit list of device URIs to try.

  • cache – Whether to cache the device (default True).

Returns

A HalDevice instance.

iree.runtime.load_vm_flatbuffer(vm_flatbuffer: bytes, *, driver: Optional[str] = None, backend: Optional[str] = None) iree.runtime.system_api.BoundModule

Loads a VM Flatbuffer into a callable module.

Either ‘driver’ or ‘backend’ must be specified.

Note that this API makes a defensive copy to ensure proper alignment and is therefore not suitable for large flatbuffers. See load_vm_flatbuffer_file() or mmap APIs on VmModule.

iree.runtime.load_vm_flatbuffer_file(path: str, *, driver: Optional[str] = None, backend: Optional[str] = None, destroy_callback=None) iree.runtime.system_api.BoundModule

Loads a file containing a VM Flatbuffer into a callable module.

Either ‘driver’ or ‘backend’ must be specified.

Note that this delegates to the lower level VmModule.mmap() API, which, as the name implies, memory maps the file. This can be fiddly across platforms and for maximum compatibility, ensure that the file is not otherwise open for write or deleted while in use.

If provided, ‘destroy_callback’ will be passed to VmModule.mmap and will be invoked when no further references to the mapping exist. This can be used to clean up test state, etc (in a Windows compatible way).

iree.runtime.load_vm_module(vm_module, config: Optional[iree.runtime.system_api.Config] = None)

Loads a VmModule into a new SystemContext and returns it.

iree.runtime.load_vm_modules(*vm_modules, config: Optional[iree.runtime.system_api.Config] = None)

Loads VmModules into a new SystemContext and returns them.

iree.runtime.normalize_value(value: Any) Optional[Union[numpy.ndarray, List[Any], Tuple[Any]]]

Normalizes the given value for input to (or comparison with) IREE.

iree.runtime.parameter_index_add_numpy_ndarray(index: iree._runtime_libs._runtime.ParameterIndex, name: str, array: numpy.ndarray)

Adds an ndarray to the index.

iree.runtime.parameter_index_entry_as_numpy_flat_ndarray(index_entry: iree._runtime_libs._runtime.ParameterIndexEntry) numpy.ndarray

Accesses the contents as a uint8 flat tensor.

If it is a splat, then the tensor will be a view of the splat pattern.

Raises a ValueError on unsupported entries.

iree.runtime.parameter_index_entry_as_numpy_ndarray(index_entry: iree._runtime_libs._runtime.ParameterIndexEntry) numpy.ndarray

Returns a tensor viewed with appropriate shape/dtype from metadata.

Raises a ValueError if unsupported.

iree.runtime.query_available_drivers() Collection[str]

Returns a collection of driver names that are available.

iree.runtime.save_archive_file(entries: dict[str, typing.Union[typing.Any, iree.runtime.io.SplatValue]], file_path: os.PathLike)

Creates an IRPA (IREE Parameter Archive) from contents.

Similar to the safetensors.numpy.save_file function, this takes a dictionary of key-value pairs where the value is a buffer. It writes a file with the contents.