Release Notes: Rolling
This document outlines changes introduced to the Intel® software for general-purpose GPU capabilities in rolling releases. As the software includes several different projects, the changes for each release are grouped by project.
Support for each rolling release continues only until the next rolling release becomes available, with no updates provided for previous rolling releases. Therefore, we recommend upgrading to the latest rolling release as soon as it becomes available. To install packages for the latest rolling release, refer to the installation guide for your distribution. For a list of packages published on repositories.intel.com/gpu for each release and operating system, see Provided Packages.
2025-07-24
The 2523.12 release supports the following operating systems:
Red Hat Enterprise Linux (RHEL): 8.10, 9.4, and 9.6
Ubuntu 22.04 and 24.04
SUSE Linux Enterprise (SLES): 15 SP4, 15 SP5, and 15 SP6
Known issues
In this release, the OpenCL compiler enforces stricter type conversion checks. Kernel code that implicitly converts a global pointer to an integer without an explicit cast may now fail to compile with an “incompatible pointer to integer conversion” error. Most applications are unaffected, but if you encounter this error, update your kernel code to add an explicit cast or compile with the -Wno-error=int-conversion flag.
Features
Intel® XPU Manager and XPU System Management Interface
Added support for updating specific GPU firmware in recovery mode.
Introduced security improvements.
Improvements
Intel® Graphics Driver Backports for Linux* OS (i915)
Improved GPU error reporting by including UUID resources for better diagnostics.
Enhanced responsiveness during memory management tasks.
Refined tbb thread handling to improve scheduling efficiency, avoid redundant parking during cancellations, and ensure proper wake-up behavior.
2025-07-02
The 2523.10 release supports the following operating systems:
Red Hat Enterprise Linux (RHEL): 8.10, 9.4, 9.5, and 9.6
Ubuntu 22.04 and 24.04
SUSE Linux Enterprise (SLES): 15 SP4, 15 SP5, and 15 SP6
Known issues
In this release, the OpenCL compiler enforces stricter type conversion checks. Kernel code that implicitly converts a global pointer to an integer without an explicit cast may now fail to compile with an “incompatible pointer to integer conversion” error. Most applications are unaffected, but if you encounter this error, update your kernel code to add an explicit cast or compile with the -Wno-error=int-conversion flag.
Features
General
Incorporated the latest security updates to address recent vulnerabilities, enhance protection, and ensure greater system reliability.
Intel® C for Metal Compiler
Disabled global fence on platforms other than Intel Data Center GPU Max Series.
Added 8-bit floating point conversion intrinsics.
Added a helper function to retrieve the global thread ID along its dimension.
Added support for new Battlemage and Panther Lake devices.
Updated the CM specification including the LSC memory interface, cache controls, and CM macro requirements.
Added the stochastic rounding intrinsic declaration.
Included the main
<cm/cm.h>header implicitly, enabling caching when compiling from the CM source.Added intrinsics for 2D block load and store operations.
Intel® Graphics Driver Backports for Linux* OS (i915)
Enabled the dynamic ICS via the opt-in KLV feature.
Updated the Graphics Micro Controller (GuC) to version 70.44.1.
Extended 2M
userptrsupport to 1G.Enabled backport support for kernel version 6.13.
Intel® Graphics Compiler for OpenCL™
Implemented GenISA predicated load/store intrinsics with promotion pass.
Added the
ActiveThreadsOnlyBarrieroption for OpenCL shaders.Improved call site inlining heuristic.
Added a call merger pass that merges mutually exclusive function calls when they are too large to inline.
Added the
__builtin_IB_disable_ieee_exception_trapandGenISA_disable_ieee_exception_trapintrinsic.Introduced additional transpose block 2D SPIR-V APIs.
Added a flag to disable merging allocas of different types, providing better control over the merge alloca pass and disabling aggressive merging by default.
Added 32-bit ELF type support to ZeBin for x86 use cases.
Enhanced
MergeAllocasperformance by replacing all allocas, generating casts at the point of use, handling select instructions in liveness analysis, avoiding merging allocas acrossContinuationHLcalls in raytracing, and disabling allocas merging for raytracing.Added support for recognizing OpenCL/SPIR-V built-ins represented as
TargetExtTyto theProcessFuncAttributespass.Enabled SIMD16 drop for Xe3 to minimize register spills.
Added the
hasLscStoresWithNonDefaultL1CacheControlsflag to zeinfo, enabling 3D clients to detect Load Store Cache (LSC) stores with non-default L1 cache policies for proper UAV coherency flushing.Started using
MergeAllocasfor private memory merging, allowing reuse of non-overlapping private memory allocations to reduce overall memory usage.Added support for SPIR-V
MulExtendedinstructions to the Vector Compiler (VC).Set the default General Register File (GRF) size to 128.
Added VISA support for HF8 conversion instruction and Panther Lake devices.
Enabled
SetHasSamplefor thegather4*instructions.Added support for stochastic round
bf8intrinsic in the Vector Compiler (VC).Added Panther Lake support.
Added support for new Battlemage device IDs.
Introduced support for Floating-Point DIVide (FDIV) instructions inside
IGCVectorizer.
Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
Added support for the
BUFFER_SIZEexplicit argument.Added the Level Zero API for querying kernel argument data.
Intel® Graphics Memory Management Library
Improved handling of coherent and compressible resources.
Added support for new Battlemage device IDs.
Introduced the
MOCSvariable for Xe2.Enabled
GO:L3for OpenCL usages.
Intel® ME TEE Library
Integrated support for
libmeiversion 1.6.4.Added a C++ wrapper.
Introduced the
TeeGetKindAPI.
Intel® Metrics Discovery Application Programming Interface
Updated API version to 1.14.
Added support for offline metric calculation:
OpenOfflineMetricsDeviceFromBuffer: Opens an offline metrics device object from a buffer.SaveMetricsDeviceToBuffer: Saves a metrics device to a buffer for future offline calculations.CloseOfflineMetricsDevice: Closes the offline metrics device object.
Added support in Xe KMD for configurable Overall Accuracy (OA) buffer size and half-full OA buffer interrupt handling.
Added new Battlemage device IDs.
Introduced the new
EQUATION_ELEM_PREV_METRIC_SYMBOLequation element whereprev$$"SymbolName"allows referencing the previous metric value within the local set.Added support for the
VectorEnginemetric group.
Intel® Metrics Library for Metrics Discovery API
Added support for Panther Lake.
Optimized copy query by reducing the number of GPU commands.
Added support for configurable Overall Accuracy (OA) buffer size in Xe KMD.
Improved performance of OA configuration updates.
Reduced build time.
Optimized debug helper.
Intel® oneAPI Level Zero
Added support for registering a
TeardownCallbackto notify clients upon release of Level Zero resources.Added support for sorting drivers based on provided devices.
Implemented basic leak checker in the validation layer.
Added
zeImageViewCreateExtandzeMemFreeExtsupport to the leak checker.Added API call logging to the validation layer.
Added the static Level Zero loader support.
Introduced support for 1.7 specification in the static loader.
Intel® Video Processing Library
Introduced Intel® Video Processing Library API 2.15 support, including new property-based capability queries interface, extended decoder and encoder capabilities reporting, and definitions for VVC main 10 still picture profile and level 6.3.
Added the explicit
INSTALL_EXAMPLESbuild option to control installation of example source code and content.Updated the default Ubuntu build to version 24.04.
Intel® Video Processing Library GPU Runtime
Improved AV1 decoding performance when all decode frame surfaces are in use.
Enabled property-based capability queries.
Intel® Video Processing Library Tools
Introduced support for Intel® Media Transcode Accelerator.
Added new strings to the
vpl-inspecttool to improve output readability.Added the
-propsoption to thevpl-inspecttool to support querying capabilities based on properties.Updated the default Ubuntu build to version 24.04.
Improvements
Intel® Graphics Compiler for OpenCL™
Fixed a crash in GenerateBlockMemOpsPass that could occur in complex loops during analysis of memory access patterns when certain PHI node conditions were not met.
Fixed u8/u16 2D block read emulation by enforcing 4-column block width for alignment. Additionally, disabled decomposition of emulated d8/d16 transpose reads due to complexity and added emulation of d8/d16 transpose reads using native 2D block transform followed by mov instructions.
Fixed type arguments for creating the
GenISA_sampleDCMlodptrcall.Removed unnecessary tracking and 32-bit truncation of bindless image offsets in kernel arguments, improving handling of bindless images and eliminating redundant instructions.
Started allocating the General Register File (GRF) number for Vector Compiler (VC) untyped load 2D intrinsics.
Fixed an issue with incorrect emission of phi values for structures.
Implemented a cycle-proof deletion strategy to Intel® Graphics Compiler Vectorizer now to ensure reliable cleanup when discarding vectorizer chains.
Fixed a crash in
ProgramScopeConstantAnalysisthat occurred during recompilation and resolved crashes encountered during the compilation of Blender kernels using theSPV_INTEL_bindless_imagesextension.Set cache control for SPIR-V 2D block prefetch when cache control decoration is missing or invalid, provided the target device supports cache control.
Fix an iterator invalidation issue.
Fixed an issue with generating spill temporary variables for 4GRF operands.
Fixed segfault caused by default output stream flags in Intel® Graphics Compiler with SYCL by replacing them with printf.
Ensured consistent instruction order for
PrivateMemto produce identical dumps indebugandndebugcompilations.Started skipping debug calls in complex UnrollLoop loops.
Replaced
std::mapwithllvm::MapVector.Ensured uniform prefetch source address is GRF-aligned, enabled implicit arguments optimizations by default, and introduced GlobalOffset across Xe1, Xe2, and Xe3 platforms for improved performance and payload size reduction.
Added a check for the
TotalGRFNumflag value before trying to return the value passed from an API option.Started emitting bitcast after selecting a value in
SimplifyConstant.Stopped removing implicit kernel arguments, as they might be used by subroutines.
Started using the correct spill size for non-send destinations by rounding up to the nearest GRF size. Additionally, added a VISA option to enable spill cleanup within specified BB ID ranges to target transformations safely.
Started skipping
dbgcalls for vector aliasing heuristic.Fixed potential
nullptrdereference by passing the function as a reference to avoid null checks and improve efficiency.Started reporting a warning when non-null/acc Architected Register File (ARF) register is used on ternary instruction.
Fixed issues related to opaque pointers support in
GenXPacketizer.Fixed issues causing incorrect chunk sizes in the
ConstantCoalescingpass.Optimized Read-Modify-Write (RMW) for strided first definitions in the entry basic block.
Moved VRT General Register File (GRF) bump-up after GRA optimizations.
Changed
RayQueryDynamicRayManagementflag to be off by default due to stability issues. It can be enabled through Application Intelligence Layer (AIL).Unified
RayInfobetween sync and async raytracing.Fixed the fill checked built-in and implemented MAD built-ins for large shapes in the joint matrix.
Added a reserved VISA opcode and updated Intel Graphics Assembler (IGA).
Fixed a VISA assert issue.
Fixed a boundary condition issue.
Resolved an issue in the Vector Compiler (VC) affecting
wrregionoperations withbf16source data types.Fixed the
lgamma_rbehavior.Fixed an issue where the atomic branch predicate was incorrectly selected when multiple modes were enabled, and removed the
PreservesCFGflag from theInsertBranchOptpass to avoid potential crashes.Fixed initialization of PHI instructions of the i1 type.
Fixed handling the
genxvolatile pointer as a function argument in the Vector Compiler (VC).Corrected the HWTID computation to use state registers when WMTP is unsupported by the shader type.
Stopped inserting the check/release intrinsics if the shader has discards.
Fixed the address register restriction in the destination.
Added support for
fcvtwithbf8andhf8data types.Fixed the
printfissues.Fixed incorrect alignment in
MergeUniformLoadwith early return.Fixed creation of chunk loads in the
ConstantCoalescingpass.Improved size reporting in
payload_argumentsin zebin.Added a null check to prevent
nullptrdereference inGenerateBlockMemOpsPass.Fixed copying uniform variables.
Fixed the local copy propagation issue for indirect VxH source.
Stopped cloning debug instructions in
CodeSinkingandCodeLoopSinkingpasses.Added conditional warning for dumped vector size, which is printed only when the
ShaderDumpEnableflag is enabled.Added correct predicate to the
movinstruction when handling split samples.Added
lifetime.startemission for classic resource loops inside nested loops.Enabled stateful
rtstack for synchronized raytracing and separated sync and asyncrtstacks in raytracing magic types.Initialized structure members to prevent potential
nullptrdereference.Fixed a
DebugInfoissue in LLVM to avoid out-of-order evaluation.Simplified the call to
readFirstLanesfor multiplegetFirstLaneIDs.Added support for functions with no return values and no arguments in SIMDCF.
Added a pass to remove freeze instructions prior to code generation.
Fixed handling
rdregionoperations with widths crossing register boundaries in the Vector Compiler (VC).Corrected the maximum sub slices value in SIP.
Updated the execution mask to 32 bit on Xe2.
Corrected the
sample_d_candsample_d_c_mlodsampler message type.Migrated
ProgramScopeConstantResolutionandStatelessToStatefulto fix opaque pointers issues.Disabled building legacy SPIR-V Reader.
Updated
CopyVariableRawto useSIMD32as the maximum Single Instruction Multiple Data (SIMD) size on Xe2.Deprecated the initial set of generation 9 Vector Compiler (VC) LITs, replacing them with XeHPG equivalents.
Made code assumption in
get_global_idoptional.Extended the application of the multiplication pattern in
GetElementPtrLoop Strength Reduction (LSR) pass to improve performance.Added options for controlling the depressurizer thresholds.
Removed too strict restrictions from the LICM pass.
Restored the SIMD16 drop functionality on Xe2, enabling support for spilling kernels using SIMD16 on this architecture.
Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
Updated the
COMPUTE_WALKERto fix incorrectRawDataarray length.Added the
FillImage1dBufferbuilt-in kernel.Started blocking
zeContextMakeImageResident.Started failing device initialization if kernel debugging is misconfigured, with a detailed error message printed to
stderr.Started passing the
Deallocate2callback to the Graphics Memory Manager (GMM).Corrected the Xe
sysfspaths for the Compute Command Streamer (CCS) mode setting.Made external semaphore controller thread-safe and ensured proxy events are destroyed only when the semaphore thread controller releases resources to prevent sporadic failures.
Improved ULLS light ring handling by managing new ring buffer residency, extending mutex protection for safe
stopoperations, and updating USM cleaner to properly stop ULLS light during resource eviction.Ensured Zebin is dumped during program build when
unpackSingleDeviceBinaryis not called, provided the debug key is enabled.Corrected gfx_core_helper definitions for EUSS.
Correct logic for retrieving valid timestamp bits.
Removed overflow check in calculations for Xe2+ cores using EUSS.
Improved media engine handling
Started returning an error via
paraminfoif it is queried with a parameter count of 0, but the programmable actually has one or more parameters.Started patching payload arguments in inline data in case of indirect kernels.
Ensured payload arguments are patched before fetching the walker command.
Added initial support for single temporary allocations list and ensured flush of split task count.
Replaced
sfencewithmfenceon discrete devices and moved ULLS semaphore to shared memory on Xe2.Disabled deferring Memory Object Control State (MOCS) on WSL for Lunar Lake and unified deferring MOCS to the Page Allocation Table (PAT).
Updated implementation to expose the
THREAD_SCRATCHdebug register only when running in the heapless mode.Improved cache handling by invalidating texture and heap caches before reuse or image reads.
Improved compression handling by disabling it for pre-Xe2 platforms, enforcing capability table flags, and removing a redundant workaround for Alchemist GPUs.
Added input/output control helper for context destruction.
Changed the
stypemember type in Level Zero Core and Tools driver extensions touint32_t-aliasto prevent casting outside theze_structure_type_t/zet_structure_type_tenum rangeImproved fence allocation and synchronization by passing product helper to
isFenceAllocationRequired, using global fence helper, simplifying fence selection in ULLS, and removing global fence from CW post-synchronization on Battlemage.Improve container management by reserving residency before addition, unifying non-append method calls, and preventing queue buffer consumption during command list execution.
Enabled staging infrastructure for 3D images.
Updated
semaphoreBufferandringBufferusage on integrated devices.Improved the
NonCopyableOrMovableandNonCopyableconcepts.Unified the local memory size getter for i915 and Xe.
Set the
vmbinduser fence inmakeMemoryResidentto resolve the memory reporting issue.Corrected the allocation size in
freeSVMAllocto prevent crashes.Added an option to enable and disable the heapless mode in the OpenCL offline compiler.
Parsed the Compute Command Streamer (CCS) mode setting for platforms other than Intel Data Center GPU Max Series.
Implemented per-element BLT copying for tiled 1D arrays and began treating tiled 1D images as 2D with a height of 1 for BLT operations.
Stopped enabling compression on
xe_lpgfor Linux and WSL.Corrected blit properties for
CL_MEM_OBJECT_IMAGE1D_BUFFERimages.Corrected the
ZE_MEMORY_ACCESS_CAP_FLAG_CONCURRENTreporting.Enabled the Unified Shared Memory (USM) compression on Linux.
Aligned allocation sizes of 2MB or larger in local memory to a 2MB boundary. This behavior is controlled by the
is2MBLocalMemAlignmentEnabledfunction. Additionally, implemented a pool allocator forgpuTimestampDeviceBuffer. Allocations are shared per device and controlled via theEnableTimestampPoolAllocatordebug flag.Corrected allocation in
MemObj::getMemObjectInfoand createdgraphicsAllocationper eachrootDevice.Corrected Level Zero versioning.
Corrected a Device IDs mismatch.
Started passing the
ReadOnlyflag only for page-misaligned input pointers.Corrected the returned metric value counter for EU stall.
Fixed an issue causing memory allocation crash.
Fixed a shared memory failure issue.
Fixed a hang issue in the Toolbox Interface (TBX) page fault manager.
Implemented changes to prevent race conditions during resource eviction.
Configured scratch pages for the debugger.
Corrected allocation handling for the increment Command Buffer (CB) event.
Enabled support for image array types with an array size of 1 on Xe2 and later platforms.
Corrected the order of passing arguments to
obtainCommandStream.Added a check for the Shared Virtual Memory (SVM) allocated host pointer in
clCreateBuffer.Merged preliminary and non-preliminary code for the Legacy Sysman Memory module.
Removed the
patchtokenfallback.Corrected the error code for the deprecated
clSetCommadQueueProperties.Removed operation access for unsupported types.
Fixed an issue with setting up the Compute Command Streamer (CCS) mode.
Corrected literal raw strings handling in the
printfformatter.Added support for
MetricCreateFromProgrammableExp2.Corrected DSH generation and programming of inline samplers with bindless addressing in Level Zero.
Started passing the root device when creating secondary contexts to ensure proper initialization of
gfxCoreHelperin Direct Rendering Manager (DRM).Enabled 2-way coherency for misaligned user memory.
Preserved the allocation type for memory objects.
Added support for passing
-deviceand-device_optionsin multiple formats in the OpenCL Offline Compiler.Fixed the status return value in
getExternalMemoryPropertieswhen operating in the Toolbox Interface (TBX) mode.Removed the deprecated
LayoutRightGraphics Memory Manager (GMM) flag.Introduced the
ImageSurfaceStatehelper class and relocated global functions into the class to reduce compilation time.Added the Sysman device directory name as a parameter to
SysmanKmdInterface.Set the external semaphore version in Level Zero.
Intel® Graphics Driver Backports for Linux* OS (i915)
Fixed an issue where the
sched_setattr_nocheckAPI was not exported in kernel versions earlier than 5.14.Switched to locked variant of
wake_up_interruptiblefor safer thread wake-ups.Reduced spurious wake-ups for single-task
shmem/userptrjobs.Started propagating wake-up from suspended threads to avoid delayed task execution.
Replaced function type casting with typed function stubs.
Added a reference around
vm_bindto maintain the Virtual Memory Area’s (VMA) validity.Started clearing the Multi Die Fabric Interconnect (MDFI) boot time errors, as they are expected during the initialization of MDFI fabric and may be confused with runtime errors.
Started using the
kobjectattribute instead of thedeviceattribute fornum_cslicesandccs_modesysfs entries on RHEL 8.X.Started handling additional PCI AER corner cases to be able to reset devices without locking up the machine.
Reordered hardware waits and GPU reset logic during PCI faults to avoid blocking on unresponsive hardware while recovering from a hardware failure.
Fixed an issue where a mutex could be held indefinitely when attempting to remove an idle Virtual Memory Area (VMA) from the VM.
Prevented memory allocations during page faults triggered by GPU reset.
Updated
GTT_MMAP_VERSIONto align with corresponding changes in user space.Allowed data to be discarded on forced unbinds, avoiding swaps to inaccessible system memory.
Set the
lmem_offsetto 0 after use so that the next local memory block does not carry the same offset leading to lost data during Single Root I/O Virtualization (SR-IOV) migrations.Fixed incorrect annotations.
Fixed error unwinding in
i915_virtualization_probe.Added periodic checks for forward progress by monitoring context switches and user interrupts. If the same context remains active without interrupts since the last check, a warning is generated with no further action.
Prevented default context creation when wedged.
Cleaned up faulting initialization.
Prevented DPC NPD after initialization failure by early
iafsetup and driver-device decoupling on probe failure.Started protecting per-CPU
px_cachefrom interrupts.Started sending a TLB invalidation request after each Virtual Memory Area (VMA) binding for GuC use, instead of deferring until before enabling GuC, to prevent Single Root I/O Virtualization (SR-IOV) failures.
Started periodic check for mmio failures.
Started handling CT fault injection during early initialization by ensuring CT descriptor objects are not dereferenced before assignment, preventing failures on early faults.
Started checking for context creation failure during
execbuf.Added support for deferred context attachment to existing clients.
Removed the residual calls to the empty
i915_oa_init_reg_stateto completely excise an old use-after-free.Skipped the HuC authentication register check as it is no longer needed.
Prevented soft lockup during defragmentation on eviction.
Prevented a potential compute hang on Alchemist GPUs.
Updated CT
desc->headafter consuming a receive chunk to prevent buffer overflow and slow GuC messaging.Added device PCI IDs to GPU dumps.
Updated
ce->vmon parallel child contexts.Corrected the CSC hardware errors.
Added the
eudbgevent for deferred default context allocation.Removed lockdep assertions around Global Graphics Translation Table (GGTT) updates to prevent conflicts.
Preserved Translation Lookaside Buffer (TLB)
seqnowhen splitting clear pages into multiple smaller pages if there is an outstanding TLB invalidation for those pages.Deferred the default context allocation until first use, reducing overhead when a device opens.
Intel® Graphics Memory Management Library
Disabled compression for the
GMM_FORMAT_I420format.Added size validation when checking
NoOptimizationPadding.Enforced the
Tile4layout overLinearforflipchainresources.Resolved type incompatibility issues.
Intel® ME TEE Library
Reduced hardware register polling timeout in EFI.
Refactored
HECI_DEVICE_KINDhandling in EFI for better maintainabilit.Fixed an EFI issue by changing the
propertyMaparray type toCHAR8*.Fixed EFI compilation errors with GCC.
Intel® Metrics Discovery Application Programming Interface
Fixed incorrect bitfield parsing in metric equations.
Corrected scaling of
std.color.node_pbe_arbmetric on Lunar Lake.Optimized memory allocation size to improve performance and reduce overhead.
Updated the
OpenIoStreambehavior to returnCC_ERROR_NOT_SUPPORTEDwhenprocessId != 0.Removed legacy global symbols to reduce namespace clutter and improve maintainability.
Intel® oneAPI Level Zero
Fixed Sysman-only initialization to prevent retrieval of the loader context when version compatibility is not met.
Corrected version and GUID updates for version 1.22.2.
Fixed GUID generation and updated to version 1.22.3.
Resolved an issue in
zesInitto correctly initialize the requested API version.Fixed artifact upload workflow.
Fixed the extension validation logic.
Improved initialization error checking to verify validation layer behavior.
Fixed experimental extension validation to accept unknown extensions within a valid range.
Corrected
sTypeinitialization in property query operations.Improved teardown checks to prevent invalid context usage.
Added the missing header to
ze_ddi_common.h.Fixed enabling the Digital Display Interface (DDI) handle extensions.
Fixed the incorrect
sTypeassignment inzello_world.Modified
context_tto always allocate dynamically and support delayed destruction.
Intel® Video Processing Library
Updated the model used in the interop example to a vehicle detection model.
Fixed the
BUILD_EXAMPLESbuild option so it no longer depends onINSTALL_DEVto take effect.Removed outdated Docker files provided with examples.
Intel® Video Processing Library GPU Runtime
Resolved crash issues during AV1, AVC, and HEVC decoding related to surface creation on resolution changes.
Intel® XPU Manager and XPU System Management Interface
Improved the firmware update under the recovery mode for Intel® Data Center GPU Flex Series.
Introduced security enhancements.
2025-04-29
The 2507.17 release supports the following operating systems:
Red Hat Enterprise Linux (RHEL): 8.8, 8.10, 9.2, 9.4, and 9.5
Ubuntu 22.04 and 24.04
SUSE Linux Enterprise (SLES): 15 SP4, 15 SP5, and 15 SP6
Improvements
Intel® Graphics Driver Backports for Linux* OS (i915)
Updated the Graphics Micro Controller (GuC) to version 70.44.1.
Resolved a hang detection issue on Intel Data Center GPU Max Series by re-enabling GPU hang checks. Hang detection now only logs a warning message without terminating the application.
Intel GPU Firmware
Updated the Graphics Micro Controller (GuC) to version 70.44.1.
2025-03-18
The 2507.12 release supports the following operating systems:
Red Hat Enterprise Linux (RHEL): 8.8, 8.10, 9.2, 9.4, and 9.5
Ubuntu 22.04 and 24.04
SUSE Linux Enterprise (SLES): 15 SP4, 15 SP5, and 15 SP6
Features
General
Incorporated the latest security updates to address recent vulnerabilities, enhance protection, and ensure greater system reliability.
Intel® Graphics Driver Backports for Linux* OS (i915)
Added support for the HBM_REPLACE bit to signal High Bandwidth Memory (HBM) health status and its transition to the REPLACE state. This enhancement enables the driver to detect the bit and prevent loading when the state changes to REPLACE, while also reporting the issue and prompting HBM replacement.
Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
Started handling page fault events in the Xe debugger.
Added support for the
cl_khr_expect_assumeOpenCL extension that introduces mechanisms to supply the compiler with information that can enhance the performance of certain kernels.Implemented the Level Zero
zeKernelGetBinaryExpAPI that allows retrieving kernel binary program data.Added support for shared system Unified Shared Memory (USM) allocation in
appendLaunchKernel.Implemented enhancements to the Unified Shared Memory (USM) reuse mechanism, including the introduction of a USM reuse cleaner that efficiently manages system and local memory across different reuse strategies, as well as an extension of the USM reuse limit infrastructure.
Improved cache management by supporting whitelisted includes.
Added support for handling new Reliability, Availability, and Serviceability (RAS) errors in Sysman.
Implemented alignment of host Unified Shared Memory (USM) to 2MB on discrete devices when the allocated size exceeds 2MB.
Intel® Graphics Compiler
Modified the pass threshold to optimize the i64 multiplication performance.
Introduced Panther Lake support.
Improved vectorizer to support vector emission of
ftruncinstructions.Enabled the
IndVarSimplificationpass to improve performance.Enabled access to the Workload Management and Thread Programming (WMTP) SIP kernel for the Xe3 core and introduced a default WMTP SIP configuration per Shared Local Memory (SLM) for Xe2.
Added more aggressive late rescheduling phase to the
CodeLoopSinkingpass and an option to disable the maximum sinking heuristic in the presence of 2D block reads.Improved the
InlineHelperLLVM utility.Implemented the
MergeAllocaspass and enabled allocation merging prior to the split asynchronous pass.Enabled the emission of vectorized floating-point addition (FADD) instructions, allowing the VISA emitter to process them efficiently.
Implemented nested 3D resource loop unrolling.
Intel® oneAPI Level Zero
Upgraded specification to version 1.12.15.
Intel® Video Processing Library
Introduced support for Intel® VPL API 2.14, introducing new quality and speed settings for AI-powered video frame interpolation. This update also expands algorithm and mode selection options for AI-based super resolution and adds support for High Efficiency Video Coding (HEVC) level 8.5 decoding.
Improved compatibility with Python 3.12 development environments.
Intel® Video Processing Library Tools
Integrated screen content coding tools for AV1 into
sample_encode.Added a new GTK renderer option to
sample_decodeandsample_multi_transcode.Introduced a new
-fullscreenoption for GTK insample_decodeandsample_multi_transcode. Users can now toggle full screen usingCtrl+fand exit withEsc.Enhanced support for Python 3.12 development environments.
Changes
General
Updated the signing key for KMD prebuilds to enhance security and ensure continued reliability. The new key, valid for one year, will be used to sign all new releases. To ensure compatibility with these updates while maintaining the secure boot functionality, you need to download and install a new Distinguished Encoding Rules (DER) certificate.
Intel® Graphics Compiler
Lowered bfloat
ceilandfloorintrinsics.Refactored parameters in vc-lits in
lit-configfor LLVM 16 to not link theinitializeGenXfunction.Increased the early recompilation threshold for default General Register File (GRF) to 500.
Enabled the
EnableWaveShuffleIndexSinkingregistry key by default.Enabled the
WaveAllJointReductionpass by default.Added an extra assertion check to the
SIMDInfooffset.
Improvements
Intel® Graphics Driver Backports for Linux* OS (i915)
Introduced page fault handling improvements.
Fixed an issue causing the CSC hardware errors.
Removed unnecessary
lockdepdebugging checks from Global Graphics Translation Table (GGTT) updates.Fixed timeout issues by preserving Translation Lookaside Buffer (TLB)
seqnowhen splitting clear pages.Fixed issues causing compilation errors on kernel 6.6 and later.
Fixed an issue where prefetch was attempted on empty objects.
Fixed an issue where
pid_task()could fail if the target process had already exited.Implemented a workaround for Address Translation Services for Memory (ATS-M) and introduced support for G8 power state to reduce idle power consumption.
Modified the logic to avoid calling
pm_qos_requesta second time on an existing request during breadcrumb reset.Disabled C-states for breadcrumb interrupts to reduce Direct Memory Access (DMA) latency.
Cleaned up incomplete
shmemfsobj->base.filpon failed swapout.Hardcoded memory health status in
sysfsto prevent breakage.Implemented flushing of freed objects before reporting available memory to stabilize the reported memory levels.
Modified implementation to retry eviction only when it is blocked by active or locked objects, aiming to reduce response time.
Optimized Virtual Memory Area (VMA) prefetch by short-circuiting redundant operations.
Corrected Compressed Color Surface (CCS) copies for Single Root I/O Virtualization (SR-IOV)
saveandrestore.Restricted
shmemflags to a valid set forswapinto resolve a page fault issue.Modified the implementation to repeat the Translation Lookaside Buffer (TLB) flush invalidation request, resolving the issue with the failing Hardware Performance Library (HPL).
Removed early unlocked unbind from object free to avoid race conditions between lockless unbinding and eviction of non-persistent VMAs.
Introduced changes to protect
i915_drm_client_finifrom early shutdown.Started supporting compilation with
CONFIG_PAGE_TABLE_ISOLATIONto fix a compilation issue on RHEL.Optimized the unbind step in the GT IFR flow by skipping context runtime updates when the device is quiesced. This change reduces the execution time.
Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
Disabled implicit callback conversion for wait events to resolve the wait operation hang issues.
Added the missing callback event cache flush to fix an issue with the
zeEventHostSynchronizehangs.Fixed an issue with reporting EU counts for multi-slice platforms.
Fixed an issue where
ZE_AFFINITY_MASKwas not working whenZE_FLAT_DEVICE_HIERARCHYwas set toCOMBINEDin OpenCL.Implemented shared allocations to preserve reference timestamps and introduced a flag in the Inter-Process Communication (IPC) pool data to verify if the mapped timestamp flag is set.
Fixed an issue where
event_profiling::command_startreturned an incorrect result.Set stateless addressing mode for buffers that are neither bindful nor bindless.
Started retrieving the minimal offset size for region barrier.
Fixed the scope of the result variable in
initDriverto resolve an issue where it was defined in a narrower scope, causing the initialization result to be improperly discarded.Started returning
rawDataSizeas zero when thereadIoStreamcall fails.Resolved issues with parsing and setting the Level Zero debugger bitmask.
Fixed performance issues on Battlemage GPUs.
Ensured memory residency by setting the
vmbinduser fence when making memory resident.Prevented crashes due to over-allocation by introducing a defer backing flag to Graphics Execution Manager (GEM) create input/output control, ensuring memory is resident before locking.
Intel® Graphics Compiler
Fixed regression chart dump for General Register File (GRF) configurations with more than 128 registers.
Resolved issues in the vectorizer, improving its stability and performance.
Disabled the
TrivialLocalMemoryOpsEliminationpass from the pipeline.Resized
G4_Declare’s row size for atomic operations to prevent out-of-bounds (OOB) issues.Fixed issues in VISA parser for Load Store Cache (LSC) 2D block operations to allow mixed register and immediate
AddrXandAddrYoperands for 2D block load and store instructions.Fixed value tracker handling of Global Element Pointers (GEP) with zero indices by treating them as bitcasts to prevent confusion in kernel usage.
Implemented dynamic optimization threshold adjustment for the depressurizer based on the number and size of General Register File (GRF) registers.
Fixed incorrect instructions placement in the rollback functionality of the
CodeLoopSinkingpass.Stopped using the
%spand%fppredefined variables.Implemented dedicated logic for handling discards in
DynamicRayManagementPassto prevent crashes.Fixed the direct address destination restriction on
SIMD32.Improved the alignment calculation in constant coalescing and started supporting additional load and store intrinsics in
SynchronizationObjectCoalescing.Fixed incorrect condition check in
isRegionInvariantofWIAnalysis.Stopped removing Built-in Function (BiF) module prebuilt stamp files to avoid redundant recompilations when CMake files are updated.
Fixed an issue with constant folding prevention inside loops.
Added intrinsic cache to
KernelDebugInfoand prevented indirect access to Software Scoreboard (SWSB).Reduced the number of atomics hitting the same cache line by performing atomic predication.
Enabled the Execution Out-of-Order Thread (EOT) to participate in the Software Scoreboard (SWSB) token assignment.
Fixed a channel mask issue in the
src0length forRenderTargetDataPayload, where the Alpha channel was incorrectly controlled.
Intel® oneAPI Level Zero
Addressed and fixed potential memory leaks.
Fixed issues in the generation of
pkg-configfiles.Corrected code generation for
libdditable queries.Corrected validation layer’s parameter checker for extensions.
Intel® Video Processing Library
Fixed the bootstrap process to support Debian distributions that do not define the ID_LIKE property.
Intel® Video Processing Library Tools
Fixed the bootstrap process to support Debian distributions that do not define the ID_LIKE property.