Release notes for commit 78d80a1cc628af76f09c53673ada906a3d2f0131
- New attributes for Intel FPGA devices :
num_simd_work_items,bank_bits,max_work_group_size,max_global_work_dim[61d60b6] [7becb9d] [9053642] [6441851] - The infrastructure for device standard C/C++ libraries has been introduced.
assertcan now be used in device code [0039ee0] - Add opencl-aot tool for AoT compilation for Intel CPU devices [400460b]
- Added support for
cl::sycl::intel::experimental::printfbuiltin [78d80a1] - Implemented device code split feature: compiler can now be instructed to
split a single device code module into multiple via the
-fsycl-device-code-splitoption. [9d7dba6] [cc93bc4] [5491486] [a339d4c] [9095749]
- Allowed applying
cl::reqd_work_group_sizeandcl::intel_reqd_sub_group_sizeattributes to a lambda function [b06fc66] - Allowed applying SYCL_EXTERNAL to functions with raw pointer arguments or
return value if
-Wno-error=sycl-strictoption is passed [2840458] - Added support for FPGA device libraries in fat static libraries [d39ab73]
- Removed limitation of virtual types in SYCL [98754ff]
- Added support for platform name and platform version in the device allowed list [7f2f668]
- Made kernels and programs caches thread-safe [118dc82]
- It's now possible to omit
clnamespace specifier when using SYCL API.sycl::buffercan be used insteadcl::sycl::buffer[67e4655] - Diagnostic on attempt to do several actions in one command group has been implemented [9f8ae50]
queue::submitnow throws synchronous exceptions [6a83d14]- Enqueue pending tasks after host accessor destructor [80d17b2]
- Implemented more "flat" kernel submission
cl::sycl::queuemethods [c5318c5]
- Added support for generation of SYCL documentation with Doxygen [de418d6]
- Design document which describes design of C/C++ standard library support has been added
- Fixed problem which prevented attaching multiple attributes to a SYCL kernel [b77e5b7]
- Fixed a possible deadlock with host accessor' creation from multiple threads [d1c6dbe]
- Compatibility issues that can happen when the SYCL library and SYCL application are compiled with different version of standard headers are fixed. [d854643]
- Fixed compile error for
handler::copywith-fsycl-unnamed-lambda[e73d2ce] - Fixed error which happened when autogenerated (
-fsycl-unnamed-lambda) kernel name containedcl::sycl::halftype [514fc0b] - Fixed crash on submitting a kernel with zero local size on the host device [b6806ea]
- Made
vec::loadandvec::storemember functions templated to align with the SYCL specification [4bd76de] - Made exceptions that are thrown in the copy back routine during SYCL memory object destructor asynchronous [2d6dcd0]
- Fixed creation of host accessor to sub-buffer leading to crash in some scenarios [f607520]
- Fixed build error on Windows caused by RuntimeLibrary value inconsistency for LLVM and apps/tests linked with it [f9296b6]
- [new] The size of object file is increased compared to the older compiler releases due to the recent fat object layout changes in the open source LLVM.
- Using
cl::sycl::programAPI to refer to a kernel defined in another translation unit leads to undefined behaviour - Linkage errors with the following message:
error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already definedcan happen when a SYCL application is built using MS Visual Studio 2019 version below 16.3.0 For MSVC version having the error the workaround is to use -std=c++17 switch.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2020.9.1.0.18_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 19.48.14977 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2020.9.1.0.18_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 26.20.100.7463 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
Please, see the runtime installation guide here
Release notes for commit e0a62df4e20eaf4bdff5c7dd46cbde566fbaee90
- Added more diagnostics on incorrect usage of SYCL options [f8a8785]
- Added diagnostic on capturing values by reference in a kernel lambda [0e7639a]
- Removed
--sycloption which was an alias to-fsycl-device-only - Added support for using integral template parameter as an argument to
intelfpga::ivdep,intelfpga::iiandintelfpga::max_concurrencyattributes [6a283ae] - Improved diagnostic for using incorrect triple for AOT [6af65e1]
cl::sycl::pipeclass was moved tocl::sycl::intelnamespace [23af059]- Added new query
info::device::kernel_kernel_pipe_supportwhich indicates whether device supports SYCL_INTEL_data_flow_pipes extension [8acf36e] - Added support for copying accessor with arbitrary dimensions in
handler::copy[ff30897] - Cache result of compilation and kernel creation of programs created via
cl::sycl::program::build_with_kernel_typewithout build options [8c2e09a] [4ba499c] [d442364] - Added diagnostic (static_assert) for dimensions mismatch when passing
cl::sycl::vecobjects to SYCL builtins [eb3ac32] - Added diagnostic (exception) on attempt to create invalid sub-buffer [0bd58df]
- Added
single_taskandparallel_formethods tocl::sycl::ordered_queue::single_task. These provide an alternative way to launch kernels [1d2f7ce] - Added forms of USM functions for allocations that take a queue reference [f57c2b6]
- Added C++ deduction guides to
cl::sycl::rangecl::sycl::idcl::sycl::nd_rangecl::sycl::veccl::sycl::multi_ptrcl::sycl::buffer[644013e] [3546a78] [728b904] [dba1c20]
- Added a new buffer constructor which takes a contiguous container as an argument [dba1c20]
- Added support of 1 byte type to load and store method of
cl::sycl::intel::sub_groupclass [c8cffcc] - Improved error reporting for kernel enqueue process [d27cff2]
- Instance of memory objects are now preserved on host after host accessor destruction [16ae15a]
- Added support for
SYCL_DEVICE_WHITE_LISTcontrol which can be used to specify devices that are visible to the SYCL implementation [4ad5263]
- Updated build instructions for Windows and added instructions for setting up FPGA emulator device to get started guide [6d0b326] [b2bb35b] [a7336c2]
- Updated SYCLCompilerAndRuntimeDesign to use proper names of AOT related options [b3ee6a2]
- Added unnamed lambda extension draft [47c4c71]
- Added kernel restrict all extension draft [47c4c71]
- Added initial draft of data flow pipes extension proposal [ee2b482]
- USM doc was updated with new version of allocation functions [0c32410]
- Added draft version of group collectives extension [e0a62df]
- Added deduction guides extension [4591d74]
- Improved environment variables controls documentation [6aae4ef]
- Fixed a problem with option duplication during offload [1edd217]
- Modified the driver to unbundle files with no extension like objects [84992a5]
- Fixed AOT compilation for FPGA target on Windows [6b789f9] [525da51]
- Fixed FPGA AOT compilation when using FPGA AOCX device archives [ec738f2] [16c530a]
- Fixed problem causing abort when
-fsycland-lstdc++passed simultaneously [e437fd4] - Fixed
-ooption for FPGA AOT on Windows [ddd24a3] - Default
-std=c++setting for device compilations was removed to avoid__cplusplusdifferences between host and device compilation. [29cabdf] - Suppressed warning about cdecl calling convention [0d2dab4]
- Fixed problem when two different lambdas with the same signature get the same mangled name [d6aa11b]
- Removed the error being issued for pseudo-destructor expressions [3c9006e]
- Removed the device linking step for cases with no valid objects being provided from the device compilation [25bbe42]
- Implemented a workaround for the issue with conversion of one element swizzle to a scalar [f752698]
- Fixed a UX problem with a sycl::stream mixing output of several work-items in some cases [377b3fa]
- Fixed problem with linker options getting passed to the bundler instead of the linker when -foffload-static-lib is supplied [e437fd4]
- Fixed problem with using wrong offset inside
cl::sycl::handler::copy[5c8e81f] - Fixed
get_countandget_sizemethods ofcl::sycl::accessorclass that used to return incorrect values for non-local accessor created with range [9dd68c5] - Fixed issue with asynchronous exceptions not being associated with the returned event [6c512d3]
- Fixed issue with host accessor not always providing exclusive access to a memory object [16ae15a]
- Fixed issue with ignoring
cl::sycl::property::buffer::use_host_ptrproperty [16ae15a]
- [new]
cl::sycl::programconstructor creates unnecessary context which is bound to the device which is chosen bycl::sycl::default_selector. This can lead to unexpected behavior, for example, process may hang if default selector chooses Intel FPGA emulator device but target device is Intel OpenCL CPU one - [new] Using cl::sycl::program API to refer to a kernel defined in another translation unit leads to undefined behaviour
- [new] -fsycl-unnamed-lambda doesn't work with kernel names that contain cl::sycl::half type
- The addition of the static keyword on an array in the presence of Intel FPGA memory attributes results in the empty kernel after translation
- A loop's attribute in device code may be lost during compilation
- Linkage errors with the following message:
error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already definedcan happen when a SYCL application is built using MS Visual Studio 2019 version below 16.3.0 For MSVC version having the error the workaround is to use -std=c++17 switch.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2019.9.12.0.12_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 19.48.14977 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2019.9.12.0.12_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 26.20.100.7463 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
Please, see the runtime installation guide here
Release notes for commit 918b285d8dede6ab0561fccc622f71cb858849a6
cl::sycl::queue::mem_advisemethod was implemented [4828db5]cl::sycl::handler::memcpyandcl::sycl::handler::memsetmethods that operate on USM pointer were implemented [d9e8467]- Implemented
ordered queueextension - Implemented support for half type in sub-group collectives: broadcast, reduce, inclusive_scan and exclusive_scan [0c78bc8]
- Added
cl::sycl::intel::ctzbuilt-in. [6a96b3c] - Added support for SYCL_EXTERNAL macro.
- Added support for passing device function pointers to a kernel [dc9db24]
- Added support for USM on host device [5b0952c]
- Enabled C++11 attribute spelling for clang
loop_unrollattribute [2f1e243] - Added full support of images on host device
- Added support for profiling info on host device [6c03c4f]
cl::sycl::handler::prefetchis implemented [feeacc1]- SYCL sub-buffers is mapped to OpenCL sub-buffers
- Added Intel FPGA Command line interface support for Windows [55ebcae]
- Added support for one-step compilation from source with
-fsycl-link[55ebcae] - Enabled additional aoc options for dependency files input and output report [55ebcae]
- Suppressed warning
"_declspec attribute 'dllexport' is not supported"when run with-fsycl. Emit error when import function is called in the sycl kernel. [b10bdbb] - Changed
-fsycl-device-onlyto override-fsycloption [d429243] - Added user-friendly diagnostic for unsupported math built-in functions usage in kernel [0476352]
- The linking stage is now skipped if -fsycl-device-only option is passed [93178d1]
- When unbundling static libraries on Windows, do not extract the host section as it is not being used. This fixes possible disk usage issues when working with fat static libraries [93ab97e]
- Passing
-fsycl-helpwith-###option now prints the actual call to tool being made. [8b8bfa9] - Allow for
-gNto override default setting with-fintelfpga[3b20615] - Update sub-group reduce/scan syntax [cd8194d]
- Prevent libraries from being considered for unbundling on Windows [3438a48]
- Improved Windows behaviors for calling
lib.exewhen creating an archive for Intel FPGA AOT [e7afcb1]
- Removed suppression of exceptions thrown by async_handler from
cl::sycl::queuedestructor [61574d8] - Added the support for output operator for half data types [6a2cd90]
- Improved efficiency of stream output of
cl::sycl::h_itemfor Intel FPGA device [80e97a0] - Added support for
std::numeric_limits<cl::sycl::half>[6edca52] - Marked barrier flags as constexpr to avoid its extra runtime translation [5635959]
- Added support for unary plus and minus for
cl::sycl::vecclass - Reversed mapping of SYCL range/ID dimensions to OpenCL, to provide expected performance through unit stride dimension. The highest dimension in SYCL (e.g. r2 in cl::sycl::range<3> R(r0,r1,r2)) now maps to the lowest dimension in OpenCL (e.g. an enqueue of size_t[3] cl_R = {r2,r1,r0}). The same applies to range and ID queries, in kernels defined through OpenCL interop. [40aa3f9]
- Added support for constructing
cl::sycl::imagewithout host ptr but with pitch provided [d1931fd] - Added
sycldlibrary on Windows which is compiled using/MDdoption. This library should be used when SYCL application is compiled with/MDdoption to avoid ABI issues [71a75c0] - Added driver and runtime support for AOT-compiled images for multiple devices. This handles the case when the device code is AOT-compiled for multiple targets [0d4eb49] [bcf38cf]
- Get started guide was reworked [9050a98] [94ee028]
- Added SYCL compiler command line guide [af63c6e]
- New document describing the SYCL Runtime Plugin Interface [bffdbcd]
- Updated interfaces in Sub-group extension specification [cc6e4ae]
- Updated interfaces in USM proposal [a6d7e12] [d9e8467]
- Fixed problem with using aliases as kernel names [a784071]
- Fixed address space in generation of annotate attribute for static vars and global Intel FPGA annotation [800c8c0]
- Suppressed emitting errors for TLS declarations [ddc1a7f]
- Suppressed device code link warnings that happen during linking
fatandnon-fatobject files [b38a8e0] - Fixed pointer width on 64-bit version of Windows [63e2b19]
- Fixed integration header generation when kernel name type is defined in cl, sycl or detail namespaces [5d22a8e]
- Fixed problem with incorrect generation of output filename caused by processing of libraries in SYCL device toolchain [d3d9d2c]
- Fixed problem with generation of depfile information for Intel FPGA AOT compilation [fbe951f]
- Fixed generation of help message in case of
-fsycl-help=getoption passed [8b8bfa9] - Improved use of
/Foon Windows in offload situations so intermediate temporary files are not renamed [6984794] - Resolved problem with unnamed lambdas having the same name [f4d182f]
- Fixed -fsycl-add-targets option to support multiple triple:binary arguments and to emit diagnostics for invalid target triples [21fa901]
- Fixed AOT compilation for GEN devices [cd2dd9b]
- Fixed problem with using 32 bits integer type as underlying type of
cl::sycl::vecclass when 64 bits integer types must be used on Windows [b4998f2] cl::sycl::aligned_alloc*now returns nullptr in case of error [9266cd5]- Fixed bug in conversion from float to half in the host version of
cl::sycl::halftype [6a2cd90] - Corrected automatic/rte mode conversion of
cl::sycl::vec::convertmethod [6a2cd90] - Fixed memory leak related to incorrectly destroying command group objects [d7b5c0d]
- Fixed layout and alignment of objects of 3 elements
cl::sycl::vectype, now they occupy memory for 4 elements underneath [32f0cd5] [8f7f4a0] - Fixed problem with reporting the same asynchronous exceptions multiple times [9040739]
- Fixed a bug with a wrong success code being returned for non-blocking pipes, that was resulting in incorrect array data passing through a pipe. [3339c45]
- Fixed problem with calling atomic_load for float types in
cl::sycl::atomic::load. Now it bitcasts float value to integer one then call atomic_load. [f4b7b17] - Fixed crash in case incorrect local size is passed. Now an exception is thrown in such cases. [1865c79]
cl::sycl::vectypes aliases are now aligned with the SYCL specification.- Fixed
cl::sycl::rotatemethod to correctly handle over-sized shift widths [d2e6a26] - Changed underlying address space of
cl::sycl::constant_ptrfrom constant to global to avoid casts between constant and generic address spaces [38c2960] - Aligned
cl::sycl::rangeclass with the SYCL specification by removing its default constructor [d3b6a49] - Fixed several thread safety problems in
cl::sycl::queueclass [349a0d3] - Fixed compare_exchange_strong to properly update expected inout parameter [627a137]
- Fixed issue with host version of
cl::sycl::sub_satfunction [7865dfc] - Fixed initialization of
cl::sycl::h_itemobject whencl::sycl::handler::parallel_formethod with flexible range is used [ab3e71e] - Fixed host version of
cl::sycl::mul_hibuilt-in to correctly handle negative arguments [8a3b7d9] - Fix host memory deallocation size of SYCL memory objects [866d634]
- Fixed bug preventing from passing structure containing accessor to a kernel on some devices [1d72965]
- Fixed bug preventing using types from "inline" namespace as kernel names [28d5931]
- Fixed bug when placeholder accessor behaved like a host accessor fetching memory to be available on the host and blocking further operations on the accessed memory object [d8505ad]
- Rectified precision issue with the float to half conversion [2de1379]
- Fixed
cl::sycl::buffer::reinterpretmethod which was working incorrectly with sub-buffers [7b2f630] [916c32d] [60b6e3f] - Fixed problem with allocating USM memory on the host [01869a0]
- Fixed compilation issues of built-in functions. [6bcf548]
- [new] The addition of the static keyword on an array in the presence of Intel FPGA memory attributes results in the empty kernel after translation.
- [new] A loop's attribute in device code may be lost during compilation.
- [new] Linkage errors with the following message:
error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already definedcan happen when a SYCL application is built using MS Visual Studio 2019 version below 16.3.0.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2019.9.11.0.1106_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 19.43.14583 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2019.9.11.0.1106_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 100.7372 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
Please, see the runtime installation guide here
Release notes for commit d4efd2ae3a708fc995e61b7da9c7419dac900372
- Added support for
reqd_work_group_sizeattribute. [68578d7] - SYCL task graph can now be printed to file in json format. See SYCL ENV VARIABLES for information how to enable it. [c615566]
- Added support for
cl::sycl::intel::fpga_regandcl::sycl::intel::fpga_selectorextensions. [e438d2b]
clCreateCommandQueueorclCreateCommandQueueWithPropertiesis used depending on version of OpenCL implementation. [3511e3d]- Added support querying kernel for
compile_sub_group_size. [bb7cb34] This resolves intel/llvm#367 - If device image is in SPIRV format and SPIRV is not supported by OpenCL implementation exception is thrown. [09e328f]
- Added support for USM pointer to
cl::sycl::handler::set_argmethod. [df410a5] - Added support for
-fsycl-help=argcompiler option which can be used to emit help message from corresponding offline compiler. [0e44dd2] - Added
-reuse-execompiler option which can be used to avoid recompilaton of device code (SPIRV) if it has not been changed from the previous compilation and all options are the same. This work for Intel FPGA AOT compilation. [2934fd8] - SYCL math builtins now work with regular pointers. [24fa42b]
- Made SYCL specific options available when using clang-cl. [f5a7522]
- USM now works on pre-context basis, so now it's possible to have two
cl::sycl::contextobjects with USM feature enabled. [e339962] "-fms-compatibility"and"-fdelayed-template-parsing"are not passed implicitly when-fsyclis used on MSVC. [9c3d98d]- Implemented
cl::sycl::vec::convertmethod, host device now supports all rounding modes while other devices support automatic rounding mode. [fe3bbf9] cl::sycl::imagenow uses internal aligned_allocator instead of standard one. [d7380f5]cl::sycl::programclass now throws exception containing build log in case of build failure.
- Fixed issue that prevented from using lambda functions in kernels. [1b83ae9]
- Fixed crash happening when
cl::sycl::eventmethods's are called for manually constructed (not obtained fromcl::sycl::queue::submit)cl::sycl::event. [90333c3] - Fixed problem with functions called from inializer list are being considered as device code. [071b581]
- Asynchronous error handler is now called in
cl::sycl::queuedestructor as the SYCL specification requires. [0182f72] - Fixed race condition which could happen if multiple host accessors to the same buffer are created simultaneously. [0c61c8c]
- Fixed problem preventing from using template type with template parameters as a kernel name. [45194f7]
- Fixed bug with offset passed to
cl::sycl::handler::parallel_formethods was ignored in case of host device or when item is an argument of a lambda. [0caeeae] - Fixed crash which happened when
-fsyclwas used with no source file provided. [c291fd3] - Resolved problem with using bool type as a kernel name. [07b4f09]
- Fixed crash which could happen if sycl objects are used during global objects destruction. [fff31fa]
- Fixed incorrect behavior of host version of
cl::sycl::abs_difffunction in case ifx - y < 0andxandyare unsigned types. [35fc029] - Aligned the type of exception being thrown if no device of requested type is
available with the SYCL specification. Now it throws
cl::sycl::runtime_error. [9d5faab] cl::sycl::accessorcan now be created fromcl::sycl::bufferwhich was constructed with non-default allocator. [8535b24]- Programs that failed to build (JIT compile) are not cached anymore. [d4efd2a]
- Partial initialization of a constant static array now works correctly. [4e52d44]
- Fixed casting from derived class to base in case of multiple inheritance. [76e223c]
- Fixed compilation of relational operations with pointer. [4541a7f]
- Fixed representation of long long int in
cl::sycl::vecclass on Windows. [336cb82] cl::sycl::streamclass has gained support for printing nan and inf values [b63a96f] This resolves intel/llvm#500- Fixed bug that prevented from using aliases as non-type template parameters (such as int) and full template specialization in kernel names. [a784071]
- Crash when using stl allocator with
cl::sycl::bufferandcl::sycl::imageclasses is fixed. [d7380f5]
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2019.8.8.0.0822_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 19.34.13890 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2019.9.9.0.0901 is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 100.7158 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
Please, see the runtime installation guide here
Release notes for commit c557eb740d55e828fcf74b28d2b686c928e45318.
- Support for
image accessorhas been landed. - Added support for unnamed lambda kernels, so
parallel_forworks without specifying "Kernel Name" type. This can be enabled by passing-fsycl-unnamed-lambdaoption. - Kernel to kernel blocking and non-blocking pipe feature is implemented.
- Added support for Unified Shared Memory (USM).
- Now OpenCL 1.2 clCreateSampler sampler is used for all version of OpenCL implementation.
- Added Intel FPGA specific command line interfaces for ahead of time compilation.
- Intel FPGA memory attributes are now supported for static variables.
- Hierarchical parallelism is improved to pass conformance tests.
private_memoryclass has been implemented.- sycl.lib is automatically linked with -fsycl switch on Windows.
- Added support for Windows getOSModuleHandle.
- Functions with variadic arguments doesn't trigger compiler diagnostic about calling convention.
- Added experimental support for building and linking SYCL runtime with libc++ library instead of libstdc++.
- Adjusted array and vec classes so they are more efficient for Intel FPGA devices.
- Exception will be thrown on attempt to create image accessor for the device which doesn't support images.
- Use
llvm-objcopyfor merging device and host objects instead of doing partial linking. - Check if online compiler is available before building the program.
- clCreateProgramWithILKHR is used if OpenCL implementation supports cl_khr_il_program extension.
- Reuse the pointer provided by the user in the
bufferconstructor (even ifuse_host_ptrproperty isn't specified) if its alignment is sufficient. -Ldirnow can be used to find libraries with-foffload-static-libmax_concurrencyIntel FPGA loop attribute now accepts zero.- Ignore incorrectly used Intel FPGA loop attribute emitting user friendly warning instead of compiler error.
- Added
depends_onmethods ofhandlerclass which can be used to provide additional dependency for a command group. - SYCL implementation can now be built on Windows using Visual Studio 2017 or higher.
- Fixed assertion failure when use
-save-tempsoption along with-fsycl. - Cached JITed programs are now released during destruction of
contextthey are bound to, earlier release happened during a process shutdown. - Fixed
get_linear_idmethod ofgroupclass, now it calculate according to row-major ordering. - Removed printing of error messages to stderr before throwing an exception.
- Explicit copy API of the handler class is asynchronous again.
fillmethod ofhandlerclass now takes element size into account.- Fixed bug with ignored several Intel FPGA loop attributes in case of argument is 1.
- Fixed bug which prevented Intel FPGA loop attributes work with infinite loops.
- Fixed problem which caused invalid/redundant memory copy operations to be generated.
- The commands created by Scheduler now cleaned up on destruction of
corresponding
SYCLmemory objects (buffer,image). - 1 dimensional sub buffer is passed as cl_mem obtained by calling clCreateSubBuffer when kernel is built from OpenCL C source.
- Now copy of entire memory is performed when data is needed in new context even if user requests accesses to only part of the memory.
- Fixed problem with one element
vecobjects relation/logical operation which was working like scalar. - Type casting and conditional operator (ternary 'if') with pointers now working correctly.
- The problem with calling inlined kernel from multiple TUs is fixed.
- Fixed compiler warnings for Intel FPGA attributes on host compilation.
- Fixed bug with passing values of
vec<#, half>type to the kernel. - Fixed buffer constructor which takes host data as shared_ptr. Now it increments shared_ptr reference counter and reuses provided memory if possible.
- Fixed a bug with nd_item.barrier not respecting fence_space flag
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support version 2019.8.7.0.0725_rel is recommended OpenCL CPU RT prerequisite for the SYCL compiler
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 19.29.13530 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
Release notes for commit 64c0262c0f0b9e1b7b2e2dcef57542a3fe3bdb97.
cl::sycl::streamclass support has been added.- New attributes for Intel FPGA device are added:
merge,max_replicatesandsimple_dual_port. - Initial support for new Plugin Interface (PI) layer is added to SYCL runtime library. This feature simplifies porting SYCL implementation to non-OpenCL APIs.
- New address space handling rules are implemented in the SYCL device
compiler. Raw pointers are allocated in generic address space by default and
address space inference is supposed to be done by LLVM pass. Old compiler
behavior can be recovered by enabling
DISABLE_INFER_ASenvironment variable. - Add basic implementation of hierarchical parallelism API.
- Add new clang built-in function
__unique_stable_name. SYCL compiler may use this built-in function to auto-generate SYCL kernel name for lambdas.
- SYCL integration header is excluded from the dependency list.
- Raw pointers capturing added to the SYCL device front-end compiler. This capability is required for Unified Shared Memory feature implementation.
- SYCL device compiler enabled support for OpenCL types like event, sampler,
images to simplify compilation of the SYCL code to SPIR-V format.
CXXReflowerpass used to make "SPIR-V friendly LLVM IR" has been removed. - Intel FPGA loop attributes were renamed to avoid potential name conflicts.
- Old scheduler has been removed.
samplertype support is added to theset_argmethods.- Internal SYCL device compiler design documentation was improved and updated. Development process documentation has been updated with more details.
- Initial support for
imageclass (w/o accessor support). - Static variables are allocated in global address space now.
- Made sub-group methods constant to enable more use cases.
- Added
-fsycl-linkoption to generate fat object "linkable" as regular host object. - Enable
set_final_datamethod withshared_ptrparameter. - Enable using of the copy method with
shared_ptrwithconst T.
- Fixed argument size calculation for zero-dimensional accessor.
- Removed incorrect source correlation from kernel instructions leading to incorrect profiling and debug information.
- A number of issues were fixed breaking build of the compiler on Windows
- Fixed unaligned access in load and store methods of the vector class.
global_mem_cache_typevalues were aligned with the latest revision of the SYCL specification.- Stubs for C++ standard headers were removed. This should fix compilation of and with SYCL device compiler.
- Unscoped enums were removed from global namespace to avoid conflicts with user defined symbols.
- Explicit copy API of the handler class is blocking i.e. data is copied once the command group has completed execution.
- Renamed
cl::sycl::group::get_lineartocl::sycl::group::get_linear_id. - SYCL kernel constructor from OpenCL handle now retains OpenCL object during SYCL object lifetime.
- Fixed forward declaration compilation inside a SYCL kernel.
- Fixed code generation for 3-element boolean vectors.
- Experimental Intel(R) CPU Runtime for OpenCL(TM) Applications with SYCL support is available now and recommended OpenCL CPU RT prerequisite for the SYCL compiler.
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 19.25.13237 is recommended OpenCL GPU RT prerequisite for the SYCL compiler.
- New address space handling approach might degrade compilation time (especially for GPU device).
- Some tests might fail on CPU device with the first experimental CPU
runtime due to new
address space handling by the SYCL compiler. The workaround for this kind of
issues while we wait for CPU runtime update is to set
DISABLE_INFER_ASenvironment variable during compilation. See intel/llvm#277 for more details.
The release notes contain information about changes that were done after previous release notes and up to commit d404d1c6767524c21b9c5d05f11b89510abc0ab9.
- New FPGA loop attributes supported:
intelfpga::ivdep(Safelen)intelfpga::ii(Interval)intelfpga::max_concurrency(NThreads)
- The new scheduler is implemented with the following improvements:
- optimize host memory allocation for
bufferobjects. - optimize data transfer between host and device by using Map/Unmap instead of Read/Write for 1D accessors.
- simultaneous read from a buffer is allowed: execution of two kernels reading from the same buffer are not serialized anymore.
- optimize host memory allocation for
- Memory attribute
intelfpga::max_concurrencywas renamed tointelfpga::max_private_copiesto avoid name conflict with fresh added loop attribute - Added support for const values and local accessors in
handler::set_argmethod.
- The new scheduler is implemented with the following bug fixes:
- host accessor now blocks subsequent operations(except RAR) with the buffer it provides accesses to until it is destroyed.
- OpenCL buffers now released on the buffer destruction.
accessor::operator[]like methods now take into account offset.- Non-SYCL compilation(without
-fsycl) was fixed, such application should work on host device, but fail on OpenCL devices. - Several warnings were cleaned up.
- buffer constructor was fixed to support const type as template parameter.
event::get_profiling_infonow waits for event to be completed.- Removed non-const overload of
item::operator[]as it's not present in SYCL specification. - Compiling multiple objects when using
-fsycl-link-targetsnow creates proper final .spv binary. - Fixed bug with crash in sampler destructor when sampler object is created using enumerations.
- Fixed
handler::set_arg, so now it works correctly with kernels created using program constructor which takescl_programorprogram::build_with_source. - Now
lgamma_rbuiltin works correctly when application is built without specifying-fsycloption to the compiler.
- Experimental Intel® CPU Runtime for OpenCL™ Applications with SYCL support is available now and recommended OpenCL CPU RT prerequisite for the SYCL compiler.
- The Intel(R) Graphics Compute Runtime for OpenCL(TM) version 19.21.13045 is recomended OpenCL GPU RT prerequisite for the SYCL compiler.
- Performance regressions can happen due to additional math for calculation of
offset that were added to the
accessor::operator[]like methods. - Applications can hang at exit when running on OpenCL CPU device because some
OpenCL handles allocated inside SYCL(e.g.
cl_command_queue) are not released.
- Added support for half type.
- Implemented sampler class.
- Implemented several methods of buffer class:
- buffer::has_property and buffer::get_property
- buffer::get_access with range and offset classes
- buffer::get_allocator as well as overall support for custom allocators.
- Implemented broadcasting vec::operator=.
- Added support for creating a sub-buffer from a SYCL buffer.
- Added diagnostic about capturing class static variable in kernel code.
- Added support for discard_write access::mode.
- Now SYCL buffer allocates 64 bytes aligned memory.
- Added support for case when object of accessor class is wrapped by some class.
- Added support for const void specialization of multi_ptr class.
- Implemented the following groups of SYCL built-in functions:
- integers
- geometric
- Support for variadic templates in SYCL kernel names.
- Disabled several methods of buffer class that were available for incompatible number of dimensions.
- Added initialization of range field in buffer interoperability constructor.
- Fixed buffer constructor with iterators, now the data is written back to the input iterator if it's not const iterator.
- Fixed the problem which didn't allow using buffer::set_final_data and creating of host accessor from buffer created using interoperability constructor.
- Now program::get_*_options returns options correctly if the program is created with interoperability constructor.
- Fixed linking multiple programs compiled using compile_with_kernel_type.
- Fixed initialization of device list in program class interoperability constructor.
- Aligned vec class with the SYCL Specification, changed argument to multi_ptr with const type.
- Fixed queue profiling device information query to work with OpenCL1.2.