Note this attribute is only supported by Clang, so this will only be applied to the kernel binary but not runtime.
+ Microarchitecture optimizations + 64-bit support + Xilinx FPGA support + LLVM-16 support + Refactoring and quality control fixes