Disable FMA in CUDA
The CUDA compiler applies an optimization method known as Fused-Multiply-Add (FMA) which combines addition and multiplication into a single instruction.
For more information about Fused-Multiply-Add (FMA), see https://docs.nvidia.com/cuda/floating-point/index.html#fused-multiply-add-fma.
This will improve the performance, However, there are some cases where it can cause the simulation results to change by a small amount. It is recommended to disable this option when developing new code and trying to compare results between versions.
You can disable this optimization by passing the following command line
option:
--nofma=1
You can also disable
this option in the Advanced Settings dialog box in the Simulator options.Note: This is an option that can be enabled as the default
setting.