Blender Git Loki
Git Commits -> Revision 3e9265a
Revision 3e9265a by Sergey Sharybin (cycles-x) July 15, 2021, 14:59 (GMT) |
Fix Cycles X adaptive sampling convergence check The optimization of atomics and reduction was wrong: the warp voting functions operate on a threads from a warp (obviously), and the result of the vote is to be accumulated once for every warp. Thread index is measured within a block, not within a warp: a block can have a lot (GPU-dependent) number of threads, while warp has only 32 threads. Now the code does a voting and atomically adds to the result. This solves possible too-early sampling stop on GPU, but because the old code could have finished too soon, there is potential that the absolute render time number goes up. Is one of the things which is a bit hard to see on the real file, but the same approach was giving wrong approach during development of shadow catcher occupancy improvement. So best visualization of the problem so far was to force `converged` to be always false and print number of pixels and active pixels after the running kernel. Before this change the number of active pixels was much smaller than the number of pixels, now those values match. |
Commit Details:
Full Hash: 3e9265a31a7573e93dc90d821bf6429474c7eb77
Parent Commit: 3a1eac0
Lines Changed: +2, -1
1 Modified Path:
/intern/cycles/kernel/device/cuda/kernel.cu (+2, -1) (Diff)