Blender Git Commit Log
Git Commits -> Revision 757c183
Revision 757c183 by Sergey Sharybin (cycles-x) April 6, 2021, 12:53 (GMT) |
Cycles: Keep CUDA threads more coherent The idea is to schedule smaller tiles with more samples per tile as a KernelWorkTile. Currently the code is attempting to schedule all samples into a single tile which is as big as the number of path states allows. This is not very efficient in a sense that it might keep some of the path states unused. The way around it is to align tile size down to the power of two and schedule more tiles. From local tests such tile size alignment is actually eliminating speedup of many samples running in parallel. Could be something about warp size or incoherency coming from other sources (the once where coalescing might help). Leaving the code for the aligning tile size to power of two as a if(false) branch for the further investigation. The number of samples is rounded to the closest power of two, which helps with path state divisions. In practice if might mean somewhat longer interval between updates, but it shouldn't be that bad since usually it is no more than 8 samples which are scheduling. With own tests the classroom render time without sync went down from 96 sec to 90 sec. |
Commit Details:
Full Hash: 757c183aaafecdb4368649b0a32a73605c663c50
Parent Commit: 63ac68c
Lines Changed: +150, -37
6 Modified Paths:
/intern/cycles/integrator/render_scheduler.cpp (+36, -1) (Diff)
/intern/cycles/integrator/tile.cpp (+48, -12) (Diff)
/intern/cycles/integrator/tile.h (+28, -3) (Diff)
/intern/cycles/integrator/work_tile_scheduler.cpp (+21, -11) (Diff)
/intern/cycles/integrator/work_tile_scheduler.h (+3, -2) (Diff)
/intern/cycles/test/integrator_tile_test.cpp (+14, -8) (Diff)
/intern/cycles/integrator/tile.cpp (+48, -12) (Diff)
/intern/cycles/integrator/tile.h (+28, -3) (Diff)
/intern/cycles/integrator/work_tile_scheduler.cpp (+21, -11) (Diff)
/intern/cycles/integrator/work_tile_scheduler.h (+3, -2) (Diff)
/intern/cycles/test/integrator_tile_test.cpp (+14, -8) (Diff)