Blender Git Loki

Git Commits -> Revision 757c183

Revision 757c183 by Sergey Sharybin (cycles-x)
April 6, 2021, 12:53 (GMT)
Cycles: Keep CUDA threads more coherent

The idea is to schedule smaller tiles with more samples per tile as a
KernelWorkTile.

Currently the code is attempting to schedule all samples into a single
tile which is as big as the number of path states allows. This is not
very efficient in a sense that it might keep some of the path states
unused. The way around it is to align tile size down to the power of
two and schedule more tiles. From local tests such tile size alignment
is actually eliminating speedup of many samples running in parallel.
Could be something about warp size or incoherency coming from other
sources (the once where coalescing might help). Leaving the code for
the aligning tile size to power of two as a if(false) branch for the
further investigation.

The number of samples is rounded to the closest power of two, which
helps with path state divisions. In practice if might mean somewhat
longer interval between updates, but it shouldn't be that bad since
usually it is no more than 8 samples which are scheduling.

With own tests the classroom render time without sync went down from
96 sec to 90 sec.

Commit Details:

Full Hash: 757c183aaafecdb4368649b0a32a73605c663c50
Parent Commit: 63ac68c
Lines Changed: +150, -37

6 Modified Paths:

/intern/cycles/integrator/render_scheduler.cpp (+36, -1) (Diff)
/intern/cycles/integrator/tile.cpp (+48, -12) (Diff)
/intern/cycles/integrator/tile.h (+28, -3) (Diff)
/intern/cycles/integrator/work_tile_scheduler.cpp (+21, -11) (Diff)
/intern/cycles/integrator/work_tile_scheduler.h (+3, -2) (Diff)
/intern/cycles/test/integrator_tile_test.cpp (+14, -8) (Diff)
Tehnyt: Miika HämäläinenViimeksi päivitetty: 07.11.2014 14:18MiikaH:n Sivut a.k.a. MiikaHweb | 2003-2021