Blender Git Commit Log

Git Commits -> Revision 27db38f

Revision 27db38f by Patrick Mours (cycles-x)
July 6, 2021, 13:51 (GMT)
Cycles X: Shading performance improvements by changing inlining behavior for SVM

The shading kernels (shade_surface, ...) are limited by memory a lot. I found several hotspots
where execution was stalled waiting for spills to be loaded back into registers. That's
something that can be adjusted by changing the inlining logic:

For example, the compiler did not inline "kernel_write_denoising_features" (even though it
was marked __inline__), which caused it to force synchronization before the function call.
Forcing it inline avoided that and got rid of that hotspot.

Then there was cubic texture filtering and NanoVDB, which introduced huge code chunks
into each texture sampling evaluation (increasing register and instruction cache pressure),
even though they are rarely actually used. Making them __noinline__ outsources that
overhead to only occur when actually used.

Another case is the SVM. The compiler currently converts the node type switch statement
into a binary searched branch sequence. This means depending on the SVM node hit, the
GPU has to branch over large portions of code, which increases instruction cache pressure
immensely (GPU is fetching lots of code even for stuff it immediately jumps away from
again, while jumping through the binary searched branches). This can be reduced somewhat
by making all the node functions __noinline__, so that the GPU only has to branch over a
bunch of call instructions, rather than all the inlined code.
The SVM "offset" value is passed by value into the node functions now and returned through
function return value, to make the compiler keep it in a register. Otherwise when passed as
a pointer, in OptiX the compiler was forced to move it into local memory (since functions
are compiled separately there, so the compiler is unaware of how that pointer is used).

Differential Revision: https://developer.blender.org/D11816

Commit Details:

Full Hash: 27db38f0a729411f13c99a60574d59c70d461be5
Parent Commit: 026a355
Lines Changed: +430, -396

48 Modified Paths:

/intern/cycles/kernel/device/cuda/image.h (+4, -3) (Diff)
/intern/cycles/kernel/kernel_passes.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm.h (+40, -36) (Diff)
/intern/cycles/kernel/svm/svm_ao.h (+7, -1) (Diff)
/intern/cycles/kernel/svm/svm_attribute.h (+12, -9) (Diff)
/intern/cycles/kernel/svm/svm_bevel.h (+7, -4) (Diff)
/intern/cycles/kernel/svm/svm_blackbody.h (+5, -5) (Diff)
/intern/cycles/kernel/svm/svm_brick.h (+6, -5) (Diff)
/intern/cycles/kernel/svm/svm_brightness.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_bump.h (+8, -8) (Diff)
/intern/cycles/kernel/svm/svm_camera.h (+6, -6) (Diff)
/intern/cycles/kernel/svm/svm_checker.h (+4, -4) (Diff)
/intern/cycles/kernel/svm/svm_clamp.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_closure.h (+37, -35) (Diff)
/intern/cycles/kernel/svm/svm_convert.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_displace.h (+12, -11) (Diff)
/intern/cycles/kernel/svm/svm_fresnel.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_gamma.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_geometry.h (+6, -6) (Diff)
/intern/cycles/kernel/svm/svm_gradient.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_hsv.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_ies.h (+4, -2) (Diff)
/intern/cycles/kernel/svm/svm_image.h (+14, -13) (Diff)
/intern/cycles/kernel/svm/svm_invert.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_light_path.h (+7, -7) (Diff)
/intern/cycles/kernel/svm/svm_magic.h (+4, -3) (Diff)
/intern/cycles/kernel/svm/svm_mapping.h (+25, -24) (Diff)
/intern/cycles/kernel/svm/svm_map_range.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_math.h (+15, -15) (Diff)
/intern/cycles/kernel/svm/svm_mix.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_musgrave.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_noisetex.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_normal.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_ramp.h (+14, -12) (Diff)
/intern/cycles/kernel/svm/svm_sepcomb_hsv.h (+18, -16) (Diff)
/intern/cycles/kernel/svm/svm_sky.h (+14, -13) (Diff)
/intern/cycles/kernel/svm/svm_tex_coord.h (+28, -22) (Diff)
/intern/cycles/kernel/svm/svm_types.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_value.h (+4, -3) (Diff)
/intern/cycles/kernel/svm/svm_vector_rotate.h (+5, -5) (Diff)
/intern/cycles/kernel/svm/svm_vector_transform.h (+4, -4) (Diff)
/intern/cycles/kernel/svm/svm_vertex_color.h (+18, -30) (Diff)
/intern/cycles/kernel/svm/svm_voronoi.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_voxel.h (+6, -5) (Diff)
/intern/cycles/kernel/svm/svm_wave.h (+5, -4) (Diff)
/intern/cycles/kernel/svm/svm_wavelength.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_white_noise.h (+6, -7) (Diff)
/intern/cycles/kernel/svm/svm_wireframe.h (+4, -4) (Diff)
By: Miika HämäläinenLast update: Nov-07-2014 14:18MiikaHweb | 2003-2021