Blender Git Loki
Git Commits -> Revision 27db38f
Revision 27db38f by Patrick Mours (cycles-x) July 6, 2021, 13:51 (GMT) |
Cycles X: Shading performance improvements by changing inlining behavior for SVM The shading kernels (shade_surface, ...) are limited by memory a lot. I found several hotspots where execution was stalled waiting for spills to be loaded back into registers. That's something that can be adjusted by changing the inlining logic: For example, the compiler did not inline "kernel_write_denoising_features" (even though it was marked __inline__), which caused it to force synchronization before the function call. Forcing it inline avoided that and got rid of that hotspot. Then there was cubic texture filtering and NanoVDB, which introduced huge code chunks into each texture sampling evaluation (increasing register and instruction cache pressure), even though they are rarely actually used. Making them __noinline__ outsources that overhead to only occur when actually used. Another case is the SVM. The compiler currently converts the node type switch statement into a binary searched branch sequence. This means depending on the SVM node hit, the GPU has to branch over large portions of code, which increases instruction cache pressure immensely (GPU is fetching lots of code even for stuff it immediately jumps away from again, while jumping through the binary searched branches). This can be reduced somewhat by making all the node functions __noinline__, so that the GPU only has to branch over a bunch of call instructions, rather than all the inlined code. The SVM "offset" value is passed by value into the node functions now and returned through function return value, to make the compiler keep it in a register. Otherwise when passed as a pointer, in OptiX the compiler was forced to move it into local memory (since functions are compiled separately there, so the compiler is unaware of how that pointer is used). Differential Revision: https://developer.blender.org/D11816 |
Commit Details:
Full Hash: 27db38f0a729411f13c99a60574d59c70d461be5
Parent Commit: 026a355
Lines Changed: +430, -396
48 Modified Paths:
/intern/cycles/kernel/device/cuda/image.h (+4, -3) (Diff)
/intern/cycles/kernel/kernel_passes.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm.h (+40, -36) (Diff)
/intern/cycles/kernel/svm/svm_ao.h (+7, -1) (Diff)
/intern/cycles/kernel/svm/svm_attribute.h (+12, -9) (Diff)
/intern/cycles/kernel/svm/svm_bevel.h (+7, -4) (Diff)
/intern/cycles/kernel/svm/svm_blackbody.h (+5, -5) (Diff)
/intern/cycles/kernel/svm/svm_brick.h (+6, -5) (Diff)
/intern/cycles/kernel/svm/svm_brightness.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_bump.h (+8, -8) (Diff)
/intern/cycles/kernel/svm/svm_camera.h (+6, -6) (Diff)
/intern/cycles/kernel/svm/svm_checker.h (+4, -4) (Diff)
/intern/cycles/kernel/svm/svm_clamp.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_closure.h (+37, -35) (Diff)
/intern/cycles/kernel/svm/svm_convert.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_displace.h (+12, -11) (Diff)
/intern/cycles/kernel/svm/svm_fresnel.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_gamma.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_geometry.h (+6, -6) (Diff)
/intern/cycles/kernel/svm/svm_gradient.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_hsv.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_ies.h (+4, -2) (Diff)
/intern/cycles/kernel/svm/svm_image.h (+14, -13) (Diff)
/intern/cycles/kernel/svm/svm_invert.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_light_path.h (+7, -7) (Diff)
/intern/cycles/kernel/svm/svm_magic.h (+4, -3) (Diff)
/intern/cycles/kernel/svm/svm_mapping.h (+25, -24) (Diff)
/intern/cycles/kernel/svm/svm_map_range.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_math.h (+15, -15) (Diff)
/intern/cycles/kernel/svm/svm_mix.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_musgrave.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_noisetex.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_normal.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_ramp.h (+14, -12) (Diff)
/intern/cycles/kernel/svm/svm_sepcomb_hsv.h (+18, -16) (Diff)
/intern/cycles/kernel/svm/svm_sky.h (+14, -13) (Diff)
/intern/cycles/kernel/svm/svm_tex_coord.h (+28, -22) (Diff)
/intern/cycles/kernel/svm/svm_types.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_value.h (+4, -3) (Diff)
/intern/cycles/kernel/svm/svm_vector_rotate.h (+5, -5) (Diff)
/intern/cycles/kernel/svm/svm_vector_transform.h (+4, -4) (Diff)
/intern/cycles/kernel/svm/svm_vertex_color.h (+18, -30) (Diff)
/intern/cycles/kernel/svm/svm_voronoi.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_voxel.h (+6, -5) (Diff)
/intern/cycles/kernel/svm/svm_wave.h (+5, -4) (Diff)
/intern/cycles/kernel/svm/svm_wavelength.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_white_noise.h (+6, -7) (Diff)
/intern/cycles/kernel/svm/svm_wireframe.h (+4, -4) (Diff)
/intern/cycles/kernel/kernel_passes.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm.h (+40, -36) (Diff)
/intern/cycles/kernel/svm/svm_ao.h (+7, -1) (Diff)
/intern/cycles/kernel/svm/svm_attribute.h (+12, -9) (Diff)
/intern/cycles/kernel/svm/svm_bevel.h (+7, -4) (Diff)
/intern/cycles/kernel/svm/svm_blackbody.h (+5, -5) (Diff)
/intern/cycles/kernel/svm/svm_brick.h (+6, -5) (Diff)
/intern/cycles/kernel/svm/svm_brightness.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_bump.h (+8, -8) (Diff)
/intern/cycles/kernel/svm/svm_camera.h (+6, -6) (Diff)
/intern/cycles/kernel/svm/svm_checker.h (+4, -4) (Diff)
/intern/cycles/kernel/svm/svm_clamp.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_closure.h (+37, -35) (Diff)
/intern/cycles/kernel/svm/svm_convert.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_displace.h (+12, -11) (Diff)
/intern/cycles/kernel/svm/svm_fresnel.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_gamma.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_geometry.h (+6, -6) (Diff)
/intern/cycles/kernel/svm/svm_gradient.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_hsv.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_ies.h (+4, -2) (Diff)
/intern/cycles/kernel/svm/svm_image.h (+14, -13) (Diff)
/intern/cycles/kernel/svm/svm_invert.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_light_path.h (+7, -7) (Diff)
/intern/cycles/kernel/svm/svm_magic.h (+4, -3) (Diff)
/intern/cycles/kernel/svm/svm_mapping.h (+25, -24) (Diff)
/intern/cycles/kernel/svm/svm_map_range.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_math.h (+15, -15) (Diff)
/intern/cycles/kernel/svm/svm_mix.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_musgrave.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_noisetex.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_normal.h (+9, -8) (Diff)
/intern/cycles/kernel/svm/svm_ramp.h (+14, -12) (Diff)
/intern/cycles/kernel/svm/svm_sepcomb_hsv.h (+18, -16) (Diff)
/intern/cycles/kernel/svm/svm_sky.h (+14, -13) (Diff)
/intern/cycles/kernel/svm/svm_tex_coord.h (+28, -22) (Diff)
/intern/cycles/kernel/svm/svm_types.h (+2, -2) (Diff)
/intern/cycles/kernel/svm/svm_value.h (+4, -3) (Diff)
/intern/cycles/kernel/svm/svm_vector_rotate.h (+5, -5) (Diff)
/intern/cycles/kernel/svm/svm_vector_transform.h (+4, -4) (Diff)
/intern/cycles/kernel/svm/svm_vertex_color.h (+18, -30) (Diff)
/intern/cycles/kernel/svm/svm_voronoi.h (+10, -9) (Diff)
/intern/cycles/kernel/svm/svm_voxel.h (+6, -5) (Diff)
/intern/cycles/kernel/svm/svm_wave.h (+5, -4) (Diff)
/intern/cycles/kernel/svm/svm_wavelength.h (+1, -1) (Diff)
/intern/cycles/kernel/svm/svm_white_noise.h (+6, -7) (Diff)
/intern/cycles/kernel/svm/svm_wireframe.h (+4, -4) (Diff)