July 6, 2021, 14:46 (GMT) |
Cleanup: White space |
July 6, 2021, 14:46 (GMT) |
Fix surface objects |
July 6, 2021, 13:51 (GMT) |
Cycles X: Shading performance improvements by changing inlining behavior for SVM The shading kernels (shade_surface, ...) are limited by memory a lot. I found several hotspots where execution was stalled waiting for spills to be loaded back into registers. That's something that can be adjusted by changing the inlining logic: For example, the compiler did not inline "kernel_write_denoising_features" (even though it was marked __inline__), which caused it to force synchronization before the function call. Forcing it inline avoided that and got rid of that hotspot. Then there was cubic texture filtering and NanoVDB, which introduced huge code chunks into each texture sampling evaluation (increasing register and instruction cache pressure), even though they are rarely actually used. Making them __noinline__ outsources that overhead to only occur when actually used. Another case is the SVM. The compiler currently converts the node type switch statement into a binary searched branch sequence. This means depending on the SVM node hit, the GPU has to branch over large portions of code, which increases instruction cache pressure immensely (GPU is fetching lots of code even for stuff it immediately jumps away from again, while jumping through the binary searched branches). This can be reduced somewhat by making all the node functions __noinline__, so that the GPU only has to branch over a bunch of call instructions, rather than all the inlined code. The SVM "offset" value is passed by value into the node functions now and returned through function return value, to make the compiler keep it in a register. Otherwise when passed as a pointer, in OptiX the compiler was forced to move it into local memory (since functions are compiled separately there, so the compiler is unaware of how that pointer is used). Differential Revision: https://developer.blender.org/D11816 |
July 6, 2021, 13:48 (GMT) |
Merge remote-tracking branch 'origin/master' into temp-lineart-contained |
July 6, 2021, 11:38 (GMT) |
Fix crash rendering some scenes after master merge Now that lamps are handled are primitives in intersections, we must include them in the bitmask used for packing primitive type bits. |
July 6, 2021, 10:38 (GMT) |
Cycles X: Make pass definition more robust to changes Previously adding, removing, or even changing order of passes in the kernel_types.h would likely to break display pass enum. This was because the python enum was relying on an exact match of enum item values. Now we do an identifier-based lookup via `Pass::get_type_enum`, which allows to more safely change passes in kernel without risk of breaking display passes. Additionally, conversion of pass to string now also happens via the `Pass::get_type_enum`. All in all, it is the pass type enum which s the source of truth with this change. Differential Revision: https://developer.blender.org/D11823 |
July 6, 2021, 10:26 (GMT) |
Cycles X: Allow viewing denoising passes in viewport Can be used without denoiser configured (acting as if the denoising data passes are enabled in the view layer options). Differential Revision: https://developer.blender.org/D11821 |
July 6, 2021, 10:25 (GMT) |
Cycles X: Only copy denoised passes for multi-device render No functional changes, and timing of the denoising process should be quite the same. The change opens the doors to allow denoisers to modify data in-place, avoiding extra allocation in the denoisers, lowering memory peak of the denoising process. Differential Revision: https://developer.blender.org/D11815 |
July 6, 2021, 09:36 (GMT) |
Merge branch 'master' into cycles-x |
July 6, 2021, 09:13 (GMT) |
Cycles X: Reduce memory usage when denoising in multi-device render The idea is to create a full big tile buffer on the actual device which will be used for denoising. This avoids OptiX creating a yet another copy of the render buffers on the actual device. Mainly moving some lines around from DeviceDenoiser to Denoiser to make logic more accessible by all denoisers, and in the path tracer. Assume allocation is cheaper than data transfer, so that some TODOs are marked as done. It's possible to reduce memory even further by allowing OIDN and OptiX to modify the copy of the render buffers in-place, as it can be thrown away. Considering this an independent further improvement which is not tackled in this change. Differential Revision: https://developer.blender.org/D11814 |
July 6, 2021, 08:42 (GMT) |
Vulkan: Add support for wayland. |
July 6, 2021, 07:38 (GMT) |
Merge branch 'master' into temp-D10912-change-group-socket-types |
July 6, 2021, 07:02 (GMT) |
Fix merge conflict. Missing parameter when creating EGLContext. |
July 6, 2021, 05:45 (GMT) |
Nicer labels for integer and color sockets. |
July 6, 2021, 05:33 (GMT) |
Remove the ICON_PLUGIN from node socket RNA. |
July 6, 2021, 05:19 (GMT) |
Removed the colored circle next to type button. |
July 6, 2021, 05:07 (GMT) |
Put back socket type enums that were removed. |
July 6, 2021, 04:12 (GMT) |
USD import crash workaround. Temporary workaround to prevent crash when reading bezier curves. This is a placeholder until bezier curves are properly supported. |
July 6, 2021, 03:48 (GMT) |
Merge branch 'master' into curve-nodes-modifier |
July 6, 2021, 03:47 (GMT) |
Report type "Error" to "Warning" |
|