Blender Git Loki

Blender Git "cycles-x" branch commits.

Page: 36 / 44

March 26, 2021, 08:58 (GMT)
Cycles: Use better check for CUDA graphics interoperability

The idea is to disable graphics interoperability if the compute device
is not part of the OpenGL context.

Test setup:
- Linux Debian Testing
- RTX 6000 used as non-display compute device
- NVidia driver 460.56

The render time of very simple scene in viewport is about 17% faster
than with an interop.
March 26, 2021, 08:58 (GMT)
Cycles: Use const qualifier in GPUDisplay reset
March 26, 2021, 08:58 (GMT)
Cycles: Support CUDA Graphics Interoperability

Allows to update GPUDisplay using CUDA without need to copy pixels
from GPU to CPU and back to GPU.

Pretty much refactoring the old code to fit into the new GPUDisplay
and PathTraceWork.

Rendering on CUDA Titan V which is the display card here is about 10%
faster vieweport rendering of simple scene compared to the round trip.

The compute device RTC6000 is 17% slower that the round trip.
Would not consider this a stopper for this change as the similar timing
difference can be observed in the old DisplayBuffer style of drawing.
March 26, 2021, 08:58 (GMT)
Cycles: Remove old pixels drawing and film conversion

The conversion to byte buffer is now a dead code. Would need to have
a decision to remove conversion to byte buffer, or to bring it back.
March 24, 2021, 13:42 (GMT)
Cleanup: minor cleanups in preparation of render pass changes
March 24, 2021, 13:42 (GMT)
Cleanup: recategorize some render passes

* Color/albedo/mist are now considered data passes
* Combined is now considered a light pass

Fits better with how they will be written out separately as part of data passes
and light accumulaton.
March 24, 2021, 13:42 (GMT)
Cycles: restore writing of data render passes

* This includes depth, UV, crypto, albedo/color, mist and denoising features
* Now all are written at every bounce instead of at the end
* Mist is negated after rendering similar to alpha, so we don't have to track
it in the path state
March 24, 2021, 09:19 (GMT)
Cycles: Cleanup, mark argument as unused

Fixes strict compiler warning.
March 24, 2021, 09:10 (GMT)
Cycles: Avoid memory allocation on every copy_to_gpu_display()

Affects the PathTraceWorkGPU. From the measurements on own desktop it
is about 10% faster on the very simple scene with 32 samples.

Technically, this potentially increases peak GPU memory usage, but
hopefully by not so much and that the performance gain worth it.
March 24, 2021, 09:10 (GMT)
Cycles: Update comment in the GPUDisplayParams

Make it explicit that display parameters are not aware of the
resolution divider.
March 24, 2021, 09:10 (GMT)
Cycles: Switch copy_to_gpu_display from Task to Queue
March 23, 2021, 17:25 (GMT)
Cleanup: deduplicate creation of kernel thread globals
March 23, 2021, 17:25 (GMT)
Cycles: add back displacement and background shader evaluation

* Add ShaderEval class similar to PathTrace which takes care of scheduling
these shader evaluations on device, instead of using DeviceTask.
* SVM shader evaluation can now have a NULL integrator state, since we
don't want to allocate a IntegratorState array for this.
March 23, 2021, 17:25 (GMT)
Cleanup: use meaningful KernelShaderEvalInput instead of uint4
March 23, 2021, 13:17 (GMT)
Cycles: Silence unused kg arguments in CPU kernels

The kernel globals might not be used by some of the kernels.
This change makes it so such kernels do not generate a strict compiler
warning.

The motivation is to still be able to see unused arguments (which
could be left-over of refactor and are to be removed) without causing
"polluting" of the compile log with warnings mentioned above.

Using C++17 attribute, since this feels to be the most portable across
compilers solution.
March 19, 2021, 18:15 (GMT)
Cycles: Avoid temporary buffer allocation for GPUDisplay update on CPU

The idea is to request GPUDisplay to map memory, so that the film
conversion can happen directly into that memory.

Additionally, move away from task to a kernel call.

In an own tests here the full viewport render time on a very simple
file is now about 6% faster, although is unclear whether it is due
to TBB or due to avoiding extra memory copy. Still nice.

The GPU part still needs work, and will be handled separately.
March 19, 2021, 18:12 (GMT)
Cycles: Make buffer parameters more coupled to PathTraceWork

It is annoying to pass scaled parameters to every call. More correct
approach it seems to be to make them more "sticky": they become a
property of path trace work, together with the render buffers.

The idea is to have the following in the long term:
- PathTrace allocates render buffers for every PathTraceWork
- PathTrace updates parameters of every PathTraceWork on resolution
divider change.
- PathTrace re-allocates the buffer when balancing is needed to
accomodate for performance difference of different render devices.

This way PathTraceWork always knows its buffer pointer and effective
parameters and always uses them in all the API calls.
March 19, 2021, 16:43 (GMT)
Cycles: Avoid explicit CPU-side buffer in GPUDisplay

The idea is to use OpenGL texture and pixel buffer object
to push pixels from PathTrace to the GPU for draw.

The need of the PBO is because updating texture from the
copy_pixels_to_texture() function turned out to be quite
expensive in terms of CPU time spent, degrading viewport
render performance. The downside of this approach is that
the peak memory might be higher when rendering on a multiple
devices compared to partial direct texture updates. Although
this is hard to tell for sure, since it depends on exact way
of how driver schedules updates and manages memory.

The upside of this change is that it makes it so PBO is
there, ready to be used by CUDA's GL graphics for direct
update without CPU memory round-trip.

The performance is a bit hard to measure, since the timing
varies quite a bit, but on the CPU it is the same ballpark.
March 19, 2021, 16:43 (GMT)
Draw manager: Add utility functions for context creation

Some special logic is needed to release the active DrawManager OpenGL
context when new context needs to be created by a render engine.

This change introduces functionality which is needed for such process,
including a simple example usage in the comment.
March 19, 2021, 15:58 (GMT)
Cycles: add back random walk subsurface scattering

* shade_surface now samples either a BSDF or BSSRDF to continue the path, and
depending on that will continue with intersect_closest or intersect_subsurface.
* Then intersect_subsurface kernels does ray-tracing and scattering below the
surface. If an exit point is found, it returns to shade_surface.
* Since shade_surface is called twice, direct lighting is evaluated both for
the BSSRDF and any (typically specular) BSDF, since only doing either is
quite noisy.
* Note most files use Christensen-Burley so will render differently.
* subsurface_random_walk should be split up into smaller functions, but I've
kept it like the current implementation now, will split it up in a later commit.
Tehnyt: Miika HämäläinenViimeksi päivitetty: 07.11.2014 14:18MiikaH:n Sivut a.k.a. MiikaHweb | 2003-2021