Blender Git Loki

Blender Git "cycles-x" branch commits.

Page: 32 / 44

April 2, 2021, 15:23 (GMT)
Cycles: Remove unused defines in CUDA device
April 2, 2021, 15:23 (GMT)
Cycles: Enable adaptive sampling tests

Got lost in one of the previous changes: they were disabled during
development.
April 2, 2021, 15:22 (GMT)
Cycles: restore fine grained bounce depth controls

Unsure if we want to keep all of this in the end, but useful now for more
accurate performance comparisons.
April 2, 2021, 14:27 (GMT)
Cycles: Scale all passes with adaptive samples count

Do it in the RenderBuffers, similar to how combined pass was handled.

While the scaling with samples count is slower than doing it on the
device, the scaling with samples count is still to be performed.

In practice the downside is that per-pixel inverse sample count is now
done from a single thread. However, this can be improved by using tbb
to scale the pass from multiple threads.

The benefit of this approach is that the pass scaling kernel is not
needed, which solves ambiguity about when to run it. It also simplifies
code in a sense that the scaling logic is not duplicated in the kernel.
April 2, 2021, 14:27 (GMT)
Cycles: Allow const-pointer access to device_memory
April 2, 2021, 14:22 (GMT)
Cycles: reduce size of shadow path state by moving to own struct
April 2, 2021, 14:22 (GMT)
Cleanup: don't put unused CUDA KernelGlobals on the stack
April 2, 2021, 14:22 (GMT)
Cycles: use SoA layout for IntegratorState on the GPU
April 2, 2021, 14:22 (GMT)
Cycles: put integrator queue pointer in constant memory

Instead of passing it around to various functions.
April 2, 2021, 14:22 (GMT)
Cycles: remove INTEGRATOR_STATE_COPY macro

This will be tricky with SoA, just do it manually in the one place that uses it.
April 2, 2021, 13:32 (GMT)
Fix error in Cycles versioning code after removal of branched path
April 2, 2021, 13:27 (GMT)
Fix error in removal of NLM denoiser causing CUDA failures
April 2, 2021, 12:12 (GMT)
Cycles: Speedup adaptive sampling on CPU

- Do early output in the convergence test, to avoid error calculation
for pixel which is known to be converged.

- Better threading scheduling in the path trace work, to avoid extra
call of `parallel_for` which has (unmeasurable) overhead.

- Biggest change is to stop parallel samples for pixel once it did
converge. Prior to this change the path trace work will attempt to
initialize path state for many samples in a row for this pixel.

Timing on sampling simple file (diffuse monkey on diffuse plane):

- master is 3.89 sec
- before this change: 4.24 sec
- after this change: 4.04 sec
April 1, 2021, 17:14 (GMT)
Cycles: Move need display update check to own function

Currently no functional changes, but allows to have more elaborate
logic in its implementation now.
April 1, 2021, 17:14 (GMT)
Cycles: Implement convergence and filtering kernels for CUDA

On user level this means that adaptive sampling works on CUDA.

Missing part for all devices is passes scaling at the end of render.
Need to look whether there is some smarter trick we can do.
April 1, 2021, 17:14 (GMT)
Cycles: Use better naming in RenderScheduler

No functional changes.
April 1, 2021, 17:14 (GMT)
Cycles: Remove hardcoded CUDA functions from CUDADevice

Use more dynamic and flexible CUDADeviceKernels.
April 1, 2021, 17:14 (GMT)
Cycles: More accurate time tracking for display update
April 1, 2021, 17:14 (GMT)
Cycles: Move adaptive sampling convergence test to own kernel

This change makes it so the check is only done after all contribution
is known at the sample. It also makes it so path tracing kernel is
faster because it doesn't do per-bounce convergence test.

Makes it so that CPU samples distribution works similar to the
master branch.

The timing on CPU is way closer to the master branch, but more accurate
measurement with proper compilation flags is needed.

GPU adaptive sampling is temporarily broken.
April 1, 2021, 17:14 (GMT)
Cycles: Better display update scheduling for adaptive sampling

Adaptive sampling forces maximum number of samples per render work in
order to have all filtering points to happen. On a simple scenes this
could lead to very often updates, which heavily degrades performance.

Now the rendering scheduler will ignore display updates if they happen
too often even in the case when adaptive sampling is used.
Tehnyt: Miika HämäläinenViimeksi päivitetty: 07.11.2014 14:18MiikaH:n Sivut a.k.a. MiikaHweb | 2003-2021